diff mbox series

[RFC,4/4] x86/vdso: x86/sgx: Allow the user to exit the vDSO loop on interrupts

Message ID 20200818042405.12871-5-sean.j.christopherson@intel.com (mailing list archive)
State New, archived
Headers show
Series x86/vdso: x86/sgx: Rework SGX vDSO API | expand

Commit Message

Sean Christopherson Aug. 18, 2020, 4:24 a.m. UTC
Allow userspace to exit the vDSO on interrupts that are acknowledged
while the enclave is active.  This allows the user's runtime to switch
contexts at opportune times without additional overhead, e.g. when using
an M:N threading model (where M user threads run N TCSs, with N > M).

Suggested-by: Jethro Beekman <jethro@fortanix.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/entry/vdso/vsgx_enter_enclave.S | 27 ++++++++++++++++++++----
 arch/x86/include/uapi/asm/sgx.h          |  3 +++
 2 files changed, 26 insertions(+), 4 deletions(-)

Comments

Jarkko Sakkinen Aug. 18, 2020, 5 p.m. UTC | #1
On Mon, Aug 17, 2020 at 09:24:05PM -0700, Sean Christopherson wrote:
> Allow userspace to exit the vDSO on interrupts that are acknowledged
> while the enclave is active.  This allows the user's runtime to switch
> contexts at opportune times without additional overhead, e.g. when using
> an M:N threading model (where M user threads run N TCSs, with N > M).
> 
> Suggested-by: Jethro Beekman <jethro@fortanix.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/entry/vdso/vsgx_enter_enclave.S | 27 ++++++++++++++++++++----
>  arch/x86/include/uapi/asm/sgx.h          |  3 +++
>  2 files changed, 26 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> index b09e87dbe9334..33428c0f94b0d 100644
> --- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> +++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> @@ -21,6 +21,9 @@
>  
>  #define SGX_SYNCHRONOUS_EXIT	0
>  #define SGX_EXCEPTION_EXIT	1
> +#define SGX_INTERRUPT_EXIT	2
> +
> +#define SGX_EXIT_ON_INTERRUPTS	1

This naming scheme does not look too great.

I have nothing principal about code changes, *if* this is the approach
where we want to move. I did not look every detail because I'm not sure
what is the consensus yet.

/Jarkko
Andy Lutomirski Aug. 18, 2020, 5:15 p.m. UTC | #2
On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Allow userspace to exit the vDSO on interrupts that are acknowledged
> while the enclave is active.  This allows the user's runtime to switch
> contexts at opportune times without additional overhead, e.g. when using
> an M:N threading model (where M user threads run N TCSs, with N > M).

This is IMO rather odd.  We don't support this type of notification on
interrupts for normal user code.  The fact user code can detect
interrupts during enclave execution is IMO an oddity of SGX, and I
have asked Intel to consider replacing the AEX mechanism with
something more transparent to user mode.  If this ever happens, this
mechanism is toast.

Even without architecture changes, building a *reliable* M:N threading
mechanism on top of this will be difficult or impossible, as there is
no particular guarantee that a thread will get timing interrupts at
all or that these interrupts will get lucky and hit enclave code, thus
triggering an AEX.  We certainly don't, and probably never will,
support any corresponding feature for non-enclave code.

So this seems like an odd, and possibly unsupportable, feature to add.

--Andy
Sean Christopherson Aug. 18, 2020, 5:31 p.m. UTC | #3
On Tue, Aug 18, 2020 at 10:15:49AM -0700, Andy Lutomirski wrote:
> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > Allow userspace to exit the vDSO on interrupts that are acknowledged
> > while the enclave is active.  This allows the user's runtime to switch
> > contexts at opportune times without additional overhead, e.g. when using
> > an M:N threading model (where M user threads run N TCSs, with N > M).
> 
> This is IMO rather odd.  We don't support this type of notification on
> interrupts for normal user code.  The fact user code can detect
> interrupts during enclave execution is IMO an oddity of SGX, and I
> have asked Intel to consider replacing the AEX mechanism with
> something more transparent to user mode.  If this ever happens, this
> mechanism is toast.
> 
> Even without architecture changes, building a *reliable* M:N threading
> mechanism on top of this will be difficult or impossible, as there is
> no particular guarantee that a thread will get timing interrupts at
> all or that these interrupts will get lucky and hit enclave code, thus
> triggering an AEX.  We certainly don't, and probably never will,
> support any corresponding feature for non-enclave code.
> 
> So this seems like an odd, and possibly unsupportable, feature to add.

I 100% agree that allowing the user to act on interrupts is weird/fragile.
I'll happily kill this off if there's an "official" NAK, but I wanted to
force the issue so that we're not stuck in limbo wondering whether or not
this should be supported.
Andy Lutomirski Aug. 18, 2020, 7:05 p.m. UTC | #4
> On Aug 18, 2020, at 10:31 AM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> 
> On Tue, Aug 18, 2020 at 10:15:49AM -0700, Andy Lutomirski wrote:
>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
>>> <sean.j.christopherson@intel.com> wrote:
>>> 
>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
>>> while the enclave is active.  This allows the user's runtime to switch
>>> contexts at opportune times without additional overhead, e.g. when using
>>> an M:N threading model (where M user threads run N TCSs, with N > M).
>> 
>> This is IMO rather odd.  We don't support this type of notification on
>> interrupts for normal user code.  The fact user code can detect
>> interrupts during enclave execution is IMO an oddity of SGX, and I
>> have asked Intel to consider replacing the AEX mechanism with
>> something more transparent to user mode.  If this ever happens, this
>> mechanism is toast.
>> 
>> Even without architecture changes, building a *reliable* M:N threading
>> mechanism on top of this will be difficult or impossible, as there is
>> no particular guarantee that a thread will get timing interrupts at
>> all or that these interrupts will get lucky and hit enclave code, thus
>> triggering an AEX.  We certainly don't, and probably never will,
>> support any corresponding feature for non-enclave code.
>> 
>> So this seems like an odd, and possibly unsupportable, feature to add.
> 
> I 100% agree that allowing the user to act on interrupts is weird/fragile.
> I'll happily kill this off if there's an "official" NAK, but I wanted to
> force the issue so that we're not stuck in limbo wondering whether or not
> this should be supported.

If you would like an official NAK, here you go: NAK!

If users want M:N and SIGALRM isn’t good enough, add something better. Please don’t rely on an architectural wart as an alternative.
Jethro Beekman Aug. 19, 2020, 2:21 p.m. UTC | #5
On 2020-08-18 19:15, Andy Lutomirski wrote:
> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
>>
>> Allow userspace to exit the vDSO on interrupts that are acknowledged
>> while the enclave is active.  This allows the user's runtime to switch
>> contexts at opportune times without additional overhead, e.g. when using
>> an M:N threading model (where M user threads run N TCSs, with N > M).
> 
> This is IMO rather odd.  We don't support this type of notification on
> interrupts for normal user code.  The fact user code can detect
> interrupts during enclave execution is IMO an oddity of SGX, and I
> have asked Intel to consider replacing the AEX mechanism with
> something more transparent to user mode.  If this ever happens, this
> mechanism is toast.

Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.

> Even without architecture changes, building a *reliable* M:N threading
> mechanism on top of this will be difficult or impossible, as there is
> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
> triggering an AEX.  We certainly don't, and probably never will,
> support any corresponding feature for non-enclave code.

There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).

> So this seems like an odd, and possibly unsupportable, feature to add.

I can implement all this without the vDSO call today, so why not support it? That just means not everyone is going to use the vDSO call, again resulting in potential problems when multiple libraries want to use enclaves.

> --Andy
> 

--
Jethro Beekman | Fortanix
Andy Lutomirski Aug. 19, 2020, 3:02 p.m. UTC | #6
On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@fortanix.com> wrote:
>
> On 2020-08-18 19:15, Andy Lutomirski wrote:
> > On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> >>
> >> Allow userspace to exit the vDSO on interrupts that are acknowledged
> >> while the enclave is active.  This allows the user's runtime to switch
> >> contexts at opportune times without additional overhead, e.g. when using
> >> an M:N threading model (where M user threads run N TCSs, with N > M).
> >
> > This is IMO rather odd.  We don't support this type of notification on
> > interrupts for normal user code.  The fact user code can detect
> > interrupts during enclave execution is IMO an oddity of SGX, and I
> > have asked Intel to consider replacing the AEX mechanism with
> > something more transparent to user mode.  If this ever happens, this
> > mechanism is toast.
>
> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.

No.  If Intel fixes the architecture, the proposed interface will
*stop working*.

>
> > Even without architecture changes, building a *reliable* M:N threading
> > mechanism on top of this will be difficult or impossible, as there is
> > no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
> > triggering an AEX.  We certainly don't, and probably never will,
> > support any corresponding feature for non-enclave code.
>
> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).

If you want to abort an enclave, abort it the same way you abort any
other user code -- send a signal or something.  If something is wrong
with signal handling in the proposed vDSO interface, then by all
means, let's fix it.  If we need better library signal support, let's
add such a thing.  If you really want to abort an enclave *cleanly*
without any chance of garbage left around, stick it in a subprocess.
We are not playing the game where someone sets a flag, crosses their
fingers, and waits for an interrupt.

>
> > So this seems like an odd, and possibly unsupportable, feature to add.
>
> I can implement all this without the vDSO call today, so why not support it? That just means not everyone is going to use the vDSO call, again resulting in potential problems when multiple libraries want to use enclaves.

No offense, but if you want to write garbage software with entirely
user code, I can't stop you, but the kernel is not going to give this
anything resembling a stamp of approval.  When you inevitably discover
that it has broken for one reason or another and your software stack
stops making forward progress, I don't want anyone playing the "ABI
stability" card and asking the upstream kernel to work around the
breakage.

Now maybe I'm wrong and there's an actual legitimate use case for this
trick, in which case I'd like to be enlightened.  But M:N threading
and enclave aborting don't sound like legitimate use cases.
Jethro Beekman Aug. 20, 2020, 11:13 a.m. UTC | #7
Tested-by: Jethro Beekman <jethro@fortanix.com>

--
Jethro Beekman | Fortanix

On 2020-08-18 06:24, Sean Christopherson wrote:
> Allow userspace to exit the vDSO on interrupts that are acknowledged
> while the enclave is active.  This allows the user's runtime to switch
> contexts at opportune times without additional overhead, e.g. when using
> an M:N threading model (where M user threads run N TCSs, with N > M).
> 
> Suggested-by: Jethro Beekman <jethro@fortanix.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/entry/vdso/vsgx_enter_enclave.S | 27 ++++++++++++++++++++----
>  arch/x86/include/uapi/asm/sgx.h          |  3 +++
>  2 files changed, 26 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> index b09e87dbe9334..33428c0f94b0d 100644
> --- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> +++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> @@ -21,6 +21,9 @@
>  
>  #define SGX_SYNCHRONOUS_EXIT	0
>  #define SGX_EXCEPTION_EXIT	1
> +#define SGX_INTERRUPT_EXIT	2
> +
> +#define SGX_EXIT_ON_INTERRUPTS	1
>  
>  /* Offsets into sgx_enter_enclave.exception. */
>  #define EX_TRAPNR	0*8
> @@ -51,12 +54,17 @@ SYM_FUNC_START(__vdso_sgx_enter_enclave)
>  
>  	mov	RUN_OFFSET(%rbp), %rcx
>  
> -	/* No flags are currently defined/supported. */
> -	cmpq	$0, FLAGS_OFFSET(%rcx)
> -	jne	.Linvalid_input
> -
>  	/* Load TCS and AEP */
>  	mov	TCS_OFFEST(%rcx), %rbx
> +
> +	/* Use the alternate AEP if the user wants to exit on interrupts. */
> +	mov	FLAGS_OFFSET(%rcx), %rcx
> +	cmpq	$SGX_EXIT_ON_INTERRUPTS, %rcx
> +	je	.Lload_interrupts_aep
> +
> +	/* All other flags are reserved. */
> +	test	%rcx, %rcx
> +	jne	.Linvalid_input
>  	lea	.Lasync_exit_pointer(%rip), %rcx
>  
>  	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
> @@ -93,6 +101,17 @@ SYM_FUNC_START(__vdso_sgx_enter_enclave)
>  	mov	$(-EINVAL), %eax
>  	jmp	.Lout
>  
> +.Lload_interrupts_aep:
> +	lea	.Lhandle_interrupt(%rip), %rcx
> +	jmp	.Lenclu_eenter_eresume
> +
> +.Lhandle_interrupt:
> +	mov	RUN_OFFSET(%rbp), %rbx
> +
> +	/* Set the exit_reason and exception info. */
> +	movl	$SGX_INTERRUPT_EXIT, EXIT_REASON_OFFSET(%rbx)
> +	jmp	.Lhandle_exit
> +
>  .Lhandle_exception:
>  	mov	RUN_OFFSET(%rbp), %rbx
>  
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index 80a8b7a949a23..beeabfad6eb81 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -76,6 +76,7 @@ struct sgx_enclave_set_attribute {
>  
>  #define SGX_SYNCHRONOUS_EXIT	0
>  #define SGX_EXCEPTION_EXIT	1
> +#define SGX_INTERRUPT_EXIT	2
>  
>  struct sgx_enclave_run;
>  
> @@ -116,6 +117,8 @@ struct sgx_enclave_exception {
>  	__u64 address;
>  };
>  
> +#define SGX_EXIT_ON_INTERRUPTS	(1ULL << 0)
> +
>  /**
>   * struct sgx_enclave_run - Control structure for __vdso_sgx_enter_enclave()
>   *
>
Jethro Beekman Aug. 20, 2020, 11:20 a.m. UTC | #8
On 2020-08-19 17:02, Andy Lutomirski wrote:
> On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@fortanix.com> wrote:
>>
>> On 2020-08-18 19:15, Andy Lutomirski wrote:
>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
>>> <sean.j.christopherson@intel.com> wrote:
>>>>
>>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
>>>> while the enclave is active.  This allows the user's runtime to switch
>>>> contexts at opportune times without additional overhead, e.g. when using
>>>> an M:N threading model (where M user threads run N TCSs, with N > M).
>>>
>>> This is IMO rather odd.  We don't support this type of notification on
>>> interrupts for normal user code.  The fact user code can detect
>>> interrupts during enclave execution is IMO an oddity of SGX, and I
>>> have asked Intel to consider replacing the AEX mechanism with
>>> something more transparent to user mode.  If this ever happens, this
>>> mechanism is toast.
>>
>> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.
> 
> No.  If Intel fixes the architecture, the proposed interface will
> *stop working*.
> 
>>
>>> Even without architecture changes, building a *reliable* M:N threading
>>> mechanism on top of this will be difficult or impossible, as there is
>>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
>>> triggering an AEX.  We certainly don't, and probably never will,
>>> support any corresponding feature for non-enclave code.
>>
>> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).
> 
> If you want to abort an enclave, abort it the same way you abort any
> other user code -- send a signal or something.  If something is wrong> with signal handling in the proposed vDSO interface, then by all
> means, let's fix it.  If we need better library signal support, let's
> add such a thing.  If you really want to abort an enclave *cleanly*
> without any chance of garbage left around, stick it in a subprocess.
> We are not playing the game where someone sets a flag, crosses their
> fingers, and waits for an interrupt.

Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible.

> 
>>
>>> So this seems like an odd, and possibly unsupportable, feature to add.
>>
>> I can implement all this without the vDSO call today, so why not support it? That just means not everyone is going to use the vDSO call, again resulting in potential problems when multiple libraries want to use enclaves.
> 
> No offense, but if you want to write garbage software with entirely
> user code, I can't stop you, but the kernel is not going to give this
> anything resembling a stamp of approval.  When you inevitably discover
> that it has broken for one reason or another and your software stack
> stops making forward progress, I don't want anyone playing the "ABI
> stability" card and asking the upstream kernel to work around the
> breakage.
> 
> Now maybe I'm wrong and there's an actual legitimate use case for this
> trick, in which case I'd like to be enlightened.  But M:N threading
> and enclave aborting don't sound like legitimate use cases.
> 

Why is it ok for this code to return to userspace after 1 second:



void ignore_signal_but_dont_restart(int s) {}

// set SIGALRM handler if none set (FIXME: racy)
struct sigaction sa_old;
struct sigaction sa = {
    .sa_handler = ignore_signal_but_dont_restart,
    .sa_flags = 0,
};
sigaction(SIGALRM, &sa, &sa_old);
if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
    || (sa_old.sa_flags & SA_SIGINFO)) {
    sa_old.sa_flags &= ~SA_RESTART;
    sigaction(SIGALRM, &sa_old, NULL);
}

alarm(1);

char buf;
read(0, &buf, 1);



But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)?



void ignore_signal_but_dont_restart(int s) {}

// set SIGALRM handler if none set (FIXME: racy)
struct sigaction sa_old;
struct sigaction sa = {
    .sa_handler = ignore_signal_but_dont_restart,
    .sa_flags = 0,
};
sigaction(SIGALRM, &sa, &sa_old);
if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
    || (sa_old.sa_flags & SA_SIGINFO)) {
    sa_old.sa_flags &= ~SA_RESTART;
    sigaction(SIGALRM, &sa_old, NULL);
}

alarm(1);

__vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL);



Tested on Linux 5.9-rc1 with SGX patches v36.

--
Jethro Beekman | Fortanix
Andy Lutomirski Aug. 20, 2020, 5:44 p.m. UTC | #9
On Thu, Aug 20, 2020 at 4:20 AM Jethro Beekman <jethro@fortanix.com> wrote:
>
> On 2020-08-19 17:02, Andy Lutomirski wrote:
> > On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@fortanix.com> wrote:
> >>
> >> On 2020-08-18 19:15, Andy Lutomirski wrote:
> >>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
> >>> <sean.j.christopherson@intel.com> wrote:
> >>>>
> >>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
> >>>> while the enclave is active.  This allows the user's runtime to switch
> >>>> contexts at opportune times without additional overhead, e.g. when using
> >>>> an M:N threading model (where M user threads run N TCSs, with N > M).
> >>>
> >>> This is IMO rather odd.  We don't support this type of notification on
> >>> interrupts for normal user code.  The fact user code can detect
> >>> interrupts during enclave execution is IMO an oddity of SGX, and I
> >>> have asked Intel to consider replacing the AEX mechanism with
> >>> something more transparent to user mode.  If this ever happens, this
> >>> mechanism is toast.
> >>
> >> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.
> >
> > No.  If Intel fixes the architecture, the proposed interface will
> > *stop working*.
> >
> >>
> >>> Even without architecture changes, building a *reliable* M:N threading
> >>> mechanism on top of this will be difficult or impossible, as there is
> >>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
> >>> triggering an AEX.  We certainly don't, and probably never will,
> >>> support any corresponding feature for non-enclave code.
> >>
> >> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).
> >
> > If you want to abort an enclave, abort it the same way you abort any
> > other user code -- send a signal or something.  If something is wrong> with signal handling in the proposed vDSO interface, then by all
> > means, let's fix it.  If we need better library signal support, let's
> > add such a thing.  If you really want to abort an enclave *cleanly*
> > without any chance of garbage left around, stick it in a subprocess.
> > We are not playing the game where someone sets a flag, crosses their
> > fingers, and waits for an interrupt.
>
> Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible.
>

Why not?  If we are failing to deliver signals if
__vdso_sgx_enter_enclave() is running, we have a bug and we should fix
it.  We should actually deliver the signal and then either resume the
enclave or return -EINTR or equivalent.  Are we getting this wrong?
Looking at your code, I think we're doing it right but maybe we could
do better.

I suppose we also need a test case for this if we do anything fancy here.

> > Now maybe I'm wrong and there's an actual legitimate use case for this
> > trick, in which case I'd like to be enlightened.  But M:N threading
> > and enclave aborting don't sound like legitimate use cases.
> >
>
> Why is it ok for this code to return to userspace after 1 second:
>
>
>
> void ignore_signal_but_dont_restart(int s) {}
>
> // set SIGALRM handler if none set (FIXME: racy)
> struct sigaction sa_old;
> struct sigaction sa = {
>     .sa_handler = ignore_signal_but_dont_restart,
>     .sa_flags = 0,
> };
> sigaction(SIGALRM, &sa, &sa_old);
> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
>     || (sa_old.sa_flags & SA_SIGINFO)) {
>     sa_old.sa_flags &= ~SA_RESTART;
>     sigaction(SIGALRM, &sa_old, NULL);
> }
>
> alarm(1);
>
> char buf;
> read(0, &buf, 1);
>

Seems fine.  That's POSIX.  User code is receiving a signal and is
being affected by the signal.  A signal is a user-visible thing.

>
>
> But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)?
>
>
>
> void ignore_signal_but_dont_restart(int s) {}
>
> // set SIGALRM handler if none set (FIXME: racy)
> struct sigaction sa_old;
> struct sigaction sa = {
>     .sa_handler = ignore_signal_but_dont_restart,
>     .sa_flags = 0,
> };
> sigaction(SIGALRM, &sa, &sa_old);
> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
>     || (sa_old.sa_flags & SA_SIGINFO)) {
>     sa_old.sa_flags &= ~SA_RESTART;
>     sigaction(SIGALRM, &sa_old, NULL);
> }
>
> alarm(1);
>
> __vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL);
>

This is a genuine semantic question: is __vdso_sgx_enter_enclave()
like read on a pipe (returns -EINTR on a signal) or is it more like a
restartable syscall or a normal library function that just keeps
running if your signal handler does nothing?  You could siglongjmp()
out, but that's a bit gross.

I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
it might be distinctly nontrivial, though.

But this isn't what this patch does, and I suspect we've been talking
past each other.  This patch makes __vdso_sgx_enter_enclave() return
if there's an *interrupt*.  If you push a keyboard button, move your
mouse, get a network interrupt on that core, etc, it will return.
This is nonsense.

--Andy
Jethro Beekman Aug. 20, 2020, 5:53 p.m. UTC | #10
On 2020-08-20 19:44, Andy Lutomirski wrote:
> On Thu, Aug 20, 2020 at 4:20 AM Jethro Beekman <jethro@fortanix.com> wrote:
>>
>> On 2020-08-19 17:02, Andy Lutomirski wrote:
>>> On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@fortanix.com> wrote:
>>>>
>>>> On 2020-08-18 19:15, Andy Lutomirski wrote:
>>>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
>>>>> <sean.j.christopherson@intel.com> wrote:
>>>>>>
>>>>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
>>>>>> while the enclave is active.  This allows the user's runtime to switch
>>>>>> contexts at opportune times without additional overhead, e.g. when using
>>>>>> an M:N threading model (where M user threads run N TCSs, with N > M).
>>>>>
>>>>> This is IMO rather odd.  We don't support this type of notification on
>>>>> interrupts for normal user code.  The fact user code can detect
>>>>> interrupts during enclave execution is IMO an oddity of SGX, and I
>>>>> have asked Intel to consider replacing the AEX mechanism with
>>>>> something more transparent to user mode.  If this ever happens, this
>>>>> mechanism is toast.
>>>>
>>>> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.
>>>
>>> No.  If Intel fixes the architecture, the proposed interface will
>>> *stop working*.
>>>
>>>>
>>>>> Even without architecture changes, building a *reliable* M:N threading
>>>>> mechanism on top of this will be difficult or impossible, as there is
>>>>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
>>>>> triggering an AEX.  We certainly don't, and probably never will,
>>>>> support any corresponding feature for non-enclave code.
>>>>
>>>> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).
>>>
>>> If you want to abort an enclave, abort it the same way you abort any
>>> other user code -- send a signal or something.  If something is wrong> with signal handling in the proposed vDSO interface, then by all
>>> means, let's fix it.  If we need better library signal support, let's
>>> add such a thing.  If you really want to abort an enclave *cleanly*
>>> without any chance of garbage left around, stick it in a subprocess.
>>> We are not playing the game where someone sets a flag, crosses their
>>> fingers, and waits for an interrupt.
>>
>> Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible.
>>
> 
> Why not?  If we are failing to deliver signals if
> __vdso_sgx_enter_enclave() is running, we have a bug and we should fix
> it.  We should actually deliver the signal and then either resume the
> enclave or return -EINTR or equivalent.  Are we getting this wrong?
> Looking at your code, I think we're doing it right but maybe we could
> do better.
> 
> I suppose we also need a test case for this if we do anything fancy here.
> 
>>> Now maybe I'm wrong and there's an actual legitimate use case for this
>>> trick, in which case I'd like to be enlightened.  But M:N threading
>>> and enclave aborting don't sound like legitimate use cases.
>>>
>>
>> Why is it ok for this code to return to userspace after 1 second:
>>
>>
>>
>> void ignore_signal_but_dont_restart(int s) {}
>>
>> // set SIGALRM handler if none set (FIXME: racy)
>> struct sigaction sa_old;
>> struct sigaction sa = {
>>     .sa_handler = ignore_signal_but_dont_restart,
>>     .sa_flags = 0,
>> };
>> sigaction(SIGALRM, &sa, &sa_old);
>> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
>>     || (sa_old.sa_flags & SA_SIGINFO)) {
>>     sa_old.sa_flags &= ~SA_RESTART;
>>     sigaction(SIGALRM, &sa_old, NULL);
>> }
>>
>> alarm(1);
>>
>> char buf;
>> read(0, &buf, 1);
>>
> 
> Seems fine.  That's POSIX.  User code is receiving a signal and is
> being affected by the signal.  A signal is a user-visible thing.
> 
>>
>>
>> But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)?
>>
>>
>>
>> void ignore_signal_but_dont_restart(int s) {}
>>
>> // set SIGALRM handler if none set (FIXME: racy)
>> struct sigaction sa_old;
>> struct sigaction sa = {
>>     .sa_handler = ignore_signal_but_dont_restart,
>>     .sa_flags = 0,
>> };
>> sigaction(SIGALRM, &sa, &sa_old);
>> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
>>     || (sa_old.sa_flags & SA_SIGINFO)) {
>>     sa_old.sa_flags &= ~SA_RESTART;
>>     sigaction(SIGALRM, &sa_old, NULL);
>> }
>>
>> alarm(1);
>>
>> __vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL);
>>
> 
> This is a genuine semantic question: is __vdso_sgx_enter_enclave()
> like read on a pipe (returns -EINTR on a signal) or is it more like a
> restartable syscall or a normal library function that just keeps
> running if your signal handler does nothing?  You could siglongjmp()
> out, but that's a bit gross.
> 
> I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
> it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
> it might be distinctly nontrivial, though.
> 
> But this isn't what this patch does, and I suspect we've been talking
> past each other.  This patch makes __vdso_sgx_enter_enclave() return
> if there's an *interrupt*.  If you push a keyboard button, move your
> mouse, get a network interrupt on that core, etc, it will return.
> This is nonsense.

It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important. 

It sounds like you're saying you want to subdivide AEXs into “interrupts that lead to user-observable signals” and “other interrupts”, and then hide the second category from the user. I wouldn't object to that, but I don't know how to code this. It seems like a lot of work compared to the obvious solution (this patch).

--
Jethro Beekman | Fortanix
Andy Lutomirski Aug. 22, 2020, 9:55 p.m. UTC | #11
On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@fortanix.com> wrote:
>
> On 2020-08-20 19:44, Andy Lutomirski wrote:
> > On Thu, Aug 20, 2020 at 4:20 AM Jethro Beekman <jethro@fortanix.com> wrote:
> >>
> >> On 2020-08-19 17:02, Andy Lutomirski wrote:
> >>> On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@fortanix.com> wrote:
> >>>>
> >>>> On 2020-08-18 19:15, Andy Lutomirski wrote:
> >>>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
> >>>>> <sean.j.christopherson@intel.com> wrote:
> >>>>>>
> >>>>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
> >>>>>> while the enclave is active.  This allows the user's runtime to switch
> >>>>>> contexts at opportune times without additional overhead, e.g. when using
> >>>>>> an M:N threading model (where M user threads run N TCSs, with N > M).
> >>>>>
> >>>>> This is IMO rather odd.  We don't support this type of notification on
> >>>>> interrupts for normal user code.  The fact user code can detect
> >>>>> interrupts during enclave execution is IMO an oddity of SGX, and I
> >>>>> have asked Intel to consider replacing the AEX mechanism with
> >>>>> something more transparent to user mode.  If this ever happens, this
> >>>>> mechanism is toast.
> >>>>
> >>>> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.
> >>>
> >>> No.  If Intel fixes the architecture, the proposed interface will
> >>> *stop working*.
> >>>
> >>>>
> >>>>> Even without architecture changes, building a *reliable* M:N threading
> >>>>> mechanism on top of this will be difficult or impossible, as there is
> >>>>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
> >>>>> triggering an AEX.  We certainly don't, and probably never will,
> >>>>> support any corresponding feature for non-enclave code.
> >>>>
> >>>> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).
> >>>
> >>> If you want to abort an enclave, abort it the same way you abort any
> >>> other user code -- send a signal or something.  If something is wrong> with signal handling in the proposed vDSO interface, then by all
> >>> means, let's fix it.  If we need better library signal support, let's
> >>> add such a thing.  If you really want to abort an enclave *cleanly*
> >>> without any chance of garbage left around, stick it in a subprocess.
> >>> We are not playing the game where someone sets a flag, crosses their
> >>> fingers, and waits for an interrupt.
> >>
> >> Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible.
> >>
> >
> > Why not?  If we are failing to deliver signals if
> > __vdso_sgx_enter_enclave() is running, we have a bug and we should fix
> > it.  We should actually deliver the signal and then either resume the
> > enclave or return -EINTR or equivalent.  Are we getting this wrong?
> > Looking at your code, I think we're doing it right but maybe we could
> > do better.
> >
> > I suppose we also need a test case for this if we do anything fancy here.
> >
> >>> Now maybe I'm wrong and there's an actual legitimate use case for this
> >>> trick, in which case I'd like to be enlightened.  But M:N threading
> >>> and enclave aborting don't sound like legitimate use cases.
> >>>
> >>
> >> Why is it ok for this code to return to userspace after 1 second:
> >>
> >>
> >>
> >> void ignore_signal_but_dont_restart(int s) {}
> >>
> >> // set SIGALRM handler if none set (FIXME: racy)
> >> struct sigaction sa_old;
> >> struct sigaction sa = {
> >>     .sa_handler = ignore_signal_but_dont_restart,
> >>     .sa_flags = 0,
> >> };
> >> sigaction(SIGALRM, &sa, &sa_old);
> >> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
> >>     || (sa_old.sa_flags & SA_SIGINFO)) {
> >>     sa_old.sa_flags &= ~SA_RESTART;
> >>     sigaction(SIGALRM, &sa_old, NULL);
> >> }
> >>
> >> alarm(1);
> >>
> >> char buf;
> >> read(0, &buf, 1);
> >>
> >
> > Seems fine.  That's POSIX.  User code is receiving a signal and is
> > being affected by the signal.  A signal is a user-visible thing.
> >
> >>
> >>
> >> But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)?
> >>
> >>
> >>
> >> void ignore_signal_but_dont_restart(int s) {}
> >>
> >> // set SIGALRM handler if none set (FIXME: racy)
> >> struct sigaction sa_old;
> >> struct sigaction sa = {
> >>     .sa_handler = ignore_signal_but_dont_restart,
> >>     .sa_flags = 0,
> >> };
> >> sigaction(SIGALRM, &sa, &sa_old);
> >> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
> >>     || (sa_old.sa_flags & SA_SIGINFO)) {
> >>     sa_old.sa_flags &= ~SA_RESTART;
> >>     sigaction(SIGALRM, &sa_old, NULL);
> >> }
> >>
> >> alarm(1);
> >>
> >> __vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL);
> >>
> >
> > This is a genuine semantic question: is __vdso_sgx_enter_enclave()
> > like read on a pipe (returns -EINTR on a signal) or is it more like a
> > restartable syscall or a normal library function that just keeps
> > running if your signal handler does nothing?  You could siglongjmp()
> > out, but that's a bit gross.
> >
> > I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
> > it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
> > it might be distinctly nontrivial, though.
> >
> > But this isn't what this patch does, and I suspect we've been talking
> > past each other.  This patch makes __vdso_sgx_enter_enclave() return
> > if there's an *interrupt*.  If you push a keyboard button, move your
> > mouse, get a network interrupt on that core, etc, it will return.
> > This is nonsense.
>
> It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important.

Allow me to quote the changelog of the patch:

This allows the user's runtime to switch
contexts at opportune times without additional overhead, e.g. when using
an M:N threading model (where M user threads run N TCSs, with N > M).

NAK.

>
> It sounds like you're saying you want to subdivide AEXs into “interrupts that lead to user-observable signals” and “other interrupts”, and then hide the second category from the user. I wouldn't object to that, but I don't know how to code this. It seems like a lot of work compared to the obvious solution (this patch).

The obvious solution is NAKked.

Does siglongjmp() really not work for you?

--Andy
Jethro Beekman Aug. 24, 2020, 1:36 p.m. UTC | #12
On 2020-08-22 23:55, Andy Lutomirski wrote:
> On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@fortanix.com> wrote:
>>
>> On 2020-08-20 19:44, Andy Lutomirski wrote:
>>> This is a genuine semantic question: is __vdso_sgx_enter_enclave()
>>> like read on a pipe (returns -EINTR on a signal) or is it more like a
>>> restartable syscall or a normal library function that just keeps
>>> running if your signal handler does nothing?  You could siglongjmp()
>>> out, but that's a bit gross.
>>>
>>> I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
>>> it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
>>> it might be distinctly nontrivial, though.
>>>
>>> But this isn't what this patch does, and I suspect we've been talking
>>> past each other.  This patch makes __vdso_sgx_enter_enclave() return
>>> if there's an *interrupt*.  If you push a keyboard button, move your
>>> mouse, get a network interrupt on that core, etc, it will return.
>>> This is nonsense.
>>
>> It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important.
> 
> Allow me to quote the changelog of the patch:
> 
> This allows the user's runtime to switch
> contexts at opportune times without additional overhead, e.g. when using
> an M:N threading model (where M user threads run N TCSs, with N > M).
> 
> NAK.
> 
>>
>> It sounds like you're saying you want to subdivide AEXs into “interrupts that lead to user-observable signals” and “other interrupts”, and then hide the second category from the user. I wouldn't object to that, but I don't know how to code this. It seems like a lot of work compared to the obvious solution (this patch).
> 
> The obvious solution is NAKked.
> 
> Does siglongjmp() really not work for you?

Allow me to quote the changelog of "[PATCH v36 18/24] x86/vdso: Add support for exception fixup in vDSO functions":

Putting everything together, userspace enclaves will utilize a library
that must be prepared to handle any and (almost) all exceptions any time
at least one thread may be executing in an enclave.  Leveraging signals
to handle the enclave exceptions is unpleasant, to put it mildly, e.g.
the SGX library must constantly (un)register its signal handler based
on whether or not at least one thread is executing in an enclave, and
filter and forward exceptions that aren't related to its enclaves.  This
becomes particularly nasty when using multiple levels of libraries that
register signal handlers, e.g. running an enclave via cgo inside of the
Go runtime.


If I could use signal handlers I wouldn't use the VDSO call. (Corollary: if I wasn't using the VDSO call, I wouldn't have to think about this because what I want is already supported).

--
Jethro Beekman | Fortanix
Sean Christopherson Aug. 26, 2020, 6:32 p.m. UTC | #13
On Mon, Aug 24, 2020 at 03:36:29PM +0200, Jethro Beekman wrote:
> On 2020-08-22 23:55, Andy Lutomirski wrote:
> > On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@fortanix.com> wrote:
> >>
> >> On 2020-08-20 19:44, Andy Lutomirski wrote:
> >>> This is a genuine semantic question: is __vdso_sgx_enter_enclave()
> >>> like read on a pipe (returns -EINTR on a signal) or is it more like a
> >>> restartable syscall or a normal library function that just keeps
> >>> running if your signal handler does nothing?  You could siglongjmp()
> >>> out, but that's a bit gross.
> >>>
> >>> I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
> >>> it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
> >>> it might be distinctly nontrivial, though.
> >>>
> >>> But this isn't what this patch does, and I suspect we've been talking
> >>> past each other.  This patch makes __vdso_sgx_enter_enclave() return
> >>> if there's an *interrupt*.  If you push a keyboard button, move your
> >>> mouse, get a network interrupt on that core, etc, it will return.
> >>> This is nonsense.
> >>
> >> It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important.
> > 
> > Allow me to quote the changelog of the patch:
> > 
> > This allows the user's runtime to switch
> > contexts at opportune times without additional overhead, e.g. when using
> > an M:N threading model (where M user threads run N TCSs, with N > M).
> > 
> > NAK.
> > 
> >>
> >> It sounds like you're saying you want to subdivide AEXs into “interrupts
> >> that lead to user-observable signals” and “other interrupts”, and then
> >> hide the second category from the user. I wouldn't object to that, but I
> >> don't know how to code this. It seems like a lot of work compared to the
> >> obvious solution (this patch).
> > 
> > The obvious solution is NAKked.
> > 
> > Does siglongjmp() really not work for you?
> 
> Allow me to quote the changelog of "[PATCH v36 18/24] x86/vdso: Add support
> for exception fixup in vDSO functions":
> 
> Putting everything together, userspace enclaves will utilize a library
> that must be prepared to handle any and (almost) all exceptions any time
> at least one thread may be executing in an enclave.  Leveraging signals
> to handle the enclave exceptions is unpleasant, to put it mildly, e.g.
> the SGX library must constantly (un)register its signal handler based
> on whether or not at least one thread is executing in an enclave, and
> filter and forward exceptions that aren't related to its enclaves.  This
> becomes particularly nasty when using multiple levels of libraries that
> register signal handlers, e.g. running an enclave via cgo inside of the
> Go runtime.
> 
> 
> If I could use signal handlers I wouldn't use the VDSO call. (Corollary: if I
> wasn't using the VDSO call, I wouldn't have to think about this because what
> I want is already supported).

Jethro, I think what you're requesting is simply not possible.  Or rather,
the SGX vDSO can't provide any additional value relative to what can be
provided by userspace.

What Andy is saying is that having the vDSO take action on interrupts is
undesirable/NAK'd from a kernel perspective, as it's ugly and unreliable.
It happens to mostly work for SIGALRM because there is an interrupt
associated with the expiration of the timer[*].  But even then it will
work if and only if the interrupt arrives while the enclave is running.
If the interrupt arrives just before ENCLU, either on initial entry or
on resume from the callback, SIGALRM will be delivered and the userspace
M:N scheduler that is relying on an exit from the enclave will miss its
"tick".

The vDSO could check for pending signals, but that effectively provides no
value as opposed to checking for signals in the callback (or caller).

The goal of the SGX vDSO is to obviate the need for handling signals that
are due to exceptions that are directly associated with the enclave.  IMO,
the SIGALRM request is completely out of scope as it is not a problem that
is unique to SGX, i.e. wanting to do M:N scheduling in a library without
hooking SIGALRM is a generic request.  The fact that the SGX architcture has
a quirk that allows for a semi-functional hack is not justification for
supporting such a hack in the kernel.

[*] Is it even guaranteed that an interrupt will precede SIGALRM?  Not that
    it matters for this discussion.
Xing, Cedric Aug. 26, 2020, 7:09 p.m. UTC | #14
On 8/26/2020 11:32 AM, Sean Christopherson wrote:
> On Mon, Aug 24, 2020 at 03:36:29PM +0200, Jethro Beekman wrote:
>> On 2020-08-22 23:55, Andy Lutomirski wrote:
>>> On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@fortanix.com> wrote:
>>>>
>>>> On 2020-08-20 19:44, Andy Lutomirski wrote:
>>>>> This is a genuine semantic question: is __vdso_sgx_enter_enclave()
>>>>> like read on a pipe (returns -EINTR on a signal) or is it more like a
>>>>> restartable syscall or a normal library function that just keeps
>>>>> running if your signal handler does nothing?  You could siglongjmp()
>>>>> out, but that's a bit gross.
>>>>>
>>>>> I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
>>>>> it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
>>>>> it might be distinctly nontrivial, though.
>>>>>
>>>>> But this isn't what this patch does, and I suspect we've been talking
>>>>> past each other.  This patch makes __vdso_sgx_enter_enclave() return
>>>>> if there's an *interrupt*.  If you push a keyboard button, move your
>>>>> mouse, get a network interrupt on that core, etc, it will return.
>>>>> This is nonsense.
>>>>
>>>> It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important.
>>>
>>> Allow me to quote the changelog of the patch:
>>>
>>> This allows the user's runtime to switch
>>> contexts at opportune times without additional overhead, e.g. when using
>>> an M:N threading model (where M user threads run N TCSs, with N > M).
>>>
>>> NAK.
>>>
>>>>
>>>> It sounds like you're saying you want to subdivide AEXs into “interrupts
>>>> that lead to user-observable signals” and “other interrupts”, and then
>>>> hide the second category from the user. I wouldn't object to that, but I
>>>> don't know how to code this. It seems like a lot of work compared to the
>>>> obvious solution (this patch).
>>>
>>> The obvious solution is NAKked.
>>>
>>> Does siglongjmp() really not work for you?
>>
>> Allow me to quote the changelog of "[PATCH v36 18/24] x86/vdso: Add support
>> for exception fixup in vDSO functions":
>>
>> Putting everything together, userspace enclaves will utilize a library
>> that must be prepared to handle any and (almost) all exceptions any time
>> at least one thread may be executing in an enclave.  Leveraging signals
>> to handle the enclave exceptions is unpleasant, to put it mildly, e.g.
>> the SGX library must constantly (un)register its signal handler based
>> on whether or not at least one thread is executing in an enclave, and
>> filter and forward exceptions that aren't related to its enclaves.  This
>> becomes particularly nasty when using multiple levels of libraries that
>> register signal handlers, e.g. running an enclave via cgo inside of the
>> Go runtime.
>>
>>
>> If I could use signal handlers I wouldn't use the VDSO call. (Corollary: if I
>> wasn't using the VDSO call, I wouldn't have to think about this because what
>> I want is already supported).
> 
> Jethro, I think what you're requesting is simply not possible.  Or rather,
> the SGX vDSO can't provide any additional value relative to what can be
> provided by userspace.
> 
> What Andy is saying is that having the vDSO take action on interrupts is
> undesirable/NAK'd from a kernel perspective, as it's ugly and unreliable.
> It happens to mostly work for SIGALRM because there is an interrupt
> associated with the expiration of the timer[*].  But even then it will
> work if and only if the interrupt arrives while the enclave is running.
> If the interrupt arrives just before ENCLU, either on initial entry or
> on resume from the callback, SIGALRM will be delivered and the userspace
> M:N scheduler that is relying on an exit from the enclave will miss its
> "tick".
> 
> The vDSO could check for pending signals, but that effectively provides no
> value as opposed to checking for signals in the callback (or caller).
> 
> The goal of the SGX vDSO is to obviate the need for handling signals that
> are due to exceptions that are directly associated with the enclave.  IMO,
> the SIGALRM request is completely out of scope as it is not a problem that
> is unique to SGX, i.e. wanting to do M:N scheduling in a library without
> hooking SIGALRM is a generic request.  The fact that the SGX architcture has
> a quirk that allows for a semi-functional hack is not justification for
> supporting such a hack in the kernel.
> 
> [*] Is it even guaranteed that an interrupt will precede SIGALRM?  Not that
>      it matters for this discussion.
> 

Hi Jethro,

I'm with Andy and Sean that it's unprecedented and undesired to notify 
applications of interrupts from the kernel's perspective.

I think what you really want is to be notified on signals without a 
signal handler, in an effort to avoid conflicts between libraries and 
their user applications. But please keep in mind that regarding signals, 
there are always 2 parts - signaling and handling. Even though you could 
avoid conflicts in the handling, you still may not avoid conflicts in 
the signaling part. For example, say you want SIGALRM every 100ms for 
M:N scheduling in a lib but the user application also uses SIGALRM for 
something else. Even though you can avoid conflict by not registering a 
SIGALRM handler, the user application may use SIGALRM at a different 
frequency. Hence the lib and the application will still interfere with 
each other. Then what's the good of not registering a handler?
Jethro Beekman Aug. 27, 2020, 8:57 a.m. UTC | #15
On 2020-08-26 20:32, Sean Christopherson wrote:
> On Mon, Aug 24, 2020 at 03:36:29PM +0200, Jethro Beekman wrote:
>> On 2020-08-22 23:55, Andy Lutomirski wrote:
>>> On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@fortanix.com> wrote:
>>>>
>>>> On 2020-08-20 19:44, Andy Lutomirski wrote:
>>>>> This is a genuine semantic question: is __vdso_sgx_enter_enclave()
>>>>> like read on a pipe (returns -EINTR on a signal) or is it more like a
>>>>> restartable syscall or a normal library function that just keeps
>>>>> running if your signal handler does nothing?  You could siglongjmp()
>>>>> out, but that's a bit gross.
>>>>>
>>>>> I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
>>>>> it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
>>>>> it might be distinctly nontrivial, though.
>>>>>
>>>>> But this isn't what this patch does, and I suspect we've been talking
>>>>> past each other.  This patch makes __vdso_sgx_enter_enclave() return
>>>>> if there's an *interrupt*.  If you push a keyboard button, move your
>>>>> mouse, get a network interrupt on that core, etc, it will return.
>>>>> This is nonsense.
>>>>
>>>> It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important.
>>>
>>> Allow me to quote the changelog of the patch:
>>>
>>> This allows the user's runtime to switch
>>> contexts at opportune times without additional overhead, e.g. when using
>>> an M:N threading model (where M user threads run N TCSs, with N > M).
>>>
>>> NAK.
>>>
>>>>
>>>> It sounds like you're saying you want to subdivide AEXs into “interrupts
>>>> that lead to user-observable signals” and “other interrupts”, and then
>>>> hide the second category from the user. I wouldn't object to that, but I
>>>> don't know how to code this. It seems like a lot of work compared to the
>>>> obvious solution (this patch).
>>>
>>> The obvious solution is NAKked.
>>>
>>> Does siglongjmp() really not work for you?
>>
>> Allow me to quote the changelog of "[PATCH v36 18/24] x86/vdso: Add support
>> for exception fixup in vDSO functions":
>>
>> Putting everything together, userspace enclaves will utilize a library
>> that must be prepared to handle any and (almost) all exceptions any time
>> at least one thread may be executing in an enclave.  Leveraging signals
>> to handle the enclave exceptions is unpleasant, to put it mildly, e.g.
>> the SGX library must constantly (un)register its signal handler based
>> on whether or not at least one thread is executing in an enclave, and
>> filter and forward exceptions that aren't related to its enclaves.  This
>> becomes particularly nasty when using multiple levels of libraries that
>> register signal handlers, e.g. running an enclave via cgo inside of the
>> Go runtime.
>>
>>
>> If I could use signal handlers I wouldn't use the VDSO call. (Corollary: if I
>> wasn't using the VDSO call, I wouldn't have to think about this because what
>> I want is already supported).
> 
> Jethro, I think what you're requesting is simply not possible.  Or rather,
> the SGX vDSO can't provide any additional value relative to what can be
> provided by userspace.
> 
> What Andy is saying is that having the vDSO take action on interrupts is
> undesirable/NAK'd from a kernel perspective, as it's ugly and unreliable.
> It happens to mostly work for SIGALRM because there is an interrupt
> associated with the expiration of the timer[*].  But even then it will
> work if and only if the interrupt arrives while the enclave is running.
> If the interrupt arrives just before ENCLU, either on initial entry or
> on resume from the callback, SIGALRM will be delivered and the userspace
> M:N scheduler that is relying on an exit from the enclave will miss its
> "tick".
> 
> The vDSO could check for pending signals, but that effectively provides no
> value as opposed to checking for signals in the callback (or caller).
> 
> The goal of the SGX vDSO is to obviate the need for handling signals that
> are due to exceptions that are directly associated with the enclave.  IMO,
> the SIGALRM request is completely out of scope as it is not a problem that
> is unique to SGX, i.e. wanting to do M:N scheduling in a library without
> hooking SIGALRM is a generic request.  The fact that the SGX architcture has
> a quirk that allows for a semi-functional hack is not justification for
> supporting such a hack in the kernel.
> 
> [*] Is it even guaranteed that an interrupt will precede SIGALRM?  Not that
>     it matters for this discussion.
> 

Let's make sure the VDSO call has a flags parameter and we can talk about this after merge.

--
Jethro Beekman | Fortanix
diff mbox series

Patch

diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S b/arch/x86/entry/vdso/vsgx_enter_enclave.S
index b09e87dbe9334..33428c0f94b0d 100644
--- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
+++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
@@ -21,6 +21,9 @@ 
 
 #define SGX_SYNCHRONOUS_EXIT	0
 #define SGX_EXCEPTION_EXIT	1
+#define SGX_INTERRUPT_EXIT	2
+
+#define SGX_EXIT_ON_INTERRUPTS	1
 
 /* Offsets into sgx_enter_enclave.exception. */
 #define EX_TRAPNR	0*8
@@ -51,12 +54,17 @@  SYM_FUNC_START(__vdso_sgx_enter_enclave)
 
 	mov	RUN_OFFSET(%rbp), %rcx
 
-	/* No flags are currently defined/supported. */
-	cmpq	$0, FLAGS_OFFSET(%rcx)
-	jne	.Linvalid_input
-
 	/* Load TCS and AEP */
 	mov	TCS_OFFEST(%rcx), %rbx
+
+	/* Use the alternate AEP if the user wants to exit on interrupts. */
+	mov	FLAGS_OFFSET(%rcx), %rcx
+	cmpq	$SGX_EXIT_ON_INTERRUPTS, %rcx
+	je	.Lload_interrupts_aep
+
+	/* All other flags are reserved. */
+	test	%rcx, %rcx
+	jne	.Linvalid_input
 	lea	.Lasync_exit_pointer(%rip), %rcx
 
 	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
@@ -93,6 +101,17 @@  SYM_FUNC_START(__vdso_sgx_enter_enclave)
 	mov	$(-EINVAL), %eax
 	jmp	.Lout
 
+.Lload_interrupts_aep:
+	lea	.Lhandle_interrupt(%rip), %rcx
+	jmp	.Lenclu_eenter_eresume
+
+.Lhandle_interrupt:
+	mov	RUN_OFFSET(%rbp), %rbx
+
+	/* Set the exit_reason and exception info. */
+	movl	$SGX_INTERRUPT_EXIT, EXIT_REASON_OFFSET(%rbx)
+	jmp	.Lhandle_exit
+
 .Lhandle_exception:
 	mov	RUN_OFFSET(%rbp), %rbx
 
diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 80a8b7a949a23..beeabfad6eb81 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -76,6 +76,7 @@  struct sgx_enclave_set_attribute {
 
 #define SGX_SYNCHRONOUS_EXIT	0
 #define SGX_EXCEPTION_EXIT	1
+#define SGX_INTERRUPT_EXIT	2
 
 struct sgx_enclave_run;
 
@@ -116,6 +117,8 @@  struct sgx_enclave_exception {
 	__u64 address;
 };
 
+#define SGX_EXIT_ON_INTERRUPTS	(1ULL << 0)
+
 /**
  * struct sgx_enclave_run - Control structure for __vdso_sgx_enter_enclave()
  *