[RFC,v1,3/3] selftests/x86: Augment SGX selftest to test new __vdso_sgx_enter_enclave() and its callback interface
diff mbox series

Message ID f82e81c9cae31634684964d3fc4e9637e7565c69.1555965327.git.cedric.xing@intel.com
State New
Headers show
Series
  • An alternative __vdso_sgx_enter_enclave() to allow enclave/host parameter passing using untrusted stack
Related show

Commit Message

Xing, Cedric April 22, 2019, 8:42 p.m. UTC
Given the changes to __vdso_sgx_enter_enclave(), the selftest is augmented to test the newly added callback interface. This addtional test marks the whole enclave range as PROT_READ, and calls mprotect() upon #PFs to add necessary PTE permissions per PFEC (#PF Error Code) until the enclave finishes.

Signed-off-by: Cedric Xing <cedric.xing@intel.com>
---
 tools/testing/selftests/x86/sgx/main.c     | 123 ++++++++++++++++++---
 tools/testing/selftests/x86/sgx/sgx_call.S |  40 ++++++-
 2 files changed, 142 insertions(+), 21 deletions(-)

Comments

Andy Lutomirski April 23, 2019, 1:29 a.m. UTC | #1
On Mon, Apr 22, 2019 at 5:37 PM Cedric Xing <cedric.xing@intel.com> wrote:
>
> Given the changes to __vdso_sgx_enter_enclave(), the selftest is augmented to
> test the newly added callback interface. This addtional test marks the whole
> enclave range as PROT_READ, and calls mprotect() upon #PFs to add necessary PTE
> permissions per PFEC (#PF Error Code) until the enclave finishes.

Nifty.

What's not tested here is running this code with EFLAGS.TF set and
making sure that it unwinds correctly.  Also, Jarkko, unless I missed
something, the vDSO extable code likely has a bug.  If you run the
instruction right before ENCLU with EFLAGS.TF set, then do_debug()
will eat the SIGTRAP and skip to the exception handler.  Similarly, if
you put an instruction breakpoint on ENCLU, it'll get skipped.  Or is
the code actually correct and am I just remembering wrong?

--Andy
Sean Christopherson April 23, 2019, 1:48 a.m. UTC | #2
On Mon, Apr 22, 2019 at 06:29:06PM -0700, Andy Lutomirski wrote:
> On Mon, Apr 22, 2019 at 5:37 PM Cedric Xing <cedric.xing@intel.com> wrote:
> >
> > Given the changes to __vdso_sgx_enter_enclave(), the selftest is augmented to
> > test the newly added callback interface. This addtional test marks the whole
> > enclave range as PROT_READ, and calls mprotect() upon #PFs to add necessary PTE
> > permissions per PFEC (#PF Error Code) until the enclave finishes.
> 
> Nifty.
> 
> What's not tested here is running this code with EFLAGS.TF set and
> making sure that it unwinds correctly.  Also, Jarkko, unless I missed
> something, the vDSO extable code likely has a bug.  If you run the
> instruction right before ENCLU with EFLAGS.TF set, then do_debug()
> will eat the SIGTRAP and skip to the exception handler.  Similarly, if
> you put an instruction breakpoint on ENCLU, it'll get skipped.  Or is
> the code actually correct and am I just remembering wrong?

My money would be on the code being broken as opposed to you remembering
wrong.  I'll take a look at it tomorrow.
Sean Christopherson April 23, 2019, 6:59 p.m. UTC | #3
On Mon, Apr 22, 2019 at 06:29:06PM -0700, Andy Lutomirski wrote:
> What's not tested here is running this code with EFLAGS.TF set and
> making sure that it unwinds correctly.  Also, Jarkko, unless I missed
> something, the vDSO extable code likely has a bug.  If you run the
> instruction right before ENCLU with EFLAGS.TF set, then do_debug()
> will eat the SIGTRAP and skip to the exception handler.  Similarly, if
> you put an instruction breakpoint on ENCLU, it'll get skipped.  Or is
> the code actually correct and am I just remembering wrong?

The code is indeed broken, and I don't see a sane way to make it not
broken other than to never do vDSO fixup on #DB or #BP.  But that's
probably the right thing to do anyways since an attached debugger is
likely the intended recipient the 99.9999999% of the time.

The crux of the matter is that it's impossible to identify whether or
not a #DB/#BP originated from within an enclave, e.g. an INT3 in an
enclave will look identical to an INT3 at the AEP.  Even if hardware
provided a magic flag, #DB still has scenarios where the intended
recipient is ambiguous, e.g. data breakpoint encountered in the enclave
but on an address outside of the enclave, breakpoint encountered in the
enclave and a code breakpoint on the AEP, etc...
Andy Lutomirski April 23, 2019, 7:07 p.m. UTC | #4
On Tue, Apr 23, 2019 at 11:59 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Mon, Apr 22, 2019 at 06:29:06PM -0700, Andy Lutomirski wrote:
> > What's not tested here is running this code with EFLAGS.TF set and
> > making sure that it unwinds correctly.  Also, Jarkko, unless I missed
> > something, the vDSO extable code likely has a bug.  If you run the
> > instruction right before ENCLU with EFLAGS.TF set, then do_debug()
> > will eat the SIGTRAP and skip to the exception handler.  Similarly, if
> > you put an instruction breakpoint on ENCLU, it'll get skipped.  Or is
> > the code actually correct and am I just remembering wrong?
>
> The code is indeed broken, and I don't see a sane way to make it not
> broken other than to never do vDSO fixup on #DB or #BP.  But that's
> probably the right thing to do anyways since an attached debugger is
> likely the intended recipient the 99.9999999% of the time.
>
> The crux of the matter is that it's impossible to identify whether or
> not a #DB/#BP originated from within an enclave, e.g. an INT3 in an
> enclave will look identical to an INT3 at the AEP.  Even if hardware
> provided a magic flag, #DB still has scenarios where the intended
> recipient is ambiguous, e.g. data breakpoint encountered in the enclave
> but on an address outside of the enclave, breakpoint encountered in the
> enclave and a code breakpoint on the AEP, etc...

Ugh.  It sounds like ignoring the fixup for #DB is the right call.
But what happens if the enclave contains an INT3 or ICEBP instruction?
 Are they magically promoted to #GP, perhaps?

As a maybe possible alternative, if we made it so that the AEX address
was not the same as the ENCLU, could we usefully distinguish these
exceptions based on RIP?  I suppose it's also worth considering
whether page faults from *inside* the enclave should result in SIGSEGV
or result in a fixup.  We certainly want page faults from the ENCLU
instruction itself to get fixed up, but maybe we want most exceptions
inside the enclave to work a bit differently.  Of course, if we do
this, we need to make sure that the semantics of returning from the
signal handler are reasonable.
Sean Christopherson April 23, 2019, 8:11 p.m. UTC | #5
On Tue, Apr 23, 2019 at 12:07:26PM -0700, Andy Lutomirski wrote:
> On Tue, Apr 23, 2019 at 11:59 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Mon, Apr 22, 2019 at 06:29:06PM -0700, Andy Lutomirski wrote:
> > > What's not tested here is running this code with EFLAGS.TF set and
> > > making sure that it unwinds correctly.  Also, Jarkko, unless I missed
> > > something, the vDSO extable code likely has a bug.  If you run the
> > > instruction right before ENCLU with EFLAGS.TF set, then do_debug()
> > > will eat the SIGTRAP and skip to the exception handler.  Similarly, if
> > > you put an instruction breakpoint on ENCLU, it'll get skipped.  Or is
> > > the code actually correct and am I just remembering wrong?
> >
> > The code is indeed broken, and I don't see a sane way to make it not
> > broken other than to never do vDSO fixup on #DB or #BP.  But that's
> > probably the right thing to do anyways since an attached debugger is
> > likely the intended recipient the 99.9999999% of the time.
> >
> > The crux of the matter is that it's impossible to identify whether or
> > not a #DB/#BP originated from within an enclave, e.g. an INT3 in an
> > enclave will look identical to an INT3 at the AEP.  Even if hardware
> > provided a magic flag, #DB still has scenarios where the intended
> > recipient is ambiguous, e.g. data breakpoint encountered in the enclave
> > but on an address outside of the enclave, breakpoint encountered in the
> > enclave and a code breakpoint on the AEP, etc...
> 
> Ugh.  It sounds like ignoring the fixup for #DB is the right call.
> But what happens if the enclave contains an INT3 or ICEBP instruction?
>  Are they magically promoted to #GP, perhaps?

#UD for opt-out, a.k.a. non-debug, enclaves.  Delivered "normally" for
opt-in debug enclaves, except they're fault-like instead of trap-like.

> As a maybe possible alternative, if we made it so that the AEX address
> was not the same as the ENCLU, could we usefully distinguish these
> exceptions based on RIP?

Not really, because a user could set a code breakpoint on the AEX or
insert an INT3, e.g. to break on exit from the enclave.

Theoretically the kernel could cross-reference addresses to determine
whether or not a DRx match occurred on an enclave address, but a) that'd
be pretty ugly to implement and b) there would still be ambiguity, e.g. if
there's a code breakpoint on the AEX and a #DB occurs in the enclave, then
DR6 will record both the in-enclave DRx match and the AEX (non-enclave)
DRx match.

> I suppose it's also worth considering
> whether page faults from *inside* the enclave should result in SIGSEGV
> or result in a fixup.  We certainly want page faults from the ENCLU
> instruction itself to get fixed up, but maybe we want most exceptions
> inside the enclave to work a bit differently.  Of course, if we do
> this, we need to make sure that the semantics of returning from the
> signal handler are reasonable.

Hmm, I'm pretty sure any fault that is 100% in the domain of the enclave
should result in fixup.

Here are a few use cases off the top of my head that would require the
enclave's runtime to intercept the signal, either to reenter the enclave
or to feed the fault into the enclave's handler:

  - Handle EPC invalidation, e.g. due to VM migration, while a thread is
    in the enclave since the resulting #PF can occur inside the enclave.

  - During enclave development, configure the runtime to call into the
    enclave on any exception so that the enclave can dump register state.

  - Implement copy-on-write or lazy allocation using SGX2 instructions,
    which would require feeding the #PF back into the enclave.  Purely
    theoretical AFAIK, but lazy allocation in particular could be
    interesting, e.g. don't allocate .bss pages at startup time.

  - An enclave and its runtime might feed #UDs back into the enclave,
    e.g. to run an unmodified binary in an enclave by wrapping it in a
    shim of sorts.

Patch
diff mbox series

diff --git a/tools/testing/selftests/x86/sgx/main.c b/tools/testing/selftests/x86/sgx/main.c
index e2265f841fb0..234cfbad14a5 100644
--- a/tools/testing/selftests/x86/sgx/main.c
+++ b/tools/testing/selftests/x86/sgx/main.c
@@ -9,6 +9,7 @@ 
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <errno.h>
 #include <sys/ioctl.h>
 #include <sys/mman.h>
 #include <sys/stat.h>
@@ -18,6 +19,10 @@ 
 #include "../../../../../arch/x86/kernel/cpu/sgx/arch.h"
 #include "../../../../../arch/x86/include/uapi/asm/sgx.h"
 
+#define _Q(x)	__Q(x)
+#define __Q(x)	#x
+#define ERRLN	"Line " _Q(__LINE__)
+
 static const uint64_t MAGIC = 0x1122334455667788ULL;
 
 struct vdso_symtab {
@@ -138,7 +143,7 @@  static bool encl_create(int dev_fd, unsigned long bin_size,
 	base = mmap(NULL, secs->size, PROT_READ | PROT_WRITE | PROT_EXEC,
 		    MAP_SHARED, dev_fd, 0);
 	if (base == MAP_FAILED) {
-		perror("mmap");
+		perror(ERRLN);
 		return false;
 	}
 
@@ -224,24 +229,113 @@  static bool encl_load(struct sgx_secs *secs, unsigned long bin_size)
 	return false;
 }
 
-void sgx_call(void *rdi, void *rsi, void *tcs,
-	      struct sgx_enclave_exception *exception,
-	      void *eenter);
+int sgx_call(void *rdi, void *rsi, long rdx, void *rcx, void *r8, void *r9,
+	     void *tcs, struct sgx_enclave_exinfo *ei, void *cb, void *eenter);
+
+static void show_enclave_exinfo(const struct sgx_enclave_exinfo *exinfop,
+				const char *header)
+{
+	printf("%s: leaf:%d", header, exinfop->leaf);
+	if (exinfop->leaf != 4)
+		printf(" trap#:%d ec:%d addr:0x%llx\n", exinfop->trapnr,
+			exinfop->error_code, exinfop->address);
+	else printf("\n");
+}
+
+static void test1(void *eenter, struct sgx_secs *secs)
+{
+	uint64_t result = 0;
+	struct sgx_enclave_exinfo exinfo;
+
+	printf("[1] Entering the enclave without callback.\n");
+
+	printf("Input: 0x%lx\n Expect: Same as input\n", MAGIC);
+	sgx_call((void *)&MAGIC, &result, 0, NULL, NULL, NULL,
+		 (void *)secs->base, &exinfo, NULL, eenter);
+	if (result != MAGIC) {
+		fprintf(stderr, "0x%lx != 0x%lx\n", result, MAGIC);
+		exit(1);
+	}
+	printf(" Output: 0x%lx\n", result);
+
+	printf("Input: Null TCS\n Expect: #PF at EENTER\n");
+	sgx_call((void *)&MAGIC, &result, 0, NULL, NULL, NULL,
+		 NULL, &exinfo, NULL, eenter);
+	show_enclave_exinfo(&exinfo, " Exit");
+	if (exinfo.leaf != 2 /*EENTER*/ || exinfo.trapnr != 14 /*#PF*/)
+		exit(1);
+}
+
+static int enclave_ex_callback(long rdi, long rsi, long rdx,
+	struct sgx_enclave_exinfo *ei, long r8, long r9, void *tcs, long ursp)
+{
+	show_enclave_exinfo(ei, "  callback");
+
+	switch (ei->leaf)
+	{
+	case 4:
+		return 0;
+	case 3:
+	case 2:
+		if (ei->trapnr != 14 /*#PF*/ || (ei->error_code & 1) == 0) {
+			fprintf(stderr, ERRLN ": Unexpected exception\n");
+			exit(1);
+		}
+
+		if (mprotect((void*)(ei->address & -0x1000), 0x1000,
+			     ((ei->error_code & 2) ? PROT_WRITE : 0) |
+			     ((ei->error_code & 0x10) ? PROT_EXEC : 0) |
+			     PROT_READ)) {
+			perror(ERRLN);
+			exit(1);
+		}
+
+		return ei->leaf == 2 ? -EAGAIN : ei->leaf;
+	}
+	return -EINVAL;
+}
+
+static void test2(void *eenter, struct sgx_secs *secs)
+{
+	uint64_t result = 0;
+	struct sgx_enclave_exinfo exinfo;
+
+	printf("[2] Entering the enclave with callback.\n");
+
+	printf("Input: 0x%lx\n Expect: Same as input\n", MAGIC);
+	sgx_call((void *)&MAGIC, &result, 0, NULL, NULL, NULL,
+		 (void *)secs->base, &exinfo, enclave_ex_callback, eenter);
+	if (result != MAGIC) {
+		fprintf(stderr, "0x%lx != 0x%lx\n", result, MAGIC);
+		exit(1);
+	}
+	printf(" Output: 0x%lx\n", result);
+
+	printf("Input: Read-only enclave (0x%lx-0x%lx)\n"
+	       " Expect: #PFs to be fixed by callback\n",
+	       secs->base, secs->base + (encl_bin_end - encl_bin) - 1);
+	if (mprotect((void*)secs->base, encl_bin_end - encl_bin, PROT_READ)) {
+		perror(ERRLN);
+		exit(1);
+	}
+	while (sgx_call((void *)&MAGIC, &result, 0, NULL, NULL, NULL,
+			(void*)secs->base, &exinfo, enclave_ex_callback,
+			eenter) == -EAGAIN);
+	show_enclave_exinfo(&exinfo, " Exit");
+	if (exinfo.leaf != 4 /*EEXIT*/)
+		exit(1);
+}
 
 int main(int argc, char *argv[], char *envp[])
 {
 	unsigned long bin_size = encl_bin_end - encl_bin;
 	unsigned long ss_size = encl_ss_end - encl_ss;
-	struct sgx_enclave_exception exception;
 	Elf64_Sym *eenter_sym;
 	struct vdso_symtab symtab;
 	struct sgx_secs secs;
-	uint64_t result = 0;
 	void *eenter;
 	void *addr;
 
-	memset(&exception, 0, sizeof(exception));
-
 	addr = vdso_get_base_addr(envp);
 	if (!addr)
 		exit(1);
@@ -266,14 +360,7 @@  int main(int argc, char *argv[], char *envp[])
 	if (!encl_load(&secs, bin_size))
 		exit(1);
 
-	printf("Input: 0x%lx\n", MAGIC);
-	sgx_call((void *)&MAGIC, &result, (void *)secs.base, &exception,
-		 eenter);
-	if (result != MAGIC) {
-		fprintf(stderr, "0x%lx != 0x%lx\n", result, MAGIC);
-		exit(1);
-	}
-
-	printf("Output: 0x%lx\n", result);
-	exit(0);
+	test1(eenter, &secs);
+	test2(eenter, &secs);
+	return 0;
 }
diff --git a/tools/testing/selftests/x86/sgx/sgx_call.S b/tools/testing/selftests/x86/sgx/sgx_call.S
index 14bd0a044199..da8f687a60d2 100644
--- a/tools/testing/selftests/x86/sgx/sgx_call.S
+++ b/tools/testing/selftests/x86/sgx/sgx_call.S
@@ -7,9 +7,43 @@ 
 
 	.global sgx_call
 sgx_call:
+	.cfi_startproc
+	push	%r15
+	.cfi_adjust_cfa_offset	8
+	.cfi_rel_offset		%r15, 0
+	push	%r14
+	.cfi_adjust_cfa_offset	8
+	.cfi_rel_offset		%r14, 0
+	push	%r13
+	.cfi_adjust_cfa_offset	8
+	.cfi_rel_offset		%r13, 0
+	push	%r12
+	.cfi_adjust_cfa_offset	8
+	.cfi_rel_offset		%r12, 0
 	push	%rbx
-	mov	$0x02, %rax
-	mov	%rdx, %rbx
-	call	*%r8
+	.cfi_adjust_cfa_offset	8
+	.cfi_rel_offset		%rbx, 0
+	push	$0
+	.cfi_adjust_cfa_offset	8
+	push	0x48(%rsp)
+	.cfi_adjust_cfa_offset	8
+	push	0x48(%rsp)
+	.cfi_adjust_cfa_offset	8
+	push	0x48(%rsp)
+	.cfi_adjust_cfa_offset	8
+	mov	$2, %eax
+	call	*0x68(%rsp)
+	add	$0x20, %rsp
+	.cfi_adjust_cfa_offset	-0x20
 	pop	%rbx
+	.cfi_adjust_cfa_offset	-8
+	pop	%r12
+	.cfi_adjust_cfa_offset	-8
+	pop	%r13
+	.cfi_adjust_cfa_offset	-8
+	pop	%r14
+	.cfi_adjust_cfa_offset	-8
+	pop	%r15
+	.cfi_adjust_cfa_offset	-8
 	ret
+	.cfi_endproc