diff mbox series

[1/4] selftests/sgx: Fix segfault upon early test failure

Message ID 3c1d84724ecc7c94131ba1d94dc4c5de5aafc58f.1643393473.git.reinette.chatre@intel.com (mailing list archive)
State Accepted
Commit 2d03861e0d1d1ee81efc59338101cdd86a7474f6
Headers show
Series selftests/sgx: Early enclave loading error path fixes | expand

Commit Message

Reinette Chatre Jan. 28, 2022, 6:23 p.m. UTC
A segfault is encountered if there happens to be an
early failure of any of the SGX tests. One way to
reproduce this is to remove the enclave binary
"test_encl.elf" that will trigger early enclave loading
failure followed by a segfault.

The segfault occurs within encl_delete() that cleans up
after an enclave by umapping its mapped regions and closing
the file descriptor to the SGX driver. As integrated with
the kselftest harness encl_delete() is called upon exit
from every test, irrespective of test success. encl_delete()
is also called to clean up if an error is encountered during
enclave loading.

encl_delete() is thus responsible for cleaning any amount of
enclave state - including state that has already been cleaned.

encl_delete() starts by accessing encl->segment_tbl that may
not have been created yet due to a very early failure or may
already be cleaned up because of a failure encountered after
encl->segment_tbl was created.

Ensure encl->segment_tbl is valid before attempting to access
memory offset from it. The offset with which it is accessed,
encl->nr_segments, is initialized after encl->segment_tbl and
thus considered valid to use after the encl->segment_tbl check
succeeds.

Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/load.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Dave Hansen Jan. 28, 2022, 6:43 p.m. UTC | #1
On 1/28/22 10:23, Reinette Chatre wrote:
> A segfault is encountered if there happens to be an
> early failure of any of the SGX tests. One way to
> reproduce this is to remove the enclave binary
> "test_encl.elf" that will trigger early enclave loading
> failure followed by a segfault.
> 
> The segfault occurs within encl_delete() that cleans up
> after an enclave by umapping its mapped regions and closing
> the file descriptor to the SGX driver. As integrated with
> the kselftest harness encl_delete() is called upon exit
> from every test, irrespective of test success. encl_delete()
> is also called to clean up if an error is encountered during
> enclave loading.
> 
> encl_delete() is thus responsible for cleaning any amount of
> enclave state - including state that has already been cleaned.
> 
> encl_delete() starts by accessing encl->segment_tbl that may
> not have been created yet due to a very early failure or may
> already be cleaned up because of a failure encountered after
> encl->segment_tbl was created.
> 
> Ensure encl->segment_tbl is valid before attempting to access
> memory offset from it. The offset with which it is accessed,
> encl->nr_segments, is initialized after encl->segment_tbl and
> thus considered valid to use after the encl->segment_tbl check
> succeeds.

I'm thinking we can be a bit more concise about the problem:

== Background ==

The SGX selftests track parts of the enclave binaries in an array:
encl->segment_tbl[].  That array is dynamically allocated early (but not
first) in the test's lifetime.  The array is referenced at the end of
the test in encl_delete().

== Problem ==

encl->segment_tbl[] can be NULL if the test fails before its allocation.
 That leads to a NULL-pointer-dereference in encl_delete().  This is
triggered during early failures of the selftest like if the enclave
binary ("test_encl.elf") is deleted.

--

I think it's also best to refer to this as a NULL-pointer problem rather
than a segfault.   The segfault is really just the fallout from the NULL
pointer, *not* the primary problem.

> Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  tools/testing/selftests/sgx/load.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
> index 9d4322c946e2..006b464c8fc9 100644
> --- a/tools/testing/selftests/sgx/load.c
> +++ b/tools/testing/selftests/sgx/load.c
> @@ -21,7 +21,7 @@
>  
>  void encl_delete(struct encl *encl)
>  {
> -	struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
> +	struct encl_segment *heap_seg;
>  
>  	if (encl->encl_base)
>  		munmap((void *)encl->encl_base, encl->encl_size);
> @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl)
>  	if (encl->fd)
>  		close(encl->fd);
>  
> -	munmap(heap_seg->src, heap_seg->size);
> -
> -	if (encl->segment_tbl)
> +	if (encl->segment_tbl) {
> +		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
> +		munmap(heap_seg->src, heap_seg->size);

This probably deserves a comment linking heap_seg->src and
encl->segment_tbl together. They _look_ independent here.

>  		free(encl->segment_tbl);
> +	}
>  
>  	memset(encl, 0, sizeof(*encl));
>  }
Reinette Chatre Jan. 28, 2022, 7:22 p.m. UTC | #2
Hi Dave,

On 1/28/2022 10:43 AM, Dave Hansen wrote:
> On 1/28/22 10:23, Reinette Chatre wrote:
>> A segfault is encountered if there happens to be an
>> early failure of any of the SGX tests. One way to
>> reproduce this is to remove the enclave binary
>> "test_encl.elf" that will trigger early enclave loading
>> failure followed by a segfault.
>>
>> The segfault occurs within encl_delete() that cleans up
>> after an enclave by umapping its mapped regions and closing
>> the file descriptor to the SGX driver. As integrated with
>> the kselftest harness encl_delete() is called upon exit
>> from every test, irrespective of test success. encl_delete()
>> is also called to clean up if an error is encountered during
>> enclave loading.
>>
>> encl_delete() is thus responsible for cleaning any amount of
>> enclave state - including state that has already been cleaned.
>>
>> encl_delete() starts by accessing encl->segment_tbl that may
>> not have been created yet due to a very early failure or may
>> already be cleaned up because of a failure encountered after
>> encl->segment_tbl was created.
>>
>> Ensure encl->segment_tbl is valid before attempting to access
>> memory offset from it. The offset with which it is accessed,
>> encl->nr_segments, is initialized after encl->segment_tbl and
>> thus considered valid to use after the encl->segment_tbl check
>> succeeds.
> 
> I'm thinking we can be a bit more concise about the problem:
> 
> == Background ==
> 
> The SGX selftests track parts of the enclave binaries in an array:
> encl->segment_tbl[].  That array is dynamically allocated early (but not
> first) in the test's lifetime.  The array is referenced at the end of
> the test in encl_delete().
> 
> == Problem ==
> 
> encl->segment_tbl[] can be NULL if the test fails before its allocation.
>  That leads to a NULL-pointer-dereference in encl_delete().  This is
> triggered during early failures of the selftest like if the enclave
> binary ("test_encl.elf") is deleted.
> 
> --
> 
> I think it's also best to refer to this as a NULL-pointer problem rather
> than a segfault.   The segfault is really just the fallout from the NULL
> pointer, *not* the primary problem.

Will do. I plan to resubmit with your changes and just also append the
paragraph that documents the fix.

> 
>> Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave")
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>>  tools/testing/selftests/sgx/load.c | 9 +++++----
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
>> index 9d4322c946e2..006b464c8fc9 100644
>> --- a/tools/testing/selftests/sgx/load.c
>> +++ b/tools/testing/selftests/sgx/load.c
>> @@ -21,7 +21,7 @@
>>  
>>  void encl_delete(struct encl *encl)
>>  {
>> -	struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
>> +	struct encl_segment *heap_seg;
>>  
>>  	if (encl->encl_base)
>>  		munmap((void *)encl->encl_base, encl->encl_size);
>> @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl)
>>  	if (encl->fd)
>>  		close(encl->fd);
>>  
>> -	munmap(heap_seg->src, heap_seg->size);
>> -
>> -	if (encl->segment_tbl)
>> +	if (encl->segment_tbl) {
>> +		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
>> +		munmap(heap_seg->src, heap_seg->size);
> 
> This probably deserves a comment linking heap_seg->src and
> encl->segment_tbl together. They _look_ independent here.

How about:

diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
index 006b464c8fc9..946eb7c2a253 100644
--- a/tools/testing/selftests/sgx/load.c
+++ b/tools/testing/selftests/sgx/load.c
@@ -33,6 +33,14 @@ void encl_delete(struct encl *encl)
 		close(encl->fd);
 
 	if (encl->segment_tbl) {
+		/*
+		 * Most segments form part of the enclave binary
+		 * and have their mappings deleted with earlier
+		 * munmap() of encl->bin.
+		 * As a mapping of anonymous memory the heap
+		 * segment is separate from the enclave
+		 * binary and needs its mapping deleted separately.
+		 */
 		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
 		munmap(heap_seg->src, heap_seg->size);
 		free(encl->segment_tbl);

Reinette
Dave Hansen Jan. 28, 2022, 7:26 p.m. UTC | #3
On 1/28/22 11:22, Reinette Chatre wrote:
>  	if (encl->segment_tbl) {
> +		/*
> +		 * Most segments form part of the enclave binary
> +		 * and have their mappings deleted with earlier
> +		 * munmap() of encl->bin.
> +		 * As a mapping of anonymous memory the heap
> +		 * segment is separate from the enclave
> +		 * binary and needs its mapping deleted separately.
> +		 */
>  		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
>  		munmap(heap_seg->src, heap_seg->size);

I was more wondering why the status of heap_seg->src is tied to
encl->segment_tbl.
Reinette Chatre Jan. 28, 2022, 8:08 p.m. UTC | #4
Hi Dave,

On 1/28/2022 11:26 AM, Dave Hansen wrote:
> On 1/28/22 11:22, Reinette Chatre wrote:
>>  	if (encl->segment_tbl) {
>> +		/*
>> +		 * Most segments form part of the enclave binary
>> +		 * and have their mappings deleted with earlier
>> +		 * munmap() of encl->bin.
>> +		 * As a mapping of anonymous memory the heap
>> +		 * segment is separate from the enclave
>> +		 * binary and needs its mapping deleted separately.
>> +		 */
>>  		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
>>  		munmap(heap_seg->src, heap_seg->size);
> 
> I was more wondering why the status of heap_seg->src is tied to
> encl->segment_tbl.

Apologies but it is not clear to me what the concern is. Please bear
with me as I first try to create some context and then I hope to
navigate to the issue.

The test creates an SGX enclave from a binary, in the test it is
named test_encl.elf. To create the enclave the test loads the data
from the binary and populates the enclave with it. In order to create
the enclave correctly the binary needs to be loaded via its distinct
segments, initially this is: TCS, TEXT, DATA. This is done to ensure
when the enclave is created it is done with the correct page types
and permissions. For example, the pages from the TCS segment needs
to be loaded in to enclave pages of type TCS with RW permission, the
pages from the TEXT segment needs to be loaded into regular enclave
pages with RX permission, etc.

To ensure the enclave is created correctly the test thus initializes
the data for each segment that will be loaded into the enclave into
that enclave's "segment_tbl".

struct encl {
        ...
        struct encl_segment *segment_tbl;
        ...
};

/*
 * struct encl_segment - parameters that needed for
 * SGX_IOC_ENCLAVE_ADD_PAGES ioctl()
 */
struct encl_segment {
        void *src;
        off_t offset;
        size_t size;
        unsigned int prot;
        unsigned int flags;
        bool measure;
};

Commit 3200505d4de6 ("selftests/sgx: Create a heap for the test
enclave") introduced a new segment, the heap, in support of more
testing. While not loaded from the original binary it is
considered a segment of the enclave and needs all fields in 
struct encl_segment in order to add its pages to the enclave
with appropriate page permissions and flags.

This is thus how heap_seg->src ended up connected to
encl->segment_tbl.

Apologies for being long winded here but I hope with this we could
narrow down where your concerns are.

Reinette
Jarkko Sakkinen Feb. 15, 2022, 7:39 p.m. UTC | #5
On Fri, Jan 28, 2022 at 10:23:56AM -0800, Reinette Chatre wrote:
> A segfault is encountered if there happens to be an
> early failure of any of the SGX tests. One way to
> reproduce this is to remove the enclave binary
> "test_encl.elf" that will trigger early enclave loading
> failure followed by a segfault.
> 
> The segfault occurs within encl_delete() that cleans up
> after an enclave by umapping its mapped regions and closing
> the file descriptor to the SGX driver. As integrated with
> the kselftest harness encl_delete() is called upon exit
> from every test, irrespective of test success. encl_delete()
> is also called to clean up if an error is encountered during
> enclave loading.
> 
> encl_delete() is thus responsible for cleaning any amount of
> enclave state - including state that has already been cleaned.
> 
> encl_delete() starts by accessing encl->segment_tbl that may
> not have been created yet due to a very early failure or may
> already be cleaned up because of a failure encountered after
> encl->segment_tbl was created.
> 
> Ensure encl->segment_tbl is valid before attempting to access
> memory offset from it. The offset with which it is accessed,
> encl->nr_segments, is initialized after encl->segment_tbl and
> thus considered valid to use after the encl->segment_tbl check
> succeeds.

Nit: textwidth=75

Not something I would NAK but just saying.

> 
> Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  tools/testing/selftests/sgx/load.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
> index 9d4322c946e2..006b464c8fc9 100644
> --- a/tools/testing/selftests/sgx/load.c
> +++ b/tools/testing/selftests/sgx/load.c
> @@ -21,7 +21,7 @@
>  
>  void encl_delete(struct encl *encl)
>  {
> -	struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
> +	struct encl_segment *heap_seg;
>  
>  	if (encl->encl_base)
>  		munmap((void *)encl->encl_base, encl->encl_size);
> @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl)
>  	if (encl->fd)
>  		close(encl->fd);
>  
> -	munmap(heap_seg->src, heap_seg->size);
> -
> -	if (encl->segment_tbl)
> +	if (encl->segment_tbl) {
> +		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
> +		munmap(heap_seg->src, heap_seg->size);
>  		free(encl->segment_tbl);
> +	}
>  
>  	memset(encl, 0, sizeof(*encl));
>  }
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org> 

I tested this by a minor code mod to trigger to failure case.

/Jarkko
diff mbox series

Patch

diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
index 9d4322c946e2..006b464c8fc9 100644
--- a/tools/testing/selftests/sgx/load.c
+++ b/tools/testing/selftests/sgx/load.c
@@ -21,7 +21,7 @@ 
 
 void encl_delete(struct encl *encl)
 {
-	struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
+	struct encl_segment *heap_seg;
 
 	if (encl->encl_base)
 		munmap((void *)encl->encl_base, encl->encl_size);
@@ -32,10 +32,11 @@  void encl_delete(struct encl *encl)
 	if (encl->fd)
 		close(encl->fd);
 
-	munmap(heap_seg->src, heap_seg->size);
-
-	if (encl->segment_tbl)
+	if (encl->segment_tbl) {
+		heap_seg = &encl->segment_tbl[encl->nr_segments - 1];
+		munmap(heap_seg->src, heap_seg->size);
 		free(encl->segment_tbl);
+	}
 
 	memset(encl, 0, sizeof(*encl));
 }