diff mbox series

[03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

Message ID 9c2c83ccc7324390bfb302bd327d9236b890c679.1614590788.git.kai.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM SGX virtualization support | expand

Commit Message

Huang, Kai March 1, 2021, 9:44 a.m. UTC
From: Jarkko Sakkinen <jarkko@kernel.org>

EREMOVE takes a pages and removes any association between that page and
an enclave.  It must be run on a page before it can be added into
another enclave.  Currently, EREMOVE is run as part of pages being freed
into the SGX page allocator.  It is not expected to fail.

KVM does not track how guest pages are used, which means that SGX
virtualization use of EREMOVE might fail.

Break out the EREMOVE call from the SGX page allocator.  This will allow
the SGX virtualization code to use the allocator directly.  (SGX/KVM
will also introduce a more permissive EREMOVE helper).

Implement original sgx_free_epc_page() as sgx_encl_free_epc_page() to be
more specific that it is used to free EPC page assigned to one enclave.
Print an error message when EREMOVE fails to explicitly call out EPC
page is leaked, and requires machine reboot to get leaked pages back.

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
RFC v6->v1:

 - Removed sgx_reset_epc_page() since with it, I found it is hard to find
   a place to print the msg saying EPC page is leaked.
 - Implemented original sgx_free_epc_page() as sgx_encl_free_epc_page(),
   and add pr_err_once() to print EPC page is leaked when EREMOVE failed.

---
 arch/x86/kernel/cpu/sgx/encl.c | 26 +++++++++++++++++++++++---
 arch/x86/kernel/cpu/sgx/main.c | 12 ++++--------
 2 files changed, 27 insertions(+), 11 deletions(-)

Comments

Sean Christopherson March 1, 2021, 5:29 p.m. UTC | #1
On Mon, Mar 01, 2021, Kai Huang wrote:
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 7449ef33f081..a7dc86e87a09 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -381,6 +381,26 @@ const struct vm_operations_struct sgx_vm_ops = {
>  	.access = sgx_vma_access,
>  };
>  
> +static void sgx_encl_free_epc_page(struct sgx_epc_page *epc_page)
> +{
> +	int ret;
> +
> +	WARN_ON_ONCE(epc_page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> +
> +	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
> +	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret)) {

This can be ENCLS_WARN, especially if you're printing a separate error message
about leaking the page.  That being said, I'm not sure a seperate error message
is a good idea.  If other stuff gets dumped to the kernel log between the WARN
and the pr_err_once(), it may not be clear to admins that the two events are
directly connected.  It's even possible the prints could come from two different
CPUs.

Why not dump a short blurb in the WARN itself?  The error message can be thrown
in a define if the line length is too obnoxious (it's ~109 chars if embedded
directly).

#define EREMOVE_ERROR_MESSAGE \
	"EREMOVE returned %d (0x%x).  EPC page leaked, reboot recommended."

	if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret))

> +		/*
> +		 * Give a message to remind EPC page is leaked, and requires
> +		 * machine reboot to get leaked pages back. This can be improved
> +		 * in the future by adding stats of leaked pages, etc.
> +		 */
> +		pr_err_once("EPC page is leaked. Require machine reboot to get leaked pages back.\n");
> +		return;
> +	}
> +
> +	sgx_free_epc_page(epc_page);
> +}
> +
>  /**
>   * sgx_encl_release - Destroy an enclave instance
>   * @kref:	address of a kref inside &sgx_encl
Huang, Kai March 2, 2021, 12:32 a.m. UTC | #2
On Mon, 2021-03-01 at 09:29 -0800, Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > index 7449ef33f081..a7dc86e87a09 100644
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -381,6 +381,26 @@ const struct vm_operations_struct sgx_vm_ops = {
> >  	.access = sgx_vma_access,
> >  };
> >  
> > 
> > 
> > 
> > +static void sgx_encl_free_epc_page(struct sgx_epc_page *epc_page)
> > +{
> > +	int ret;
> > +
> > +	WARN_ON_ONCE(epc_page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> > +
> > +	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
> > +	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret)) {
> 
> This can be ENCLS_WARN, especially if you're printing a separate error message
> about leaking the page.  That being said, I'm not sure a seperate error message
> is a good idea.  If other stuff gets dumped to the kernel log between the WARN
> and the pr_err_once(), it may not be clear to admins that the two events are
> directly connected.  It's even possible the prints could come from two different
> CPUs.

Good point. Thanks for educating me :)

> 
> Why not dump a short blurb in the WARN itself?  The error message can be thrown
> in a define if the line length is too obnoxious (it's ~109 chars if embedded
> directly).
> 
> #define EREMOVE_ERROR_MESSAGE \
> 	"EREMOVE returned %d (0x%x).  EPC page leaked, reboot recommended."
> 
> 	if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret))

Will do in your way. Thanks!

> 
> > +		/*
> > +		 * Give a message to remind EPC page is leaked, and requires
> > +		 * machine reboot to get leaked pages back. This can be improved
> > +		 * in the future by adding stats of leaked pages, etc.
> > +		 */
> > +		pr_err_once("EPC page is leaked. Require machine reboot to get leaked pages back.\n");
> > +		return;
> > +	}
> > +
> > +	sgx_free_epc_page(epc_page);
> > +}
> > +
> >  /**
> >   * sgx_encl_release - Destroy an enclave instance
> >   * @kref:	address of a kref inside &sgx_encl
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 7449ef33f081..a7dc86e87a09 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -381,6 +381,26 @@  const struct vm_operations_struct sgx_vm_ops = {
 	.access = sgx_vma_access,
 };
 
+static void sgx_encl_free_epc_page(struct sgx_epc_page *epc_page)
+{
+	int ret;
+
+	WARN_ON_ONCE(epc_page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
+
+	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
+	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret)) {
+		/*
+		 * Give a message to remind EPC page is leaked, and requires
+		 * machine reboot to get leaked pages back. This can be improved
+		 * in the future by adding stats of leaked pages, etc.
+		 */
+		pr_err_once("EPC page is leaked. Require machine reboot to get leaked pages back.\n");
+		return;
+	}
+
+	sgx_free_epc_page(epc_page);
+}
+
 /**
  * sgx_encl_release - Destroy an enclave instance
  * @kref:	address of a kref inside &sgx_encl
@@ -404,7 +424,7 @@  void sgx_encl_release(struct kref *ref)
 			if (sgx_unmark_page_reclaimable(entry->epc_page))
 				continue;
 
-			sgx_free_epc_page(entry->epc_page);
+			sgx_encl_free_epc_page(entry->epc_page);
 			encl->secs_child_cnt--;
 			entry->epc_page = NULL;
 		}
@@ -415,7 +435,7 @@  void sgx_encl_release(struct kref *ref)
 	xa_destroy(&encl->page_array);
 
 	if (!encl->secs_child_cnt && encl->secs.epc_page) {
-		sgx_free_epc_page(encl->secs.epc_page);
+		sgx_encl_free_epc_page(entry->epc_page);
 		encl->secs.epc_page = NULL;
 	}
 
@@ -423,7 +443,7 @@  void sgx_encl_release(struct kref *ref)
 		va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
 					   list);
 		list_del(&va_page->list);
-		sgx_free_epc_page(va_page->epc_page);
+		sgx_encl_free_epc_page(entry->epc_page);
 		kfree(va_page);
 	}
 
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 8df81a3ed945..44fe91a5bfb3 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -598,18 +598,14 @@  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
  * sgx_free_epc_page() - Free an EPC page
  * @page:	an EPC page
  *
- * Call EREMOVE for an EPC page and insert it back to the list of free pages.
+ * Put the EPC page back to the list of free pages. It's the caller's
+ * responsibility to make sure that the page is in uninitialized state. In other
+ * words, do EREMOVE, EWB or whatever operation is necessary before calling
+ * this function.
  */
 void sgx_free_epc_page(struct sgx_epc_page *page)
 {
 	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
-	int ret;
-
-	WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
-
-	ret = __eremove(sgx_get_epc_virt_addr(page));
-	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
-		return;
 
 	spin_lock(&section->lock);
 	list_add_tail(&page->list, &section->page_list);