From patchwork Mon Nov 28 22:35:56 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sean Christopherson <sean.j.christopherson@intel.com>
X-Patchwork-Id: 9450645
Return-Path: <intel-sgx-kernel-dev-bounces@lists.01.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	9BF956071C for <patchwork-intel-sgx@patchwork.kernel.org>;
	Mon, 28 Nov 2016 22:36:00 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8D1CD2808F
	for <patchwork-intel-sgx@patchwork.kernel.org>;
	Mon, 28 Nov 2016 22:36:00 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 82150280DE; Mon, 28 Nov 2016 22:36:00 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE
	autolearn=ham version=3.3.1
Received: from ml01.01.org (ml01.01.org [198.145.21.10])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BC2DB2808F
	for <patchwork-intel-sgx@patchwork.kernel.org>;
	Mon, 28 Nov 2016 22:35:59 +0000 (UTC)
Received: from [127.0.0.1] (localhost [IPv6:::1])
	by ml01.01.org (Postfix) with ESMTP id 6C1E281C8F
	for <patchwork-intel-sgx@patchwork.kernel.org>;
	Mon, 28 Nov 2016 14:35:59 -0800 (PST)
X-Original-To: intel-sgx-kernel-dev@lists.01.org
Delivered-To: intel-sgx-kernel-dev@lists.01.org
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ml01.01.org (Postfix) with ESMTPS id 614A581C8F
	for <intel-sgx-kernel-dev@lists.01.org>;
	Mon, 28 Nov 2016 14:35:58 -0800 (PST)
Received: from orsmga002.jf.intel.com ([10.7.209.21])
	by fmsmga101.fm.intel.com with ESMTP; 28 Nov 2016 14:35:57 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.31,565,1473145200"; d="scan'208";a="10717836"
Received: from orsmsx107.amr.corp.intel.com ([10.22.240.5])
	by orsmga002.jf.intel.com with ESMTP; 28 Nov 2016 14:35:57 -0800
Received: from orsmsx116.amr.corp.intel.com (10.22.240.14) by
	ORSMSX107.amr.corp.intel.com (10.22.240.5) with Microsoft SMTP Server
	(TLS) id 14.3.248.2; Mon, 28 Nov 2016 14:35:57 -0800
Received: from orsmsx108.amr.corp.intel.com ([169.254.2.107]) by
	ORSMSX116.amr.corp.intel.com ([10.22.240.14]) with mapi id
	14.03.0248.002; Mon, 28 Nov 2016 14:35:57 -0800
From: "Christopherson, Sean J" <sean.j.christopherson@intel.com>
To: "intel-sgx-kernel-dev@lists.01.org" <intel-sgx-kernel-dev@lists.01.org>
Thread-Topic: [PATCH v2] intel_sgx: Add lock to make EPC eviction atomic
Thread-Index: AQHSScddPeoc8vsfWkG/+Uzc1k7vvaDu+6Iw
Date: Mon, 28 Nov 2016 22:35:56 +0000
Message-ID: 
 <37306EFA9975BE469F115FDE982C075B9B693DC8@ORSMSX108.amr.corp.intel.com>
References: 
 <1480372367-12679-1-git-send-email-sean.j.christopherson@intel.com>
In-Reply-To: 
 <1480372367-12679-1-git-send-email-sean.j.christopherson@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ctpclassification: CTP_IC
x-titus-metadata-40: 
 eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMjNhYjQ1MDYtYWFlMC00MWE0LWFlNzQtMjU1NmJmODc2OTc3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6Ik1OS2dHdFVKTXA5NDl2MVhMMk83S3d6TjNyaW1hXC9iQWUrMXdVYlBQdExJPSJ9
x-originating-ip: [10.22.254.140]
MIME-Version: 1.0
Subject: Re: [intel-sgx-kernel-dev] [PATCH v2] intel_sgx: Add lock to make
	EPC eviction atomic
X-BeenThere: intel-sgx-kernel-dev@lists.01.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Project: Intel&reg; Software Guard Extensions for Linux*:
	https://01.org/intel-software-guard-extensions"
	<intel-sgx-kernel-dev.lists.01.org>
List-Unsubscribe: <https://lists.01.org/mailman/options/intel-sgx-kernel-dev>,
	<mailto:intel-sgx-kernel-dev-request@lists.01.org?subject=unsubscribe>
List-Archive: <http://lists.01.org/pipermail/intel-sgx-kernel-dev/>
List-Post: <mailto:intel-sgx-kernel-dev@lists.01.org>
List-Help: <mailto:intel-sgx-kernel-dev-request@lists.01.org?subject=help>
List-Subscribe: <https://lists.01.org/mailman/listinfo/intel-sgx-kernel-dev>,
	<mailto:intel-sgx-kernel-dev-request@lists.01.org?subject=subscribe>
Errors-To: intel-sgx-kernel-dev-bounces@lists.01.org
Sender: "intel-sgx-kernel-dev" <intel-sgx-kernel-dev-bounces@lists.01.org>
X-Virus-Scanned: ClamAV using ClamSMTP

To add a bit of context, as I understand it, the underlying concern with holding a lock for the entire eviction sequence is that doing so could block an non-eviction operation on the enclave in question for an extended duration.  In that case, adding a dedicated per-enclave lock solves both the ETRACK issue and potential problems due to blocking the entire enclave for an extended amount of time.

An interesting side note, without any LRU algorithm, this patch causes a massive performance regression, e.g. 50% throughput, for workflows that spawn many enclaves with extreme EPC pressure.  Allowing EADD in parallel with EBLOCK/ETRACK/EWB reduces enclave initialization time, which has the unfortunate side effect of causing massive thrash if there is no LRU-based eviction.


Christopherson, Sean J <sean.j.christopherson@intel.com> wrote:
> Add an eviction specific, per-enclave lock so that the lock can be held for
> the entire EPC eviction sequence without having to hold the overall
> per-enclave lock for an extended duration.  Holding an enclave's overall lock
> prevents any other driver operations on the enclave, e.g. faulting a page
> into the enclave, and evicting 16 EPC pages takes upwards of half a
> millisecond.
> 
> Holding a lock for the entire EBLOCK->ETRACK->EWB sequence is necessary to
> make the sequence atomic with respect to other evictions on the same
> enclave.  The SGX architecture does not allow multiple ETRACK sequences to be
> in flight at the same time if the enclave being tracked has one or more
> active threads in the enclave.  A second ETRACK will fail due to
> PREV_TRK_INCMPL if it is attempted before EWB is executed on all blocked
> pages (in the presence of active enclave threads).
> 
> Currently the driver releases the enclave's lock after ETRACK and then
> reacquires the lock prior to EWB.  Releasing the lock from the thread that
> performed ETRACK, T1, allows a different thread, T2, to acquire the lock for
> its own ETRACK.  T2's ETRACK will fail due to the architectural restrictions
> if a third thread, T3, is executing inside the enclave whose pages are being
> swapped.
> 
>     T3: <active in enclave E1>
>     T1: lock_mutex(E1)
>     T1: ETRACK(E1)
>     T1: unlock_mutex(E1)
>     T2: lock_mutex(E1)
>     T2: ETRACK(E1) fails
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

-----Original Message-----
From: Christopherson, Sean J 
Sent: Monday, November 28, 2016 2:33 PM
To: intel-sgx-kernel-dev@lists.01.org
Cc: Christopherson, Sean J <sean.j.christopherson@intel.com>
Subject: [PATCH v2] intel_sgx: Add lock to make EPC eviction atomic

Add an eviction specific, per-enclave lock so that the lock can be held for the entire EPC eviction sequence without having to hold the overall per-enclave lock for an extended duration.  Holding an enclave's overall lock prevents any other driver operations on the enclave, e.g. faulting a page into the enclave, and evicting 16 EPC pages takes upwards of half a millisecond.

Holding a lock for the entire EBLOCK->ETRACK->EWB sequence is necessary to make the sequence atomic with respect to other evictions on the same enclave.  The SGX architecture does not allow multiple ETRACK sequences to be in flight at the same time if the enclave being tracked has one or more active threads in the enclave.  A second ETRACK will fail due to PREV_TRK_INCMPL if it is attempted before EWB is executed on all blocked pages (in the presence of active enclave threads).

Currently the driver releases the enclave's lock after ETRACK and then reacquires the lock prior to EWB.  Releasing the lock from the thread that performed ETRACK, T1, allows a different thread, T2, to acquire the lock for its own ETRACK.  T2's ETRACK will fail due to the architectural restrictions if a third thread, T3, is executing inside the enclave whose pages are being swapped.

    T3: <active in enclave E1>
    T1: lock_mutex(E1)
    T1: ETRACK(E1)
    T1: unlock_mutex(E1)
    T2: lock_mutex(E1)
    T2: ETRACK(E1) fails

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 drivers/platform/x86/intel_sgx.h            |  1 +
 drivers/platform/x86/intel_sgx_ioctl.c      |  1 +
 drivers/platform/x86/intel_sgx_page_cache.c | 69 +++++++++++++++++------------
 3 files changed, 43 insertions(+), 28 deletions(-)

 				smp_call_function(sgx_ipi_cb, NULL, 1);
 				BUG_ON(sgx_ewb(encl, entry, pages[i]));
 			}
-			encl->secs_child_cnt--;
 		}
+		sgx_put_backing(pages[i++], vma);
+	}
+
+	mutex_unlock(&encl->eviction_lock);
+
+	mutex_lock(&encl->lock);
+
+	/* Bookeeping */
+	while (!list_empty(src)) {
+		entry = list_first_entry(src, struct sgx_encl_page,
+					 load_list);
+		list_del(&entry->load_list);
 
 		sgx_free_encl_page(entry, encl,
-				   evma ? SGX_FREE_SKIP_EREMOVE : 0);
-		sgx_put_backing(pages[i++], evma);
+				   vma ? SGX_FREE_SKIP_EREMOVE : 0);
 	}
 
-	/* Allow SECS page eviction only when the encl is initialized. */
+	encl->secs_child_cnt -= cnt;
+
+	/* Allow SECS page eviction only when the encl is initialized.
+	 * The eviction lock does not need to be held to evict the SECS,
+	 * conflict is impossible as there are no other pages to evict.
+	 */
 	if (!encl->secs_child_cnt &&
 	    (encl->flags & SGX_ENCL_INITIALIZED)) {
 		pages[cnt] = sgx_get_backing(encl, &encl->secs_page); @@ -336,9 +349,9 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
 			sgx_put_backing(pages[cnt], true);
 		}
 	}
-
 	mutex_unlock(&encl->lock);
 
+out:
 	sgx_unpin_mm(encl);
 }
 
--
2.7.4

diff --git a/drivers/platform/x86/intel_sgx.h b/drivers/platform/x86/intel_sgx.h
index 464763d..d837789 100644
--- a/drivers/platform/x86/intel_sgx.h
+++ b/drivers/platform/x86/intel_sgx.h
@@ -137,6 +137,7 @@ struct sgx_encl {
 	unsigned int secs_child_cnt;
 	unsigned int vma_cnt;
 	struct mutex lock;
+	struct mutex eviction_lock;
 	struct task_struct *owner;
 	struct mm_struct *mm;
 	struct file *backing;
diff --git a/drivers/platform/x86/intel_sgx_ioctl.c b/drivers/platform/x86/intel_sgx_ioctl.c
index d54c410..72fe9aa 100644
--- a/drivers/platform/x86/intel_sgx_ioctl.c
+++ b/drivers/platform/x86/intel_sgx_ioctl.c
@@ -522,6 +522,7 @@ static long sgx_ioc_enclave_create(struct file *filep, unsigned int cmd,
 	INIT_LIST_HEAD(&encl->load_list);
 	INIT_LIST_HEAD(&encl->encl_list);
 	mutex_init(&encl->lock);
+	mutex_init(&encl->eviction_lock);
 	INIT_WORK(&encl->add_page_work, sgx_add_page_worker);
 
 	encl->owner = current->group_leader;
diff --git a/drivers/platform/x86/intel_sgx_page_cache.c b/drivers/platform/x86/intel_sgx_page_cache.c
index 8b1cc82..ce60b76 100644
--- a/drivers/platform/x86/intel_sgx_page_cache.c
+++ b/drivers/platform/x86/intel_sgx_page_cache.c
@@ -224,9 +224,9 @@ static int sgx_ewb(struct sgx_encl *encl,
 	return ret;
 }
 
-void sgx_free_encl_page(struct sgx_encl_page *entry,
-		    struct sgx_encl *encl,
-		    unsigned int flags)
+static void sgx_free_encl_page(struct sgx_encl_page *entry,
+			       struct sgx_encl *encl,
+			       unsigned int flags)
 {
 	sgx_free_page(entry->epc_page, encl, flags);
 	entry->epc_page = NULL;
@@ -238,7 +238,7 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
 	struct sgx_encl_page *entry;
 	struct sgx_encl_page *tmp;
 	struct page *pages[SGX_NR_SWAP_CLUSTER_MAX + 1];
-	struct vm_area_struct *evma;
+	struct vm_area_struct *vma;
 	int cnt = 0;
 	int i = 0;
 	int ret;
@@ -261,13 +261,15 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
 		return;
 	}
 
-	/* EBLOCK */
+	mutex_lock(&encl->eviction_lock);
 
+	/* EBLOCK */
 	list_for_each_entry_safe(entry, tmp, src, load_list) {
-		mutex_lock(&encl->lock);
-		evma = sgx_find_vma(encl, entry->addr);
-		if (!evma) {
+		vma = sgx_find_vma(encl, entry->addr);
+		if (!vma) {
 			list_del(&entry->load_list);
+
+			mutex_lock(&encl->lock);
 			sgx_free_encl_page(entry, encl, 0);
 			mutex_unlock(&encl->lock);
 			continue;
@@ -276,36 +278,32 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
 		pages[cnt] = sgx_get_backing(encl, entry);
 		if (IS_ERR(pages[cnt])) {
 			list_del(&entry->load_list);
+
+			mutex_lock(&encl->lock);
 			list_add_tail(&entry->load_list, &encl->load_list);
 			entry->flags &= ~SGX_ENCL_PAGE_RESERVED;
 			mutex_unlock(&encl->lock);
 			continue;
 		}
 
-		zap_vma_ptes(evma, entry->addr, PAGE_SIZE);
+		zap_vma_ptes(vma, entry->addr, PAGE_SIZE);
 		sgx_eblock(entry->epc_page);
 		cnt++;
-		mutex_unlock(&encl->lock);
 	}
 
-	/* ETRACK */
+	if (!cnt) {
+		mutex_unlock(&encl->eviction_lock);
+		goto out;
+	}
 
-	mutex_lock(&encl->lock);
+	/* ETRACK */
 	sgx_etrack(encl->secs_page.epc_page);
-	mutex_unlock(&encl->lock);
 
 	/* EWB */
-
-	mutex_lock(&encl->lock);
 	i = 0;
-
-	while (!list_empty(src)) {
-		entry = list_first_entry(src, struct sgx_encl_page,
-					 load_list);
-		list_del(&entry->load_list);
-
-		evma = sgx_find_vma(encl, entry->addr);
-		if (evma) {
+	list_for_each_entry(entry, src, load_list) {
+		vma = sgx_find_vma(encl, entry->addr);
+		if (vma) {
 			ret = sgx_ewb(encl, entry, pages[i]);
 			BUG_ON(ret != 0 && ret != SGX_NOT_TRACKED);
 			/* Only kick out threads with an IPI if needed. */ @@ -313,15 +311,30 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)