From patchwork Wed Dec 1 19:22:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12EB7C43219 for ; Wed, 1 Dec 2021 19:23:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245205AbhLAT1J (ORCPT ); Wed, 1 Dec 2021 14:27:09 -0500 Received: from mga04.intel.com ([192.55.52.120]:46907 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245137AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267920" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267920" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380437" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers Date: Wed, 1 Dec 2021 11:22:59 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The SGX ENCLS instruction uses EAX to specify an SGX function and may require additional registers, depending on the SGX function. ENCLS invokes the specified privileged SGX function for managing and debugging enclaves. Macros are used to wrap the ENCLS functionality and several wrappers are used to wrap the macros to make the different SGX functions accessible in the code. The wrappers of the supported SGX functions are cryptic. Add short changelog descriptions of each to a comment. Suggested-by: Dave Hansen Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encls.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h index 9b204843b78d..241b766265d3 100644 --- a/arch/x86/kernel/cpu/sgx/encls.h +++ b/arch/x86/kernel/cpu/sgx/encls.h @@ -162,57 +162,68 @@ static inline bool encls_failed(int ret) ret; \ }) +/* Create an SECS page in the Enclave Page Cache (EPC) */ static inline int __ecreate(struct sgx_pageinfo *pginfo, void *secs) { return __encls_2(ECREATE, pginfo, secs); } +/* Extend uninitialized enclave measurement */ static inline int __eextend(void *secs, void *addr) { return __encls_2(EEXTEND, secs, addr); } +/* Add a page to an uninitialized enclave */ static inline int __eadd(struct sgx_pageinfo *pginfo, void *addr) { return __encls_2(EADD, pginfo, addr); } +/* Initialize an enclave for execution */ static inline int __einit(void *sigstruct, void *token, void *secs) { return __encls_ret_3(EINIT, sigstruct, secs, token); } +/* Remove a page from the Enclave Page Cache (EPC) */ static inline int __eremove(void *addr) { return __encls_ret_1(EREMOVE, addr); } +/* Write to a debug enclave */ static inline int __edbgwr(void *addr, unsigned long *data) { return __encls_2(EDGBWR, *data, addr); } +/* Read from a debug enclave */ static inline int __edbgrd(void *addr, unsigned long *data) { return __encls_1_1(EDGBRD, *data, addr); } +/* Track threads operating inside the enclave */ static inline int __etrack(void *addr) { return __encls_ret_1(ETRACK, addr); } +/* Load, verify, and unblock an Enclave Page Cache (EPC) page */ static inline int __eldu(struct sgx_pageinfo *pginfo, void *addr, void *va) { return __encls_ret_3(ELDU, pginfo, addr, va); } +/* Mark an Enclave Page Cache (EPC) page as blocked */ static inline int __eblock(void *addr) { return __encls_ret_1(EBLOCK, addr); } +/* Add a Version Array (VA) page to the Enclave Page Cache (EPC) */ static inline int __epa(void *addr) { unsigned long rbx = SGX_PAGE_TYPE_VA; @@ -220,6 +231,7 @@ static inline int __epa(void *addr) return __encls_2(EPA, rbx, addr); } +/* Invalidate an EPC page and write it out to main memory */ static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr, void *va) { From patchwork Wed Dec 1 19:23:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DCAEC43219 for ; Wed, 1 Dec 2021 19:23:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245139AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243784AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267921" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267921" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380440" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions Date: Wed, 1 Dec 2021 11:23:00 -0800 Message-Id: <50ad5005c451f41951327853fb450ba302eadb40.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The SGX ENCLS instruction uses EAX to specify an SGX function and may require additional registers, depending on the SGX function. ENCLS invokes the specified privileged SGX function for managing and debugging enclaves. Several macros are used to wrap the ENCLS functionality. Add ENCLS wrappers for the SGX2 EMODPR, EMODT, and EAUG functions that can make changes to pages of an initialized SGX enclave. The EMODPR function is used to restrict enclave page permissions as maintained within the enclave (Enclave Page Cache Map (EPCM) permissions). The EMODT function is used to change the type of an enclave page. The EAUG function is used to dynamically add enclave pages to an initialized enclave. EMODPR and EMODT accepts two parameters and can fault as well as return an SGX error code. EAUG also accepts two parameters but does not return an SGX error code. Use existing macros for all new functions. Expand enum sgx_return_code with the possible EMODPR and EMODT return codes. Signed-off-by: Reinette Chatre --- arch/x86/include/asm/sgx.h | 5 +++++ arch/x86/kernel/cpu/sgx/encls.h | 18 ++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h index 05f3e21f01a7..ebae2a153c66 100644 --- a/arch/x86/include/asm/sgx.h +++ b/arch/x86/include/asm/sgx.h @@ -47,17 +47,22 @@ enum sgx_encls_function { /** * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV + * %SGX_EPC_PAGE_CONFLICT: Page is being written by other ENCLS function. * %SGX_NOT_TRACKED: Previous ETRACK's shootdown sequence has not * been completed yet. * %SGX_CHILD_PRESENT SECS has child pages present in the EPC. * %SGX_INVALID_EINITTOKEN: EINITTOKEN is invalid and enclave signer's * public key does not match IA32_SGXLEPUBKEYHASH. + * %SGX_PAGE_NOT_MODIFIABLE: The EPC page cannot be modified because it + * is in the PENDING or MODIFIED state. * %SGX_UNMASKED_EVENT: An unmasked event, e.g. INTR, was received */ enum sgx_return_code { + SGX_EPC_PAGE_CONFLICT = 7, SGX_NOT_TRACKED = 11, SGX_CHILD_PRESENT = 13, SGX_INVALID_EINITTOKEN = 16, + SGX_PAGE_NOT_MODIFIABLE = 20, SGX_UNMASKED_EVENT = 128, }; diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h index 241b766265d3..243c30301ddb 100644 --- a/arch/x86/kernel/cpu/sgx/encls.h +++ b/arch/x86/kernel/cpu/sgx/encls.h @@ -238,4 +238,22 @@ static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr, return __encls_ret_3(EWB, pginfo, addr, va); } +/* Restrict the permissions of an Enclave Page Cache (EPC) page */ +static inline int __emodpr(struct sgx_secinfo *secinfo, void *addr) +{ + return __encls_ret_2(EMODPR, secinfo, addr); +} + +/* Change the type of an Enclave Page Cache (EPC) page */ +static inline int __emodt(struct sgx_secinfo *secinfo, void *addr) +{ + return __encls_ret_2(EMODT, secinfo, addr); +} + +/* Add a page to an initialized enclave */ +static inline int __eaug(struct sgx_pageinfo *pginfo, void *addr) +{ + return __encls_2(EAUG, pginfo, addr); +} + #endif /* _X86_ENCLS_H */ From patchwork Wed Dec 1 19:23:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00EFAC4332F for ; Wed, 1 Dec 2021 19:23:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352696AbhLAT1Q (ORCPT ); Wed, 1 Dec 2021 14:27:16 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245195AbhLAT1I (ORCPT ); Wed, 1 Dec 2021 14:27:08 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267922" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267922" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380443" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions Date: Wed, 1 Dec 2021 11:23:01 -0800 Message-Id: <7e622156315c9c22c3ef84a7c0aeb01b5c001ff9.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org === Summary === An SGX VMA can only be created if its permissions are the same or weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA creation this rule continues to be enforced by the page fault handler. With SGX2 the EPCM permissions of a page can change after VMA creation resulting in the VMA exceeding the EPCM permissions and the page fault handler incorrectly blocking access. Enable the VMA's pages to remain accessible while ensuring that the page table entries are installed to match the EPCM permissions without exceeding the VMA permissions. === Full Changelog === An SGX enclave is an area of memory where parts of an application can reside. First an enclave is created and loaded (from non-enclave memory) with the code and data of an application, then user space can map (mmap()) the enclave memory to be able to enter the enclave at its defined entry points for execution within it. The hardware maintains a secure structure, the Enclave Page Cache Map (EPCM), that tracks the contents of the enclave. Of interest here is its tracking of the enclave page permissions. When a page is loaded into the enclave its permissions are specified and recorded in the EPCM. In parallel the OS maintains the page table permissions and the rule is that page table permissions are never allowed to exceed EPCM permissions. A new mapping (mmap()) of enclave memory can only succeed if the mapping has the same or weaker permissions than the permissions that were vetted during enclave creation. This is enforced by sgx_encl_may_map() that is called on the mmap() as well as mprotect() paths. This permission verification remains. One feature of SGX2 is to support the modification of enclave page permissions after enclave creation. Enclave pages may thus already be mapped at the time their enclave permissions are changed resulting in the VMA's permissions potentially exceeding the enclave page permissions. Enable permissions of existing VMAs to exceed enclave page permissions in preparation for dynamic enclave page permission changes. New VMAs that attempt to exceed enclave page permissions continue to be unsupported. Reasons why permissions of existing VMAs are allowed to exceed enclave page permissions instead of dynamically changing VMA permissions when enclave page permissions change are: 1) Changing VMA permissions involve splitting VMAs which is an operation that can fail. Additionally the actual changing of page permissions of a range of pages could also fail on any of the pages involved. Handling these error cases causes problems. For example, if an enclave page permission change fails and the VMA has already been split then it is not possible to undo the VMA split nor possible to undo the enclave page permission changes that did succeed before the failure. 2) The OS has little insight into the user space where EPCM permissions are controlled from. For example, a RW page may be made RO just before it is made RX and splitting the VMAs while the VMAs may change soon is unnecessary. Remove the extra permission check called on a page fault (vm_operations_struct->fault) or during debugging (vm_operations_struct->access) when loading the enclave page from swap that ensures that the VMA permissions do not exceed the enclave permissions. Since a VMA could only exist if it passed the original permission checks during mmap() and a VMA may indeed exceed the page permissions this extra permission check is no longer appropriate. With the permission check removed, ensure that page table entries do not blindly inherit the VMA permissions but instead the permissions that the VMA and enclave agree on. PTEs for writable pages (from VMA and enclave perspective) are installed with the writable bit set, reducing the need for this additional flow to the permission mismatch cases handled next. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 38 ++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 001808e3901c..20e97d3abdce 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -91,10 +91,8 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page, } static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, - unsigned long addr, - unsigned long vm_flags) + unsigned long addr) { - unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC); struct sgx_epc_page *epc_page; struct sgx_encl_page *entry; @@ -102,14 +100,6 @@ static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, if (!entry) return ERR_PTR(-EFAULT); - /* - * Verify that the faulted page has equal or higher build time - * permissions than the VMA permissions (i.e. the subset of {VM_READ, - * VM_WRITE, VM_EXECUTE} in vma->vm_flags). - */ - if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits) - return ERR_PTR(-EFAULT); - /* Entry successfully located. */ if (entry->epc_page) { if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED) @@ -138,7 +128,9 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) { unsigned long addr = (unsigned long)vmf->address; struct vm_area_struct *vma = vmf->vma; + unsigned long page_prot_bits; struct sgx_encl_page *entry; + unsigned long vm_prot_bits; unsigned long phys_addr; struct sgx_encl *encl; vm_fault_t ret; @@ -155,7 +147,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) mutex_lock(&encl->lock); - entry = sgx_encl_load_page(encl, addr, vma->vm_flags); + entry = sgx_encl_load_page(encl, addr); if (IS_ERR(entry)) { mutex_unlock(&encl->lock); @@ -167,7 +159,19 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) phys_addr = sgx_get_epc_phys_addr(entry->epc_page); - ret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr)); + /* + * Insert PTE to match the EPCM page permissions ensured to not + * exceed the VMA permissions. + */ + vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC); + page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits; + /* + * Add VM_SHARED so that PTE is made writable right away if VMA + * and EPCM are writable (no COW in SGX). + */ + page_prot_bits |= (vma->vm_flags & VM_SHARED); + ret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr), + vm_get_page_prot(page_prot_bits)); if (ret != VM_FAULT_NOPAGE) { mutex_unlock(&encl->lock); @@ -295,15 +299,14 @@ static int sgx_encl_debug_write(struct sgx_encl *encl, struct sgx_encl_page *pag * Load an enclave page to EPC if required, and take encl->lock. */ static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl, - unsigned long addr, - unsigned long vm_flags) + unsigned long addr) { struct sgx_encl_page *entry; for ( ; ; ) { mutex_lock(&encl->lock); - entry = sgx_encl_load_page(encl, addr, vm_flags); + entry = sgx_encl_load_page(encl, addr); if (PTR_ERR(entry) != -EBUSY) break; @@ -339,8 +342,7 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr, return -EFAULT; for (i = 0; i < len; i += cnt) { - entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK, - vma->vm_flags); + entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK); if (IS_ERR(entry)) { ret = PTR_ERR(entry); break; From patchwork Wed Dec 1 19:23:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74552C433EF for ; Wed, 1 Dec 2021 19:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352685AbhLAT1P (ORCPT ); Wed, 1 Dec 2021 14:27:15 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245144AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267923" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267923" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380445" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs Date: Wed, 1 Dec 2021 11:23:02 -0800 Message-Id: <9edf8cd20628cbf400886d88e359fb24265fdef0.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org By default a write page fault on a present PTE inherits the permissions of the VMA. Enclave page permissions maintained in the hardware's Enclave Page Cache Map (EPCM) may change after a VMA accessing the page is created. A VMA's permissions may thus exceed the enclave page permissions even though the VMA was originally created not to exceed the enclave page permissions. Following the default behavior during a page fault on a present PTE while the VMA permissions exceed the enclave page permissions would result in the PTE for an enclave page to be writable even though the page is not writable according to the enclave's permissions. Consider the following scenario: * An enclave page exists with RW EPCM permissions. * A RW VMA maps the range spanning the enclave page. * The enclave page's EPCM permissions are changed to read-only. * There is no page table entry for the enclave page. Q. What will user space observe when an attempt is made to write to the enclave page from within the enclave? A. Initially the page table entry is not present so the following is observed: 1) Instruction writing to enclave page is run from within the enclave. 2) A page fault with second and third bits set (0x6) is encountered and handled by the SGX handler sgx_vma_fault() that installs a read-only page table entry following previous patch that installs page table entry with permissions that VMA and enclave agree on (read-only in this case). 3) Instruction writing to enclave page is re-attempted. 4) A page fault with first three bits set (0x7) is encountered and transparently (from SGX and user space perspective) handled by the OS with the page table entry made writable because the VMA is writable. 5) Instruction writing to enclave page is re-attempted. 6) Since the EPCM permissions prevents writing to the page a new page fault is encountered, this time with the SGX flag set in the error code (0x8007). No action is taken by OS for this page fault and execution returns to user space. 7) Typically such a fault will be passed on to an application with a signal but if the enclave is entered with the vDSO function provided by the kernel then user space does not receive a signal but instead the vDSO function returns successfully with exception information (vector=14, error code=0x8007, and address) within the exception fields within the vDSO function's struct sgx_enclave_run. As can be observed it is not possible for user space to write to an enclave page if that page's enclave page permissions do not allow so, no matter what the VMA or PTE allows. Even so, the OS should not allow writing to a page if that page is not writable. Thus the page table entry should accurately reflect the enclave page permissions. Do not blindly accept VMA permissions on a page fault due to a write attempt to a present PTE. Install a pfn_mkwrite() handler that ensures that the VMA permissions agree with the enclave permissions in this regard. Considering the same scenario as above after this change results in the following behavior change: Q. What will user space observe when an attempt is made to write to the enclave page from within the enclave? A. Initially the page table entry is not present so the following is observed: 1) Instruction writing to enclave page is run from within the enclave. 2) A page fault with second and third bits set (0x6) is encountered and handled by the SGX handler sgx_vma_fault() that installs a read-only page table entry following previous patch that installs page table entry with permissions that VMA and enclave agree on (read-only in this case). 3) Instruction writing to enclave page is re-attempted. 4) A page fault with first three bits set (0x7) is encountered and passed to the pfn_mkwrite() handler for consideration. The handler determines that the page should not be writable and returns SIGBUS. 5) Typically such a fault will be passed on to an application with a signal but if the enclave is entered with the vDSO function provided by the kernel then user space does not receive a signal but instead the vDSO function returns successfully with exception information (vector=14, error code=0x7, and address) within the exception fields within the vDSO function's struct sgx_enclave_run. The accurate exception information supports the SGX runtime, which is virtually always implemented inside a shared library, by providing accurate information in support of its management of the SGX enclave. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 42 ++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 20e97d3abdce..60afa8eaf979 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -184,6 +184,47 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) return VM_FAULT_NOPAGE; } +/* + * A fault occurred while writing to a present enclave PTE. Since PTE is + * present this will not be handled by sgx_vma_fault(). VMA may allow + * writing to the page while enclave does not. Do not follow the default + * of inheriting VMA permissions in this regard, ensure enclave also allows + * writing to the page. + */ +static vm_fault_t sgx_vma_pfn_mkwrite(struct vm_fault *vmf) +{ + unsigned long addr = (unsigned long)vmf->address; + struct vm_area_struct *vma = vmf->vma; + struct sgx_encl_page *entry; + struct sgx_encl *encl; + vm_fault_t ret = 0; + + encl = vma->vm_private_data; + + /* + * It's very unlikely but possible that allocating memory for the + * mm_list entry of a forked process failed in sgx_vma_open(). When + * this happens, vm_private_data is set to NULL. + */ + if (unlikely(!encl)) + return VM_FAULT_SIGBUS; + + mutex_lock(&encl->lock); + + entry = xa_load(&encl->page_array, PFN_DOWN(addr)); + if (!entry) { + ret = VM_FAULT_SIGBUS; + goto out; + } + + if (!(entry->vm_max_prot_bits & VM_WRITE)) + ret = VM_FAULT_SIGBUS; + +out: + mutex_unlock(&encl->lock); + return ret; +} + static void sgx_vma_open(struct vm_area_struct *vma) { struct sgx_encl *encl = vma->vm_private_data; @@ -381,6 +422,7 @@ const struct vm_operations_struct sgx_vm_ops = { .mprotect = sgx_vma_mprotect, .open = sgx_vma_open, .access = sgx_vma_access, + .pfn_mkwrite = sgx_vma_pfn_mkwrite, }; /** From patchwork Wed Dec 1 19:23:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03FA3C433F5 for ; Wed, 1 Dec 2021 19:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352679AbhLAT1O (ORCPT ); Wed, 1 Dec 2021 14:27:14 -0500 Received: from mga04.intel.com ([192.55.52.120]:46907 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245142AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267924" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267924" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380449" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 05/25] x86/sgx: Introduce runtime protection bits Date: Wed, 1 Dec 2021 11:23:03 -0800 Message-Id: <2f6b04dd8949591ee6139072c72eb93da3dd07b0.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Enclave creators declare their paging permission intent at the time the pages are added to the enclave. These paging permissions are vetted when pages are added to the enclave and stashed off (in sgx_encl_page->vm_max_prot_bits) for later comparison with enclave PTEs. Current permission support assume that enclave page permissions remain static for the lifetime of the enclave. This is about to change with the addition of support for SGX2 where the permissions of enclave pages belonging to an initialized enclave may be changed during the enclave's lifetime. Introduce runtime protection bits in preparation for support of enclave page permission changes. These bits reflect the active permissions of an enclave page and are not to exceed the maximum protection bits that passed scrutiny during enclave creation. Associate runtime protection bits with each enclave page. Initialize the runtime protection bits to the vetted maximum protection bits on page creation. Use the runtime protection bits for any access checks. struct sgx_encl_page hosting this information is maintained for each enclave page so the space consumed by the struct is important. The existing vm_max_prot_bits is already unsigned long while only using three bits. Transition to a bitfield for the two members containing protection bits. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 6 +++--- arch/x86/kernel/cpu/sgx/encl.h | 3 ++- arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++ 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 60afa8eaf979..6fec68896e1b 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -164,7 +164,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) * exceed the VMA permissions. */ vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC); - page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits; + page_prot_bits = entry->vm_run_prot_bits & vm_prot_bits; /* * Add VM_SHARED so that PTE is made writable right away if VMA * and EPCM are writable (no COW in SGX). @@ -217,7 +217,7 @@ static vm_fault_t sgx_vma_pfn_mkwrite(struct vm_fault *vmf) goto out; } - if (!(entry->vm_max_prot_bits & VM_WRITE)) + if (!(entry->vm_run_prot_bits & VM_WRITE)) ret = VM_FAULT_SIGBUS; out: @@ -280,7 +280,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, mutex_lock(&encl->lock); xas_lock(&xas); xas_for_each(&xas, page, PFN_DOWN(end - 1)) { - if (~page->vm_max_prot_bits & vm_prot_bits) { + if (~page->vm_run_prot_bits & vm_prot_bits) { ret = -EACCES; break; } diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index fec43ca65065..dc262d843411 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -27,7 +27,8 @@ struct sgx_encl_page { unsigned long desc; - unsigned long vm_max_prot_bits; + unsigned long vm_max_prot_bits:8; + unsigned long vm_run_prot_bits:8; struct sgx_epc_page *epc_page; struct sgx_encl *encl; struct sgx_va_page *va_page; diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 83df20e3e633..7e0819a89532 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -197,6 +197,12 @@ static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl, /* Calculate maximum of the VM flags for the page. */ encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0); + /* + * At time of allocation, the runtime protection bits are the same + * as the maximum protection bits. + */ + encl_page->vm_run_prot_bits = encl_page->vm_max_prot_bits; + return encl_page; } From patchwork Wed Dec 1 19:23:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AA2FC43217 for ; Wed, 1 Dec 2021 19:23:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232671AbhLAT1H (ORCPT ); Wed, 1 Dec 2021 14:27:07 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245113AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267925" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267925" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380452" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function Date: Wed, 1 Dec 2021 11:23:04 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Knowing which CPUs might have executed an enclave is useful to ensure that TLBs are cleared when changes are made to enclave pages. Knowing which CPUs might have executed an enclave is currently only used within the reclaimer when an enclave page is evicted. In support of this the function that determines the CPUs is called sgx_encl_ewb_cpumask() - a name that is specific to the current usage. Rename sgx_encl_ewb_cpumask() to sgx_encl_cpumask() in preparation for its usage in more flows. Care should be taken to ensure that any future usage maintains the current context requirement that ETRACK has been called first. To highlight this the existing comments regarding this context is expanded and moved to more prominent location. With the usage of this function no longer unique to the reclaimer it is relocated to be the rest of the enclave code in encl.c. No functional change. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 67 ++++++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/encl.h | 1 + arch/x86/kernel/cpu/sgx/main.c | 31 +--------------- 3 files changed, 69 insertions(+), 30 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 6fec68896e1b..c29e10541d12 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -595,6 +595,73 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm) return 0; } +/** + * sgx_encl_cpumask - Query which CPUs might be accessing the enclave + * @encl: the enclave + * + * Some SGX functions require that no cached linear-to-physical address + * mappings are present before they can succeed. For example, ENCLS[EWB] + * copies a page from the enclave page cache to regular main memory but + * it fails if it cannot ensure that there are no cached + * linear-to-physical address mappings referring to the page. + * + * SGX hardware flushes all cached linear-to-physical mappings on a CPU + * when an enclave is exited via ENCLU[EEXIT] or an Asynchronous Enclave + * Exit (AEX). Exiting an enclave will thus ensure cached linear-to-physical + * address mappings are cleared but coordination with the tracking done within + * the SGX hardware is needed to support the SGX functions that depend on this + * cache clearing. + * + * When the ENCLS[ETRACK] function is issued on an enclave the hardware + * tracks threads operating inside the enclave at that time. The SGX + * hardware tracking require that all the identified threads must have + * exited the enclave in order to flush the mappings before a function such + * as ENCLS[EWB] will be permitted + * + * The following flow is used to support SGX functions that require that + * no cached linear-to-physical address mappings are present: + * 1) Execute ENCLS[ETRACK] to initiate hardware tracking. + * 2) Use this function (sgx_encl_cpumask()) to query which CPUs might be + * accessing the enclave. + * 3) Send IPI to identified CPUs, kicking them out of the enclave and + * thus flushing all locally cached linear-to-physical address mappings. + * 4) Execute SGX function. + * + * Context: It is required to call this function after ENCLS[ETRACK]. + * This will ensure that if any new mm appears (racing with + * sgx_encl_mm_add()) then the new mm will enter into the + * enclave with fresh linear-to-physical address mappings. + * + * It is required that all IPIs are completed before a new + * ENCLS[ETRACK] is issued so be sure to protect steps 1 to 3 + * of the above flow with the enclave's mutex. + * + * Return: cpumask of CPUs that might be accessing @encl + */ +const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl) +{ + cpumask_t *cpumask = &encl->cpumask; + struct sgx_encl_mm *encl_mm; + int idx; + + cpumask_clear(cpumask); + + idx = srcu_read_lock(&encl->srcu); + + list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) { + if (!mmget_not_zero(encl_mm->mm)) + continue; + + cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm)); + + mmput_async(encl_mm->mm); + } + + srcu_read_unlock(&encl->srcu, idx); + + return cpumask; +} + static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl, pgoff_t index) { diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index dc262d843411..becb68503baa 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -106,6 +106,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, void sgx_encl_release(struct kref *ref); int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm); +const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl); int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index, struct sgx_backing *backing); void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 6036328de255..e5be992897f4 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -201,35 +201,6 @@ static void sgx_ipi_cb(void *info) { } -static const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl) -{ - cpumask_t *cpumask = &encl->cpumask; - struct sgx_encl_mm *encl_mm; - int idx; - - /* - * Can race with sgx_encl_mm_add(), but ETRACK has already been - * executed, which means that the CPUs running in the new mm will enter - * into the enclave with a fresh epoch. - */ - cpumask_clear(cpumask); - - idx = srcu_read_lock(&encl->srcu); - - list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) { - if (!mmget_not_zero(encl_mm->mm)) - continue; - - cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm)); - - mmput_async(encl_mm->mm); - } - - srcu_read_unlock(&encl->srcu, idx); - - return cpumask; -} - /* * Swap page to the regular memory transformed to the blocked state by using * EBLOCK, which means that it can no longer be referenced (no new TLB entries). @@ -276,7 +247,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page, * miss cpus that entered the enclave between * generating the mask and incrementing epoch. */ - on_each_cpu_mask(sgx_encl_ewb_cpumask(encl), + on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1); ret = __sgx_encl_ewb(epc_page, va_slot, backing); } From patchwork Wed Dec 1 19:23:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A69CC433F5 for ; Wed, 1 Dec 2021 19:23:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232854AbhLAT1E (ORCPT ); Wed, 1 Dec 2021 14:27:04 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229655AbhLAT1E (ORCPT ); Wed, 1 Dec 2021 14:27:04 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784098" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784098" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380454" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function Date: Wed, 1 Dec 2021 11:23:05 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The SGX reclaimer removes page table entries pointing to pages that are moved to swap. SGX2 enables changes to pages belonging to an initialized enclave, for example changing page permissions. Supporting SGX2 requires this ability to remove page table entries that is available in the SGX reclaimer code. Factor out the code removing page table entries to a separate function, fixing accuracy of comments in the process, and make it available to other areas within the SGX code. Since the code will no longer be unique to the reclaimer it is relocated to be with the rest of the enclave code in encl.c interacting with the page table. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 45 +++++++++++++++++++++++++++++++++- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/main.c | 31 ++--------------------- 3 files changed, 47 insertions(+), 31 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index c29e10541d12..ba39186d5a28 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -587,7 +587,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm) spin_lock(&encl->mm_lock); list_add_rcu(&encl_mm->list, &encl->mm_list); - /* Pairs with smp_rmb() in sgx_reclaimer_block(). */ + /* Pairs with smp_rmb() in sgx_zap_enclave_ptes(). */ smp_wmb(); encl->mm_list_version++; spin_unlock(&encl->mm_lock); @@ -776,6 +776,49 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, return ret; } +/** + * sgx_zap_enclave_ptes - remove PTEs mapping the address from enclave + * @encl: the enclave + * @addr: page aligned pointer to single page for which PTEs will be removed + * + * Multiple VMAs may have an enclave page mapped. Remove the PTE mapping + * @addr from each VMA. Ensure that page fault handler is ready to handle + * new mappings of @addr before calling this function. + */ +void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) +{ + unsigned long mm_list_version; + struct sgx_encl_mm *encl_mm; + struct vm_area_struct *vma; + int idx, ret; + + do { + mm_list_version = encl->mm_list_version; + + /* Pairs with smp_wmb() in sgx_encl_mm_add(). */ + smp_rmb(); + + idx = srcu_read_lock(&encl->srcu); + + list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) { + if (!mmget_not_zero(encl_mm->mm)) + continue; + + mmap_read_lock(encl_mm->mm); + + ret = sgx_encl_find(encl_mm->mm, addr, &vma); + if (!ret && encl == vma->vm_private_data) + zap_vma_ptes(vma, addr, PAGE_SIZE); + + mmap_read_unlock(encl_mm->mm); + + mmput_async(encl_mm->mm); + } + + srcu_read_unlock(&encl->srcu, idx); + } while (unlikely(encl->mm_list_version != mm_list_version)); +} + /** * sgx_alloc_va_page() - Allocate a Version Array (VA) page * diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index becb68503baa..82e21088e68b 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -112,7 +112,7 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index, void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write); int sgx_encl_test_and_clear_young(struct mm_struct *mm, struct sgx_encl_page *page); - +void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr); struct sgx_epc_page *sgx_alloc_va_page(void); unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index e5be992897f4..9b96f4e0a17a 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -135,36 +135,9 @@ static void sgx_reclaimer_block(struct sgx_epc_page *epc_page) struct sgx_encl_page *page = epc_page->owner; unsigned long addr = page->desc & PAGE_MASK; struct sgx_encl *encl = page->encl; - unsigned long mm_list_version; - struct sgx_encl_mm *encl_mm; - struct vm_area_struct *vma; - int idx, ret; - - do { - mm_list_version = encl->mm_list_version; - - /* Pairs with smp_rmb() in sgx_encl_mm_add(). */ - smp_rmb(); - - idx = srcu_read_lock(&encl->srcu); - - list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) { - if (!mmget_not_zero(encl_mm->mm)) - continue; - - mmap_read_lock(encl_mm->mm); - - ret = sgx_encl_find(encl_mm->mm, addr, &vma); - if (!ret && encl == vma->vm_private_data) - zap_vma_ptes(vma, addr, PAGE_SIZE); - - mmap_read_unlock(encl_mm->mm); - - mmput_async(encl_mm->mm); - } + int ret; - srcu_read_unlock(&encl->srcu, idx); - } while (unlikely(encl->mm_list_version != mm_list_version)); + sgx_zap_enclave_ptes(encl, addr); mutex_lock(&encl->lock); From patchwork Wed Dec 1 19:23:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16654C433F5 for ; Wed, 1 Dec 2021 19:23:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245242AbhLAT1J (ORCPT ); Wed, 1 Dec 2021 14:27:09 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245131AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267926" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267926" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380459" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally Date: Wed, 1 Dec 2021 11:23:06 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The ETRACK instruction followed by an IPI to all CPUs within an enclave is a common pattern with more frequent use in support of SGX2. Make the (empty) IPI callback function available internally in preparation for more usages. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/main.c | 2 +- arch/x86/kernel/cpu/sgx/sgx.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 9b96f4e0a17a..887648ce6084 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -170,7 +170,7 @@ static int __sgx_encl_ewb(struct sgx_epc_page *epc_page, void *va_slot, return ret; } -static void sgx_ipi_cb(void *info) +void sgx_ipi_cb(void *info) { } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 9ec3136c7800..ca89d625aa74 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -89,6 +89,8 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page); int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); +void sgx_ipi_cb(void *info); + #ifdef CONFIG_X86_SGX_KVM int __init sgx_vepc_init(void); #else From patchwork Wed Dec 1 19:23:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DF10C43219 for ; Wed, 1 Dec 2021 19:23:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352607AbhLAT1M (ORCPT ); Wed, 1 Dec 2021 14:27:12 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245136AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267929" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267929" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380462" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 09/25] x86/sgx: Keep record of SGX page type Date: Wed, 1 Dec 2021 11:23:07 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org SGX2 functions are not allowed on all page types. For example, ENCLS[EMODPR] is only allowed on regular SGX enclave pages and ENCLS[EMODPT] is only allowed on TCS and regular pages. If these functions are attempted on another type of page the hardware would trigger a fault. Keep a record of the SGX page type so that there is more certainty whether an SGX2 instruction can succeed and faults can be treated as real failures. The page type is made to be a property of struct sgx_encl_page and thus does not cover the VA page type. VA pages are maintained in separate structures and thus their type can be determined in a different way. The SGX2 instructions being supported do not operate on VA pages and this is thus not a scenario needing to be covered at this time. With the protection bits consuming 16 bits of the unsigned long there is room available in the bitfield to include the page type information without increasing the space consumed by the struct. Signed-off-by: Reinette Chatre Acked-by: Jarkko Sakkinen --- arch/x86/include/asm/sgx.h | 3 +++ arch/x86/kernel/cpu/sgx/encl.h | 1 + arch/x86/kernel/cpu/sgx/ioctl.c | 2 ++ 3 files changed, 6 insertions(+) diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h index ebae2a153c66..8b6cbedada96 100644 --- a/arch/x86/include/asm/sgx.h +++ b/arch/x86/include/asm/sgx.h @@ -221,6 +221,9 @@ struct sgx_pageinfo { * %SGX_PAGE_TYPE_REG: a regular page * %SGX_PAGE_TYPE_VA: a VA page * %SGX_PAGE_TYPE_TRIM: a page in trimmed state + * + * Make sure when making changes to this enum that its values can still fit + * in the bitfield within &struct sgx_encl_page */ enum sgx_page_type { SGX_PAGE_TYPE_SECS, diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index 82e21088e68b..cb9f16d457ac 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -29,6 +29,7 @@ struct sgx_encl_page { unsigned long desc; unsigned long vm_max_prot_bits:8; unsigned long vm_run_prot_bits:8; + enum sgx_page_type type:16; struct sgx_epc_page *epc_page; struct sgx_encl *encl; struct sgx_va_page *va_page; diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 7e0819a89532..491d2700a54d 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -107,6 +107,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs) set_bit(SGX_ENCL_DEBUG, &encl->flags); encl->secs.encl = encl; + encl->secs.type = SGX_PAGE_TYPE_SECS; encl->base = secs->base; encl->size = secs->size; encl->attributes = secs->attributes; @@ -350,6 +351,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src, */ encl_page->encl = encl; encl_page->epc_page = epc_page; + encl_page->type = (secinfo->flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8; encl->secs_child_cnt++; if (flags & SGX_PAGE_MEASURE) { From patchwork Wed Dec 1 19:23:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB844C433EF for ; Wed, 1 Dec 2021 19:23:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245098AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232671AbhLAT1E (ORCPT ); Wed, 1 Dec 2021 14:27:04 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784102" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784102" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380465" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 10/25] x86/sgx: Support enclave page permission changes Date: Wed, 1 Dec 2021 11:23:08 -0800 Message-Id: <44fe170cfd855760857660b9f56cae8c4747cc15.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org In the initial (SGX1) version of SGX, pages in an enclave need to be created with permissions that support all usages of the pages, from the time the enclave is initialized until it is unloaded. For example, pages used by a JIT compiler or when code needs to otherwise be relocated need to always have RWX permissions. SGX2 includes two functions that can be used to modify the enclave page permissions of regular enclave pages within an initialized enclave. ENCLS[EMODPR] is run from the OS and used to restrict enclave page permissions while ENCLU[EMODPE] is run from within the enclave to extend enclave page permissions. Enclave page permission changes need to be approached with care and for this reason this initial support is to allow enclave page permission changes _only_ if the new permissions are the same or more restrictive that the permissions originally vetted at the time the pages were added to the enclave. Support for extending enclave page permissions beyond what was originally vetted is deferred. Whether enclave page permissions are restricted or extended it is necessary to ensure that the page table entries and enclave page permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to support enclave page permission changes. Since the OS has no insight in how permissions may have been extended from within the enclave all page permission requests are treated as permission restrictions. This ioctl is used when enclave page permissions need to be restricted via the OS as well as after enclave page permissions have been extended from within the enclave (to ensure correct page table entries are generated). With this ioctl the user specifies a page range and the permissions to be applied to all pages in the provided range. The ioctl itself can return an error code based on failures encountered by the OS. It is also possible for SGX specific failures to be encountered. Add a result output parameter to communicate the SGX return code. It is possible for the permission change request to fail on a particular page. To support partial success the ioctl will return the number of pages that were successfully changed. Signed-off-by: Reinette Chatre --- arch/x86/include/uapi/asm/sgx.h | 20 +++ arch/x86/kernel/cpu/sgx/encl.c | 4 +- arch/x86/kernel/cpu/sgx/encl.h | 3 + arch/x86/kernel/cpu/sgx/ioctl.c | 235 ++++++++++++++++++++++++++++++++ 4 files changed, 260 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h index f4b81587e90b..24bebc31e336 100644 --- a/arch/x86/include/uapi/asm/sgx.h +++ b/arch/x86/include/uapi/asm/sgx.h @@ -29,6 +29,8 @@ enum sgx_page_flags { _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision) #define SGX_IOC_VEPC_REMOVE_ALL \ _IO(SGX_MAGIC, 0x04) +#define SGX_IOC_PAGE_MODP \ + _IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp) /** * struct sgx_enclave_create - parameter structure for the @@ -76,6 +78,24 @@ struct sgx_enclave_provision { __u64 fd; }; +/** + * struct sgx_page_modp - parameter structure for the %SGX_IOC_PAGE_MODP ioctl + * @offset: starting page offset (page aligned relative to enclave base + * address defined in SECS) + * @length: length of memory (multiple of the page size) + * @prot: new protection bits of pages in range described by @offset + * and @length + * @result: SGX result code of ENCLS[EMODPR] function + * @count: bytes successfully changed (multiple of page size) + */ +struct sgx_page_modp { + __u64 offset; + __u64 length; + __u64 prot; + __u64 result; + __u64 count; +}; + struct sgx_enclave_run; /** diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index ba39186d5a28..03c4d7e00b44 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -90,8 +90,8 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page, return epc_page; } -static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, - unsigned long addr) +struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, + unsigned long addr) { struct sgx_epc_page *epc_page; struct sgx_encl_page *entry; diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index cb9f16d457ac..848a28d28d3d 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -120,4 +120,7 @@ void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); bool sgx_va_page_full(struct sgx_va_page *va_page); void sgx_encl_free_epc_page(struct sgx_epc_page *page); +struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, + unsigned long addr); + #endif /* _X86_ENCL_H */ diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 491d2700a54d..5dddb3c9f742 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -682,6 +682,238 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg) return sgx_set_attribute(&encl->attributes_mask, params.fd); } +/** + * sgx_page_modp - Align enclave (EPCM) and OS (PTE) view of page permission + * @encl: Enclave to which the pages belong. + * @modp: Checked parameters from user on which pages need modifying + * and their new permissions. + * + * SGX2 distinguishes between extending and restricting the enclave page + * permissions maintained by the hardware (EPCM permissions) of pages + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT). + * + * EPCM permissions can be extended anytime directly from the enclave with + * no visibility from the OS. This is accomplished with ENCLU[EMODPE] + * run from within enclave. Accessing pages with the new, extended, + * permissions requires the OS to update the PTE to handle the subsequent + * #PF correctly. + * + * EPCM permissions cannot be restricted from within the enclave, the enclave + * requires the OS to run the privileged level 0 instructions ENCLS[EMODPR] + * and ENCLS[ETRACK] to achieve this. + * + * Since OS does not have insight into enclave's ENCLU[EMODPE] calls all + * EPCM permission changes are treated as restricting of (EPCM) permissions. + * Page table entries are cleared to ensure that the fault handler installs + * new entries with correct permissions. + * + * Return: + * - 0: Success. + * - -errno: Otherwise. + */ +static long sgx_page_modp(struct sgx_encl *encl, struct sgx_page_modp *modp) +{ + unsigned long vm_prot, run_prot_restore; + struct sgx_encl_page *entry; + struct sgx_secinfo secinfo; + unsigned long addr; + u64 secinfo_perm; + unsigned long c; + void *epc_virt; + int ret; + + secinfo_perm = modp->prot & SGX_SECINFO_PERMISSION_MASK; + + if ((secinfo_perm & SGX_SECINFO_W) && !(secinfo_perm & SGX_SECINFO_R)) + return -EINVAL; + + memset(&secinfo, 0, sizeof(secinfo)); + + secinfo.flags = secinfo_perm; + + vm_prot = _calc_vm_trans(secinfo.flags, SGX_SECINFO_R, PROT_READ) | + _calc_vm_trans(secinfo.flags, SGX_SECINFO_W, PROT_WRITE) | + _calc_vm_trans(secinfo.flags, SGX_SECINFO_X, PROT_EXEC); + vm_prot = calc_vm_prot_bits(vm_prot, 0); + + for (c = 0 ; c < modp->length; c += PAGE_SIZE) { + addr = encl->base + modp->offset + c; + + mutex_lock(&encl->lock); + + entry = sgx_encl_load_page(encl, addr); + if (IS_ERR(entry)) { + ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT; + goto out_unlock; + } + + /* + * Changing EPCM permissions is only supported on regular + * SGX pages. Attempting this change on other pages will + * result in #PF. + */ + if (entry->type != SGX_PAGE_TYPE_REG) { + ret = -EINVAL; + goto out_unlock; + } + + /* + * Do not verify if current runtime protection bits are what + * is being requested. The enclave may have done some + * permission extending calls without letting OS know and + * thus permission restriction may still be needed even if + * from OS perspective the permissions are unchanged. + */ + + /* Do not exceed permissions that have been vetted. */ + if ((entry->vm_max_prot_bits & vm_prot) != vm_prot) { + ret = -EPERM; + goto out_unlock; + } + + /* Make sure page stays around while releasing mutex. */ + if (sgx_unmark_page_reclaimable(entry->epc_page)) { + ret = -EAGAIN; + goto out_unlock; + } + + /* + * Change runtime protection before zapping PTEs to ensure + * any new #PF uses new permissions. EPCM permissions not + * changed yet. + */ + run_prot_restore = entry->vm_run_prot_bits; + entry->vm_run_prot_bits = vm_prot; + + mutex_unlock(&encl->lock); + /* + * Do not keep encl->lock because of dependency on + * mmap_lock acquired in sgx_zap_enclave_ptes(). + */ + sgx_zap_enclave_ptes(encl, addr); + + mutex_lock(&encl->lock); + + /* Change EPCM permissions. */ + epc_virt = sgx_get_epc_virt_addr(entry->epc_page); + ret = __emodpr(&secinfo, epc_virt); + if (encls_faulted(ret)) { + /* + * All possible faults should be avoidable: + * parameters have been checked, will only change + * permissions of a regular page, and no concurrent + * SGX1/SGX2 ENCLS instructions since these + * are protected with mutex. + */ + pr_err_once("EMODPR encountered exception %d\n", + ENCLS_TRAPNR(ret)); + ret = -EFAULT; + goto out_prot_restore; + } + if (encls_failed(ret)) { + modp->result = ret; + ret = -EFAULT; + goto out_prot_restore; + } + + epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page); + ret = __etrack(epc_virt); + if (ret) { + /* + * ETRACK only fails when there is an OS issue. For + * example, two consecutive ETRACK was sent without + * completed IPI between. + */ + pr_err_once("ETRACK returned %d (0x%x)", ret, ret); + /* + * Send IPIs to kick CPUs out of the enclave and + * try ETRACK again. + */ + on_each_cpu_mask(sgx_encl_cpumask(encl), + sgx_ipi_cb, NULL, 1); + ret = __etrack(epc_virt); + if (ret) { + pr_err_once("ETRACK repeat returned %d (0x%x)", + ret, ret); + ret = -EFAULT; + goto out_reclaim; + } + } + on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1); + + sgx_mark_page_reclaimable(entry->epc_page); + mutex_unlock(&encl->lock); + } + + ret = 0; + goto out; + +out_prot_restore: + entry->vm_run_prot_bits = run_prot_restore; +out_reclaim: + sgx_mark_page_reclaimable(entry->epc_page); +out_unlock: + mutex_unlock(&encl->lock); +out: + modp->count = c; + + return ret; +} + +/** + * sgx_ioc_page_modp() - handler for %SGX_IOC_PAGE_MODP + * @encl: an enclave pointer + * @arg: userspace pointer to a &struct sgx_page_modp instance + * + * Return: + * - 0: Success + * - -errno: Otherwise + */ +static long sgx_ioc_page_modp(struct sgx_encl *encl, void __user *arg) +{ + struct sgx_page_modp params; + long ret; + + /* + * Ensure that there is a change this could succeed: (1) SGX2 + * is required, and (2) only pages in an initialized enclave could + * be modified. + */ + if (!(cpu_feature_enabled(X86_FEATURE_SGX2))) + return -ENODEV; + + if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) + return -EINVAL; + + /* + * Obtain parameters from user and perform sanity checks. + */ + if (copy_from_user(¶ms, arg, sizeof(params))) + return -EFAULT; + + if (!IS_ALIGNED(params.offset, PAGE_SIZE)) + return -EINVAL; + + if (!params.length || params.length & (PAGE_SIZE - 1)) + return -EINVAL; + + if (params.offset + params.length - PAGE_SIZE >= encl->size) + return -EINVAL; + + if (params.prot & ~SGX_SECINFO_PERMISSION_MASK) + return -EINVAL; + + if (params.result || params.count) + return -EINVAL; + + ret = sgx_page_modp(encl, ¶ms); + + if (copy_to_user(arg, ¶ms, sizeof(params))) + return -EFAULT; + + return ret; +} + long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct sgx_encl *encl = filep->private_data; @@ -703,6 +935,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) case SGX_IOC_ENCLAVE_PROVISION: ret = sgx_ioc_enclave_provision(encl, (void __user *)arg); break; + case SGX_IOC_PAGE_MODP: + ret = sgx_ioc_page_modp(encl, (void __user *)arg); + break; default: ret = -ENOIOCTLCMD; break; From patchwork Wed Dec 1 19:23:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A23DFC433F5 for ; Wed, 1 Dec 2021 19:23:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352717AbhLAT1T (ORCPT ); Wed, 1 Dec 2021 14:27:19 -0500 Received: from mga04.intel.com ([192.55.52.120]:46907 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245180AbhLAT1I (ORCPT ); Wed, 1 Dec 2021 14:27:08 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267930" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267930" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380468" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 11/25] selftests/sgx: Add test for EPCM permission changes Date: Wed, 1 Dec 2021 11:23:09 -0800 Message-Id: <5f35768a649683eac1160db7bff9f02e00813115.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org EPCM permission changes could be made from within (to extend permissions) or out (to restrict permissions) the enclave. OS support is needed when permissions are restricted to be able to call the privileged ENCLS[EMODPR] instruction and ensure PTEs allowing the restricted permissions are flushed. EPCM permissions can be extended via ENCLU[EMODPE] from within the enclave. Add a test that exercises a few of the enclave page permission flows: 1) Test starts with a RW (from enclave and OS perspective) enclave page that is mapped via a RW VMA. 2) The SGX_IOC_PAGE_MODP ioctl is used to restrict the enclave (EPCM) page permissions to read-only (OS removes page table entry in the process). 3) Run ENCLU[EACCEPT] from within the enclave to accept the new page permissions. 4) Attempt to write to the enclave page from within the enclave - this should fail with a page fault on the page table entry since the page table entry accurately reflects the EPCM permissions. 5) Restore EPCM permissions to RW by running ENCLU[EMODPE] from within the enclave. 6) Attempt to write to the enclave page from within the enclave - this should fail again with a page fault because even though the EPCM permissions are RW the page table entries do not yet reflect that. 7) The SGX_IOC_PAGE_MODP ioctl is used to inform OS of new page permissions and page table entries will accurately reflect RW EPCM permissions. 8) Writing to enclave page from within enclave succeeds. 9) Ensure EPCM.PR is clear by running ENCLU[EACCEPT]. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/defines.h | 15 ++ tools/testing/selftests/sgx/main.c | 264 ++++++++++++++++++++++++ tools/testing/selftests/sgx/test_encl.c | 38 ++++ 3 files changed, 317 insertions(+) diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h index 02d775789ea7..b638eb98c80c 100644 --- a/tools/testing/selftests/sgx/defines.h +++ b/tools/testing/selftests/sgx/defines.h @@ -24,6 +24,8 @@ enum encl_op_type { ENCL_OP_PUT_TO_ADDRESS, ENCL_OP_GET_FROM_ADDRESS, ENCL_OP_NOP, + ENCL_OP_EACCEPT, + ENCL_OP_EMODPE, ENCL_OP_MAX, }; @@ -53,4 +55,17 @@ struct encl_op_get_from_addr { uint64_t addr; }; +struct encl_op_eaccept { + struct encl_op_header header; + uint64_t epc_addr; + uint64_t flags; + uint64_t ret; +}; + +struct encl_op_emodpe { + struct encl_op_header header; + uint64_t epc_addr; + uint64_t flags; +}; + #endif /* DEFINES_H */ diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index 7e912db4c6c5..dbd071ba03fe 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -24,6 +24,18 @@ static const uint64_t MAGIC = 0x1122334455667788ULL; static const uint64_t MAGIC2 = 0x8877665544332211ULL; vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave; +/* + * Security Information (SECINFO) data structure needed by a few SGX + * instructions (eg. ENCLU[EACCEPT] and ENCLU[EMODPE]) holds meta-data + * about an enclave page. &enum sgx_secinfo_page_state specifies the + * secinfo flags used for page state. + */ +enum sgx_secinfo_page_state { + SGX_SECINFO_PENDING = (1 << 3), + SGX_SECINFO_MODIFIED = (1 << 4), + SGX_SECINFO_PR = (1 << 5), +}; + struct vdso_symtab { Elf64_Sym *elf_symtab; const char *elf_symstrtab; @@ -555,4 +567,256 @@ TEST_F(enclave, pte_permissions) EXPECT_EQ(self->run.exception_addr, 0); } +/* + * Enclave page permission test. + * + * Modify and restore enclave page's EPCM (enclave) permissions from + * outside enclave (ENCLS[EMODPR] via OS) as well as from within enclave (via + * ENCLU[EMODPE]). Kernel should ensure PTE permissions are the same as + * the EPCM permissions so check for page fault if VMA allows access but + * EPCM and PTE does not. + */ +TEST_F(enclave, epcm_permissions) +{ + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + struct encl_op_eaccept eaccept_op; + struct encl_op_emodpe emodpe_op; + unsigned long data_start; + struct sgx_page_modp ioc; + int ret, errno_save; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + /* + * Ensure kernel supports needed ioctl and system supports needed + * commands. + */ + memset(&ioc, 0, sizeof(ioc)); + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc); + + if (ret == -1) { + if (errno == ENOTTY) + SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODP ioctl"); + else if (errno == ENODEV) + SKIP(return, "System does not support SGX2"); + } + + /* + * Invalid parameters were provided during sanity check, + * expect command to fail. + */ + EXPECT_EQ(ret, -1); + + /* + * Page that will have its permissions changed is the second data + * page in the .data segment. This forms part of the local encl_buffer + * within the enclave. + * + * At start of test @data_start should have EPCM as well as PTE + * permissions of RW. + */ + + data_start = self->encl.encl_base + + encl_get_data_offset(&self->encl) + PAGE_SIZE; + + /* + * Sanity check that page at @data_start is writable before making + * any changes to page permissions. + * + * Start by writing MAGIC to test page. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = data_start; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that + * page is writable. + */ + get_addr_op.value = 0; + get_addr_op.addr = data_start; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Change EPCM permissions to read-only, PTE entry flushed by OS in + * the process. + */ + memset(&ioc, 0, sizeof(ioc)); + + ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + ioc.length = PAGE_SIZE; + ioc.prot = PROT_READ; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(ioc.result, 0); + EXPECT_EQ(ioc.count, 4096); + + /* + * EPCM permissions changed from OS, need to EACCEPT from enclave. + */ + eaccept_op.epc_addr = data_start; + eaccept_op.flags = PROT_READ | SGX_SECINFO_REG | SGX_SECINFO_PR; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* + * EPCM permissions of page is now read-only, expect #PF + * on PTE (not EPCM) when attempting to write to page from + * within enclave. + */ + put_addr_op.value = MAGIC2; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EQ(self->run.function, ERESUME); + EXPECT_EQ(self->run.exception_vector, 14); + EXPECT_EQ(self->run.exception_error_code, 0x7); + EXPECT_EQ(self->run.exception_addr, data_start); + + self->run.exception_vector = 0; + self->run.exception_error_code = 0; + self->run.exception_addr = 0; + + /* + * Received AEX but cannot return to enclave at same entrypoint, + * need different TCS from where EPCM permission can be made writable + * again. + */ + self->run.tcs = self->encl.encl_base + PAGE_SIZE; + + /* + * Enter enclave at new TCS to change EPCM permissions to be + * writable again and thus fix the page fault that triggered the + * AEX. + */ + + emodpe_op.epc_addr = data_start; + emodpe_op.flags = PROT_READ | PROT_WRITE; + emodpe_op.header.type = ENCL_OP_EMODPE; + + EXPECT_EQ(ENCL_CALL(&emodpe_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Attempt to return to main TCS to resume execution at faulting + * instruction, but PTE should still prevent writing to the page. + */ + self->run.tcs = self->encl.encl_base; + + EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0, + ERESUME, 0, 0, + &self->run), + 0); + + EXPECT_EQ(self->run.function, ERESUME); + EXPECT_EQ(self->run.exception_vector, 14); + EXPECT_EQ(self->run.exception_error_code, 0x7); + EXPECT_EQ(self->run.exception_addr, data_start); + + self->run.exception_vector = 0; + self->run.exception_error_code = 0; + self->run.exception_addr = 0; + /* + * Inform OS about new permissions to have PTEs match EPCM. + */ + memset(&ioc, 0, sizeof(ioc)); + + ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + ioc.length = PAGE_SIZE; + ioc.prot = PROT_READ | PROT_WRITE; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(ioc.result, 0); + EXPECT_EQ(ioc.count, 4096); + + /* + * Wrong page permissions that caused original fault has + * now been fixed via EPCM permissions as well as PTE. + * Resume execution in main TCS to re-attempt the memory access. + */ + self->run.tcs = self->encl.encl_base; + + EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0, + ERESUME, 0, 0, + &self->run), + 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + get_addr_op.value = 0; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC2); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.user_data, 0); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * The SGX_IOC_PAGE_MODP runs ENCLS[EMODPR] that sets EPCM.PR even + * if permissions are not actually restricted. The previous memory + * access succeeding shows that the PR flag does not prevent + * access. Even so, include an ENCLU[EACCEPT] as reference + * implementation to ensure EPCM does not have a dangling PR bit set. + */ + + eaccept_op.epc_addr = data_start; + eaccept_op.flags = PROT_READ | PROT_WRITE | SGX_SECINFO_REG | SGX_SECINFO_PR; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); +} + TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index 4fca01cfd898..5b6c65331527 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -11,6 +11,42 @@ */ static uint8_t encl_buffer[8192] = { 1 }; +enum sgx_enclu_function { + EACCEPT = 0x5, + EMODPE = 0x6, +}; + +static void do_encl_emodpe(void *_op) +{ + struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0}; + struct encl_op_emodpe *op = _op; + + secinfo.flags = op->flags; + + asm volatile(".byte 0x0f, 0x01, 0xd7" + : + : "a" (EMODPE), + "b" (&secinfo), + "c" (op->epc_addr)); +} + +static void do_encl_eaccept(void *_op) +{ + struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0}; + struct encl_op_eaccept *op = _op; + int rax; + + secinfo.flags = op->flags; + + asm volatile(".byte 0x0f, 0x01, 0xd7" + : "=a" (rax) + : "a" (EACCEPT), + "b" (&secinfo), + "c" (op->epc_addr)); + + op->ret = rax; +} + static void *memcpy(void *dest, const void *src, size_t n) { size_t i; @@ -62,6 +98,8 @@ void encl_body(void *rdi, void *rsi) do_encl_op_put_to_addr, do_encl_op_get_from_addr, do_encl_op_nop, + do_encl_eaccept, + do_encl_emodpe, }; struct encl_op_header *op = (struct encl_op_header *)rdi; From patchwork Wed Dec 1 19:23:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44C5AC433FE for ; Wed, 1 Dec 2021 19:23:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245127AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244564AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784104" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784104" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380470" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 12/25] selftests/sgx: Add test for TCS page permission changes Date: Wed, 1 Dec 2021 11:23:10 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Kernel should not allow permission changes on TCS pages. Add test to confirm this behavior. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/main.c | 70 ++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index dbd071ba03fe..c7c50d05e246 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -120,6 +120,24 @@ static Elf64_Sym *vdso_symtab_get(struct vdso_symtab *symtab, const char *name) return NULL; } +/* + * Return the offset in the enclave where the TCS segment can be found. + * The first RW segment loaded is the TCS. + */ +static off_t encl_get_tcs_offset(struct encl *encl) +{ + int i; + + for (i = 0; i < encl->nr_segments; i++) { + struct encl_segment *seg = &encl->segment_tbl[i]; + + if (i == 0 && seg->prot == (PROT_READ | PROT_WRITE)) + return seg->offset; + } + + return -1; +} + /* * Return the offset in the enclave where the data segment can be found. * The first RW segment loaded is the TCS, skip that to get info on the @@ -567,6 +585,58 @@ TEST_F(enclave, pte_permissions) EXPECT_EQ(self->run.exception_addr, 0); } +/* + * Modifying permissions of TCS page should not be possible. + */ +TEST_F(enclave, tcs_permissions) +{ + struct sgx_page_modp ioc; + int ret, errno_save; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + memset(&ioc, 0, sizeof(ioc)); + + /* + * Ensure kernel supports needed ioctl and system supports needed + * commands. + */ + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc); + + if (ret == -1) { + if (errno == ENOTTY) + SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODP ioctl"); + else if (errno == ENODEV) + SKIP(return, "System does not support SGX2"); + } + + /* + * Invalid parameters were provided during sanity check, + * expect command to fail. + */ + EXPECT_EQ(ret, -1); + + /* + * Attempt to make TCS page read-only. This is not allowed and + * should be prevented by OS. + */ + ioc.offset = encl_get_tcs_offset(&self->encl); + ioc.length = PAGE_SIZE; + ioc.prot = PROT_READ; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, -1); + EXPECT_EQ(errno_save, EINVAL); + EXPECT_EQ(ioc.result, 0); + EXPECT_EQ(ioc.count, 0); +} + /* * Enclave page permission test. * From patchwork Wed Dec 1 19:23:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16BCAC433EF for ; Wed, 1 Dec 2021 19:23:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351880AbhLAT1R (ORCPT ); Wed, 1 Dec 2021 14:27:17 -0500 Received: from mga04.intel.com ([192.55.52.120]:46914 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245188AbhLAT1I (ORCPT ); Wed, 1 Dec 2021 14:27:08 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267931" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267931" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380473" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Date: Wed, 1 Dec 2021 11:23:11 -0800 Message-Id: <9ab661a845d242cb10a90ade997f8ebda33cc7c9.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org With SGX1 an enclave needs to be created with its maximum memory demands allocated. Pages cannot be added to an enclave after it is initialized. SGX2 introduces a new function, ENCLS[EAUG], that can be used to add pages to an initialized enclave. With SGX2 the enclave still needs to set aside address space for its maximum memory demands during enclave creation, but all pages need not be added before enclave initialization. Pages can be added during enclave runtime. Add support for dynamically adding pages to an initialized enclave, architecturally limited to RW permission. Add pages via the page fault handler at the time an enclave address without a backing enclave page is accessed, potentially directly reclaiming pages if no free pages are available. The enclave is still required to run ENCLU[EACCEPT] on the page before it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT] on an uninitialized address. This will trigger the page fault handler that will add the enclave page and return execution to the enclave to repeat the ENCLU[EACCEPT] instruction, this time successful. If the enclave accesses an uninitialized address in another way, for example by expanding the enclave stack to a page that has not yet been added, then the page fault handler would add the page on the first write but upon returning to the enclave the instruction that triggered the page fault would be repeated and since ENCLU[EACCEPT] was not run yet it would trigger a second page fault, this time with the SGX flag set in the page fault error code. This can only be recovered by entering the enclave again and directly running the ENCLU[EACCEPT] instruction on the now initialized address. Accessing an uninitialized address from outside the enclave also triggers this flow but the page will remain in PENDING state until accepted from within the enclave. The page is added with the architecturally constrained RW permissions as runtime as well as maximum allowed permissions. It is understood that there are some use cases, for example code relocation, that requires RWX maximum permissions. Supporting these use cases require guidance from user space policy before such maximum permissions can be allowed. Integration with user policy is deferred to a follow-up series. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 133 ++++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/encl.h | 2 + arch/x86/kernel/cpu/sgx/ioctl.c | 4 +- 3 files changed, 137 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 03c4d7e00b44..342b97dd4c33 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -124,6 +124,128 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, return entry; } +/** + * sgx_encl_eaug_page - Dynamically add page to initialized enclave + * @vma: VMA obtained from fault info from where page is accessed + * @encl: enclave accessing the page + * @addr: address that triggered the page fault + * + * When an initialized enclave accesses a page with no backing EPC page + * on a SGX2 system then the EPC can be added dynamically via the SGX2 + * ENCLS[EAUG] instruction. + * + * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed + * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise. + */ +static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, + struct sgx_encl *encl, unsigned long addr) +{ + struct sgx_pageinfo pginfo = {0}; + struct sgx_encl_page *encl_page; + struct sgx_epc_page *epc_page; + struct sgx_va_page *va_page; + unsigned long phys_addr; + unsigned long prot; + vm_fault_t vmret; + int ret; + + if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) + return VM_FAULT_SIGBUS; + + encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL); + if (!encl_page) + return VM_FAULT_OOM; + + encl_page->desc = addr; + encl_page->encl = encl; + + /* + * Adding a regular page that is architecturally allowed to only + * be created with RW permissions. + * TBD: Interface with user space policy to support max permissions + * of RWX. + */ + prot = PROT_READ | PROT_WRITE; + encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot, 0); + encl_page->vm_max_prot_bits = encl_page->vm_run_prot_bits; + + epc_page = sgx_alloc_epc_page(encl_page, true); + if (IS_ERR(epc_page)) { + kfree(encl_page); + return VM_FAULT_SIGBUS; + } + + va_page = sgx_encl_grow(encl); + if (IS_ERR(va_page)) { + ret = PTR_ERR(va_page); + goto err_out_free; + } + + mutex_lock(&encl->lock); + + /* + * Copy comment from sgx_encl_add_page() to maintain guidance in + * this similar flow: + * Adding to encl->va_pages must be done under encl->lock. Ditto for + * deleting (via sgx_encl_shrink()) in the error path. + */ + if (va_page) + list_add(&va_page->list, &encl->va_pages); + + ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc), + encl_page, GFP_KERNEL); + /* + * If ret == -EBUSY then page was created in another flow while + * running without encl->lock + */ + if (ret) + goto err_out_unlock; + + pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page); + pginfo.addr = encl_page->desc & PAGE_MASK; + pginfo.metadata = 0; + + ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page)); + if (ret) + goto err_out; + + encl_page->encl = encl; + encl_page->epc_page = epc_page; + encl_page->type = SGX_PAGE_TYPE_REG; + encl->secs_child_cnt++; + + sgx_mark_page_reclaimable(encl_page->epc_page); + + phys_addr = sgx_get_epc_phys_addr(epc_page); + /* + * Do not undo everything when creating PTE entry fails - next #PF + * would find page ready for a PTE. + * PAGE_SHARED because protection is forced to be RW above and COW + * is not supported. + */ + vmret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr), + PAGE_SHARED); + if (vmret != VM_FAULT_NOPAGE) { + mutex_unlock(&encl->lock); + return VM_FAULT_SIGBUS; + } + mutex_unlock(&encl->lock); + return VM_FAULT_NOPAGE; + +err_out: + xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc)); + +err_out_unlock: + sgx_encl_shrink(encl, va_page); + mutex_unlock(&encl->lock); + +err_out_free: + sgx_encl_free_epc_page(epc_page); + kfree(encl_page); + + return VM_FAULT_SIGBUS; +} + static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) { unsigned long addr = (unsigned long)vmf->address; @@ -145,6 +267,17 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) if (unlikely(!encl)) return VM_FAULT_SIGBUS; + /* + * The page_array keeps track of all enclave pages, whether they + * are swapped out or not. If there is no entry for this page and + * the system supports SGX2 then it is possible to dynamically add + * a new enclave page. This is only possible for an initialized + * enclave that will be checked for right away. + */ + if (cpu_feature_enabled(X86_FEATURE_SGX2) && + (!xa_load(&encl->page_array, PFN_DOWN(addr)))) + return sgx_encl_eaug_page(vma, encl, addr); + mutex_lock(&encl->lock); entry = sgx_encl_load_page(encl, addr); diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index 848a28d28d3d..1b6ce1da7c92 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -123,4 +123,6 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page); struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, unsigned long addr); +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl); +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page); #endif /* _X86_ENCL_H */ diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 5dddb3c9f742..de0bf68ee842 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -17,7 +17,7 @@ #include "encl.h" #include "encls.h" -static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) { struct sgx_va_page *va_page = NULL; void *err; @@ -43,7 +43,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) return va_page; } -static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page) +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page) { encl->page_cnt--; From patchwork Wed Dec 1 19:23:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD470C433F5 for ; Wed, 1 Dec 2021 19:23:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245152AbhLAT1H (ORCPT ); Wed, 1 Dec 2021 14:27:07 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245107AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784106" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784106" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380477" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization Date: Wed, 1 Dec 2021 11:23:12 -0800 Message-Id: <66da195d44cbbed57b6840c5d20bb789c06fb99f.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Before an enclave is initialized the enclave's memory range is unknown. The enclave's memory range is learned at the time it is created via the SGX_IOC_ENCLAVE_CREATE ioctl where the provided memory range is obtained from an earlier mmap() of the sgx_enclave device. After an enclave is initialized its memory can be mapped into user space (mmap()) from where it can be entered at its defined entry points. With the enclave's memory range known after it is initialized there is no reason why it should be possible to map memory outside this range. Lock down access to the initialized enclave's memory range by denying any attempt to map memory outside its memory range. Locking down the memory range also makes adding pages to an initialized enclave more efficient. Pages are added to an initialized enclave by accessing memory that belongs to the enclave's memory range but not yet backed by an enclave page. If it is possible for user space to map memory that does not form part of the enclave then an access to this memory would eventually fail. Failures range from a prompt general protection fault if the access was an ENCLU[EACCEPT] from within the enclave, or a page fault via the vDSO if it was another access from within the enclave, or a SIGBUS (also resulting from a page fault) if the access was from outside the enclave. Disallowing invalid memory to be mapped in the first place avoids preventable failures. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 342b97dd4c33..37203da382f8 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -403,6 +403,10 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, XA_STATE(xas, &encl->page_array, PFN_DOWN(start)); + if (test_bit(SGX_ENCL_INITIALIZED, &encl->flags) && + (start < encl->base || end > encl->base + encl->size)) + return -EACCES; + /* * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might * conflict with the enclave page permissions. From patchwork Wed Dec 1 19:23:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57937C433F5 for ; Wed, 1 Dec 2021 19:23:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352649AbhLAT1N (ORCPT ); Wed, 1 Dec 2021 14:27:13 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245114AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784109" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784109" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380481" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 15/25] selftests/sgx: Test two different SGX2 EAUG flows Date: Wed, 1 Dec 2021 11:23:13 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Enclave pages can be added to an initialized enclave when an address belonging to the enclave but without a backing page is accessed from within the enclave. Accessing memory without a backing enclave page from within an enclave can be in different ways: 1) Pre-emptively run ENCLU[EACCEPT]. Since the addition of a page always needs to be accepted by the enclave via ENCLU[EACCEPT] this flow is efficient since the first execution of ENCLU[EACCEPT] triggers the addition of the page and when execution returns to the same instruction the second execution would be successful as an acceptance of the page. 2) A direct read or write. The flow where a direct read or write triggers the page addition execution cannot resume from the instruction (read/write) that triggered the fault but instead the enclave needs to be entered at a different entry point to run needed ENCLU[EACCEPT] before execution can return to the original entry point and the read/write instruction that faulted. Add tests for both flows. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/main.c | 260 +++++++++++++++++++++++++++++ 1 file changed, 260 insertions(+) diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index c7c50d05e246..bc8c7d06d74c 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -85,6 +85,30 @@ static bool vdso_get_symtab(void *addr, struct vdso_symtab *symtab) return true; } +static inline void __cpuid(unsigned int *eax, unsigned int *ebx, + unsigned int *ecx, unsigned int *edx) +{ + asm volatile("cpuid" + : "=a" (*eax), + "=b" (*ebx), + "=c" (*ecx), + "=d" (*edx) + : "0" (*eax), "2" (*ecx) + : "memory"); +} + +static inline int sgx2_supported(void) +{ + unsigned int eax, ebx, ecx, edx; + + eax = SGX_CPUID; + ecx = 0x0; + + __cpuid(&eax, &ebx, &ecx, &edx); + + return eax & 0x2; +} + static unsigned long elf_sym_hash(const char *name) { unsigned long h = 0, high; @@ -889,4 +913,240 @@ TEST_F(enclave, epcm_permissions) EXPECT_EQ(eaccept_op.ret, 0); } +/* + * Test the addition of pages to an initialized enclave via writing to + * a page belonging to the enclave's address space but was not added + * during enclave creation. + */ +TEST_F(enclave, augment) +{ + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + struct encl_op_eaccept eaccept_op; + size_t total_size = 0; + void *addr; + int i; + + if (!sgx2_supported()) + SKIP(return, "SGX2 not supported"); + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + for (i = 0; i < self->encl.nr_segments; i++) { + struct encl_segment *seg = &self->encl.segment_tbl[i]; + + total_size += seg->size; + } + + /* + * Actual enclave size is expected to be larger than the loaded + * test enclave since enclave size must be a power of 2 in bytes + * and test_encl does not consume it all. + */ + EXPECT_LT(total_size + PAGE_SIZE, self->encl.encl_size); + + /* + * Create memory mapping for the page that will be added. New + * memory mapping is for one page right after all existing + * mappings. + */ + addr = mmap((void *)self->encl.encl_base + total_size, PAGE_SIZE, + PROT_READ | PROT_WRITE | PROT_EXEC, + MAP_SHARED | MAP_FIXED, self->encl.fd, 0); + EXPECT_NE(addr, MAP_FAILED); + + self->run.exception_vector = 0; + self->run.exception_error_code = 0; + self->run.exception_addr = 0; + + /* + * Attempt to write to the new page from within enclave. + * Expected to fail since page is not (yet) part of the enclave. + * The first #PF will trigger the addition of the page to the + * enclave, but since the new page needs an EACCEPT from within the + * enclave before it can be used it would not be possible + * to successfully return to the failing instruction. This is the + * cause of the second #PF captured here having the SGX bit set, + * it is from hardware preventing the page from being used. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = (unsigned long)addr; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EQ(self->run.function, ERESUME); + EXPECT_EQ(self->run.exception_vector, 14); + EXPECT_EQ(self->run.exception_addr, (unsigned long)addr); + + if (self->run.exception_error_code == 0x6) { + munmap(addr, PAGE_SIZE); + SKIP(return, "Kernel does not support adding pages to initialized enclave"); + } + + EXPECT_EQ(self->run.exception_error_code, 0x8007); + + self->run.exception_vector = 0; + self->run.exception_error_code = 0; + self->run.exception_addr = 0; + + /* Handle AEX by running EACCEPT from new entry point. */ + self->run.tcs = self->encl.encl_base + PAGE_SIZE; + + eaccept_op.epc_addr = self->encl.encl_base + total_size; + eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* Can now return to main TCS to resume execution. */ + self->run.tcs = self->encl.encl_base; + + EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0, + ERESUME, 0, 0, + &self->run), + 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that data + * previously written (MAGIC) is present. Only change two test + * parameters, rest are same as previous test. + */ + get_addr_op.value = 0; + get_addr_op.addr = (unsigned long)addr; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + munmap(addr, PAGE_SIZE); +} + +/* + * Test for the addition of pages to an initialized enclave via a + * pre-emptive run of EACCEPT on page to be added. + */ +TEST_F(enclave, augment_via_eaccept) +{ + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + struct encl_op_eaccept eaccept_op; + size_t total_size = 0; + void *addr; + int i; + + if (!sgx2_supported()) + SKIP(return, "SGX2 not supported"); + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + for (i = 0; i < self->encl.nr_segments; i++) { + struct encl_segment *seg = &self->encl.segment_tbl[i]; + + total_size += seg->size; + } + + /* + * Actual enclave size is expected to be larger than the loaded + * test enclave since enclave size must be a power of 2 in bytes while + * test_encl does not consume it all. + */ + EXPECT_LT(total_size + PAGE_SIZE, self->encl.encl_size); + + /* + * mmap() a page at end of existing enclave to be used for dynamic + * EPC page. + */ + + addr = mmap((void *)self->encl.encl_base + total_size, PAGE_SIZE, + PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_FIXED, + self->encl.fd, 0); + EXPECT_NE(addr, MAP_FAILED); + + self->run.exception_vector = 0; + self->run.exception_error_code = 0; + self->run.exception_addr = 0; + + /* + * Run EACCEPT on new page to trigger the #PF->EAUG->EACCEPT(again + * without a #PF). All should be transparent to userspace. + */ + eaccept_op.epc_addr = self->encl.encl_base + total_size; + eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + if (self->run.exception_vector == 14 && + self->run.exception_error_code == 4 && + self->run.exception_addr == self->encl.encl_base + total_size) { + munmap(addr, PAGE_SIZE); + SKIP(return, "Kernel does not support adding pages to initialized enclave"); + } + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* + * New page should be accessible from within enclave - attempt to + * write to it. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = (unsigned long)addr; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that data + * previously written (MAGIC) is present. Only change two test + * parameters, rest are same as previous test. + */ + get_addr_op.value = 0; + get_addr_op.addr = (unsigned long)addr; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + munmap(addr, PAGE_SIZE); +} + TEST_HARNESS_MAIN From patchwork Wed Dec 1 19:23:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 813C5C433F5 for ; Wed, 1 Dec 2021 19:24:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352724AbhLAT1X (ORCPT ); Wed, 1 Dec 2021 14:27:23 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352641AbhLAT1N (ORCPT ); Wed, 1 Dec 2021 14:27:13 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267932" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267932" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380484" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 16/25] x86/sgx: Support modifying SGX page type Date: Wed, 1 Dec 2021 11:23:14 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Every enclave contains one or more Thread Control Structures (TCS). The TCS contains meta-data used by the hardware to save and restore thread specific information when entering/exiting the enclave. With SGX1 an enclave needs to be created with enough TCSs to support the largest number of threads expecting to use the enclave and enough enclave pages to meet all its anticipated memory demands. In SGX1 all pages remain in the enclave until the enclave is unloaded. Earlier changes added support for the SGX2 feature where pages can be added dynamically to an initialized enclave. SGX2 introduces a new function, ENCLS[EMODT], that is used to change the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS) page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later removal). With the existing support of dynamically adding regular enclave pages to an initialized enclave and changing the page type to TCS it is possible to dynamically increase the number of threads supported by an enclave. Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step of dynamically removing pages from an initialized enclave. The complete page removal flow is: 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM using the ioctl introduced here. 2) Approve the page removal by running ENCLU[EACCEPT] from within the enclave. 3) Initiate actual page removal using the new ioctl introduced in the following patch. Support changing SGX enclave page types with a new ioctl. With this ioctl the user specifies a page range and the enclave page type to be applied to all pages in the provided range. The ioctl itself can return an error code based on failures encountered by the OS. It is also possible for SGX specific failures to be encountered. Add a result output parameter to communicate the SGX return code. It is possible for the enclave page type change request to fail on any page within the provided range. Support partial success by returning the number of pages that were successfully changed. After the page type is changed to SGX_PAGE_TYPE_TRIM the page continues to be accessible from the OS perspective with page table entries and internal state. The page may be moved to swap. Any invalid access (any access except ENCLU[EACCEPT]) will encounter a page fault with SGX flag set in error code until the page is removed. Removal of trimmed enclave pages on user request will be supported in following patch. Trimmed enclave pages are also removed when enclave is unloaded. Signed-off-by: Reinette Chatre --- arch/x86/include/uapi/asm/sgx.h | 19 +++ arch/x86/kernel/cpu/sgx/ioctl.c | 235 ++++++++++++++++++++++++++++++++ 2 files changed, 254 insertions(+) diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h index 24bebc31e336..f70caccd166c 100644 --- a/arch/x86/include/uapi/asm/sgx.h +++ b/arch/x86/include/uapi/asm/sgx.h @@ -31,6 +31,8 @@ enum sgx_page_flags { _IO(SGX_MAGIC, 0x04) #define SGX_IOC_PAGE_MODP \ _IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp) +#define SGX_IOC_PAGE_MODT \ + _IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt) /** * struct sgx_enclave_create - parameter structure for the @@ -96,6 +98,23 @@ struct sgx_page_modp { __u64 count; }; +/** + * struct sgx_page_modt - parameter structure for the %SGX_IOC_PAGE_MODT ioctl + * @offset: starting page offset (page aligned relative to enclave base + * address defined in SECS) + * @length: length of memory (multiple of the page size) + * @type: new type of pages in range described by @offset and @length + * @result: SGX result code of ENCLS[EMODT] function + * @count: bytes successfully changed (multiple of page size) + */ +struct sgx_page_modt { + __u64 offset; + __u64 length; + __u64 type; + __u64 result; + __u64 count; +}; + struct sgx_enclave_run; /** diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index de0bf68ee842..a952d608ab35 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -914,6 +914,238 @@ static long sgx_ioc_page_modp(struct sgx_encl *encl, void __user *arg) return ret; } +/** + * sgx_page_modt - Modify type of SGX enclave pages + * @encl: Enclave to which the pages belong. + * @modt: Checked parameters from user about which pages need modifying + * and their new type. + * + * Ability to change the enclave page type supports the following use cases: + * * It is possible to add TCS pages to enclave by changing the type of + * regular pages (SGX_PAGE_TYPE_REG) to TCS (SGX_PAGE_TYPE_TCS) pages. With + * this support the number of threads supported by an initialized enclave + * can be increased dynamically. + * * Regular or TCS pages can dynamically be removed from an initialized + * enclave by changing the page type to SGX_PAGE_TYPE_TRIM. Changing the + * page type to SGX_PAGE_TYPE_TRIM marks the page for removal with actual + * removal done by handler of %SGX_IOC_PAGE_REMOVE ioctl called after + * ENCLU[EACCEPT] is run on SGX_PAGE_TYPE_TRIM page from within the enclave. + * + * Return: + * - 0: Success + * - -errno: Otherwise + */ +static long sgx_page_modt(struct sgx_encl *encl, struct sgx_page_modt *modt) +{ + unsigned long max_prot_restore, run_prot_restore; + enum sgx_page_type page_type; + struct sgx_encl_page *entry; + struct sgx_secinfo secinfo; + unsigned long prot; + unsigned long addr; + unsigned long c; + void *epc_virt; + int ret; + + page_type = modt->type & SGX_PAGE_TYPE_MASK; + + /* + * The only new page types allowed by hardware are PT_TCS and PT_TRIM. + */ + if (page_type != SGX_PAGE_TYPE_TCS && page_type != SGX_PAGE_TYPE_TRIM) + return -EINVAL; + + memset(&secinfo, 0, sizeof(secinfo)); + + secinfo.flags = page_type << 8; + + for (c = 0 ; c < modt->length; c += PAGE_SIZE) { + addr = encl->base + modt->offset + c; + + mutex_lock(&encl->lock); + + entry = sgx_encl_load_page(encl, addr); + if (IS_ERR(entry)) { + ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT; + goto out_unlock; + } + + /* + * Borrow the logic from the Intel SDM. Regular pages + * (SGX_PAGE_TYPE_REG) can change type to SGX_PAGE_TYPE_TCS + * or SGX_PAGE_TYPE_TRIM but TCS pages can only be trimmed. + * CET pages not supported yet. + */ + if (!(entry->type == SGX_PAGE_TYPE_REG || + (entry->type == SGX_PAGE_TYPE_TCS && + page_type == SGX_PAGE_TYPE_TRIM))) { + ret = -EINVAL; + goto out_unlock; + } + + max_prot_restore = entry->vm_max_prot_bits; + run_prot_restore = entry->vm_run_prot_bits; + + /* + * Once a regular page becomes a TCS page it cannot be + * changed back. So the maximum allowed protection reflects + * the TCS page that is always RW from OS perspective but + * will be inaccessible from within enclave. Before doing + * so, do make sure that the new page type continues to + * respect the originally vetted page permissions. + */ + if (entry->type == SGX_PAGE_TYPE_REG && + page_type == SGX_PAGE_TYPE_TCS) { + if (~entry->vm_max_prot_bits & (VM_READ | VM_WRITE)) { + ret = -EPERM; + goto out_unlock; + } + prot = PROT_READ | PROT_WRITE; + entry->vm_max_prot_bits = calc_vm_prot_bits(prot, 0); + entry->vm_run_prot_bits = entry->vm_max_prot_bits; + + /* + * Prevent page from being reclaimed while mutex + * is released. + */ + if (sgx_unmark_page_reclaimable(entry->epc_page)) { + ret = -EAGAIN; + goto out_entry_changed; + } + + /* + * Do not keep encl->lock because of dependency on + * mmap_lock acquired in sgx_zap_enclave_ptes(). + */ + mutex_unlock(&encl->lock); + + sgx_zap_enclave_ptes(encl, addr); + + mutex_lock(&encl->lock); + + sgx_mark_page_reclaimable(entry->epc_page); + } + + /* Change EPC type */ + epc_virt = sgx_get_epc_virt_addr(entry->epc_page); + ret = __emodt(&secinfo, epc_virt); + if (encls_faulted(ret)) { + /* + * All possible faults should be avoidable: + * parameters have been checked, will only change + * valid page types, and no concurrent + * SGX1/SGX2 ENCLS instructions since these are + * protected with mutex. + */ + pr_err_once("EMODT encountered exception %d\n", + ENCLS_TRAPNR(ret)); + ret = -EFAULT; + goto out_entry_changed; + } + if (encls_failed(ret)) { + modt->result = ret; + ret = -EFAULT; + goto out_entry_changed; + } + + epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page); + ret = __etrack(epc_virt); + if (ret) { + /* + * ETRACK only fails when there is an OS issue. For + * example, two consecutive ETRACK was sent without + * completed IPI between. + */ + pr_err_once("ETRACK returned %d (0x%x)", ret, ret); + /* + * Send IPIs to kick CPUs out of the enclave and + * try ETRACK again. + */ + on_each_cpu_mask(sgx_encl_cpumask(encl), + sgx_ipi_cb, NULL, 1); + ret = __etrack(epc_virt); + if (ret) { + pr_err_once("ETRACK repeat returned %d (0x%x)", + ret, ret); + ret = -EFAULT; + goto out_unlock; + } + } + on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1); + + entry->type = page_type; + + mutex_unlock(&encl->lock); + } + + ret = 0; + goto out; + +out_entry_changed: + entry->vm_max_prot_bits = max_prot_restore; + entry->vm_run_prot_bits = run_prot_restore; +out_unlock: + mutex_unlock(&encl->lock); +out: + modt->count = c; + + return ret; +} + +/** + * sgx_ioc_page_modt() - handler for %SGX_IOC_PAGE_MODT + * @encl: an enclave pointer + * @arg: userspace pointer to a &struct sgx_page_modt instance + * + * Return: + * - 0: Success + * - -errno: Otherwise + */ +static long sgx_ioc_page_modt(struct sgx_encl *encl, void __user *arg) +{ + struct sgx_page_modt params; + long ret; + + /* + * Ensure that there is a chance the request could succeed: + * (1) SGX2 is required. + * (2) Only pages in an initialized enclave could be modified. + */ + if (!(cpu_feature_enabled(X86_FEATURE_SGX2))) + return -ENODEV; + + if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) + return -EINVAL; + + /* + * Obtain parameters from user and perform sanity checks. + */ + if (copy_from_user(¶ms, arg, sizeof(params))) + return -EFAULT; + + if (!IS_ALIGNED(params.offset, PAGE_SIZE)) + return -EINVAL; + + if (!params.length || params.length & (PAGE_SIZE - 1)) + return -EINVAL; + + if (params.offset + params.length - PAGE_SIZE >= encl->size) + return -EINVAL; + + if (params.type & ~SGX_PAGE_TYPE_MASK) + return -EINVAL; + + if (params.result || params.count) + return -EINVAL; + + ret = sgx_page_modt(encl, ¶ms); + + if (copy_to_user(arg, ¶ms, sizeof(params))) + return -EFAULT; + + return ret; +} + long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct sgx_encl *encl = filep->private_data; @@ -938,6 +1170,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) case SGX_IOC_PAGE_MODP: ret = sgx_ioc_page_modp(encl, (void __user *)arg); break; + case SGX_IOC_PAGE_MODT: + ret = sgx_ioc_page_modt(encl, (void __user *)arg); + break; default: ret = -ENOIOCTLCMD; break; From patchwork Wed Dec 1 19:23:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED8C8C43217 for ; Wed, 1 Dec 2021 19:23:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352630AbhLAT1N (ORCPT ); Wed, 1 Dec 2021 14:27:13 -0500 Received: from mga05.intel.com ([192.55.52.43]:21705 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245122AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784111" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784111" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380488" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 17/25] x86/sgx: Support complete page removal Date: Wed, 1 Dec 2021 11:23:15 -0800 Message-Id: <737e651af6de9c0a7d43d2532047f20895d6c7d4.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The SGX2 page removal flow was introduced in previous patch and is as follows: 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM using the ioctl introduced in previous patch. 2) Approve the page removal by running ENCLU[EACCEPT] from within the enclave. 3) Initiate actual page removal using the new ioctl introduced here. Support the final step of the SGX2 page removal flow with a new ioctl. With this ioctl the user specifies a page range that should be removed. At this time all pages in the provided range should have the SGX_PAGE_TYPE_TRIM page type and the ioctl will fail with EPERM (Operation not permitted) when it encounters a page that does not have the correct type. Page removal can fail on any page within the provided range. Support partial success by returning the number of pages that were successfully removed. Since actual page removal will succeed even if ENCLU[EACCEPT] was not run from within the enclave the ENCLU[EMODPR] instruction with RWX permissions is used as a no-op mechanism to ensure ENCLU[EACCEPT] was successfully run from within the enclave before the enclave page is removed. Signed-off-by: Reinette Chatre --- arch/x86/include/uapi/asm/sgx.h | 21 +++++ arch/x86/kernel/cpu/sgx/ioctl.c | 159 ++++++++++++++++++++++++++++++++ 2 files changed, 180 insertions(+) diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h index f70caccd166c..6648ded960f8 100644 --- a/arch/x86/include/uapi/asm/sgx.h +++ b/arch/x86/include/uapi/asm/sgx.h @@ -33,6 +33,8 @@ enum sgx_page_flags { _IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp) #define SGX_IOC_PAGE_MODT \ _IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt) +#define SGX_IOC_PAGE_REMOVE \ + _IOWR(SGX_MAGIC, 0x07, struct sgx_page_remove) /** * struct sgx_enclave_create - parameter structure for the @@ -115,6 +117,25 @@ struct sgx_page_modt { __u64 count; }; +/** + * struct sgx_page_remove - parameters for the %SGX_IOC_PAGE_REMOVE ioctl + * @offset: starting page offset (page aligned relative to enclave base + * address defined in SECS) + * @length: length of memory (multiple of the page size) + * @count: bytes successfully changed (multiple of page size) + * + * Regular (PT_REG) or TCS (PT_TCS) can be removed from an initialized + * enclave if the system supports SGX2. First, the %SGX_IOC_PAGE_MODT ioctl + * should be used to change the page type to PT_TRIM. After that succeeds + * ENCLU[EACCEPT] should be run from within the enclave and then can this + * ioctl be used to complete the page removal. + */ +struct sgx_page_remove { + __u64 offset; + __u64 length; + __u64 count; +}; + struct sgx_enclave_run; /** diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index a952d608ab35..d11da6c53b26 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -1146,6 +1146,162 @@ static long sgx_ioc_page_modt(struct sgx_encl *encl, void __user *arg) return ret; } +/** + * sgx_page_remove - Remove trimmed pages from SGX enclave + * @encl: Enclave to which the pages belong + * @params: Checked parameters from user on which pages need to be removed + * + * Final step of the flow removing pages from an initialized enclave. The + * complete flow is: + * 1) User changes the type of the pages to be removed to %SGX_PAGE_TYPE_TRIM + * using the %SGX_IOC_PAGE_MODT ioctl. + * 2) User approves the page removal by running ENCLU[EACCEPT] from within + * the enclave. + * 3) User initiates actual page removal using the %SGX_IOC_PAGE_REMOVE + * ioctl that is handled here. + * + * First remove any page table entries pointing to the page and then proceed + * with the actual removal of the enclave page and data in support of it. + * + * VA pages are not affected by this removal. It is thus possible that the + * enclave may end up with more VA pages than needed to support all its + * pages. + * + * Return: + * - 0: Success. + * - -errno: Otherwise. + */ +static long sgx_page_remove(struct sgx_encl *encl, + struct sgx_page_remove *params) +{ + struct sgx_encl_page *entry; + struct sgx_secinfo secinfo; + unsigned long addr; + unsigned long c; + void *epc_virt; + int ret; + + memset(&secinfo, 0, sizeof(secinfo)); + secinfo.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X; + + for (c = 0 ; c < params->length; c += PAGE_SIZE) { + addr = encl->base + params->offset + c; + + mutex_lock(&encl->lock); + + entry = sgx_encl_load_page(encl, addr); + if (IS_ERR(entry)) { + ret = -EFAULT; + goto out_unlock; + } + + if (entry->type != SGX_PAGE_TYPE_TRIM) { + ret = -EPERM; + goto out_unlock; + } + + /* + * ENCLS[EMODPR] is a no-op instruction used to inform if + * ENCLU[EACCEPT] was run from within the enclave. If + * ENCLS[EMODPR] is run with RWX on a trimmed page that is + * not yet accepted then it will return + * %SGX_PAGE_NOT_MODIFIABLE, after the trimmed page is + * accepted the instruction will encounter a page fault. + */ + epc_virt = sgx_get_epc_virt_addr(entry->epc_page); + ret = __emodpr(&secinfo, epc_virt); + if (!encls_faulted(ret) || ENCLS_TRAPNR(ret) != X86_TRAP_PF) { + ret = -EPERM; + goto out_unlock; + } + + if (sgx_unmark_page_reclaimable(entry->epc_page)) { + ret = -EBUSY; + goto out_unlock; + } + + /* + * Do not keep encl->lock because of dependency on + * mmap_lock acquired in sgx_zap_enclave_ptes(). + */ + mutex_unlock(&encl->lock); + + sgx_zap_enclave_ptes(encl, addr); + + mutex_lock(&encl->lock); + + sgx_encl_free_epc_page(entry->epc_page); + encl->secs_child_cnt--; + entry->epc_page = NULL; + xa_erase(&encl->page_array, PFN_DOWN(entry->desc)); + sgx_encl_shrink(encl, NULL); + kfree(entry); + + mutex_unlock(&encl->lock); + } + + ret = 0; + goto out; + +out_unlock: + mutex_unlock(&encl->lock); +out: + params->count = c; + + return ret; +} + +/** + * sgx_ioc_page_remove() - handler for %SGX_IOC_PAGE_REMOVE + * @encl: an enclave pointer + * @arg: userspace pointer to a struct sgx_page_remove instance + * + * Return: + * - 0: Success + * - -errno: Otherwise + */ +static long sgx_ioc_page_remove(struct sgx_encl *encl, void __user *arg) +{ + struct sgx_page_remove params; + long ret; + + /* + * Ensure that there is a chance the request could succeed: + * (1) SGX2 is required + * (2) Pages can only be removed from an initialized enclave + */ + if (!(cpu_feature_enabled(X86_FEATURE_SGX2))) + return -ENODEV; + + if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) + return -EINVAL; + + /* + * Obtain parameters from user and perform sanity checks. + */ + if (copy_from_user(¶ms, arg, sizeof(params))) + return -EFAULT; + + if (!IS_ALIGNED(params.offset, PAGE_SIZE)) + return -EINVAL; + + if (!params.length || params.length & (PAGE_SIZE - 1)) + return -EINVAL; + + if (params.offset + params.length - PAGE_SIZE >= encl->size) + return -EINVAL; + + if (params.count) + return -EINVAL; + + ret = sgx_page_remove(encl, ¶ms); + + if (copy_to_user(arg, ¶ms, sizeof(params))) + return -EFAULT; + + return ret; +} + long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct sgx_encl *encl = filep->private_data; @@ -1173,6 +1329,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) case SGX_IOC_PAGE_MODT: ret = sgx_ioc_page_modt(encl, (void __user *)arg); break; + case SGX_IOC_PAGE_REMOVE: + ret = sgx_ioc_page_remove(encl, (void __user *)arg); + break; default: ret = -ENOIOCTLCMD; break; From patchwork Wed Dec 1 19:23:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650897 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50283C433EF for ; Wed, 1 Dec 2021 19:24:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352663AbhLAT1Z (ORCPT ); Wed, 1 Dec 2021 14:27:25 -0500 Received: from mga04.intel.com ([192.55.52.120]:46907 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352655AbhLAT1N (ORCPT ); Wed, 1 Dec 2021 14:27:13 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267934" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267934" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380491" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 18/25] selftests/sgx: Introduce dynamic entry point Date: Wed, 1 Dec 2021 11:23:16 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The test enclave (test_encl.elf) is built with two initialized Thread Control Structures (TCS) included in the binary. Both TCS are initialized with the same entry point, encl_entry, that correctly computes the absolute address of the stack based on the stack of each TCS that is also built into the binary. A new TCS can be added dynamically to the enclave and requires to be initialized with an entry point used to enter the enclave. Since the existing entry point, encl_entry, assumes that the TCS and its stack exists at particular offsets within the binary it is not able to handle a dynamically added TCS and its stack. Introduce a new entry point, encl_dyn_entry, that initializes the absolute address of that thread's stack to the address immediately preceding the TCS itself. It is now possible to dynamically add a contiguous memory region to the enclave with the new stack preceding the new TCS. With the new TCS initialized with encl_dyn_entry as entry point the absolute address of the stack is computed correctly on entry. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/test_encl_bootstrap.S | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/testing/selftests/sgx/test_encl_bootstrap.S b/tools/testing/selftests/sgx/test_encl_bootstrap.S index 82fb0dfcbd23..03ae0f57e29d 100644 --- a/tools/testing/selftests/sgx/test_encl_bootstrap.S +++ b/tools/testing/selftests/sgx/test_encl_bootstrap.S @@ -45,6 +45,12 @@ encl_entry: # TCS #2. By adding the value of encl_stack to it, we get # the absolute address for the stack. lea (encl_stack)(%rbx), %rax + jmp encl_entry_core +encl_dyn_entry: + # Entry point for dynamically created TCS page expected to follow + # its stack directly. + lea -1(%rbx), %rax +encl_entry_core: xchg %rsp, %rax push %rax From patchwork Wed Dec 1 19:23:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FD84C4167D for ; Wed, 1 Dec 2021 19:23:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245163AbhLAT1H (ORCPT ); Wed, 1 Dec 2021 14:27:07 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245132AbhLAT1F (ORCPT ); Wed, 1 Dec 2021 14:27:05 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784113" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784113" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380494" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 19/25] selftests/sgx: Introduce TCS initialization enclave operation Date: Wed, 1 Dec 2021 11:23:17 -0800 Message-Id: <3ab3f4f3c2a0da64a8c889f561f2484788eda365.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The Thread Control Structure (TCS) contains meta-data used by the hardware to save and restore thread specific information when entering/exiting the enclave. A TCS can be added to an initialized enclave by first adding a new regular enclave page, initializing the content of the new page from within the enclave, and then changing that page's type to a TCS. Support the initialization of a TCS from within the enclave. The variable information needed that should be provided from outside the enclave is the address of the TCS, address of the State Save Area (SSA), and the entry point that the thread should use to enter the enclave. With this information provided all needed fields of a TCS can be initialized. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/defines.h | 8 +++++++ tools/testing/selftests/sgx/test_encl.c | 30 +++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h index b638eb98c80c..d8587c971941 100644 --- a/tools/testing/selftests/sgx/defines.h +++ b/tools/testing/selftests/sgx/defines.h @@ -26,6 +26,7 @@ enum encl_op_type { ENCL_OP_NOP, ENCL_OP_EACCEPT, ENCL_OP_EMODPE, + ENCL_OP_INIT_TCS_PAGE, ENCL_OP_MAX, }; @@ -68,4 +69,11 @@ struct encl_op_emodpe { uint64_t flags; }; +struct encl_op_init_tcs_page { + struct encl_op_header header; + uint64_t tcs_page; + uint64_t ssa; + uint64_t entry; +}; + #endif /* DEFINES_H */ diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index 5b6c65331527..c0d6397295e3 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -57,6 +57,35 @@ static void *memcpy(void *dest, const void *src, size_t n) return dest; } +static void *memset(void *dest, int c, size_t n) +{ + size_t i; + + for (i = 0; i < n; i++) + ((char *)dest)[i] = c; + + return dest; +} + +static void do_encl_init_tcs_page(void *_op) +{ + struct encl_op_init_tcs_page *op = _op; + void *tcs = (void *)op->tcs_page; + uint32_t val_32; + + memset(tcs, 0, 16); /* STATE and FLAGS */ + memcpy(tcs + 16, &op->ssa, 8); /* OSSA */ + memset(tcs + 24, 0, 4); /* CSSA */ + val_32 = 1; + memcpy(tcs + 28, &val_32, 4); /* NSSA */ + memcpy(tcs + 32, &op->entry, 8); /* OENTRY */ + memset(tcs + 40, 0, 24); /* AEP, OFSBASE, OGSBASE */ + val_32 = 0xFFFFFFFF; + memcpy(tcs + 64, &val_32, 4); /* FSLIMIT */ + memcpy(tcs + 68, &val_32, 4); /* GSLIMIT */ + memset(tcs + 72, 0, 4024); /* Reserved */ +} + static void do_encl_op_put_to_buf(void *op) { struct encl_op_put_to_buf *op2 = op; @@ -100,6 +129,7 @@ void encl_body(void *rdi, void *rsi) do_encl_op_nop, do_encl_eaccept, do_encl_emodpe, + do_encl_init_tcs_page, }; struct encl_op_header *op = (struct encl_op_header *)rdi; From patchwork Wed Dec 1 19:23:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F103C4332F for ; Wed, 1 Dec 2021 19:24:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245217AbhLAT1Z (ORCPT ); Wed, 1 Dec 2021 14:27:25 -0500 Received: from mga04.intel.com ([192.55.52.120]:46907 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352708AbhLAT1R (ORCPT ); Wed, 1 Dec 2021 14:27:17 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267936" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267936" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380497" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 20/25] selftests/sgx: Test complete changing of page type flow Date: Wed, 1 Dec 2021 11:23:18 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Support for changing an enclave page's type enables an initialized enclave to be expanded with support for more threads by changing the type of a regular enclave page to that of a Thread Control Structure (TCS). Additionally, being able to change a TCS or regular enclave page's type to be trimmed (SGX_PAGE_TYPE_TRIM) initiates the removal of the page from the enclave. Test changing page type to TCS as well as page removal flows in two phases: In the first phase support for a new thread is dynamically added to an initialized enclave and in the second phase the pages associated with the new thread are removed from the enclave. As an additional sanity check after the second phase the page used as a TCS page during the first phase is added back as a regular page and ensured that it can be written to (which is not possible if it was a TCS page). Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/load.c | 41 ++++ tools/testing/selftests/sgx/main.c | 343 +++++++++++++++++++++++++++++ tools/testing/selftests/sgx/main.h | 1 + 3 files changed, 385 insertions(+) diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c index 9d4322c946e2..41b9d2031799 100644 --- a/tools/testing/selftests/sgx/load.c +++ b/tools/testing/selftests/sgx/load.c @@ -129,6 +129,47 @@ static bool encl_ioc_add_pages(struct encl *encl, struct encl_segment *seg) return true; } +/* + * Parse the enclave code's symbol table to locate and return address of + * the provided symbol + */ +uint64_t encl_get_entry(struct encl *encl, const char *symbol) +{ + Elf64_Shdr *sections; + Elf64_Sym *symtab; + Elf64_Ehdr *ehdr; + char *sym_names; + int num_sym; + int i; + + ehdr = encl->bin; + sections = encl->bin + ehdr->e_shoff; + + for (i = 0; i < ehdr->e_shnum; i++) { + if (sections[i].sh_type == SHT_SYMTAB) { + symtab = (Elf64_Sym *)((char *)encl->bin + sections[i].sh_offset); + num_sym = sections[i].sh_size / sections[i].sh_entsize; + break; + } + } + + for (i = 0; i < ehdr->e_shnum; i++) { + if (sections[i].sh_type == SHT_STRTAB) { + sym_names = (char *)encl->bin + sections[i].sh_offset; + break; + } + } + + for (i = 0; i < num_sym; i++) { + Elf64_Sym *sym = &symtab[i]; + + if (!strcmp(symbol, sym_names + sym->st_name)) + return (uint64_t)sym->st_value; + } + + return 0; +} + bool encl_load(const char *path, struct encl *encl, unsigned long heap_size) { const char device_path[] = "/dev/sgx_enclave"; diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index bc8c7d06d74c..d73ea2a02d4b 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -1149,4 +1149,347 @@ TEST_F(enclave, augment_via_eaccept) munmap(addr, PAGE_SIZE); } +/* + * SGX2 page type modification test in two phases: + * Phase 1: + * Create a new TCS, consisting out of three new pages (stack page with regular + * page type, SSA page with regular page type, and TCS page with TCS page + * type) in an initialized enclave and run a simple workload within it. + * Phase 2: + * Remove the three pages added in phase 1, add a new regular page at the + * same address that previously hosted the TCS page and verify that it can + * be modified. + */ +TEST_F(enclave, tcs_create) +{ + struct encl_op_init_tcs_page init_tcs_page_op; + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + struct encl_op_get_from_buf get_buf_op; + struct encl_op_put_to_buf put_buf_op; + void *addr, *tcs, *stack_end, *ssa; + struct encl_op_eaccept eaccept_op; + struct sgx_page_remove remove_ioc; + struct sgx_page_modt modt_ioc; + size_t total_size = 0; + uint64_t val_64; + int errno_save; + int ret, i; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, + _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + /* + * Hardware (SGX2) and OS support is needed for this test. Start + * with check that test has a chance of succeeding. + */ + memset(&modt_ioc, 0, sizeof(modt_ioc)); + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + + if (ret == -1) { + if (errno == ENOTTY) + SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl"); + else if (errno == ENODEV) + SKIP(return, "System does not support SGX2"); + } + + /* + * Invalid parameters were provided during sanity check, + * expect command to fail. + */ + EXPECT_EQ(ret, -1); + + /* + * Add three regular pages via EAUG: one will be the TCS stack, one + * will be the TCS SSA, and one will be the new TCS. The stack and + * SSA will remain as regular pages, the TCS page will need its + * type changed after populated with needed data. + */ + for (i = 0; i < self->encl.nr_segments; i++) { + struct encl_segment *seg = &self->encl.segment_tbl[i]; + + total_size += seg->size; + } + + /* + * Actual enclave size is expected to be larger than the loaded + * test enclave since enclave size must be a power of 2 in bytes while + * test_encl does not consume it all. + */ + EXPECT_LT(total_size + 3 * PAGE_SIZE, self->encl.encl_size); + + /* + * mmap() three pages at end of existing enclave to be used for the + * three new pages. + */ + addr = mmap((void *)self->encl.encl_base + total_size, 3 * PAGE_SIZE, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, + self->encl.fd, 0); + EXPECT_NE(addr, MAP_FAILED); + + self->run.exception_vector = 0; + self->run.exception_error_code = 0; + self->run.exception_addr = 0; + + stack_end = (void *)self->encl.encl_base + total_size; + tcs = (void *)self->encl.encl_base + total_size + PAGE_SIZE; + ssa = (void *)self->encl.encl_base + total_size + 2 * PAGE_SIZE; + + /* + * Run EACCEPT on each new page to trigger the + * EACCEPT->(#PF)->EAUG->EACCEPT(again without a #PF) flow. + */ + + eaccept_op.epc_addr = (unsigned long)stack_end; + eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + if (self->run.exception_vector == 14 && + self->run.exception_error_code == 4 && + self->run.exception_addr == (unsigned long)stack_end) { + munmap(addr, 3 * PAGE_SIZE); + SKIP(return, "Kernel does not support adding pages to initialized enclave"); + } + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + eaccept_op.epc_addr = (unsigned long)ssa; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + eaccept_op.epc_addr = (unsigned long)tcs; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* + * Three new pages added to enclave. Now populate the TCS page with + * needed data. This should be done from within enclave. Provide + * the function that will do the actual data population with needed + * data. + */ + + /* + * New TCS will use the "encl_dyn_entry" entrypoint that expects + * stack to begin in page before TCS page. + */ + val_64 = encl_get_entry(&self->encl, "encl_dyn_entry"); + EXPECT_NE(val_64, 0); + + init_tcs_page_op.tcs_page = (unsigned long)tcs; + init_tcs_page_op.ssa = (unsigned long)total_size + 2 * PAGE_SIZE; + init_tcs_page_op.entry = val_64; + init_tcs_page_op.header.type = ENCL_OP_INIT_TCS_PAGE; + + EXPECT_EQ(ENCL_CALL(&init_tcs_page_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* Change TCS page type to TCS. */ + memset(&modt_ioc, 0, sizeof(modt_ioc)); + + modt_ioc.offset = total_size + PAGE_SIZE; + modt_ioc.length = PAGE_SIZE; + modt_ioc.type = SGX_PAGE_TYPE_TCS; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(modt_ioc.result, 0); + EXPECT_EQ(modt_ioc.count, 4096); + + /* EACCEPT new TCS page from enclave. */ + eaccept_op.epc_addr = (unsigned long)tcs; + eaccept_op.flags = SGX_SECINFO_TCS | SGX_SECINFO_MODIFIED; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* Run workload from new TCS. */ + self->run.tcs = (unsigned long)tcs; + + /* + * Simple workload to write to data buffer and read value back. + */ + put_buf_op.header.type = ENCL_OP_PUT_TO_BUFFER; + put_buf_op.value = MAGIC; + + EXPECT_EQ(ENCL_CALL(&put_buf_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + get_buf_op.header.type = ENCL_OP_GET_FROM_BUFFER; + get_buf_op.value = 0; + + EXPECT_EQ(ENCL_CALL(&get_buf_op, &self->run, true), 0); + + EXPECT_EQ(get_buf_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Phase 2 of test: + * Remove pages associated with new TCS, create a regular page + * where TCS page used to be and verify it can be used as a regular + * page. + */ + + /* Start page removal by requesting change of page type to PT_TRIM. */ + memset(&modt_ioc, 0, sizeof(modt_ioc)); + + modt_ioc.offset = total_size; + modt_ioc.length = 3 * PAGE_SIZE; + modt_ioc.type = SGX_PAGE_TYPE_TRIM; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(modt_ioc.result, 0); + EXPECT_EQ(modt_ioc.count, 3 * PAGE_SIZE); + + /* + * Enter enclave via TCS #1 and approve page removal by sending + * EACCEPT for each of three removed pages. + */ + self->run.tcs = self->encl.encl_base; + + eaccept_op.epc_addr = (unsigned long)stack_end; + eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + eaccept_op.epc_addr = (unsigned long)tcs; + eaccept_op.ret = 0; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + eaccept_op.epc_addr = (unsigned long)ssa; + eaccept_op.ret = 0; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* Send final ioctl to complete page removal. */ + memset(&remove_ioc, 0, sizeof(remove_ioc)); + + remove_ioc.offset = total_size; + remove_ioc.length = 3 * PAGE_SIZE; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(remove_ioc.count, 3 * PAGE_SIZE); + + /* + * Enter enclave via TCS #1 and access location where TCS #3 was to + * trigger dynamic add of regular page at that location. + */ + eaccept_op.epc_addr = (unsigned long)tcs; + eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* + * New page should be accessible from within enclave - write to it. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = (unsigned long)tcs; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that data + * previously written (MAGIC) is present. Only change two test + * parameters, rest are same as previous test. + */ + get_addr_op.value = 0; + get_addr_op.addr = (unsigned long)tcs; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + munmap(addr, 3 * PAGE_SIZE); +} + TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/sgx/main.h b/tools/testing/selftests/sgx/main.h index b45c52ec7ab3..fc585be97e2f 100644 --- a/tools/testing/selftests/sgx/main.h +++ b/tools/testing/selftests/sgx/main.h @@ -38,6 +38,7 @@ void encl_delete(struct encl *ctx); bool encl_load(const char *path, struct encl *encl, unsigned long heap_size); bool encl_measure(struct encl *encl); bool encl_build(struct encl *encl); +uint64_t encl_get_entry(struct encl *encl, const char *symbol); int sgx_enter_enclave(void *rdi, void *rsi, long rdx, u32 function, void *r8, void *r9, struct sgx_enclave_run *run); From patchwork Wed Dec 1 19:23:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AFC5C4332F for ; Wed, 1 Dec 2021 19:23:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245171AbhLAT1I (ORCPT ); Wed, 1 Dec 2021 14:27:08 -0500 Received: from mga05.intel.com ([192.55.52.43]:21705 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245133AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784115" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784115" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380500" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 21/25] selftests/sgx: Test faulty enclave behavior Date: Wed, 1 Dec 2021 11:23:19 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Removing a page from an initialized enclave involves three steps: first the user requests changing the page type to SGX_PAGE_TYPE_TRIM via an ioctl, on success the ENCLU[EACCEPT] instruction needs to be run from within the enclave to accept the page removal, finally the user requests page removal to be completed via an ioctl. Only after acceptance (ENCLU[EACCEPT]) from within the enclave can the kernel remove the page from a running enclave. Test the behavior when the user's request to change the page type succeeds, but the ENCLU[EACCEPT] instruction is not run before the ioctl requesting page removal is run. This should not be permitted. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/main.c | 114 +++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index d73ea2a02d4b..f71f943099fb 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -1492,4 +1492,118 @@ TEST_F(enclave, tcs_create) munmap(addr, 3 * PAGE_SIZE); } +/* + * Ensure sane behavior if user requests page removal, does not run + * EACCEPT from within enclave but still attempts to finalize page removal + * with the SGX_IOC_PAGE_REMOVE ioctl. The latter should fail because the + * removal was not EACCEPTed from within the enclave. + */ +TEST_F(enclave, remove_added_page_no_eaccept) +{ + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + struct sgx_page_remove remove_ioc; + struct sgx_page_modt modt_ioc; + unsigned long data_start; + int ret, errno_save; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + /* + * Hardware (SGX2) and OS support is needed for this test. Start + * with check that test has a chance of succeeding. + */ + memset(&modt_ioc, 0, sizeof(modt_ioc)); + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + + if (ret == -1) { + if (errno == ENOTTY) + SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl"); + else if (errno == ENODEV) + SKIP(return, "System does not support SGX2"); + } + + /* + * Invalid parameters were provided during sanity check, + * expect command to fail. + */ + EXPECT_EQ(ret, -1); + + /* + * Page that will be removed is the second data page in the .data + * segment. This forms part of the local encl_buffer within the + * enclave. + */ + data_start = self->encl.encl_base + + encl_get_data_offset(&self->encl) + PAGE_SIZE; + + /* + * Sanity check that page at @data_start is writable before + * removing it. + * + * Start by writing MAGIC to test page. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = data_start; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that data + * previously written (MAGIC) is present. Only change two test + * parameters, rest are same as previous test. + */ + get_addr_op.value = 0; + get_addr_op.addr = data_start; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* Start page removal by requesting change of page type to PT_TRIM */ + memset(&modt_ioc, 0, sizeof(modt_ioc)); + + modt_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + modt_ioc.length = PAGE_SIZE; + modt_ioc.type = SGX_PAGE_TYPE_TRIM; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(modt_ioc.result, 0); + EXPECT_EQ(modt_ioc.count, 4096); + + /* Skip EACCEPT */ + + /* Send final ioctl to complete page removal */ + memset(&remove_ioc, 0, sizeof(remove_ioc)); + + remove_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + remove_ioc.length = PAGE_SIZE; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc); + errno_save = ret == -1 ? errno : 0; + + /* Operation not permitted since EACCEPT was omitted. */ + EXPECT_EQ(ret, -1); + EXPECT_EQ(errno_save, EPERM); + EXPECT_EQ(remove_ioc.count, 0); +} + TEST_HARNESS_MAIN From patchwork Wed Dec 1 19:23:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4359DC433EF for ; Wed, 1 Dec 2021 19:24:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352665AbhLAT11 (ORCPT ); Wed, 1 Dec 2021 14:27:27 -0500 Received: from mga04.intel.com ([192.55.52.120]:46904 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352710AbhLAT1R (ORCPT ); Wed, 1 Dec 2021 14:27:17 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267937" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267937" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380504" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:42 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 22/25] selftests/sgx: Test invalid access to removed enclave page Date: Wed, 1 Dec 2021 11:23:20 -0800 Message-Id: <88af581809b945b1774c32186bcb0190d3d462e9.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Removing a page from an initialized enclave involves three steps: (1) the user requests changing the page type to SGX_PAGE_TYPE_TRIM via the SGX_IOC_PAGE_MODT ioctl, (2) on success the ENCLU[EACCEPT] instruction is run from within the enclave to accept the page removal, (3) the user initiates the actual removal of the page via the SGX_IOC_PAGE_REMOVE ioctl. Test two possible invalid accesses during the page removal flow: * Test the behavior when a request to remove the page by changing its type to SGX_PAGE_TYPE_TRIM completes successfully but instead of executing ENCLU[EACCEPT] from within the enclave the enclave attempts to read from the page. Even though the page is accessible from the page table entries its type is SGX_PAGE_TYPE_TRIM and thus not accessible according to SGX. The expected behavior is a page fault with the SGX flag set in the error code. * Test the behavior when the page type is changed successfully and ENCLU[EACCEPT] was run from within the enclave. The final ioctl, SGX_IOC_PAGE_REMOVE, is omitted and replaced with an attempt to access the page. Even though the page is accessible from the page table entries its type is SGX_PAGE_TYPE_TRIM and thus not accessible according to SGX. The expected behavior is a page fault with the SGX flag set in the error code. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/main.c | 243 +++++++++++++++++++++++++++++ 1 file changed, 243 insertions(+) diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index f71f943099fb..3bee8fa557fd 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -1606,4 +1606,247 @@ TEST_F(enclave, remove_added_page_no_eaccept) EXPECT_EQ(remove_ioc.count, 0); } +/* + * Request enclave page removal but instead of correctly following with + * EACCEPT a read attempt to page is made from within the enclave. + */ +TEST_F(enclave, remove_added_page_invalid_access) +{ + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + unsigned long data_start; + struct sgx_page_modt ioc; + int ret, errno_save; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + /* + * Hardware (SGX2) and OS support is needed for this test. Start + * with check that test has a chance of succeeding. + */ + memset(&ioc, 0, sizeof(ioc)); + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc); + + if (ret == -1) { + if (errno == ENOTTY) + SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl"); + else if (errno == ENODEV) + SKIP(return, "System does not support SGX2"); + } + + /* + * Invalid parameters were provided during sanity check, + * expect command to fail. + */ + EXPECT_EQ(ret, -1); + + /* + * Page that will be removed is the second data page in the .data + * segment. This forms part of the local encl_buffer within the + * enclave. + */ + data_start = self->encl.encl_base + + encl_get_data_offset(&self->encl) + PAGE_SIZE; + + /* + * Sanity check that page at @data_start is writable before + * removing it. + * + * Start by writing MAGIC to test page. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = data_start; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that data + * previously written (MAGIC) is present. Only change two test + * parameters, rest are same as previous test. + */ + get_addr_op.value = 0; + get_addr_op.addr = data_start; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* Start page removal by requesting change of page type to PT_TRIM */ + memset(&ioc, 0, sizeof(ioc)); + + ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + ioc.length = PAGE_SIZE; + ioc.type = SGX_PAGE_TYPE_TRIM; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(ioc.result, 0); + EXPECT_EQ(ioc.count, 4096); + + /* + * Read from page that was just removed. + */ + get_addr_op.value = 0; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + /* + * From OS perspective the page is present but according to SGX the + * page should not be accessible so a #PF with SGX bit set is + * expected. + */ + + EXPECT_EQ(self->run.function, ERESUME); + EXPECT_EQ(self->run.exception_vector, 14); + EXPECT_EQ(self->run.exception_error_code, 0x8005); + EXPECT_EQ(self->run.exception_addr, data_start); +} + +/* + * Request enclave page removal and correctly follow with + * EACCEPT but do not follow with removal ioctl but instead a read attempt + * to removed page is made from within the enclave. + */ +TEST_F(enclave, remove_added_page_invalid_access_after_eaccept) +{ + struct encl_op_get_from_addr get_addr_op; + struct encl_op_put_to_addr put_addr_op; + struct encl_op_eaccept eaccept_op; + unsigned long data_start; + struct sgx_page_modt ioc; + int ret, errno_save; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + /* + * Hardware (SGX2) and OS support is needed for this test. Start + * with check that test has a chance of succeeding. + */ + memset(&ioc, 0, sizeof(ioc)); + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc); + + if (ret == -1) { + if (errno == ENOTTY) + SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl"); + else if (errno == ENODEV) + SKIP(return, "System does not support SGX2"); + } + + /* + * Invalid parameters were provided during sanity check, + * expect command to fail. + */ + EXPECT_EQ(ret, -1); + + /* + * Page that will be removed is the second data page in the .data + * segment. This forms part of the local encl_buffer within the + * enclave. + */ + data_start = self->encl.encl_base + + encl_get_data_offset(&self->encl) + PAGE_SIZE; + + /* + * Sanity check that page at @data_start is writable before + * removing it. + * + * Start by writing MAGIC to test page. + */ + put_addr_op.value = MAGIC; + put_addr_op.addr = data_start; + put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* + * Read memory that was just written to, confirming that data + * previously written (MAGIC) is present. Only change two test + * parameters, rest are same as previous test. + */ + get_addr_op.value = 0; + get_addr_op.addr = data_start; + get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + EXPECT_EQ(get_addr_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + + /* Start page removal by requesting change of page type to PT_TRIM */ + memset(&ioc, 0, sizeof(ioc)); + + ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + ioc.length = PAGE_SIZE; + ioc.type = SGX_PAGE_TYPE_TRIM; + + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(ioc.result, 0); + EXPECT_EQ(ioc.count, 4096); + + eaccept_op.epc_addr = (unsigned long)data_start; + eaccept_op.ret = 0; + eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + /* Skip ioctl to remove page */ + + /* + * Read from page that was just removed. + */ + get_addr_op.value = 0; + + EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0); + + /* + * From OS perspective the page is present but according to SGX the + * page should not be accessible so a #PF with SGX bit set is + * expected. + */ + + EXPECT_EQ(self->run.function, ERESUME); + EXPECT_EQ(self->run.exception_vector, 14); + EXPECT_EQ(self->run.exception_error_code, 0x8005); + EXPECT_EQ(self->run.exception_addr, data_start); +} + TEST_HARNESS_MAIN From patchwork Wed Dec 1 19:23:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 613B1C4332F for ; Wed, 1 Dec 2021 19:23:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352437AbhLAT1M (ORCPT ); Wed, 1 Dec 2021 14:27:12 -0500 Received: from mga05.intel.com ([192.55.52.43]:21702 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229655AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784117" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784117" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380507" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 23/25] selftests/sgx: Test reclaiming of untouched page Date: Wed, 1 Dec 2021 11:23:21 -0800 Message-Id: <5dfa762816df7aad50303ea6519cdf049b19ffed.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Removing a page from an initialized enclave involves three steps: (1) the user requests changing the page type to PT_TRIM via the SGX_IOC_PAGE_MODT ioctl, (2) on success the ENCLU[EACCEPT] instruction is run from within the enclave to accept the page removal, (3) the user initiates the actual removal of the page via the SGX_IOC_PAGE_REMOVE ioctl. Remove a page that has never been accessed. This means that when the first ioctl requesting page removal arrives, there will be no page table entry, yet a valid page table entry needs to exist for the ENCLU[EACCEPT] function to succeed. In this test it is verified that a page table entry can still be installed for a page that is in the process of being removed. Suggested-by: Haitao Huang Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/main.c | 58 ++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index 3bee8fa557fd..618f5ff0601b 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -1849,4 +1849,62 @@ TEST_F(enclave, remove_added_page_invalid_access_after_eaccept) EXPECT_EQ(self->run.exception_addr, data_start); } +TEST_F(enclave, remove_untouched_page) +{ + struct encl_op_eaccept eaccept_op; + struct sgx_page_remove remove_ioc; + struct sgx_page_modt modt_ioc; + unsigned long data_start; + int ret, errno_save; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + data_start = self->encl.encl_base + + encl_get_data_offset(&self->encl) + PAGE_SIZE; + + memset(&modt_ioc, 0, sizeof(modt_ioc)); + + modt_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + modt_ioc.length = PAGE_SIZE; + modt_ioc.type = SGX_PAGE_TYPE_TRIM; + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(modt_ioc.result, 0); + EXPECT_EQ(modt_ioc.count, 4096); + + /* + * Enter enclave via TCS #1 and approve page removal by sending + * EACCEPT for removed page. + */ + + eaccept_op.epc_addr = data_start; + eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED; + eaccept_op.ret = 0; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + + memset(&remove_ioc, 0, sizeof(remove_ioc)); + + remove_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE; + remove_ioc.length = PAGE_SIZE; + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(remove_ioc.count, 4096); +} + TEST_HARNESS_MAIN From patchwork Wed Dec 1 19:23:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22E4CC4332F for ; Wed, 1 Dec 2021 19:24:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352668AbhLAT1U (ORCPT ); Wed, 1 Dec 2021 14:27:20 -0500 Received: from mga04.intel.com ([192.55.52.120]:46914 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352657AbhLAT1N (ORCPT ); Wed, 1 Dec 2021 14:27:13 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="235267938" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="235267938" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380510" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges Date: Wed, 1 Dec 2021 11:23:22 -0800 Message-Id: <4d3cc24f06ae75d99449f1d505cbff9ba53819a9.1638381245.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org The page reclaimer ensures availability of EPC pages across all enclaves. In support of this it runs independently from the individual enclaves in order to take locks from the different enclaves as it writes pages to swap. When needing to load a page from swap an EPC page needs to be available for its contents to be loaded into. Loading an existing enclave page from swap does not reclaim EPC pages directly if none are available, instead the reclaimer is woken when the available EPC pages are found to be below a watermark. When iterating over a large number of pages in an oversubscribed environment there is a race between the reclaimer woken up and EPC pages reclaimed fast enough for the page operations to proceed. Instead of tuning the race between the page operations and the reclaimer the page operations instead makes sure that there are EPC pages available. Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++ arch/x86/kernel/cpu/sgx/main.c | 6 ++++++ arch/x86/kernel/cpu/sgx/sgx.h | 1 + 3 files changed, 13 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index d11da6c53b26..fc2737b3c7cc 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -739,6 +739,8 @@ static long sgx_page_modp(struct sgx_encl *encl, struct sgx_page_modp *modp) for (c = 0 ; c < modp->length; c += PAGE_SIZE) { addr = encl->base + modp->offset + c; + sgx_direct_reclaim(); + mutex_lock(&encl->lock); entry = sgx_encl_load_page(encl, addr); @@ -962,6 +964,8 @@ static long sgx_page_modt(struct sgx_encl *encl, struct sgx_page_modt *modt) for (c = 0 ; c < modt->length; c += PAGE_SIZE) { addr = encl->base + modt->offset + c; + sgx_direct_reclaim(); + mutex_lock(&encl->lock); entry = sgx_encl_load_page(encl, addr); @@ -1187,6 +1191,8 @@ static long sgx_page_remove(struct sgx_encl *encl, for (c = 0 ; c < params->length; c += PAGE_SIZE) { addr = encl->base + params->offset + c; + sgx_direct_reclaim(); + mutex_lock(&encl->lock); entry = sgx_encl_load_page(encl, addr); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 887648ce6084..bc3fc57f5f08 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -376,6 +376,12 @@ static bool sgx_should_reclaim(unsigned long watermark) !list_empty(&sgx_active_page_list); } +void sgx_direct_reclaim(void) +{ + if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) + sgx_reclaim_pages(); +} + static int ksgxd(void *p) { set_freezable(); diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index ca89d625aa74..02af24acaacc 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -85,6 +85,7 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page) struct sgx_epc_page *__sgx_alloc_epc_page(void); void sgx_free_epc_page(struct sgx_epc_page *page); +void sgx_direct_reclaim(void); void sgx_mark_page_reclaimable(struct sgx_epc_page *page); int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); From patchwork Wed Dec 1 19:23:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 12650873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89254C433EF for ; Wed, 1 Dec 2021 19:23:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245352AbhLAT1K (ORCPT ); Wed, 1 Dec 2021 14:27:10 -0500 Received: from mga05.intel.com ([192.55.52.43]:21705 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245134AbhLAT1G (ORCPT ); Wed, 1 Dec 2021 14:27:06 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10185"; a="322784118" X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="322784118" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:44 -0800 X-IronPort-AV: E=Sophos;i="5.87,279,1631602800"; d="scan'208";a="500380512" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2021 11:23:43 -0800 From: Reinette Chatre To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH 25/25] selftests/sgx: Page removal stress test Date: Wed, 1 Dec 2021 11:23:23 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org Create enclave with additional heap that consumes all physical SGX memory and then remove it. Depending on the available SGX memory this test could take a significant time to run (several minutes) as it (1) creates the enclave, (2) changes the type of every page to be trimmed, (3) enters the enclave once per page to run EACCEPT, before (4) the pages are finally removed. Signed-off-by: Reinette Chatre --- tools/testing/selftests/sgx/main.c | 98 ++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index 618f5ff0601b..c1495eaafe79 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -393,7 +393,105 @@ TEST_F(enclave, unclobbered_vdso_oversubscribed) EXPECT_EQ(get_op.value, MAGIC); EXPECT_EEXIT(&self->run); EXPECT_EQ(self->run.user_data, 0); +} + +TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) +{ + struct encl_op_get_from_buf get_op; + struct encl_op_put_to_buf put_op; + struct sgx_page_modt modt_ioc; + struct encl_segment *heap; + unsigned long total_mem; + int ret, errno_save; + unsigned long addr; + struct encl_op_eaccept eaccept_op; + struct sgx_page_remove remove_ioc; + unsigned long i; + + /* + * Create enclave with additional heap that is as big as all + * available physical SGX memory. + */ + total_mem = get_total_epc_mem(); + ASSERT_NE(total_mem, 0); + TH_LOG("Creating an enclave with %lu bytes heap may take a while ...", + total_mem); + ASSERT_TRUE(setup_test_encl(total_mem, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + heap = &self->encl.segment_tbl[self->encl.nr_segments - 1]; + + put_op.header.type = ENCL_OP_PUT_TO_BUFFER; + put_op.value = MAGIC; + + EXPECT_EQ(ENCL_CALL(&put_op, &self->run, false), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.user_data, 0); + + get_op.header.type = ENCL_OP_GET_FROM_BUFFER; + get_op.value = 0; + + EXPECT_EQ(ENCL_CALL(&get_op, &self->run, false), 0); + + EXPECT_EQ(get_op.value, MAGIC); + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.user_data, 0); + + /* Trim entire heap */ + memset(&modt_ioc, 0, sizeof(modt_ioc)); + + modt_ioc.offset = heap->offset; + modt_ioc.length = heap->size; + modt_ioc.type = SGX_PAGE_TYPE_TRIM; + + TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", + heap->size); + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(modt_ioc.result, 0); + EXPECT_EQ(modt_ioc.count, heap->size); + /* EACCEPT all removed pages */ + addr = self->encl.encl_base + heap->offset; + + eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED; + eaccept_op.header.type = ENCL_OP_EACCEPT; + + TH_LOG("Entering enclave to run EACCEPT for each page of %zd bytes may take a while ...", + heap->size); + for (i = 0; i < heap->size; i += 4096) { + eaccept_op.epc_addr = addr + i; + eaccept_op.ret = 0; + + EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); + EXPECT_EQ(eaccept_op.ret, 0); + } + + /* Complete page removal */ + memset(&remove_ioc, 0, sizeof(remove_ioc)); + + remove_ioc.offset = heap->offset; + remove_ioc.length = heap->size; + + TH_LOG("Removing %zd bytes from enclave may take a while ...", + heap->size); + ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc); + errno_save = ret == -1 ? errno : 0; + + EXPECT_EQ(ret, 0); + EXPECT_EQ(errno_save, 0); + EXPECT_EQ(remove_ioc.count, heap->size); } TEST_F(enclave, clobbered_vdso)