[V2,06/32] x86/sgx: Support VMA permissions more relaxed than enclave permissions

=== Summary ===

An SGX VMA can only be created if its permissions are the same or
weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
creation this same rule is again enforced by the page fault handler:
faulted enclave pages are required to have equal or more relaxed
EPCM permissions than the VMA permissions.

On SGX1 systems the additional enforcement in the page fault handler
is redundant and on SGX2 systems it incorrectly prevents access.
On SGX1 systems it is unnecessary to repeat the enforcement of the
permission rule. The rule used during original VMA creation will
ensure that any access attempt will use correct permissions.
With SGX2 the EPCM permissions of a page can change after VMA
creation resulting in the VMA permissions potentially being more
relaxed than the EPCM permissions and the page fault handler
incorrectly blocking valid access attempts.

Enable the VMA's pages to remain accessible while ensuring that
the PTEs are installed to match the EPCM permissions but not be
more relaxed than the VMA permissions.

=== Full Changelog ===

An SGX enclave is an area of memory where parts of an application
can reside. First an enclave is created and loaded (from
non-enclave memory) with the code and data of an application,
then user space can map (mmap()) the enclave memory to
be able to enter the enclave at its defined entry points for
execution within it.

The hardware maintains a secure structure, the Enclave Page Cache Map
(EPCM), that tracks the contents of the enclave. Of interest here is
its tracking of the enclave page permissions. When a page is loaded
into the enclave its permissions are specified and recorded in the
EPCM. In parallel the kernel maintains permissions within the
page table entries (PTEs) and the rule is that PTE permissions
are not allowed to be more relaxed than the EPCM permissions.

A new mapping (mmap()) of enclave memory can only succeed if the
mapping has the same or weaker permissions than the permissions that
were vetted during enclave creation. This is enforced by
sgx_encl_may_map() that is called on the mmap() as well as mprotect()
paths. This rule remains.

One feature of SGX2 is to support the modification of EPCM permissions
after enclave initialization. Enclave pages may thus already be part
of a VMA at the time their EPCM permissions are changed resulting
in the VMA's permissions potentially being more relaxed than the EPCM
permissions.

Allow permissions of existing VMAs to be more relaxed than EPCM
permissions in preparation for dynamic EPCM permission changes
made possible in SGX2.  New VMAs that attempt to have more relaxed
permissions than EPCM permissions continue to be unsupported.

Reasons why permissions of existing VMAs are allowed to be more relaxed
than EPCM permissions instead of dynamically changing VMA permissions
when EPCM permissions change are:
1) Changing VMA permissions involve splitting VMAs which is an
   operation that can fail. Additionally changing EPCM permissions of
   a range of pages could also fail on any of the pages involved.
   Handling these error cases causes problems. For example, if an
   EPCM permission change fails and the VMA has already been split
   then it is not possible to undo the VMA split nor possible to
   undo the EPCM permission changes that did succeed before the
   failure.
2) The kernel has little insight into the user space where EPCM
   permissions are controlled from. For example, a RW page may
   be made RO just before it is made RX and splitting the VMAs
   while the VMAs may change soon is unnecessary.

Remove the extra permission check called on a page fault
(vm_operations_struct->fault) or during debugging
(vm_operations_struct->access) when loading the enclave page from swap
that ensures that the VMA permissions are not more relaxed than the
EPCM permissions. Since a VMA could only exist if it passed the
original permission checks during mmap() and a VMA may indeed
have more relaxed permissions than the EPCM permissions this extra
permission check is no longer appropriate.

With the permission check removed, ensure that PTEs do
not blindly inherit the VMA permissions but instead the permissions
that the VMA and EPCM agree on. PTEs for writable pages (from VMA
and enclave perspective) are installed with the writable bit set,
reducing the need for this additional flow to the permission mismatch
cases handled next.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V1:
- Reword commit message (Jarkko).
- Use "relax" instead of "exceed" when referring to permissions (Dave).
- Add snippet to Documentation/x86/sgx.rst that highlights the
  relationship between VMA, EPCM, and PTE permissions on SGX
  systems (Andy).

 Documentation/x86/sgx.rst      | 10 +++++++++
 arch/x86/kernel/cpu/sgx/encl.c | 38 ++++++++++++++++++----------------
 2 files changed, 30 insertions(+), 18 deletions(-)

Message ID	0555a4b4a5e8879eb8f879ab3d9908302000f11c.1644274683.git.reinette.chatre@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-sgx-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54B91C433F5 for <linux-sgx@archiver.kernel.org>; Tue, 8 Feb 2022 01:07:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236712AbiBHBHE (ORCPT <rfc822;linux-sgx@archiver.kernel.org>); Mon, 7 Feb 2022 20:07:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344156AbiBHAqM (ORCPT <rfc822;linux-sgx@vger.kernel.org>); Mon, 7 Feb 2022 19:46:12 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BCFBC06109E; Mon, 7 Feb 2022 16:46:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644281172; x=1675817172; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A0tKjrTiaB3MFmcpXoXALQfp0EgJ5XWEEKw+aPc7+m8=; b=JVJ9lrBZz/l4wi9KNOiFbYesjctcKZNLwSrLVVQH4ScY6Q7m74TTPy79 8af83EEQZQ9QbBVZVDIEnmVdBbb72wzitPGm+nJ0Qw4b0JBkbTygAwfO+ TSDpIPW8r7+/8tJPLH1HFRBAxxXsJ0SaO4+ly9xPTsCkYHsJXXR5CnuDL Dcnr3sDSQVZRIaB48sQ1y4uLTUKs6K59QssSmEl3l0WokQDO7UWhpUay+ CpU73RS3xG1dsI6czNSY+3eDc6FNkESw6QOc4gXjYx/B+YSqUgC3eiXN2 /SQDXg+3sz8KEnpcVNUSIRFOap9CL0cI83A38YzvxyXQs/88SETI1Wnbk Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10251"; a="232407944" X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="232407944" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Feb 2022 16:46:08 -0800 X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="499389483" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Feb 2022 16:46:08 -0800 From: Reinette Chatre <reinette.chatre@intel.com> To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, kai.huang@intel.com, cathy.zhang@intel.com, cedric.xing@intel.com, haitao.huang@intel.com, mark.shanahan@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 06/32] x86/sgx: Support VMA permissions more relaxed than enclave permissions Date: Mon, 7 Feb 2022 16:45:28 -0800 Message-Id: <0555a4b4a5e8879eb8f879ab3d9908302000f11c.1644274683.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <cover.1644274683.git.reinette.chatre@intel.com> References: <cover.1644274683.git.reinette.chatre@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-sgx.vger.kernel.org> X-Mailing-List: linux-sgx@vger.kernel.org
Series	x86/sgx and selftests/sgx: Support SGX2 \| expand [V2,00/32] x86/sgx and selftests/sgx: Support SGX2 [V2,01/32] x86/sgx: Add short descriptions to ENCLS wrappers [V2,02/32] x86/sgx: Add wrapper for SGX2 EMODPR function [V2,03/32] x86/sgx: Add wrapper for SGX2 EMODT function [V2,04/32] x86/sgx: Add wrapper for SGX2 EAUG function [V2,05/32] Documentation/x86: Document SGX permission details [V2,06/32] x86/sgx: Support VMA permissions more relaxed than enclave permissions [V2,07/32] x86/sgx: Add pfn_mkwrite() handler for present PTEs [V2,08/32] x86/sgx: x86/sgx: Add sgx_encl_page->vm_run_prot_bits for dynamic permission changes [V2,09/32] x86/sgx: Export sgx_encl_ewb_cpumask() [V2,10/32] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask() [V2,11/32] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes() [V2,12/32] x86/sgx: Make sgx_ipi_cb() available internally [V2,13/32] x86/sgx: Create utility to validate user provided offset and length [V2,14/32] x86/sgx: Keep record of SGX page type [V2,15/32] x86/sgx: Support relaxing of enclave page permissions [V2,16/32] x86/sgx: Support restricting of enclave page permissions [V2,17/32] selftests/sgx: Add test for EPCM permission changes [V2,18/32] selftests/sgx: Add test for TCS page permission changes [V2,19/32] x86/sgx: Support adding of pages to an initialized enclave [V2,20/32] x86/sgx: Tighten accessible memory range after enclave initialization [V2,21/32] selftests/sgx: Test two different SGX2 EAUG flows [V2,22/32] x86/sgx: Support modifying SGX page type [V2,23/32] x86/sgx: Support complete page removal [V2,24/32] Documentation/x86: Introduce enclave runtime management section [V2,25/32] selftests/sgx: Introduce dynamic entry point [V2,26/32] selftests/sgx: Introduce TCS initialization enclave operation [V2,27/32] selftests/sgx: Test complete changing of page type flow [V2,28/32] selftests/sgx: Test faulty enclave behavior [V2,29/32] selftests/sgx: Test invalid access to removed enclave page [V2,30/32] selftests/sgx: Test reclaiming of untouched page [V2,31/32] x86/sgx: Free up EPC pages directly to support large page ranges [V2,32/32] selftests/sgx: Page removal stress test

[V2,06/32] x86/sgx: Support VMA permissions more relaxed than enclave permissions

Commit Message

Comments

Patch