[V3,3/5] x86/sgx: Obtain backing storage page with enclave mutex held

Haitao reported encountering a WARN triggered by the ENCLS[ELDU]
instruction faulting with a #GP.

The WARN is encountered when the reclaimer evicts a range of
pages from the enclave when the same pages are faulted back
right away.

The SGX backing storage is accessed on two paths: when there
are insufficient free pages in the EPC the reclaimer works
to move enclave pages to the backing storage and as enclaves
access pages that have been moved to the backing storage
they are retrieved from there as part of page fault handling.

An oversubscribed SGX system will often run the reclaimer and
page fault handler concurrently and needs to ensure that the
backing store is accessed safely between the reclaimer and
the page fault handler. This is not the case because the
reclaimer accesses the backing store without the enclave mutex
while the page fault handler accesses the backing store with
the enclave mutex.

Consider the scenario where a page is faulted while a page sharing
a PCMD page with the faulted page is being reclaimed. The
consequence is a race between the reclaimer and page fault
handler, the reclaimer attempting to access a PCMD at the
same time it is truncated by the page fault handler. This
could result in lost PCMD data. Data may still be
lost if the reclaimer wins the race, this is addressed in
the following patch.

The reclaimer accesses pages from the backing storage without
holding the enclave mutex and runs the risk of concurrently
accessing the backing storage with the page fault handler that
does access the backing storage with the enclave mutex held.

In the scenario below a PCMD page is truncated from the backing
store after all its pages have been loaded in to the enclave
at the same time the PCMD page is loaded from the backing store
when one of its pages are reclaimed:

sgx_reclaim_pages() {              sgx_vma_fault() {
                                     ...
                                     mutex_lock(&encl->lock);
                                     ...
                                     __sgx_encl_eldu() {
                                       ...
                                       if (pcmd_page_empty) {
/*
 * EPC page being reclaimed              /*
 * shares a PCMD page with an             * PCMD page truncated
 * enclave page that is being             * while requested from
 * faulted in.                            * reclaimer.
 */                                       */
sgx_encl_get_backing()  <---------->      sgx_encl_truncate_backing_page()
                                        }
                                       mutex_unlock(&encl->lock);
}                                    }

In this scenario there is a race between the reclaimer and the page fault
handler when the reclaimer attempts to get access to the same PCMD page
that is being truncated. This could result in the reclaimer writing to
the PCMD page that is then truncated, causing the PCMD data to be lost,
or in a new PCMD page being allocated. The lost PCMD data may still occur
after protecting the backing store access with the mutex - this is fixed
in the next patch. By ensuring the backing store is accessed with the mutex
held the enclave page state can be made accurate with the
SGX_ENCL_PAGE_BEING_RECLAIMED flag accurately reflecting that a page
is in the process of being reclaimed.

Consistently protect the reclaimer's backing store access with the
enclave's mutex to ensure that it can safely run concurrently with the
page fault handler.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Haitao Huang <haitao.huang@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Haitao Huang <haitao.huang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Move description of "scenario a" to the new patch in series that marks
  page as dirty with enclave mutex held.
- Add Haitao's "Tested-by" tag.
- Add Jarkko's "Reviewed-by" and "Tested-by" tags.

 arch/x86/kernel/cpu/sgx/main.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Message ID	fa2e04c561a8555bfe1f4e7adc37d60efc77387b.1652389823.git.reinette.chatre@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-sgx-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B6B2C433EF for <linux-sgx@archiver.kernel.org>; Thu, 12 May 2022 21:51:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1358898AbiELVvw (ORCPT <rfc822;linux-sgx@archiver.kernel.org>); Thu, 12 May 2022 17:51:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358950AbiELVvM (ORCPT <rfc822;linux-sgx@vger.kernel.org>); Thu, 12 May 2022 17:51:12 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B43E8220FE; Thu, 12 May 2022 14:51:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652392271; x=1683928271; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MrYGnZoD474Ia0AZomlM8lfbjmoLwZnH/KfPcxfBqzQ=; b=Dp66sl3hVwpBg4S+HmzThB6jmzlDNIGsfSuCleK2cL8O90qXMveR8Tqd ArYFi/W1X8FH9npWF7pJ1BO29rp6JRwmpcjuoUx59Hzaj2dEe5XdxDsUy NNH3uwiiXJo6sGtlGIaGFRqNR+jV9jOWE7sUtByqKIWFzoqPU2kP0zMbU HtfgSa21NgXl+p62L2n7fkkTATRM0iw3PWBMAg2UhYZ1SLjan7VQq8Lc7 5OadD8Gb4zB7WOcH78f7X3Xl8xqPsFkM4Rtx4P4leddA+ERnEmCb7Srsn g6lbaTqU0Sr8rb1MCq0TLekXtdD0ahKSDdJZU/RS4huH5Bgwhe0vQVRo4 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10345"; a="267736146" X-IronPort-AV: E=Sophos;i="5.91,221,1647327600"; d="scan'208";a="267736146" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2022 14:51:09 -0700 X-IronPort-AV: E=Sophos;i="5.91,221,1647327600"; d="scan'208";a="553955571" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2022 14:51:08 -0700 From: Reinette Chatre <reinette.chatre@intel.com> To: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org Cc: haitao.huang@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH V3 3/5] x86/sgx: Obtain backing storage page with enclave mutex held Date: Thu, 12 May 2022 14:50:59 -0700 Message-Id: <fa2e04c561a8555bfe1f4e7adc37d60efc77387b.1652389823.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <cover.1652389823.git.reinette.chatre@intel.com> References: <cover.1652389823.git.reinette.chatre@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-sgx.vger.kernel.org> X-Mailing-List: linux-sgx@vger.kernel.org
Series	[V3,1/5] x86/sgx: Disconnect backing page references from dirty status \| expand [V3,1/5] x86/sgx: Disconnect backing page references from dirty status [V3,2/5] x86/sgx: Mark PCMD page as dirty when modifying contents [V3,3/5] x86/sgx: Obtain backing storage page with enclave mutex held [V3,4/5] x86/sgx: Fix race between reclaimer and page fault handler [V3,5/5] x86/sgx: Ensure no data in PCMD page after truncate

[V3,3/5] x86/sgx: Obtain backing storage page with enclave mutex held

Commit Message

Comments

Patch