[v2,00/17] v23 updates

Message ID	20190916041417.12533-1-jarkko.sakkinen@linux.intel.com (mailing list archive)
Headers	show Return-Path: <SRS0=4JBP=XL=vger.kernel.org=linux-sgx-owner@kernel.org> From: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> To: linux-sgx@vger.kernel.org Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Subject: [PATCH v2 00/17] v23 updates Date: Mon, 16 Sep 2019 07:14:00 +0300 Message-Id: <20190916041417.12533-1-jarkko.sakkinen@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-sgx-owner@vger.kernel.org Precedence: bulk
Series	v23 updates \| expand [v2,00/17] v23 updates [v2,01/17] selftest/x86/sgx: Remove encl_piggy.h [v2,02/17] x86/sgx: Clean up internal includes [v2,03/17] x86/sgx: Write backing storage only if EWB is successful [v2,04/17] x86/sgx: Rename 'j' as 'cnt' in sgx_reclaim_pages() [v2,05/17] x86/sgx: Turn encls_failed() as inline function [v2,06/17] x86/sgx: Move sgx_einit() to encls.c [v2,07/17] x86/sgx: Remove pages in sgx_reclaimer_write() [v2,08/17] x86/sgx: Calculate page index in sgx_reclaimer_write() [v2,09/17] x86/sgx: Move SGX_ENCL_DEAD check to sgx_reclaimer_write() [v2,10/17] x86/sgx: Free VA slot when the EWB flow fails [v2,11/17] x86/sgx: Call sgx_encl_destroy() when the EWB flow fails [v2,12/17] x86/sgx: Open code sgx_reclaimer_get() and sgx_reclaimer_put() [v2,13/17] x86/sgx: Introduce sgx_can_reclaim() [v2,14/17] x86/sgx: Replace section->free_cnt with a global sgx_nr_free_pages [v2,15/17] x86/sgx: sgx_vma_access(): Do not return -ECANCELED on invalid TCS pages [v2,16/17] x86/sgx: Introduce sgx_encl_get_backing() [v2,17/17] x86/sgx: Fix pages in the BLOCKED state ending up to the free pool

Message ID

20190916041417.12533-1-jarkko.sakkinen@linux.intel.com (mailing list archive)

Headers

From: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
To: linux-sgx@vger.kernel.org
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Subject: [PATCH v2 00/17] v23 updates
Date: Mon, 16 Sep 2019 07:14:00 +0300
Message-Id: <20190916041417.12533-1-jarkko.sakkinen@linux.intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: linux-sgx-owner@vger.kernel.org
Precedence: bulk

Series

v23 updates | expand

Message

Jarkko Sakkinen Sept. 16, 2019, 4:14 a.m. UTC

My flush of updates for v23. Contains a bunch of clean ups and bug
fixes with the main focus on the page reclaimer. The main goal has
been to disclose all the other possibilities for failure after
ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
invalidated.

Jarkko Sakkinen (17):
  selftest/x86/sgx: Remove encl_piggy.h
  x86/sgx: Clean up internal includes
  x86/sgx: Write backing storage only if EWB is successful
  x86/sgx: Rename 'j' as 'cnt' in sgx_reclaim_pages()
  x86/sgx: Turn encls_failed() as inline function
  x86/sgx: Move sgx_einit() to encls.c
  x86/sgx: Remove pages in sgx_reclaimer_write()
  x86/sgx: Calculate page index in sgx_reclaimer_write()
  x86/sgx: Move SGX_ENCL_DEAD check to sgx_reclaimer_write()
  x86/sgx: Free VA slot when the EWB flow fails
  x86/sgx: Call sgx_encl_destroy() when the EWB flow fails
  x86/sgx: Open code sgx_reclaimer_get() and sgx_reclaimer_put()
  x86/sgx: Introduce sgx_can_reclaim()
  x86/sgx: Replace section->free_cnt with a global sgx_nr_free_pages
  x86/sgx: sgx_vma_access(): Do not return -ECANCELED on invalid TCS
    pages
  x86/sgx: Introduce sgx_encl_get_backing()
  x86/sgx: Fix pages in the BLOCKED state ending up to the free pool

 arch/x86/kernel/cpu/sgx/driver.c             |   2 +
 arch/x86/kernel/cpu/sgx/driver.h             |   3 -
 arch/x86/kernel/cpu/sgx/encl.c               | 104 +++----
 arch/x86/kernel/cpu/sgx/encl.h               |  23 +-
 arch/x86/kernel/cpu/sgx/encls.c              |  54 +++-
 arch/x86/kernel/cpu/sgx/encls.h              |  23 +-
 arch/x86/kernel/cpu/sgx/ioctl.c              |   5 +-
 arch/x86/kernel/cpu/sgx/main.c               |  64 +---
 arch/x86/kernel/cpu/sgx/reclaim.c            | 299 +++++++++----------
 arch/x86/kernel/cpu/sgx/sgx.h                |   6 +-
 tools/testing/selftests/x86/sgx/encl_piggy.h |  14 -
 tools/testing/selftests/x86/sgx/main.c       |   1 -
 12 files changed, 276 insertions(+), 322 deletions(-)
 delete mode 100644 tools/testing/selftests/x86/sgx/encl_piggy.h

Comments

Jarkko Sakkinen Sept. 16, 2019, 7:58 a.m. UTC | #1

On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> My flush of updates for v23. Contains a bunch of clean ups and bug
> fixes with the main focus on the page reclaimer. The main goal has
> been to disclose all the other possibilities for failure after
> ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> invalidated.

I have at least one more update to the reclaimer but want to merge these
first.

It adds optional struct epc_page **reclaimed_page to
sgx_reclaim_pages(). If NULL, the function will just append everything
to the free pool. Otherwise, it will use it to return one of the
reclaimed pages if there are any.

sgx_alloc_page() then does the following when @reclaim=true:

1. If page in free page pool, take one.
2. If not, try to reclaim one.
3. If nothing was reclaimed -ENOMEM.

Right now sgx_alloc_page() can in theory take however long.

I wonder why we do not return -ENOMEM also when @reclaim=false. Where
did this returning -EBUSY came from? Can't recall.

/Jarkko

Jarkko Sakkinen Sept. 16, 2019, 8:01 a.m. UTC | #2

On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > My flush of updates for v23. Contains a bunch of clean ups and bug
> > fixes with the main focus on the page reclaimer. The main goal has
> > been to disclose all the other possibilities for failure after
> > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > invalidated.
> 
> I have at least one more update to the reclaimer but want to merge these
> first.
> 
> It adds optional struct epc_page **reclaimed_page to
> sgx_reclaim_pages(). If NULL, the function will just append everything
> to the free pool. Otherwise, it will use it to return one of the
> reclaimed pages if there are any.
> 
> sgx_alloc_page() then does the following when @reclaim=true:
> 
> 1. If page in free page pool, take one.
> 2. If not, try to reclaim one.
> 3. If nothing was reclaimed -ENOMEM.
> 
> Right now sgx_alloc_page() can in theory take however long.
> 
> I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> did this returning -EBUSY came from? Can't recall.

Checked. I guess it is just for ELDU flow but does not make sense
otherwise. Tuning sgx_vma_fault() should be enough. I mean with
the above change we would start to return -EBUSY sometimes in
OOM situations.

/Jarkko

Sean Christopherson Sept. 16, 2019, 6:37 p.m. UTC | #3

On Mon, Sep 16, 2019 at 11:01:43AM +0300, Jarkko Sakkinen wrote:
> On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> > On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > > My flush of updates for v23. Contains a bunch of clean ups and bug
> > > fixes with the main focus on the page reclaimer. The main goal has
> > > been to disclose all the other possibilities for failure after
> > > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > > invalidated.
> > 
> > I have at least one more update to the reclaimer but want to merge these
> > first.
> > 
> > It adds optional struct epc_page **reclaimed_page to
> > sgx_reclaim_pages(). If NULL, the function will just append everything
> > to the free pool. Otherwise, it will use it to return one of the
> > reclaimed pages if there are any.
> > 
> > sgx_alloc_page() then does the following when @reclaim=true:
> > 
> > 1. If page in free page pool, take one.
> > 2. If not, try to reclaim one.
> > 3. If nothing was reclaimed -ENOMEM.
> > 
> > Right now sgx_alloc_page() can in theory take however long.
> > 
> > I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> > did this returning -EBUSY came from? Can't recall.
> 
> Checked. I guess it is just for ELDU flow but does not make sense
> otherwise. Tuning sgx_vma_fault() should be enough. I mean with
> the above change we would start to return -EBUSY sometimes in
> OOM situations.

Returning -EBUSY is done to differentiate between the case where reclaim
is possible, i.e. sgx_active_page_list is *not* empty, but disallowed, and
the case where reclaim is impossible, i.e. sgx_active_page_list is empty.
If reclaim is impossible then the fault handler should signal SIGSEGV so
that processes start dying and/or killing enclaves to free up EPC.

Barring a kernel bug, I don't think it's possible for sgx_active_page_list
to be empty when only the driver is supported, but both KVM and EPC cgroup
support will introduce (relatively common) scenarios where there are no
pages on the active/reclaimable list.  Technically we probably don't need
the -EBUSY logic, but my vote is to keep it since it's a nice fallback in
case there are kernel bugs.

Jarkko Sakkinen Sept. 17, 2019, 7:08 p.m. UTC | #4

On Mon, Sep 16, 2019 at 11:37:43AM -0700, Sean Christopherson wrote:
> On Mon, Sep 16, 2019 at 11:01:43AM +0300, Jarkko Sakkinen wrote:
> > On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> > > On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > > > My flush of updates for v23. Contains a bunch of clean ups and bug
> > > > fixes with the main focus on the page reclaimer. The main goal has
> > > > been to disclose all the other possibilities for failure after
> > > > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > > > invalidated.
> > > 
> > > I have at least one more update to the reclaimer but want to merge these
> > > first.
> > > 
> > > It adds optional struct epc_page **reclaimed_page to
> > > sgx_reclaim_pages(). If NULL, the function will just append everything
> > > to the free pool. Otherwise, it will use it to return one of the
> > > reclaimed pages if there are any.
> > > 
> > > sgx_alloc_page() then does the following when @reclaim=true:
> > > 
> > > 1. If page in free page pool, take one.
> > > 2. If not, try to reclaim one.
> > > 3. If nothing was reclaimed -ENOMEM.
> > > 
> > > Right now sgx_alloc_page() can in theory take however long.
> > > 
> > > I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> > > did this returning -EBUSY came from? Can't recall.
> > 
> > Checked. I guess it is just for ELDU flow but does not make sense
> > otherwise. Tuning sgx_vma_fault() should be enough. I mean with
> > the above change we would start to return -EBUSY sometimes in
> > OOM situations.
> 
> Returning -EBUSY is done to differentiate between the case where reclaim
> is possible, i.e. sgx_active_page_list is *not* empty, but disallowed, and
> the case where reclaim is impossible, i.e. sgx_active_page_list is empty.
> If reclaim is impossible then the fault handler should signal SIGSEGV so
> that processes start dying and/or killing enclaves to free up EPC.
> 
> Barring a kernel bug, I don't think it's possible for sgx_active_page_list
> to be empty when only the driver is supported, but both KVM and EPC cgroup
> support will introduce (relatively common) scenarios where there are no
> pages on the active/reclaimable list.  Technically we probably don't need
> the -EBUSY logic, but my vote is to keep it since it's a nice fallback in
> case there are kernel bugs.

OK, my root question is I guess, why want to differentiate those cases?
Both are as far as I'm concerned situations where there is no memory
available.

And now my changes after these patches add yet another case: active
page list was not empty but nothing could be reclaimed. Is the
granularity really needed for something here?

/Jarkko

Sean Christopherson Sept. 17, 2019, 7:27 p.m. UTC | #5

On Tue, Sep 17, 2019 at 10:08:31PM +0300, Jarkko Sakkinen wrote:
> On Mon, Sep 16, 2019 at 11:37:43AM -0700, Sean Christopherson wrote:
> > On Mon, Sep 16, 2019 at 11:01:43AM +0300, Jarkko Sakkinen wrote:
> > > On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> > > > On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > > > > My flush of updates for v23. Contains a bunch of clean ups and bug
> > > > > fixes with the main focus on the page reclaimer. The main goal has
> > > > > been to disclose all the other possibilities for failure after
> > > > > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > > > > invalidated.
> > > > 
> > > > I have at least one more update to the reclaimer but want to merge these
> > > > first.
> > > > 
> > > > It adds optional struct epc_page **reclaimed_page to
> > > > sgx_reclaim_pages(). If NULL, the function will just append everything
> > > > to the free pool. Otherwise, it will use it to return one of the
> > > > reclaimed pages if there are any.
> > > > 
> > > > sgx_alloc_page() then does the following when @reclaim=true:
> > > > 
> > > > 1. If page in free page pool, take one.
> > > > 2. If not, try to reclaim one.
> > > > 3. If nothing was reclaimed -ENOMEM.
> > > > 
> > > > Right now sgx_alloc_page() can in theory take however long.
> > > > 
> > > > I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> > > > did this returning -EBUSY came from? Can't recall.
> > > 
> > > Checked. I guess it is just for ELDU flow but does not make sense
> > > otherwise. Tuning sgx_vma_fault() should be enough. I mean with
> > > the above change we would start to return -EBUSY sometimes in
> > > OOM situations.
> > 
> > Returning -EBUSY is done to differentiate between the case where reclaim
> > is possible, i.e. sgx_active_page_list is *not* empty, but disallowed, and
> > the case where reclaim is impossible, i.e. sgx_active_page_list is empty.
> > If reclaim is impossible then the fault handler should signal SIGSEGV so
> > that processes start dying and/or killing enclaves to free up EPC.
> > 
> > Barring a kernel bug, I don't think it's possible for sgx_active_page_list
> > to be empty when only the driver is supported, but both KVM and EPC cgroup
> > support will introduce (relatively common) scenarios where there are no
> > pages on the active/reclaimable list.  Technically we probably don't need
> > the -EBUSY logic, but my vote is to keep it since it's a nice fallback in
> > case there are kernel bugs.
> 
> OK, my root question is I guess, why want to differentiate those cases?
> Both are as far as I'm concerned situations where there is no memory
> available.
> 
> And now my changes after these patches add yet another case: active
> page list was not empty but nothing could be reclaimed. Is the
> granularity really needed for something here?

Yes.  If there are reclaimable pages, then letting userspace re-fault is
correct as the process can make forward progress.  Restarting userspace
when there are no reclaimable pages will soft hang userspace, i.e. it'll
fault indefinitely.