mbox series

[v2,00/17] v23 updates

Message ID 20190916041417.12533-1-jarkko.sakkinen@linux.intel.com (mailing list archive)
Headers show
Series v23 updates | expand

Message

Jarkko Sakkinen Sept. 16, 2019, 4:14 a.m. UTC
My flush of updates for v23. Contains a bunch of clean ups and bug
fixes with the main focus on the page reclaimer. The main goal has
been to disclose all the other possibilities for failure after
ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
invalidated.

Jarkko Sakkinen (17):
  selftest/x86/sgx: Remove encl_piggy.h
  x86/sgx: Clean up internal includes
  x86/sgx: Write backing storage only if EWB is successful
  x86/sgx: Rename 'j' as 'cnt' in sgx_reclaim_pages()
  x86/sgx: Turn encls_failed() as inline function
  x86/sgx: Move sgx_einit() to encls.c
  x86/sgx: Remove pages in sgx_reclaimer_write()
  x86/sgx: Calculate page index in sgx_reclaimer_write()
  x86/sgx: Move SGX_ENCL_DEAD check to sgx_reclaimer_write()
  x86/sgx: Free VA slot when the EWB flow fails
  x86/sgx: Call sgx_encl_destroy() when the EWB flow fails
  x86/sgx: Open code sgx_reclaimer_get() and sgx_reclaimer_put()
  x86/sgx: Introduce sgx_can_reclaim()
  x86/sgx: Replace section->free_cnt with a global sgx_nr_free_pages
  x86/sgx: sgx_vma_access(): Do not return -ECANCELED on invalid TCS
    pages
  x86/sgx: Introduce sgx_encl_get_backing()
  x86/sgx: Fix pages in the BLOCKED state ending up to the free pool

 arch/x86/kernel/cpu/sgx/driver.c             |   2 +
 arch/x86/kernel/cpu/sgx/driver.h             |   3 -
 arch/x86/kernel/cpu/sgx/encl.c               | 104 +++----
 arch/x86/kernel/cpu/sgx/encl.h               |  23 +-
 arch/x86/kernel/cpu/sgx/encls.c              |  54 +++-
 arch/x86/kernel/cpu/sgx/encls.h              |  23 +-
 arch/x86/kernel/cpu/sgx/ioctl.c              |   5 +-
 arch/x86/kernel/cpu/sgx/main.c               |  64 +---
 arch/x86/kernel/cpu/sgx/reclaim.c            | 299 +++++++++----------
 arch/x86/kernel/cpu/sgx/sgx.h                |   6 +-
 tools/testing/selftests/x86/sgx/encl_piggy.h |  14 -
 tools/testing/selftests/x86/sgx/main.c       |   1 -
 12 files changed, 276 insertions(+), 322 deletions(-)
 delete mode 100644 tools/testing/selftests/x86/sgx/encl_piggy.h

Comments

Jarkko Sakkinen Sept. 16, 2019, 7:58 a.m. UTC | #1
On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> My flush of updates for v23. Contains a bunch of clean ups and bug
> fixes with the main focus on the page reclaimer. The main goal has
> been to disclose all the other possibilities for failure after
> ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> invalidated.

I have at least one more update to the reclaimer but want to merge these
first.

It adds optional struct epc_page **reclaimed_page to
sgx_reclaim_pages(). If NULL, the function will just append everything
to the free pool. Otherwise, it will use it to return one of the
reclaimed pages if there are any.

sgx_alloc_page() then does the following when @reclaim=true:

1. If page in free page pool, take one.
2. If not, try to reclaim one.
3. If nothing was reclaimed -ENOMEM.

Right now sgx_alloc_page() can in theory take however long.

I wonder why we do not return -ENOMEM also when @reclaim=false. Where
did this returning -EBUSY came from? Can't recall.

/Jarkko
Jarkko Sakkinen Sept. 16, 2019, 8:01 a.m. UTC | #2
On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > My flush of updates for v23. Contains a bunch of clean ups and bug
> > fixes with the main focus on the page reclaimer. The main goal has
> > been to disclose all the other possibilities for failure after
> > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > invalidated.
> 
> I have at least one more update to the reclaimer but want to merge these
> first.
> 
> It adds optional struct epc_page **reclaimed_page to
> sgx_reclaim_pages(). If NULL, the function will just append everything
> to the free pool. Otherwise, it will use it to return one of the
> reclaimed pages if there are any.
> 
> sgx_alloc_page() then does the following when @reclaim=true:
> 
> 1. If page in free page pool, take one.
> 2. If not, try to reclaim one.
> 3. If nothing was reclaimed -ENOMEM.
> 
> Right now sgx_alloc_page() can in theory take however long.
> 
> I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> did this returning -EBUSY came from? Can't recall.

Checked. I guess it is just for ELDU flow but does not make sense
otherwise. Tuning sgx_vma_fault() should be enough. I mean with
the above change we would start to return -EBUSY sometimes in
OOM situations.

/Jarkko
Sean Christopherson Sept. 16, 2019, 6:37 p.m. UTC | #3
On Mon, Sep 16, 2019 at 11:01:43AM +0300, Jarkko Sakkinen wrote:
> On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> > On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > > My flush of updates for v23. Contains a bunch of clean ups and bug
> > > fixes with the main focus on the page reclaimer. The main goal has
> > > been to disclose all the other possibilities for failure after
> > > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > > invalidated.
> > 
> > I have at least one more update to the reclaimer but want to merge these
> > first.
> > 
> > It adds optional struct epc_page **reclaimed_page to
> > sgx_reclaim_pages(). If NULL, the function will just append everything
> > to the free pool. Otherwise, it will use it to return one of the
> > reclaimed pages if there are any.
> > 
> > sgx_alloc_page() then does the following when @reclaim=true:
> > 
> > 1. If page in free page pool, take one.
> > 2. If not, try to reclaim one.
> > 3. If nothing was reclaimed -ENOMEM.
> > 
> > Right now sgx_alloc_page() can in theory take however long.
> > 
> > I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> > did this returning -EBUSY came from? Can't recall.
> 
> Checked. I guess it is just for ELDU flow but does not make sense
> otherwise. Tuning sgx_vma_fault() should be enough. I mean with
> the above change we would start to return -EBUSY sometimes in
> OOM situations.

Returning -EBUSY is done to differentiate between the case where reclaim
is possible, i.e. sgx_active_page_list is *not* empty, but disallowed, and
the case where reclaim is impossible, i.e. sgx_active_page_list is empty.
If reclaim is impossible then the fault handler should signal SIGSEGV so
that processes start dying and/or killing enclaves to free up EPC.

Barring a kernel bug, I don't think it's possible for sgx_active_page_list
to be empty when only the driver is supported, but both KVM and EPC cgroup
support will introduce (relatively common) scenarios where there are no
pages on the active/reclaimable list.  Technically we probably don't need
the -EBUSY logic, but my vote is to keep it since it's a nice fallback in
case there are kernel bugs.
Jarkko Sakkinen Sept. 17, 2019, 7:08 p.m. UTC | #4
On Mon, Sep 16, 2019 at 11:37:43AM -0700, Sean Christopherson wrote:
> On Mon, Sep 16, 2019 at 11:01:43AM +0300, Jarkko Sakkinen wrote:
> > On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> > > On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > > > My flush of updates for v23. Contains a bunch of clean ups and bug
> > > > fixes with the main focus on the page reclaimer. The main goal has
> > > > been to disclose all the other possibilities for failure after
> > > > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > > > invalidated.
> > > 
> > > I have at least one more update to the reclaimer but want to merge these
> > > first.
> > > 
> > > It adds optional struct epc_page **reclaimed_page to
> > > sgx_reclaim_pages(). If NULL, the function will just append everything
> > > to the free pool. Otherwise, it will use it to return one of the
> > > reclaimed pages if there are any.
> > > 
> > > sgx_alloc_page() then does the following when @reclaim=true:
> > > 
> > > 1. If page in free page pool, take one.
> > > 2. If not, try to reclaim one.
> > > 3. If nothing was reclaimed -ENOMEM.
> > > 
> > > Right now sgx_alloc_page() can in theory take however long.
> > > 
> > > I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> > > did this returning -EBUSY came from? Can't recall.
> > 
> > Checked. I guess it is just for ELDU flow but does not make sense
> > otherwise. Tuning sgx_vma_fault() should be enough. I mean with
> > the above change we would start to return -EBUSY sometimes in
> > OOM situations.
> 
> Returning -EBUSY is done to differentiate between the case where reclaim
> is possible, i.e. sgx_active_page_list is *not* empty, but disallowed, and
> the case where reclaim is impossible, i.e. sgx_active_page_list is empty.
> If reclaim is impossible then the fault handler should signal SIGSEGV so
> that processes start dying and/or killing enclaves to free up EPC.
> 
> Barring a kernel bug, I don't think it's possible for sgx_active_page_list
> to be empty when only the driver is supported, but both KVM and EPC cgroup
> support will introduce (relatively common) scenarios where there are no
> pages on the active/reclaimable list.  Technically we probably don't need
> the -EBUSY logic, but my vote is to keep it since it's a nice fallback in
> case there are kernel bugs.

OK, my root question is I guess, why want to differentiate those cases?
Both are as far as I'm concerned situations where there is no memory
available.

And now my changes after these patches add yet another case: active
page list was not empty but nothing could be reclaimed. Is the
granularity really needed for something here?

/Jarkko
Sean Christopherson Sept. 17, 2019, 7:27 p.m. UTC | #5
On Tue, Sep 17, 2019 at 10:08:31PM +0300, Jarkko Sakkinen wrote:
> On Mon, Sep 16, 2019 at 11:37:43AM -0700, Sean Christopherson wrote:
> > On Mon, Sep 16, 2019 at 11:01:43AM +0300, Jarkko Sakkinen wrote:
> > > On Mon, Sep 16, 2019 at 10:58:06AM +0300, Jarkko Sakkinen wrote:
> > > > On Mon, Sep 16, 2019 at 07:14:00AM +0300, Jarkko Sakkinen wrote:
> > > > > My flush of updates for v23. Contains a bunch of clean ups and bug
> > > > > fixes with the main focus on the page reclaimer. The main goal has
> > > > > been to disclose all the other possibilities for failure after
> > > > > ENCLS[EBLOCK] other than EPCM conflict when the whole EPC is
> > > > > invalidated.
> > > > 
> > > > I have at least one more update to the reclaimer but want to merge these
> > > > first.
> > > > 
> > > > It adds optional struct epc_page **reclaimed_page to
> > > > sgx_reclaim_pages(). If NULL, the function will just append everything
> > > > to the free pool. Otherwise, it will use it to return one of the
> > > > reclaimed pages if there are any.
> > > > 
> > > > sgx_alloc_page() then does the following when @reclaim=true:
> > > > 
> > > > 1. If page in free page pool, take one.
> > > > 2. If not, try to reclaim one.
> > > > 3. If nothing was reclaimed -ENOMEM.
> > > > 
> > > > Right now sgx_alloc_page() can in theory take however long.
> > > > 
> > > > I wonder why we do not return -ENOMEM also when @reclaim=false. Where
> > > > did this returning -EBUSY came from? Can't recall.
> > > 
> > > Checked. I guess it is just for ELDU flow but does not make sense
> > > otherwise. Tuning sgx_vma_fault() should be enough. I mean with
> > > the above change we would start to return -EBUSY sometimes in
> > > OOM situations.
> > 
> > Returning -EBUSY is done to differentiate between the case where reclaim
> > is possible, i.e. sgx_active_page_list is *not* empty, but disallowed, and
> > the case where reclaim is impossible, i.e. sgx_active_page_list is empty.
> > If reclaim is impossible then the fault handler should signal SIGSEGV so
> > that processes start dying and/or killing enclaves to free up EPC.
> > 
> > Barring a kernel bug, I don't think it's possible for sgx_active_page_list
> > to be empty when only the driver is supported, but both KVM and EPC cgroup
> > support will introduce (relatively common) scenarios where there are no
> > pages on the active/reclaimable list.  Technically we probably don't need
> > the -EBUSY logic, but my vote is to keep it since it's a nice fallback in
> > case there are kernel bugs.
> 
> OK, my root question is I guess, why want to differentiate those cases?
> Both are as far as I'm concerned situations where there is no memory
> available.
> 
> And now my changes after these patches add yet another case: active
> page list was not empty but nothing could be reclaimed. Is the
> granularity really needed for something here?

Yes.  If there are reclaimable pages, then letting userspace re-fault is
correct as the process can make forward progress.  Restarting userspace
when there are no reclaimable pages will soft hang userspace, i.e. it'll
fault indefinitely.