diff mbox series

[05/25] x86/sgx: Introduce runtime protection bits

Message ID 2f6b04dd8949591ee6139072c72eb93da3dd07b0.1638381245.git.reinette.chatre@intel.com (mailing list archive)
State New, archived
Headers show
Series x86/sgx and selftests/sgx: Support SGX2 | expand

Commit Message

Reinette Chatre Dec. 1, 2021, 7:23 p.m. UTC
Enclave creators declare their paging permission intent at the time
the pages are added to the enclave. These paging permissions are
vetted when pages are added to the enclave and stashed off
(in sgx_encl_page->vm_max_prot_bits) for later comparison with
enclave PTEs.

Current permission support assume that enclave page permissions
remain static for the lifetime of the enclave. This is about to change
with the addition of support for SGX2 where the permissions of enclave
pages belonging to an initialized enclave may be changed during the
enclave's lifetime.

Introduce runtime protection bits in preparation for support of
enclave page permission changes. These bits reflect the active
permissions of an enclave page and are not to exceed the maximum
protection bits that passed scrutiny during enclave creation.

Associate runtime protection bits with each enclave page. Initialize
the runtime protection bits to the vetted maximum protection bits
on page creation. Use the runtime protection bits for any access
checks.

struct sgx_encl_page hosting this information is maintained for each
enclave page so the space consumed by the struct is important.
The existing vm_max_prot_bits is already unsigned long while only using
three bits. Transition to a bitfield for the two members containing
protection bits.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c  | 6 +++---
 arch/x86/kernel/cpu/sgx/encl.h  | 3 ++-
 arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
 3 files changed, 11 insertions(+), 4 deletions(-)

Comments

Andy Lutomirski Dec. 3, 2021, 7:28 p.m. UTC | #1
On 12/1/21 11:23, Reinette Chatre wrote:
> Enclave creators declare their paging permission intent at the time
> the pages are added to the enclave. These paging permissions are
> vetted when pages are added to the enclave and stashed off
> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
> enclave PTEs.
> 

I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the 
EPCM permission bits however it likes with no oversight from the kernel. 
  So we end up with a whole bunch of permission masks:

The PTE: controlled by complex kernel policy

The VMA: with your series, this is entirely controlled by userspace.  I 
think I'm fine with that.

vm_max_prot_bits: populated from secinfo at setup time, unless I missed 
something that changes it later.  Maybe I'm confused or missed something 
in one of the patches,

vm_run_prot_bits: populated from some combination of ioctls.  I'm 
entirely lost as to what this is for.

EPCM bits: controlled by the guest.  basically useless for any host 
purpose on SGX2 hardware (with or without kernel support -- the enclave 
can do ENCLU[EMODPE] whether we like it or not, even on old kernels)

So I guess I don't understand the purpose of this patch	or of the rules 
in the later patches, and I feel like this is getting more complicated 
than makes sense.


Could we perhaps make vm_max_prot_bits dynamic or at least controllable 
in some useful way?  My initial proposal (years ago) was for 
vm_max_prot_bits to be *separately* configured at initial load time 
instead of being inferred from secinfo with the intent being that the 
user untrusted runtime would set it appropriately.  I have no problem 
with allowing runtime changes as long as the security policy makes sense 
and it's kept consistent with PTEs.

Also, I think we need a changelog message or, even better, actual docs 
in kernel, explaining the actual final set of rules and invariants for 
all these masks.

--Andy
Reinette Chatre Dec. 3, 2021, 10:12 p.m. UTC | #2
Hi Andy,

On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
> On 12/1/21 11:23, Reinette Chatre wrote:
>> Enclave creators declare their paging permission intent at the time
>> the pages are added to the enclave. These paging permissions are
>> vetted when pages are added to the enclave and stashed off
>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>> enclave PTEs.
>>
> 
> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the 
> EPCM permission bits however it likes with no oversight from the kernel. 
>   So we end up with a whole bunch of permission masks:

Before jumping to the permission masks I would like to step back and 
just confirm the context. We need to consider the following three 
permissions:

EPCM permissions: the enclave page permissions maintained in the SGX 
hardware. The OS is constrained here in that it cannot query the current 
EPCM permissions. Even so, the OS needs to ensure PTEs are installed 
appropriately (we do not want a RW PTE for a read-only enclave page) and 
thus the OS keeps its own record of EPCM permissions to support this.
As you note later even in current kernel the enclave can change these 
permissions without OS knowing. EPCM permissions can only be relaxed 
without the OS knowledge though so the OS record of EPCM permissions can 
only ever be stricter than the actual EPCM permissions.

VMA permissions: Current behavior (not changed in this series) is that 
the OS enforces that a new VMA should have the same or weaker 
permissions than the EPCM permissions.

Page table entries: These should match the EPCM permissions without 
exceeding VMA permissions.

> The PTE: controlled by complex kernel policy
> 
> The VMA: with your series, this is entirely controlled by userspace.  I 
> think I'm fine with that.
> 
> vm_max_prot_bits: populated from secinfo at setup time, unless I missed 
> something that changes it later.  Maybe I'm confused or missed something 
> in one of the patches,

Yes, vm_max_prot_bits is currently and continues to be populated from 
secinfo for pages added before the enclave is initialized and in a later 
patch it would be hardcoded to RW for pages that are added after the 
enclave is initialized.  In the current implementation vm_max_prot_bits 
is the OS's record of the EPCM permissions used to guide VMA and PTE 
permissions.

On a higher level, the implementation decision is that vm_max_prot_bits 
is the static "vetted" permissions of a page - the maximum permissions a 
page is allowed to have during its entire lifetime. This matches the 
current implementation. In the current implementation permissions are 
only able to change via VMA and PTE ... for example a read-only VMA can 
access an enclave page with vm_max_prot_bits of RW. With the SGX2 
support permission changes are allowed to EPCM permissions - but in this 
implementation they are not allowed to exceed the originally vetted 
vm_max_prot_bits.

In this SGX2 implementation an enclave page could thus be added to an 
enclave with secinfo and vm_max_prot_bits of RW that would only allow 
that page to have R or RW permissions (VMA, PTE, and OS view of EPCM 
permissions) in its lifetime, never RX or RWX. Yes, it may be possible 
for the enclave to change the EPCM permissions from within the enclave 
using ENCLU[EMODPE] but to access the page the enclave would need the OS 
to install the appropriate PTE and the OS would not do so if 
vm_max_prot_bits does not allow it. Neither would the OS allow an 
executable VMA.

> 
> vm_run_prot_bits: populated from some combination of ioctls.  I'm 
> entirely lost as to what this is for.

With SGX2 it is possible to change the EPCM permissions of an enclave 
page after the enclave is initialized. vm_max_prot_bits would provide 
guidance to what permissions a page is allowed to have while 
vm_run_prot_bits contains the current view of EPCM permissions used by 
the OS to guide whether requested VMA permissions are allowed and guide 
what PTE permissions should be.

Consider this example how vm_max_prot_bits and vm_run_prot_bits are used:

(1) Add enclave page with secinfo of RW to uninitialized enclave
     vm_max_prot_bits = RW
     vm_run_prot_bits = RW

(2) User space runs SGX_IOC_PAGE_MODP to change the permissions to read-
     only. This is allowed because vm_max_prot_bits = RW. Now:
     vm_max_prot_bits = RW
     vm_run_prot_bits = R

     Now VMAs are created and PTEs installed based on value of
     vm_run_prot_bits - write access will not be allowed.

(3) User space runs SGX_IOC_PAGE_MODP to change the permissions to RX.
     This will be denied because vm_max_prot_bits = RW.

(3) User space runs SGX_IOC_PAGE_MODP to change the permissions to RW.
     This will be allowed because vm_max_prot_bits = RW.

     Now VMAs are created and PTEs installed based on value of
     vm_run_prot_bits - write access will again be allowed.

> 
> EPCM bits: controlled by the guest.  basically useless for any host 
> purpose on SGX2 hardware (with or without kernel support -- the enclave 
> can do ENCLU[EMODPE] whether we like it or not, even on old kernels)

Indeed - permissions can only be relaxed without the OS knowledge so the 
OS's view would always be the same or stricter than the enclave.

> So I guess I don't understand the purpose of this patch    or of the 
> rules in the later patches, and I feel like this is getting more 
> complicated than makes sense.
> 
> 
> Could we perhaps make vm_max_prot_bits dynamic or at least controllable 
> in some useful way?  My initial proposal (years ago) was for 
> vm_max_prot_bits to be *separately* configured at initial load time 
> instead of being inferred from secinfo with the intent being that the 
> user untrusted runtime would set it appropriately.  I have no problem 
> with allowing runtime changes as long as the security policy makes sense 
> and it's kept consistent with PTEs.

This SGX2 enabling indeed builds on the current implementation where 
vm_max_prot_bits is inferred from secinfo. The intent is for 
vm_max_prot_bits to reflect the maximum allowed vetted permissions.

At this time vm_max_prot_bits is indeed static and thus creates the need 
for (dynamic) vm_run_prot_bits that reflects the current EPCM 
permissions and guides VMA and PTE permissions while vm_max_prot_bits 
guides new permission requests. From what I understand this 
implementation follows the current security policy - permissions are 
never allowed to exceed the originally vetted permissions. PTEs are kept 
consistent in that they match the (vetted, OS view of) EPCM permissions.

> Also, I think we need a changelog message or, even better, actual docs 
> in kernel, explaining the actual final set of rules and invariants for 
> all these masks.

I will add a section to Documentation/x86/sgx.rst.

Reinette
Andy Lutomirski Dec. 4, 2021, 12:38 a.m. UTC | #3
On 12/3/21 14:12, Reinette Chatre wrote:
> Hi Andy,
> 
> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>> On 12/1/21 11:23, Reinette Chatre wrote:
>>> Enclave creators declare their paging permission intent at the time
>>> the pages are added to the enclave. These paging permissions are
>>> vetted when pages are added to the enclave and stashed off
>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>> enclave PTEs.
>>>
>>
>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>> the EPCM permission bits however it likes with no oversight from the 
>> kernel.   So we end up with a whole bunch of permission masks:
> 
> Before jumping to the permission masks I would like to step back and 
> just confirm the context. We need to consider the following three 
> permissions:
> 
> EPCM permissions: the enclave page permissions maintained in the SGX 
> hardware. The OS is constrained here in that it cannot query the current 
> EPCM permissions. Even so, the OS needs to ensure PTEs are installed 
> appropriately (we do not want a RW PTE for a read-only enclave page)

Why not?  What's wrong with an RW PTE for a read-only enclave page?

If you convince me that this is actually important, then I'll read all 
the stuff below.
Reinette Chatre Dec. 4, 2021, 1:14 a.m. UTC | #4
Hi Andy,

On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
> On 12/3/21 14:12, Reinette Chatre wrote:
>> Hi Andy,
>>
>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>> Enclave creators declare their paging permission intent at the time
>>>> the pages are added to the enclave. These paging permissions are
>>>> vetted when pages are added to the enclave and stashed off
>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>> enclave PTEs.
>>>>
>>>
>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>>> the EPCM permission bits however it likes with no oversight from the 
>>> kernel.   So we end up with a whole bunch of permission masks:
>>
>> Before jumping to the permission masks I would like to step back and 
>> just confirm the context. We need to consider the following three 
>> permissions:
>>
>> EPCM permissions: the enclave page permissions maintained in the SGX 
>> hardware. The OS is constrained here in that it cannot query the 
>> current EPCM permissions. Even so, the OS needs to ensure PTEs are 
>> installed appropriately (we do not want a RW PTE for a read-only 
>> enclave page)
> 
> Why not?  What's wrong with an RW PTE for a read-only enclave page?
> 
> If you convince me that this is actually important, then I'll read all 
> the stuff below.

Perhaps it is my misunderstanding/misinterpretation of the current 
implementation? From what I understand the current requirement, as 
enforced in the current mmap(), mprotect() as well as fault() hooks, is 
that mappings are required to have identical or weaker permission than 
the enclave permission.

Could you please elaborate how you envision PTEs should be managed in 
this implementation?

Thank you

Reinette
Andy Lutomirski Dec. 4, 2021, 5:56 p.m. UTC | #5
On 12/3/21 17:14, Reinette Chatre wrote:
> Hi Andy,
> 
> On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
>> On 12/3/21 14:12, Reinette Chatre wrote:
>>> Hi Andy,
>>>
>>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>>> Enclave creators declare their paging permission intent at the time
>>>>> the pages are added to the enclave. These paging permissions are
>>>>> vetted when pages are added to the enclave and stashed off
>>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>>> enclave PTEs.
>>>>>
>>>>
>>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>>>> the EPCM permission bits however it likes with no oversight from the 
>>>> kernel.   So we end up with a whole bunch of permission masks:
>>>
>>> Before jumping to the permission masks I would like to step back and 
>>> just confirm the context. We need to consider the following three 
>>> permissions:
>>>
>>> EPCM permissions: the enclave page permissions maintained in the SGX 
>>> hardware. The OS is constrained here in that it cannot query the 
>>> current EPCM permissions. Even so, the OS needs to ensure PTEs are 
>>> installed appropriately (we do not want a RW PTE for a read-only 
>>> enclave page)
>>
>> Why not?  What's wrong with an RW PTE for a read-only enclave page?
>>
>> If you convince me that this is actually important, then I'll read all 
>> the stuff below.
> 
> Perhaps it is my misunderstanding/misinterpretation of the current 
> implementation? From what I understand the current requirement, as 
> enforced in the current mmap(), mprotect() as well as fault() hooks, is 
> that mappings are required to have identical or weaker permission than 
> the enclave permission.

The current implementation does require that, but for a perhaps 
counterintuitive reason.  If a SELinux-restricted (or similarly 
restricted) process that is *not* permitted to do JIT-like things loads 
an enclave, it's entirely okay for it to initialize RW enclave pages 
however it likes and it's entirely okay for it to initialize RX (or XO 
if that ever becomes a thing) enclave pages from appropriately files on 
disk.  But it's not okay for it to create RWX enclave pages or to 
initialize RX enclave pages from untrusted application memory. [0]

So we have a half-baked implementation right now: the permission to 
execute a page is decided based on secinfo (max permissions) when the 
enclave is set up, and it's enforced at the PTE level.  The PTE 
enforcement is because, on SGX2 hardware, the enclave can do EMODPE and 
bypass any supposed restrictions in the EPCM.

The only coupling between EPCM and PTE here is that the max_perm is 
initialized together with EPCM, but it didn't have to be that way.

An SGX2 implementation needs to be more fully baked, because in a 
dynamic environment enclaves need to be able to use EMODPE and actually 
end up with permissions that exceed the initial secinfo permissions.  So 
it needs to be possible to make a page that starts out R (or RW or 
whatever) but nonetheless has max_perm=RWX so that the enclave can use a 
combination of EMODPE and (ioctl-based) EMODPR to do JIT.  So I think 
you should make it possible to set up pages like this, but I see no 
reason to couple the PTE and the EPCM permissions.

> 
> Could you please elaborate how you envision PTEs should be managed in 
> this implementation?

As above: PTE permissions may not exceed max_perm, and EPCM is entirely 
separate except to the extent needed for ABI compatibility with SGX1 
runtimes.


[0] I'm not sure anyone actually has a system set up like this or that 
the necessary LSM support is in the kernel.  But it's supposed to be 
possible without changing the ABI.
Jarkko Sakkinen Dec. 4, 2021, 10:50 p.m. UTC | #6
What about:

"x86/sgx: Add encl_page->vm_run_prot_bits for dynamic permission changes"

On Wed, Dec 01, 2021 at 11:23:03AM -0800, Reinette Chatre wrote:
> Enclave creators declare their paging permission intent at the time
> the pages are added to the enclave. These paging permissions are
> vetted when pages are added to the enclave and stashed off
> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
> enclave PTEs.
> 
> Current permission support assume that enclave page permissions
> remain static for the lifetime of the enclave. This is about to change
> with the addition of support for SGX2 where the permissions of enclave
> pages belonging to an initialized enclave may be changed during the
> enclave's lifetime.
> 
> Introduce runtime protection bits in preparation for support of

By writing "Introduce runtime protection bits", instead of simply "Add
encl_page->vm_run_prot_bits", the only thing you are adding is obfuscation.

Try to refer to the "exact thing", instead of English rephrasing
whenever possible.

> enclave page permission changes. These bits reflect the active
> permissions of an enclave page and are not to exceed the maximum
> protection bits that passed scrutiny during enclave creation.
> 
> Associate runtime protection bits with each enclave page. Initialize
> the runtime protection bits to the vetted maximum protection bits
> on page creation. Use the runtime protection bits for any access
> checks.

I guess the first sentence in this paragraph is completely redundant
as the first sentence of the previous paragraph tells the exact
same story.

> struct sgx_encl_page hosting this information is maintained for each
> enclave page so the space consumed by the struct is important.
> The existing vm_max_prot_bits is already unsigned long while only using
> three bits. Transition to a bitfield for the two members containing
> protection bits.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

So this commit message left the most important thing unanswered,
or I missed it (which happens quite often): why two fields instead
of one? Why vm_max_port_bits needs to stay constant?

It's something that should be clearly documented.

/Jarkko
Reinette Chatre Dec. 4, 2021, 11:55 p.m. UTC | #7
Hi Andy,

On 12/4/2021 9:56 AM, Andy Lutomirski wrote:
> On 12/3/21 17:14, Reinette Chatre wrote:
>> Hi Andy,
>>
>> On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
>>> On 12/3/21 14:12, Reinette Chatre wrote:
>>>> Hi Andy,
>>>>
>>>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>>>> Enclave creators declare their paging permission intent at the time
>>>>>> the pages are added to the enclave. These paging permissions are
>>>>>> vetted when pages are added to the enclave and stashed off
>>>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>>>> enclave PTEs.
>>>>>>
>>>>>
>>>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>>>>> the EPCM permission bits however it likes with no oversight from 
>>>>> the kernel.   So we end up with a whole bunch of permission masks:
>>>>
>>>> Before jumping to the permission masks I would like to step back and 
>>>> just confirm the context. We need to consider the following three 
>>>> permissions:
>>>>
>>>> EPCM permissions: the enclave page permissions maintained in the SGX 
>>>> hardware. The OS is constrained here in that it cannot query the 
>>>> current EPCM permissions. Even so, the OS needs to ensure PTEs are 
>>>> installed appropriately (we do not want a RW PTE for a read-only 
>>>> enclave page)
>>>
>>> Why not?  What's wrong with an RW PTE for a read-only enclave page?
>>>
>>> If you convince me that this is actually important, then I'll read 
>>> all the stuff below.
>>
>> Perhaps it is my misunderstanding/misinterpretation of the current 
>> implementation? From what I understand the current requirement, as 
>> enforced in the current mmap(), mprotect() as well as fault() hooks, 
>> is that mappings are required to have identical or weaker permission 
>> than the enclave permission.
> 
> The current implementation does require that, but for a perhaps 
> counterintuitive reason.  If a SELinux-restricted (or similarly 
> restricted) process that is *not* permitted to do JIT-like things loads 
> an enclave, it's entirely okay for it to initialize RW enclave pages 
> however it likes and it's entirely okay for it to initialize RX (or XO 
> if that ever becomes a thing) enclave pages from appropriately files on 
> disk.  But it's not okay for it to create RWX enclave pages or to 
> initialize RX enclave pages from untrusted application memory. [0]
> 
> So we have a half-baked implementation right now: the permission to 
> execute a page is decided based on secinfo (max permissions) when the 
> enclave is set up, and it's enforced at the PTE level.  The PTE 
> enforcement is because, on SGX2 hardware, the enclave can do EMODPE and 
> bypass any supposed restrictions in the EPCM.
> 
> The only coupling between EPCM and PTE here is that the max_perm is 
> initialized together with EPCM, but it didn't have to be that way.
> 
> An SGX2 implementation needs to be more fully baked, because in a 
> dynamic environment enclaves need to be able to use EMODPE and actually 
> end up with permissions that exceed the initial secinfo permissions.  So 

Could you please elaborate why this is a requirement? In this 
implementation the secinfo of a page added before enclave initialization 
(via SGX_IOC_ENCLAVE_ADD_PAGES) would indicate the maximum permissions 
it may have during its lifetime. Pages needing to be writable and 
executable during their lifetime can be created with RWX secinfo and 
during the enclave runtime the pages could obtain all combinations of 
permissions: RWX, R, RW, RX. A page added with RW secinfo may have R or 
RW permissions during its lifetime but never RX or RWX.

So far our inquiries on whether this is acceptable has been positive and 
is also what Dave attempted to put a spotlight on in:
https://lore.kernel.org/lkml/94d8d631-5345-66c4-52a3-941e52500f84@intel.com/

This above is specific to pages added before enclave initialization. In 
this implementation pages added after enclave initialization, those 
needing the ENCLS[EAUG] SGX2 instruction, are added with max permissions 
of RW so could only have R or RW permissions during their lifetime. This 
is an understood limitation and it is understood that integration with 
user policy is required to support these pages obtaining executable 
permission. The plan is to handle user policy integration in a series 
that follows this core SGX2 enabling.

> it needs to be possible to make a page that starts out R (or RW or 
> whatever) but nonetheless has max_perm=RWX so that the enclave can use a 
> combination of EMODPE and (ioctl-based) EMODPR to do JIT.  So I think 
> you should make it possible to set up pages like this, but I see no 
> reason to couple the PTE and the EPCM permissions.
> 
>>
>> Could you please elaborate how you envision PTEs should be managed in 
>> this implementation?
> 
> As above: PTE permissions may not exceed max_perm, and EPCM is entirely 
> separate except to the extent needed for ABI compatibility with SGX1 
> runtimes.

ok, so if I understand correctly you, since PTE permissions may not 
exceed max_perm and EPCM are separate, this seems to get back to your 
previous question of "What's wrong with an RW PTE for a read-only 
enclave page?"

This is indeed something that we could allow but not doing so (that is 
PTEs not exceeding EPCM permissions) would better support the SGX 
runtime. That is why I separated out the addition of the pfn_mkwrite() 
callback in the previous patch (04/25). Like in your example, there is a 
RW mapping of a read-only enclave page that first results in a RW PTE 
for the read-only enclave page. That would result in a #PF with the SGX 
flag set (0x8007). If the PTE matches the enclave permissions the page 
fault would have familiar 0x7 error code.

In either case user space would encounter a #PF so technically there is 
nothing "wrong" with allowing this - even so, as motivated in the 
previous patch: accurate exception information supports the SGX runtime, 
which is virtually always implemented inside a shared library, by 
providing accurate information in support of its management of the SGX 
enclave.


> [0] I'm not sure anyone actually has a system set up like this or that 
> the necessary LSM support is in the kernel.  But it's supposed to be 
> possible without changing the ABI.
> 

Reinette
Jarkko Sakkinen Dec. 4, 2021, 11:57 p.m. UTC | #8
On Fri, Dec 03, 2021 at 11:28:04AM -0800, Andy Lutomirski wrote:
> On 12/1/21 11:23, Reinette Chatre wrote:
> > Enclave creators declare their paging permission intent at the time
> > the pages are added to the enclave. These paging permissions are
> > vetted when pages are added to the enclave and stashed off
> > (in sgx_encl_page->vm_max_prot_bits) for later comparison with
> > enclave PTEs.
> > 
> 
> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the EPCM
> permission bits however it likes with no oversight from the kernel.  So we
> end up with a whole bunch of permission masks:
> 
> The PTE: controlled by complex kernel policy
> 
> The VMA: with your series, this is entirely controlled by userspace.  I
> think I'm fine with that.
> 
> vm_max_prot_bits: populated from secinfo at setup time, unless I missed
> something that changes it later.  Maybe I'm confused or missed something in
> one of the patches,
> 
> vm_run_prot_bits: populated from some combination of ioctls.  I'm entirely
> lost as to what this is for.
> 
> EPCM bits: controlled by the guest.  basically useless for any host purpose
> on SGX2 hardware (with or without kernel support -- the enclave can do
> ENCLU[EMODPE] whether we like it or not, even on old kernels)
> 
> So I guess I don't understand the purpose of this patch	or of the rules in
> the later patches, and I feel like this is getting more complicated than
> makes sense.
> 
> 
> Could we perhaps make vm_max_prot_bits dynamic or at least controllable in
> some useful way?  My initial proposal (years ago) was for vm_max_prot_bits
> to be *separately* configured at initial load time instead of being inferred
> from secinfo with the intent being that the user untrusted runtime would set
> it appropriately.  I have no problem with allowing runtime changes as long
> as the security policy makes sense and it's kept consistent with PTEs.

This is a valid question. Since EMODPE exists why not just make things for
EMODPE, and ignore EMODPR altogether?

> Also, I think we need a changelog message or, even better, actual docs in
> kernel, explaining the actual final set of rules and invariants for all
> these masks.
> 
> --Andy

/Jarkko
Reinette Chatre Dec. 6, 2021, 9:20 p.m. UTC | #9
Hi Jarkko,

On 12/4/2021 3:57 PM, Jarkko Sakkinen wrote:
> On Fri, Dec 03, 2021 at 11:28:04AM -0800, Andy Lutomirski wrote:
>> On 12/1/21 11:23, Reinette Chatre wrote:
>>> Enclave creators declare their paging permission intent at the time
>>> the pages are added to the enclave. These paging permissions are
>>> vetted when pages are added to the enclave and stashed off
>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>> enclave PTEs.
>>>
>>
>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the EPCM
>> permission bits however it likes with no oversight from the kernel.  So we
>> end up with a whole bunch of permission masks:
>>
>> The PTE: controlled by complex kernel policy
>>
>> The VMA: with your series, this is entirely controlled by userspace.  I
>> think I'm fine with that.
>>
>> vm_max_prot_bits: populated from secinfo at setup time, unless I missed
>> something that changes it later.  Maybe I'm confused or missed something in
>> one of the patches,
>>
>> vm_run_prot_bits: populated from some combination of ioctls.  I'm entirely
>> lost as to what this is for.
>>
>> EPCM bits: controlled by the guest.  basically useless for any host purpose
>> on SGX2 hardware (with or without kernel support -- the enclave can do
>> ENCLU[EMODPE] whether we like it or not, even on old kernels)
>>
>> So I guess I don't understand the purpose of this patch	or of the rules in
>> the later patches, and I feel like this is getting more complicated than
>> makes sense.
>>
>>
>> Could we perhaps make vm_max_prot_bits dynamic or at least controllable in
>> some useful way?  My initial proposal (years ago) was for vm_max_prot_bits
>> to be *separately* configured at initial load time instead of being inferred
>> from secinfo with the intent being that the user untrusted runtime would set
>> it appropriately.  I have no problem with allowing runtime changes as long
>> as the security policy makes sense and it's kept consistent with PTEs.
> 
> This is a valid question. Since EMODPE exists why not just make things for
> EMODPE, and ignore EMODPR altogether?
> 

I believe that we should support the best practice of principle of least 
privilege - once a page no longer needs a particular permission there 
should be a way to remove it (the unneeded permission).

Reinette
Reinette Chatre Dec. 6, 2021, 9:28 p.m. UTC | #10
Hi Jarkko,

On 12/4/2021 2:50 PM, Jarkko Sakkinen wrote:
> What about:
> 
> "x86/sgx: Add encl_page->vm_run_prot_bits for dynamic permission changes"

Sure.

> 
> On Wed, Dec 01, 2021 at 11:23:03AM -0800, Reinette Chatre wrote:
>> Enclave creators declare their paging permission intent at the time
>> the pages are added to the enclave. These paging permissions are
>> vetted when pages are added to the enclave and stashed off
>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>> enclave PTEs.
>>
>> Current permission support assume that enclave page permissions
>> remain static for the lifetime of the enclave. This is about to change
>> with the addition of support for SGX2 where the permissions of enclave
>> pages belonging to an initialized enclave may be changed during the
>> enclave's lifetime.
>>
>> Introduce runtime protection bits in preparation for support of
> 
> By writing "Introduce runtime protection bits", instead of simply "Add
> encl_page->vm_run_prot_bits", the only thing you are adding is obfuscation.
> 
> Try to refer to the "exact thing", instead of English rephrasing
> whenever possible.
> 
>> enclave page permission changes. These bits reflect the active
>> permissions of an enclave page and are not to exceed the maximum
>> protection bits that passed scrutiny during enclave creation.
>>
>> Associate runtime protection bits with each enclave page. Initialize
>> the runtime protection bits to the vetted maximum protection bits
>> on page creation. Use the runtime protection bits for any access
>> checks.
> 
> I guess the first sentence in this paragraph is completely redundant
> as the first sentence of the previous paragraph tells the exact
> same story.

The previous paragraph introduces what these bits are and what they mean 
and the second describes how they are used. I can merge the paragraphs.

> 
>> struct sgx_encl_page hosting this information is maintained for each
>> enclave page so the space consumed by the struct is important.
>> The existing vm_max_prot_bits is already unsigned long while only using
>> three bits. Transition to a bitfield for the two members containing
>> protection bits.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> 
> So this commit message left the most important thing unanswered,
> or I missed it (which happens quite often): why two fields instead
> of one? Why vm_max_port_bits needs to stay constant?
> 
> It's something that should be clearly documented.

vm_max_prot_bits is the vetted EPCM permissions an enclave is allowed to 
have (for EADDed pages it is the value from secinfo). Permissions can be 
changed using SGX2 but they should never exceed vm_max_prot_bits.
vm_run_prot_bits reflects the current (from OS perspective) active EPCM 
permissions and replaces the current usages of vm_max_prot_bits in 
runtime (VMA and PTE) permission checks.

Consider this example how vm_max_prot_bits and vm_run_prot_bits are used:

(1) Add enclave page with secinfo of RW to uninitialized enclave
     vm_max_prot_bits = RW
     vm_run_prot_bits = RW

(2) User space runs SGX_IOC_PAGE_MODP (renamed
     to SGX_IOC_ENCLAVE_MOD_PROTECTIONS) to change the permissions to
     read-only. This is allowed because vm_max_prot_bits = RW. Now:
     vm_max_prot_bits = RW
     vm_run_prot_bits = R

     At this point only new read-only VMAs would be allowed to access
     this page and PTEs would not allow write access ... this is guided
     by vm_run_prot_bits.

(3) User space runs SGX_IOC_PAGE_MODP (renamed
     to SGX_IOC_ENCLAVE_MOD_PROTECTIONS) to change the permissions to RX.
     This will be denied because vm_max_prot_bits = RW.

(3) User space runs SGX_IOC_PAGE_MODP (renamed
     to SGX_IOC_ENCLAVE_MOD_PROTECTIONS) to change the permissions to RW.
     This will be allowed because vm_max_prot_bits = RW.

If this is helpful I can add this example to the changelog.

Reinette
Jarkko Sakkinen Dec. 11, 2021, 7:42 a.m. UTC | #11
On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
> > This is a valid question. Since EMODPE exists why not just make things for
> > EMODPE, and ignore EMODPR altogether?
> > 
> 
> I believe that we should support the best practice of principle of least 
> privilege - once a page no longer needs a particular permission there 
> should be a way to remove it (the unneeded permission).

What if EMODPR was not used at all, since EMODPE is there anyway?

This could be achieved e.g. by having ioctl to change protection
bits in encl->page_tree.

This would simplify things a lot given that there would be only
two, instead of three, EACCEPT code paths.

/Jarkko
Reinette Chatre Dec. 13, 2021, 10:10 p.m. UTC | #12
Hi Jarkko,

On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
>>> This is a valid question. Since EMODPE exists why not just make things for
>>> EMODPE, and ignore EMODPR altogether?
>>>
>>
>> I believe that we should support the best practice of principle of least
>> privilege - once a page no longer needs a particular permission there
>> should be a way to remove it (the unneeded permission).
> 
> What if EMODPR was not used at all, since EMODPE is there anyway?

EMODPR and EMODPE are not equivalent.

EMODPE can only be used to "extend"/relax permissions while EMODPR can 
only be used to restrict permissions.

Notice in the EMODPE instruction reference of the SDM:

(* Update EPCM permissions *)
EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;

So, when using EMODPE it is only possible to add permissions, not remove 
permissions.

If a user wants to remove permissions from an EPCM page it is only 
possible when using EMODPR. Notice in its instruction reference found in 
the SDM how it in turn can only be used to restrict permissions:

(* Update EPCM permissions *)
EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;

> This could be achieved e.g. by having ioctl to change protection
> bits in encl->page_tree.
> 
> This would simplify things a lot given that there would be only
> two, instead of three, EACCEPT code paths.

Reinette
Reinette Chatre Dec. 13, 2021, 10:34 p.m. UTC | #13
Hi Andy,

I would like to check in if you had some time to digest my responses 
with a few high level questions below ...

On 12/4/2021 3:55 PM, Reinette Chatre wrote:
> On 12/4/2021 9:56 AM, Andy Lutomirski wrote:
>> On 12/3/21 17:14, Reinette Chatre wrote:
>>> On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
>>>> On 12/3/21 14:12, Reinette Chatre wrote:
>>>>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>>>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>>>>> Enclave creators declare their paging permission intent at the time
>>>>>>> the pages are added to the enclave. These paging permissions are
>>>>>>> vetted when pages are added to the enclave and stashed off
>>>>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>>>>> enclave PTEs.
>>>>>>>
>>>>>>
>>>>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to 
>>>>>> change the EPCM permission bits however it likes with no oversight 
>>>>>> from the kernel.   So we end up with a whole bunch of permission 
>>>>>> masks:
>>>>>
>>>>> Before jumping to the permission masks I would like to step back 
>>>>> and just confirm the context. We need to consider the following 
>>>>> three permissions:
>>>>>
>>>>> EPCM permissions: the enclave page permissions maintained in the 
>>>>> SGX hardware. The OS is constrained here in that it cannot query 
>>>>> the current EPCM permissions. Even so, the OS needs to ensure PTEs 
>>>>> are installed appropriately (we do not want a RW PTE for a 
>>>>> read-only enclave page)
>>>>
>>>> Why not?  What's wrong with an RW PTE for a read-only enclave page?
>>>>
>>>> If you convince me that this is actually important, then I'll read 
>>>> all the stuff below.
>>>
>>> Perhaps it is my misunderstanding/misinterpretation of the current 
>>> implementation? From what I understand the current requirement, as 
>>> enforced in the current mmap(), mprotect() as well as fault() hooks, 
>>> is that mappings are required to have identical or weaker permission 
>>> than the enclave permission.
>>
>> The current implementation does require that, but for a perhaps 
>> counterintuitive reason.  If a SELinux-restricted (or similarly 
>> restricted) process that is *not* permitted to do JIT-like things 
>> loads an enclave, it's entirely okay for it to initialize RW enclave 
>> pages however it likes and it's entirely okay for it to initialize RX 
>> (or XO if that ever becomes a thing) enclave pages from appropriately 
>> files on disk.  But it's not okay for it to create RWX enclave pages 
>> or to initialize RX enclave pages from untrusted application memory. [0]
>>
>> So we have a half-baked implementation right now: the permission to 
>> execute a page is decided based on secinfo (max permissions) when the 
>> enclave is set up, and it's enforced at the PTE level.  The PTE 
>> enforcement is because, on SGX2 hardware, the enclave can do EMODPE 
>> and bypass any supposed restrictions in the EPCM.
>>
>> The only coupling between EPCM and PTE here is that the max_perm is 
>> initialized together with EPCM, but it didn't have to be that way.
>>
>> An SGX2 implementation needs to be more fully baked, because in a 
>> dynamic environment enclaves need to be able to use EMODPE and 
>> actually end up with permissions that exceed the initial secinfo 
>> permissions.  So 
> 
> Could you please elaborate why this is a requirement? In this 
> implementation the secinfo of a page added before enclave initialization 
> (via SGX_IOC_ENCLAVE_ADD_PAGES) would indicate the maximum permissions 
> it may have during its lifetime. Pages needing to be writable and 
> executable during their lifetime can be created with RWX secinfo and 
> during the enclave runtime the pages could obtain all combinations of 
> permissions: RWX, R, RW, RX. A page added with RW secinfo may have R or 
> RW permissions during its lifetime but never RX or RWX.
> 
> So far our inquiries on whether this is acceptable has been positive and 
> is also what Dave attempted to put a spotlight on in:
> https://lore.kernel.org/lkml/94d8d631-5345-66c4-52a3-941e52500f84@intel.com/ 
> 
> 
> This above is specific to pages added before enclave initialization. In 
> this implementation pages added after enclave initialization, those 
> needing the ENCLS[EAUG] SGX2 instruction, are added with max permissions 
> of RW so could only have R or RW permissions during their lifetime. This 
> is an understood limitation and it is understood that integration with 
> user policy is required to support these pages obtaining executable 
> permission. The plan is to handle user policy integration in a series 
> that follows this core SGX2 enabling.

Are you ok with the strategy to support modification of enclave page 
permissions?

> 
>> it needs to be possible to make a page that starts out R (or RW or 
>> whatever) but nonetheless has max_perm=RWX so that the enclave can use 
>> a combination of EMODPE and (ioctl-based) EMODPR to do JIT.  So I 
>> think you should make it possible to set up pages like this, but I see 
>> no reason to couple the PTE and the EPCM permissions.
>>
>>>
>>> Could you please elaborate how you envision PTEs should be managed in 
>>> this implementation?
>>
>> As above: PTE permissions may not exceed max_perm, and EPCM is 
>> entirely separate except to the extent needed for ABI compatibility 
>> with SGX1 runtimes.
> 
> ok, so if I understand correctly you, since PTE permissions may not 
> exceed max_perm and EPCM are separate, this seems to get back to your 
> previous question of "What's wrong with an RW PTE for a read-only 
> enclave page?"
> 
> This is indeed something that we could allow but not doing so (that is 
> PTEs not exceeding EPCM permissions) would better support the SGX 
> runtime. That is why I separated out the addition of the pfn_mkwrite() 
> callback in the previous patch (04/25). Like in your example, there is a 
> RW mapping of a read-only enclave page that first results in a RW PTE 
> for the read-only enclave page. That would result in a #PF with the SGX 
> flag set (0x8007). If the PTE matches the enclave permissions the page 
> fault would have familiar 0x7 error code.
> 
> In either case user space would encounter a #PF so technically there is 
> nothing "wrong" with allowing this - even so, as motivated in the 
> previous patch: accurate exception information supports the SGX runtime, 
> which is virtually always implemented inside a shared library, by 
> providing accurate information in support of its management of the SGX 
> enclave.

Are you ok with managing PTEs in this way? It matches your requirement 
that PTE permissions may not exceed max_perm and ABI is compatible with 
SGX1. Additionally, PTEs are not allowed to exceed EPCM permissions, 
which is not an ABI change since it was not a consideration during SGX1 
where EPCM permissions could not change.


>> [0] I'm not sure anyone actually has a system set up like this or that 
>> the necessary LSM support is in the kernel.  But it's supposed to be 
>> possible without changing the ABI.
>>

Thank you very much

Reinette
Jarkko Sakkinen Dec. 28, 2021, 2:52 p.m. UTC | #14
On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
> > > > This is a valid question. Since EMODPE exists why not just make things for
> > > > EMODPE, and ignore EMODPR altogether?
> > > > 
> > > 
> > > I believe that we should support the best practice of principle of least
> > > privilege - once a page no longer needs a particular permission there
> > > should be a way to remove it (the unneeded permission).
> > 
> > What if EMODPR was not used at all, since EMODPE is there anyway?
> 
> EMODPR and EMODPE are not equivalent.
> 
> EMODPE can only be used to "extend"/relax permissions while EMODPR can only
> be used to restrict permissions.
> 
> Notice in the EMODPE instruction reference of the SDM:
> 
> (* Update EPCM permissions *)
> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
> 
> So, when using EMODPE it is only possible to add permissions, not remove
> permissions.
> 
> If a user wants to remove permissions from an EPCM page it is only possible
> when using EMODPR. Notice in its instruction reference found in the SDM how
> it in turn can only be used to restrict permissions:
> 
> (* Update EPCM permissions *)
> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;

OK, so the question is: do we need both or would a mechanism just to extend
permissions be sufficient?

/Jarkko
Reinette Chatre Jan. 6, 2022, 5:46 p.m. UTC | #15
Hi Jarkko,

On 12/28/2021 6:52 AM, Jarkko Sakkinen wrote:
> On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
>>> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
>>>>> This is a valid question. Since EMODPE exists why not just make things for
>>>>> EMODPE, and ignore EMODPR altogether?
>>>>>
>>>>
>>>> I believe that we should support the best practice of principle of least
>>>> privilege - once a page no longer needs a particular permission there
>>>> should be a way to remove it (the unneeded permission).
>>>
>>> What if EMODPR was not used at all, since EMODPE is there anyway?
>>
>> EMODPR and EMODPE are not equivalent.
>>
>> EMODPE can only be used to "extend"/relax permissions while EMODPR can only
>> be used to restrict permissions.
>>
>> Notice in the EMODPE instruction reference of the SDM:
>>
>> (* Update EPCM permissions *)
>> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
>> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
>> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
>>
>> So, when using EMODPE it is only possible to add permissions, not remove
>> permissions.
>>
>> If a user wants to remove permissions from an EPCM page it is only possible
>> when using EMODPR. Notice in its instruction reference found in the SDM how
>> it in turn can only be used to restrict permissions:
>>
>> (* Update EPCM permissions *)
>> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
>> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
>> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
> 
> OK, so the question is: do we need both or would a mechanism just to extend
> permissions be sufficient?

I do believe that we need both in order to support pages having only
the permissions required to support their intended use during the time the
particular access is required. While technically it is possible to grant
pages all permissions they may need during their lifetime it is safer to
remove permissions when no longer required.

Reinette
Jarkko Sakkinen Jan. 7, 2022, 12:16 p.m. UTC | #16
On Thu, Jan 06, 2022 at 09:46:06AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/28/2021 6:52 AM, Jarkko Sakkinen wrote:
> > On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
> >>> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
> >>>>> This is a valid question. Since EMODPE exists why not just make things for
> >>>>> EMODPE, and ignore EMODPR altogether?
> >>>>>
> >>>>
> >>>> I believe that we should support the best practice of principle of least
> >>>> privilege - once a page no longer needs a particular permission there
> >>>> should be a way to remove it (the unneeded permission).
> >>>
> >>> What if EMODPR was not used at all, since EMODPE is there anyway?
> >>
> >> EMODPR and EMODPE are not equivalent.
> >>
> >> EMODPE can only be used to "extend"/relax permissions while EMODPR can only
> >> be used to restrict permissions.
> >>
> >> Notice in the EMODPE instruction reference of the SDM:
> >>
> >> (* Update EPCM permissions *)
> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
> >>
> >> So, when using EMODPE it is only possible to add permissions, not remove
> >> permissions.
> >>
> >> If a user wants to remove permissions from an EPCM page it is only possible
> >> when using EMODPR. Notice in its instruction reference found in the SDM how
> >> it in turn can only be used to restrict permissions:
> >>
> >> (* Update EPCM permissions *)
> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
> > 
> > OK, so the question is: do we need both or would a mechanism just to extend
> > permissions be sufficient?
> 
> I do believe that we need both in order to support pages having only
> the permissions required to support their intended use during the time the
> particular access is required. While technically it is possible to grant
> pages all permissions they may need during their lifetime it is safer to
> remove permissions when no longer required.

So if we imagine a run-time: how EMODPR would be useful, and how using it
would make things safer?

/Jarkko
Haitao Huang Jan. 7, 2022, 4:14 p.m. UTC | #17
On Fri, 07 Jan 2022 06:16:21 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Thu, Jan 06, 2022 at 09:46:06AM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/28/2021 6:52 AM, Jarkko Sakkinen wrote:
>> > On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
>> >> Hi Jarkko,
>> >>
>> >> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
>> >>> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
>> >>>>> This is a valid question. Since EMODPE exists why not just make  
>> things for
>> >>>>> EMODPE, and ignore EMODPR altogether?
>> >>>>>
>> >>>>
>> >>>> I believe that we should support the best practice of principle of  
>> least
>> >>>> privilege - once a page no longer needs a particular permission  
>> there
>> >>>> should be a way to remove it (the unneeded permission).
>> >>>
>> >>> What if EMODPR was not used at all, since EMODPE is there anyway?
>> >>
>> >> EMODPR and EMODPE are not equivalent.
>> >>
>> >> EMODPE can only be used to "extend"/relax permissions while EMODPR  
>> can only
>> >> be used to restrict permissions.
>> >>
>> >> Notice in the EMODPE instruction reference of the SDM:
>> >>
>> >> (* Update EPCM permissions *)
>> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
>> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
>> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
>> >>
>> >> So, when using EMODPE it is only possible to add permissions, not  
>> remove
>> >> permissions.
>> >>
>> >> If a user wants to remove permissions from an EPCM page it is only  
>> possible
>> >> when using EMODPR. Notice in its instruction reference found in the  
>> SDM how
>> >> it in turn can only be used to restrict permissions:
>> >>
>> >> (* Update EPCM permissions *)
>> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
>> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
>> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
>> >
>> > OK, so the question is: do we need both or would a mechanism just to  
>> extend
>> > permissions be sufficient?
>>
>> I do believe that we need both in order to support pages having only
>> the permissions required to support their intended use during the time  
>> the
>> particular access is required. While technically it is possible to grant
>> pages all permissions they may need during their lifetime it is safer to
>> remove permissions when no longer required.
>
> So if we imagine a run-time: how EMODPR would be useful, and how using it
> would make things safer?
>
In scenarios of JIT compilers, once code is generated into RW pages,  
modifying both PTE and EPCM permissions to RX would be a good defensive  
measure. In that case, EMODPR is useful.

Haitao
Jarkko Sakkinen Jan. 8, 2022, 3:45 p.m. UTC | #18
On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > OK, so the question is: do we need both or would a mechanism just
> > > to extend
> > > > permissions be sufficient?
> > > 
> > > I do believe that we need both in order to support pages having only
> > > the permissions required to support their intended use during the
> > > time the
> > > particular access is required. While technically it is possible to grant
> > > pages all permissions they may need during their lifetime it is safer to
> > > remove permissions when no longer required.
> > 
> > So if we imagine a run-time: how EMODPR would be useful, and how using it
> > would make things safer?
> > 
> In scenarios of JIT compilers, once code is generated into RW pages,
> modifying both PTE and EPCM permissions to RX would be a good defensive
> measure. In that case, EMODPR is useful.

What is the exact threat we are talking about?

/Jarkko
Jarkko Sakkinen Jan. 8, 2022, 3:51 p.m. UTC | #19
On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > OK, so the question is: do we need both or would a mechanism just
> > > > to extend
> > > > > permissions be sufficient?
> > > > 
> > > > I do believe that we need both in order to support pages having only
> > > > the permissions required to support their intended use during the
> > > > time the
> > > > particular access is required. While technically it is possible to grant
> > > > pages all permissions they may need during their lifetime it is safer to
> > > > remove permissions when no longer required.
> > > 
> > > So if we imagine a run-time: how EMODPR would be useful, and how using it
> > > would make things safer?
> > > 
> > In scenarios of JIT compilers, once code is generated into RW pages,
> > modifying both PTE and EPCM permissions to RX would be a good defensive
> > measure. In that case, EMODPR is useful.
> 
> What is the exact threat we are talking about?

To add: it should be *significantly* critical thread, given that not
supporting only EAUG would leave us only one complex call pattern with
EACCEPT involvement.

I'd even go to suggest to leave EMODPR out of the patch set, and introduce
it when there is PoC code for any of the existing run-time that
demonstrates the demand for it. Right now this way too speculative.

Supporting EMODPE is IMHO by factors more critical.

/Jarkko
Jarkko Sakkinen Jan. 8, 2022, 4:22 p.m. UTC | #20
On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > OK, so the question is: do we need both or would a mechanism just
> > > > > to extend
> > > > > > permissions be sufficient?
> > > > > 
> > > > > I do believe that we need both in order to support pages having only
> > > > > the permissions required to support their intended use during the
> > > > > time the
> > > > > particular access is required. While technically it is possible to grant
> > > > > pages all permissions they may need during their lifetime it is safer to
> > > > > remove permissions when no longer required.
> > > > 
> > > > So if we imagine a run-time: how EMODPR would be useful, and how using it
> > > > would make things safer?
> > > > 
> > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > modifying both PTE and EPCM permissions to RX would be a good defensive
> > > measure. In that case, EMODPR is useful.
> > 
> > What is the exact threat we are talking about?
> 
> To add: it should be *significantly* critical thread, given that not
> supporting only EAUG would leave us only one complex call pattern with
> EACCEPT involvement.
> 
> I'd even go to suggest to leave EMODPR out of the patch set, and introduce
> it when there is PoC code for any of the existing run-time that
> demonstrates the demand for it. Right now this way too speculative.
> 
> Supporting EMODPE is IMHO by factors more critical.

At least it does not protected against enclave code because an enclave can
always choose not to EACCEPT any of the EMODPR requests. I'm not only
confused here about the actual threat but also the potential adversary and
target.

/Jarkko
Haitao Huang Jan. 10, 2022, 10:05 p.m. UTC | #21
On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>> > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>> > > > > > OK, so the question is: do we need both or would a mechanism  
>> just
>> > > > > to extend
>> > > > > > permissions be sufficient?
>> > > > >
>> > > > > I do believe that we need both in order to support pages having  
>> only
>> > > > > the permissions required to support their intended use during  
>> the
>> > > > > time the
>> > > > > particular access is required. While technically it is possible  
>> to grant
>> > > > > pages all permissions they may need during their lifetime it is  
>> safer to
>> > > > > remove permissions when no longer required.
>> > > >
>> > > > So if we imagine a run-time: how EMODPR would be useful, and how  
>> using it
>> > > > would make things safer?
>> > > >
>> > > In scenarios of JIT compilers, once code is generated into RW pages,
>> > > modifying both PTE and EPCM permissions to RX would be a good  
>> defensive
>> > > measure. In that case, EMODPR is useful.
>> >
>> > What is the exact threat we are talking about?
>>
>> To add: it should be *significantly* critical thread, given that not
>> supporting only EAUG would leave us only one complex call pattern with
>> EACCEPT involvement.
>>
>> I'd even go to suggest to leave EMODPR out of the patch set, and  
>> introduce
>> it when there is PoC code for any of the existing run-time that
>> demonstrates the demand for it. Right now this way too speculative.
>>
>> Supporting EMODPE is IMHO by factors more critical.
>
> At least it does not protected against enclave code because an enclave  
> can
> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> confused here about the actual threat but also the potential adversary  
> and
> target.
>
I'm not sure I follow your thoughts here. The sequence should be for  
enclave to request  EMODPR in the first place through runtime to kernel,  
then to verify with EACCEPT that the OS indeed has done EMODPR.
If enclave does not verify with EACCEPT, then its own code has  
vulnerability. But this does not justify OS not providing the mechanism to  
request EMODPR.

Similar to how we don't want have RWX code pages for normal Linux  
application, when an enclave loads code pages (either directly or JIT  
compiled from high level code ) into EAUG'd page (which has RW), we do not  
want leave pages to be RWX for code to be executable, hence the need of  
EMODPR request OS to reduce the permissions to RX once the code is ready  
to execute.

I believe this is needed for LibOS runtimes (e.g.,Gramine) loading  
unmodified app binaries, or an enclave with JIT compiler (I think Enarx in  
this category?). Experts from those project can confirm or contradict.  
Intel SDK currently also has implementation to reduce permissions of RELRO  
sections in ELF binaries to ReadOnly after relocation is done. In our new  
EDMM user support[1] based on this patch series, we also support flows to  
reduce permissions using EMODPR in a generic way.

[1]https://github.com/intel/linux-sgx/pull/751
Jarkko Sakkinen Jan. 11, 2022, 1:53 a.m. UTC | #22
On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > OK, so the question is: do we need both or would a
> > > mechanism just
> > > > > > > to extend
> > > > > > > > permissions be sufficient?
> > > > > > >
> > > > > > > I do believe that we need both in order to support pages
> > > having only
> > > > > > > the permissions required to support their intended use
> > > during the
> > > > > > > time the
> > > > > > > particular access is required. While technically it is
> > > possible to grant
> > > > > > > pages all permissions they may need during their lifetime it
> > > is safer to
> > > > > > > remove permissions when no longer required.
> > > > > >
> > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > how using it
> > > > > > would make things safer?
> > > > > >
> > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > defensive
> > > > > measure. In that case, EMODPR is useful.
> > > >
> > > > What is the exact threat we are talking about?
> > > 
> > > To add: it should be *significantly* critical thread, given that not
> > > supporting only EAUG would leave us only one complex call pattern with
> > > EACCEPT involvement.
> > > 
> > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > introduce
> > > it when there is PoC code for any of the existing run-time that
> > > demonstrates the demand for it. Right now this way too speculative.
> > > 
> > > Supporting EMODPE is IMHO by factors more critical.
> > 
> > At least it does not protected against enclave code because an enclave
> > can
> > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > confused here about the actual threat but also the potential adversary
> > and
> > target.
> > 
> I'm not sure I follow your thoughts here. The sequence should be for enclave
> to request  EMODPR in the first place through runtime to kernel, then to
> verify with EACCEPT that the OS indeed has done EMODPR.
> If enclave does not verify with EACCEPT, then its own code has
> vulnerability. But this does not justify OS not providing the mechanism to
> request EMODPR.

The question is really simple: what is the threat scenario? In order to use
the word "vulnerability", you would need one.

Given the complexity of the whole dance with EMODPR it is mandatory to have
one, in order to ack it to the mainline.

> Similar to how we don't want have RWX code pages for normal Linux
> application, when an enclave loads code pages (either directly or JIT
> compiled from high level code ) into EAUG'd page (which has RW), we do not
> want leave pages to be RWX for code to be executable, hence the need of
> EMODPR request OS to reduce the permissions to RX once the code is ready to
> execute.

You cannot compare *enforced* permissions outside the enclave, and claim that
they would be equivalent to the permissions of the already sandboxed code
inside the enclave, with permissions that are not enforced but are based
on good will of the enclave code.

/Jarkko
Jarkko Sakkinen Jan. 11, 2022, 1:55 a.m. UTC | #23
On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > wrote:
> > 
> > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > > OK, so the question is: do we need both or would a
> > > > mechanism just
> > > > > > > > to extend
> > > > > > > > > permissions be sufficient?
> > > > > > > >
> > > > > > > > I do believe that we need both in order to support pages
> > > > having only
> > > > > > > > the permissions required to support their intended use
> > > > during the
> > > > > > > > time the
> > > > > > > > particular access is required. While technically it is
> > > > possible to grant
> > > > > > > > pages all permissions they may need during their lifetime it
> > > > is safer to
> > > > > > > > remove permissions when no longer required.
> > > > > > >
> > > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > > how using it
> > > > > > > would make things safer?
> > > > > > >
> > > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > > defensive
> > > > > > measure. In that case, EMODPR is useful.
> > > > >
> > > > > What is the exact threat we are talking about?
> > > > 
> > > > To add: it should be *significantly* critical thread, given that not
> > > > supporting only EAUG would leave us only one complex call pattern with
> > > > EACCEPT involvement.
> > > > 
> > > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > introduce
> > > > it when there is PoC code for any of the existing run-time that
> > > > demonstrates the demand for it. Right now this way too speculative.
> > > > 
> > > > Supporting EMODPE is IMHO by factors more critical.
> > > 
> > > At least it does not protected against enclave code because an enclave
> > > can
> > > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > confused here about the actual threat but also the potential adversary
> > > and
> > > target.
> > > 
> > I'm not sure I follow your thoughts here. The sequence should be for enclave
> > to request  EMODPR in the first place through runtime to kernel, then to
> > verify with EACCEPT that the OS indeed has done EMODPR.
> > If enclave does not verify with EACCEPT, then its own code has
> > vulnerability. But this does not justify OS not providing the mechanism to
> > request EMODPR.
> 
> The question is really simple: what is the threat scenario? In order to use
> the word "vulnerability", you would need one.
> 
> Given the complexity of the whole dance with EMODPR it is mandatory to have
> one, in order to ack it to the mainline.
> 
> > Similar to how we don't want have RWX code pages for normal Linux
> > application, when an enclave loads code pages (either directly or JIT
> > compiled from high level code ) into EAUG'd page (which has RW), we do not
> > want leave pages to be RWX for code to be executable, hence the need of
> > EMODPR request OS to reduce the permissions to RX once the code is ready to
> > execute.
> 
> You cannot compare *enforced* permissions outside the enclave, and claim that
> they would be equivalent to the permissions of the already sandboxed code
> inside the enclave, with permissions that are not enforced but are based
> on good will of the enclave code.

To add, you can already do "EMODPR" by simply adjusting VMA permissions to be
more restrictive. How this would be worse than this collaboration based 
thing?

/Jarkko
Jarkko Sakkinen Jan. 11, 2022, 2:03 a.m. UTC | #24
On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > wrote:
> > > 
> > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > > > OK, so the question is: do we need both or would a
> > > > > mechanism just
> > > > > > > > > to extend
> > > > > > > > > > permissions be sufficient?
> > > > > > > > >
> > > > > > > > > I do believe that we need both in order to support pages
> > > > > having only
> > > > > > > > > the permissions required to support their intended use
> > > > > during the
> > > > > > > > > time the
> > > > > > > > > particular access is required. While technically it is
> > > > > possible to grant
> > > > > > > > > pages all permissions they may need during their lifetime it
> > > > > is safer to
> > > > > > > > > remove permissions when no longer required.
> > > > > > > >
> > > > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > > > how using it
> > > > > > > > would make things safer?
> > > > > > > >
> > > > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > > > defensive
> > > > > > > measure. In that case, EMODPR is useful.
> > > > > >
> > > > > > What is the exact threat we are talking about?
> > > > > 
> > > > > To add: it should be *significantly* critical thread, given that not
> > > > > supporting only EAUG would leave us only one complex call pattern with
> > > > > EACCEPT involvement.
> > > > > 
> > > > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > > introduce
> > > > > it when there is PoC code for any of the existing run-time that
> > > > > demonstrates the demand for it. Right now this way too speculative.
> > > > > 
> > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > 
> > > > At least it does not protected against enclave code because an enclave
> > > > can
> > > > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > > confused here about the actual threat but also the potential adversary
> > > > and
> > > > target.
> > > > 
> > > I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > to request  EMODPR in the first place through runtime to kernel, then to
> > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > If enclave does not verify with EACCEPT, then its own code has
> > > vulnerability. But this does not justify OS not providing the mechanism to
> > > request EMODPR.
> > 
> > The question is really simple: what is the threat scenario? In order to use
> > the word "vulnerability", you would need one.
> > 
> > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > one, in order to ack it to the mainline.
> > 
> > > Similar to how we don't want have RWX code pages for normal Linux
> > > application, when an enclave loads code pages (either directly or JIT
> > > compiled from high level code ) into EAUG'd page (which has RW), we do not
> > > want leave pages to be RWX for code to be executable, hence the need of
> > > EMODPR request OS to reduce the permissions to RX once the code is ready to
> > > execute.
> > 
> > You cannot compare *enforced* permissions outside the enclave, and claim that
> > they would be equivalent to the permissions of the already sandboxed code
> > inside the enclave, with permissions that are not enforced but are based
> > on good will of the enclave code.
> 
> To add, you can already do "EMODPR" by simply adjusting VMA permissions to be
> more restrictive. How this would be worse than this collaboration based 
> thing?

... or you could even make soft version of EMODPR without using that opcode
by writing an ioctl to update our xarray to allow lower permissions. That
ties the hands of the process who is doing the mmap() already. 

/Jarkko
Jarkko Sakkinen Jan. 11, 2022, 2:15 a.m. UTC | #25
On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > > wrote:
> > > > 
> > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > > > > OK, so the question is: do we need both or would a
> > > > > > mechanism just
> > > > > > > > > > to extend
> > > > > > > > > > > permissions be sufficient?
> > > > > > > > > >
> > > > > > > > > > I do believe that we need both in order to support pages
> > > > > > having only
> > > > > > > > > > the permissions required to support their intended use
> > > > > > during the
> > > > > > > > > > time the
> > > > > > > > > > particular access is required. While technically it is
> > > > > > possible to grant
> > > > > > > > > > pages all permissions they may need during their lifetime it
> > > > > > is safer to
> > > > > > > > > > remove permissions when no longer required.
> > > > > > > > >
> > > > > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > > > > how using it
> > > > > > > > > would make things safer?
> > > > > > > > >
> > > > > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > > > > defensive
> > > > > > > > measure. In that case, EMODPR is useful.
> > > > > > >
> > > > > > > What is the exact threat we are talking about?
> > > > > > 
> > > > > > To add: it should be *significantly* critical thread, given that not
> > > > > > supporting only EAUG would leave us only one complex call pattern with
> > > > > > EACCEPT involvement.
> > > > > > 
> > > > > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > > > introduce
> > > > > > it when there is PoC code for any of the existing run-time that
> > > > > > demonstrates the demand for it. Right now this way too speculative.
> > > > > > 
> > > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > > 
> > > > > At least it does not protected against enclave code because an enclave
> > > > > can
> > > > > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > > > confused here about the actual threat but also the potential adversary
> > > > > and
> > > > > target.
> > > > > 
> > > > I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > > to request  EMODPR in the first place through runtime to kernel, then to
> > > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > > If enclave does not verify with EACCEPT, then its own code has
> > > > vulnerability. But this does not justify OS not providing the mechanism to
> > > > request EMODPR.
> > > 
> > > The question is really simple: what is the threat scenario? In order to use
> > > the word "vulnerability", you would need one.
> > > 
> > > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > one, in order to ack it to the mainline.
> > > 
> > > > Similar to how we don't want have RWX code pages for normal Linux
> > > > application, when an enclave loads code pages (either directly or JIT
> > > > compiled from high level code ) into EAUG'd page (which has RW), we do not
> > > > want leave pages to be RWX for code to be executable, hence the need of
> > > > EMODPR request OS to reduce the permissions to RX once the code is ready to
> > > > execute.
> > > 
> > > You cannot compare *enforced* permissions outside the enclave, and claim that
> > > they would be equivalent to the permissions of the already sandboxed code
> > > inside the enclave, with permissions that are not enforced but are based
> > > on good will of the enclave code.
> > 
> > To add, you can already do "EMODPR" by simply adjusting VMA permissions to be
> > more restrictive. How this would be worse than this collaboration based 
> > thing?
> 
> ... or you could even make soft version of EMODPR without using that opcode
> by writing an ioctl to update our xarray to allow lower permissions. That
> ties the hands of the process who is doing the mmap() already. 

E.g. why not just

#define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
#define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)

struct sgx_enclave_restrict_page_permissions {
	__u64 src;
	__u64 offset;
	__u64 length;
	__u64 secinfo;
	__u64 count;
};
struct sgx_enclave_extend_page_permissions {
	__u64 src;
	__u64 offset;
	__u64 length;
	__u64 secinfo;
	__u64 count;
};

These would simply update the xarray and nothing else. I'd go with two
ioctls (with the necessary checks for secinfo) in order to provide hook
up points in the future for LSMs.

This leaves only EAUG and EMODT requiring the EACCEPT handshake.

/Jarkko
Haitao Huang Jan. 11, 2022, 3:48 a.m. UTC | #26
On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
>> On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
>> > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
>> > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>> > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen  
>> <jarkko@kernel.org>
>> > > > wrote:
>> > > >
>> > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>> > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen  
>> wrote:
>> > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang  
>> wrote:
>> > > > > > > > > > > OK, so the question is: do we need both or would a
>> > > > > > mechanism just
>> > > > > > > > > > to extend
>> > > > > > > > > > > permissions be sufficient?
>> > > > > > > > > >
>> > > > > > > > > > I do believe that we need both in order to support  
>> pages
>> > > > > > having only
>> > > > > > > > > > the permissions required to support their intended use
>> > > > > > during the
>> > > > > > > > > > time the
>> > > > > > > > > > particular access is required. While technically it is
>> > > > > > possible to grant
>> > > > > > > > > > pages all permissions they may need during their  
>> lifetime it
>> > > > > > is safer to
>> > > > > > > > > > remove permissions when no longer required.
>> > > > > > > > >
>> > > > > > > > > So if we imagine a run-time: how EMODPR would be  
>> useful, and
>> > > > > > how using it
>> > > > > > > > > would make things safer?
>> > > > > > > > >
>> > > > > > > > In scenarios of JIT compilers, once code is generated  
>> into RW pages,
>> > > > > > > > modifying both PTE and EPCM permissions to RX would be a  
>> good
>> > > > > > defensive
>> > > > > > > > measure. In that case, EMODPR is useful.
>> > > > > > >
>> > > > > > > What is the exact threat we are talking about?
>> > > > > >
>> > > > > > To add: it should be *significantly* critical thread, given  
>> that not
>> > > > > > supporting only EAUG would leave us only one complex call  
>> pattern with
>> > > > > > EACCEPT involvement.
>> > > > > >
>> > > > > > I'd even go to suggest to leave EMODPR out of the patch set,  
>> and
>> > > > > > introduce
>> > > > > > it when there is PoC code for any of the existing run-time  
>> that
>> > > > > > demonstrates the demand for it. Right now this way too  
>> speculative.
>> > > > > >
>> > > > > > Supporting EMODPE is IMHO by factors more critical.
>> > > > >
>> > > > > At least it does not protected against enclave code because an  
>> enclave
>> > > > > can
>> > > > > always choose not to EACCEPT any of the EMODPR requests. I'm  
>> not only
>> > > > > confused here about the actual threat but also the potential  
>> adversary
>> > > > > and
>> > > > > target.
>> > > > >
>> > > > I'm not sure I follow your thoughts here. The sequence should be  
>> for enclave
>> > > > to request  EMODPR in the first place through runtime to kernel,  
>> then to
>> > > > verify with EACCEPT that the OS indeed has done EMODPR.
>> > > > If enclave does not verify with EACCEPT, then its own code has
>> > > > vulnerability. But this does not justify OS not providing the  
>> mechanism to
>> > > > request EMODPR.
>> > >
>> > > The question is really simple: what is the threat scenario? In  
>> order to use
>> > > the word "vulnerability", you would need one.
>> > >
>> > > Given the complexity of the whole dance with EMODPR it is mandatory  
>> to have
>> > > one, in order to ack it to the mainline.
>> > >
>> > > > Similar to how we don't want have RWX code pages for normal Linux
>> > > > application, when an enclave loads code pages (either directly or  
>> JIT
>> > > > compiled from high level code ) into EAUG'd page (which has RW),  
>> we do not
>> > > > want leave pages to be RWX for code to be executable, hence the  
>> need of
>> > > > EMODPR request OS to reduce the permissions to RX once the code  
>> is ready to
>> > > > execute.
>> > >
>> > > You cannot compare *enforced* permissions outside the enclave, and  
>> claim that
>> > > they would be equivalent to the permissions of the already  
>> sandboxed code
>> > > inside the enclave, with permissions that are not enforced but are  
>> based
>> > > on good will of the enclave code.
>> >
>> > To add, you can already do "EMODPR" by simply adjusting VMA  
>> permissions to be
>> > more restrictive. How this would be worse than this collaboration  
>> based
>> > thing?
>>
>> ... or you could even make soft version of EMODPR without using that  
>> opcode
>> by writing an ioctl to update our xarray to allow lower permissions.  
>> That
>> ties the hands of the process who is doing the mmap() already.
>
> E.g. why not just
>
> #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
> 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
> #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
> 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
>
> struct sgx_enclave_restrict_page_permissions {
> 	__u64 src;
> 	__u64 offset;
> 	__u64 length;
> 	__u64 secinfo;
> 	__u64 count;
> };
> struct sgx_enclave_extend_page_permissions {
> 	__u64 src;
> 	__u64 offset;
> 	__u64 length;
> 	__u64 secinfo;
> 	__u64 count;
> };
>
> These would simply update the xarray and nothing else. I'd go with two
> ioctls (with the necessary checks for secinfo) in order to provide hook
> up points in the future for LSMs.
>
> This leaves only EAUG and EMODT requiring the EACCEPT handshake.
>
> /Jarkko
The trusted code base here is the enclave. It can't trust any code outside  
for enforcement. There is also need for TLB shootdown.

To answer your earlier question about threat, the threat is  
malicious/compromised code inside enclave. Yes, you can say the whole  
thing is sand-boxed, but the runtime inside enclave could load complex  
upper layer code.  Therefore the runtime needs to have a trusted mechanism  
to ensure code pages not writable so that there is less/no chance for  
compromised malicious enclave to modify existing code pages. I still  
consider it to be similar to normal Linux elf-loader/dynamic linker  
relying on mmap/mprotect and trusting OS to enforce permissions, but here  
the enclave runtime only trust the HW provided mechanism: EMODPR to change  
EPCM records and EACCEPT to verify.
Reinette Chatre Jan. 11, 2022, 5:13 p.m. UTC | #27
Hi Jarkko,

On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>> wrote:
>>
>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>> OK, so the question is: do we need both or would a
>>>> mechanism just
>>>>>>>> to extend
>>>>>>>>> permissions be sufficient?
>>>>>>>>
>>>>>>>> I do believe that we need both in order to support pages
>>>> having only
>>>>>>>> the permissions required to support their intended use
>>>> during the
>>>>>>>> time the
>>>>>>>> particular access is required. While technically it is
>>>> possible to grant
>>>>>>>> pages all permissions they may need during their lifetime it
>>>> is safer to
>>>>>>>> remove permissions when no longer required.
>>>>>>>
>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>> how using it
>>>>>>> would make things safer?
>>>>>>>
>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>> defensive
>>>>>> measure. In that case, EMODPR is useful.
>>>>>
>>>>> What is the exact threat we are talking about?
>>>>
>>>> To add: it should be *significantly* critical thread, given that not
>>>> supporting only EAUG would leave us only one complex call pattern with
>>>> EACCEPT involvement.
>>>>
>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>> introduce
>>>> it when there is PoC code for any of the existing run-time that
>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>
>>>> Supporting EMODPE is IMHO by factors more critical.
>>>
>>> At least it does not protected against enclave code because an enclave
>>> can
>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>> confused here about the actual threat but also the potential adversary
>>> and
>>> target.
>>>
>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>> to request  EMODPR in the first place through runtime to kernel, then to
>> verify with EACCEPT that the OS indeed has done EMODPR.
>> If enclave does not verify with EACCEPT, then its own code has
>> vulnerability. But this does not justify OS not providing the mechanism to
>> request EMODPR.
> 
> The question is really simple: what is the threat scenario? In order to use
> the word "vulnerability", you would need one.
> 
> Given the complexity of the whole dance with EMODPR it is mandatory to have
> one, in order to ack it to the mainline.
> 

Which complexity related to EMODPR are you concerned about? In a later message
you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
so it seems that you are perhaps concerned about the flow involving EACCEPT?
The OS does not require nor depend on EACCEPT being called as part of these flows
so a faulty or misbehaving user space omitting an EACCEPT call would not impact
these flows in the OS, but would of course impact the enclave.

Reinette
Jarkko Sakkinen Jan. 12, 2022, 11:48 p.m. UTC | #28
On Mon, Jan 10, 2022 at 09:48:15PM -0600, Haitao Huang wrote:
> On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
> > > On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> > > > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen
> > > <jarkko@kernel.org>
> > > > > > wrote:
> > > > > >
> > > > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen
> > > wrote:
> > > > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang
> > > wrote:
> > > > > > > > > > > > > OK, so the question is: do we need both or would a
> > > > > > > > mechanism just
> > > > > > > > > > > > to extend
> > > > > > > > > > > > > permissions be sufficient?
> > > > > > > > > > > >
> > > > > > > > > > > > I do believe that we need both in order to support
> > > pages
> > > > > > > > having only
> > > > > > > > > > > > the permissions required to support their intended use
> > > > > > > > during the
> > > > > > > > > > > > time the
> > > > > > > > > > > > particular access is required. While technically it is
> > > > > > > > possible to grant
> > > > > > > > > > > > pages all permissions they may need during their
> > > lifetime it
> > > > > > > > is safer to
> > > > > > > > > > > > remove permissions when no longer required.
> > > > > > > > > > >
> > > > > > > > > > > So if we imagine a run-time: how EMODPR would be
> > > useful, and
> > > > > > > > how using it
> > > > > > > > > > > would make things safer?
> > > > > > > > > > >
> > > > > > > > > > In scenarios of JIT compilers, once code is generated
> > > into RW pages,
> > > > > > > > > > modifying both PTE and EPCM permissions to RX would be
> > > a good
> > > > > > > > defensive
> > > > > > > > > > measure. In that case, EMODPR is useful.
> > > > > > > > >
> > > > > > > > > What is the exact threat we are talking about?
> > > > > > > >
> > > > > > > > To add: it should be *significantly* critical thread,
> > > given that not
> > > > > > > > supporting only EAUG would leave us only one complex call
> > > pattern with
> > > > > > > > EACCEPT involvement.
> > > > > > > >
> > > > > > > > I'd even go to suggest to leave EMODPR out of the patch
> > > set, and
> > > > > > > > introduce
> > > > > > > > it when there is PoC code for any of the existing run-time
> > > that
> > > > > > > > demonstrates the demand for it. Right now this way too
> > > speculative.
> > > > > > > >
> > > > > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > > > >
> > > > > > > At least it does not protected against enclave code because
> > > an enclave
> > > > > > > can
> > > > > > > always choose not to EACCEPT any of the EMODPR requests. I'm
> > > not only
> > > > > > > confused here about the actual threat but also the potential
> > > adversary
> > > > > > > and
> > > > > > > target.
> > > > > > >
> > > > > > I'm not sure I follow your thoughts here. The sequence should
> > > be for enclave
> > > > > > to request  EMODPR in the first place through runtime to
> > > kernel, then to
> > > > > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > > > > If enclave does not verify with EACCEPT, then its own code has
> > > > > > vulnerability. But this does not justify OS not providing the
> > > mechanism to
> > > > > > request EMODPR.
> > > > >
> > > > > The question is really simple: what is the threat scenario? In
> > > order to use
> > > > > the word "vulnerability", you would need one.
> > > > >
> > > > > Given the complexity of the whole dance with EMODPR it is
> > > mandatory to have
> > > > > one, in order to ack it to the mainline.
> > > > >
> > > > > > Similar to how we don't want have RWX code pages for normal Linux
> > > > > > application, when an enclave loads code pages (either directly
> > > or JIT
> > > > > > compiled from high level code ) into EAUG'd page (which has
> > > RW), we do not
> > > > > > want leave pages to be RWX for code to be executable, hence
> > > the need of
> > > > > > EMODPR request OS to reduce the permissions to RX once the
> > > code is ready to
> > > > > > execute.
> > > > >
> > > > > You cannot compare *enforced* permissions outside the enclave,
> > > and claim that
> > > > > they would be equivalent to the permissions of the already
> > > sandboxed code
> > > > > inside the enclave, with permissions that are not enforced but
> > > are based
> > > > > on good will of the enclave code.
> > > >
> > > > To add, you can already do "EMODPR" by simply adjusting VMA
> > > permissions to be
> > > > more restrictive. How this would be worse than this collaboration
> > > based
> > > > thing?
> > > 
> > > ... or you could even make soft version of EMODPR without using that
> > > opcode
> > > by writing an ioctl to update our xarray to allow lower permissions.
> > > That
> > > ties the hands of the process who is doing the mmap() already.
> > 
> > E.g. why not just
> > 
> > #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
> > 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
> > #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
> > 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
> > 
> > struct sgx_enclave_restrict_page_permissions {
> > 	__u64 src;
> > 	__u64 offset;
> > 	__u64 length;
> > 	__u64 secinfo;
> > 	__u64 count;
> > };
> > struct sgx_enclave_extend_page_permissions {
> > 	__u64 src;
> > 	__u64 offset;
> > 	__u64 length;
> > 	__u64 secinfo;
> > 	__u64 count;
> > };
> > 
> > These would simply update the xarray and nothing else. I'd go with two
> > ioctls (with the necessary checks for secinfo) in order to provide hook
> > up points in the future for LSMs.
> > 
> > This leaves only EAUG and EMODT requiring the EACCEPT handshake.
> > 
> > /Jarkko
> The trusted code base here is the enclave. It can't trust any code outside
> for enforcement. There is also need for TLB shootdown.
> 
> To answer your earlier question about threat, the threat is
> malicious/compromised code inside enclave. Yes, you can say the whole thing
> is sand-boxed, but the runtime inside enclave could load complex upper layer
> code.  Therefore the runtime needs to have a trusted mechanism to ensure
> code pages not writable so that there is less/no chance for compromised
> malicious enclave to modify existing code pages. I still consider it to be
> similar to normal Linux elf-loader/dynamic linker relying on mmap/mprotect
> and trusting OS to enforce permissions, but here the enclave runtime only
> trust the HW provided mechanism: EMODPR to change EPCM records and EACCEPT
> to verify.

So what if:

1. User space does EMODPR ioctl.
2. Enclave does EACCEPT.
3. Enclave does EMODPE.

The problem here is the asymmetry of these operations. If EMODPE also
required EACCEPT from the run-time, EMODPR would also make sense.

Please give a code example on how EMODPR improves trust.

/Jarkko
Jarkko Sakkinen Jan. 12, 2022, 11:50 p.m. UTC | #29
On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >> wrote:
> >>
> >>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>> OK, so the question is: do we need both or would a
> >>>> mechanism just
> >>>>>>>> to extend
> >>>>>>>>> permissions be sufficient?
> >>>>>>>>
> >>>>>>>> I do believe that we need both in order to support pages
> >>>> having only
> >>>>>>>> the permissions required to support their intended use
> >>>> during the
> >>>>>>>> time the
> >>>>>>>> particular access is required. While technically it is
> >>>> possible to grant
> >>>>>>>> pages all permissions they may need during their lifetime it
> >>>> is safer to
> >>>>>>>> remove permissions when no longer required.
> >>>>>>>
> >>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>> how using it
> >>>>>>> would make things safer?
> >>>>>>>
> >>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>> defensive
> >>>>>> measure. In that case, EMODPR is useful.
> >>>>>
> >>>>> What is the exact threat we are talking about?
> >>>>
> >>>> To add: it should be *significantly* critical thread, given that not
> >>>> supporting only EAUG would leave us only one complex call pattern with
> >>>> EACCEPT involvement.
> >>>>
> >>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>> introduce
> >>>> it when there is PoC code for any of the existing run-time that
> >>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>
> >>>> Supporting EMODPE is IMHO by factors more critical.
> >>>
> >>> At least it does not protected against enclave code because an enclave
> >>> can
> >>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>> confused here about the actual threat but also the potential adversary
> >>> and
> >>> target.
> >>>
> >> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >> to request  EMODPR in the first place through runtime to kernel, then to
> >> verify with EACCEPT that the OS indeed has done EMODPR.
> >> If enclave does not verify with EACCEPT, then its own code has
> >> vulnerability. But this does not justify OS not providing the mechanism to
> >> request EMODPR.
> > 
> > The question is really simple: what is the threat scenario? In order to use
> > the word "vulnerability", you would need one.
> > 
> > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > one, in order to ack it to the mainline.
> > 
> 
> Which complexity related to EMODPR are you concerned about? In a later message
> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> The OS does not require nor depend on EACCEPT being called as part of these flows
> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> these flows in the OS, but would of course impact the enclave.

I'd say *any* complexity because I see no benefit of supporting it. E.g.
EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
EMODPR going to help with any sort of workload?

/Jarkko
Jarkko Sakkinen Jan. 12, 2022, 11:56 p.m. UTC | #30
On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > Hi Jarkko,
> > 
> > On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > >> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > >> wrote:
> > >>
> > >>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > >>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > >>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > >>>>>>>>> OK, so the question is: do we need both or would a
> > >>>> mechanism just
> > >>>>>>>> to extend
> > >>>>>>>>> permissions be sufficient?
> > >>>>>>>>
> > >>>>>>>> I do believe that we need both in order to support pages
> > >>>> having only
> > >>>>>>>> the permissions required to support their intended use
> > >>>> during the
> > >>>>>>>> time the
> > >>>>>>>> particular access is required. While technically it is
> > >>>> possible to grant
> > >>>>>>>> pages all permissions they may need during their lifetime it
> > >>>> is safer to
> > >>>>>>>> remove permissions when no longer required.
> > >>>>>>>
> > >>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > >>>> how using it
> > >>>>>>> would make things safer?
> > >>>>>>>
> > >>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > >>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > >>>> defensive
> > >>>>>> measure. In that case, EMODPR is useful.
> > >>>>>
> > >>>>> What is the exact threat we are talking about?
> > >>>>
> > >>>> To add: it should be *significantly* critical thread, given that not
> > >>>> supporting only EAUG would leave us only one complex call pattern with
> > >>>> EACCEPT involvement.
> > >>>>
> > >>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > >>>> introduce
> > >>>> it when there is PoC code for any of the existing run-time that
> > >>>> demonstrates the demand for it. Right now this way too speculative.
> > >>>>
> > >>>> Supporting EMODPE is IMHO by factors more critical.
> > >>>
> > >>> At least it does not protected against enclave code because an enclave
> > >>> can
> > >>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > >>> confused here about the actual threat but also the potential adversary
> > >>> and
> > >>> target.
> > >>>
> > >> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > >> to request  EMODPR in the first place through runtime to kernel, then to
> > >> verify with EACCEPT that the OS indeed has done EMODPR.
> > >> If enclave does not verify with EACCEPT, then its own code has
> > >> vulnerability. But this does not justify OS not providing the mechanism to
> > >> request EMODPR.
> > > 
> > > The question is really simple: what is the threat scenario? In order to use
> > > the word "vulnerability", you would need one.
> > > 
> > > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > one, in order to ack it to the mainline.
> > > 
> > 
> > Which complexity related to EMODPR are you concerned about? In a later message
> > you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > The OS does not require nor depend on EACCEPT being called as part of these flows
> > so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > these flows in the OS, but would of course impact the enclave.
> 
> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> EMODPR going to help with any sort of workload?

I've even started think should we just always allow mmap()? The worst thing
that can happen is that the enclave crashes. Does that matter all that
much? I'm asking because access control is the main theme in SGX2 patch set
that IMHO should be considered to the ground. It really "stress tests" that
area. If we can settle on  that, then other things are just technical details
that we can surely sort out.

/Jarkko
Haitao Huang Jan. 13, 2022, 2:41 a.m. UTC | #31
On Wed, 12 Jan 2022 17:48:48 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Mon, Jan 10, 2022 at 09:48:15PM -0600, Haitao Huang wrote:
>> On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>> wrote:
>>
>> > On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
>> > > On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
>> > > > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
>> > > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>> > > > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen
>> > > <jarkko@kernel.org>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen  
>> wrote:
>> > > > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen
>> > > wrote:
>> > > > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang
>> > > wrote:
>> > > > > > > > > > > > > OK, so the question is: do we need both or  
>> would a
>> > > > > > > > mechanism just
>> > > > > > > > > > > > to extend
>> > > > > > > > > > > > > permissions be sufficient?
>> > > > > > > > > > > >
>> > > > > > > > > > > > I do believe that we need both in order to support
>> > > pages
>> > > > > > > > having only
>> > > > > > > > > > > > the permissions required to support their  
>> intended use
>> > > > > > > > during the
>> > > > > > > > > > > > time the
>> > > > > > > > > > > > particular access is required. While technically  
>> it is
>> > > > > > > > possible to grant
>> > > > > > > > > > > > pages all permissions they may need during their
>> > > lifetime it
>> > > > > > > > is safer to
>> > > > > > > > > > > > remove permissions when no longer required.
>> > > > > > > > > > >
>> > > > > > > > > > > So if we imagine a run-time: how EMODPR would be
>> > > useful, and
>> > > > > > > > how using it
>> > > > > > > > > > > would make things safer?
>> > > > > > > > > > >
>> > > > > > > > > > In scenarios of JIT compilers, once code is generated
>> > > into RW pages,
>> > > > > > > > > > modifying both PTE and EPCM permissions to RX would be
>> > > a good
>> > > > > > > > defensive
>> > > > > > > > > > measure. In that case, EMODPR is useful.
>> > > > > > > > >
>> > > > > > > > > What is the exact threat we are talking about?
>> > > > > > > >
>> > > > > > > > To add: it should be *significantly* critical thread,
>> > > given that not
>> > > > > > > > supporting only EAUG would leave us only one complex call
>> > > pattern with
>> > > > > > > > EACCEPT involvement.
>> > > > > > > >
>> > > > > > > > I'd even go to suggest to leave EMODPR out of the patch
>> > > set, and
>> > > > > > > > introduce
>> > > > > > > > it when there is PoC code for any of the existing run-time
>> > > that
>> > > > > > > > demonstrates the demand for it. Right now this way too
>> > > speculative.
>> > > > > > > >
>> > > > > > > > Supporting EMODPE is IMHO by factors more critical.
>> > > > > > >
>> > > > > > > At least it does not protected against enclave code because
>> > > an enclave
>> > > > > > > can
>> > > > > > > always choose not to EACCEPT any of the EMODPR requests. I'm
>> > > not only
>> > > > > > > confused here about the actual threat but also the potential
>> > > adversary
>> > > > > > > and
>> > > > > > > target.
>> > > > > > >
>> > > > > > I'm not sure I follow your thoughts here. The sequence should
>> > > be for enclave
>> > > > > > to request  EMODPR in the first place through runtime to
>> > > kernel, then to
>> > > > > > verify with EACCEPT that the OS indeed has done EMODPR.
>> > > > > > If enclave does not verify with EACCEPT, then its own code has
>> > > > > > vulnerability. But this does not justify OS not providing the
>> > > mechanism to
>> > > > > > request EMODPR.
>> > > > >
>> > > > > The question is really simple: what is the threat scenario? In
>> > > order to use
>> > > > > the word "vulnerability", you would need one.
>> > > > >
>> > > > > Given the complexity of the whole dance with EMODPR it is
>> > > mandatory to have
>> > > > > one, in order to ack it to the mainline.
>> > > > >
>> > > > > > Similar to how we don't want have RWX code pages for normal  
>> Linux
>> > > > > > application, when an enclave loads code pages (either directly
>> > > or JIT
>> > > > > > compiled from high level code ) into EAUG'd page (which has
>> > > RW), we do not
>> > > > > > want leave pages to be RWX for code to be executable, hence
>> > > the need of
>> > > > > > EMODPR request OS to reduce the permissions to RX once the
>> > > code is ready to
>> > > > > > execute.
>> > > > >
>> > > > > You cannot compare *enforced* permissions outside the enclave,
>> > > and claim that
>> > > > > they would be equivalent to the permissions of the already
>> > > sandboxed code
>> > > > > inside the enclave, with permissions that are not enforced but
>> > > are based
>> > > > > on good will of the enclave code.
>> > > >
>> > > > To add, you can already do "EMODPR" by simply adjusting VMA
>> > > permissions to be
>> > > > more restrictive. How this would be worse than this collaboration
>> > > based
>> > > > thing?
>> > >
>> > > ... or you could even make soft version of EMODPR without using that
>> > > opcode
>> > > by writing an ioctl to update our xarray to allow lower permissions.
>> > > That
>> > > ties the hands of the process who is doing the mmap() already.
>> >
>> > E.g. why not just
>> >
>> > #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
>> > 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
>> > #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
>> > 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
>> >
>> > struct sgx_enclave_restrict_page_permissions {
>> > 	__u64 src;
>> > 	__u64 offset;
>> > 	__u64 length;
>> > 	__u64 secinfo;
>> > 	__u64 count;
>> > };
>> > struct sgx_enclave_extend_page_permissions {
>> > 	__u64 src;
>> > 	__u64 offset;
>> > 	__u64 length;
>> > 	__u64 secinfo;
>> > 	__u64 count;
>> > };
>> >
>> > These would simply update the xarray and nothing else. I'd go with two
>> > ioctls (with the necessary checks for secinfo) in order to provide  
>> hook
>> > up points in the future for LSMs.
>> >
>> > This leaves only EAUG and EMODT requiring the EACCEPT handshake.
>> >
>> > /Jarkko
>> The trusted code base here is the enclave. It can't trust any code  
>> outside
>> for enforcement. There is also need for TLB shootdown.
>>
>> To answer your earlier question about threat, the threat is
>> malicious/compromised code inside enclave. Yes, you can say the whole  
>> thing
>> is sand-boxed, but the runtime inside enclave could load complex upper  
>> layer
>> code.  Therefore the runtime needs to have a trusted mechanism to ensure
>> code pages not writable so that there is less/no chance for compromised
>> malicious enclave to modify existing code pages. I still consider it to  
>> be
>> similar to normal Linux elf-loader/dynamic linker relying on  
>> mmap/mprotect
>> and trusting OS to enforce permissions, but here the enclave runtime  
>> only
>> trust the HW provided mechanism: EMODPR to change EPCM records and  
>> EACCEPT
>> to verify.
>
> So what if:
>
> 1. User space does EMODPR ioctl.
> 2. Enclave does EACCEPT.
> 3. Enclave does EMODPE.
>
Could you elaborate on your exact concern here? EMODPE won't be able to  
restrict permissions, only add, so no way to cancel what's done by EMODPR  
if that's your concern.

And EMODPE would only affect EPCM not PTE. So if OS set PTE no matching  
EPCM, the enclave won't be able to use the page for added access.

> The problem here is the asymmetry of these operations. If EMODPE also
> required EACCEPT from the run-time, EMODPR would also make sense.
>

The asymmetry is on the user space side as Reinette stated in her reply. I  
could not see why this a relevant concern for kernel.

> Please give a code example on how EMODPR improves trust.
>
It's not that EMODPR itself improves trust. What I try to say is that the  
enclave runtime can use EACCET to verify EPCM permissions which is  
trusted, and not relying on PTE permissions which is controlled by OS. It  
must do EACCEPT for EMODPR and other ENCLS ops like EMODT,EAUG, etc. as  
enclave security model considers OS untrusted.

EMODPR is the only way to restrict permissions in EPCM for enclave pages.  
So if it is not supported by kernel then there is no way for enclave  
runtimes to support the use cases I stated previously. That means RWX  
required in EPCM for dynamic loaded/JIT compiled code pages.

Thanks
Haitao
Nathaniel McCallum Jan. 13, 2022, 8:09 p.m. UTC | #32
On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > >
> > > On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > >> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > >> wrote:
> > > >>
> > > >>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > >>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > >>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > >>>>>>>>> OK, so the question is: do we need both or would a
> > > >>>> mechanism just
> > > >>>>>>>> to extend
> > > >>>>>>>>> permissions be sufficient?
> > > >>>>>>>>
> > > >>>>>>>> I do believe that we need both in order to support pages
> > > >>>> having only
> > > >>>>>>>> the permissions required to support their intended use
> > > >>>> during the
> > > >>>>>>>> time the
> > > >>>>>>>> particular access is required. While technically it is
> > > >>>> possible to grant
> > > >>>>>>>> pages all permissions they may need during their lifetime it
> > > >>>> is safer to
> > > >>>>>>>> remove permissions when no longer required.
> > > >>>>>>>
> > > >>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > >>>> how using it
> > > >>>>>>> would make things safer?
> > > >>>>>>>
> > > >>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > >>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > >>>> defensive
> > > >>>>>> measure. In that case, EMODPR is useful.
> > > >>>>>
> > > >>>>> What is the exact threat we are talking about?
> > > >>>>
> > > >>>> To add: it should be *significantly* critical thread, given that not
> > > >>>> supporting only EAUG would leave us only one complex call pattern with
> > > >>>> EACCEPT involvement.
> > > >>>>
> > > >>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > >>>> introduce
> > > >>>> it when there is PoC code for any of the existing run-time that
> > > >>>> demonstrates the demand for it. Right now this way too speculative.
> > > >>>>
> > > >>>> Supporting EMODPE is IMHO by factors more critical.
> > > >>>
> > > >>> At least it does not protected against enclave code because an enclave
> > > >>> can
> > > >>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > >>> confused here about the actual threat but also the potential adversary
> > > >>> and
> > > >>> target.
> > > >>>
> > > >> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > >> to request  EMODPR in the first place through runtime to kernel, then to
> > > >> verify with EACCEPT that the OS indeed has done EMODPR.
> > > >> If enclave does not verify with EACCEPT, then its own code has
> > > >> vulnerability. But this does not justify OS not providing the mechanism to
> > > >> request EMODPR.
> > > >
> > > > The question is really simple: what is the threat scenario? In order to use
> > > > the word "vulnerability", you would need one.
> > > >
> > > > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > > one, in order to ack it to the mainline.
> > > >
> > >
> > > Which complexity related to EMODPR are you concerned about? In a later message
> > > you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > The OS does not require nor depend on EACCEPT being called as part of these flows
> > > so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > these flows in the OS, but would of course impact the enclave.
> >
> > I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > EMODPR going to help with any sort of workload?
>
> I've even started think should we just always allow mmap()?

I suspect this may be the most ergonomic way forward. Instructions
like EAUG/EMODPR/etc are really irrelevant implementation details to
what the enclave wants, which is a memory mapping in the enclave. Why
make the enclave runner do multiple context switches just to change
the memory map of an enclave?

> The worst thing
> that can happen is that the enclave crashes. Does that matter all that
> much? I'm asking because access control is the main theme in SGX2 patch set
> that IMHO should be considered to the ground. It really "stress tests" that
> area. If we can settle on  that, then other things are just technical details
> that we can surely sort out.
>
> /Jarkko
Reinette Chatre Jan. 13, 2022, 9:42 p.m. UTC | #33
Hi Jarkko and Nathaniel,

On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>
>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
>>>> Hi Jarkko,
>>>>
>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>>>>>> wrote:
>>>>>>
>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>>>>>> OK, so the question is: do we need both or would a
>>>>>>>> mechanism just
>>>>>>>>>>>> to extend
>>>>>>>>>>>>> permissions be sufficient?
>>>>>>>>>>>>
>>>>>>>>>>>> I do believe that we need both in order to support pages
>>>>>>>> having only
>>>>>>>>>>>> the permissions required to support their intended use
>>>>>>>> during the
>>>>>>>>>>>> time the
>>>>>>>>>>>> particular access is required. While technically it is
>>>>>>>> possible to grant
>>>>>>>>>>>> pages all permissions they may need during their lifetime it
>>>>>>>> is safer to
>>>>>>>>>>>> remove permissions when no longer required.
>>>>>>>>>>>
>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>>>>>> how using it
>>>>>>>>>>> would make things safer?
>>>>>>>>>>>
>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>>>>>> defensive
>>>>>>>>>> measure. In that case, EMODPR is useful.
>>>>>>>>>
>>>>>>>>> What is the exact threat we are talking about?
>>>>>>>>
>>>>>>>> To add: it should be *significantly* critical thread, given that not
>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
>>>>>>>> EACCEPT involvement.
>>>>>>>>
>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>>>>>> introduce
>>>>>>>> it when there is PoC code for any of the existing run-time that
>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>>>>>
>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
>>>>>>>
>>>>>>> At least it does not protected against enclave code because an enclave
>>>>>>> can
>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>>>>>> confused here about the actual threat but also the potential adversary
>>>>>>> and
>>>>>>> target.
>>>>>>>
>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
>>>>>> If enclave does not verify with EACCEPT, then its own code has
>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
>>>>>> request EMODPR.
>>>>>
>>>>> The question is really simple: what is the threat scenario? In order to use
>>>>> the word "vulnerability", you would need one.
>>>>>
>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
>>>>> one, in order to ack it to the mainline.
>>>>>
>>>>
>>>> Which complexity related to EMODPR are you concerned about? In a later message
>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
>>>> these flows in the OS, but would of course impact the enclave.
>>>
>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
>>> EMODPR going to help with any sort of workload?
>>
>> I've even started think should we just always allow mmap()?
> 
> I suspect this may be the most ergonomic way forward. Instructions
> like EAUG/EMODPR/etc are really irrelevant implementation details to
> what the enclave wants, which is a memory mapping in the enclave. Why
> make the enclave runner do multiple context switches just to change
> the memory map of an enclave?

The enclave runner is not forced to make any changes to a memory mapping. To start,
this implementation supports and does not change the existing ABI where a new
memory mapping can only be created if its permissions are the same or weaker
than the EPCM permissions. After the memory mapping is created the EPCM permissions
can change (thanks to SGX2) and when they do there are no forced nor required
changes to the memory mapping - pages remain accessible where the memory mapping
and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
to an enclave page (EMODPE) then the memory mapping may need to be changed as
should be expected to access a page with permissions that the memory mapping
did not previously allow.

Are you saying that the permissions of a new memory mapping should now be allowed
to exceed EPCM permissions and thus the enclave runner would not need to modify a
memory mapping when EPCM permissions are relaxed? As mentioned above this may be
considered a change in ABI but something we could support on SGX2 systems.

I would also like to highlight Haitao's earlier comment that a foundation of SGX is
that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
and EMODPE to manage enclave page permissions.

Reinette
Jarkko Sakkinen Jan. 14, 2022, 9:36 p.m. UTC | #34
On Wed, Jan 12, 2022 at 08:41:18PM -0600, Haitao Huang wrote:
> On Wed, 12 Jan 2022 17:48:48 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Mon, Jan 10, 2022 at 09:48:15PM -0600, Haitao Huang wrote:
> > > On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > wrote:
> > > 
> > > > On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
> > > > > On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> > > > > > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > > > > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > > > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen
> > > > > <jarkko@kernel.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko
> > > Sakkinen wrote:
> > > > > > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen
> > > > > wrote:
> > > > > > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang
> > > > > wrote:
> > > > > > > > > > > > > > > OK, so the question is: do we need both or
> > > would a
> > > > > > > > > > mechanism just
> > > > > > > > > > > > > > to extend
> > > > > > > > > > > > > > > permissions be sufficient?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I do believe that we need both in order to support
> > > > > pages
> > > > > > > > > > having only
> > > > > > > > > > > > > > the permissions required to support their
> > > intended use
> > > > > > > > > > during the
> > > > > > > > > > > > > > time the
> > > > > > > > > > > > > > particular access is required. While
> > > technically it is
> > > > > > > > > > possible to grant
> > > > > > > > > > > > > > pages all permissions they may need during their
> > > > > lifetime it
> > > > > > > > > > is safer to
> > > > > > > > > > > > > > remove permissions when no longer required.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So if we imagine a run-time: how EMODPR would be
> > > > > useful, and
> > > > > > > > > > how using it
> > > > > > > > > > > > > would make things safer?
> > > > > > > > > > > > >
> > > > > > > > > > > > In scenarios of JIT compilers, once code is generated
> > > > > into RW pages,
> > > > > > > > > > > > modifying both PTE and EPCM permissions to RX would be
> > > > > a good
> > > > > > > > > > defensive
> > > > > > > > > > > > measure. In that case, EMODPR is useful.
> > > > > > > > > > >
> > > > > > > > > > > What is the exact threat we are talking about?
> > > > > > > > > >
> > > > > > > > > > To add: it should be *significantly* critical thread,
> > > > > given that not
> > > > > > > > > > supporting only EAUG would leave us only one complex call
> > > > > pattern with
> > > > > > > > > > EACCEPT involvement.
> > > > > > > > > >
> > > > > > > > > > I'd even go to suggest to leave EMODPR out of the patch
> > > > > set, and
> > > > > > > > > > introduce
> > > > > > > > > > it when there is PoC code for any of the existing run-time
> > > > > that
> > > > > > > > > > demonstrates the demand for it. Right now this way too
> > > > > speculative.
> > > > > > > > > >
> > > > > > > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > > > > > >
> > > > > > > > > At least it does not protected against enclave code because
> > > > > an enclave
> > > > > > > > > can
> > > > > > > > > always choose not to EACCEPT any of the EMODPR requests. I'm
> > > > > not only
> > > > > > > > > confused here about the actual threat but also the potential
> > > > > adversary
> > > > > > > > > and
> > > > > > > > > target.
> > > > > > > > >
> > > > > > > > I'm not sure I follow your thoughts here. The sequence should
> > > > > be for enclave
> > > > > > > > to request  EMODPR in the first place through runtime to
> > > > > kernel, then to
> > > > > > > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > > > > > > If enclave does not verify with EACCEPT, then its own code has
> > > > > > > > vulnerability. But this does not justify OS not providing the
> > > > > mechanism to
> > > > > > > > request EMODPR.
> > > > > > >
> > > > > > > The question is really simple: what is the threat scenario? In
> > > > > order to use
> > > > > > > the word "vulnerability", you would need one.
> > > > > > >
> > > > > > > Given the complexity of the whole dance with EMODPR it is
> > > > > mandatory to have
> > > > > > > one, in order to ack it to the mainline.
> > > > > > >
> > > > > > > > Similar to how we don't want have RWX code pages for
> > > normal Linux
> > > > > > > > application, when an enclave loads code pages (either directly
> > > > > or JIT
> > > > > > > > compiled from high level code ) into EAUG'd page (which has
> > > > > RW), we do not
> > > > > > > > want leave pages to be RWX for code to be executable, hence
> > > > > the need of
> > > > > > > > EMODPR request OS to reduce the permissions to RX once the
> > > > > code is ready to
> > > > > > > > execute.
> > > > > > >
> > > > > > > You cannot compare *enforced* permissions outside the enclave,
> > > > > and claim that
> > > > > > > they would be equivalent to the permissions of the already
> > > > > sandboxed code
> > > > > > > inside the enclave, with permissions that are not enforced but
> > > > > are based
> > > > > > > on good will of the enclave code.
> > > > > >
> > > > > > To add, you can already do "EMODPR" by simply adjusting VMA
> > > > > permissions to be
> > > > > > more restrictive. How this would be worse than this collaboration
> > > > > based
> > > > > > thing?
> > > > >
> > > > > ... or you could even make soft version of EMODPR without using that
> > > > > opcode
> > > > > by writing an ioctl to update our xarray to allow lower permissions.
> > > > > That
> > > > > ties the hands of the process who is doing the mmap() already.
> > > >
> > > > E.g. why not just
> > > >
> > > > #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
> > > > 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
> > > > #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
> > > > 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
> > > >
> > > > struct sgx_enclave_restrict_page_permissions {
> > > > 	__u64 src;
> > > > 	__u64 offset;
> > > > 	__u64 length;
> > > > 	__u64 secinfo;
> > > > 	__u64 count;
> > > > };
> > > > struct sgx_enclave_extend_page_permissions {
> > > > 	__u64 src;
> > > > 	__u64 offset;
> > > > 	__u64 length;
> > > > 	__u64 secinfo;
> > > > 	__u64 count;
> > > > };
> > > >
> > > > These would simply update the xarray and nothing else. I'd go with two
> > > > ioctls (with the necessary checks for secinfo) in order to provide
> > > hook
> > > > up points in the future for LSMs.
> > > >
> > > > This leaves only EAUG and EMODT requiring the EACCEPT handshake.
> > > >
> > > > /Jarkko
> > > The trusted code base here is the enclave. It can't trust any code
> > > outside
> > > for enforcement. There is also need for TLB shootdown.
> > > 
> > > To answer your earlier question about threat, the threat is
> > > malicious/compromised code inside enclave. Yes, you can say the
> > > whole thing
> > > is sand-boxed, but the runtime inside enclave could load complex
> > > upper layer
> > > code.  Therefore the runtime needs to have a trusted mechanism to ensure
> > > code pages not writable so that there is less/no chance for compromised
> > > malicious enclave to modify existing code pages. I still consider it
> > > to be
> > > similar to normal Linux elf-loader/dynamic linker relying on
> > > mmap/mprotect
> > > and trusting OS to enforce permissions, but here the enclave runtime
> > > only
> > > trust the HW provided mechanism: EMODPR to change EPCM records and
> > > EACCEPT
> > > to verify.
> > 
> > So what if:
> > 
> > 1. User space does EMODPR ioctl.
> > 2. Enclave does EACCEPT.
> > 3. Enclave does EMODPE.
> > 
> Could you elaborate on your exact concern here? EMODPE won't be able to
> restrict permissions, only add, so no way to cancel what's done by EMODPR if
> that's your concern.
> 
> And EMODPE would only affect EPCM not PTE. So if OS set PTE no matching
> EPCM, the enclave won't be able to use the page for added access.

The problem I see is clearly visible in your last sentence, if you think
about it. That's all I can add more to this discussion for the moment.

/Jarkko
Jarkko Sakkinen Jan. 14, 2022, 9:53 p.m. UTC | #35
On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> Hi Jarkko and Nathaniel,
> 
> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>
> >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>>
> >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> >>>>>>>> mechanism just
> >>>>>>>>>>>> to extend
> >>>>>>>>>>>>> permissions be sufficient?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I do believe that we need both in order to support pages
> >>>>>>>> having only
> >>>>>>>>>>>> the permissions required to support their intended use
> >>>>>>>> during the
> >>>>>>>>>>>> time the
> >>>>>>>>>>>> particular access is required. While technically it is
> >>>>>>>> possible to grant
> >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> >>>>>>>> is safer to
> >>>>>>>>>>>> remove permissions when no longer required.
> >>>>>>>>>>>
> >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>>>>>> how using it
> >>>>>>>>>>> would make things safer?
> >>>>>>>>>>>
> >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>>>>>> defensive
> >>>>>>>>>> measure. In that case, EMODPR is useful.
> >>>>>>>>>
> >>>>>>>>> What is the exact threat we are talking about?
> >>>>>>>>
> >>>>>>>> To add: it should be *significantly* critical thread, given that not
> >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> >>>>>>>> EACCEPT involvement.
> >>>>>>>>
> >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>>>>>> introduce
> >>>>>>>> it when there is PoC code for any of the existing run-time that
> >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>>>>>
> >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> >>>>>>>
> >>>>>>> At least it does not protected against enclave code because an enclave
> >>>>>>> can
> >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>>>>>> confused here about the actual threat but also the potential adversary
> >>>>>>> and
> >>>>>>> target.
> >>>>>>>
> >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> >>>>>> If enclave does not verify with EACCEPT, then its own code has
> >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> >>>>>> request EMODPR.
> >>>>>
> >>>>> The question is really simple: what is the threat scenario? In order to use
> >>>>> the word "vulnerability", you would need one.
> >>>>>
> >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> >>>>> one, in order to ack it to the mainline.
> >>>>>
> >>>>
> >>>> Which complexity related to EMODPR are you concerned about? In a later message
> >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> >>>> these flows in the OS, but would of course impact the enclave.
> >>>
> >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> >>> EMODPR going to help with any sort of workload?
> >>
> >> I've even started think should we just always allow mmap()?
> > 
> > I suspect this may be the most ergonomic way forward. Instructions
> > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > what the enclave wants, which is a memory mapping in the enclave. Why
> > make the enclave runner do multiple context switches just to change
> > the memory map of an enclave?
> 
> The enclave runner is not forced to make any changes to a memory mapping. To start,
> this implementation supports and does not change the existing ABI where a new
> memory mapping can only be created if its permissions are the same or weaker
> than the EPCM permissions. After the memory mapping is created the EPCM permissions
> can change (thanks to SGX2) and when they do there are no forced nor required
> changes to the memory mapping - pages remain accessible where the memory mapping
> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> to an enclave page (EMODPE) then the memory mapping may need to be changed as
> should be expected to access a page with permissions that the memory mapping
> did not previously allow.
> 
> Are you saying that the permissions of a new memory mapping should now be allowed
> to exceed EPCM permissions and thus the enclave runner would not need to modify a
> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> considered a change in ABI but something we could support on SGX2 systems.
> 
> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> and EMODPE to manage enclave page permissions.

Thanks, this was very informative response. I'll try to elaborate why
EMODPR gives me headaches.

I'm having hard time to connect the dots between OS mistrust and
restricting enclave by changing EPCM permissions. To make EMODPR actually
legit, it needs really at least some sort of example of a scenario where
mistrusted OS is the adversary and enclave is the attack target. Otherwise,
we are just waving our hands.

Generally speaking a restriction is not a restriction if cannot be enforced. 

I see two non-EMODPR options: you could relax this,  *or* you could make it
soft restriction by not doing EMODPR but instead just updating the internal
xarray. The 2nd option would be fully backwards compatible with the
existing invariant.

It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
or will be supported.

> Reinette
 
/Jarkko
Jarkko Sakkinen Jan. 14, 2022, 9:57 p.m. UTC | #36
On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > Hi Jarkko and Nathaniel,
> > 
> > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >>
> > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > >>>> Hi Jarkko,
> > >>>>
> > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > >>>>>>>> mechanism just
> > >>>>>>>>>>>> to extend
> > >>>>>>>>>>>>> permissions be sufficient?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > >>>>>>>> having only
> > >>>>>>>>>>>> the permissions required to support their intended use
> > >>>>>>>> during the
> > >>>>>>>>>>>> time the
> > >>>>>>>>>>>> particular access is required. While technically it is
> > >>>>>>>> possible to grant
> > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > >>>>>>>> is safer to
> > >>>>>>>>>>>> remove permissions when no longer required.
> > >>>>>>>>>>>
> > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > >>>>>>>> how using it
> > >>>>>>>>>>> would make things safer?
> > >>>>>>>>>>>
> > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > >>>>>>>> defensive
> > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > >>>>>>>>>
> > >>>>>>>>> What is the exact threat we are talking about?
> > >>>>>>>>
> > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > >>>>>>>> EACCEPT involvement.
> > >>>>>>>>
> > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > >>>>>>>> introduce
> > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > >>>>>>>>
> > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > >>>>>>>
> > >>>>>>> At least it does not protected against enclave code because an enclave
> > >>>>>>> can
> > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > >>>>>>> confused here about the actual threat but also the potential adversary
> > >>>>>>> and
> > >>>>>>> target.
> > >>>>>>>
> > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > >>>>>> request EMODPR.
> > >>>>>
> > >>>>> The question is really simple: what is the threat scenario? In order to use
> > >>>>> the word "vulnerability", you would need one.
> > >>>>>
> > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > >>>>> one, in order to ack it to the mainline.
> > >>>>>
> > >>>>
> > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > >>>> these flows in the OS, but would of course impact the enclave.
> > >>>
> > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > >>> EMODPR going to help with any sort of workload?
> > >>
> > >> I've even started think should we just always allow mmap()?
> > > 
> > > I suspect this may be the most ergonomic way forward. Instructions
> > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > make the enclave runner do multiple context switches just to change
> > > the memory map of an enclave?
> > 
> > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > this implementation supports and does not change the existing ABI where a new
> > memory mapping can only be created if its permissions are the same or weaker
> > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > can change (thanks to SGX2) and when they do there are no forced nor required
> > changes to the memory mapping - pages remain accessible where the memory mapping
> > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > should be expected to access a page with permissions that the memory mapping
> > did not previously allow.
> > 
> > Are you saying that the permissions of a new memory mapping should now be allowed
> > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > considered a change in ABI but something we could support on SGX2 systems.
> > 
> > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > and EMODPE to manage enclave page permissions.
> 
> Thanks, this was very informative response. I'll try to elaborate why
> EMODPR gives me headaches.
> 
> I'm having hard time to connect the dots between OS mistrust and
> restricting enclave by changing EPCM permissions. To make EMODPR actually
> legit, it needs really at least some sort of example of a scenario where
> mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> we are just waving our hands.
> 
> Generally speaking a restriction is not a restriction if cannot be enforced. 
> 
> I see two non-EMODPR options: you could relax this,  *or* you could make it
> soft restriction by not doing EMODPR but instead just updating the internal
> xarray. The 2nd option would be fully backwards compatible with the
> existing invariant.
> 
> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> or will be supported.

Off-topic: I was able to compile a kernel with your SGX2 patches and run
kselftests. I cannot give tested-by's before the design is locked-in but
in that sense I don't think we are far away of some solution. EAUG side
looks pretty good to me.

/Jarkko
Jarkko Sakkinen Jan. 14, 2022, 10 p.m. UTC | #37
On Fri, Jan 14, 2022 at 11:57:08PM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko and Nathaniel,
> > > 
> > > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >>
> > > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>>
> > > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > > >>>>>>>> mechanism just
> > > >>>>>>>>>>>> to extend
> > > >>>>>>>>>>>>> permissions be sufficient?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > > >>>>>>>> having only
> > > >>>>>>>>>>>> the permissions required to support their intended use
> > > >>>>>>>> during the
> > > >>>>>>>>>>>> time the
> > > >>>>>>>>>>>> particular access is required. While technically it is
> > > >>>>>>>> possible to grant
> > > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > > >>>>>>>> is safer to
> > > >>>>>>>>>>>> remove permissions when no longer required.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > >>>>>>>> how using it
> > > >>>>>>>>>>> would make things safer?
> > > >>>>>>>>>>>
> > > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > >>>>>>>> defensive
> > > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > > >>>>>>>>>
> > > >>>>>>>>> What is the exact threat we are talking about?
> > > >>>>>>>>
> > > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > > >>>>>>>> EACCEPT involvement.
> > > >>>>>>>>
> > > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > >>>>>>>> introduce
> > > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > > >>>>>>>>
> > > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > > >>>>>>>
> > > >>>>>>> At least it does not protected against enclave code because an enclave
> > > >>>>>>> can
> > > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > >>>>>>> confused here about the actual threat but also the potential adversary
> > > >>>>>>> and
> > > >>>>>>> target.
> > > >>>>>>>
> > > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > > >>>>>> request EMODPR.
> > > >>>>>
> > > >>>>> The question is really simple: what is the threat scenario? In order to use
> > > >>>>> the word "vulnerability", you would need one.
> > > >>>>>
> > > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > >>>>> one, in order to ack it to the mainline.
> > > >>>>>
> > > >>>>
> > > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > >>>> these flows in the OS, but would of course impact the enclave.
> > > >>>
> > > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > > >>> EMODPR going to help with any sort of workload?
> > > >>
> > > >> I've even started think should we just always allow mmap()?
> > > > 
> > > > I suspect this may be the most ergonomic way forward. Instructions
> > > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > > make the enclave runner do multiple context switches just to change
> > > > the memory map of an enclave?
> > > 
> > > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > > this implementation supports and does not change the existing ABI where a new
> > > memory mapping can only be created if its permissions are the same or weaker
> > > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > > can change (thanks to SGX2) and when they do there are no forced nor required
> > > changes to the memory mapping - pages remain accessible where the memory mapping
> > > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > > should be expected to access a page with permissions that the memory mapping
> > > did not previously allow.
> > > 
> > > Are you saying that the permissions of a new memory mapping should now be allowed
> > > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > > considered a change in ABI but something we could support on SGX2 systems.
> > > 
> > > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > > and EMODPE to manage enclave page permissions.
> > 
> > Thanks, this was very informative response. I'll try to elaborate why
> > EMODPR gives me headaches.
> > 
> > I'm having hard time to connect the dots between OS mistrust and
> > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > legit, it needs really at least some sort of example of a scenario where
> > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > we are just waving our hands.
> > 
> > Generally speaking a restriction is not a restriction if cannot be enforced. 
> > 
> > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > soft restriction by not doing EMODPR but instead just updating the internal
> > xarray. The 2nd option would be fully backwards compatible with the
> > existing invariant.
> > 
> > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > or will be supported.
> 
> Off-topic: I was able to compile a kernel with your SGX2 patches and run
> kselftests. I cannot give tested-by's before the design is locked-in but
> in that sense I don't think we are far away of some solution. EAUG side
> looks pretty good to me.

Also, I'll add the missing ioctl's to my man page patch before sending a
new version so that it lacks only SGX2 ioctls. I think you are right in
your review feedback that they should be part of the patch so that we get
it in-sync.

/Jarkko
Jarkko Sakkinen Jan. 14, 2022, 10:17 p.m. UTC | #38
On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > Hi Jarkko and Nathaniel,
> > 
> > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >>
> > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > >>>> Hi Jarkko,
> > >>>>
> > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > >>>>>>>> mechanism just
> > >>>>>>>>>>>> to extend
> > >>>>>>>>>>>>> permissions be sufficient?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > >>>>>>>> having only
> > >>>>>>>>>>>> the permissions required to support their intended use
> > >>>>>>>> during the
> > >>>>>>>>>>>> time the
> > >>>>>>>>>>>> particular access is required. While technically it is
> > >>>>>>>> possible to grant
> > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > >>>>>>>> is safer to
> > >>>>>>>>>>>> remove permissions when no longer required.
> > >>>>>>>>>>>
> > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > >>>>>>>> how using it
> > >>>>>>>>>>> would make things safer?
> > >>>>>>>>>>>
> > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > >>>>>>>> defensive
> > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > >>>>>>>>>
> > >>>>>>>>> What is the exact threat we are talking about?
> > >>>>>>>>
> > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > >>>>>>>> EACCEPT involvement.
> > >>>>>>>>
> > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > >>>>>>>> introduce
> > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > >>>>>>>>
> > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > >>>>>>>
> > >>>>>>> At least it does not protected against enclave code because an enclave
> > >>>>>>> can
> > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > >>>>>>> confused here about the actual threat but also the potential adversary
> > >>>>>>> and
> > >>>>>>> target.
> > >>>>>>>
> > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > >>>>>> request EMODPR.
> > >>>>>
> > >>>>> The question is really simple: what is the threat scenario? In order to use
> > >>>>> the word "vulnerability", you would need one.
> > >>>>>
> > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > >>>>> one, in order to ack it to the mainline.
> > >>>>>
> > >>>>
> > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > >>>> these flows in the OS, but would of course impact the enclave.
> > >>>
> > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > >>> EMODPR going to help with any sort of workload?
> > >>
> > >> I've even started think should we just always allow mmap()?
> > > 
> > > I suspect this may be the most ergonomic way forward. Instructions
> > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > make the enclave runner do multiple context switches just to change
> > > the memory map of an enclave?
> > 
> > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > this implementation supports and does not change the existing ABI where a new
> > memory mapping can only be created if its permissions are the same or weaker
> > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > can change (thanks to SGX2) and when they do there are no forced nor required
> > changes to the memory mapping - pages remain accessible where the memory mapping
> > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > should be expected to access a page with permissions that the memory mapping
> > did not previously allow.
> > 
> > Are you saying that the permissions of a new memory mapping should now be allowed
> > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > considered a change in ABI but something we could support on SGX2 systems.
> > 
> > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > and EMODPE to manage enclave page permissions.
> 
> Thanks, this was very informative response. I'll try to elaborate why
> EMODPR gives me headaches.
> 
> I'm having hard time to connect the dots between OS mistrust and
> restricting enclave by changing EPCM permissions. To make EMODPR actually
> legit, it needs really at least some sort of example of a scenario where
> mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> we are just waving our hands.
> 
> Generally speaking a restriction is not a restriction if cannot be enforced. 
> 
> I see two non-EMODPR options: you could relax this,  *or* you could make it
> soft restriction by not doing EMODPR but instead just updating the internal
> xarray. The 2nd option would be fully backwards compatible with the
> existing invariant.
> 
> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> or will be supported.

I think I *might* have a supporting scenario for EMODPR.

Enclave might want to accept EMODPR request because a bug in functionality
triggered with TCS entries might allow otherwise to rewrite enclave data,
i.e. provide a write primitive outside the enclave. With some other way to
exploit you could have a read primitive and thus have a full access to the
internal data of the enclave.

/Jarkko
Jarkko Sakkinen Jan. 14, 2022, 10:23 p.m. UTC | #39
On Sat, Jan 15, 2022 at 12:17:06AM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko and Nathaniel,
> > > 
> > > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >>
> > > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>>
> > > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > > >>>>>>>> mechanism just
> > > >>>>>>>>>>>> to extend
> > > >>>>>>>>>>>>> permissions be sufficient?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > > >>>>>>>> having only
> > > >>>>>>>>>>>> the permissions required to support their intended use
> > > >>>>>>>> during the
> > > >>>>>>>>>>>> time the
> > > >>>>>>>>>>>> particular access is required. While technically it is
> > > >>>>>>>> possible to grant
> > > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > > >>>>>>>> is safer to
> > > >>>>>>>>>>>> remove permissions when no longer required.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > >>>>>>>> how using it
> > > >>>>>>>>>>> would make things safer?
> > > >>>>>>>>>>>
> > > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > >>>>>>>> defensive
> > > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > > >>>>>>>>>
> > > >>>>>>>>> What is the exact threat we are talking about?
> > > >>>>>>>>
> > > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > > >>>>>>>> EACCEPT involvement.
> > > >>>>>>>>
> > > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > >>>>>>>> introduce
> > > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > > >>>>>>>>
> > > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > > >>>>>>>
> > > >>>>>>> At least it does not protected against enclave code because an enclave
> > > >>>>>>> can
> > > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > >>>>>>> confused here about the actual threat but also the potential adversary
> > > >>>>>>> and
> > > >>>>>>> target.
> > > >>>>>>>
> > > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > > >>>>>> request EMODPR.
> > > >>>>>
> > > >>>>> The question is really simple: what is the threat scenario? In order to use
> > > >>>>> the word "vulnerability", you would need one.
> > > >>>>>
> > > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > >>>>> one, in order to ack it to the mainline.
> > > >>>>>
> > > >>>>
> > > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > >>>> these flows in the OS, but would of course impact the enclave.
> > > >>>
> > > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > > >>> EMODPR going to help with any sort of workload?
> > > >>
> > > >> I've even started think should we just always allow mmap()?
> > > > 
> > > > I suspect this may be the most ergonomic way forward. Instructions
> > > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > > make the enclave runner do multiple context switches just to change
> > > > the memory map of an enclave?
> > > 
> > > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > > this implementation supports and does not change the existing ABI where a new
> > > memory mapping can only be created if its permissions are the same or weaker
> > > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > > can change (thanks to SGX2) and when they do there are no forced nor required
> > > changes to the memory mapping - pages remain accessible where the memory mapping
> > > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > > should be expected to access a page with permissions that the memory mapping
> > > did not previously allow.
> > > 
> > > Are you saying that the permissions of a new memory mapping should now be allowed
> > > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > > considered a change in ABI but something we could support on SGX2 systems.
> > > 
> > > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > > and EMODPE to manage enclave page permissions.
> > 
> > Thanks, this was very informative response. I'll try to elaborate why
> > EMODPR gives me headaches.
> > 
> > I'm having hard time to connect the dots between OS mistrust and
> > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > legit, it needs really at least some sort of example of a scenario where
> > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > we are just waving our hands.
> > 
> > Generally speaking a restriction is not a restriction if cannot be enforced. 
> > 
> > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > soft restriction by not doing EMODPR but instead just updating the internal
> > xarray. The 2nd option would be fully backwards compatible with the
> > existing invariant.
> > 
> > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > or will be supported.
> 
> I think I *might* have a supporting scenario for EMODPR.
> 
> Enclave might want to accept EMODPR request because a bug in functionality
> triggered with TCS entries might allow otherwise to rewrite enclave data,
> i.e. provide a write primitive outside the enclave. With some other way to
> exploit you could have a read primitive and thus have a full access to the
> internal data of the enclave.

I.e. because of this it would be "for profit case" for the enclave not to
cancel the effect of EMODPR by applying EMODPE because it can protect
itself by doing that from malformed input data.

I get that the whole point is the OS mistrust but you really need to bring
up the rationale to the specifics what you mean by it in the context of the
kernel patch. Otherwise, anything would go by saying that we do this
because OS mistrust.

/Jarkko
Jarkko Sakkinen Jan. 14, 2022, 10:34 p.m. UTC | #40
On Sat, Jan 15, 2022 at 12:23:46AM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 15, 2022 at 12:17:06AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> > > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > > > Hi Jarkko and Nathaniel,
> > > > 
> > > > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > > >>
> > > > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > > > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > > >>>> Hi Jarkko,
> > > > >>>>
> > > > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > > > >>>>>>>> mechanism just
> > > > >>>>>>>>>>>> to extend
> > > > >>>>>>>>>>>>> permissions be sufficient?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > > > >>>>>>>> having only
> > > > >>>>>>>>>>>> the permissions required to support their intended use
> > > > >>>>>>>> during the
> > > > >>>>>>>>>>>> time the
> > > > >>>>>>>>>>>> particular access is required. While technically it is
> > > > >>>>>>>> possible to grant
> > > > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > > > >>>>>>>> is safer to
> > > > >>>>>>>>>>>> remove permissions when no longer required.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > > >>>>>>>> how using it
> > > > >>>>>>>>>>> would make things safer?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > > >>>>>>>> defensive
> > > > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > > > >>>>>>>>>
> > > > >>>>>>>>> What is the exact threat we are talking about?
> > > > >>>>>>>>
> > > > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > > > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > > > >>>>>>>> EACCEPT involvement.
> > > > >>>>>>>>
> > > > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > >>>>>>>> introduce
> > > > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > > > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > > > >>>>>>>>
> > > > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > > > >>>>>>>
> > > > >>>>>>> At least it does not protected against enclave code because an enclave
> > > > >>>>>>> can
> > > > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > > >>>>>>> confused here about the actual threat but also the potential adversary
> > > > >>>>>>> and
> > > > >>>>>>> target.
> > > > >>>>>>>
> > > > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > > > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > > > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > > > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > > > >>>>>> request EMODPR.
> > > > >>>>>
> > > > >>>>> The question is really simple: what is the threat scenario? In order to use
> > > > >>>>> the word "vulnerability", you would need one.
> > > > >>>>>
> > > > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > > >>>>> one, in order to ack it to the mainline.
> > > > >>>>>
> > > > >>>>
> > > > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > > > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > > > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > > >>>> these flows in the OS, but would of course impact the enclave.
> > > > >>>
> > > > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > > > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > > > >>> EMODPR going to help with any sort of workload?
> > > > >>
> > > > >> I've even started think should we just always allow mmap()?
> > > > > 
> > > > > I suspect this may be the most ergonomic way forward. Instructions
> > > > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > > > make the enclave runner do multiple context switches just to change
> > > > > the memory map of an enclave?
> > > > 
> > > > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > > > this implementation supports and does not change the existing ABI where a new
> > > > memory mapping can only be created if its permissions are the same or weaker
> > > > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > > > can change (thanks to SGX2) and when they do there are no forced nor required
> > > > changes to the memory mapping - pages remain accessible where the memory mapping
> > > > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > > > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > > > should be expected to access a page with permissions that the memory mapping
> > > > did not previously allow.
> > > > 
> > > > Are you saying that the permissions of a new memory mapping should now be allowed
> > > > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > > > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > > > considered a change in ABI but something we could support on SGX2 systems.
> > > > 
> > > > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > > > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > > > and EMODPE to manage enclave page permissions.
> > > 
> > > Thanks, this was very informative response. I'll try to elaborate why
> > > EMODPR gives me headaches.
> > > 
> > > I'm having hard time to connect the dots between OS mistrust and
> > > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > > legit, it needs really at least some sort of example of a scenario where
> > > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > > we are just waving our hands.
> > > 
> > > Generally speaking a restriction is not a restriction if cannot be enforced. 
> > > 
> > > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > > soft restriction by not doing EMODPR but instead just updating the internal
> > > xarray. The 2nd option would be fully backwards compatible with the
> > > existing invariant.
> > > 
> > > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > > or will be supported.
> > 
> > I think I *might* have a supporting scenario for EMODPR.
> > 
> > Enclave might want to accept EMODPR request because a bug in functionality
> > triggered with TCS entries might allow otherwise to rewrite enclave data,
> > i.e. provide a write primitive outside the enclave. With some other way to
> > exploit you could have a read primitive and thus have a full access to the
> > internal data of the enclave.
> 
> I.e. because of this it would be "for profit case" for the enclave not to
> cancel the effect of EMODPR by applying EMODPE because it can protect
> itself by doing that from malformed input data.
> 
> I get that the whole point is the OS mistrust but you really need to bring
> up the rationale to the specifics what you mean by it in the context of the
> kernel patch. Otherwise, anything would go by saying that we do this
> because OS mistrust.

My scenario is illegit because:

1. An attacker can choose not to do EMODPR and still take advantage of the
   exploit, and get the write primitive.
2. Enclave has very theoretical chances to counter-measure that because
   introspection is not possible, only the "mistrusted OS" has that
   capability, i.e. the attacker. ERDINFO is AFAIK ENCLS leaf.

/Jarkko
Reinette Chatre Jan. 14, 2022, 11:05 p.m. UTC | #41
Hi Jarkko,

On 1/14/2022 1:53 PM, Jarkko Sakkinen wrote:
> On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
>> Hi Jarkko and Nathaniel,
>>
>> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
>>> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>
>>>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
>>>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
>>>>>> Hi Jarkko,
>>>>>>
>>>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
>>>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>>>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>>>>>>>> OK, so the question is: do we need both or would a
>>>>>>>>>> mechanism just
>>>>>>>>>>>>>> to extend
>>>>>>>>>>>>>>> permissions be sufficient?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do believe that we need both in order to support pages
>>>>>>>>>> having only
>>>>>>>>>>>>>> the permissions required to support their intended use
>>>>>>>>>> during the
>>>>>>>>>>>>>> time the
>>>>>>>>>>>>>> particular access is required. While technically it is
>>>>>>>>>> possible to grant
>>>>>>>>>>>>>> pages all permissions they may need during their lifetime it
>>>>>>>>>> is safer to
>>>>>>>>>>>>>> remove permissions when no longer required.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>>>>>>>> how using it
>>>>>>>>>>>>> would make things safer?
>>>>>>>>>>>>>
>>>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>>>>>>>> defensive
>>>>>>>>>>>> measure. In that case, EMODPR is useful.
>>>>>>>>>>>
>>>>>>>>>>> What is the exact threat we are talking about?
>>>>>>>>>>
>>>>>>>>>> To add: it should be *significantly* critical thread, given that not
>>>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
>>>>>>>>>> EACCEPT involvement.
>>>>>>>>>>
>>>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>>>>>>>> introduce
>>>>>>>>>> it when there is PoC code for any of the existing run-time that
>>>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>>>>>>>
>>>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
>>>>>>>>>
>>>>>>>>> At least it does not protected against enclave code because an enclave
>>>>>>>>> can
>>>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>>>>>>>> confused here about the actual threat but also the potential adversary
>>>>>>>>> and
>>>>>>>>> target.
>>>>>>>>>
>>>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>>>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
>>>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
>>>>>>>> If enclave does not verify with EACCEPT, then its own code has
>>>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
>>>>>>>> request EMODPR.
>>>>>>>
>>>>>>> The question is really simple: what is the threat scenario? In order to use
>>>>>>> the word "vulnerability", you would need one.
>>>>>>>
>>>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
>>>>>>> one, in order to ack it to the mainline.
>>>>>>>
>>>>>>
>>>>>> Which complexity related to EMODPR are you concerned about? In a later message
>>>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
>>>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
>>>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
>>>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
>>>>>> these flows in the OS, but would of course impact the enclave.
>>>>>
>>>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
>>>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
>>>>> EMODPR going to help with any sort of workload?
>>>>
>>>> I've even started think should we just always allow mmap()?
>>>
>>> I suspect this may be the most ergonomic way forward. Instructions
>>> like EAUG/EMODPR/etc are really irrelevant implementation details to
>>> what the enclave wants, which is a memory mapping in the enclave. Why
>>> make the enclave runner do multiple context switches just to change
>>> the memory map of an enclave?
>>
>> The enclave runner is not forced to make any changes to a memory mapping. To start,
>> this implementation supports and does not change the existing ABI where a new
>> memory mapping can only be created if its permissions are the same or weaker
>> than the EPCM permissions. After the memory mapping is created the EPCM permissions
>> can change (thanks to SGX2) and when they do there are no forced nor required
>> changes to the memory mapping - pages remain accessible where the memory mapping
>> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
>> to an enclave page (EMODPE) then the memory mapping may need to be changed as
>> should be expected to access a page with permissions that the memory mapping
>> did not previously allow.
>>
>> Are you saying that the permissions of a new memory mapping should now be allowed
>> to exceed EPCM permissions and thus the enclave runner would not need to modify a
>> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
>> considered a change in ABI but something we could support on SGX2 systems.
>>
>> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
>> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
>> and EMODPE to manage enclave page permissions.
> 
> Thanks, this was very informative response. I'll try to elaborate why
> EMODPR gives me headaches.
> 
> I'm having hard time to connect the dots between OS mistrust and
> restricting enclave by changing EPCM permissions. To make EMODPR actually
> legit, it needs really at least some sort of example of a scenario where
> mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> we are just waving our hands.

The enclave itself should run in an environment that respects the foundational
security principle of least privilege. There are valid scenarios where an enclave
may need to write to a page and then later execute from it, but it should
ideally not have those permissions for its entire lifetime. As Haitao mentioned
this is to protect it from itself, which can be bugs in the code or maybe even
malicious code and is similar to how non enclave code is treated today. So in your
request I interpret this to mean that it is the enclave that is both the
adversary and the target.

At the same time the ones running the enclaves do not trust the OS to manage
access to enclave pages and that is how we ended up with the need for these SGX2
permission functions needed to change the EPCM permissions.

> 
> Generally speaking a restriction is not a restriction if cannot be enforced. 

The EPCM permissions are enforced by the hardware and in addition
in this implementation the OS will not add PTEs that exceed the EPCM permissions.

> 
> I see two non-EMODPR options: you could relax this,  *or* you could make it
> soft restriction by not doing EMODPR but instead just updating the internal
> xarray. The 2nd option would be fully backwards compatible with the
> existing invariant.

Just updating OS structures would not be sufficient. In addition to needing to
update its EPCM permissions for its security the enclave needs to ensure that
there are no cached addresses in the TLB that may still allow access to pages
with permissions that were removed.

> 
> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> or will be supported.

EMODPE is currently supported and you can see an example of its use
in the testing code that forms part of this series. Please see
"[PATCH 11/25] selftests/sgx: Add test for EPCM permission changes" where you
will find support for calling ENCLU[EMODPE] added to the test enclave and the 
"epcm_permissions" test making use of it.

After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
to communicate the new permissions to the OS. Since the OS does not know when/if 
ENCLU[EMODPE] has indeed been called all permission changes are treated as
permission restriction and will thus, in addition to removing the page table
entries, call ENCLS[EMODPR], which will be a no-op (apart from EPCM.PR being set)
if user space only did call ENCLU[EMODPE]. The page remains accessible even though
EPCM.PR is set but the current implementations follows a
SGX_IOC_ENCLAVE_MOD_PROTECTIONS call with an ENCLU[EACCEPT] with the PR bit set
to ensure there are no dangling PR bit set in EPCM.

Reinette
Jarkko Sakkinen Jan. 14, 2022, 11:15 p.m. UTC | #42
On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/14/2022 1:53 PM, Jarkko Sakkinen wrote:
> > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko and Nathaniel,
> >>
> >> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> >>> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>>>
> >>>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> >>>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> >>>>>> Hi Jarkko,
> >>>>>>
> >>>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> >>>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >>>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>>>>>>>> OK, so the question is: do we need both or would a
> >>>>>>>>>> mechanism just
> >>>>>>>>>>>>>> to extend
> >>>>>>>>>>>>>>> permissions be sufficient?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I do believe that we need both in order to support pages
> >>>>>>>>>> having only
> >>>>>>>>>>>>>> the permissions required to support their intended use
> >>>>>>>>>> during the
> >>>>>>>>>>>>>> time the
> >>>>>>>>>>>>>> particular access is required. While technically it is
> >>>>>>>>>> possible to grant
> >>>>>>>>>>>>>> pages all permissions they may need during their lifetime it
> >>>>>>>>>> is safer to
> >>>>>>>>>>>>>> remove permissions when no longer required.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>>>>>>>> how using it
> >>>>>>>>>>>>> would make things safer?
> >>>>>>>>>>>>>
> >>>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>>>>>>>> defensive
> >>>>>>>>>>>> measure. In that case, EMODPR is useful.
> >>>>>>>>>>>
> >>>>>>>>>>> What is the exact threat we are talking about?
> >>>>>>>>>>
> >>>>>>>>>> To add: it should be *significantly* critical thread, given that not
> >>>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> >>>>>>>>>> EACCEPT involvement.
> >>>>>>>>>>
> >>>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>>>>>>>> introduce
> >>>>>>>>>> it when there is PoC code for any of the existing run-time that
> >>>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>>>>>>>
> >>>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> >>>>>>>>>
> >>>>>>>>> At least it does not protected against enclave code because an enclave
> >>>>>>>>> can
> >>>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>>>>>>>> confused here about the actual threat but also the potential adversary
> >>>>>>>>> and
> >>>>>>>>> target.
> >>>>>>>>>
> >>>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >>>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> >>>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> >>>>>>>> If enclave does not verify with EACCEPT, then its own code has
> >>>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> >>>>>>>> request EMODPR.
> >>>>>>>
> >>>>>>> The question is really simple: what is the threat scenario? In order to use
> >>>>>>> the word "vulnerability", you would need one.
> >>>>>>>
> >>>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> >>>>>>> one, in order to ack it to the mainline.
> >>>>>>>
> >>>>>>
> >>>>>> Which complexity related to EMODPR are you concerned about? In a later message
> >>>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> >>>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> >>>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> >>>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> >>>>>> these flows in the OS, but would of course impact the enclave.
> >>>>>
> >>>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> >>>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> >>>>> EMODPR going to help with any sort of workload?
> >>>>
> >>>> I've even started think should we just always allow mmap()?
> >>>
> >>> I suspect this may be the most ergonomic way forward. Instructions
> >>> like EAUG/EMODPR/etc are really irrelevant implementation details to
> >>> what the enclave wants, which is a memory mapping in the enclave. Why
> >>> make the enclave runner do multiple context switches just to change
> >>> the memory map of an enclave?
> >>
> >> The enclave runner is not forced to make any changes to a memory mapping. To start,
> >> this implementation supports and does not change the existing ABI where a new
> >> memory mapping can only be created if its permissions are the same or weaker
> >> than the EPCM permissions. After the memory mapping is created the EPCM permissions
> >> can change (thanks to SGX2) and when they do there are no forced nor required
> >> changes to the memory mapping - pages remain accessible where the memory mapping
> >> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> >> to an enclave page (EMODPE) then the memory mapping may need to be changed as
> >> should be expected to access a page with permissions that the memory mapping
> >> did not previously allow.
> >>
> >> Are you saying that the permissions of a new memory mapping should now be allowed
> >> to exceed EPCM permissions and thus the enclave runner would not need to modify a
> >> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> >> considered a change in ABI but something we could support on SGX2 systems.
> >>
> >> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> >> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> >> and EMODPE to manage enclave page permissions.
> > 
> > Thanks, this was very informative response. I'll try to elaborate why
> > EMODPR gives me headaches.
> > 
> > I'm having hard time to connect the dots between OS mistrust and
> > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > legit, it needs really at least some sort of example of a scenario where
> > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > we are just waving our hands.
> 
> The enclave itself should run in an environment that respects the foundational
> security principle of least privilege. There are valid scenarios where an enclave
> may need to write to a page and then later execute from it, but it should
> ideally not have those permissions for its entire lifetime. As Haitao mentioned
> this is to protect it from itself, which can be bugs in the code or maybe even
> malicious code and is similar to how non enclave code is treated today. So in your
> request I interpret this to mean that it is the enclave that is both the
> adversary and the target.
> 
> At the same time the ones running the enclaves do not trust the OS to manage
> access to enclave pages and that is how we ended up with the need for these SGX2
> permission functions needed to change the EPCM permissions.
> 
> > 
> > Generally speaking a restriction is not a restriction if cannot be enforced. 
> 
> The EPCM permissions are enforced by the hardware and in addition
> in this implementation the OS will not add PTEs that exceed the EPCM permissions.
> 
> > 
> > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > soft restriction by not doing EMODPR but instead just updating the internal
> > xarray. The 2nd option would be fully backwards compatible with the
> > existing invariant.
> 
> Just updating OS structures would not be sufficient. In addition to needing to
> update its EPCM permissions for its security the enclave needs to ensure that
> there are no cached addresses in the TLB that may still allow access to pages
> with permissions that were removed.

How enclave can check a page range that EPCM has the expected permissions?

I'd expect OS mistrusting enclave to do such check at the start of TCS.

Otherwise, it cannot be sure whether EMODPR was ever requested, and thus
it plays zero part in the game.

You would get equivalent level of security by just modifying the xarray,
and not doing EMODPR.

> > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > or will be supported.
> 
> EMODPE is currently supported and you can see an example of its use
> in the testing code that forms part of this series. Please see
> "[PATCH 11/25] selftests/sgx: Add test for EPCM permission changes" where you
> will find support for calling ENCLU[EMODPE] added to the test enclave and the 
> "epcm_permissions" test making use of it.
> 
> After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS

OK, great.

A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 

> to communicate the new permissions to the OS. Since the OS does not know when/if 
> ENCLU[EMODPE] has indeed been called all permission changes are treated as
> permission restriction and will thus, in addition to removing the page table
> entries, call ENCLS[EMODPR], which will be a no-op (apart from EPCM.PR being set)
> if user space only did call ENCLU[EMODPE]. The page remains accessible even though
> EPCM.PR is set but the current implementations follows a
> SGX_IOC_ENCLAVE_MOD_PROTECTIONS call with an ENCLU[EACCEPT] with the PR bit set
> to ensure there are no dangling PR bit set in EPCM.
> 
> Reinette

/Jarkko
Reinette Chatre Jan. 15, 2022, 12:01 a.m. UTC | #43
Hi Jarkko,

On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
> 
> How enclave can check a page range that EPCM has the expected permissions?

Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
time the enclave provides the expected permissions and that will fail
if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).

> 
> I'd expect OS mistrusting enclave to do such check at the start of TCS.
> 
> Otherwise, it cannot be sure whether EMODPR was ever requested, and thus
> it plays zero part in the game.

The EPCM would always have a PR bit set after EMODPR was requested.

> 
> You would get equivalent level of security by just modifying the xarray,
> and not doing EMODPR.

The xarray is an internal SGX driver structure that can guide how the OS manages
page permissions (via VMA permissions or PTEs). This brings us back to the
fact that the OS is not trusted and a malicious OS may install PTEs that
allow full access to enclave pages and that is why the enclave needs/wants
to control its own permissions via the EPCM permissions that are managed
with the ENCLS[EMODPR] and ENCLU[EMODPE] instructions.

 
>>> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
>>> or will be supported.
>>
>> EMODPE is currently supported and you can see an example of its use
>> in the testing code that forms part of this series. Please see
>> "[PATCH 11/25] selftests/sgx: Add test for EPCM permission changes" where you
>> will find support for calling ENCLU[EMODPE] added to the test enclave and the 
>> "epcm_permissions" test making use of it.
>>
>> After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
> 
> OK, great.
> 
> A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 

Sure. (btw ... I was following guidance from:
https://lore.kernel.org/lkml/Yav0%2F3jeJsuT3yEq@iki.fi/ ).

Reinette
Jarkko Sakkinen Jan. 15, 2022, 12:27 a.m. UTC | #44
On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> > 
> > How enclave can check a page range that EPCM has the expected permissions?
> 
> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> time the enclave provides the expected permissions and that will fail
> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).

This is a very valid point but that does make the introspection possible
only at the time of EACCEPT.

It does not give tools for enclave to make sure that EMODPR-ETRACK dance
was ever exercised.

/Jarkko
Reinette Chatre Jan. 15, 2022, 12:41 a.m. UTC | #45
Hi Jarkko,

On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
>>>> Hi Jarkko,
>>>
>>> How enclave can check a page range that EPCM has the expected permissions?
>>
>> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
>> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
>> time the enclave provides the expected permissions and that will fail
>> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> 
> This is a very valid point but that does make the introspection possible
> only at the time of EACCEPT.
> 
> It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> was ever exercised.

Could you please elaborate? EACCEPT is available to the enclave as a tool
and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).

Here is the relevant snippet from the SDM from the section where it
describes EACCEPT:

IF (Tracking not correct)
    THEN
        RFLAGS.ZF := 1;
        RAX := SGX_NOT_TRACKED;
        GOTO DONE;
FI;

Reinette
Jarkko Sakkinen Jan. 15, 2022, 1:18 a.m. UTC | #46
On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>
> >>> How enclave can check a page range that EPCM has the expected permissions?
> >>
> >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> >> time the enclave provides the expected permissions and that will fail
> >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > 
> > This is a very valid point but that does make the introspection possible
> > only at the time of EACCEPT.
> > 
> > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > was ever exercised.
> 
> Could you please elaborate? EACCEPT is available to the enclave as a tool
> and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> 
> Here is the relevant snippet from the SDM from the section where it
> describes EACCEPT:
> 
> IF (Tracking not correct)
>     THEN
>         RFLAGS.ZF := 1;
>         RAX := SGX_NOT_TRACKED;
>         GOTO DONE;
> FI;
> 
> Reinette

Yes, if enclave calls EACCEPT it does the necessary introspection and makes
sure that ETRACK is completed. I have trouble understanding how enclave
makes sure that EACCEPT was called.

/Jarkko
Jarkko Sakkinen Jan. 15, 2022, 11:56 a.m. UTC | #47
On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > Hi Jarkko,
> > 
> > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > >> Hi Jarkko,
> > >>
> > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > >>>> Hi Jarkko,
> > >>>
> > >>> How enclave can check a page range that EPCM has the expected permissions?
> > >>
> > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > >> time the enclave provides the expected permissions and that will fail
> > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > 
> > > This is a very valid point but that does make the introspection possible
> > > only at the time of EACCEPT.
> > > 
> > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > was ever exercised.
> > 
> > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > 
> > Here is the relevant snippet from the SDM from the section where it
> > describes EACCEPT:
> > 
> > IF (Tracking not correct)
> >     THEN
> >         RFLAGS.ZF := 1;
> >         RAX := SGX_NOT_TRACKED;
> >         GOTO DONE;
> > FI;
> > 
> > Reinette
> 
> Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> sure that ETRACK is completed. I have trouble understanding how enclave
> makes sure that EACCEPT was called.

I'm not concerned of anything going wrong once EMODPR has been started.

The problem nails down to that the whole EMODPR process is spawned by
the entity that is not trusted so maybe that should further broke down
to three roles:

1. Build process B
2. Runner process R.
3. Enclave E.

And to the costraint that we trust B *more* than R. Once B has done all the
needed EMODPR calls it would send the file descriptor to R. Even if R would
have full access to /dev/sgx_enclave, it would not matter, since B has done
EMODPR-EACCEPT dance with E.

So what you can achieve with EMODPR is not protection against mistrusted
*OS*. There's absolutely no chance you could use it for that purpose
because mistrusted OS controls the whole process.

EMODPR is to help to protect enclave against mistrusted *process*, i.e.
in the above scenario R.

/Jarkko
Jarkko Sakkinen Jan. 15, 2022, 11:59 a.m. UTC | #48
On Sat, Jan 15, 2022 at 01:56:55PM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > >> Hi Jarkko,
> > > >>
> > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>
> > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > >>
> > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > >> time the enclave provides the expected permissions and that will fail
> > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > 
> > > > This is a very valid point but that does make the introspection possible
> > > > only at the time of EACCEPT.
> > > > 
> > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > was ever exercised.
> > > 
> > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > 
> > > Here is the relevant snippet from the SDM from the section where it
> > > describes EACCEPT:
> > > 
> > > IF (Tracking not correct)
> > >     THEN
> > >         RFLAGS.ZF := 1;
> > >         RAX := SGX_NOT_TRACKED;
> > >         GOTO DONE;
> > > FI;
> > > 
> > > Reinette
> > 
> > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > sure that ETRACK is completed. I have trouble understanding how enclave
> > makes sure that EACCEPT was called.
> 
> I'm not concerned of anything going wrong once EMODPR has been started.
> 
> The problem nails down to that the whole EMODPR process is spawned by
> the entity that is not trusted so maybe that should further broke down
> to three roles:
> 
> 1. Build process B
> 2. Runner process R.
> 3. Enclave E.
> 
> And to the costraint that we trust B *more* than R. Once B has done all the
> needed EMODPR calls it would send the file descriptor to R. Even if R would
> have full access to /dev/sgx_enclave, it would not matter, since B has done
> EMODPR-EACCEPT dance with E.
> 
> So what you can achieve with EMODPR is not protection against mistrusted
> *OS*. There's absolutely no chance you could use it for that purpose
> because mistrusted OS controls the whole process.
> 
> EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> in the above scenario R.

My suggestion for this is that let's make EMODPR either opt-in or opt-out
with flags parameter, I don't care which way around. That way you can pick
between performance or extra layer of hardening if you care about the above
scenario.

/Jarkko
Jarkko Sakkinen Jan. 15, 2022, 4:49 p.m. UTC | #49
On Sat, Jan 15, 2022 at 01:15:53AM +0200, Jarkko Sakkinen wrote:
> > After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
> 
> OK, great.
> 
> A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 

I'm not confident after looking through the test case and ioctl
about EMODPE support but I do not want disturb this anymore. Bunch
of things have been nailed and I'm now running the code, which is
great.

The obviously wrong implementation choice in this ioctl is that
it is multi-function. It should be just split it into two ioctls:
sgx_restrict_page_permissions and sgx_extend_page_permissions.

They are conceptually different flows and I'm also basing this on earlier
discussion in this mailing list from which I conclude that it is also
consensus to not have such ioctls.

Might sound clanky but it is much easier to comprehend what is going
on "in the blackbox" by doing that split.

/Jarkko
Nathaniel McCallum Jan. 17, 2022, 1:13 p.m. UTC | #50
On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > >
> > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > >> Hi Jarkko,
> > > >>
> > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>
> > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > >>
> > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > >> time the enclave provides the expected permissions and that will fail
> > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > >
> > > > This is a very valid point but that does make the introspection possible
> > > > only at the time of EACCEPT.
> > > >
> > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > was ever exercised.
> > >
> > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > >
> > > Here is the relevant snippet from the SDM from the section where it
> > > describes EACCEPT:
> > >
> > > IF (Tracking not correct)
> > >     THEN
> > >         RFLAGS.ZF := 1;
> > >         RAX := SGX_NOT_TRACKED;
> > >         GOTO DONE;
> > > FI;
> > >
> > > Reinette
> >
> > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > sure that ETRACK is completed. I have trouble understanding how enclave
> > makes sure that EACCEPT was called.
>
> I'm not concerned of anything going wrong once EMODPR has been started.
>
> The problem nails down to that the whole EMODPR process is spawned by
> the entity that is not trusted so maybe that should further broke down
> to three roles:
>
> 1. Build process B
> 2. Runner process R.
> 3. Enclave E.
>
> And to the costraint that we trust B *more* than R. Once B has done all the
> needed EMODPR calls it would send the file descriptor to R. Even if R would
> have full access to /dev/sgx_enclave, it would not matter, since B has done
> EMODPR-EACCEPT dance with E.
>
> So what you can achieve with EMODPR is not protection against mistrusted
> *OS*. There's absolutely no chance you could use it for that purpose
> because mistrusted OS controls the whole process.
>
> EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> in the above scenario R.

There are two general cases that I can see. Both are valid.

1. The OS moves from a trusted to an untrusted state. This could be
the multi-process system you've described. But it could also be that
the kernel becomes compromised after the enclave is fully initialized.

2. The OS is untrustworthy from the start.

The second case is the stronger one and if you can solve it, the first
one is solved implicitly. And our end goal is that if the OS does
anything malicious we will crash in a controlled way.

A defensive enclave will always want to have the least number of
privileges for the maximum protection. Therefore, the enclave will
want the OS to call EMODPR. If that were it, the host could just lie.
But the enclave also verifies that the EMODPR operation was, in fact,
executed by doing EACCEPT. When the enclave calls EACCEPT, if the
kernel hasn't restricted permissions then we get a controlled crash.
Therefore, we have solved the second case.
Nathaniel McCallum Jan. 17, 2022, 1:27 p.m. UTC | #51
On Thu, Jan 13, 2022 at 4:43 PM Reinette Chatre
<reinette.chatre@intel.com> wrote:
>
> Hi Jarkko and Nathaniel,
>
> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>
> >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>>
> >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> >>>>>>>> mechanism just
> >>>>>>>>>>>> to extend
> >>>>>>>>>>>>> permissions be sufficient?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I do believe that we need both in order to support pages
> >>>>>>>> having only
> >>>>>>>>>>>> the permissions required to support their intended use
> >>>>>>>> during the
> >>>>>>>>>>>> time the
> >>>>>>>>>>>> particular access is required. While technically it is
> >>>>>>>> possible to grant
> >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> >>>>>>>> is safer to
> >>>>>>>>>>>> remove permissions when no longer required.
> >>>>>>>>>>>
> >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>>>>>> how using it
> >>>>>>>>>>> would make things safer?
> >>>>>>>>>>>
> >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>>>>>> defensive
> >>>>>>>>>> measure. In that case, EMODPR is useful.
> >>>>>>>>>
> >>>>>>>>> What is the exact threat we are talking about?
> >>>>>>>>
> >>>>>>>> To add: it should be *significantly* critical thread, given that not
> >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> >>>>>>>> EACCEPT involvement.
> >>>>>>>>
> >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>>>>>> introduce
> >>>>>>>> it when there is PoC code for any of the existing run-time that
> >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>>>>>
> >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> >>>>>>>
> >>>>>>> At least it does not protected against enclave code because an enclave
> >>>>>>> can
> >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>>>>>> confused here about the actual threat but also the potential adversary
> >>>>>>> and
> >>>>>>> target.
> >>>>>>>
> >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> >>>>>> If enclave does not verify with EACCEPT, then its own code has
> >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> >>>>>> request EMODPR.
> >>>>>
> >>>>> The question is really simple: what is the threat scenario? In order to use
> >>>>> the word "vulnerability", you would need one.
> >>>>>
> >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> >>>>> one, in order to ack it to the mainline.
> >>>>>
> >>>>
> >>>> Which complexity related to EMODPR are you concerned about? In a later message
> >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> >>>> these flows in the OS, but would of course impact the enclave.
> >>>
> >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> >>> EMODPR going to help with any sort of workload?
> >>
> >> I've even started think should we just always allow mmap()?
> >
> > I suspect this may be the most ergonomic way forward. Instructions
> > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > what the enclave wants, which is a memory mapping in the enclave. Why
> > make the enclave runner do multiple context switches just to change
> > the memory map of an enclave?
>
> The enclave runner is not forced to make any changes to a memory mapping. To start,
> this implementation supports and does not change the existing ABI where a new
> memory mapping can only be created if its permissions are the same or weaker
> than the EPCM permissions. After the memory mapping is created the EPCM permissions
> can change (thanks to SGX2) and when they do there are no forced nor required
> changes to the memory mapping - pages remain accessible where the memory mapping
> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> to an enclave page (EMODPE) then the memory mapping may need to be changed as
> should be expected to access a page with permissions that the memory mapping
> did not previously allow.
>
> Are you saying that the permissions of a new memory mapping should now be allowed
> to exceed EPCM permissions and thus the enclave runner would not need to modify a
> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> considered a change in ABI but something we could support on SGX2 systems.
>
> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> and EMODPE to manage enclave page permissions.

As I understand the problem, there are two permission sets:

 * The EPCM permissions
 * The mmap() permissions

The mmap() permissions cannot exceed the EPCM permissions, for obvious reasons.

Hypothesis: there is no practical reason the EPCM permissions should
exceed mmap() permissions.

If the hypothesis is true, then userspace shouldn't have an API to
manage EPCM permissions distinct from mmap() permissions. Instead,
userspace should just call mmap() and the kernel should internally
adjust the EPCM permissions to match the mmap() permissions. This has
a performance advantage that every permissions change is one syscall
rather than two.

So what is the use case where an enclave would want to restrict mmap()
permissions but not restrict EPCM?
Jarkko Sakkinen Jan. 18, 2022, 1:59 a.m. UTC | #52
On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > > Hi Jarkko,
> > > >
> > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > > >> Hi Jarkko,
> > > > >>
> > > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > > >>>> Hi Jarkko,
> > > > >>>
> > > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > > >>
> > > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > > >> time the enclave provides the expected permissions and that will fail
> > > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > >
> > > > > This is a very valid point but that does make the introspection possible
> > > > > only at the time of EACCEPT.
> > > > >
> > > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > > was ever exercised.
> > > >
> > > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > >
> > > > Here is the relevant snippet from the SDM from the section where it
> > > > describes EACCEPT:
> > > >
> > > > IF (Tracking not correct)
> > > >     THEN
> > > >         RFLAGS.ZF := 1;
> > > >         RAX := SGX_NOT_TRACKED;
> > > >         GOTO DONE;
> > > > FI;
> > > >
> > > > Reinette
> > >
> > > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > > sure that ETRACK is completed. I have trouble understanding how enclave
> > > makes sure that EACCEPT was called.
> >
> > I'm not concerned of anything going wrong once EMODPR has been started.
> >
> > The problem nails down to that the whole EMODPR process is spawned by
> > the entity that is not trusted so maybe that should further broke down
> > to three roles:
> >
> > 1. Build process B
> > 2. Runner process R.
> > 3. Enclave E.
> >
> > And to the costraint that we trust B *more* than R. Once B has done all the
> > needed EMODPR calls it would send the file descriptor to R. Even if R would
> > have full access to /dev/sgx_enclave, it would not matter, since B has done
> > EMODPR-EACCEPT dance with E.
> >
> > So what you can achieve with EMODPR is not protection against mistrusted
> > *OS*. There's absolutely no chance you could use it for that purpose
> > because mistrusted OS controls the whole process.
> >
> > EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> > in the above scenario R.
> 
> There are two general cases that I can see. Both are valid.
> 
> 1. The OS moves from a trusted to an untrusted state. This could be
> the multi-process system you've described. But it could also be that
> the kernel becomes compromised after the enclave is fully initialized.
> 
> 2. The OS is untrustworthy from the start.
> 
> The second case is the stronger one and if you can solve it, the first
> one is solved implicitly. And our end goal is that if the OS does
> anything malicious we will crash in a controlled way.
> 
> A defensive enclave will always want to have the least number of
> privileges for the maximum protection. Therefore, the enclave will
> want the OS to call EMODPR. If that were it, the host could just lie.
> But the enclave also verifies that the EMODPR operation was, in fact,
> executed by doing EACCEPT. When the enclave calls EACCEPT, if the
> kernel hasn't restricted permissions then we get a controlled crash.
> Therefore, we have solved the second case.

So you're referring to this part of the SDM pseude code in the SDM:

(* Check the destination EPC page for concurrency *)
IF ( EPC page in use )
    THEN #GP(0); FI;

I wonder does "EPC page in use" unconditionally trigger when EACCEPT
is invoked for a page for which all of these conditions hold:

- .PR := 0 (no EMODPR in progress)
- .MODIFIED := 0 (no EMODT in progress)
- .PENDING := 0 (no EMODPR in progress)

I don't know the exact scope and scale of "EPC page in use".

Then, yes, EACCEPT could be at least used to validate that one of the
three operations above was requested. However, enclave thread cannot say
which one was it, so it is guesswork.

I guess that still is enough to quantitatively argue that EMODPR is an
operation that does improve the confidentiality properities of an
enclave, even if it has it flaws.

/Jarkko
Jarkko Sakkinen Jan. 18, 2022, 2:22 a.m. UTC | #53
On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
> > On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >
> > > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > > > Hi Jarkko,
> > > > >
> > > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > > > >> Hi Jarkko,
> > > > > >>
> > > > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > > > >>>> Hi Jarkko,
> > > > > >>>
> > > > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > > > >>
> > > > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > > > >> time the enclave provides the expected permissions and that will fail
> > > > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > > >
> > > > > > This is a very valid point but that does make the introspection possible
> > > > > > only at the time of EACCEPT.
> > > > > >
> > > > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > > > was ever exercised.
> > > > >
> > > > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > > >
> > > > > Here is the relevant snippet from the SDM from the section where it
> > > > > describes EACCEPT:
> > > > >
> > > > > IF (Tracking not correct)
> > > > >     THEN
> > > > >         RFLAGS.ZF := 1;
> > > > >         RAX := SGX_NOT_TRACKED;
> > > > >         GOTO DONE;
> > > > > FI;
> > > > >
> > > > > Reinette
> > > >
> > > > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > > > sure that ETRACK is completed. I have trouble understanding how enclave
> > > > makes sure that EACCEPT was called.
> > >
> > > I'm not concerned of anything going wrong once EMODPR has been started.
> > >
> > > The problem nails down to that the whole EMODPR process is spawned by
> > > the entity that is not trusted so maybe that should further broke down
> > > to three roles:
> > >
> > > 1. Build process B
> > > 2. Runner process R.
> > > 3. Enclave E.
> > >
> > > And to the costraint that we trust B *more* than R. Once B has done all the
> > > needed EMODPR calls it would send the file descriptor to R. Even if R would
> > > have full access to /dev/sgx_enclave, it would not matter, since B has done
> > > EMODPR-EACCEPT dance with E.
> > >
> > > So what you can achieve with EMODPR is not protection against mistrusted
> > > *OS*. There's absolutely no chance you could use it for that purpose
> > > because mistrusted OS controls the whole process.
> > >
> > > EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> > > in the above scenario R.
> > 
> > There are two general cases that I can see. Both are valid.
> > 
> > 1. The OS moves from a trusted to an untrusted state. This could be
> > the multi-process system you've described. But it could also be that
> > the kernel becomes compromised after the enclave is fully initialized.
> > 
> > 2. The OS is untrustworthy from the start.
> > 
> > The second case is the stronger one and if you can solve it, the first
> > one is solved implicitly. And our end goal is that if the OS does
> > anything malicious we will crash in a controlled way.
> > 
> > A defensive enclave will always want to have the least number of
> > privileges for the maximum protection. Therefore, the enclave will
> > want the OS to call EMODPR. If that were it, the host could just lie.
> > But the enclave also verifies that the EMODPR operation was, in fact,
> > executed by doing EACCEPT. When the enclave calls EACCEPT, if the
> > kernel hasn't restricted permissions then we get a controlled crash.
> > Therefore, we have solved the second case.
> 
> So you're referring to this part of the SDM pseude code in the SDM:
> 
> (* Check the destination EPC page for concurrency *)
> IF ( EPC page in use )
>     THEN #GP(0); FI;
> 
> I wonder does "EPC page in use" unconditionally trigger when EACCEPT
> is invoked for a page for which all of these conditions hold:
> 
> - .PR := 0 (no EMODPR in progress)
> - .MODIFIED := 0 (no EMODT in progress)
> - .PENDING := 0 (no EMODPR in progress)
> 
> I don't know the exact scope and scale of "EPC page in use".
> 
> Then, yes, EACCEPT could be at least used to validate that one of the
> three operations above was requested. However, enclave thread cannot say
> which one was it, so it is guesswork.

OK, I got it, and this last paragraph is not true. SECINFO given EACCEPT
will lock in rest of the details and make the operation deterministic.

The only question mark then is the condition when no requests are active.

/Jarkko
Jarkko Sakkinen Jan. 18, 2022, 3:31 a.m. UTC | #54
On Tue, Jan 18, 2022 at 04:22:45AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
> > > On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >
> > > > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > > > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > > > > Hi Jarkko,
> > > > > >
> > > > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > > > > >> Hi Jarkko,
> > > > > > >>
> > > > > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > > > > >>>> Hi Jarkko,
> > > > > > >>>
> > > > > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > > > > >>
> > > > > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > > > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > > > > >> time the enclave provides the expected permissions and that will fail
> > > > > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > > > >
> > > > > > > This is a very valid point but that does make the introspection possible
> > > > > > > only at the time of EACCEPT.
> > > > > > >
> > > > > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > > > > was ever exercised.
> > > > > >
> > > > > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > > > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > > > >
> > > > > > Here is the relevant snippet from the SDM from the section where it
> > > > > > describes EACCEPT:
> > > > > >
> > > > > > IF (Tracking not correct)
> > > > > >     THEN
> > > > > >         RFLAGS.ZF := 1;
> > > > > >         RAX := SGX_NOT_TRACKED;
> > > > > >         GOTO DONE;
> > > > > > FI;
> > > > > >
> > > > > > Reinette
> > > > >
> > > > > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > > > > sure that ETRACK is completed. I have trouble understanding how enclave
> > > > > makes sure that EACCEPT was called.
> > > >
> > > > I'm not concerned of anything going wrong once EMODPR has been started.
> > > >
> > > > The problem nails down to that the whole EMODPR process is spawned by
> > > > the entity that is not trusted so maybe that should further broke down
> > > > to three roles:
> > > >
> > > > 1. Build process B
> > > > 2. Runner process R.
> > > > 3. Enclave E.
> > > >
> > > > And to the costraint that we trust B *more* than R. Once B has done all the
> > > > needed EMODPR calls it would send the file descriptor to R. Even if R would
> > > > have full access to /dev/sgx_enclave, it would not matter, since B has done
> > > > EMODPR-EACCEPT dance with E.
> > > >
> > > > So what you can achieve with EMODPR is not protection against mistrusted
> > > > *OS*. There's absolutely no chance you could use it for that purpose
> > > > because mistrusted OS controls the whole process.
> > > >
> > > > EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> > > > in the above scenario R.
> > > 
> > > There are two general cases that I can see. Both are valid.
> > > 
> > > 1. The OS moves from a trusted to an untrusted state. This could be
> > > the multi-process system you've described. But it could also be that
> > > the kernel becomes compromised after the enclave is fully initialized.
> > > 
> > > 2. The OS is untrustworthy from the start.
> > > 
> > > The second case is the stronger one and if you can solve it, the first
> > > one is solved implicitly. And our end goal is that if the OS does
> > > anything malicious we will crash in a controlled way.
> > > 
> > > A defensive enclave will always want to have the least number of
> > > privileges for the maximum protection. Therefore, the enclave will
> > > want the OS to call EMODPR. If that were it, the host could just lie.
> > > But the enclave also verifies that the EMODPR operation was, in fact,
> > > executed by doing EACCEPT. When the enclave calls EACCEPT, if the
> > > kernel hasn't restricted permissions then we get a controlled crash.
> > > Therefore, we have solved the second case.
> > 
> > So you're referring to this part of the SDM pseude code in the SDM:
> > 
> > (* Check the destination EPC page for concurrency *)
> > IF ( EPC page in use )
> >     THEN #GP(0); FI;
> > 
> > I wonder does "EPC page in use" unconditionally trigger when EACCEPT
> > is invoked for a page for which all of these conditions hold:
> > 
> > - .PR := 0 (no EMODPR in progress)
> > - .MODIFIED := 0 (no EMODT in progress)
> > - .PENDING := 0 (no EMODPR in progress)
> > 
> > I don't know the exact scope and scale of "EPC page in use".
> > 
> > Then, yes, EACCEPT could be at least used to validate that one of the
> > three operations above was requested. However, enclave thread cannot say
> > which one was it, so it is guesswork.
> 
> OK, I got it, and this last paragraph is not true. SECINFO given EACCEPT
> will lock in rest of the details and make the operation deterministic.
> 
> The only question mark then is the condition when no requests are active.

This the big picture model how I look at SGX2 patches, and perhaps this
could bring some more common sense idioms to this discussion.

There is not one but two passes of measurement:

1. A static pass.
2. A dynamic pass.

The static pass is the full process of doing ioctls that trigger ECREATE,
EADD and EEXTEND. It ends to EINIT, which triggers to pass/fail condition.

The 2nd pass is the dynamic pass. It's the pass where enclave measures
itself and is performed by the full set of all EACCEPT operations done by
the enclave. It ends to either all EACCEPT operations succeeding, or any
of them #GP.

Confidentiality is the state established exactly when both passes are
completed.

/Jarkko
Reinette Chatre Jan. 18, 2022, 8:59 p.m. UTC | #55
Hi Jarkko,

On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
>> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
>>> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>
>>>> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
>>>>> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
>>>>>> Hi Jarkko,
>>>>>>
>>>>>> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
>>>>>>> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
>>>>>>>> Hi Jarkko,
>>>>>>>>
>>>>>>>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
>>>>>>>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
>>>>>>>>>> Hi Jarkko,
>>>>>>>>>
>>>>>>>>> How enclave can check a page range that EPCM has the expected permissions?
>>>>>>>>
>>>>>>>> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
>>>>>>>> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
>>>>>>>> time the enclave provides the expected permissions and that will fail
>>>>>>>> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
>>>>>>>
>>>>>>> This is a very valid point but that does make the introspection possible
>>>>>>> only at the time of EACCEPT.
>>>>>>>
>>>>>>> It does not give tools for enclave to make sure that EMODPR-ETRACK dance
>>>>>>> was ever exercised.
>>>>>>
>>>>>> Could you please elaborate? EACCEPT is available to the enclave as a tool
>>>>>> and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
>>>>>>
>>>>>> Here is the relevant snippet from the SDM from the section where it
>>>>>> describes EACCEPT:
>>>>>>
>>>>>> IF (Tracking not correct)
>>>>>>     THEN
>>>>>>         RFLAGS.ZF := 1;
>>>>>>         RAX := SGX_NOT_TRACKED;
>>>>>>         GOTO DONE;
>>>>>> FI;
>>>>>>
>>>>>> Reinette
>>>>>
>>>>> Yes, if enclave calls EACCEPT it does the necessary introspection and makes
>>>>> sure that ETRACK is completed. I have trouble understanding how enclave
>>>>> makes sure that EACCEPT was called.
>>>>
>>>> I'm not concerned of anything going wrong once EMODPR has been started.
>>>>
>>>> The problem nails down to that the whole EMODPR process is spawned by
>>>> the entity that is not trusted so maybe that should further broke down
>>>> to three roles:
>>>>
>>>> 1. Build process B
>>>> 2. Runner process R.
>>>> 3. Enclave E.
>>>>
>>>> And to the costraint that we trust B *more* than R. Once B has done all the
>>>> needed EMODPR calls it would send the file descriptor to R. Even if R would
>>>> have full access to /dev/sgx_enclave, it would not matter, since B has done
>>>> EMODPR-EACCEPT dance with E.
>>>>
>>>> So what you can achieve with EMODPR is not protection against mistrusted
>>>> *OS*. There's absolutely no chance you could use it for that purpose
>>>> because mistrusted OS controls the whole process.
>>>>
>>>> EMODPR is to help to protect enclave against mistrusted *process*, i.e.
>>>> in the above scenario R.
>>>
>>> There are two general cases that I can see. Both are valid.
>>>
>>> 1. The OS moves from a trusted to an untrusted state. This could be
>>> the multi-process system you've described. But it could also be that
>>> the kernel becomes compromised after the enclave is fully initialized.
>>>
>>> 2. The OS is untrustworthy from the start.
>>>
>>> The second case is the stronger one and if you can solve it, the first
>>> one is solved implicitly. And our end goal is that if the OS does
>>> anything malicious we will crash in a controlled way.
>>>
>>> A defensive enclave will always want to have the least number of
>>> privileges for the maximum protection. Therefore, the enclave will
>>> want the OS to call EMODPR. If that were it, the host could just lie.
>>> But the enclave also verifies that the EMODPR operation was, in fact,
>>> executed by doing EACCEPT. When the enclave calls EACCEPT, if the
>>> kernel hasn't restricted permissions then we get a controlled crash.
>>> Therefore, we have solved the second case.
>>
>> So you're referring to this part of the SDM pseude code in the SDM:
>>
>> (* Check the destination EPC page for concurrency *)
>> IF ( EPC page in use )
>>     THEN #GP(0); FI;
>>
>> I wonder does "EPC page in use" unconditionally trigger when EACCEPT
>> is invoked for a page for which all of these conditions hold:
>>
>> - .PR := 0 (no EMODPR in progress)
>> - .MODIFIED := 0 (no EMODT in progress)
>> - .PENDING := 0 (no EMODPR in progress)
>>
>> I don't know the exact scope and scale of "EPC page in use".
>>
>> Then, yes, EACCEPT could be at least used to validate that one of the
>> three operations above was requested. However, enclave thread cannot say
>> which one was it, so it is guesswork.
> 
> OK, I got it, and this last paragraph is not true. SECINFO given EACCEPT
> will lock in rest of the details and make the operation deterministic.

Indeed - so the SDM pseudo code that is relevant here can be found under
the "(* Verify that accept request matches current EPC page settings *)"
comment where the enclave can verify that all EPCM values are as they should
and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
amiss.

> 
> The only question mark then is the condition when no requests are active.

Could you please elaborate what you mean with this question? If no request
is active then I understand that to mean that no request has started. 

Reinette
Reinette Chatre Jan. 18, 2022, 9:11 p.m. UTC | #56
Hi Nathaniel,

On 1/17/2022 5:27 AM, Nathaniel McCallum wrote:
> On Thu, Jan 13, 2022 at 4:43 PM Reinette Chatre
> <reinette.chatre@intel.com> wrote:
>>
>> Hi Jarkko and Nathaniel,
>>
>> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
>>> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>
>>>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
>>>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
>>>>>> Hi Jarkko,
>>>>>>
>>>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
>>>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>>>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>>>>>>>> OK, so the question is: do we need both or would a
>>>>>>>>>> mechanism just
>>>>>>>>>>>>>> to extend
>>>>>>>>>>>>>>> permissions be sufficient?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do believe that we need both in order to support pages
>>>>>>>>>> having only
>>>>>>>>>>>>>> the permissions required to support their intended use
>>>>>>>>>> during the
>>>>>>>>>>>>>> time the
>>>>>>>>>>>>>> particular access is required. While technically it is
>>>>>>>>>> possible to grant
>>>>>>>>>>>>>> pages all permissions they may need during their lifetime it
>>>>>>>>>> is safer to
>>>>>>>>>>>>>> remove permissions when no longer required.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>>>>>>>> how using it
>>>>>>>>>>>>> would make things safer?
>>>>>>>>>>>>>
>>>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>>>>>>>> defensive
>>>>>>>>>>>> measure. In that case, EMODPR is useful.
>>>>>>>>>>>
>>>>>>>>>>> What is the exact threat we are talking about?
>>>>>>>>>>
>>>>>>>>>> To add: it should be *significantly* critical thread, given that not
>>>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
>>>>>>>>>> EACCEPT involvement.
>>>>>>>>>>
>>>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>>>>>>>> introduce
>>>>>>>>>> it when there is PoC code for any of the existing run-time that
>>>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>>>>>>>
>>>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
>>>>>>>>>
>>>>>>>>> At least it does not protected against enclave code because an enclave
>>>>>>>>> can
>>>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>>>>>>>> confused here about the actual threat but also the potential adversary
>>>>>>>>> and
>>>>>>>>> target.
>>>>>>>>>
>>>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>>>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
>>>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
>>>>>>>> If enclave does not verify with EACCEPT, then its own code has
>>>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
>>>>>>>> request EMODPR.
>>>>>>>
>>>>>>> The question is really simple: what is the threat scenario? In order to use
>>>>>>> the word "vulnerability", you would need one.
>>>>>>>
>>>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
>>>>>>> one, in order to ack it to the mainline.
>>>>>>>
>>>>>>
>>>>>> Which complexity related to EMODPR are you concerned about? In a later message
>>>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
>>>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
>>>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
>>>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
>>>>>> these flows in the OS, but would of course impact the enclave.
>>>>>
>>>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
>>>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
>>>>> EMODPR going to help with any sort of workload?
>>>>
>>>> I've even started think should we just always allow mmap()?
>>>
>>> I suspect this may be the most ergonomic way forward. Instructions
>>> like EAUG/EMODPR/etc are really irrelevant implementation details to
>>> what the enclave wants, which is a memory mapping in the enclave. Why
>>> make the enclave runner do multiple context switches just to change
>>> the memory map of an enclave?
>>
>> The enclave runner is not forced to make any changes to a memory mapping. To start,
>> this implementation supports and does not change the existing ABI where a new
>> memory mapping can only be created if its permissions are the same or weaker
>> than the EPCM permissions. After the memory mapping is created the EPCM permissions
>> can change (thanks to SGX2) and when they do there are no forced nor required
>> changes to the memory mapping - pages remain accessible where the memory mapping
>> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
>> to an enclave page (EMODPE) then the memory mapping may need to be changed as
>> should be expected to access a page with permissions that the memory mapping
>> did not previously allow.
>>
>> Are you saying that the permissions of a new memory mapping should now be allowed
>> to exceed EPCM permissions and thus the enclave runner would not need to modify a
>> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
>> considered a change in ABI but something we could support on SGX2 systems.
>>
>> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
>> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
>> and EMODPE to manage enclave page permissions.
> 
> As I understand the problem, there are two permission sets:
> 
>  * The EPCM permissions
>  * The mmap() permissions

It may be easier to think of there being three permission sets:
* EPCM permissions
* VMA (the mmap() permissions)
* PTE permissions

> 
> The mmap() permissions cannot exceed the EPCM permissions, for obvious reasons.

That is the current rule - when a new memory mapping is created it cannot exceed
the EPCM permissions. This rule remains in SGX2 but there is a caveat that
the EPCM permissions may change during runtime while the memory is mapped and thus
the VMA permissions may indeed exceed the EPCM permissions. This is where the
importance of PTE permissions is highlighted.

You may ask - when EPCM permissions are changed, why not just change the VMA
permissions? Please see the commit message below that contains details about
this and the reasons why VMA permissions are allowed to exceed EPCM permissions:
https://lore.kernel.org/lkml/7e622156315c9c22c3ef84a7c0aeb01b5c001ff9.1638381245.git.reinette.chatre@intel.com/

> 
> Hypothesis: there is no practical reason the EPCM permissions should
> exceed mmap() permissions.
> 
> If the hypothesis is true, then userspace shouldn't have an API to
> manage EPCM permissions distinct from mmap() permissions. Instead,
> userspace should just call mmap() and the kernel should internally
> adjust the EPCM permissions to match the mmap() permissions. This has
> a performance advantage that every permissions change is one syscall
> rather than two.
> 
> So what is the use case where an enclave would want to restrict mmap()
> permissions but not restrict EPCM?

An enclave with its EPCM permissions can be used by different tasks, each
should only access the enclave with the least privileges needed. Multiple
tasks may thus map a portion of an enclave with different permissions.

This implementation also supports the workflow that if a portion of an enclave
is already mapped then it is possible to change the EPCM permissions without
requiring the memory to be remapped.

Reinette
Reinette Chatre Jan. 18, 2022, 9:18 p.m. UTC | #57
Hi Jarkko,

On 1/15/2022 8:49 AM, Jarkko Sakkinen wrote:
> On Sat, Jan 15, 2022 at 01:15:53AM +0200, Jarkko Sakkinen wrote:
>>> After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
>>
>> OK, great.
>>
>> A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 
> 
> I'm not confident after looking through the test case and ioctl
> about EMODPE support but I do not want disturb this anymore. Bunch
> of things have been nailed and I'm now running the code, which is
> great.
> 
> The obviously wrong implementation choice in this ioctl is that
> it is multi-function. It should be just split it into two ioctls:
> sgx_restrict_page_permissions and sgx_extend_page_permissions.

Sure, I can move it to two ioctls.

To keep the naming consistent, what do you think of:
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS

Please refer to message below as motivation for the "relax" term:
https://lore.kernel.org/lkml/24447a03-139a-c7e0-9ad5-34e2019c4df5@intel.com/

> 
> They are conceptually different flows and I'm also basing this on earlier
> discussion in this mailing list from which I conclude that it is also
> consensus to not have such ioctls.
> 
> Might sound clanky but it is much easier to comprehend what is going
> on "in the blackbox" by doing that split.

Reinette
Jarkko Sakkinen Jan. 20, 2022, 12:53 p.m. UTC | #58
On Tue, 2022-01-18 at 12:59 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
> > On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum
> > > wrote:
> > > > On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen
> > > > <jarkko@kernel.org> wrote:
> > > > > 
> > > > > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen
> > > > > wrote:
> > > > > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre
> > > > > > wrote:
> > > > > > > Hi Jarkko,
> > > > > > > 
> > > > > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette
> > > > > > > > Chatre wrote:
> > > > > > > > > Hi Jarkko,
> > > > > > > > > 
> > > > > > > > > On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > > > > > > > On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette
> > > > > > > > > > Chatre wrote:
> > > > > > > > > > > Hi Jarkko,
> > > > > > > > > > 
> > > > > > > > > > How enclave can check a page range that EPCM has
> > > > > > > > > > the expected permissions?
> > > > > > > > > 
> > > > > > > > > Only way to change EPCM permissions from outside
> > > > > > > > > enclave is to run ENCLS[EMODPR]
> > > > > > > > > that needs to be accepted from within the enclave via
> > > > > > > > > ENCLU[EACCEPT]. At that
> > > > > > > > > time the enclave provides the expected permissions
> > > > > > > > > and that will fail
> > > > > > > > > if there is a mismatch with the EPCM permissions
> > > > > > > > > (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > > > > > 
> > > > > > > > This is a very valid point but that does make the
> > > > > > > > introspection possible
> > > > > > > > only at the time of EACCEPT.
> > > > > > > > 
> > > > > > > > It does not give tools for enclave to make sure that
> > > > > > > > EMODPR-ETRACK dance
> > > > > > > > was ever exercised.
> > > > > > > 
> > > > > > > Could you please elaborate? EACCEPT is available to the
> > > > > > > enclave as a tool
> > > > > > > and it would fail if ETRACK was not completed (error
> > > > > > > SGX_NOT_TRACKED).
> > > > > > > 
> > > > > > > Here is the relevant snippet from the SDM from the
> > > > > > > section where it
> > > > > > > describes EACCEPT:
> > > > > > > 
> > > > > > > IF (Tracking not correct)
> > > > > > >     THEN
> > > > > > >         RFLAGS.ZF := 1;
> > > > > > >         RAX := SGX_NOT_TRACKED;
> > > > > > >         GOTO DONE;
> > > > > > > FI;
> > > > > > > 
> > > > > > > Reinette
> > > > > > 
> > > > > > Yes, if enclave calls EACCEPT it does the necessary
> > > > > > introspection and makes
> > > > > > sure that ETRACK is completed. I have trouble understanding
> > > > > > how enclave
> > > > > > makes sure that EACCEPT was called.
> > > > > 
> > > > > I'm not concerned of anything going wrong once EMODPR has
> > > > > been started.
> > > > > 
> > > > > The problem nails down to that the whole EMODPR process is
> > > > > spawned by
> > > > > the entity that is not trusted so maybe that should further
> > > > > broke down
> > > > > to three roles:
> > > > > 
> > > > > 1. Build process B
> > > > > 2. Runner process R.
> > > > > 3. Enclave E.
> > > > > 
> > > > > And to the costraint that we trust B *more* than R. Once B
> > > > > has done all the
> > > > > needed EMODPR calls it would send the file descriptor to R.
> > > > > Even if R would
> > > > > have full access to /dev/sgx_enclave, it would not matter,
> > > > > since B has done
> > > > > EMODPR-EACCEPT dance with E.
> > > > > 
> > > > > So what you can achieve with EMODPR is not protection against
> > > > > mistrusted
> > > > > *OS*. There's absolutely no chance you could use it for that
> > > > > purpose
> > > > > because mistrusted OS controls the whole process.
> > > > > 
> > > > > EMODPR is to help to protect enclave against mistrusted
> > > > > *process*, i.e.
> > > > > in the above scenario R.
> > > > 
> > > > There are two general cases that I can see. Both are valid.
> > > > 
> > > > 1. The OS moves from a trusted to an untrusted state. This
> > > > could be
> > > > the multi-process system you've described. But it could also be
> > > > that
> > > > the kernel becomes compromised after the enclave is fully
> > > > initialized.
> > > > 
> > > > 2. The OS is untrustworthy from the start.
> > > > 
> > > > The second case is the stronger one and if you can solve it,
> > > > the first
> > > > one is solved implicitly. And our end goal is that if the OS
> > > > does
> > > > anything malicious we will crash in a controlled way.
> > > > 
> > > > A defensive enclave will always want to have the least number
> > > > of
> > > > privileges for the maximum protection. Therefore, the enclave
> > > > will
> > > > want the OS to call EMODPR. If that were it, the host could
> > > > just lie.
> > > > But the enclave also verifies that the EMODPR operation was, in
> > > > fact,
> > > > executed by doing EACCEPT. When the enclave calls EACCEPT, if
> > > > the
> > > > kernel hasn't restricted permissions then we get a controlled
> > > > crash.
> > > > Therefore, we have solved the second case.
> > > 
> > > So you're referring to this part of the SDM pseude code in the
> > > SDM:
> > > 
> > > (* Check the destination EPC page for concurrency *)
> > > IF ( EPC page in use )
> > >     THEN #GP(0); FI;
> > > 
> > > I wonder does "EPC page in use" unconditionally trigger when
> > > EACCEPT
> > > is invoked for a page for which all of these conditions hold:
> > > 
> > > - .PR := 0 (no EMODPR in progress)
> > > - .MODIFIED := 0 (no EMODT in progress)
> > > - .PENDING := 0 (no EMODPR in progress)
> > > 
> > > I don't know the exact scope and scale of "EPC page in use".
> > > 
> > > Then, yes, EACCEPT could be at least used to validate that one of
> > > the
> > > three operations above was requested. However, enclave thread
> > > cannot say
> > > which one was it, so it is guesswork.
> > 
> > OK, I got it, and this last paragraph is not true. SECINFO given
> > EACCEPT
> > will lock in rest of the details and make the operation
> > deterministic.
> 
> Indeed - so the SDM pseudo code that is relevant here can be found
> under
> the "(* Verify that accept request matches current EPC page settings
> *)"
> comment where the enclave can verify that all EPCM values are as they
> should
> and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
> amiss.
> 
> > 
> > The only question mark then is the condition when no requests are
> > active.
> 
> Could you please elaborate what you mean with this question? If no
> request
> is active then I understand that to mean that no request has started.

My issue was that when:

- .PR := 0 (no EMODPR in progress)
- .MODIFIED := 0 (no EMODT in progress)
- .PENDING := 0 (no EMODPR in progress)

Does this trigger #GP when you call EACCEPT?

I don't think the answer matters that much tho sice if e.g. EMODPR was never
done, and enclave expected a change, #GP would trigger eventually in SECINFO
validation.

The way I look at EACCEPT is a memory verification tool it does the same at
run-time as EINIT does before run-time.

/Jarkko
Reinette Chatre Jan. 20, 2022, 4:52 p.m. UTC | #59
Hi Jarkko,

On 1/20/2022 4:53 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-01-18 at 12:59 -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
>>> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
>>>> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum
>>>> wrote:
>>>>> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen
>>>>> <jarkko@kernel.org> wrote:
>>>>>>
>>>>>> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen
>>>>>> wrote:
>>>>>>> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre
>>>>>>> wrote:
>>>>>>>> Hi Jarkko,
>>>>>>>>
>>>>>>>> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
>>>>>>>>> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette
>>>>>>>>> Chatre wrote:
>>>>>>>>>> Hi Jarkko,
>>>>>>>>>>
>>>>>>>>>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
>>>>>>>>>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette
>>>>>>>>>>> Chatre wrote:
>>>>>>>>>>>> Hi Jarkko,
>>>>>>>>>>>
>>>>>>>>>>> How enclave can check a page range that EPCM has
>>>>>>>>>>> the expected permissions?
>>>>>>>>>>
>>>>>>>>>> Only way to change EPCM permissions from outside
>>>>>>>>>> enclave is to run ENCLS[EMODPR]
>>>>>>>>>> that needs to be accepted from within the enclave via
>>>>>>>>>> ENCLU[EACCEPT]. At that
>>>>>>>>>> time the enclave provides the expected permissions
>>>>>>>>>> and that will fail
>>>>>>>>>> if there is a mismatch with the EPCM permissions
>>>>>>>>>> (SGX_PAGE_ATTRIBUTES_MISMATCH).
>>>>>>>>>
>>>>>>>>> This is a very valid point but that does make the
>>>>>>>>> introspection possible
>>>>>>>>> only at the time of EACCEPT.
>>>>>>>>>
>>>>>>>>> It does not give tools for enclave to make sure that
>>>>>>>>> EMODPR-ETRACK dance
>>>>>>>>> was ever exercised.
>>>>>>>>
>>>>>>>> Could you please elaborate? EACCEPT is available to the
>>>>>>>> enclave as a tool
>>>>>>>> and it would fail if ETRACK was not completed (error
>>>>>>>> SGX_NOT_TRACKED).
>>>>>>>>
>>>>>>>> Here is the relevant snippet from the SDM from the
>>>>>>>> section where it
>>>>>>>> describes EACCEPT:
>>>>>>>>
>>>>>>>> IF (Tracking not correct)
>>>>>>>>     THEN
>>>>>>>>         RFLAGS.ZF := 1;
>>>>>>>>         RAX := SGX_NOT_TRACKED;
>>>>>>>>         GOTO DONE;
>>>>>>>> FI;
>>>>>>>>
>>>>>>>> Reinette
>>>>>>>
>>>>>>> Yes, if enclave calls EACCEPT it does the necessary
>>>>>>> introspection and makes
>>>>>>> sure that ETRACK is completed. I have trouble understanding
>>>>>>> how enclave
>>>>>>> makes sure that EACCEPT was called.
>>>>>>
>>>>>> I'm not concerned of anything going wrong once EMODPR has
>>>>>> been started.
>>>>>>
>>>>>> The problem nails down to that the whole EMODPR process is
>>>>>> spawned by
>>>>>> the entity that is not trusted so maybe that should further
>>>>>> broke down
>>>>>> to three roles:
>>>>>>
>>>>>> 1. Build process B
>>>>>> 2. Runner process R.
>>>>>> 3. Enclave E.
>>>>>>
>>>>>> And to the costraint that we trust B *more* than R. Once B
>>>>>> has done all the
>>>>>> needed EMODPR calls it would send the file descriptor to R.
>>>>>> Even if R would
>>>>>> have full access to /dev/sgx_enclave, it would not matter,
>>>>>> since B has done
>>>>>> EMODPR-EACCEPT dance with E.
>>>>>>
>>>>>> So what you can achieve with EMODPR is not protection against
>>>>>> mistrusted
>>>>>> *OS*. There's absolutely no chance you could use it for that
>>>>>> purpose
>>>>>> because mistrusted OS controls the whole process.
>>>>>>
>>>>>> EMODPR is to help to protect enclave against mistrusted
>>>>>> *process*, i.e.
>>>>>> in the above scenario R.
>>>>>
>>>>> There are two general cases that I can see. Both are valid.
>>>>>
>>>>> 1. The OS moves from a trusted to an untrusted state. This
>>>>> could be
>>>>> the multi-process system you've described. But it could also be
>>>>> that
>>>>> the kernel becomes compromised after the enclave is fully
>>>>> initialized.
>>>>>
>>>>> 2. The OS is untrustworthy from the start.
>>>>>
>>>>> The second case is the stronger one and if you can solve it,
>>>>> the first
>>>>> one is solved implicitly. And our end goal is that if the OS
>>>>> does
>>>>> anything malicious we will crash in a controlled way.
>>>>>
>>>>> A defensive enclave will always want to have the least number
>>>>> of
>>>>> privileges for the maximum protection. Therefore, the enclave
>>>>> will
>>>>> want the OS to call EMODPR. If that were it, the host could
>>>>> just lie.
>>>>> But the enclave also verifies that the EMODPR operation was, in
>>>>> fact,
>>>>> executed by doing EACCEPT. When the enclave calls EACCEPT, if
>>>>> the
>>>>> kernel hasn't restricted permissions then we get a controlled
>>>>> crash.
>>>>> Therefore, we have solved the second case.
>>>>
>>>> So you're referring to this part of the SDM pseude code in the
>>>> SDM:
>>>>
>>>> (* Check the destination EPC page for concurrency *)
>>>> IF ( EPC page in use )
>>>>     THEN #GP(0); FI;
>>>>
>>>> I wonder does "EPC page in use" unconditionally trigger when
>>>> EACCEPT
>>>> is invoked for a page for which all of these conditions hold:
>>>>
>>>> - .PR := 0 (no EMODPR in progress)
>>>> - .MODIFIED := 0 (no EMODT in progress)
>>>> - .PENDING := 0 (no EMODPR in progress)
>>>>
>>>> I don't know the exact scope and scale of "EPC page in use".
>>>>
>>>> Then, yes, EACCEPT could be at least used to validate that one of
>>>> the
>>>> three operations above was requested. However, enclave thread
>>>> cannot say
>>>> which one was it, so it is guesswork.
>>>
>>> OK, I got it, and this last paragraph is not true. SECINFO given
>>> EACCEPT
>>> will lock in rest of the details and make the operation
>>> deterministic.
>>
>> Indeed - so the SDM pseudo code that is relevant here can be found
>> under
>> the "(* Verify that accept request matches current EPC page settings
>> *)"
>> comment where the enclave can verify that all EPCM values are as they
>> should
>> and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
>> amiss.
>>
>>>
>>> The only question mark then is the condition when no requests are
>>> active.
>>
>> Could you please elaborate what you mean with this question? If no
>> request
>> is active then I understand that to mean that no request has started.
> 
> My issue was that when:
> 
> - .PR := 0 (no EMODPR in progress)
> - .MODIFIED := 0 (no EMODT in progress)
> - .PENDING := 0 (no EMODPR in progress)
> 
> Does this trigger #GP when you call EACCEPT?

From what I understand a #GP would be triggered if the EACCEPT does not
specify at least one of these. That would be a problem with the EACCEPT
instruction as opposed to the EPCM contents or OS flow though. This
can be found under the following comment in the SDM pseudo code:

(* Check that the combination of requested PT, PENDING and MODIFIED is legal *)

As far as the actual checking of EPCM values goes, it would not result
in a #GP but for an unexpected value of MODIFIED or PENDING the EACCEPT
will fail with SGX_PAGE_ATTRIBUTES_MISMATCH. EACCEPT does not enforce the PR
bit but it _does_ enforce the individual permission bits.

> I don't think the answer matters that much tho sice if e.g. EMODPR was never
> done, and enclave expected a change, #GP would trigger eventually in SECINFO
> validation.

Similar here as I understand it will not be a #GP but EACCEPT failure with
error SGX_PAGE_ATTRIBUTES_MISMATCH. The relevant pseudo-code in the SDM is
below and you can see how MODIFIED and PENDING are matched but PR not (while
the individual permission bits are):

(* Verify that accept request matches current EPC page settings *)
IF ( (EPCM(DS:RCX).ENCLAVEADDRESS ≠ DS:RCX) or (EPCM(DS:RCX).PENDING ≠ SCRATCH_SECINFO.FLAGS.PENDING) or
(EPCM(DS:RCX).MODIFIED ≠ SCRATCH_SECINFO.FLAGS.MODIFIED) or (EPCM(DS:RCX).R ≠ SCRATCH_SECINFO.FLAGS.R) or
(EPCM(DS:RCX).W ≠ SCRATCH_SECINFO.FLAGS.W) or (EPCM(DS:RCX).X ≠ SCRATCH_SECINFO.FLAGS.X) or
(EPCM(DS:RCX).PT ≠ SCRATCH_SECINFO.FLAGS.PT) )
THEN
     RFLAGS.ZF := 1;
     RAX := SGX_PAGE_ATTRIBUTES_MISMATCH;
     GOTO DONE;
FI;


> 
> The way I look at EACCEPT is a memory verification tool it does the same at
> run-time as EINIT does before run-time.

Indeed.

Reinette
Jarkko Sakkinen Jan. 26, 2022, 2:41 p.m. UTC | #60
On Thu, Jan 20, 2022 at 08:52:28AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/20/2022 4:53 AM, Jarkko Sakkinen wrote:
> > On Tue, 2022-01-18 at 12:59 -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
> >>> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> >>>> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum
> >>>> wrote:
> >>>>> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen
> >>>>> <jarkko@kernel.org> wrote:
> >>>>>>
> >>>>>> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen
> >>>>>> wrote:
> >>>>>>> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre
> >>>>>>> wrote:
> >>>>>>>> Hi Jarkko,
> >>>>>>>>
> >>>>>>>> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> >>>>>>>>> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette
> >>>>>>>>> Chatre wrote:
> >>>>>>>>>> Hi Jarkko,
> >>>>>>>>>>
> >>>>>>>>>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> >>>>>>>>>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette
> >>>>>>>>>>> Chatre wrote:
> >>>>>>>>>>>> Hi Jarkko,
> >>>>>>>>>>>
> >>>>>>>>>>> How enclave can check a page range that EPCM has
> >>>>>>>>>>> the expected permissions?
> >>>>>>>>>>
> >>>>>>>>>> Only way to change EPCM permissions from outside
> >>>>>>>>>> enclave is to run ENCLS[EMODPR]
> >>>>>>>>>> that needs to be accepted from within the enclave via
> >>>>>>>>>> ENCLU[EACCEPT]. At that
> >>>>>>>>>> time the enclave provides the expected permissions
> >>>>>>>>>> and that will fail
> >>>>>>>>>> if there is a mismatch with the EPCM permissions
> >>>>>>>>>> (SGX_PAGE_ATTRIBUTES_MISMATCH).
> >>>>>>>>>
> >>>>>>>>> This is a very valid point but that does make the
> >>>>>>>>> introspection possible
> >>>>>>>>> only at the time of EACCEPT.
> >>>>>>>>>
> >>>>>>>>> It does not give tools for enclave to make sure that
> >>>>>>>>> EMODPR-ETRACK dance
> >>>>>>>>> was ever exercised.
> >>>>>>>>
> >>>>>>>> Could you please elaborate? EACCEPT is available to the
> >>>>>>>> enclave as a tool
> >>>>>>>> and it would fail if ETRACK was not completed (error
> >>>>>>>> SGX_NOT_TRACKED).
> >>>>>>>>
> >>>>>>>> Here is the relevant snippet from the SDM from the
> >>>>>>>> section where it
> >>>>>>>> describes EACCEPT:
> >>>>>>>>
> >>>>>>>> IF (Tracking not correct)
> >>>>>>>>     THEN
> >>>>>>>>         RFLAGS.ZF := 1;
> >>>>>>>>         RAX := SGX_NOT_TRACKED;
> >>>>>>>>         GOTO DONE;
> >>>>>>>> FI;
> >>>>>>>>
> >>>>>>>> Reinette
> >>>>>>>
> >>>>>>> Yes, if enclave calls EACCEPT it does the necessary
> >>>>>>> introspection and makes
> >>>>>>> sure that ETRACK is completed. I have trouble understanding
> >>>>>>> how enclave
> >>>>>>> makes sure that EACCEPT was called.
> >>>>>>
> >>>>>> I'm not concerned of anything going wrong once EMODPR has
> >>>>>> been started.
> >>>>>>
> >>>>>> The problem nails down to that the whole EMODPR process is
> >>>>>> spawned by
> >>>>>> the entity that is not trusted so maybe that should further
> >>>>>> broke down
> >>>>>> to three roles:
> >>>>>>
> >>>>>> 1. Build process B
> >>>>>> 2. Runner process R.
> >>>>>> 3. Enclave E.
> >>>>>>
> >>>>>> And to the costraint that we trust B *more* than R. Once B
> >>>>>> has done all the
> >>>>>> needed EMODPR calls it would send the file descriptor to R.
> >>>>>> Even if R would
> >>>>>> have full access to /dev/sgx_enclave, it would not matter,
> >>>>>> since B has done
> >>>>>> EMODPR-EACCEPT dance with E.
> >>>>>>
> >>>>>> So what you can achieve with EMODPR is not protection against
> >>>>>> mistrusted
> >>>>>> *OS*. There's absolutely no chance you could use it for that
> >>>>>> purpose
> >>>>>> because mistrusted OS controls the whole process.
> >>>>>>
> >>>>>> EMODPR is to help to protect enclave against mistrusted
> >>>>>> *process*, i.e.
> >>>>>> in the above scenario R.
> >>>>>
> >>>>> There are two general cases that I can see. Both are valid.
> >>>>>
> >>>>> 1. The OS moves from a trusted to an untrusted state. This
> >>>>> could be
> >>>>> the multi-process system you've described. But it could also be
> >>>>> that
> >>>>> the kernel becomes compromised after the enclave is fully
> >>>>> initialized.
> >>>>>
> >>>>> 2. The OS is untrustworthy from the start.
> >>>>>
> >>>>> The second case is the stronger one and if you can solve it,
> >>>>> the first
> >>>>> one is solved implicitly. And our end goal is that if the OS
> >>>>> does
> >>>>> anything malicious we will crash in a controlled way.
> >>>>>
> >>>>> A defensive enclave will always want to have the least number
> >>>>> of
> >>>>> privileges for the maximum protection. Therefore, the enclave
> >>>>> will
> >>>>> want the OS to call EMODPR. If that were it, the host could
> >>>>> just lie.
> >>>>> But the enclave also verifies that the EMODPR operation was, in
> >>>>> fact,
> >>>>> executed by doing EACCEPT. When the enclave calls EACCEPT, if
> >>>>> the
> >>>>> kernel hasn't restricted permissions then we get a controlled
> >>>>> crash.
> >>>>> Therefore, we have solved the second case.
> >>>>
> >>>> So you're referring to this part of the SDM pseude code in the
> >>>> SDM:
> >>>>
> >>>> (* Check the destination EPC page for concurrency *)
> >>>> IF ( EPC page in use )
> >>>>     THEN #GP(0); FI;
> >>>>
> >>>> I wonder does "EPC page in use" unconditionally trigger when
> >>>> EACCEPT
> >>>> is invoked for a page for which all of these conditions hold:
> >>>>
> >>>> - .PR := 0 (no EMODPR in progress)
> >>>> - .MODIFIED := 0 (no EMODT in progress)
> >>>> - .PENDING := 0 (no EMODPR in progress)
> >>>>
> >>>> I don't know the exact scope and scale of "EPC page in use".
> >>>>
> >>>> Then, yes, EACCEPT could be at least used to validate that one of
> >>>> the
> >>>> three operations above was requested. However, enclave thread
> >>>> cannot say
> >>>> which one was it, so it is guesswork.
> >>>
> >>> OK, I got it, and this last paragraph is not true. SECINFO given
> >>> EACCEPT
> >>> will lock in rest of the details and make the operation
> >>> deterministic.
> >>
> >> Indeed - so the SDM pseudo code that is relevant here can be found
> >> under
> >> the "(* Verify that accept request matches current EPC page settings
> >> *)"
> >> comment where the enclave can verify that all EPCM values are as they
> >> should
> >> and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
> >> amiss.
> >>
> >>>
> >>> The only question mark then is the condition when no requests are
> >>> active.
> >>
> >> Could you please elaborate what you mean with this question? If no
> >> request
> >> is active then I understand that to mean that no request has started.
> > 
> > My issue was that when:
> > 
> > - .PR := 0 (no EMODPR in progress)
> > - .MODIFIED := 0 (no EMODT in progress)
> > - .PENDING := 0 (no EMODPR in progress)
> > 
> > Does this trigger #GP when you call EACCEPT?
> 
> From what I understand a #GP would be triggered if the EACCEPT does not
> specify at least one of these. That would be a problem with the EACCEPT
> instruction as opposed to the EPCM contents or OS flow though. This
> can be found under the following comment in the SDM pseudo code:
> 
> (* Check that the combination of requested PT, PENDING and MODIFIED is legal *)
> 
> As far as the actual checking of EPCM values goes, it would not result
> in a #GP but for an unexpected value of MODIFIED or PENDING the EACCEPT
> will fail with SGX_PAGE_ATTRIBUTES_MISMATCH. EACCEPT does not enforce the PR
> bit but it _does_ enforce the individual permission bits.
> 
> > I don't think the answer matters that much tho sice if e.g. EMODPR was never
> > done, and enclave expected a change, #GP would trigger eventually in SECINFO
> > validation.
> 
> Similar here as I understand it will not be a #GP but EACCEPT failure with
> error SGX_PAGE_ATTRIBUTES_MISMATCH. The relevant pseudo-code in the SDM is
> below and you can see how MODIFIED and PENDING are matched but PR not (while
> the individual permission bits are):
> 
> (* Verify that accept request matches current EPC page settings *)
> IF ( (EPCM(DS:RCX).ENCLAVEADDRESS ≠ DS:RCX) or (EPCM(DS:RCX).PENDING ≠ SCRATCH_SECINFO.FLAGS.PENDING) or
> (EPCM(DS:RCX).MODIFIED ≠ SCRATCH_SECINFO.FLAGS.MODIFIED) or (EPCM(DS:RCX).R ≠ SCRATCH_SECINFO.FLAGS.R) or
> (EPCM(DS:RCX).W ≠ SCRATCH_SECINFO.FLAGS.W) or (EPCM(DS:RCX).X ≠ SCRATCH_SECINFO.FLAGS.X) or
> (EPCM(DS:RCX).PT ≠ SCRATCH_SECINFO.FLAGS.PT) )
> THEN
>      RFLAGS.ZF := 1;
>      RAX := SGX_PAGE_ATTRIBUTES_MISMATCH;
>      GOTO DONE;
> FI;
> 
> 
> > 
> > The way I look at EACCEPT is a memory verification tool it does the same at
> > run-time as EINIT does before run-time.
> 
> Indeed.

I think I got this now. Thank you anyway for further explanation :-)

> Reinette

/Jarkko
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 60afa8eaf979..6fec68896e1b 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -164,7 +164,7 @@  static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	 * exceed the VMA permissions.
 	 */
 	vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
-	page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits;
+	page_prot_bits = entry->vm_run_prot_bits & vm_prot_bits;
 	/*
 	 * Add VM_SHARED so that PTE is made writable right away if VMA
 	 * and EPCM are writable (no COW in SGX).
@@ -217,7 +217,7 @@  static vm_fault_t sgx_vma_pfn_mkwrite(struct vm_fault *vmf)
 		goto out;
 	}
 
-	if (!(entry->vm_max_prot_bits & VM_WRITE))
+	if (!(entry->vm_run_prot_bits & VM_WRITE))
 		ret = VM_FAULT_SIGBUS;
 
 out:
@@ -280,7 +280,7 @@  int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 	mutex_lock(&encl->lock);
 	xas_lock(&xas);
 	xas_for_each(&xas, page, PFN_DOWN(end - 1)) {
-		if (~page->vm_max_prot_bits & vm_prot_bits) {
+		if (~page->vm_run_prot_bits & vm_prot_bits) {
 			ret = -EACCES;
 			break;
 		}
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index fec43ca65065..dc262d843411 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -27,7 +27,8 @@ 
 
 struct sgx_encl_page {
 	unsigned long desc;
-	unsigned long vm_max_prot_bits;
+	unsigned long vm_max_prot_bits:8;
+	unsigned long vm_run_prot_bits:8;
 	struct sgx_epc_page *epc_page;
 	struct sgx_encl *encl;
 	struct sgx_va_page *va_page;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 83df20e3e633..7e0819a89532 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -197,6 +197,12 @@  static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
 	/* Calculate maximum of the VM flags for the page. */
 	encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
 
+	/*
+	 * At time of allocation, the runtime protection bits are the same
+	 * as the maximum protection bits.
+	 */
+	encl_page->vm_run_prot_bits = encl_page->vm_max_prot_bits;
+
 	return encl_page;
 }