diff mbox series

[RFC,v3,04/12] x86/sgx: Require userspace to define enclave pages' protection bits

Message ID 20190617222438.2080-5-sean.j.christopherson@intel.com (mailing list archive)
State New, archived
Headers show
Series security: x86/sgx: SGX vs. LSM, round 3 | expand

Commit Message

Sean Christopherson June 17, 2019, 10:24 p.m. UTC
Existing Linux Security Module policies restrict userspace's ability to
map memory, e.g. may require priveleged permissions to map a page that
is simultaneously writable and executable.  Said permissions are often
tied to the file which backs the mapped memory, i.e. vm_file.

For reasons explained below, SGX does not allow LSMs to enforce policies
using existing LSM hooks such as file_mprotect().  Explicitly track the
protection bits for an enclave page (separate from the vma/pte bits) and
require userspace to explicit define a page's protection bits when the
page is added to the enclave.  Enclave page protection bits paves the
way to adding security_enclave_load() LSM hook as an SGX equivalent to
security_file_mprotect(), e.g. SGX can pass the page's protection bits
and source vma to LSMs.   The source vma will allow LSMs to tie
permissions to files, e.g. the file containing the enclave's code and
initial data, and the protection bits will allow LSMs to make decisions
based on the capabilities of the process, e.g. if a process is allowed
to load unmeasured code or load code from anonymous memory.

Due to the nature of the Enclave Page Cache, and because the EPC is
manually managed by SGX, all enclave vmas are backed by the same file,
i.e. /dev/sgx/enclave.  Specifically, a single file allows SGX to use
file op hooks to move pages in/out of the EPC.

Furthermore, EPC pages for any given enclave are fundamentally shared
between processes, i.e. CoW semantics are not possible with EPC pages
due to hardware restrictions such as 1:1 mappings between virtual and
physical addresses (within the enclave).

Lastly, all real world enclaves will need read, write and execute
permissions to EPC pages.

As a result, SGX does not play nice with existing LSM behavior as it is
impossible to apply policies to enclaves with reasonable granularity,
e.g. an LSM can deny access to EPC altogether, but can't deny
potentially unwanted behavior such as mapping pages WX, loading code
from anonymous memory, loading unmeasured code, etc...

For example, because all (practical) enclaves need RW pages for data and
RX pages for code, SELinux's existing policies will require all enclaves
to have FILE__READ, FILE__WRITE and FILE__EXECUTE permissions on
/dev/sgx/enclave.  Witholding FILE__WRITE or FILE__EXECUTE in an attempt
to deny RW->RX or RWX would prevent running *any* enclave, even those
that cleanly separate RW and RX pages.  And because /dev/sgx/enclave
requires MAP_SHARED, the anonymous/CoW checks that would trigger
FILE__EXECMOD or PROCESS__EXECMEM permissions will never fire.

Taking protection bits has a second use in that it can be used to
prevent loading an enclave from a noexec file system.  On SGX2 hardware,
regardless of kernel support for SGX2, userspace could EADD a page from
a noexec path using read-only permissions and later mprotect() and
ENCLU[EMODPE] the page to gain execute permissions.  By requiring
the enclave's page protections up front, SGX will be able to enforce
noexec paths when building enclaves.

To prevent userspace from circumventing the allowed protections, do not
allow PROT_{READ,WRITE,EXEC} mappings to an enclave without an
associated enclave page, i.e. prevent creating a mapping with unchecked
protection bits.

Many alternatives[1][2] have been explored, most notably the concept of
having SGX check (at load time) and save the permissions of the enclave
loader.  The permissions would then be enforced by SGX at run time, e.g.
via mmap()/mprotect() hooks of some form.  The basic functionality of
pre-checking permissions is relatively straightforward, but supporting
LSM auditing is complex and fraught with pitfalls.  If auditing is done
at the time of denial then the audit logs will potentially show a large
number of false positives.  Auditing when the denial is enforced, e.g.
at mprotect(), suffers from its own problems, e.g.:

  - Requires LSMs to pre-generate audit messages so that they can be
    replayed by SGX when the denial is actually enforced.

  - System changes can result in stale audit messages, e.g. if files
    are removed from the system, an LSM profile is modified, etc...

  - A process could log what is essentially a false positive denial,
    e.g. if the current process has the requisite capability but the
    original enclave loader did not.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/uapi/asm/sgx.h        |  5 +--
 arch/x86/kernel/cpu/sgx/driver/ioctl.c | 15 +++++---
 arch/x86/kernel/cpu/sgx/driver/main.c  | 49 ++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h         |  1 +
 tools/testing/selftests/x86/sgx/main.c | 32 +++++++++++++++--
 5 files changed, 93 insertions(+), 9 deletions(-)

Comments

Jarkko Sakkinen June 19, 2019, 2:43 p.m. UTC | #1
On Mon, 2019-06-17 at 15:24 -0700, Sean Christopherson wrote:
> +	__u32	flags;

This should be changed to secinfo_flags_mask containing a mask of the
allowed bits for the secinfo flags because of two obvious reasons:

1. Protection flags are used mainly with syscalls and contain also other
   things than just the permissions that do not apply in this context.
2. Having a mask for all secinfo flags is more future proof.

With the protection flags you end up reserving bits forever for things
that we will never have any use for (e.g. PROT_SEM).

Looking the change you convert 'flags' (wondering why it isn't called
'prot') to VM flags, which means that you essentially gain absolutely
nothing and loose some potential versatility as a side-effect by doing
that.

/Jarkko
Sean Christopherson June 19, 2019, 3:20 p.m. UTC | #2
On Wed, Jun 19, 2019 at 05:43:11PM +0300, Jarkko Sakkinen wrote:
> On Mon, 2019-06-17 at 15:24 -0700, Sean Christopherson wrote:
> > +	__u32	flags;
> 
> This should be changed to secinfo_flags_mask containing a mask of the
> allowed bits for the secinfo flags because of two obvious reasons:
> 
> 1. Protection flags are used mainly with syscalls and contain also other
>    things than just the permissions that do not apply in this context.
> 2. Having a mask for all secinfo flags is more future proof.
> 
> With the protection flags you end up reserving bits forever for things
> that we will never have any use for (e.g. PROT_SEM).
> 
> Looking the change you convert 'flags' (wondering why it isn't called
> 'prot') to VM flags, which means that you essentially gain absolutely
> nothing and loose some potential versatility as a side-effect by doing
> that.

Ah, I see where you're coming from.  My intent was that supported flags
would be SGX specific, not generic PROT_* flags.  I.e. bits 2:0 are used
for PROT_{READ,WRITE,EXEC}, bit 7 can be used for SGX_ZERO_PAGE, etc...

I have two objections to 'secinfo_flags_mask':

  - A full SECINFO mask is problematic for literally every other bit/field
    currently defined in SECINFO.FLAGS, e.g. masking PAGE_TYPE, PENDING
    and MODIFIED adds no value that I can think of, but would require the
    kernel do to weird things like reject page types and EMODPR requests
    (due to their PENDING/MODIFIED interaction).

  - The kernel doesn't actually restrict SECINFO based on the param, it's
    restricting VM_MAY* flags in the vmas.  'secinfo_flags_mask' implies
    the kernel is somehow masking SECINFO.

What about something like this?

/**
 * struct sgx_enclave_add_page - parameter structure for the
 *                               %SGX_IOC_ENCLAVE_ADD_PAGE ioctl
 * @addr:	address within the ELRANGE
 * @src:	address for the page data
 * @secinfo:	address for the SECINFO data
 * @mrmask:	bitmask for the measured 256 byte chunks
 * @prot:	maximal PROT_{READ,WRITE,EXEC} permissions for the page
 */
struct sgx_enclave_add_page {
	__u64		addr;
	__u64		src;
	__u64		secinfo;
	__u16		mrmask;
	__u8		prot;
	__u8		pad;
	__u64[2]	reserved;
};
Jarkko Sakkinen June 20, 2019, 10:17 p.m. UTC | #3
On Wed, Jun 19, 2019 at 08:20:18AM -0700, Sean Christopherson wrote:
> On Wed, Jun 19, 2019 at 05:43:11PM +0300, Jarkko Sakkinen wrote:
> > On Mon, 2019-06-17 at 15:24 -0700, Sean Christopherson wrote:
> > > +	__u32	flags;
> > 
> > This should be changed to secinfo_flags_mask containing a mask of the
> > allowed bits for the secinfo flags because of two obvious reasons:
> > 
> > 1. Protection flags are used mainly with syscalls and contain also other
> >    things than just the permissions that do not apply in this context.
> > 2. Having a mask for all secinfo flags is more future proof.
> > 
> > With the protection flags you end up reserving bits forever for things
> > that we will never have any use for (e.g. PROT_SEM).
> > 
> > Looking the change you convert 'flags' (wondering why it isn't called
> > 'prot') to VM flags, which means that you essentially gain absolutely
> > nothing and loose some potential versatility as a side-effect by doing
> > that.
> 
> Ah, I see where you're coming from.  My intent was that supported flags
> would be SGX specific, not generic PROT_* flags.  I.e. bits 2:0 are used
> for PROT_{READ,WRITE,EXEC}, bit 7 can be used for SGX_ZERO_PAGE, etc...
> 
> I have two objections to 'secinfo_flags_mask':
> 
>   - A full SECINFO mask is problematic for literally every other bit/field
>     currently defined in SECINFO.FLAGS, e.g. masking PAGE_TYPE, PENDING
>     and MODIFIED adds no value that I can think of, but would require the
>     kernel do to weird things like reject page types and EMODPR requests
>     (due to their PENDING/MODIFIED interaction).

You're probably right that in practice it would hard to do much with
EMODT.

>   - The kernel doesn't actually restrict SECINFO based on the param, it's
>     restricting VM_MAY* flags in the vmas.  'secinfo_flags_mask' implies
>     the kernel is somehow masking SECINFO.
>
> What about something like this?
> 
> /**
>  * struct sgx_enclave_add_page - parameter structure for the
>  *                               %SGX_IOC_ENCLAVE_ADD_PAGE ioctl
>  * @addr:	address within the ELRANGE
>  * @src:	address for the page data
>  * @secinfo:	address for the SECINFO data
>  * @mrmask:	bitmask for the measured 256 byte chunks
>  * @prot:	maximal PROT_{READ,WRITE,EXEC} permissions for the page
>  */
> struct sgx_enclave_add_page {
> 	__u64		addr;
> 	__u64		src;
> 	__u64		secinfo;
> 	__u16		mrmask;
> 	__u8		prot;
> 	__u8		pad;
> 	__u64[2]	reserved;
> };

LGTM but why it isn't like:

__u16 mrmask;
__u8 prot;
__u8 reserved[5];

/Jarkko
Sean Christopherson July 7, 2019, 7:08 p.m. UTC | #4
On Fri, Jun 21, 2019 at 01:17:02AM +0300, Jarkko Sakkinen wrote:
> On Wed, Jun 19, 2019 at 08:20:18AM -0700, Sean Christopherson wrote:
> > On Wed, Jun 19, 2019 at 05:43:11PM +0300, Jarkko Sakkinen wrote:
> > > On Mon, 2019-06-17 at 15:24 -0700, Sean Christopherson wrote:
> > > > +	__u32	flags;
> > > 
> > > This should be changed to secinfo_flags_mask containing a mask of the
> > > allowed bits for the secinfo flags because of two obvious reasons:
> > > 
> > > 1. Protection flags are used mainly with syscalls and contain also other
> > >    things than just the permissions that do not apply in this context.
> > > 2. Having a mask for all secinfo flags is more future proof.
> > > 
> > > With the protection flags you end up reserving bits forever for things
> > > that we will never have any use for (e.g. PROT_SEM).
> > > 
> > > Looking the change you convert 'flags' (wondering why it isn't called
> > > 'prot') to VM flags, which means that you essentially gain absolutely
> > > nothing and loose some potential versatility as a side-effect by doing
> > > that.
> > 
> > Ah, I see where you're coming from.  My intent was that supported flags
> > would be SGX specific, not generic PROT_* flags.  I.e. bits 2:0 are used
> > for PROT_{READ,WRITE,EXEC}, bit 7 can be used for SGX_ZERO_PAGE, etc...
> > 
> > I have two objections to 'secinfo_flags_mask':
> > 
> >   - A full SECINFO mask is problematic for literally every other bit/field
> >     currently defined in SECINFO.FLAGS, e.g. masking PAGE_TYPE, PENDING
> >     and MODIFIED adds no value that I can think of, but would require the
> >     kernel do to weird things like reject page types and EMODPR requests
> >     (due to their PENDING/MODIFIED interaction).
> 
> You're probably right that in practice it would hard to do much with
> EMODT.
> 
> >   - The kernel doesn't actually restrict SECINFO based on the param, it's
> >     restricting VM_MAY* flags in the vmas.  'secinfo_flags_mask' implies
> >     the kernel is somehow masking SECINFO.
> >
> > What about something like this?
> > 
> > /**
> >  * struct sgx_enclave_add_page - parameter structure for the
> >  *                               %SGX_IOC_ENCLAVE_ADD_PAGE ioctl
> >  * @addr:	address within the ELRANGE
> >  * @src:	address for the page data
> >  * @secinfo:	address for the SECINFO data
> >  * @mrmask:	bitmask for the measured 256 byte chunks
> >  * @prot:	maximal PROT_{READ,WRITE,EXEC} permissions for the page
> >  */
> > struct sgx_enclave_add_page {
> > 	__u64		addr;
> > 	__u64		src;
> > 	__u64		secinfo;
> > 	__u16		mrmask;
> > 	__u8		prot;
> > 	__u8		pad;
> > 	__u64[2]	reserved;
> > };
> 
> LGTM but why it isn't like:
> 
> __u16 mrmask;
> __u8 prot;
> __u8 reserved[5];

Because math is hard :-)  Though I think we'd want

  __u16 mrmask
  __u8  prot
  __u8  pad[5];

with an optional

  __u64 reserved[?];

"pad" can be ignored, e.g. doesn't need to be explicitly zeroed by
userspace, whereas "reserved" requires explicit zeroing and probably an
associated kernel check.
Jarkko Sakkinen July 8, 2019, 3:23 p.m. UTC | #5
On Sun, 2019-07-07 at 12:08 -0700, Sean Christopherson wrote:
> LGTM but why it isn't like:
> > 
> > __u16 mrmask;
> > __u8 prot;
> > __u8 reserved[5];
> 
> Because math is hard :-)  Though I think we'd want
> 
>   __u16 mrmask
>   __u8  prot
>   __u8  pad[5];
> 
> with an optional
> 
>   __u64 reserved[?];
> 
> "pad" can be ignored, e.g. doesn't need to be explicitly zeroed by
> userspace, whereas "reserved" requires explicit zeroing and probably an
> associated kernel check.

OK, cool with the change itself. Still need to get a better idea
how these make sense in architectural sense.

Things that would help with overall picture:

1. We have to figure out how this can be useful when LSM's are not used.
That gives at least some evidence that the security model is overally
good if it makes sense with and without LSM. Right now this looks like
dead functionality if not coupled with an LSM.
2. Probably some "user story" type of examples would help with the
discussion overall [1] i.e. how one would use this for
her own good.

[1] Probably many of the folks who work x86 tree have ignored major
    part of the discussion. Somehow these should be brought to
    nutshell so that anyone can get whatever the model is. Anyone
    should get it basically.

/Jarkko
Sean Christopherson July 8, 2019, 4:19 p.m. UTC | #6
On Mon, Jul 08, 2019 at 06:23:46PM +0300, Jarkko Sakkinen wrote:
> On Sun, 2019-07-07 at 12:08 -0700, Sean Christopherson wrote:
> > LGTM but why it isn't like:
> > > 
> > > __u16 mrmask;
> > > __u8 prot;
> > > __u8 reserved[5];
> > 
> > Because math is hard :-)  Though I think we'd want
> > 
> >   __u16 mrmask
> >   __u8  prot
> >   __u8  pad[5];
> > 
> > with an optional
> > 
> >   __u64 reserved[?];
> > 
> > "pad" can be ignored, e.g. doesn't need to be explicitly zeroed by
> > userspace, whereas "reserved" requires explicit zeroing and probably an
> > associated kernel check.
> 
> OK, cool with the change itself. Still need to get a better idea
> how these make sense in architectural sense.
> 
> Things that would help with overall picture:
> 
> 1. We have to figure out how this can be useful when LSM's are not used.

Declaring PROT_EXEC is useful to enforce noexec filesystems.  Beyond that,
I am not aware of any meaningful use case.

> That gives at least some evidence that the security model is overally
> good if it makes sense with and without LSM. Right now this looks like
> dead functionality if not coupled with an LSM.

I agree that it's effectively dead functionality without LSMs, but keep
in mind that this LSM rat hole was opened specifically because SGX could
be used to circumvent existing LSM security policies.  In other words,
the purpose of the UAPI extension is to achieve LSM compatibility without
incurring significant complexity in the LSM subsystem.

> 2. Probably some "user story" type of examples would help with the
> discussion overall [1] i.e. how one would use this for
> her own good.

The compelling story is Andy's original concern that userspace could
circumvent existing security policies by running code in an enclave.

AIUI, closing the LSM loophole is the minimal requirement to get SGX
upstreamed.  The extensive discussion has largely been focused on
ensuring that whatever mechanism is used to close the loophole will
play nice with future SGX functionality and/or LSM security policies.

> [1] Probably many of the folks who work x86 tree have ignored major
>     part of the discussion. Somehow these should be brought to
>     nutshell so that anyone can get whatever the model is. Anyone
>     should get it basically.
> 
> /Jarkko
>
Jarkko Sakkinen July 9, 2019, 4:06 p.m. UTC | #7
On Mon, Jul 08, 2019 at 09:19:32AM -0700, Sean Christopherson wrote:
> > 2. Probably some "user story" type of examples would help with the
> > discussion overall [1] i.e. how one would use this for
> > her own good.
> 
> The compelling story is Andy's original concern that userspace could
> circumvent existing security policies by running code in an enclave.
> 
> AIUI, closing the LSM loophole is the minimal requirement to get SGX
> upstreamed.  The extensive discussion has largely been focused on
> ensuring that whatever mechanism is used to close the loophole will
> play nice with future SGX functionality and/or LSM security policies.

OK, might be getting here where I fall out of the wagon so:

Doesn't Andy's example anyway require a process that has privileges to
make pages executable i.e. it could run arbitrary code even without an
enclave?

/Jarkko
Sean Christopherson July 10, 2019, 5:25 p.m. UTC | #8
On Tue, Jul 09, 2019 at 07:06:34PM +0300, Jarkko Sakkinen wrote:
> On Mon, Jul 08, 2019 at 09:19:32AM -0700, Sean Christopherson wrote:
> > > 2. Probably some "user story" type of examples would help with the
> > > discussion overall [1] i.e. how one would use this for
> > > her own good.
> > 
> > The compelling story is Andy's original concern that userspace could
> > circumvent existing security policies by running code in an enclave.
> > 
> > AIUI, closing the LSM loophole is the minimal requirement to get SGX
> > upstreamed.  The extensive discussion has largely been focused on
> > ensuring that whatever mechanism is used to close the loophole will
> > play nice with future SGX functionality and/or LSM security policies.
> 
> OK, might be getting here where I fall out of the wagon so:
> 
> Doesn't Andy's example anyway require a process that has privileges to
> make pages executable i.e. it could run arbitrary code even without an
> enclave?

Ah, no.  He did raise that concern, but it'd only be an issue if the
enclave fd were backed by an anon inode, in which case all enclaves would
need EXECMEM in order to gain PROT_EXEC on EPC.  Because the fd is backed
/dev/sgx/enclave, userspace just needs FILE__EXECUTE on /dev/sgx/enclave.
Andy Lutomirski July 15, 2019, 10:29 p.m. UTC | #9
On Wed, Jul 10, 2019 at 10:25 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Jul 09, 2019 at 07:06:34PM +0300, Jarkko Sakkinen wrote:
> > On Mon, Jul 08, 2019 at 09:19:32AM -0700, Sean Christopherson wrote:
> > > > 2. Probably some "user story" type of examples would help with the
> > > > discussion overall [1] i.e. how one would use this for
> > > > her own good.
> > >
> > > The compelling story is Andy's original concern that userspace could
> > > circumvent existing security policies by running code in an enclave.
> > >
> > > AIUI, closing the LSM loophole is the minimal requirement to get SGX
> > > upstreamed.  The extensive discussion has largely been focused on
> > > ensuring that whatever mechanism is used to close the loophole will
> > > play nice with future SGX functionality and/or LSM security policies.
> >
> > OK, might be getting here where I fall out of the wagon so:
> >
> > Doesn't Andy's example anyway require a process that has privileges to
> > make pages executable i.e. it could run arbitrary code even without an
> > enclave?
>
> Ah, no.  He did raise that concern, but it'd only be an issue if the
> enclave fd were backed by an anon inode, in which case all enclaves would
> need EXECMEM in order to gain PROT_EXEC on EPC.  Because the fd is backed
> /dev/sgx/enclave, userspace just needs FILE__EXECUTE on /dev/sgx/enclave.

I would say it differently: regardless of exactly how /dev/sgx/enclave
is wired up under the hood, we want a way that a process can be
granted permission to usefully run enclaves without being granted
permission to execute whatever bytes of code it wants.  Preferably
without requiring LSMs to maintain some form of enclave signature
whitelist.

This is pretty much the only hard requirement I see.  We really could
achieve this, in a somewhat limited form, by saying that LSMs can
approve or reject the SIGSTRUCT.  But doing it that way is a bit nasty
as we've noticed, for a few reasons.  Several of you have raised
objections to requiring SIGSTRUCT to come from a .sigstruct file.  We
also need to worry about a SIGSTRUCT that refers to an enclave that
forgot to measure its text.  And we need to worry about SGX2.

So this whole messy exercise boils down to: a bunch of security policy
authors think that EXECMEM and similar are not to be given out
lightly.  How to we allow policies like that to be compatible with
SGX?
Jarkko Sakkinen Aug. 1, 2019, 4:38 p.m. UTC | #10
On Mon, Jul 15, 2019 at 03:29:23PM -0700, Andy Lutomirski wrote:
> I would say it differently: regardless of exactly how /dev/sgx/enclave
> is wired up under the hood, we want a way that a process can be
> granted permission to usefully run enclaves without being granted
> permission to execute whatever bytes of code it wants.  Preferably
> without requiring LSMs to maintain some form of enclave signature
> whitelist.

Would it be better to have a signer whitelist instead or some
combination? E.g. you could whiteliste either by signer or
enclave signature.

/Jarkko
Andy Lutomirski Aug. 4, 2019, 10:20 p.m. UTC | #11
On Thu, Aug 1, 2019 at 9:38 AM Jarkko Sakkinen
<jarkko.sakkinen@linux.intel.com> wrote:
>
> On Mon, Jul 15, 2019 at 03:29:23PM -0700, Andy Lutomirski wrote:
> > I would say it differently: regardless of exactly how /dev/sgx/enclave
> > is wired up under the hood, we want a way that a process can be
> > granted permission to usefully run enclaves without being granted
> > permission to execute whatever bytes of code it wants.  Preferably
> > without requiring LSMs to maintain some form of enclave signature
> > whitelist.
>
> Would it be better to have a signer whitelist instead or some
> combination? E.g. you could whiteliste either by signer or
> enclave signature.
>

I'm not sure, and also don't really think we need to commit to an
answer right now.  I do think that the eventual solution should be
more flexible than just whitelisting the signers.  In particular, it
should be possible to make secure enclaves, open-source or otherwise,
that are reproducibly buildable.  This more or less requires that the
signing private key not be a secret, which means that no one would
want to whitelist the signing key.  The enclave would be trusted, and
would seal data, on the basis of its MRENCLAVE, and the policy, if
any, would want to whitelist the MRENCLAVE or perhaps the whole
SIGSTRUCT.

But my overall point is that it should be possible to have a conherent
policy that allows any enclave whatsoever to run but that still
respects EXECMEM and such.
Jarkko Sakkinen Aug. 5, 2019, 8:51 p.m. UTC | #12
On Sun, Aug 04, 2019 at 03:20:24PM -0700, Andy Lutomirski wrote:
> On Thu, Aug 1, 2019 at 9:38 AM Jarkko Sakkinen
> <jarkko.sakkinen@linux.intel.com> wrote:
> >
> > On Mon, Jul 15, 2019 at 03:29:23PM -0700, Andy Lutomirski wrote:
> > > I would say it differently: regardless of exactly how /dev/sgx/enclave
> > > is wired up under the hood, we want a way that a process can be
> > > granted permission to usefully run enclaves without being granted
> > > permission to execute whatever bytes of code it wants.  Preferably
> > > without requiring LSMs to maintain some form of enclave signature
> > > whitelist.
> >
> > Would it be better to have a signer whitelist instead or some
> > combination? E.g. you could whiteliste either by signer or
> > enclave signature.
> >
> 
> I'm not sure, and also don't really think we need to commit to an
> answer right now.  I do think that the eventual solution should be
> more flexible than just whitelisting the signers.  In particular, it
> should be possible to make secure enclaves, open-source or otherwise,
> that are reproducibly buildable.  This more or less requires that the
> signing private key not be a secret, which means that no one would
> want to whitelist the signing key.  The enclave would be trusted, and
> would seal data, on the basis of its MRENCLAVE, and the policy, if
> any, would want to whitelist the MRENCLAVE or perhaps the whole
> SIGSTRUCT.
> 
> But my overall point is that it should be possible to have a conherent
> policy that allows any enclave whatsoever to run but that still
> respects EXECMEM and such.

So could kernel embed a fixed signing key that would be made available
through sysfs for signing? Already have one for my selftest.

/Jarkko
Andy Lutomirski Aug. 5, 2019, 9:30 p.m. UTC | #13
On Mon, Aug 5, 2019 at 1:51 PM Jarkko Sakkinen
<jarkko.sakkinen@linux.intel.com> wrote:
>
> On Sun, Aug 04, 2019 at 03:20:24PM -0700, Andy Lutomirski wrote:
> > On Thu, Aug 1, 2019 at 9:38 AM Jarkko Sakkinen
> > <jarkko.sakkinen@linux.intel.com> wrote:
> > >
> > > On Mon, Jul 15, 2019 at 03:29:23PM -0700, Andy Lutomirski wrote:
> > > > I would say it differently: regardless of exactly how /dev/sgx/enclave
> > > > is wired up under the hood, we want a way that a process can be
> > > > granted permission to usefully run enclaves without being granted
> > > > permission to execute whatever bytes of code it wants.  Preferably
> > > > without requiring LSMs to maintain some form of enclave signature
> > > > whitelist.
> > >
> > > Would it be better to have a signer whitelist instead or some
> > > combination? E.g. you could whiteliste either by signer or
> > > enclave signature.
> > >
> >
> > I'm not sure, and also don't really think we need to commit to an
> > answer right now.  I do think that the eventual solution should be
> > more flexible than just whitelisting the signers.  In particular, it
> > should be possible to make secure enclaves, open-source or otherwise,
> > that are reproducibly buildable.  This more or less requires that the
> > signing private key not be a secret, which means that no one would
> > want to whitelist the signing key.  The enclave would be trusted, and
> > would seal data, on the basis of its MRENCLAVE, and the policy, if
> > any, would want to whitelist the MRENCLAVE or perhaps the whole
> > SIGSTRUCT.
> >
> > But my overall point is that it should be possible to have a conherent
> > policy that allows any enclave whatsoever to run but that still
> > respects EXECMEM and such.
>
> So could kernel embed a fixed signing key that would be made available
> through sysfs for signing? Already have one for my selftest.
>

Do you mean a public and private key?  I was imagining that someone
would just create a key pair and publish it for the case of SGX
programs that don't depend on MRSIGNER.  There doesn't have to be just
one.

But I may be misunderstanding you.
Jarkko Sakkinen Aug. 7, 2019, 6:51 p.m. UTC | #14
On Mon, Aug 05, 2019 at 02:30:22PM -0700, Andy Lutomirski wrote:
> On Mon, Aug 5, 2019 at 1:51 PM Jarkko Sakkinen
> <jarkko.sakkinen@linux.intel.com> wrote:
> >
> > On Sun, Aug 04, 2019 at 03:20:24PM -0700, Andy Lutomirski wrote:
> > > On Thu, Aug 1, 2019 at 9:38 AM Jarkko Sakkinen
> > > <jarkko.sakkinen@linux.intel.com> wrote:
> > > >
> > > > On Mon, Jul 15, 2019 at 03:29:23PM -0700, Andy Lutomirski wrote:
> > > > > I would say it differently: regardless of exactly how /dev/sgx/enclave
> > > > > is wired up under the hood, we want a way that a process can be
> > > > > granted permission to usefully run enclaves without being granted
> > > > > permission to execute whatever bytes of code it wants.  Preferably
> > > > > without requiring LSMs to maintain some form of enclave signature
> > > > > whitelist.
> > > >
> > > > Would it be better to have a signer whitelist instead or some
> > > > combination? E.g. you could whiteliste either by signer or
> > > > enclave signature.
> > > >
> > >
> > > I'm not sure, and also don't really think we need to commit to an
> > > answer right now.  I do think that the eventual solution should be
> > > more flexible than just whitelisting the signers.  In particular, it
> > > should be possible to make secure enclaves, open-source or otherwise,
> > > that are reproducibly buildable.  This more or less requires that the
> > > signing private key not be a secret, which means that no one would
> > > want to whitelist the signing key.  The enclave would be trusted, and
> > > would seal data, on the basis of its MRENCLAVE, and the policy, if
> > > any, would want to whitelist the MRENCLAVE or perhaps the whole
> > > SIGSTRUCT.
> > >
> > > But my overall point is that it should be possible to have a conherent
> > > policy that allows any enclave whatsoever to run but that still
> > > respects EXECMEM and such.
> >
> > So could kernel embed a fixed signing key that would be made available
> > through sysfs for signing? Already have one for my selftest.
> >
> 
> Do you mean a public and private key?  I was imagining that someone
> would just create a key pair and publish it for the case of SGX
> programs that don't depend on MRSIGNER.  There doesn't have to be just
> one.
> 
> But I may be misunderstanding you.

Aa, OK, got you. I actually misunderstood you.

/Jarkko
diff mbox series

Patch

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 6dba9f282232..4144242ab690 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -35,15 +35,16 @@  struct sgx_enclave_create  {
  * @src:	address for the page data
  * @secinfo:	address for the SECINFO data
  * @mrmask:	bitmask for the measured 256 byte chunks
+ * @flags:	flags, e.g. PROT_{READ,WRITE,EXEC}
  */
 struct sgx_enclave_add_page {
 	__u64	addr;
 	__u64	src;
 	__u64	secinfo;
-	__u64	mrmask;
+	__u32	mrmask;
+	__u32	flags;
 };
 
-
 /**
  * struct sgx_enclave_init - parameter structure for the
  *                           %SGX_IOC_ENCLAVE_INIT ioctl
diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
index 3552d642b26f..8e95e45411f2 100644
--- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
@@ -2,6 +2,7 @@ 
 // Copyright(c) 2016-19 Intel Corporation.
 
 #include <asm/mman.h>
+#include <linux/mman.h>
 #include <linux/delay.h>
 #include <linux/file.h>
 #include <linux/hashtable.h>
@@ -235,7 +236,8 @@  static int sgx_validate_secs(const struct sgx_secs *secs,
 }
 
 static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
-						 unsigned long addr)
+						 unsigned long addr,
+						 unsigned long prot)
 {
 	struct sgx_encl_page *encl_page;
 	int ret;
@@ -247,6 +249,7 @@  static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
 		return ERR_PTR(-ENOMEM);
 	encl_page->desc = addr;
 	encl_page->encl = encl;
+	encl_page->vm_prot_bits = calc_vm_prot_bits(prot, 0);
 	ret = radix_tree_insert(&encl->page_tree, PFN_DOWN(encl_page->desc),
 				encl_page);
 	if (ret) {
@@ -517,7 +520,7 @@  static int __sgx_encl_add_page(struct sgx_encl *encl,
 
 static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long addr,
 			     void *data, struct sgx_secinfo *secinfo,
-			     unsigned int mrmask)
+			     unsigned int mrmask, unsigned long prot)
 {
 	u64 page_type = secinfo->flags & SGX_SECINFO_PAGE_TYPE_MASK;
 	struct sgx_encl_page *encl_page;
@@ -543,7 +546,7 @@  static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long addr,
 		goto out;
 	}
 
-	encl_page = sgx_encl_page_alloc(encl, addr);
+	encl_page = sgx_encl_page_alloc(encl, addr, prot);
 	if (IS_ERR(encl_page)) {
 		ret = PTR_ERR(encl_page);
 		goto out;
@@ -584,6 +587,7 @@  static long sgx_ioc_enclave_add_page(struct file *filep, void __user *arg)
 	struct sgx_enclave_add_page addp;
 	struct sgx_secinfo secinfo;
 	struct page *data_page;
+	unsigned long prot;
 	void *data;
 	int ret;
 
@@ -605,7 +609,10 @@  static long sgx_ioc_enclave_add_page(struct file *filep, void __user *arg)
 		goto out;
 	}
 
-	ret = sgx_encl_add_page(encl, addp.addr, data, &secinfo, addp.mrmask);
+	prot = addp.flags & (PROT_READ | PROT_WRITE | PROT_EXEC);
+
+	ret = sgx_encl_add_page(encl, addp.addr, data, &secinfo, addp.mrmask,
+				prot);
 	if (ret)
 		goto out;
 
diff --git a/arch/x86/kernel/cpu/sgx/driver/main.c b/arch/x86/kernel/cpu/sgx/driver/main.c
index d7de4d9aea87..cfc348b44ffb 100644
--- a/arch/x86/kernel/cpu/sgx/driver/main.c
+++ b/arch/x86/kernel/cpu/sgx/driver/main.c
@@ -79,17 +79,66 @@  static int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
 	return 0;
 }
 
+/*
+ * Returns the AND of VM_{READ,WRITE,EXEC} permissions across all pages
+ * covered by the specific VMA.  A non-existent (or yet to be added) enclave
+ * page is considered to have no RWX permissions, i.e. is inaccessible.
+ */
+static unsigned long sgx_allowed_rwx(struct sgx_encl *encl,
+				     struct vm_area_struct *vma)
+{
+	unsigned long allowed_rwx = VM_READ | VM_WRITE | VM_EXEC;
+	unsigned long idx, idx_start, idx_end;
+	struct sgx_encl_page *page;
+
+	idx_start = PFN_DOWN(vma->vm_start);
+	idx_end = PFN_DOWN(vma->vm_end - 1);
+
+	for (idx = idx_start; idx <= idx_end; ++idx) {
+		/*
+		 * No need to take encl->lock, vm_prot_bits is set prior to
+		 * insertion and never changes, and racing with adding pages is
+		 * a userspace bug.
+		 */
+		rcu_read_lock();
+		page = radix_tree_lookup(&encl->page_tree, idx);
+		rcu_read_unlock();
+
+		/* Do not allow R|W|X to a non-existent page. */
+		if (!page)
+			allowed_rwx = 0;
+		else
+			allowed_rwx &= page->vm_prot_bits;
+		if (!allowed_rwx)
+			break;
+	}
+
+	return allowed_rwx;
+}
+
 static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sgx_encl *encl = file->private_data;
+	unsigned long allowed_rwx;
 	int ret;
 
+	allowed_rwx = sgx_allowed_rwx(encl, vma);
+	if (vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC) & ~allowed_rwx)
+		return -EACCES;
+
 	if (!sgx_encl_get_mm(encl, vma->vm_mm)) {
 		ret = sgx_encl_mm_add(encl, vma->vm_mm);
 		if (ret)
 			return ret;
 	}
 
+	if (!(allowed_rwx & VM_READ))
+		vma->vm_flags &= ~VM_MAYREAD;
+	if (!(allowed_rwx & VM_WRITE))
+		vma->vm_flags &= ~VM_MAYWRITE;
+	if (!(allowed_rwx & VM_EXEC))
+		vma->vm_flags &= ~VM_MAYEXEC;
+
 	vma->vm_ops = &sgx_vm_ops;
 	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
 	vma->vm_private_data = encl;
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 7b339334d875..5cb0f2653d4c 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -41,6 +41,7 @@  enum sgx_encl_page_desc {
 
 struct sgx_encl_page {
 	unsigned long desc;
+	unsigned long vm_prot_bits;
 	struct sgx_epc_page *epc_page;
 	struct sgx_va_page *va_page;
 	struct sgx_encl *encl;
diff --git a/tools/testing/selftests/x86/sgx/main.c b/tools/testing/selftests/x86/sgx/main.c
index e2265f841fb0..e14a34adc0e4 100644
--- a/tools/testing/selftests/x86/sgx/main.c
+++ b/tools/testing/selftests/x86/sgx/main.c
@@ -2,6 +2,7 @@ 
 // Copyright(c) 2016-18 Intel Corporation.
 
 #include <elf.h>
+#include <errno.h>
 #include <fcntl.h>
 #include <stdbool.h>
 #include <stdio.h>
@@ -18,6 +19,8 @@ 
 #include "../../../../../arch/x86/kernel/cpu/sgx/arch.h"
 #include "../../../../../arch/x86/include/uapi/asm/sgx.h"
 
+#define PAGE_SIZE  4096
+
 static const uint64_t MAGIC = 0x1122334455667788ULL;
 
 struct vdso_symtab {
@@ -135,8 +138,7 @@  static bool encl_create(int dev_fd, unsigned long bin_size,
 	for (secs->size = 4096; secs->size < bin_size; )
 		secs->size <<= 1;
 
-	base = mmap(NULL, secs->size, PROT_READ | PROT_WRITE | PROT_EXEC,
-		    MAP_SHARED, dev_fd, 0);
+	base = mmap(NULL, secs->size, PROT_NONE, MAP_SHARED, dev_fd, 0);
 	if (base == MAP_FAILED) {
 		perror("mmap");
 		return false;
@@ -147,7 +149,7 @@  static bool encl_create(int dev_fd, unsigned long bin_size,
 	ioc.src = (unsigned long)secs;
 	rc = ioctl(dev_fd, SGX_IOC_ENCLAVE_CREATE, &ioc);
 	if (rc) {
-		fprintf(stderr, "ECREATE failed rc=%d.\n", rc);
+		fprintf(stderr, "ECREATE failed rc=%d, err=%d.\n", rc, errno);
 		munmap(base, secs->size);
 		return false;
 	}
@@ -160,8 +162,14 @@  static bool encl_add_page(int dev_fd, unsigned long addr, void *data,
 {
 	struct sgx_enclave_add_page ioc;
 	struct sgx_secinfo secinfo;
+	unsigned long prot;
 	int rc;
 
+	if (flags == SGX_SECINFO_TCS)
+		prot = PROT_READ | PROT_WRITE;
+	else
+		prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+
 	memset(&secinfo, 0, sizeof(secinfo));
 	secinfo.flags = flags;
 
@@ -169,6 +177,7 @@  static bool encl_add_page(int dev_fd, unsigned long addr, void *data,
 	ioc.mrmask = 0xFFFF;
 	ioc.addr = addr;
 	ioc.src = (uint64_t)data;
+	ioc.flags = prot;
 
 	rc = ioctl(dev_fd, SGX_IOC_ENCLAVE_ADD_PAGE, &ioc);
 	if (rc) {
@@ -184,6 +193,7 @@  static bool encl_load(struct sgx_secs *secs, unsigned long bin_size)
 	struct sgx_enclave_init ioc;
 	uint64_t offset;
 	uint64_t flags;
+	void *addr;
 	int dev_fd;
 	int rc;
 
@@ -215,6 +225,22 @@  static bool encl_load(struct sgx_secs *secs, unsigned long bin_size)
 		goto out_map;
 	}
 
+	addr = mmap((void *)secs->base, PAGE_SIZE, PROT_READ | PROT_WRITE,
+		    MAP_SHARED | MAP_FIXED, dev_fd, 0);
+	if (addr == MAP_FAILED) {
+		fprintf(stderr, "mmap() failed on TCS, errno=%d.\n", errno);
+		return false;
+	}
+
+	addr = mmap((void *)(secs->base + PAGE_SIZE), bin_size - PAGE_SIZE,
+		    PROT_READ | PROT_WRITE | PROT_EXEC,
+		    MAP_SHARED | MAP_FIXED, dev_fd, 0);
+	if (addr == MAP_FAILED) {
+		fprintf(stderr, "mmap() failed, errno=%d.\n", errno);
+		return false;
+	}
+
+
 	close(dev_fd);
 	return true;
 out_map: