[RFC,v2,1/5] mm: Introduce vm_ops->may_mprotect()
diff mbox series

Message ID 20190606021145.12604-2-sean.j.christopherson@intel.com
State New
Headers show
Series
  • security: x86/sgx: SGX vs. LSM
Related show

Commit Message

Sean Christopherson June 6, 2019, 2:11 a.m. UTC
SGX will use the may_mprotect() hook to prevent userspace from
circumventing various security checks, e.g. Linux Security Modules.
Naming it may_mprotect() instead of simply mprotect() is intended to
reflect the hook's purpose as a way to gate mprotect() as opposed to
a wholesale replacement.

Enclaves are built by copying data from normal memory into the Enclave
Page Cache (EPC).  Due to the nature of SGX, the EPC is represented by a
single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
MAP_SHARED vm_file that references single file path.  Furthermore, all
enclaves will need read, write and execute pages in the EPC.

As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
that originated in a non-EXECUTE file (which is long gone by the time
mprotect() is called).

By hooking mprotect(), SGX can make explicit LSM upcalls while an
enclave is being built, i.e. when the kernel has a handle to origin of
each enclave page, and enforce the result of the LSM policy whenever
userspace maps the enclave page in the future.

Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
that approach is quite ugly, e.g. would require userspace to call an
SGX ioctl() prior to using mprotect() to extend a page's protections.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 include/linux/mm.h |  2 ++
 mm/mprotect.c      | 15 +++++++++++----
 2 files changed, 13 insertions(+), 4 deletions(-)

Comments

Jarkko Sakkinen June 10, 2019, 3:06 p.m. UTC | #1
On Wed, Jun 05, 2019 at 07:11:41PM -0700, Sean Christopherson wrote:
> SGX will use the may_mprotect() hook to prevent userspace from
> circumventing various security checks, e.g. Linux Security Modules.
> Naming it may_mprotect() instead of simply mprotect() is intended to
> reflect the hook's purpose as a way to gate mprotect() as opposed to
> a wholesale replacement.

"This commit adds may_mprotect() to struct vm_operations_struct, which
can be used to ask from the owner of a VMA if mprotect() is allowed."

This would be more appropriate statement because that is what the code
change aims for precisely. I did not even understand what you meant by
gating in this context. I would leave SGX and LSM's (and especially
"various security checks", which means abssolutely nothing) out of the
first paragraph completely.

> Enclaves are built by copying data from normal memory into the Enclave
> Page Cache (EPC).  Due to the nature of SGX, the EPC is represented by a
> single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
> MAP_SHARED vm_file that references single file path.  Furthermore, all
> enclaves will need read, write and execute pages in the EPC.

I would just say that "Due to the fact that EPC is delivered as IO
memory from the preboot firmware, it can be only mapped as MAP_SHARED".
It is what it is.

> As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
> can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
> that originated in a non-EXECUTE file (which is long gone by the time
> mprotect() is called).

I have hard time following what is paragraph is trying to say.

> By hooking mprotect(), SGX can make explicit LSM upcalls while an
> enclave is being built, i.e. when the kernel has a handle to origin of
> each enclave page, and enforce the result of the LSM policy whenever
> userspace maps the enclave page in the future.

"LSM policy whenever calls mprotect()"? I'm no sure why you mean by
mapping here and if there is any need to talk about future. Isn't this
needed now?

> Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
> that approach is quite ugly, e.g. would require userspace to call an
> SGX ioctl() prior to using mprotect() to extend a page's protections.

Instead of talking "playing games" I would state what could be done with
VM_MAY{READ,WRITE,EXEC} and why it is bad. Leaves questions otherwise.

> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  include/linux/mm.h |  2 ++
>  mm/mprotect.c      | 15 +++++++++++----
>  2 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e8834ac32b7..a697996040ac 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -458,6 +458,8 @@ struct vm_operations_struct {
>  	void (*close)(struct vm_area_struct * area);
>  	int (*split)(struct vm_area_struct * area, unsigned long addr);
>  	int (*mremap)(struct vm_area_struct * area);
> +	int (*may_mprotect)(struct vm_area_struct * area, unsigned long start,
> +			    unsigned long end, unsigned long prot);

Could be just boolean.

/Jarkko
Sean Christopherson June 10, 2019, 3:55 p.m. UTC | #2
On Mon, Jun 10, 2019 at 06:06:00PM +0300, Jarkko Sakkinen wrote:
> On Wed, Jun 05, 2019 at 07:11:41PM -0700, Sean Christopherson wrote:
> > SGX will use the may_mprotect() hook to prevent userspace from
> > circumventing various security checks, e.g. Linux Security Modules.
> > Naming it may_mprotect() instead of simply mprotect() is intended to
> > reflect the hook's purpose as a way to gate mprotect() as opposed to
> > a wholesale replacement.
> 
> "This commit adds may_mprotect() to struct vm_operations_struct, which
> can be used to ask from the owner of a VMA if mprotect() is allowed."
> 
> This would be more appropriate statement because that is what the code
> change aims for precisely. I did not even understand what you meant by
> gating in this context. I would leave SGX and LSM's (and especially
> "various security checks", which means abssolutely nothing) out of the
> first paragraph completely.
> 
> > Enclaves are built by copying data from normal memory into the Enclave
> > Page Cache (EPC).  Due to the nature of SGX, the EPC is represented by a
> > single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
> > MAP_SHARED vm_file that references single file path.  Furthermore, all
> > enclaves will need read, write and execute pages in the EPC.
> 
> I would just say that "Due to the fact that EPC is delivered as IO
> memory from the preboot firmware, it can be only mapped as MAP_SHARED".
> It is what it is.

I was trying to convey that the nature of SGX itself requires that an
enclave's pages are shared between process.  E.g. {MAP,VM}_SHARED would be
required even if we modified the mmu to handle EPC memory in such a way
that it didn't have to be tagged with VM_PFNMAP.

> > As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
> > can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
> > that originated in a non-EXECUTE file (which is long gone by the time
> > mprotect() is called).
> 
> I have hard time following what is paragraph is trying to say.
> 
> > By hooking mprotect(), SGX can make explicit LSM upcalls while an
> > enclave is being built, i.e. when the kernel has a handle to origin of
> > each enclave page, and enforce the result of the LSM policy whenever
> > userspace maps the enclave page in the future.
> 
> "LSM policy whenever calls mprotect()"? I'm no sure why you mean by
> mapping here and if there is any need to talk about future. Isn't this
> needed now?

Future is referring to the timeline of a running kernel, not the future
of the kernel code.

Rather than trying to explain all of the above with words, I'll provide
code examples to show how ->may_protect() will be used by SGX and why it
is the preferred solution.

> > Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
> > that approach is quite ugly, e.g. would require userspace to call an
> > SGX ioctl() prior to using mprotect() to extend a page's protections.
> 
> Instead of talking "playing games" I would state what could be done with
> VM_MAY{READ,WRITE,EXEC} and why it is bad. Leaves questions otherwise.
> 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> >  include/linux/mm.h |  2 ++
> >  mm/mprotect.c      | 15 +++++++++++----
> >  2 files changed, 13 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0e8834ac32b7..a697996040ac 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -458,6 +458,8 @@ struct vm_operations_struct {
> >  	void (*close)(struct vm_area_struct * area);
> >  	int (*split)(struct vm_area_struct * area, unsigned long addr);
> >  	int (*mremap)(struct vm_area_struct * area);
> > +	int (*may_mprotect)(struct vm_area_struct * area, unsigned long start,
> > +			    unsigned long end, unsigned long prot);
> 
> Could be just boolean.
> 
> /Jarkko
Xing, Cedric June 10, 2019, 5:47 p.m. UTC | #3
> From: Christopherson, Sean J
> Sent: Monday, June 10, 2019 8:56 AM
> 
> > > As a result, LSM policies cannot be meaningfully applied, e.g. an
> > > LSM can deny access to the EPC as a whole, but can't deny PROT_EXEC
> > > on page that originated in a non-EXECUTE file (which is long gone by
> > > the time
> > > mprotect() is called).
> >
> > I have hard time following what is paragraph is trying to say.
> >
> > > By hooking mprotect(), SGX can make explicit LSM upcalls while an
> > > enclave is being built, i.e. when the kernel has a handle to origin
> > > of each enclave page, and enforce the result of the LSM policy
> > > whenever userspace maps the enclave page in the future.
> >
> > "LSM policy whenever calls mprotect()"? I'm no sure why you mean by
> > mapping here and if there is any need to talk about future. Isn't this
> > needed now?
> 
> Future is referring to the timeline of a running kernel, not the future
> of the kernel code.
> 
> Rather than trying to explain all of the above with words, I'll provide
> code examples to show how ->may_protect() will be used by SGX and why it
> is the preferred solution.

The LSM concept is to separate security policy enforcement from the rest of the kernel. For modules, the "official" way is to use VM_MAY* flags to limit allowable permissions, while LSM uses security_file_mprotect(). I guess that's why we didn't have .may_mprotect() in the first place. What you are doing is enforcing some security policy outside of LSM, which is dirty from architecture perspective.
Sean Christopherson June 10, 2019, 7:49 p.m. UTC | #4
On Mon, Jun 10, 2019 at 10:47:52AM -0700, Xing, Cedric wrote:
> > From: Christopherson, Sean J
> > Sent: Monday, June 10, 2019 8:56 AM
> > 
> > > > As a result, LSM policies cannot be meaningfully applied, e.g. an
> > > > LSM can deny access to the EPC as a whole, but can't deny PROT_EXEC
> > > > on page that originated in a non-EXECUTE file (which is long gone by
> > > > the time
> > > > mprotect() is called).
> > >
> > > I have hard time following what is paragraph is trying to say.
> > >
> > > > By hooking mprotect(), SGX can make explicit LSM upcalls while an
> > > > enclave is being built, i.e. when the kernel has a handle to origin
> > > > of each enclave page, and enforce the result of the LSM policy
> > > > whenever userspace maps the enclave page in the future.
> > >
> > > "LSM policy whenever calls mprotect()"? I'm no sure why you mean by
> > > mapping here and if there is any need to talk about future. Isn't this
> > > needed now?
> > 
> > Future is referring to the timeline of a running kernel, not the future
> > of the kernel code.
> > 
> > Rather than trying to explain all of the above with words, I'll provide
> > code examples to show how ->may_protect() will be used by SGX and why it
> > is the preferred solution.
> 
> The LSM concept is to separate security policy enforcement from the rest of
> the kernel. For modules, the "official" way is to use VM_MAY* flags to limit
> allowable permissions, while LSM uses security_file_mprotect().
> I guess that's why we didn't have .may_mprotect() in the first place.

Heh, so I've typed up about five different responses to this comment.  In
doing so, I think I've convinced myself that ->may_mprotect() is
unnecessary.  Rther than hook mprotect(), simply update the VM_MAY* flags
during mmap(), with all bits cleared if there isn't an associated enclave
page.  IIRC, the need to add ->may_protect() came about when we were
exploring more dynamic interplay between SGX and LSMs.

> What you are doing is enforcing some security policy outside of LSM, which
> is dirty from architecture perspective.

No, the enclave page protections are enforced regardless of LSM policy,
and in v2 those protections are immutable.  Yes, the explicit enclave
page protection bits are being added primarily for LSMs, but they don't
impact functionality other than at the security_enclave_load() touchpoint.
Xing, Cedric June 10, 2019, 10:06 p.m. UTC | #5
> From: Christopherson, Sean J
> Sent: Monday, June 10, 2019 12:50 PM
> 
> On Mon, Jun 10, 2019 at 10:47:52AM -0700, Xing, Cedric wrote:
> > > From: Christopherson, Sean J
> > > Sent: Monday, June 10, 2019 8:56 AM
> > >
> > > > > As a result, LSM policies cannot be meaningfully applied, e.g.
> > > > > an LSM can deny access to the EPC as a whole, but can't deny
> > > > > PROT_EXEC on page that originated in a non-EXECUTE file (which
> > > > > is long gone by the time
> > > > > mprotect() is called).
> > > >
> > > > I have hard time following what is paragraph is trying to say.
> > > >
> > > > > By hooking mprotect(), SGX can make explicit LSM upcalls while
> > > > > an enclave is being built, i.e. when the kernel has a handle to
> > > > > origin of each enclave page, and enforce the result of the LSM
> > > > > policy whenever userspace maps the enclave page in the future.
> > > >
> > > > "LSM policy whenever calls mprotect()"? I'm no sure why you mean
> > > > by mapping here and if there is any need to talk about future.
> > > > Isn't this needed now?
> > >
> > > Future is referring to the timeline of a running kernel, not the
> > > future of the kernel code.
> > >
> > > Rather than trying to explain all of the above with words, I'll
> > > provide code examples to show how ->may_protect() will be used by
> > > SGX and why it is the preferred solution.
> >
> > The LSM concept is to separate security policy enforcement from the
> > rest of the kernel. For modules, the "official" way is to use VM_MAY*
> > flags to limit allowable permissions, while LSM uses
> security_file_mprotect().
> > I guess that's why we didn't have .may_mprotect() in the first place.
> 
> Heh, so I've typed up about five different responses to this comment.
> In doing so, I think I've convinced myself that ->may_mprotect() is
> unnecessary.  Rther than hook mprotect(), simply update the VM_MAY*
> flags during mmap(), with all bits cleared if there isn't an associated
> enclave page.  IIRC, the need to add ->may_protect() came about when we
> were exploring more dynamic interplay between SGX and LSMs.
> 
> > What you are doing is enforcing some security policy outside of LSM,
> > which is dirty from architecture perspective.
> 
> No, the enclave page protections are enforced regardless of LSM policy,
> and in v2 those protections are immutable.  Yes, the explicit enclave
> page protection bits are being added primarily for LSMs, but they don't
> impact functionality other than at the security_enclave_load()
> touchpoint.

Disagreed.

You can say you want to enforce "something" without LSM. But what's the purpose of that "something" without LSM? Why doesn't the original mprotect() enforce that "something"?

It *does* affect functionality because user mode code has to figure out an "explicit protection" to make sure the enclave would work with *and also* without LSM. That said, the "explicit protection" can neither be too restrictive (or enclave wouldn't work) nor be too permissive (or LSM policies are violated). But what if the user mode code doesn't have appropriate "explicit protection" ahead of time as it is just going to mprotect() as the enclave requests at runtime?

And your restrictions on mmap()'ing non-existing pages also have great impacts to SGX2 support.

I think some reasonable answers are needed to the above questions before we can call this proposal viable.

Patch
diff mbox series

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0e8834ac32b7..a697996040ac 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -458,6 +458,8 @@  struct vm_operations_struct {
 	void (*close)(struct vm_area_struct * area);
 	int (*split)(struct vm_area_struct * area, unsigned long addr);
 	int (*mremap)(struct vm_area_struct * area);
+	int (*may_mprotect)(struct vm_area_struct * area, unsigned long start,
+			    unsigned long end, unsigned long prot);
 	vm_fault_t (*fault)(struct vm_fault *vmf);
 	vm_fault_t (*huge_fault)(struct vm_fault *vmf,
 			enum page_entry_size pe_size);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index bf38dfbbb4b4..18732543b295 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -547,13 +547,20 @@  static int do_mprotect_pkey(unsigned long start, size_t len,
 			goto out;
 		}
 
-		error = security_file_mprotect(vma, reqprot, prot);
-		if (error)
-			goto out;
-
 		tmp = vma->vm_end;
 		if (tmp > end)
 			tmp = end;
+
+		if (vma->vm_ops && vma->vm_ops->may_mprotect) {
+			error = vma->vm_ops->may_mprotect(vma, nstart, tmp, prot);
+			if (error)
+				goto out;
+		}
+
+		error = security_file_mprotect(vma, reqprot, prot);
+		if (error)
+			goto out;
+
 		error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
 		if (error)
 			goto out;