diff mbox series

[3/6] PKEY: Apply PKEY_ENFORCE_API to mprotect

Message ID 20230515130553.2311248-4-jeffxu@chromium.org (mailing list archive)
State Superseded
Headers show
Series Memory Mapping (VMA) protection using PKU - set 1 | expand

Commit Message

Jeff Xu May 15, 2023, 1:05 p.m. UTC
From: Jeff Xu <jeffxu@google.com>

This patch enables PKEY_ENFORCE_API for the mprotect and
mprotect_pkey syscalls.

Signed-off-by: Jeff Xu<jeffxu@google.com>
---
 mm/mprotect.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Kees Cook May 16, 2023, 8:07 p.m. UTC | #1
On Mon, May 15, 2023 at 01:05:49PM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@google.com>
> 
> This patch enables PKEY_ENFORCE_API for the mprotect and
> mprotect_pkey syscalls.

All callers are from userspace -- this change looks like a no-op?

-Kees
Jeff Xu May 16, 2023, 10:23 p.m. UTC | #2
On Tue, May 16, 2023 at 1:07 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, May 15, 2023 at 01:05:49PM +0000, jeffxu@chromium.org wrote:
> > From: Jeff Xu <jeffxu@google.com>
> >
> > This patch enables PKEY_ENFORCE_API for the mprotect and
> > mprotect_pkey syscalls.
>
> All callers are from userspace -- this change looks like a no-op?
>
Yes. All callers are from user space now.
I am thinking about the future when someone adds a caller in kernel
code and may miss the check.
This is also consistent with munmap and other syscalls I plan to change.
There are comments on do_mprotect_pkey() to describe how this flag is used.


> -Kees
>
> --
> Kees Cook
Dave Hansen May 16, 2023, 11:18 p.m. UTC | #3
On 5/15/23 06:05, jeffxu@chromium.org wrote:
>  /*
>   * pkey==-1 when doing a legacy mprotect()
> + * syscall==true if this is called by syscall from userspace.
> + * Note: this is always true for now, added as a reminder in case that
> + * do_mprotect_pkey is called directly by kernel in the future.
> + * Also it is consistent with __do_munmap().
>   */
>  static int do_mprotect_pkey(unsigned long start, size_t len,
> -		unsigned long prot, int pkey)
> +		unsigned long prot, int pkey, bool syscall)
>  {

The 'syscall' seems kinda silly (and a bit confusing).  It's easy to
check if the caller is a kthread or has a current->mm==NULL.  If you
*really* want a warning, I'd check for those rather than plumb a
apparently unused argument in here.

BTW, this warning is one of those things that will probably cause some
amount of angst.  I'd move it to the end of the series or just axe it
completely.
Jeff Xu May 16, 2023, 11:36 p.m. UTC | #4
On Tue, May 16, 2023 at 4:19 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 5/15/23 06:05, jeffxu@chromium.org wrote:
> >  /*
> >   * pkey==-1 when doing a legacy mprotect()
> > + * syscall==true if this is called by syscall from userspace.
> > + * Note: this is always true for now, added as a reminder in case that
> > + * do_mprotect_pkey is called directly by kernel in the future.
> > + * Also it is consistent with __do_munmap().
> >   */
> >  static int do_mprotect_pkey(unsigned long start, size_t len,
> > -             unsigned long prot, int pkey)
> > +             unsigned long prot, int pkey, bool syscall)
> >  {
>
> The 'syscall' seems kinda silly (and a bit confusing).  It's easy to
> check if the caller is a kthread or has a current->mm==NULL.  If you
> *really* want a warning, I'd check for those rather than plumb a
> apparently unused argument in here.
>
> BTW, this warning is one of those things that will probably cause some
> amount of angst.  I'd move it to the end of the series or just axe it
> completely.

Agreed. syscall is not a good name here.
The intention is to check this at the system call entry point
For example, munmap can get called inside mremap(), but by that time
mremap() should already check that all the memory is writeable.

I will remove "syscall" from do_mprotect_pkey signature, it seems it caused
more confusion than helpful.  I will keep the comments/note in place to remind
future developer.
Jeff Xu May 17, 2023, 4:50 a.m. UTC | #5
On Tue, May 16, 2023 at 4:37 PM Jeff Xu <jeffxu@google.com> wrote:
>
> On Tue, May 16, 2023 at 4:19 PM Dave Hansen <dave.hansen@intel.com> wrote:
> >
> > On 5/15/23 06:05, jeffxu@chromium.org wrote:
> > >  /*
> > >   * pkey==-1 when doing a legacy mprotect()
> > > + * syscall==true if this is called by syscall from userspace.
> > > + * Note: this is always true for now, added as a reminder in case that
> > > + * do_mprotect_pkey is called directly by kernel in the future.
> > > + * Also it is consistent with __do_munmap().
> > >   */
> > >  static int do_mprotect_pkey(unsigned long start, size_t len,
> > > -             unsigned long prot, int pkey)
> > > +             unsigned long prot, int pkey, bool syscall)
> > >  {
> >
> > The 'syscall' seems kinda silly (and a bit confusing).  It's easy to
> > check if the caller is a kthread or has a current->mm==NULL.  If you
> > *really* want a warning, I'd check for those rather than plumb a
> > apparently unused argument in here.
> >
> > BTW, this warning is one of those things that will probably cause some
> > amount of angst.  I'd move it to the end of the series or just axe it
> > completely.
>
Okay, I will move the logging part to the end of the series.


> Agreed. syscall is not a good name here.
> The intention is to check this at the system call entry point
> For example, munmap can get called inside mremap(), but by that time
> mremap() should already check that all the memory is writeable.
>
> I will remove "syscall" from do_mprotect_pkey signature, it seems it caused
> more confusion than helpful.  I will keep the comments/note in place to remind
> future developer.
diff mbox series

Patch

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 8a68fdca8487..1378be50567d 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -727,9 +727,13 @@  mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
 
 /*
  * pkey==-1 when doing a legacy mprotect()
+ * syscall==true if this is called by syscall from userspace.
+ * Note: this is always true for now, added as a reminder in case that
+ * do_mprotect_pkey is called directly by kernel in the future.
+ * Also it is consistent with __do_munmap().
  */
 static int do_mprotect_pkey(unsigned long start, size_t len,
-		unsigned long prot, int pkey)
+		unsigned long prot, int pkey, bool syscall)
 {
 	unsigned long nstart, end, tmp, reqprot;
 	struct vm_area_struct *vma, *prev;
@@ -794,6 +798,21 @@  static int do_mprotect_pkey(unsigned long start, size_t len,
 		}
 	}
 
+	/*
+	 * When called by syscall from userspace, check if the calling
+	 * thread has the PKEY permission to modify the memory mapping.
+	 */
+	if (syscall &&
+	    arch_check_pkey_enforce_api(current->mm, start, end) < 0) {
+		char comm[TASK_COMM_LEN];
+
+		pr_warn_ratelimited(
+			"munmap was denied on PKEY_ENFORCE_API memory, pid=%d '%s'\n",
+			task_pid_nr(current), get_task_comm(comm, current));
+		error = -EACCES;
+		goto out;
+	}
+
 	prev = vma_prev(&vmi);
 	if (start > vma->vm_start)
 		prev = vma;
@@ -878,7 +897,7 @@  static int do_mprotect_pkey(unsigned long start, size_t len,
 SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
 		unsigned long, prot)
 {
-	return do_mprotect_pkey(start, len, prot, -1);
+	return do_mprotect_pkey(start, len, prot, -1, true);
 }
 
 #ifdef CONFIG_ARCH_HAS_PKEYS
@@ -886,7 +905,7 @@  SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
 SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len,
 		unsigned long, prot, int, pkey)
 {
-	return do_mprotect_pkey(start, len, prot, pkey);
+	return do_mprotect_pkey(start, len, prot, pkey, true);
 }
 
 SYSCALL_DEFINE2(pkey_alloc, unsigned long, flags, unsigned long, init_val)