diff mbox series

[v7,02/39] prctl: arch-agnostic prctl for shadow stack

Message ID 20231122-arm64-gcs-v7-2-201c483bd775@kernel.org (mailing list archive)
State New
Headers show
Series arm64/gcs: Provide support for GCS in userspace | expand

Commit Message

Mark Brown Nov. 22, 2023, 9:42 a.m. UTC
Three architectures (x86, aarch64, riscv) have announced support for
shadow stacks with fairly similar functionality.  While x86 is using
arch_prctl() to control the functionality neither arm64 nor riscv uses
that interface so this patch adds arch-agnostic prctl() support to
get and set status of shadow stacks and lock the current configuation to
prevent further changes, with support for turning on and off individual
subfeatures so applications can limit their exposure to features that
they do not need.  The features are:

  - PR_SHADOW_STACK_ENABLE: Tracking and enforcement of shadow stacks,
    including allocation of a shadow stack if one is not already
    allocated.
  - PR_SHADOW_STACK_WRITE: Writes to specific addresses in the shadow
    stack.
  - PR_SHADOW_STACK_PUSH: Push additional values onto the shadow stack.

These features are expected to be inherited by new threads and cleared
on exec(), unknown features should be rejected for enable but accepted
for locking (in order to allow for future proofing).

This is based on a patch originally written by Deepak Gupta but modified
fairly heavily, support for indirect landing pads is removed, additional
modes added and the locking interface reworked.  The set status prctl()
is also reworked to just set flags, if setting/reading the shadow stack
pointer is required this could be a separate prctl.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 include/linux/mm.h         |  4 ++++
 include/uapi/linux/prctl.h | 22 ++++++++++++++++++++++
 kernel/sys.c               | 30 ++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+)

Comments

Deepak Gupta Dec. 12, 2023, 7:17 p.m. UTC | #1
On Wed, Nov 22, 2023 at 1:43 AM Mark Brown <broonie@kernel.org> wrote:
>
> Three architectures (x86, aarch64, riscv) have announced support for
> shadow stacks with fairly similar functionality.  While x86 is using
> arch_prctl() to control the functionality neither arm64 nor riscv uses
> that interface so this patch adds arch-agnostic prctl() support to
> get and set status of shadow stacks and lock the current configuation to
> prevent further changes, with support for turning on and off individual
> subfeatures so applications can limit their exposure to features that
> they do not need.  The features are:
>
>   - PR_SHADOW_STACK_ENABLE: Tracking and enforcement of shadow stacks,
>     including allocation of a shadow stack if one is not already
>     allocated.
>   - PR_SHADOW_STACK_WRITE: Writes to specific addresses in the shadow
>     stack.
>   - PR_SHADOW_STACK_PUSH: Push additional values onto the shadow stack.
>
> These features are expected to be inherited by new threads and cleared
> on exec(), unknown features should be rejected for enable but accepted
> for locking (in order to allow for future proofing).
>
> This is based on a patch originally written by Deepak Gupta but modified
> fairly heavily, support for indirect landing pads is removed, additional
> modes added and the locking interface reworked.  The set status prctl()
> is also reworked to just set flags, if setting/reading the shadow stack
> pointer is required this could be a separate prctl.
>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
>  include/linux/mm.h         |  4 ++++
>  include/uapi/linux/prctl.h | 22 ++++++++++++++++++++++
>  kernel/sys.c               | 30 ++++++++++++++++++++++++++++++
>  3 files changed, 56 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 10462f354614..8b28483b4afa 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4143,4 +4143,8 @@ static inline bool pfn_is_unaccepted_memory(unsigned long pfn)
>         return range_contains_unaccepted_memory(paddr, paddr + PAGE_SIZE);
>  }
>
> +int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status);
> +int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);
> +int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> +
>  #endif /* _LINUX_MM_H */
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 370ed14b1ae0..3c66ed8f46d8 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -306,4 +306,26 @@ struct prctl_mm_map {
>  # define PR_RISCV_V_VSTATE_CTRL_NEXT_MASK      0xc
>  # define PR_RISCV_V_VSTATE_CTRL_MASK           0x1f
>
> +/*
> + * Get the current shadow stack configuration for the current thread,
> + * this will be the value configured via PR_SET_SHADOW_STACK_STATUS.
> + */
> +#define PR_GET_SHADOW_STACK_STATUS      71
> +
> +/*
> + * Set the current shadow stack configuration.  Enabling the shadow
> + * stack will cause a shadow stack to be allocated for the thread.
> + */
> +#define PR_SET_SHADOW_STACK_STATUS      72
> +# define PR_SHADOW_STACK_ENABLE         (1UL << 0)

Other architecture may require disabling shadow stack if glibc
tunables is set to permissive mode.
In permissive mode, if glibc encounters `dlopen` on an object which
doesn't support shadow stack,
glibc should be able to issue PR_SHADOW_STACK_DISABLE.

Architectures can choose to implement or not but I think arch agnostic
code should enumerate this.

> +# define PR_SHADOW_STACK_WRITE         (1UL << 1)
> +# define PR_SHADOW_STACK_PUSH          (1UL << 2)
> +
> +/*
> + * Prevent further changes to the specified shadow stack
> + * configuration.  All bits may be locked via this call, including
> + * undefined bits.
> + */
Mark Brown Dec. 12, 2023, 7:22 p.m. UTC | #2
On Tue, Dec 12, 2023 at 11:17:11AM -0800, Deepak Gupta wrote:
> On Wed, Nov 22, 2023 at 1:43 AM Mark Brown <broonie@kernel.org> wrote:

> > +/*
> > + * Set the current shadow stack configuration.  Enabling the shadow
> > + * stack will cause a shadow stack to be allocated for the thread.
> > + */
> > +#define PR_SET_SHADOW_STACK_STATUS      72
> > +# define PR_SHADOW_STACK_ENABLE         (1UL << 0)

> Other architecture may require disabling shadow stack if glibc
> tunables is set to permissive mode.
> In permissive mode, if glibc encounters `dlopen` on an object which
> doesn't support shadow stack,
> glibc should be able to issue PR_SHADOW_STACK_DISABLE.

> Architectures can choose to implement or not but I think arch agnostic
> code should enumerate this.

The current implementation for arm64 and therefore API for the prctl()
is that whatever combination of flags is specified will be set, this
means that setting the status to something that does not include _ENABLE
will result in disabling and we don't need a separate flag for disable.
We have use cases that make active use of disabling at runtime.

Please delete unneeded context from replies, it makes it much easier to
find new content.
Edgecombe, Rick P Dec. 12, 2023, 8:17 p.m. UTC | #3
On Wed, 2023-11-22 at 09:42 +0000, Mark Brown wrote:
> These features are expected to be inherited by new threads and
> cleared
> on exec(), unknown features should be rejected for enable but
> accepted
> for locking (in order to allow for future proofing).

The reason why I stuck with arch_prctl when this came up is that CRIU
(and probably other ptracers) needs a way to unlock via ptrace. ptrace
arch_prctl() can do this. Did you have a plan for unlocking via ptrace?
Mark Brown Dec. 12, 2023, 8:26 p.m. UTC | #4
On Tue, Dec 12, 2023 at 08:17:09PM +0000, Edgecombe, Rick P wrote:
> On Wed, 2023-11-22 at 09:42 +0000, Mark Brown wrote:

> > These features are expected to be inherited by new threads and
> > cleared
> > on exec(), unknown features should be rejected for enable but
> > accepted
> > for locking (in order to allow for future proofing).

> The reason why I stuck with arch_prctl when this came up is that CRIU
> (and probably other ptracers) needs a way to unlock via ptrace. ptrace
> arch_prctl() can do this. Did you have a plan for unlocking via ptrace?

The set of locked features is read/write via ptrace in my arm64 series,
that's architecture specific unfortunately but that seems to be the way
with ptrace.

In general if things have a need to get at prctl()s via ptrace we should
just fix that, at least for arm64 there's things like the vector lengths
that are currently controlled via prctl(), but it shouldn't be a blocker
for the locking specifically.
Edgecombe, Rick P Dec. 12, 2023, 9:22 p.m. UTC | #5
+Mike, who did the CRIU work

Re:
https://lore.kernel.org/lkml/e1362732ba86990b7707d3f5b785358b77c5f896.camel@intel.com/

On Tue, 2023-12-12 at 20:26 +0000, Mark Brown wrote:
> The set of locked features is read/write via ptrace in my arm64
> series,
> that's architecture specific unfortunately but that seems to be the
> way
> with ptrace.

Ah, sorry I didn't see that later in the series. Makes sense. 

> 
> In general if things have a need to get at prctl()s via ptrace we
> should
> just fix that, at least for arm64 there's things like the vector
> lengths
> that are currently controlled via prctl(), but it shouldn't be a
> blocker
> for the locking specifically.

ptrace arch_prctl() is a bit odd. Not all values of 'option' are
supported because ptrace arch_prctl's have to operate cross task. Some
have extra code to support doing this, and some only know how to
operate on the current task, so return an error in the ptrace case.

I guess a benefit would be that there could be some arch agnostic
ptrace userspace code. And I'd also guess (really a guess) that most
ptracing code has some arch awareness already, but the other way is
probably non-zero. Same for shadow stack enabling code. Then on the
kernel side we'd have to add and support a ptrace prctl() solution.

Is it worth the effort? I don't have a strong opinion.
Deepak Gupta Dec. 13, 2023, 12:50 a.m. UTC | #6
On Tue, Dec 12, 2023 at 11:23 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Tue, Dec 12, 2023 at 11:17:11AM -0800, Deepak Gupta wrote:
> > On Wed, Nov 22, 2023 at 1:43 AM Mark Brown <broonie@kernel.org> wrote:
>
> > > +/*
> > > + * Set the current shadow stack configuration.  Enabling the shadow
> > > + * stack will cause a shadow stack to be allocated for the thread.
> > > + */
> > > +#define PR_SET_SHADOW_STACK_STATUS      72
> > > +# define PR_SHADOW_STACK_ENABLE         (1UL << 0)
>
> > Other architecture may require disabling shadow stack if glibc
> > tunables is set to permissive mode.
> > In permissive mode, if glibc encounters `dlopen` on an object which
> > doesn't support shadow stack,
> > glibc should be able to issue PR_SHADOW_STACK_DISABLE.
>
> > Architectures can choose to implement or not but I think arch agnostic
> > code should enumerate this.
>
> The current implementation for arm64 and therefore API for the prctl()
> is that whatever combination of flags is specified will be set, this
> means that setting the status to something that does not include _ENABLE
> will result in disabling and we don't need a separate flag for disable.
> We have use cases that make active use of disabling at runtime.

A theoretical scenario (no current workloads should've this case
because no shadow stack)

- User mode did _ENABLE on the main thread. Shadow stack was allocated
for the current
  thread.
- User mode created a bunch worker threads to run untrusted contained
code. They shadow
  stack too.
- main thread had to do dlopen and now need to disable shadow stack on
itself due to
  incompatibility of incoming object in address space.
- main thread controls worker threads and knows they're contained and
should still be running
  with a shadow stack. Although once in a while the main thread needs
to perform writes to a shadow
  stack of worker threads for some fixup (in the same addr space).
main thread doesn't want to delegate
  this responsibility of ss writes to worker threads because they're untrusted.

How will it do that (currently _ENABLE is married to _WRITE and _PUSH) ?

Please note that I am making up this scenario just for sake of discussion
And don't know if software would be using it in this manner.

>
> Please delete unneeded context from replies, it makes it much easier to
> find new content.

Sorry about that.
Noted.
Mark Brown Dec. 13, 2023, 1:37 p.m. UTC | #7
On Tue, Dec 12, 2023 at 04:50:38PM -0800, Deepak Gupta wrote:

> A theoretical scenario (no current workloads should've this case
> because no shadow stack)

> - User mode did _ENABLE on the main thread. Shadow stack was allocated
> for the current
>   thread.
> - User mode created a bunch worker threads to run untrusted contained
> code. They shadow
>   stack too.
> - main thread had to do dlopen and now need to disable shadow stack on
> itself due to
>   incompatibility of incoming object in address space.
> - main thread controls worker threads and knows they're contained and
> should still be running
>   with a shadow stack. Although once in a while the main thread needs
> to perform writes to a shadow
>   stack of worker threads for some fixup (in the same addr space).
> main thread doesn't want to delegate
>   this responsibility of ss writes to worker threads because they're untrusted.

> How will it do that (currently _ENABLE is married to _WRITE and _PUSH) ?

That's feeling moderately firmly into "don't do that" territory to be
honest, the problems of trying to modify the stack of another running
thread while it's active just don't seem worth it - if you're
coordinating enough to do the modifications it's probably possible to
just ask the thread who's stack is being modified to do the modification
itself and having an unprotected thread writing into shadow stack memory
doesn't feel great.

That said in terms of the API there would be nothing stopping us saying
that _WRITE by itself is a valid combination of flags, in which case the
thread would have permission to write to any shadow stack memory it
could get to.  For arm64 I think we can implement that, I'm not sure
about x86.  _PUSH without _ENABLE is a lot less clear, you would at the
very least at some point have had a stack enabled to have a stack
pointer.
Mark Brown Dec. 13, 2023, 1:49 p.m. UTC | #8
On Tue, Dec 12, 2023 at 09:22:59PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2023-12-12 at 20:26 +0000, Mark Brown wrote:

> > In general if things have a need to get at prctl()s via ptrace we
> > should
> > just fix that, at least for arm64 there's things like the vector
> > lengths
> > that are currently controlled via prctl(), but it shouldn't be a
> > blocker
> > for the locking specifically.

> ptrace arch_prctl() is a bit odd. Not all values of 'option' are
> supported because ptrace arch_prctl's have to operate cross task. Some
> have extra code to support doing this, and some only know how to
> operate on the current task, so return an error in the ptrace case.

It feels like x86 is doing some things via arch_prctl() rather than
implementing specific ptrace() interfaces for them, there's a lot of
stuff where ptrace isn't a great fit for due to it's concept that it's
going to work with an array of registers so that's understandable.

> I guess a benefit would be that there could be some arch agnostic
> ptrace userspace code. And I'd also guess (really a guess) that most
> ptracing code has some arch awareness already, but the other way is
> probably non-zero. Same for shadow stack enabling code. Then on the
> kernel side we'd have to add and support a ptrace prctl() solution.

> Is it worth the effort? I don't have a strong opinion.

I don't have a strong enough opinion on it to start working on it
immediately at any rate.
Deepak Gupta Dec. 13, 2023, 7:43 p.m. UTC | #9
On Wed, Dec 13, 2023 at 5:37 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Tue, Dec 12, 2023 at 04:50:38PM -0800, Deepak Gupta wrote:
>
> > A theoretical scenario (no current workloads should've this case
> > because no shadow stack)
>
> > - User mode did _ENABLE on the main thread. Shadow stack was allocated
> > for the current
> >   thread.
> > - User mode created a bunch worker threads to run untrusted contained
> > code. They shadow
> >   stack too.
> > - main thread had to do dlopen and now need to disable shadow stack on
> > itself due to
> >   incompatibility of incoming object in address space.
> > - main thread controls worker threads and knows they're contained and
> > should still be running
> >   with a shadow stack. Although once in a while the main thread needs
> > to perform writes to a shadow
> >   stack of worker threads for some fixup (in the same addr space).
> > main thread doesn't want to delegate
> >   this responsibility of ss writes to worker threads because they're untrusted.
>
> > How will it do that (currently _ENABLE is married to _WRITE and _PUSH) ?
>
> That's feeling moderately firmly into "don't do that" territory to be
> honest, the problems of trying to modify the stack of another running
> thread while it's active just don't seem worth it - if you're
> coordinating enough to do the modifications it's probably possible to
> just ask the thread who's stack is being modified to do the modification
> itself and having an unprotected thread writing into shadow stack memory
> doesn't feel great.
>

Yeah no leanings on my side. Just wanted to articulate this scenario.
Since this is new ground,
we can define what's appropriate. Let's keep it this way where a
thread can write to shadow
stack mappings only when it itself has shadow stack enabled.
Mark Brown Dec. 13, 2023, 7:48 p.m. UTC | #10
On Wed, Dec 13, 2023 at 11:43:49AM -0800, Deepak Gupta wrote:
> On Wed, Dec 13, 2023 at 5:37 AM Mark Brown <broonie@kernel.org> wrote:
> > On Tue, Dec 12, 2023 at 04:50:38PM -0800, Deepak Gupta wrote:

> > > How will it do that (currently _ENABLE is married to _WRITE and _PUSH) ?

> > That's feeling moderately firmly into "don't do that" territory to be
> > honest, the problems of trying to modify the stack of another running
> > thread while it's active just don't seem worth it - if you're
> > coordinating enough to do the modifications it's probably possible to
> > just ask the thread who's stack is being modified to do the modification
> > itself and having an unprotected thread writing into shadow stack memory
> > doesn't feel great.

> Yeah no leanings on my side. Just wanted to articulate this scenario.
> Since this is new ground,
> we can define what's appropriate. Let's keep it this way where a
> thread can write to shadow
> stack mappings only when it itself has shadow stack enabled.

Sounds good to me - it's much easier to relax permissions later than to
tighten them up.
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 10462f354614..8b28483b4afa 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4143,4 +4143,8 @@  static inline bool pfn_is_unaccepted_memory(unsigned long pfn)
 	return range_contains_unaccepted_memory(paddr, paddr + PAGE_SIZE);
 }
 
+int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status);
+int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);
+int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
+
 #endif /* _LINUX_MM_H */
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 370ed14b1ae0..3c66ed8f46d8 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -306,4 +306,26 @@  struct prctl_mm_map {
 # define PR_RISCV_V_VSTATE_CTRL_NEXT_MASK	0xc
 # define PR_RISCV_V_VSTATE_CTRL_MASK		0x1f
 
+/*
+ * Get the current shadow stack configuration for the current thread,
+ * this will be the value configured via PR_SET_SHADOW_STACK_STATUS.
+ */
+#define PR_GET_SHADOW_STACK_STATUS      71
+
+/*
+ * Set the current shadow stack configuration.  Enabling the shadow
+ * stack will cause a shadow stack to be allocated for the thread.
+ */
+#define PR_SET_SHADOW_STACK_STATUS      72
+# define PR_SHADOW_STACK_ENABLE         (1UL << 0)
+# define PR_SHADOW_STACK_WRITE		(1UL << 1)
+# define PR_SHADOW_STACK_PUSH		(1UL << 2)
+
+/*
+ * Prevent further changes to the specified shadow stack
+ * configuration.  All bits may be locked via this call, including
+ * undefined bits.
+ */
+#define PR_LOCK_SHADOW_STACK_STATUS      73
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index e219fcfa112d..96e8a6b5993a 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2301,6 +2301,21 @@  int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
 	return -EINVAL;
 }
 
+int __weak arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status)
+{
+	return -EINVAL;
+}
+
+int __weak arch_set_shadow_stack_status(struct task_struct *t, unsigned long status)
+{
+	return -EINVAL;
+}
+
+int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status)
+{
+	return -EINVAL;
+}
+
 #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
 
 #ifdef CONFIG_ANON_VMA_NAME
@@ -2743,6 +2758,21 @@  SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	case PR_RISCV_V_GET_CONTROL:
 		error = RISCV_V_GET_CONTROL();
 		break;
+	case PR_GET_SHADOW_STACK_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_get_shadow_stack_status(me, (unsigned long __user *) arg2);
+		break;
+	case PR_SET_SHADOW_STACK_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_set_shadow_stack_status(me, arg2);
+		break;
+	case PR_LOCK_SHADOW_STACK_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_lock_shadow_stack_status(me, arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;