diff mbox series

[v10,13/40] arm64/mm: Map pages for guarded control stack

Message ID 20240801-arm64-gcs-v10-13-699e2bd2190b@kernel.org (mailing list archive)
State New
Headers show
Series arm64/gcs: Provide support for GCS in userspace | expand

Commit Message

Mark Brown Aug. 1, 2024, 12:06 p.m. UTC
Map pages flagged as being part of a GCS as such rather than using the
full set of generic VM flags.

This is done using a conditional rather than extending the size of
protection_map since that would make for a very sparse array.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/mman.h |  9 +++++++++
 arch/arm64/mm/mmap.c          | 10 +++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

Comments

Catalin Marinas Aug. 19, 2024, 9:10 a.m. UTC | #1
On Thu, Aug 01, 2024 at 01:06:40PM +0100, Mark Brown wrote:
> Map pages flagged as being part of a GCS as such rather than using the
> full set of generic VM flags.
> 
> This is done using a conditional rather than extending the size of
> protection_map since that would make for a very sparse array.
> 
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
>  arch/arm64/include/asm/mman.h |  9 +++++++++
>  arch/arm64/mm/mmap.c          | 10 +++++++++-
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> index c21849ffdd88..6d3fe6433a62 100644
> --- a/arch/arm64/include/asm/mman.h
> +++ b/arch/arm64/include/asm/mman.h
> @@ -61,6 +61,15 @@ static inline bool arch_validate_flags(unsigned long vm_flags)
>  			return false;
>  	}
>  
> +	if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
> +		/*
> +		 * An executable GCS isn't a good idea, and the mm
> +		 * core can't cope with a shared GCS.
> +		 */
> +		if (vm_flags & (VM_EXEC | VM_ARM64_BTI | VM_SHARED))
> +			return false;
> +	}

I wonder whether we should clear VM_MAYEXEC early on during the vma
creation. This way the mprotect() case will be handled in the core code.
At a quick look, do_mmap() seems to always set VM_MAYEXEC but discard it
for non-executable file mmap. Last time I looked (when doing MTE) there
wasn't a way for the arch code to clear specific VM_* flags, only to
validate them. But I think we should just clear VM_MAYEXEC and also
return an error for VM_EXEC in the core do_mmap() if VM_SHADOW_STACK. It
would cover the other architectures doing shadow stacks.

Regarding VM_SHARED, how do we even end up with this via the
map_shadow_stack() syscall? I can't see how one can pass MAP_SHARED to
do_mmap() on this path. I'm fine with a VM_WARN_ON() if you want the
check (and there's no way a user can trigger it).

Is there any arch restriction with setting BTI and GCS? It doesn't make
sense but curious if it matters. We block the exec permission anyway
(unless the BTI pages moved to PIE as well, I don't remember).

> +
>  	return true;
>  
>  }
> diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
> index 642bdf908b22..3ed63fc8cd0a 100644
> --- a/arch/arm64/mm/mmap.c
> +++ b/arch/arm64/mm/mmap.c
> @@ -83,9 +83,17 @@ arch_initcall(adjust_protection_map);
>  
>  pgprot_t vm_get_page_prot(unsigned long vm_flags)
>  {
> -	pteval_t prot = pgprot_val(protection_map[vm_flags &
> +	pteval_t prot;
> +
> +	/* Short circuit GCS to avoid bloating the table. */
> +	if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
> +		prot = _PAGE_GCS_RO;
> +	} else {
> +		prot = pgprot_val(protection_map[vm_flags &
>  				   (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]);
> +	}

This looks fine to me. Such page will become proper GCS on first access.
Mark Brown Aug. 19, 2024, 4:33 p.m. UTC | #2
On Mon, Aug 19, 2024 at 10:10:36AM +0100, Catalin Marinas wrote:
> On Thu, Aug 01, 2024 at 01:06:40PM +0100, Mark Brown wrote:

> > +	if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
> > +		/*
> > +		 * An executable GCS isn't a good idea, and the mm
> > +		 * core can't cope with a shared GCS.
> > +		 */
> > +		if (vm_flags & (VM_EXEC | VM_ARM64_BTI | VM_SHARED))
> > +			return false;
> > +	}

> I wonder whether we should clear VM_MAYEXEC early on during the vma
> creation. This way the mprotect() case will be handled in the core code.
> At a quick look, do_mmap() seems to always set VM_MAYEXEC but discard it
> for non-executable file mmap. Last time I looked (when doing MTE) there
> wasn't a way for the arch code to clear specific VM_* flags, only to
> validate them. But I think we should just clear VM_MAYEXEC and also
> return an error for VM_EXEC in the core do_mmap() if VM_SHADOW_STACK. It
> would cover the other architectures doing shadow stacks.

Yes, I think adding something generic would make sense here.  That feels
like a cleanup which could be split out?

> Regarding VM_SHARED, how do we even end up with this via the
> map_shadow_stack() syscall? I can't see how one can pass MAP_SHARED to
> do_mmap() on this path. I'm fine with a VM_WARN_ON() if you want the
> check (and there's no way a user can trigger it).

It's just a defenesive programming thing, I'm not aware of any way in
which it should be possible to trigger this.

> Is there any arch restriction with setting BTI and GCS? It doesn't make
> sense but curious if it matters. We block the exec permission anyway
> (unless the BTI pages moved to PIE as well, I don't remember).

As you say BTI should be meaningless for a non-executable page like GCS,
I'm not aware of any way in which it matters.  BTI is separate to PIE.
Catalin Marinas Aug. 20, 2024, 2:59 p.m. UTC | #3
On Mon, Aug 19, 2024 at 05:33:24PM +0100, Mark Brown wrote:
> On Mon, Aug 19, 2024 at 10:10:36AM +0100, Catalin Marinas wrote:
> > On Thu, Aug 01, 2024 at 01:06:40PM +0100, Mark Brown wrote:
> > > +	if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
> > > +		/*
> > > +		 * An executable GCS isn't a good idea, and the mm
> > > +		 * core can't cope with a shared GCS.
> > > +		 */
> > > +		if (vm_flags & (VM_EXEC | VM_ARM64_BTI | VM_SHARED))
> > > +			return false;
> > > +	}
> 
> > I wonder whether we should clear VM_MAYEXEC early on during the vma
> > creation. This way the mprotect() case will be handled in the core code.
> > At a quick look, do_mmap() seems to always set VM_MAYEXEC but discard it
> > for non-executable file mmap. Last time I looked (when doing MTE) there
> > wasn't a way for the arch code to clear specific VM_* flags, only to
> > validate them. But I think we should just clear VM_MAYEXEC and also
> > return an error for VM_EXEC in the core do_mmap() if VM_SHADOW_STACK. It
> > would cover the other architectures doing shadow stacks.
> 
> Yes, I think adding something generic would make sense here.  That feels
> like a cleanup which could be split out?

It can be done separately. It doesn't look like x86 has such checks.
Adding it generically would be a slight ABI tightening but I doubt it
matters, no sane software would use an executable shadow stack.

> > Regarding VM_SHARED, how do we even end up with this via the
> > map_shadow_stack() syscall? I can't see how one can pass MAP_SHARED to
> > do_mmap() on this path. I'm fine with a VM_WARN_ON() if you want the
> > check (and there's no way a user can trigger it).
> 
> It's just a defenesive programming thing, I'm not aware of any way in
> which it should be possible to trigger this.
> 
> > Is there any arch restriction with setting BTI and GCS? It doesn't make
> > sense but curious if it matters. We block the exec permission anyway
> > (unless the BTI pages moved to PIE as well, I don't remember).
> 
> As you say BTI should be meaningless for a non-executable page like GCS,
> I'm not aware of any way in which it matters.  BTI is separate to PIE.

My thoughts were whether we can get rid of this hunk entirely by
handling it in the core code. We'd allow BTI if one wants such useless
combination but clear VM_MAYEXEC in the core code (and ignore VM_SHARED
since you can't set it anyway).
Mark Brown Aug. 20, 2024, 3:28 p.m. UTC | #4
On Tue, Aug 20, 2024 at 03:59:21PM +0100, Catalin Marinas wrote:
> On Mon, Aug 19, 2024 at 05:33:24PM +0100, Mark Brown wrote:
> > On Mon, Aug 19, 2024 at 10:10:36AM +0100, Catalin Marinas wrote:

> > > At a quick look, do_mmap() seems to always set VM_MAYEXEC but discard it
> > > for non-executable file mmap. Last time I looked (when doing MTE) there
> > > wasn't a way for the arch code to clear specific VM_* flags, only to
> > > validate them. But I think we should just clear VM_MAYEXEC and also
> > > return an error for VM_EXEC in the core do_mmap() if VM_SHADOW_STACK. It
> > > would cover the other architectures doing shadow stacks.

> > Yes, I think adding something generic would make sense here.  That feels
> > like a cleanup which could be split out?

> It can be done separately. It doesn't look like x86 has such checks.
> Adding it generically would be a slight ABI tightening but I doubt it
> matters, no sane software would use an executable shadow stack.

OK.

> > > Is there any arch restriction with setting BTI and GCS? It doesn't make
> > > sense but curious if it matters. We block the exec permission anyway
> > > (unless the BTI pages moved to PIE as well, I don't remember).

> > As you say BTI should be meaningless for a non-executable page like GCS,
> > I'm not aware of any way in which it matters.  BTI is separate to PIE.

> My thoughts were whether we can get rid of this hunk entirely by
> handling it in the core code. We'd allow BTI if one wants such useless
> combination but clear VM_MAYEXEC in the core code (and ignore VM_SHARED
> since you can't set it anyway).

I have to admit that the BTI because I was shoving _EXEC in there rather
than because it specifically needed to be blocked.  So change the check
for VM_SHARED to a VM_WARN_ON(), and leave the _EXEC check for now
pending the above core change?
Catalin Marinas Aug. 20, 2024, 5:30 p.m. UTC | #5
On Tue, Aug 20, 2024 at 04:28:21PM +0100, Mark Brown wrote:
> On Tue, Aug 20, 2024 at 03:59:21PM +0100, Catalin Marinas wrote:
> > On Mon, Aug 19, 2024 at 05:33:24PM +0100, Mark Brown wrote:
> > > On Mon, Aug 19, 2024 at 10:10:36AM +0100, Catalin Marinas wrote:
> > > > Is there any arch restriction with setting BTI and GCS? It doesn't make
> > > > sense but curious if it matters. We block the exec permission anyway
> > > > (unless the BTI pages moved to PIE as well, I don't remember).
> 
> > > As you say BTI should be meaningless for a non-executable page like GCS,
> > > I'm not aware of any way in which it matters.  BTI is separate to PIE.
> 
> > My thoughts were whether we can get rid of this hunk entirely by
> > handling it in the core code. We'd allow BTI if one wants such useless
> > combination but clear VM_MAYEXEC in the core code (and ignore VM_SHARED
> > since you can't set it anyway).
> 
> I have to admit that the BTI because I was shoving _EXEC in there rather
> than because it specifically needed to be blocked.  So change the check
> for VM_SHARED to a VM_WARN_ON(), and leave the _EXEC check for now
> pending the above core change?

Yes, sounds good.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index c21849ffdd88..6d3fe6433a62 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -61,6 +61,15 @@  static inline bool arch_validate_flags(unsigned long vm_flags)
 			return false;
 	}
 
+	if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
+		/*
+		 * An executable GCS isn't a good idea, and the mm
+		 * core can't cope with a shared GCS.
+		 */
+		if (vm_flags & (VM_EXEC | VM_ARM64_BTI | VM_SHARED))
+			return false;
+	}
+
 	return true;
 
 }
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index 642bdf908b22..3ed63fc8cd0a 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -83,9 +83,17 @@  arch_initcall(adjust_protection_map);
 
 pgprot_t vm_get_page_prot(unsigned long vm_flags)
 {
-	pteval_t prot = pgprot_val(protection_map[vm_flags &
+	pteval_t prot;
+
+	/* Short circuit GCS to avoid bloating the table. */
+	if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
+		prot = _PAGE_GCS_RO;
+	} else {
+		prot = pgprot_val(protection_map[vm_flags &
 				   (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]);
+	}
 
+	/* VM_ARM64_BTI on a GCS is rejected in arch_validate_flags() */
 	if (vm_flags & VM_ARM64_BTI)
 		prot |= PTE_GP;