mbox series

[v11,00/39] arm64/gcs: Provide support for GCS in userspace

Message ID 20240822-arm64-gcs-v11-0-41b81947ecb5@kernel.org (mailing list archive)
Headers show
Series arm64/gcs: Provide support for GCS in userspace | expand

Message

Mark Brown Aug. 22, 2024, 1:15 a.m. UTC
The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling.

When GCS is active a secondary stack called the Guarded Control Stack is
maintained, protected with a memory attribute which means that it can
only be written with specific GCS operations.  The current GCS pointer
can not be directly written to by userspace.  When a BL is executed the
value stored in LR is also pushed onto the GCS, and when a RET is
executed the top of the GCS is popped and compared to LR with a fault
being raised if the values do not match.  GCS operations may only be
performed on GCS pages, a data abort is generated if they are not.

The combination of hardware enforcement and lack of extra instructions
in the function entry and exit paths should result in something which
has less overhead and is more difficult to attack than a purely software
implementation like clang's shadow stacks.

This series implements support for use of GCS by userspace, along with
support for use of GCS within KVM guests.  It does not enable use of GCS
by either EL1 or EL2, this will be implemented separately.  Executables
are started without GCS and must use a prctl() to enable it, it is
expected that this will be done very early in application execution by
the dynamic linker or other startup code.  For dynamic linking this will
be done by checking that everything in the executable is marked as GCS
compatible.

x86 has an equivalent feature called shadow stacks, this series depends
on the x86 patches for generic memory management support for the new
guarded/shadow stack page type and shares APIs as much as possible.  As
there has been extensive discussion with the wider community around the
ABI for shadow stacks I have as far as practical kept implementation
decisions close to those for x86, anticipating that review would lead to
similar conclusions in the absence of strong reasoning for divergence.

The main divergence I am concious of is that x86 allows shadow stack to
be enabled and disabled repeatedly, freeing the shadow stack for the
thread whenever disabled, while this implementation keeps the GCS
allocated after disable but refuses to reenable it.  This is to avoid
races with things actively walking the GCS during a disable, we do
anticipate that some systems will wish to disable GCS at runtime but are
not aware of any demand for subsequently reenabling it.

x86 uses an arch_prctl() to manage enable and disable, since only x86
and S/390 use arch_prctl() a generic prctl() was proposed[1] as part of a
patch set for the equivalent RISC-V Zicfiss feature which I initially
adopted fairly directly but following review feedback has been revised
quite a bit.

We currently maintain the x86 pattern of implicitly allocating a shadow
stack for threads started with shadow stack enabled, there has been some
discussion of removing this support and requiring the use of clone3()
with explicit allocation of shadow stacks instead.  I have no strong
feelings either way, implicit allocation is not really consistent with
anything else we do and creates the potential for errors around thread
exit but on the other hand it is existing ABI on x86 and minimises the
changes needed in userspace code.

glibc and bionic changes using this ABI have been implemented and
tested.  Headless Android systems have been validated and Ross Burton
has used this code has been used to bring up a Yocto system with GCS
enabed as standard, a test implementation of V8 support has also been
done.

There is an open issue with support for CRIU, on x86 this required the
ability to set the GCS mode via ptrace.  This series supports
configuring mode bits other than enable/disable via ptrace but it needs
to be confirmed if this is sufficient.

It is likely that we could relax some of the barriers added here with
some more targeted placements, this is left for further study.

There is an in process series adding clone3() support for shadow stacks:

   https://lore.kernel.org/r/20240819-clone3-shadow-stack-v9-0-962d74f99464@kernel.org

Previous versions of this series depended on that, this dependency has
been removed in order to make merging easier.

[1] https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/

Signed-off-by: Mark Brown <broonie@kernel.org>
---
Changes in v11:
- Remove the dependency on the addition of clone3() support for shadow
  stacks, rebasing onto v6.11-rc3.
- Make ID_AA64PFR1_EL1.GCS writeable in KVM.
- Hide GCS registers when GCS is not enabled for KVM guests.
- Require HCRX_EL2.GCSEn if booting at EL1.
- Require that GCSCR_EL1 and GCSCRE0_EL1 be initialised regardless of
  if we boot at EL2 or EL1.
- Remove some stray use of bit 63 in signal cap tokens.
- Warn if we see a GCS with VM_SHARED.
- Remove rdundant check for VM_WRITE in fault handling.
- Cleanups and clarifications in the ABI document.
- Clean up and improve documentation of some sync placement.
- Only set the EL0 GCS mode if it's actually changed.
- Various minor fixes and tweaks.
- Link to v10: https://lore.kernel.org/r/20240801-arm64-gcs-v10-0-699e2bd2190b@kernel.org

Changes in v10:
- Fix issues with THP.
- Tighten up requirements for initialising GCSCR*.
- Only generate GCS signal frames for threads using GCS.
- Only context switch EL1 GCS registers if S1PIE is enabled.
- Move context switch of GCSCRE0_EL1 to EL0 context switch.
- Make GCS registers unconditionally visible to userspace.
- Use FHU infrastructure.
- Don't change writability of ID_AA64PFR1_EL1 for KVM.
- Remove unused arguments from alloc_gcs().
- Typo fixes.
- Link to v9: https://lore.kernel.org/r/20240625-arm64-gcs-v9-0-0f634469b8f0@kernel.org

Changes in v9:
- Rebase onto v6.10-rc3.
- Restructure and clarify memory management fault handling.
- Fix up basic-gcs for the latest clone3() changes.
- Convert to newly merged KVM ID register based feature configuration.
- Fixes for NV traps.
- Link to v8: https://lore.kernel.org/r/20240203-arm64-gcs-v8-0-c9fec77673ef@kernel.org

Changes in v8:
- Invalidate signal cap token on stack when consuming.
- Typo and other trivial fixes.
- Don't try to use process_vm_write() on GCS, it intentionally does not
  work.
- Fix leak of thread GCSs.
- Rebase onto latest clone3() series.
- Link to v7: https://lore.kernel.org/r/20231122-arm64-gcs-v7-0-201c483bd775@kernel.org

Changes in v7:
- Rebase onto v6.7-rc2 via the clone3() patch series.
- Change the token used to cap the stack during signal handling to be
  compatible with GCSPOPM.
- Fix flags for new page types.
- Fold in support for clone3().
- Replace copy_to_user_gcs() with put_user_gcs().
- Link to v6: https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org

Changes in v6:
- Rebase onto v6.6-rc3.
- Add some more gcsb_dsync() barriers following spec clarifications.
- Due to ongoing discussion around clone()/clone3() I've not updated
  anything there, the behaviour is the same as on previous versions.
- Link to v5: https://lore.kernel.org/r/20230822-arm64-gcs-v5-0-9ef181dd6324@kernel.org

Changes in v5:
- Don't map any permissions for user GCSs, we always use EL0 accessors
  or use a separate mapping of the page.
- Reduce the standard size of the GCS to RLIMIT_STACK/2.
- Enforce a PAGE_SIZE alignment requirement on map_shadow_stack().
- Clarifications and fixes to documentation.
- More tests.
- Link to v4: https://lore.kernel.org/r/20230807-arm64-gcs-v4-0-68cfa37f9069@kernel.org

Changes in v4:
- Implement flags for map_shadow_stack() allowing the cap and end of
  stack marker to be enabled independently or not at all.
- Relax size and alignment requirements for map_shadow_stack().
- Add more blurb explaining the advantages of hardware enforcement.
- Link to v3: https://lore.kernel.org/r/20230731-arm64-gcs-v3-0-cddf9f980d98@kernel.org

Changes in v3:
- Rebase onto v6.5-rc4.
- Add a GCS barrier on context switch.
- Add a GCS stress test.
- Link to v2: https://lore.kernel.org/r/20230724-arm64-gcs-v2-0-dc2c1d44c2eb@kernel.org

Changes in v2:
- Rebase onto v6.5-rc3.
- Rework prctl() interface to allow each bit to be locked independently.
- map_shadow_stack() now places the cap token based on the size
  requested by the caller not the actual space allocated.
- Mode changes other than enable via ptrace are now supported.
- Expand test coverage.
- Various smaller fixes and adjustments.
- Link to v1: https://lore.kernel.org/r/20230716-arm64-gcs-v1-0-bf567f93bba6@kernel.org

---
Mark Brown (39):
      mm: Introduce ARCH_HAS_USER_SHADOW_STACK
      arm64/mm: Restructure arch_validate_flags() for extensibility
      prctl: arch-agnostic prctl for shadow stack
      mman: Add map_shadow_stack() flags
      arm64: Document boot requirements for Guarded Control Stacks
      arm64/gcs: Document the ABI for Guarded Control Stacks
      arm64/sysreg: Add definitions for architected GCS caps
      arm64/gcs: Add manual encodings of GCS instructions
      arm64/gcs: Provide put_user_gcs()
      arm64/gcs: Provide basic EL2 setup to allow GCS usage at EL0 and EL1
      arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)
      arm64/mm: Allocate PIE slots for EL0 guarded control stack
      mm: Define VM_SHADOW_STACK for arm64 when we support GCS
      arm64/mm: Map pages for guarded control stack
      KVM: arm64: Manage GCS access and registers for guests
      arm64/idreg: Add overrride for GCS
      arm64/hwcap: Add hwcap for GCS
      arm64/traps: Handle GCS exceptions
      arm64/mm: Handle GCS data aborts
      arm64/gcs: Context switch GCS state for EL0
      arm64/gcs: Ensure that new threads have a GCS
      arm64/gcs: Implement shadow stack prctl() interface
      arm64/mm: Implement map_shadow_stack()
      arm64/signal: Set up and restore the GCS context for signal handlers
      arm64/signal: Expose GCS state in signal frames
      arm64/ptrace: Expose GCS via ptrace and core files
      arm64: Add Kconfig for Guarded Control Stack (GCS)
      kselftest/arm64: Verify the GCS hwcap
      kselftest/arm64: Add GCS as a detected feature in the signal tests
      kselftest/arm64: Add framework support for GCS to signal handling tests
      kselftest/arm64: Allow signals tests to specify an expected si_code
      kselftest/arm64: Always run signals tests with GCS enabled
      kselftest/arm64: Add very basic GCS test program
      kselftest/arm64: Add a GCS test program built with the system libc
      kselftest/arm64: Add test coverage for GCS mode locking
      kselftest/arm64: Add GCS signal tests
      kselftest/arm64: Add a GCS stress test
      kselftest/arm64: Enable GCS for the FP stress tests
      KVM: selftests: arm64: Add GCS registers to get-reg-list

 Documentation/admin-guide/kernel-parameters.txt    |   3 +
 Documentation/arch/arm64/booting.rst               |  32 +
 Documentation/arch/arm64/elf_hwcaps.rst            |   2 +
 Documentation/arch/arm64/gcs.rst                   | 230 +++++++
 Documentation/arch/arm64/index.rst                 |   1 +
 Documentation/filesystems/proc.rst                 |   2 +-
 arch/arm64/Kconfig                                 |  20 +
 arch/arm64/include/asm/cpufeature.h                |   6 +
 arch/arm64/include/asm/el2_setup.h                 |  29 +
 arch/arm64/include/asm/esr.h                       |  28 +-
 arch/arm64/include/asm/exception.h                 |   2 +
 arch/arm64/include/asm/gcs.h                       | 107 +++
 arch/arm64/include/asm/hwcap.h                     |   1 +
 arch/arm64/include/asm/kvm_host.h                  |  12 +
 arch/arm64/include/asm/mman.h                      |  23 +-
 arch/arm64/include/asm/pgtable-prot.h              |  14 +-
 arch/arm64/include/asm/processor.h                 |   7 +
 arch/arm64/include/asm/sysreg.h                    |  20 +
 arch/arm64/include/asm/uaccess.h                   |  40 ++
 arch/arm64/include/asm/vncr_mapping.h              |   2 +
 arch/arm64/include/uapi/asm/hwcap.h                |   1 +
 arch/arm64/include/uapi/asm/ptrace.h               |   8 +
 arch/arm64/include/uapi/asm/sigcontext.h           |   9 +
 arch/arm64/kernel/cpufeature.c                     |  12 +
 arch/arm64/kernel/cpuinfo.c                        |   1 +
 arch/arm64/kernel/entry-common.c                   |  23 +
 arch/arm64/kernel/pi/idreg-override.c              |   2 +
 arch/arm64/kernel/process.c                        |  88 +++
 arch/arm64/kernel/ptrace.c                         |  54 ++
 arch/arm64/kernel/signal.c                         | 225 ++++++-
 arch/arm64/kernel/traps.c                          |  11 +
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h         |  49 +-
 arch/arm64/kvm/sys_regs.c                          |  27 +-
 arch/arm64/mm/Makefile                             |   1 +
 arch/arm64/mm/fault.c                              |  40 ++
 arch/arm64/mm/gcs.c                                | 252 +++++++
 arch/arm64/mm/mmap.c                               |  10 +-
 arch/arm64/tools/cpucaps                           |   1 +
 arch/x86/Kconfig                                   |   1 +
 arch/x86/include/uapi/asm/mman.h                   |   3 -
 fs/proc/task_mmu.c                                 |   2 +-
 include/linux/mm.h                                 |  18 +-
 include/uapi/asm-generic/mman.h                    |   4 +
 include/uapi/linux/elf.h                           |   1 +
 include/uapi/linux/prctl.h                         |  22 +
 kernel/sys.c                                       |  30 +
 mm/Kconfig                                         |   6 +
 tools/testing/selftests/arm64/Makefile             |   2 +-
 tools/testing/selftests/arm64/abi/hwcap.c          |  19 +
 tools/testing/selftests/arm64/fp/assembler.h       |  15 +
 tools/testing/selftests/arm64/fp/fpsimd-test.S     |   2 +
 tools/testing/selftests/arm64/fp/sve-test.S        |   2 +
 tools/testing/selftests/arm64/fp/za-test.S         |   2 +
 tools/testing/selftests/arm64/fp/zt-test.S         |   2 +
 tools/testing/selftests/arm64/gcs/.gitignore       |   5 +
 tools/testing/selftests/arm64/gcs/Makefile         |  24 +
 tools/testing/selftests/arm64/gcs/asm-offsets.h    |   0
 tools/testing/selftests/arm64/gcs/basic-gcs.c      | 357 ++++++++++
 tools/testing/selftests/arm64/gcs/gcs-locking.c    | 200 ++++++
 .../selftests/arm64/gcs/gcs-stress-thread.S        | 311 +++++++++
 tools/testing/selftests/arm64/gcs/gcs-stress.c     | 530 +++++++++++++++
 tools/testing/selftests/arm64/gcs/gcs-util.h       | 100 +++
 tools/testing/selftests/arm64/gcs/libc-gcs.c       | 728 +++++++++++++++++++++
 tools/testing/selftests/arm64/signal/.gitignore    |   1 +
 .../testing/selftests/arm64/signal/test_signals.c  |  17 +-
 .../testing/selftests/arm64/signal/test_signals.h  |   6 +
 .../selftests/arm64/signal/test_signals_utils.c    |  32 +-
 .../selftests/arm64/signal/test_signals_utils.h    |  39 ++
 .../arm64/signal/testcases/gcs_exception_fault.c   |  62 ++
 .../selftests/arm64/signal/testcases/gcs_frame.c   |  88 +++
 .../arm64/signal/testcases/gcs_write_fault.c       |  67 ++
 .../selftests/arm64/signal/testcases/testcases.c   |   7 +
 .../selftests/arm64/signal/testcases/testcases.h   |   1 +
 tools/testing/selftests/kvm/aarch64/get-reg-list.c |  28 +
 74 files changed, 4086 insertions(+), 43 deletions(-)
---
base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
change-id: 20230303-arm64-gcs-e311ab0d8729

Best regards,

Comments

Catalin Marinas Aug. 22, 2024, 8:58 a.m. UTC | #1
On Thu, Aug 22, 2024 at 02:15:08AM +0100, Mark Brown wrote:
> FEAT_GCS introduces a number of new system registers, we require that
> access to these registers is not trapped when we identify that the feature
> is present.  There is also a HCRX_EL2 control to make GCS operations
> functional.
> 
> Since if GCS is enabled any function call instruction will cause a fault
> we also require that the feature be specifically disabled, existing
> kernels implicitly have this requirement and especially given that the
> MMU must be disabled it is difficult to see a situation where leaving
> GCS enabled would be reasonable.
> 
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas Aug. 22, 2024, 10:14 a.m. UTC | #2
On Thu, Aug 22, 2024 at 02:15:16AM +0100, Mark Brown wrote:
> Use VM_HIGH_ARCH_5 for guarded control stack pages.
> 
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas Aug. 22, 2024, 11:31 a.m. UTC | #3
On Thu, Aug 22, 2024 at 02:15:20AM +0100, Mark Brown wrote:
> Provide a hwcap to enable userspace to detect support for GCS.
> 
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas Aug. 22, 2024, 11:31 a.m. UTC | #4
On Thu, Aug 22, 2024 at 02:15:21AM +0100, Mark Brown wrote:
> A new exception code is defined for GCS specific faults other than
> standard load/store faults, for example GCS token validation failures,
> add handling for this. These faults are reported to userspace as
> segfaults with code SEGV_CPERR (protection error), mirroring the
> reporting for x86 shadow stack errors.
> 
> GCS faults due to memory load/store operations generate data aborts with
> a flag set, these will be handled separately as part of the data abort
> handling.
> 
> Since we do not currently enable GCS for EL1 we should not get any faults
> there but while we're at it we wire things up there, treating any GCS
> fault as fatal.
> 
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas Aug. 22, 2024, 4:12 p.m. UTC | #5
On Thu, Aug 22, 2024 at 02:15:22AM +0100, Mark Brown wrote:
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 451ba7cbd5ad..3ada31c2ac12 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -486,6 +486,14 @@ static void do_bad_area(unsigned long far, unsigned long esr,
>  	}
>  }
>  
> +static bool is_gcs_fault(unsigned long esr)
> +{
> +	if (!esr_is_data_abort(esr))
> +		return false;
> +
> +	return ESR_ELx_ISS2(esr) & ESR_ELx_GCS;
> +}
> +
>  static bool is_el0_instruction_abort(unsigned long esr)
>  {
>  	return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
> @@ -500,6 +508,23 @@ static bool is_write_abort(unsigned long esr)
>  	return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM);
>  }
>  
> +static bool is_invalid_gcs_access(struct vm_area_struct *vma, u64 esr)
> +{
> +	if (!system_supports_gcs())
> +		return false;
> +
> +	if (unlikely(is_gcs_fault(esr))) {
> +		/* GCS accesses must be performed on a GCS page */
> +		if (!(vma->vm_flags & VM_SHADOW_STACK))
> +			return true;

This first check covers the GCSPOPM/RET etc. permission faults on
non-GCS vmas. It looks correct.

> +	} else if (unlikely(vma->vm_flags & VM_SHADOW_STACK)) {
> +		/* Only GCS operations can write to a GCS page */
> +		return is_write_abort(esr);
> +	}

I don't think that's right. The ESR on this path may not even indicate a
data abort and ESR.WnR bit check wouldn't make sense.

I presume we want to avoid an infinite loop on a (writeable) GCS page
when the user does a normal STR but the CPU raises a permission fault. I
think this function needs to just return false if !esr_is_data_abort().

> +
> +	return false;
> +}
> +
>  static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  				   struct pt_regs *regs)
>  {
> @@ -535,6 +560,14 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		/* It was exec fault */
>  		vm_flags = VM_EXEC;
>  		mm_flags |= FAULT_FLAG_INSTRUCTION;
> +	} else if (is_gcs_fault(esr)) {
> +		/*
> +		 * The GCS permission on a page implies both read and
> +		 * write so always handle any GCS fault as a write fault,
> +		 * we need to trigger CoW even for GCS reads.
> +		 */
> +		vm_flags = VM_WRITE;
> +		mm_flags |= FAULT_FLAG_WRITE;
>  	} else if (is_write_abort(esr)) {
>  		/* It was write fault */
>  		vm_flags = VM_WRITE;
> @@ -568,6 +601,13 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	if (!vma)
>  		goto lock_mmap;
>  
> +	if (is_invalid_gcs_access(vma, esr)) {
> +		vma_end_read(vma);
> +		fault = 0;
> +		si_code = SEGV_ACCERR;
> +		goto bad_area;
> +	}

Here there's a risk that the above function returns true for some
unrelated fault that happens to have bit 6 in ESR set.
Catalin Marinas Aug. 22, 2024, 4:15 p.m. UTC | #6
On Thu, Aug 22, 2024 at 02:15:23AM +0100, Mark Brown wrote:
> There are two registers controlling the GCS state of EL0, GCSPR_EL0 which
> is the current GCS pointer and GCSCRE0_EL1 which has enable bits for the
> specific GCS functionality enabled for EL0. Manage these on context switch
> and process lifetime events, GCS is reset on exec().  Also ensure that
> any changes to the GCS memory are visible to other PEs and that changes
> from other PEs are visible on this one by issuing a GCSB DSYNC when
> moving to or from a thread with GCS.
> 
> Since the current GCS configuration of a thread will be visible to
> userspace we store the configuration in the format used with userspace
> and provide a helper which configures the system register as needed.
> 
> On systems that support GCS we always allow access to GCSPR_EL0, this
> facilitates reporting of GCS faults if userspace implements disabling of
> GCS on error - the GCS can still be discovered and examined even if GCS
> has been disabled.
> 
> Signed-off-by: Mark Brown <broonie@kernel.org>

We could do with a bit more code comments around GCSB DSYNC but
otherwise it looks fine now.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Mark Brown Aug. 22, 2024, 4:31 p.m. UTC | #7
On Thu, Aug 22, 2024 at 04:44:12PM +0100, Catalin Marinas wrote:
> On Thu, Aug 22, 2024 at 02:15:21AM +0100, Mark Brown wrote:

> > +void do_el0_gcs(struct pt_regs *regs, unsigned long esr)
> > +{
> > +	force_signal_inject(SIGSEGV, SEGV_CPERR, regs->pc, 0);
> > +}

> Just double checking: a GCSPOPM (for example, it can be a RET) from a
> non-GCS page would generate a classic permission fault with ISS2.GCS set
> rather than a GCS exception. That's my reading from the Arm ARM
> pseudocode, the text isn't clear to me.

Yes, we only generate GCS exceptions on checking values that have
successfully been loaded from memory or other GCS logic errors - memory
accesses generate data aborts.
Mark Brown Aug. 22, 2024, 4:44 p.m. UTC | #8
On Thu, Aug 22, 2024 at 05:12:30PM +0100, Catalin Marinas wrote:
> On Thu, Aug 22, 2024 at 02:15:22AM +0100, Mark Brown wrote:

> > +static bool is_invalid_gcs_access(struct vm_area_struct *vma, u64 esr)

> > +	} else if (unlikely(vma->vm_flags & VM_SHADOW_STACK)) {
> > +		/* Only GCS operations can write to a GCS page */
> > +		return is_write_abort(esr);
> > +	}

> I don't think that's right. The ESR on this path may not even indicate a
> data abort and ESR.WnR bit check wouldn't make sense.

> I presume we want to avoid an infinite loop on a (writeable) GCS page
> when the user does a normal STR but the CPU raises a permission fault. I
> think this function needs to just return false if !esr_is_data_abort().

Yes, that should check for a data abort.  I think I'd formed the
impression that is_write_abort() included that check somehow.  As you
say it's to avoid spinning trying to resolve a permission fault for a
write (non-GCS reads to a GCS page are valid), I do think we need the 
is_write_abort() since non-GCS reads are valid so something like:

	if (!esr_is_data_abort(esr))
		return false;

	return is_write_abort(esr);
Catalin Marinas Aug. 22, 2024, 5:19 p.m. UTC | #9
On Thu, Aug 22, 2024 at 05:44:19PM +0100, Mark Brown wrote:
> On Thu, Aug 22, 2024 at 05:12:30PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 22, 2024 at 02:15:22AM +0100, Mark Brown wrote:
> 
> > > +static bool is_invalid_gcs_access(struct vm_area_struct *vma, u64 esr)
> 
> > > +	} else if (unlikely(vma->vm_flags & VM_SHADOW_STACK)) {
> > > +		/* Only GCS operations can write to a GCS page */
> > > +		return is_write_abort(esr);
> > > +	}
> 
> > I don't think that's right. The ESR on this path may not even indicate a
> > data abort and ESR.WnR bit check wouldn't make sense.
> 
> > I presume we want to avoid an infinite loop on a (writeable) GCS page
> > when the user does a normal STR but the CPU raises a permission fault. I
> > think this function needs to just return false if !esr_is_data_abort().
> 
> Yes, that should check for a data abort.  I think I'd formed the
> impression that is_write_abort() included that check somehow.  As you
> say it's to avoid spinning trying to resolve a permission fault for a
> write (non-GCS reads to a GCS page are valid), I do think we need the 
> is_write_abort() since non-GCS reads are valid so something like:
> 
> 	if (!esr_is_data_abort(esr))
> 		return false;
> 
> 	return is_write_abort(esr);

We do need the write abort check but not unconditionally, only if to a
GCS page (you can have other genuine write aborts).
Mark Brown Aug. 22, 2024, 5:30 p.m. UTC | #10
On Thu, Aug 22, 2024 at 06:19:38PM +0100, Catalin Marinas wrote:
> On Thu, Aug 22, 2024 at 05:44:19PM +0100, Mark Brown wrote:
> > On Thu, Aug 22, 2024 at 05:12:30PM +0100, Catalin Marinas wrote:
> > > On Thu, Aug 22, 2024 at 02:15:22AM +0100, Mark Brown wrote:
> > 
> > > > +static bool is_invalid_gcs_access(struct vm_area_struct *vma, u64 esr)
> > 
> > > > +	} else if (unlikely(vma->vm_flags & VM_SHADOW_STACK)) {
> > > > +		/* Only GCS operations can write to a GCS page */
> > > > +		return is_write_abort(esr);
> > > > +	}

> > Yes, that should check for a data abort.  I think I'd formed the
> > impression that is_write_abort() included that check somehow.  As you
> > say it's to avoid spinning trying to resolve a permission fault for a
> > write (non-GCS reads to a GCS page are valid), I do think we need the 
> > is_write_abort() since non-GCS reads are valid so something like:
> > 
> > 	if (!esr_is_data_abort(esr))
> > 		return false;
> > 
> > 	return is_write_abort(esr);
> 
> We do need the write abort check but not unconditionally, only if to a
> GCS page (you can have other genuine write aborts).

tThat was to replace the checks in the above case, not the function as a
whole.
Catalin Marinas Aug. 23, 2024, 9:41 a.m. UTC | #11
On Thu, Aug 22, 2024 at 02:15:29AM +0100, Mark Brown wrote:
> Provide a new register type NT_ARM_GCS reporting the current GCS mode
> and pointer for EL0.  Due to the interactions with allocation and
> deallocation of Guarded Control Stacks we do not permit any changes to
> the GCS mode via ptrace, only GCSPR_EL0 may be changed.
> 
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>