mbox series

[00/14] arm64: Preparatory FPSIMD/SVE/SME fixes

Message ID 20250404174435.3288106-1-mark.rutland@arm.com (mailing list archive)
Headers show
Series arm64: Preparatory FPSIMD/SVE/SME fixes | expand

Message

Mark Rutland April 4, 2025, 5:44 p.m. UTC
Hi,

These patches fix a number of problems in the FPSIMD/SVE/SME code, as a
step toeards re-enabling SME support. Additional fixes/changes will be
necessary before we can re-enable SME support. I intend to follow up
with more patches in the near future.

I'm hoping these patches as-is are largely uncontroversial, though I'm
afraid they've only seen light testing so far, so any testing would be
much appreciated.

The series is based on v6.14, and I've pushed the series to the
'arm64-fpsimd-fixes-20250404' branch of my kernel.org git repo:

  git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/
  
As an advance warning regarding future patches, there are a few issues
in particular that'll require some more work, and several ABI concerns
that I'll need some opinions on. In particular:

* The user/kernel ABI for signal entry has never been compatible with
  the AAPCS64 lazy ZA saving scheme [1] as deployed by GCC, LLVM, and
  GLIBC. Currently signal handlers can be entered with PSTATE.ZA==0 and
  TPIDR2_EL0 being non-null, which is not a valid state per AAPCS64, and
  TPIDR2_EL0 being non-null implies that a lazy ZA save needs to occur.
  This ends up resulting in unexpected SIGILLs or data corruption during
  an attempts to lazily save ZA, which happen in some standard C library
  functions.

  It is very likely that we'll need to make *some* changes to the signal
  entry ABI here. The obvious/simple fix would be to clear TPIDR2_EL0 at
  signal entry, but this poses other problems for userspace (e.g. where
  longjmp() jumps out of a signal handler), and toolchain folk have
  asked whether it'd be possible to leave PSTATE.ZA enabled when
  entering signal handlers.

* The ptrace ABI for SME is written around the assumption that
  SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are separate bit flags, whereas
  in reality these are different values (0 and 1 respectively) for
  bit 0 of the user_sve_header flags.

  I haven't yet worked through all of the implications of this, but
  AFAICT we can't reliably indicate the value of PSTATE.SM, and
  userspace cannot reliably save/restore a task's state in all cases.

[1] https://github.com/ARM-software/abi-aa/blob/a82eef0433556b30539c0d4463768d9feb8cfd0b/aapcs64/aapcs64.rst#the-za-lazy-saving-scheme

Mark.

Mark Brown (2):
  arm64/fpsimd: Discard stale CPU state when handling SME traps
  arm64/fpsimd: Don't corrupt FPMR when streaming mode changes

Mark Rutland (12):
  arm64/fpsimd: Avoid RES0 bits in the SME trap handler
  arm64/fpsimd: Remove unused fpsimd_force_sync_to_sve()
  arm64/fpsimd: Remove redundant clearing of TIF_SVE
  arm64/fpsimd: Remove redundant SVE trap manipulation
  arm64/fpsimd: Remove opportunistic freeing of SME state
  arm64/fpsimd: Avoid clobbering kernel FPSIMD state with SMSTOP
  arm64/fpsimd: Reset FPMR upon exec()
  arm64/fpsimd: Fix merging of FPSIMD state during signal return
  arm64/fpsimd: Add fpsimd_save_and_flush_current_state()
  arm64/fpsimd: signal32: Always save+flush state early
  arm64/fpsimd: signal: Always save+flush state early
  arm64/fpsimd: signal: Simplify preserve_tpidr2_context()

 arch/arm64/include/asm/esr.h    | 14 ++---
 arch/arm64/include/asm/fpsimd.h |  2 +-
 arch/arm64/kernel/fpsimd.c      | 95 ++++++++++-----------------------
 arch/arm64/kernel/ptrace.c      |  2 -
 arch/arm64/kernel/signal.c      | 75 ++++++--------------------
 arch/arm64/kernel/signal32.c    | 11 ++--
 6 files changed, 60 insertions(+), 139 deletions(-)

Comments

Mark Brown April 4, 2025, 6:06 p.m. UTC | #1
On Fri, Apr 04, 2025 at 06:44:21PM +0100, Mark Rutland wrote:

> * The ptrace ABI for SME is written around the assumption that
>   SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are separate bit flags, whereas
>   in reality these are different values (0 and 1 respectively) for
>   bit 0 of the user_sve_header flags.

The documentation definitely doesn't reflect what actually got
implemented, IIRC the errors are more around returning the header only
rather than the assumption that they're separate flags but at this point
it doesn't really matter given that the documentation is clearly wrong.

>   I haven't yet worked through all of the implications of this, but
>   AFAICT we can't reliably indicate the value of PSTATE.SM, and
>   userspace cannot reliably save/restore a task's state in all cases.

Userspace should be able to read by looking the SSVE registers, if you
get SVE format register state back you're in streaming mode.  Streaming
mode state will always be saved in SVE format so a read of streaming SVE
vector state will come out with the SVE bit set, if we're not in
streaming mode it won't be set.

For writes writing to the SVE register set will disable streaming mode,
writes to the streaming SVE register set will enable it.