mbox series

[00/14] arm64: Preparatory FPSIMD/SVE/SME fixes

Message ID 20250404174435.3288106-1-mark.rutland@arm.com (mailing list archive)
Headers show
Series arm64: Preparatory FPSIMD/SVE/SME fixes | expand

Message

Mark Rutland April 4, 2025, 5:44 p.m. UTC
Hi,

These patches fix a number of problems in the FPSIMD/SVE/SME code, as a
step toeards re-enabling SME support. Additional fixes/changes will be
necessary before we can re-enable SME support. I intend to follow up
with more patches in the near future.

I'm hoping these patches as-is are largely uncontroversial, though I'm
afraid they've only seen light testing so far, so any testing would be
much appreciated.

The series is based on v6.14, and I've pushed the series to the
'arm64-fpsimd-fixes-20250404' branch of my kernel.org git repo:

  git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/
  
As an advance warning regarding future patches, there are a few issues
in particular that'll require some more work, and several ABI concerns
that I'll need some opinions on. In particular:

* The user/kernel ABI for signal entry has never been compatible with
  the AAPCS64 lazy ZA saving scheme [1] as deployed by GCC, LLVM, and
  GLIBC. Currently signal handlers can be entered with PSTATE.ZA==0 and
  TPIDR2_EL0 being non-null, which is not a valid state per AAPCS64, and
  TPIDR2_EL0 being non-null implies that a lazy ZA save needs to occur.
  This ends up resulting in unexpected SIGILLs or data corruption during
  an attempts to lazily save ZA, which happen in some standard C library
  functions.

  It is very likely that we'll need to make *some* changes to the signal
  entry ABI here. The obvious/simple fix would be to clear TPIDR2_EL0 at
  signal entry, but this poses other problems for userspace (e.g. where
  longjmp() jumps out of a signal handler), and toolchain folk have
  asked whether it'd be possible to leave PSTATE.ZA enabled when
  entering signal handlers.

* The ptrace ABI for SME is written around the assumption that
  SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are separate bit flags, whereas
  in reality these are different values (0 and 1 respectively) for
  bit 0 of the user_sve_header flags.

  I haven't yet worked through all of the implications of this, but
  AFAICT we can't reliably indicate the value of PSTATE.SM, and
  userspace cannot reliably save/restore a task's state in all cases.

[1] https://github.com/ARM-software/abi-aa/blob/a82eef0433556b30539c0d4463768d9feb8cfd0b/aapcs64/aapcs64.rst#the-za-lazy-saving-scheme

Mark.

Mark Brown (2):
  arm64/fpsimd: Discard stale CPU state when handling SME traps
  arm64/fpsimd: Don't corrupt FPMR when streaming mode changes

Mark Rutland (12):
  arm64/fpsimd: Avoid RES0 bits in the SME trap handler
  arm64/fpsimd: Remove unused fpsimd_force_sync_to_sve()
  arm64/fpsimd: Remove redundant clearing of TIF_SVE
  arm64/fpsimd: Remove redundant SVE trap manipulation
  arm64/fpsimd: Remove opportunistic freeing of SME state
  arm64/fpsimd: Avoid clobbering kernel FPSIMD state with SMSTOP
  arm64/fpsimd: Reset FPMR upon exec()
  arm64/fpsimd: Fix merging of FPSIMD state during signal return
  arm64/fpsimd: Add fpsimd_save_and_flush_current_state()
  arm64/fpsimd: signal32: Always save+flush state early
  arm64/fpsimd: signal: Always save+flush state early
  arm64/fpsimd: signal: Simplify preserve_tpidr2_context()

 arch/arm64/include/asm/esr.h    | 14 ++---
 arch/arm64/include/asm/fpsimd.h |  2 +-
 arch/arm64/kernel/fpsimd.c      | 95 ++++++++++-----------------------
 arch/arm64/kernel/ptrace.c      |  2 -
 arch/arm64/kernel/signal.c      | 75 ++++++--------------------
 arch/arm64/kernel/signal32.c    | 11 ++--
 6 files changed, 60 insertions(+), 139 deletions(-)

Comments

Mark Brown April 4, 2025, 6:06 p.m. UTC | #1
On Fri, Apr 04, 2025 at 06:44:21PM +0100, Mark Rutland wrote:

> * The ptrace ABI for SME is written around the assumption that
>   SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are separate bit flags, whereas
>   in reality these are different values (0 and 1 respectively) for
>   bit 0 of the user_sve_header flags.

The documentation definitely doesn't reflect what actually got
implemented, IIRC the errors are more around returning the header only
rather than the assumption that they're separate flags but at this point
it doesn't really matter given that the documentation is clearly wrong.

>   I haven't yet worked through all of the implications of this, but
>   AFAICT we can't reliably indicate the value of PSTATE.SM, and
>   userspace cannot reliably save/restore a task's state in all cases.

Userspace should be able to read by looking the SSVE registers, if you
get SVE format register state back you're in streaming mode.  Streaming
mode state will always be saved in SVE format so a read of streaming SVE
vector state will come out with the SVE bit set, if we're not in
streaming mode it won't be set.

For writes writing to the SVE register set will disable streaming mode,
writes to the streaming SVE register set will enable it.
Mark Rutland April 8, 2025, 4:46 p.m. UTC | #2
On Fri, Apr 04, 2025 at 07:06:50PM +0100, Mark Brown wrote:
> On Fri, Apr 04, 2025 at 06:44:21PM +0100, Mark Rutland wrote:
> 
> > * The ptrace ABI for SME is written around the assumption that
> >   SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are separate bit flags, whereas
> >   in reality these are different values (0 and 1 respectively) for
> >   bit 0 of the user_sve_header flags.
> 
> The documentation definitely doesn't reflect what actually got
> implemented, IIRC the errors are more around returning the header only
> rather than the assumption that they're separate flags but at this point
> it doesn't really matter given that the documentation is clearly wrong.

I think it matters because the kernel code was also written to that
incorrect assumption, and the reason the effective behaviour differs
from the documented behaviour is because the assumption is incorrect.
This means portions of the code are wildly misleading, and this caused
secondary problems such as providing register data when we didn't intend
to, which I assume is what you mean by "the errors are more around
returning the header only".

From a quick scan, both GDB and LLVM determine whether streaming mode is
active based on whether the NT_ARM_SSVE header indicates
SVE_PT_REGS_SVE. We can make that work, but it doesn't work in all cases
today (see below).

> >   I haven't yet worked through all of the implications of this, but
> >   AFAICT we can't reliably indicate the value of PSTATE.SM, and
> >   userspace cannot reliably save/restore a task's state in all cases.
> 
> Userspace should be able to read by looking the SSVE registers, if you
> get SVE format register state back you're in streaming mode. Streaming
> mode state will always be saved in SVE format so a read of streaming SVE
> vector state will come out with the SVE bit set, if we're not in
> streaming mode it won't be set.

Unfortunately, that's not always true today.

While fpsimd_save_user_state() will save streaming mode state into the
FP_STATE_SVE format, other code can manipulate the task into a state
where thread_sm_enabled() is true, but the task's registers are saved in
the FPSIMD format. That combination is broken for a number of reasons,
but you seemed to permit this deliberately in commit:

  bbc6172eefdb276b ("arm64/fpsimd: SME no longer requires SVE register state")

... which says:

| Now that we track the type of the stored register state separately
| to what is active in the task, it is valid to have the FPSIMD
| register state stored while in streaming mode. 

I will assume that commit message is misleading, and that we do not want
to permit that case. It's possible to get a task into a bad state
regardless of that commit.

We can fix that in a number of ways, but we need to think of the shape
of ABI we actually want, e.g. whether writes to NT_ARM_SSVE with
SVE_PT_REGS_FPSIMD should be rejected outright.

> For writes writing to the SVE register set will disable streaming mode,
> writes to the streaming SVE register set will enable it.

Yep.

Mark.
Mark Brown April 8, 2025, 8:16 p.m. UTC | #3
On Tue, Apr 08, 2025 at 05:46:31PM +0100, Mark Rutland wrote:
> On Fri, Apr 04, 2025 at 07:06:50PM +0100, Mark Brown wrote:
> > On Fri, Apr 04, 2025 at 06:44:21PM +0100, Mark Rutland wrote:

> > > * The ptrace ABI for SME is written around the assumption that
> > >   SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are separate bit flags, whereas
> > >   in reality these are different values (0 and 1 respectively) for
> > >   bit 0 of the user_sve_header flags.

> > The documentation definitely doesn't reflect what actually got
> > implemented, IIRC the errors are more around returning the header only
> > rather than the assumption that they're separate flags but at this point
> > it doesn't really matter given that the documentation is clearly wrong.

....

> This means portions of the code are wildly misleading, and this caused
> secondary problems such as providing register data when we didn't intend
> to, which I assume is what you mean by "the errors are more around
> returning the header only".

Yes, exactly.

> > >   I haven't yet worked through all of the implications of this, but
> > >   AFAICT we can't reliably indicate the value of PSTATE.SM, and
> > >   userspace cannot reliably save/restore a task's state in all cases.

> > Userspace should be able to read by looking the SSVE registers, if you
> > get SVE format register state back you're in streaming mode. Streaming
> > mode state will always be saved in SVE format so a read of streaming SVE
> > vector state will come out with the SVE bit set, if we're not in
> > streaming mode it won't be set.

> Unfortunately, that's not always true today.

> While fpsimd_save_user_state() will save streaming mode state into the
> FP_STATE_SVE format, other code can manipulate the task into a state
> where thread_sm_enabled() is true, but the task's registers are saved in
> the FPSIMD format. That combination is broken for a number of reasons,
> but you seemed to permit this deliberately in commit:

>   bbc6172eefdb276b ("arm64/fpsimd: SME no longer requires SVE register state")

> ... which says:

> | Now that we track the type of the stored register state separately
> | to what is active in the task, it is valid to have the FPSIMD
> | register state stored while in streaming mode. 

> I will assume that commit message is misleading, and that we do not want
> to permit that case. It's possible to get a task into a bad state
> regardless of that commit.

Ah, yes - that case is indeed going to break.

> We can fix that in a number of ways, but we need to think of the shape
> of ABI we actually want, e.g. whether writes to NT_ARM_SSVE with
> SVE_PT_REGS_FPSIMD should be rejected outright.

It's hard to see a use case for it so it does seem reasonable to
disallow that, it would really be nice if SVE_PT_REGS_FPSIMD weren't
part of the ABI at all.  OTOH it will be implementable with any read
side ABI and it *is* there.

I think the intended ABI where returning SVE register state from
NT_ARM_SSVE indicates streaming mode is probably a good choice given
that it is compatible with the existing userspace code you've surveyed,
will be the behaviour seen in most actual usage with the current code
and is fairly close to the documentation.

If we do that then when streaming mode is disabled the obvious options
are that we can either keep the current behaviour and return FPSIMD
register state, or change to only return the header.  The former is less
change and therefore safer, the latter is probably a nicer ABI.  GDB
appears to do the right thing if it only gets the header back.  LLDB
appears to only attempt to read the header when determining if it's in
streaming mode so shouldn't see any difference at all, though that code
is a bit less clear to me.