Message ID | 20240130-arm64-sme-resume-v1-2-0e60ebba18df@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64/sme: Fix handling of traps on resume | expand |
On Tue, Jan 30, 2024 at 12:02:49AM +0000, Mark Brown wrote: > The fields in SMCR_EL1 reset to an architecturally UNKNOWN value. Since we > do not otherwise manage the traps configured in this register at runtime we > need to reconfigure them after a suspend in case nothing else was kind > enough to preserve them for us. Do so for SMCR_EL1.EZT0. > > Fixes: d4913eee152d (arm64/sme: Add basic enumeration for SME2) > Reported-by: Jackson Cooper-Driver <Jackson.Cooper-Driver@arm.com> > Signed-off-by: Mark Brown <broonie@kernel.org> > --- > arch/arm64/kernel/fpsimd.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index 69201208bb13..329782fe39c5 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -1320,6 +1320,8 @@ void sme_suspend_exit(void) > > if (system_supports_fa64()) > smcr |= SMCR_ELx_FA64; Ditto comments on patch 1. Unrelated to this patch, it is worth having a prctl for this? The architecture seems to discourage software written for the SME ISA to implicitly rely on FA64, so it would useful to be able to run with it disabled even on hardware that supports it. > + if (system_supports_sme2()) > + smcr |= SMCR_ELx_EZT0; Side question: since ZT0 is likely to be sporadically used, maybe it is worth having separate lazy restore for it versus the main SME state? (Not relevant for this series though, and probably best deferred until there is hardware to benchmark on. Also, ZT0 is small compared with the SME state proper...) [...] Cheers ---Dave
On Tue, Jan 30, 2024 at 10:54:06AM +0000, Dave Martin wrote: > On Tue, Jan 30, 2024 at 12:02:49AM +0000, Mark Brown wrote: > > + if (system_supports_sme2()) > > + smcr |= SMCR_ELx_EZT0; > Side question: since ZT0 is likely to be sporadically used, maybe it > is worth having separate lazy restore for it versus the main SME state? > (Not relevant for this series though, and probably best deferred until > there is hardware to benchmark on. Also, ZT0 is small compared with > the SME state proper...) One of the advantages SME has here is that we've got a clear indication if userspace is actively using the registers through SMSTART and SMSTOP. We only restore ZT0 at all whenever PSTATE.ZA is set and the strong recommendation is that should only be set when either ZA or ZT0 are in active use for power and performance reasons. While it is likely that there will be code that uses ZA but doesn't touch ZT0 I would expect that the overhead of entering the kernel to do a lazy restore will be sufficiently high for it to be an unreasonable penalty on code that does touch it, as you say it's not *that* big compared to likely ZA sizes.
On Tue, Jan 30, 2024 at 02:34:23PM +0000, Mark Brown wrote: > On Tue, Jan 30, 2024 at 10:54:06AM +0000, Dave Martin wrote: > > On Tue, Jan 30, 2024 at 12:02:49AM +0000, Mark Brown wrote: > > > > + if (system_supports_sme2()) > > > + smcr |= SMCR_ELx_EZT0; > > > Side question: since ZT0 is likely to be sporadically used, maybe it > > is worth having separate lazy restore for it versus the main SME state? > > (Not relevant for this series though, and probably best deferred until > > there is hardware to benchmark on. Also, ZT0 is small compared with > > the SME state proper...) > > One of the advantages SME has here is that we've got a clear indication > if userspace is actively using the registers through SMSTART and SMSTOP. > We only restore ZT0 at all whenever PSTATE.ZA is set and the strong Good point. I was still thinking in SVE mode there. > recommendation is that should only be set when either ZA or ZT0 are in > active use for power and performance reasons. While it is likely that > there will be code that uses ZA but doesn't touch ZT0 I would expect > that the overhead of entering the kernel to do a lazy restore will be > sufficiently high for it to be an unreasonable penalty on code that does > touch it, as you say it's not *that* big compared to likely ZA sizes. Agreed. Cheers ---Dave
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 69201208bb13..329782fe39c5 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1320,6 +1320,8 @@ void sme_suspend_exit(void) if (system_supports_fa64()) smcr |= SMCR_ELx_FA64; + if (system_supports_sme2()) + smcr |= SMCR_ELx_EZT0; write_sysreg_s(smcr, SYS_SMCR_EL1); }
The fields in SMCR_EL1 reset to an architecturally UNKNOWN value. Since we do not otherwise manage the traps configured in this register at runtime we need to reconfigure them after a suspend in case nothing else was kind enough to preserve them for us. Do so for SMCR_EL1.EZT0. Fixes: d4913eee152d (arm64/sme: Add basic enumeration for SME2) Reported-by: Jackson Cooper-Driver <Jackson.Cooper-Driver@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> --- arch/arm64/kernel/fpsimd.c | 2 ++ 1 file changed, 2 insertions(+)