Message ID | 20240406002610.37202-2-ebiggers@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Herbert Xu |
Headers | show |
Series | crypto: x86 - add missing vzeroupper instructions | expand |
On Fri, 2024-04-05 at 20:26 -0400, Eric Biggers wrote: > > From: Eric Biggers <ebiggers@google.com> > > > > Since nh_avx2() uses ymm registers, execute vzeroupper before returning > > from it. This is necessary to avoid reducing the performance of SSE > > code. > > > > Fixes: 0f961f9f670e ("crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305") > > Signed-off-by: Eric Biggers <ebiggers@google.com> > > --- > > arch/x86/crypto/nh-avx2-x86_64.S | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/arch/x86/crypto/nh-avx2-x86_64.S b/arch/x86/crypto/nh-avx2-x86_64.S > > index ef73a3ab8726..791386d9a83a 100644 > > --- a/arch/x86/crypto/nh-avx2-x86_64.S > > +++ b/arch/x86/crypto/nh-avx2-x86_64.S > > @@ -152,7 +152,8 @@ SYM_TYPED_FUNC_START(nh_avx2) > > > > vpaddq T5, T4, T4 > > vpaddq T1, T0, T0 > > vpaddq T4, T0, T0 > > vmovdqu T0, (HASH) > > + vzeroupper > > RET > > SYM_FUNC_END(nh_avx2) Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
diff --git a/arch/x86/crypto/nh-avx2-x86_64.S b/arch/x86/crypto/nh-avx2-x86_64.S index ef73a3ab8726..791386d9a83a 100644 --- a/arch/x86/crypto/nh-avx2-x86_64.S +++ b/arch/x86/crypto/nh-avx2-x86_64.S @@ -152,7 +152,8 @@ SYM_TYPED_FUNC_START(nh_avx2) vpaddq T5, T4, T4 vpaddq T1, T0, T0 vpaddq T4, T0, T0 vmovdqu T0, (HASH) + vzeroupper RET SYM_FUNC_END(nh_avx2)