diff mbox series

[v2] crypto: arm/aes-ce - work around Cortex-A57/A72 silion errata

Message ID 20201126074907.18965-1-ardb@kernel.org (mailing list archive)
State Accepted
Delegated to: Herbert Xu
Headers show
Series [v2] crypto: arm/aes-ce - work around Cortex-A57/A72 silion errata | expand

Commit Message

Ard Biesheuvel Nov. 26, 2020, 7:49 a.m. UTC
ARM Cortex-A57 and Cortex-A72 cores running in 32-bit mode are affected
by silicon errata #1742098 and #1655431, respectively, where the second
instruction of a AES instruction pair may execute twice if an interrupt
is taken right after the first instruction consumes an input register of
which a single 32-bit lane has been updated the last time it was modified.

This is not such a rare occurrence as it may seem: in counter mode, only
the least significant 32-bit word is incremented in the absence of a
carry, which makes our counter mode implementation susceptible to these
errata.

So let's shuffle the counter assignments around a bit so that the most
recent updates when the AES instruction pair executes are 128-bit wide.

[0] ARM-EPM-049219 v23 Cortex-A57 MPCore Software Developers Errata Notice
[1] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice

Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
v2: - add comment block describing the erratum and how it is being worked
      around
    - mention A57 as well as A72, as both are affected

 arch/arm/crypto/aes-ce-core.S | 32 ++++++++++++++------
 1 file changed, 22 insertions(+), 10 deletions(-)

Comments

Herbert Xu Dec. 4, 2020, 7:15 a.m. UTC | #1
On Thu, Nov 26, 2020 at 08:49:07AM +0100, Ard Biesheuvel wrote:
> ARM Cortex-A57 and Cortex-A72 cores running in 32-bit mode are affected
> by silicon errata #1742098 and #1655431, respectively, where the second
> instruction of a AES instruction pair may execute twice if an interrupt
> is taken right after the first instruction consumes an input register of
> which a single 32-bit lane has been updated the last time it was modified.
> 
> This is not such a rare occurrence as it may seem: in counter mode, only
> the least significant 32-bit word is incremented in the absence of a
> carry, which makes our counter mode implementation susceptible to these
> errata.
> 
> So let's shuffle the counter assignments around a bit so that the most
> recent updates when the AES instruction pair executes are 128-bit wide.
> 
> [0] ARM-EPM-049219 v23 Cortex-A57 MPCore Software Developers Errata Notice
> [1] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice
> 
> Cc: <stable@vger.kernel.org> # v5.4+
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> v2: - add comment block describing the erratum and how it is being worked
>       around
>     - mention A57 as well as A72, as both are affected
> 
>  arch/arm/crypto/aes-ce-core.S | 32 ++++++++++++++------
>  1 file changed, 22 insertions(+), 10 deletions(-)

Patch applied.  Thanks.
diff mbox series

Patch

diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S
index 4d1707388d94..312428d83eed 100644
--- a/arch/arm/crypto/aes-ce-core.S
+++ b/arch/arm/crypto/aes-ce-core.S
@@ -386,20 +386,32 @@  ENTRY(ce_aes_ctr_encrypt)
 .Lctrloop4x:
 	subs		r4, r4, #4
 	bmi		.Lctr1x
-	add		r6, r6, #1
+
+	/*
+	 * NOTE: the sequence below has been carefully tweaked to avoid
+	 * a silicon erratum that exists in Cortex-A57 (#1742098) and
+	 * Cortex-A72 (#1655431) cores, where AESE/AESMC instruction pairs
+	 * may produce an incorrect result if they take their input from a
+	 * register of which a single 32-bit lane has been updated the last
+	 * time it was modified. To work around this, the lanes of registers
+	 * q0-q3 below are not manipulated individually, and the different
+	 * counter values are prepared by successive manipulations of q7.
+	 */
+	add		ip, r6, #1
 	vmov		q0, q7
+	rev		ip, ip
+	add		lr, r6, #2
+	vmov		s31, ip			@ set lane 3 of q1 via q7
+	add		ip, r6, #3
+	rev		lr, lr
 	vmov		q1, q7
-	rev		ip, r6
-	add		r6, r6, #1
+	vmov		s31, lr			@ set lane 3 of q2 via q7
+	rev		ip, ip
 	vmov		q2, q7
-	vmov		s7, ip
-	rev		ip, r6
-	add		r6, r6, #1
+	vmov		s31, ip			@ set lane 3 of q3 via q7
+	add		r6, r6, #4
 	vmov		q3, q7
-	vmov		s11, ip
-	rev		ip, r6
-	add		r6, r6, #1
-	vmov		s15, ip
+
 	vld1.8		{q4-q5}, [r1]!
 	vld1.8		{q6}, [r1]!
 	vld1.8		{q15}, [r1]!