diff mbox

ARM: crypto: update NEON AES module to latest OpenSSL version

Message ID 1424935325-14437-1-git-send-email-ard.biesheuvel@linaro.org (mailing list archive)
State Accepted
Delegated to: Herbert Xu
Headers show

Commit Message

Ard Biesheuvel Feb. 26, 2015, 7:22 a.m. UTC
This updates the bit sliced AES module to the latest version in the
upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a
bug in the XTS decryption path, where data chunked in a certain way
could trigger the ciphertext stealing code, which is not supposed to
be active in the kernel build (The kernel implementation of XTS only
supports round multiples of the AES block size of 16 bytes, whereas
the conformant OpenSSL implementation of XTS supports inputs of
arbitrary size by applying ciphertext stealing). This is fixed in
the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK
around the offending instructions.

The upstream code also contains the change applied by Russell to
build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7,
but implemented slightly differently.

Fixes: e4e7f10bfc40 ("ARM: add support for bit sliced AES using NEON instructions")
Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---

This was found using the tcrypt test code, to which I recently added additional
chunking modes. However, XTS typically operates on pages or at least on sectors,
so this bug is unlikely to affect anyone in real life.

Still, please add cc stable when applying,

Thanks,
Ard.


 arch/arm/crypto/aesbs-core.S_shipped | 12 ++++++++----
 arch/arm/crypto/bsaes-armv7.pl       | 12 ++++++++----
 2 files changed, 16 insertions(+), 8 deletions(-)

Comments

Milan Broz Feb. 28, 2015, 10:30 p.m. UTC | #1
On 02/26/2015 08:22 AM, Ard Biesheuvel wrote:
> This updates the bit sliced AES module to the latest version in the
> upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a
> bug in the XTS decryption path, where data chunked in a certain way
> could trigger the ciphertext stealing code, which is not supposed to
> be active in the kernel build (The kernel implementation of XTS only
> supports round multiples of the AES block size of 16 bytes, whereas
> the conformant OpenSSL implementation of XTS supports inputs of
> arbitrary size by applying ciphertext stealing). This is fixed in
> the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK
> around the offending instructions.
> 
> The upstream code also contains the change applied by Russell to
> build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7,
> but implemented slightly differently.
> 
> Fixes: e4e7f10bfc40 ("ARM: add support for bit sliced AES using NEON instructions")
> Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
> 
> This was found using the tcrypt test code, to which I recently added additional
> chunking modes. However, XTS typically operates on pages or at least on sectors,
> so this bug is unlikely to affect anyone in real life.
> 
> Still, please add cc stable when applying,

Please apply it for stable.

For me, this patch fixed bug reported here
  http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/7966
  http://www.mail-archive.com/linux-crypto@vger.kernel.org/msg13102.html

Cryptsetup uses AF_ALG userspace crypto SKCIPHER interface and on
Raspberry Pi2 with Neon AES implementation it fails to unlock LUKS device
(if XTS and aes-arm-bs module is used).

After applying this patch it works as expected.

I did not have time to check it more in detail (we are always encrypting
the whole sector so this should not happen... but apparently it does.)

Tested-by: Milan Broz <gmazyland@gmail.com>

Thanks,
Milan

> 
> Thanks,
> Ard.
> 
> 
>  arch/arm/crypto/aesbs-core.S_shipped | 12 ++++++++----
>  arch/arm/crypto/bsaes-armv7.pl       | 12 ++++++++----
>  2 files changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/crypto/aesbs-core.S_shipped b/arch/arm/crypto/aesbs-core.S_shipped
> index 71e5fc7cfb18..1d1800f71c5b 100644
> --- a/arch/arm/crypto/aesbs-core.S_shipped
> +++ b/arch/arm/crypto/aesbs-core.S_shipped
> @@ -58,14 +58,18 @@
>  # define VFP_ABI_FRAME	0
>  # define BSAES_ASM_EXTENDED_KEY
>  # define XTS_CHAIN_TWEAK
> -# define __ARM_ARCH__	7
> +# define __ARM_ARCH__ __LINUX_ARM_ARCH__
> +# define __ARM_MAX_ARCH__ 7
>  #endif
>  
>  #ifdef __thumb__
>  # define adrl adr
>  #endif
>  
> -#if __ARM_ARCH__>=7
> +#if __ARM_MAX_ARCH__>=7
> +.arch	armv7-a
> +.fpu	neon
> +
>  .text
>  .syntax	unified 	@ ARMv7-capable assembler is expected to handle this
>  #ifdef __thumb2__
> @@ -74,8 +78,6 @@
>  .code   32
>  #endif
>  
> -.fpu	neon
> -
>  .type	_bsaes_decrypt8,%function
>  .align	4
>  _bsaes_decrypt8:
> @@ -2095,9 +2097,11 @@ bsaes_xts_decrypt:
>  	vld1.8	{q8}, [r0]			@ initial tweak
>  	adr	r2, .Lxts_magic
>  
> +#ifndef	XTS_CHAIN_TWEAK
>  	tst	r9, #0xf			@ if not multiple of 16
>  	it	ne				@ Thumb2 thing, sanity check in ARM
>  	subne	r9, #0x10			@ subtract another 16 bytes
> +#endif
>  	subs	r9, #0x80
>  
>  	blo	.Lxts_dec_short
> diff --git a/arch/arm/crypto/bsaes-armv7.pl b/arch/arm/crypto/bsaes-armv7.pl
> index be068db960ee..a4d3856e7d24 100644
> --- a/arch/arm/crypto/bsaes-armv7.pl
> +++ b/arch/arm/crypto/bsaes-armv7.pl
> @@ -701,14 +701,18 @@ $code.=<<___;
>  # define VFP_ABI_FRAME	0
>  # define BSAES_ASM_EXTENDED_KEY
>  # define XTS_CHAIN_TWEAK
> -# define __ARM_ARCH__	7
> +# define __ARM_ARCH__ __LINUX_ARM_ARCH__
> +# define __ARM_MAX_ARCH__ 7
>  #endif
>  
>  #ifdef __thumb__
>  # define adrl adr
>  #endif
>  
> -#if __ARM_ARCH__>=7
> +#if __ARM_MAX_ARCH__>=7
> +.arch	armv7-a
> +.fpu	neon
> +
>  .text
>  .syntax	unified 	@ ARMv7-capable assembler is expected to handle this
>  #ifdef __thumb2__
> @@ -717,8 +721,6 @@ $code.=<<___;
>  .code   32
>  #endif
>  
> -.fpu	neon
> -
>  .type	_bsaes_decrypt8,%function
>  .align	4
>  _bsaes_decrypt8:
> @@ -2076,9 +2078,11 @@ bsaes_xts_decrypt:
>  	vld1.8	{@XMM[8]}, [r0]			@ initial tweak
>  	adr	$magic, .Lxts_magic
>  
> +#ifndef	XTS_CHAIN_TWEAK
>  	tst	$len, #0xf			@ if not multiple of 16
>  	it	ne				@ Thumb2 thing, sanity check in ARM
>  	subne	$len, #0x10			@ subtract another 16 bytes
> +#endif
>  	subs	$len, #0x80
>  
>  	blo	.Lxts_dec_short
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ard Biesheuvel March 2, 2015, 8:26 a.m. UTC | #2
On 28 February 2015 at 22:30, Milan Broz <gmazyland@gmail.com> wrote:
> On 02/26/2015 08:22 AM, Ard Biesheuvel wrote:
>> This updates the bit sliced AES module to the latest version in the
>> upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a
>> bug in the XTS decryption path, where data chunked in a certain way
>> could trigger the ciphertext stealing code, which is not supposed to
>> be active in the kernel build (The kernel implementation of XTS only
>> supports round multiples of the AES block size of 16 bytes, whereas
>> the conformant OpenSSL implementation of XTS supports inputs of
>> arbitrary size by applying ciphertext stealing). This is fixed in
>> the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK
>> around the offending instructions.
>>
>> The upstream code also contains the change applied by Russell to
>> build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7,
>> but implemented slightly differently.
>>
>> Fixes: e4e7f10bfc40 ("ARM: add support for bit sliced AES using NEON instructions")
>> Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>
>> This was found using the tcrypt test code, to which I recently added additional
>> chunking modes. However, XTS typically operates on pages or at least on sectors,
>> so this bug is unlikely to affect anyone in real life.
>>
>> Still, please add cc stable when applying,
>
> Please apply it for stable.
>
> For me, this patch fixed bug reported here
>   http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/7966
>   http://www.mail-archive.com/linux-crypto@vger.kernel.org/msg13102.html
>
> Cryptsetup uses AF_ALG userspace crypto SKCIPHER interface and on
> Raspberry Pi2 with Neon AES implementation it fails to unlock LUKS device
> (if XTS and aes-arm-bs module is used).
>
> After applying this patch it works as expected.
>
> I did not have time to check it more in detail (we are always encrypting
> the whole sector so this should not happen... but apparently it does.)
>
> Tested-by: Milan Broz <gmazyland@gmail.com>
>

OK, thanks for pointing that out, and for confirming that this fixes the issue.

Actually, I did not consider the userspace use case, but it is not
surprising that the data ends up being pushed into the crypto layer in
odd size chunks.

@Herbert: are you ok to take this as a bug fix for stable?

Thanks,
Ard.


>>  arch/arm/crypto/aesbs-core.S_shipped | 12 ++++++++----
>>  arch/arm/crypto/bsaes-armv7.pl       | 12 ++++++++----
>>  2 files changed, 16 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/crypto/aesbs-core.S_shipped b/arch/arm/crypto/aesbs-core.S_shipped
>> index 71e5fc7cfb18..1d1800f71c5b 100644
>> --- a/arch/arm/crypto/aesbs-core.S_shipped
>> +++ b/arch/arm/crypto/aesbs-core.S_shipped
>> @@ -58,14 +58,18 @@
>>  # define VFP_ABI_FRAME       0
>>  # define BSAES_ASM_EXTENDED_KEY
>>  # define XTS_CHAIN_TWEAK
>> -# define __ARM_ARCH__        7
>> +# define __ARM_ARCH__ __LINUX_ARM_ARCH__
>> +# define __ARM_MAX_ARCH__ 7
>>  #endif
>>
>>  #ifdef __thumb__
>>  # define adrl adr
>>  #endif
>>
>> -#if __ARM_ARCH__>=7
>> +#if __ARM_MAX_ARCH__>=7
>> +.arch        armv7-a
>> +.fpu neon
>> +
>>  .text
>>  .syntax      unified         @ ARMv7-capable assembler is expected to handle this
>>  #ifdef __thumb2__
>> @@ -74,8 +78,6 @@
>>  .code   32
>>  #endif
>>
>> -.fpu neon
>> -
>>  .type        _bsaes_decrypt8,%function
>>  .align       4
>>  _bsaes_decrypt8:
>> @@ -2095,9 +2097,11 @@ bsaes_xts_decrypt:
>>       vld1.8  {q8}, [r0]                      @ initial tweak
>>       adr     r2, .Lxts_magic
>>
>> +#ifndef      XTS_CHAIN_TWEAK
>>       tst     r9, #0xf                        @ if not multiple of 16
>>       it      ne                              @ Thumb2 thing, sanity check in ARM
>>       subne   r9, #0x10                       @ subtract another 16 bytes
>> +#endif
>>       subs    r9, #0x80
>>
>>       blo     .Lxts_dec_short
>> diff --git a/arch/arm/crypto/bsaes-armv7.pl b/arch/arm/crypto/bsaes-armv7.pl
>> index be068db960ee..a4d3856e7d24 100644
>> --- a/arch/arm/crypto/bsaes-armv7.pl
>> +++ b/arch/arm/crypto/bsaes-armv7.pl
>> @@ -701,14 +701,18 @@ $code.=<<___;
>>  # define VFP_ABI_FRAME       0
>>  # define BSAES_ASM_EXTENDED_KEY
>>  # define XTS_CHAIN_TWEAK
>> -# define __ARM_ARCH__        7
>> +# define __ARM_ARCH__ __LINUX_ARM_ARCH__
>> +# define __ARM_MAX_ARCH__ 7
>>  #endif
>>
>>  #ifdef __thumb__
>>  # define adrl adr
>>  #endif
>>
>> -#if __ARM_ARCH__>=7
>> +#if __ARM_MAX_ARCH__>=7
>> +.arch        armv7-a
>> +.fpu neon
>> +
>>  .text
>>  .syntax      unified         @ ARMv7-capable assembler is expected to handle this
>>  #ifdef __thumb2__
>> @@ -717,8 +721,6 @@ $code.=<<___;
>>  .code   32
>>  #endif
>>
>> -.fpu neon
>> -
>>  .type        _bsaes_decrypt8,%function
>>  .align       4
>>  _bsaes_decrypt8:
>> @@ -2076,9 +2078,11 @@ bsaes_xts_decrypt:
>>       vld1.8  {@XMM[8]}, [r0]                 @ initial tweak
>>       adr     $magic, .Lxts_magic
>>
>> +#ifndef      XTS_CHAIN_TWEAK
>>       tst     $len, #0xf                      @ if not multiple of 16
>>       it      ne                              @ Thumb2 thing, sanity check in ARM
>>       subne   $len, #0x10                     @ subtract another 16 bytes
>> +#endif
>>       subs    $len, #0x80
>>
>>       blo     .Lxts_dec_short
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu March 2, 2015, 10:19 a.m. UTC | #3
On Thu, Feb 26, 2015 at 07:22:05AM +0000, Ard Biesheuvel wrote:
> This updates the bit sliced AES module to the latest version in the
> upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a
> bug in the XTS decryption path, where data chunked in a certain way
> could trigger the ciphertext stealing code, which is not supposed to
> be active in the kernel build (The kernel implementation of XTS only
> supports round multiples of the AES block size of 16 bytes, whereas
> the conformant OpenSSL implementation of XTS supports inputs of
> arbitrary size by applying ciphertext stealing). This is fixed in
> the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK
> around the offending instructions.
> 
> The upstream code also contains the change applied by Russell to
> build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7,
> but implemented slightly differently.
> 
> Fixes: e4e7f10bfc40 ("ARM: add support for bit sliced AES using NEON instructions")
> Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Patch applied with cc to stable.
diff mbox

Patch

diff --git a/arch/arm/crypto/aesbs-core.S_shipped b/arch/arm/crypto/aesbs-core.S_shipped
index 71e5fc7cfb18..1d1800f71c5b 100644
--- a/arch/arm/crypto/aesbs-core.S_shipped
+++ b/arch/arm/crypto/aesbs-core.S_shipped
@@ -58,14 +58,18 @@ 
 # define VFP_ABI_FRAME	0
 # define BSAES_ASM_EXTENDED_KEY
 # define XTS_CHAIN_TWEAK
-# define __ARM_ARCH__	7
+# define __ARM_ARCH__ __LINUX_ARM_ARCH__
+# define __ARM_MAX_ARCH__ 7
 #endif
 
 #ifdef __thumb__
 # define adrl adr
 #endif
 
-#if __ARM_ARCH__>=7
+#if __ARM_MAX_ARCH__>=7
+.arch	armv7-a
+.fpu	neon
+
 .text
 .syntax	unified 	@ ARMv7-capable assembler is expected to handle this
 #ifdef __thumb2__
@@ -74,8 +78,6 @@ 
 .code   32
 #endif
 
-.fpu	neon
-
 .type	_bsaes_decrypt8,%function
 .align	4
 _bsaes_decrypt8:
@@ -2095,9 +2097,11 @@  bsaes_xts_decrypt:
 	vld1.8	{q8}, [r0]			@ initial tweak
 	adr	r2, .Lxts_magic
 
+#ifndef	XTS_CHAIN_TWEAK
 	tst	r9, #0xf			@ if not multiple of 16
 	it	ne				@ Thumb2 thing, sanity check in ARM
 	subne	r9, #0x10			@ subtract another 16 bytes
+#endif
 	subs	r9, #0x80
 
 	blo	.Lxts_dec_short
diff --git a/arch/arm/crypto/bsaes-armv7.pl b/arch/arm/crypto/bsaes-armv7.pl
index be068db960ee..a4d3856e7d24 100644
--- a/arch/arm/crypto/bsaes-armv7.pl
+++ b/arch/arm/crypto/bsaes-armv7.pl
@@ -701,14 +701,18 @@  $code.=<<___;
 # define VFP_ABI_FRAME	0
 # define BSAES_ASM_EXTENDED_KEY
 # define XTS_CHAIN_TWEAK
-# define __ARM_ARCH__	7
+# define __ARM_ARCH__ __LINUX_ARM_ARCH__
+# define __ARM_MAX_ARCH__ 7
 #endif
 
 #ifdef __thumb__
 # define adrl adr
 #endif
 
-#if __ARM_ARCH__>=7
+#if __ARM_MAX_ARCH__>=7
+.arch	armv7-a
+.fpu	neon
+
 .text
 .syntax	unified 	@ ARMv7-capable assembler is expected to handle this
 #ifdef __thumb2__
@@ -717,8 +721,6 @@  $code.=<<___;
 .code   32
 #endif
 
-.fpu	neon
-
 .type	_bsaes_decrypt8,%function
 .align	4
 _bsaes_decrypt8:
@@ -2076,9 +2078,11 @@  bsaes_xts_decrypt:
 	vld1.8	{@XMM[8]}, [r0]			@ initial tweak
 	adr	$magic, .Lxts_magic
 
+#ifndef	XTS_CHAIN_TWEAK
 	tst	$len, #0xf			@ if not multiple of 16
 	it	ne				@ Thumb2 thing, sanity check in ARM
 	subne	$len, #0x10			@ subtract another 16 bytes
+#endif
 	subs	$len, #0x80
 
 	blo	.Lxts_dec_short