mbox series

[0/2] crypto: arm64/crct10dif - refactor and implement non-Crypto Extension version

Message ID 20180827153812.6763-1-ard.biesheuvel@linaro.org (mailing list archive)
Headers show
Series crypto: arm64/crct10dif - refactor and implement non-Crypto Extension version | expand

Message

Ard Biesheuvel Aug. 27, 2018, 3:38 p.m. UTC
The current arm64 CRC-T10DIF code only runs on cores that implement the
64x64 bit PMULL instructions that are part of the optional Crypto
Extensions, and falls back to the highly inefficient C code otherwise.

Let's provide a SIMD version that is twice as fast as the C code even on
a low end core like the Cortex-A53, and is time invariant and much easier
on the D-cache.

Some performance numbers at the bottom.

Ard Biesheuvel (2):
  crypto: arm64/crct10dif - preparatory refactor for 8x8 PMULL version
  crypto: arm64/crct10dif - implement non-Crypto Extensions alternative

 arch/arm64/crypto/crct10dif-ce-core.S | 314 +++++++++++++++-----
 arch/arm64/crypto/crct10dif-ce-glue.c |  14 +-
 2 files changed, 251 insertions(+), 77 deletions(-)

Comments

Herbert Xu Sept. 4, 2018, 5:21 a.m. UTC | #1
On Mon, Aug 27, 2018 at 05:38:10PM +0200, Ard Biesheuvel wrote:
> The current arm64 CRC-T10DIF code only runs on cores that implement the
> 64x64 bit PMULL instructions that are part of the optional Crypto
> Extensions, and falls back to the highly inefficient C code otherwise.
> 
> Let's provide a SIMD version that is twice as fast as the C code even on
> a low end core like the Cortex-A53, and is time invariant and much easier
> on the D-cache.
> 
> Some performance numbers at the bottom.
> 
> Ard Biesheuvel (2):
>   crypto: arm64/crct10dif - preparatory refactor for 8x8 PMULL version
>   crypto: arm64/crct10dif - implement non-Crypto Extensions alternative
> 
>  arch/arm64/crypto/crct10dif-ce-core.S | 314 +++++++++++++++-----
>  arch/arm64/crypto/crct10dif-ce-glue.c |  14 +-
>  2 files changed, 251 insertions(+), 77 deletions(-)

All applied.  Thanks.