[v4,0/7] crypto: CAESAR final portfolio follow-up
mbox series

Message ID 20190703085512.13915-1-ard.biesheuvel@linaro.org
Headers show
Series
  • crypto: CAESAR final portfolio follow-up
Related show

Message

Ard Biesheuvel July 3, 2019, 8:55 a.m. UTC
This v2/v3/v4 is a follow-up to both 'crypto: aegis128 - add NEON intrinsics
version for ARM/arm64' [0] and 'crypto: morus - remove generic and x86
implementations' [1]. Since there is some overlap, it makes sense to merge
them and avoid merge conflicts.

Now that aegis128 has been announced as one of the winners of the CAESAR
competition, it's time to provide some better support for it on arm64 (and
32-bit ARM *)

This time, instead of cloning the generic driver twice and rewriting half
of it in arm64 and ARM assembly, add hooks for an accelerated SIMD path to
the generic driver, and populate it with a C version using NEON intrinsics
that can be built for both ARM and arm64. This results in a speedup of ~11x,
resulting in a performance of 2.2 cycles per byte on Cortex-A53.

Patches #3 and #4 are fixes/improvements for the generic code. Patch #5
adds the plumbing for using a SIMD accelerated implementation. Patch #6
adds the ARM and arm64 code, and patch #7 adds a speed test.

Since aegis128l and aegis256 were not selected, and nor where any of the
morus contestants (which are in fact found to be cryptographically broken),
patches #1 and #2 remove these entirely.

Changes since v3:
- drop always_inline annotation from aegis_aes_round(), which triggers a
  warning on ARM
- add Ondrej's R-b to the series

Changes since v2:
- drop AEGIS128L/256 Kconfig symbols from crypto/Kconfig
- ensure that aese/aesmc are issued in pairs

Changes since v1s:
- add reference to research paper (#1)
- drop hunks against m68k defconfigs - these get regenerated automatically
  anyway, and so it is better to avoid the potential merge conflicts.
- drop patch to use unaligned accessors where it isn't needed
- drop hunks against aegis variants that are being removed (#3)
- add acks from Ondrej

* 32-bit ARM today rarely provides the special AES instruction that the
  implementation in this series relies on, but this may change in the future,
  and the NEON intrinsics code can be compiled for both ISAs.

Cc: Eric Biggers <ebiggers@google.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Milan Broz <gmazyland@gmail.com>

[0] https://lore.kernel.org/linux-crypto/20190624073818.29296-1-ard.biesheuvel@linaro.org/
[1] https://lore.kernel.org/linux-crypto/20190625145254.28510-1-ard.biesheuvel@linaro.org/

Ard Biesheuvel (7):
  crypto: morus - remove generic and x86 implementations
  crypto: aegis128l/aegis256 - remove x86 and generic implementations
  crypto: aegis128 - drop empty TFM init/exit routines
  crypto: aegis - avoid prerotated AES tables
  crypto: aegis128 - add support for SIMD acceleration
  crypto: aegis128 - provide a SIMD implementation based on NEON
    intrinsics
  crypto: tcrypt - add a speed test for AEGIS128

 arch/x86/crypto/Makefile               |   17 -
 arch/x86/crypto/aegis128l-aesni-asm.S  |  826 ------
 arch/x86/crypto/aegis128l-aesni-glue.c |  297 ---
 arch/x86/crypto/aegis256-aesni-asm.S   |  703 -----
 arch/x86/crypto/aegis256-aesni-glue.c  |  297 ---
 arch/x86/crypto/morus1280-avx2-asm.S   |  622 -----
 arch/x86/crypto/morus1280-avx2-glue.c  |   66 -
 arch/x86/crypto/morus1280-sse2-asm.S   |  896 -------
 arch/x86/crypto/morus1280-sse2-glue.c  |   65 -
 arch/x86/crypto/morus1280_glue.c       |  209 --
 arch/x86/crypto/morus640-sse2-asm.S    |  615 -----
 arch/x86/crypto/morus640-sse2-glue.c   |   65 -
 arch/x86/crypto/morus640_glue.c        |  204 --
 crypto/Kconfig                         |   89 +-
 crypto/Makefile                        |   16 +-
 crypto/aegis.h                         |   28 +-
 crypto/{aegis128.c => aegis128-core.c} |   53 +-
 crypto/aegis128-neon-inner.c           |  149 ++
 crypto/aegis128-neon.c                 |   43 +
 crypto/aegis128l.c                     |  522 ----
 crypto/aegis256.c                      |  473 ----
 crypto/morus1280.c                     |  542 ----
 crypto/morus640.c                      |  533 ----
 crypto/tcrypt.c                        |    7 +
 crypto/testmgr.c                       |   24 -
 crypto/testmgr.h                       | 2691 --------------------
 include/crypto/morus1280_glue.h        |   97 -
 include/crypto/morus640_glue.h         |   97 -
 include/crypto/morus_common.h          |   18 -
 29 files changed, 266 insertions(+), 9998 deletions(-)
 delete mode 100644 arch/x86/crypto/aegis128l-aesni-asm.S
 delete mode 100644 arch/x86/crypto/aegis128l-aesni-glue.c
 delete mode 100644 arch/x86/crypto/aegis256-aesni-asm.S
 delete mode 100644 arch/x86/crypto/aegis256-aesni-glue.c
 delete mode 100644 arch/x86/crypto/morus1280-avx2-asm.S
 delete mode 100644 arch/x86/crypto/morus1280-avx2-glue.c
 delete mode 100644 arch/x86/crypto/morus1280-sse2-asm.S
 delete mode 100644 arch/x86/crypto/morus1280-sse2-glue.c
 delete mode 100644 arch/x86/crypto/morus1280_glue.c
 delete mode 100644 arch/x86/crypto/morus640-sse2-asm.S
 delete mode 100644 arch/x86/crypto/morus640-sse2-glue.c
 delete mode 100644 arch/x86/crypto/morus640_glue.c
 rename crypto/{aegis128.c => aegis128-core.c} (89%)
 create mode 100644 crypto/aegis128-neon-inner.c
 create mode 100644 crypto/aegis128-neon.c
 delete mode 100644 crypto/aegis128l.c
 delete mode 100644 crypto/aegis256.c
 delete mode 100644 crypto/morus1280.c
 delete mode 100644 crypto/morus640.c
 delete mode 100644 include/crypto/morus1280_glue.h
 delete mode 100644 include/crypto/morus640_glue.h
 delete mode 100644 include/crypto/morus_common.h

Comments

Herbert Xu July 26, 2019, 12:31 p.m. UTC | #1
Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> This v2/v3/v4 is a follow-up to both 'crypto: aegis128 - add NEON intrinsics
> version for ARM/arm64' [0] and 'crypto: morus - remove generic and x86
> implementations' [1]. Since there is some overlap, it makes sense to merge
> them and avoid merge conflicts.
> 
> Now that aegis128 has been announced as one of the winners of the CAESAR
> competition, it's time to provide some better support for it on arm64 (and
> 32-bit ARM *)
> 
> This time, instead of cloning the generic driver twice and rewriting half
> of it in arm64 and ARM assembly, add hooks for an accelerated SIMD path to
> the generic driver, and populate it with a C version using NEON intrinsics
> that can be built for both ARM and arm64. This results in a speedup of ~11x,
> resulting in a performance of 2.2 cycles per byte on Cortex-A53.
> 
> Patches #3 and #4 are fixes/improvements for the generic code. Patch #5
> adds the plumbing for using a SIMD accelerated implementation. Patch #6
> adds the ARM and arm64 code, and patch #7 adds a speed test.
> 
> Since aegis128l and aegis256 were not selected, and nor where any of the
> morus contestants (which are in fact found to be cryptographically broken),
> patches #1 and #2 remove these entirely.
> 
> Changes since v3:
> - drop always_inline annotation from aegis_aes_round(), which triggers a
>  warning on ARM
> - add Ondrej's R-b to the series
> 
> Changes since v2:
> - drop AEGIS128L/256 Kconfig symbols from crypto/Kconfig
> - ensure that aese/aesmc are issued in pairs
> 
> Changes since v1s:
> - add reference to research paper (#1)
> - drop hunks against m68k defconfigs - these get regenerated automatically
>  anyway, and so it is better to avoid the potential merge conflicts.
> - drop patch to use unaligned accessors where it isn't needed
> - drop hunks against aegis variants that are being removed (#3)
> - add acks from Ondrej
> 
> * 32-bit ARM today rarely provides the special AES instruction that the
>  implementation in this series relies on, but this may change in the future,
>  and the NEON intrinsics code can be compiled for both ISAs.
> 
> Cc: Eric Biggers <ebiggers@google.com>
> Cc: Ondrej Mosnacek <omosnace@redhat.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Milan Broz <gmazyland@gmail.com>
> 
> [0] https://lore.kernel.org/linux-crypto/20190624073818.29296-1-ard.biesheuvel@linaro.org/
> [1] https://lore.kernel.org/linux-crypto/20190625145254.28510-1-ard.biesheuvel@linaro.org/
> 
> Ard Biesheuvel (7):
>  crypto: morus - remove generic and x86 implementations
>  crypto: aegis128l/aegis256 - remove x86 and generic implementations
>  crypto: aegis128 - drop empty TFM init/exit routines
>  crypto: aegis - avoid prerotated AES tables
>  crypto: aegis128 - add support for SIMD acceleration
>  crypto: aegis128 - provide a SIMD implementation based on NEON
>    intrinsics
>  crypto: tcrypt - add a speed test for AEGIS128
> 
> arch/x86/crypto/Makefile               |   17 -
> arch/x86/crypto/aegis128l-aesni-asm.S  |  826 ------
> arch/x86/crypto/aegis128l-aesni-glue.c |  297 ---
> arch/x86/crypto/aegis256-aesni-asm.S   |  703 -----
> arch/x86/crypto/aegis256-aesni-glue.c  |  297 ---
> arch/x86/crypto/morus1280-avx2-asm.S   |  622 -----
> arch/x86/crypto/morus1280-avx2-glue.c  |   66 -
> arch/x86/crypto/morus1280-sse2-asm.S   |  896 -------
> arch/x86/crypto/morus1280-sse2-glue.c  |   65 -
> arch/x86/crypto/morus1280_glue.c       |  209 --
> arch/x86/crypto/morus640-sse2-asm.S    |  615 -----
> arch/x86/crypto/morus640-sse2-glue.c   |   65 -
> arch/x86/crypto/morus640_glue.c        |  204 --
> crypto/Kconfig                         |   89 +-
> crypto/Makefile                        |   16 +-
> crypto/aegis.h                         |   28 +-
> crypto/{aegis128.c => aegis128-core.c} |   53 +-
> crypto/aegis128-neon-inner.c           |  149 ++
> crypto/aegis128-neon.c                 |   43 +
> crypto/aegis128l.c                     |  522 ----
> crypto/aegis256.c                      |  473 ----
> crypto/morus1280.c                     |  542 ----
> crypto/morus640.c                      |  533 ----
> crypto/tcrypt.c                        |    7 +
> crypto/testmgr.c                       |   24 -
> crypto/testmgr.h                       | 2691 --------------------
> include/crypto/morus1280_glue.h        |   97 -
> include/crypto/morus640_glue.h         |   97 -
> include/crypto/morus_common.h          |   18 -
> 29 files changed, 266 insertions(+), 9998 deletions(-)
> delete mode 100644 arch/x86/crypto/aegis128l-aesni-asm.S
> delete mode 100644 arch/x86/crypto/aegis128l-aesni-glue.c
> delete mode 100644 arch/x86/crypto/aegis256-aesni-asm.S
> delete mode 100644 arch/x86/crypto/aegis256-aesni-glue.c
> delete mode 100644 arch/x86/crypto/morus1280-avx2-asm.S
> delete mode 100644 arch/x86/crypto/morus1280-avx2-glue.c
> delete mode 100644 arch/x86/crypto/morus1280-sse2-asm.S
> delete mode 100644 arch/x86/crypto/morus1280-sse2-glue.c
> delete mode 100644 arch/x86/crypto/morus1280_glue.c
> delete mode 100644 arch/x86/crypto/morus640-sse2-asm.S
> delete mode 100644 arch/x86/crypto/morus640-sse2-glue.c
> delete mode 100644 arch/x86/crypto/morus640_glue.c
> rename crypto/{aegis128.c => aegis128-core.c} (89%)
> create mode 100644 crypto/aegis128-neon-inner.c
> create mode 100644 crypto/aegis128-neon.c
> delete mode 100644 crypto/aegis128l.c
> delete mode 100644 crypto/aegis256.c
> delete mode 100644 crypto/morus1280.c
> delete mode 100644 crypto/morus640.c
> delete mode 100644 include/crypto/morus1280_glue.h
> delete mode 100644 include/crypto/morus640_glue.h
> delete mode 100644 include/crypto/morus_common.h

All applied.  Thanks.