mbox series

[RFC,00/10] crypto: x86 - remove XTS and CTR glue helper code

Message ID 20201223223841.11311-1-ardb@kernel.org (mailing list archive)
Headers show
Series crypto: x86 - remove XTS and CTR glue helper code | expand

Message

Ard Biesheuvel Dec. 23, 2020, 10:38 p.m. UTC
After applying my performance fixes for AES-NI in XTS mode, the only
remaining users of the x86 glue helper module are the niche algorithms
camellia, cast6, serpent and twofish.

It is not clear from the history why all these different versions of these
algorithms in XTS and CTR modes were added in the first place: the only
in-kernel references that seem to exist are to cbc(serpent), cbc(camellia)
and cbc(twofish) in the IPsec stack. The XTS spec only mentions AES, and
CTR modes don't seem to be widely used either.

Since the glue helper code relies heavily on indirect calls for small chunks
of in/output, it needs some work to recover from the performance hit caused
by the retpoline changes. However, it makes sense to only expend the effort
for algorithms that are being used in the first place, and this does not
seem to be the case for XTS and CTR.

CTR mode can simply be removed: it is not used in the kernel, and it is
highly unlikely that it is being relied upon via algif_skcipher. And even
if it was, the generic CTR mode driver can still provide the CTR transforms
if necessary.

XTS mode may actually be in use by dm-crypt users, so we cannot simply drop
this code entirely. However, as it turns out, the XTS template wrapped
around the ECB mode skciphers perform roughly on par *, and so there is no
need to retain all the complicated XTS helper logic. In the unlikely case
that dm-crypt users are relying on xts(camellia) or xts(serpent) in the
field, they should not be impacted by these changes at all.

As a follow-up, it makes sense to rework the ECB and CBC mode implementations
to get rid of the indirect calls. Or perhaps we could drop [some of] these
algorithms entirely ...

* tcrypt results for various XTS implementations below, captured on a
  Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz

Cc: Megha Dey <megha.dey@intel.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Milan Broz <gmazyland@gmail.com>
Cc: Mike Snitzer <snitzer@redhat.com>

Ard Biesheuvel (10):
  crypto: x86/camellia - switch to XTS template
  crypto: x86/cast6 - switch to XTS template
  crypto: x86/serpent- switch to XTS template
  crypto: x86/twofish - switch to XTS template
  crypto: x86/glue-helper - drop XTS helper routines
  crypto: x86/camellia - drop CTR mode implementation
  crypto: x86/cast6 - drop CTR mode implementation
  crypto: x86/serpent - drop CTR mode implementation
  crypto: x86/twofish - drop CTR mode implementation
  crypto: x86/glue-helper - drop CTR helper routines

 arch/x86/crypto/camellia-aesni-avx-asm_64.S  | 297 ----------------
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 350 -------------------
 arch/x86/crypto/camellia_aesni_avx2_glue.c   | 111 ------
 arch/x86/crypto/camellia_aesni_avx_glue.c    | 141 +-------
 arch/x86/crypto/camellia_glue.c              |  68 ----
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |  84 -----
 arch/x86/crypto/cast6_avx_glue.c             | 146 --------
 arch/x86/crypto/glue_helper-asm-avx.S        | 104 ------
 arch/x86/crypto/glue_helper-asm-avx2.S       | 136 -------
 arch/x86/crypto/glue_helper.c                | 226 ------------
 arch/x86/crypto/serpent-avx-x86_64-asm_64.S  |  68 ----
 arch/x86/crypto/serpent-avx2-asm_64.S        |  87 -----
 arch/x86/crypto/serpent_avx2_glue.c          | 110 ------
 arch/x86/crypto/serpent_avx_glue.c           | 152 --------
 arch/x86/crypto/serpent_sse2_glue.c          |  67 ----
 arch/x86/crypto/twofish-avx-x86_64-asm_64.S  |  80 -----
 arch/x86/crypto/twofish_avx_glue.c           | 136 -------
 arch/x86/crypto/twofish_glue_3way.c          |  72 ----
 arch/x86/include/asm/crypto/camellia.h       |  24 --
 arch/x86/include/asm/crypto/glue_helper.h    |  44 ---
 arch/x86/include/asm/crypto/serpent-avx.h    |  21 --
 arch/x86/include/asm/crypto/twofish.h        |   4 -
 22 files changed, 1 insertion(+), 2527 deletions(-)

Comments

Milan Broz Dec. 24, 2020, 9:33 a.m. UTC | #1
On 23/12/2020 23:38, Ard Biesheuvel wrote:
> After applying my performance fixes for AES-NI in XTS mode, the only
> remaining users of the x86 glue helper module are the niche algorithms
> camellia, cast6, serpent and twofish.
> 
> It is not clear from the history why all these different versions of these
> algorithms in XTS and CTR modes were added in the first place: the only
> in-kernel references that seem to exist are to cbc(serpent), cbc(camellia)
> and cbc(twofish) in the IPsec stack. The XTS spec only mentions AES, and
> CTR modes don't seem to be widely used either.

FYI: Serpent, Camellia and Twofish are used in TrueCrypt/VeraCrypt implementation;
cryptsetup and I perhaps even VeraCrypt itself tries to use native dm-crypt mapping.
(They also added Russian GOST Kuznyechik with XTS, but this is not in mainline,
but Debian packages it as gost-crypto-dkms).

Serpent and Twofish can be also used with LRW and CBC modes (for old containers only).

Cryptsetup uses crypto userspace API to decrypt the key from header, then it configures
dm-crypt mapping for data. We need both use and in-kernel API here.

For reference, see this table (my independent implementation of TrueCrypt/VeraCrypt modes,
it should be complete history though):
https://gitlab.com/cryptsetup/cryptsetup/-/blob/master/lib/tcrypt/tcrypt.c#L77

If the above still works (I would really like to have way to open old containers)
it is ok to do whatever you want to change here :-)

I have no info that CTR is used anywhere related to dm-crypt
(IIRC it can be tricked to be used there but it does not make any sense).

Thanks,
Milan
Ard Biesheuvel Dec. 24, 2020, 9:56 a.m. UTC | #2
On Thu, 24 Dec 2020 at 10:33, Milan Broz <gmazyland@gmail.com> wrote:
>
> On 23/12/2020 23:38, Ard Biesheuvel wrote:
> > After applying my performance fixes for AES-NI in XTS mode, the only
> > remaining users of the x86 glue helper module are the niche algorithms
> > camellia, cast6, serpent and twofish.
> >
> > It is not clear from the history why all these different versions of these
> > algorithms in XTS and CTR modes were added in the first place: the only
> > in-kernel references that seem to exist are to cbc(serpent), cbc(camellia)
> > and cbc(twofish) in the IPsec stack. The XTS spec only mentions AES, and
> > CTR modes don't seem to be widely used either.
>
> FYI: Serpent, Camellia and Twofish are used in TrueCrypt/VeraCrypt implementation;
> cryptsetup and I perhaps even VeraCrypt itself tries to use native dm-crypt mapping.
> (They also added Russian GOST Kuznyechik with XTS, but this is not in mainline,
> but Debian packages it as gost-crypto-dkms).
>
> Serpent and Twofish can be also used with LRW and CBC modes (for old containers only).
>
> Cryptsetup uses crypto userspace API to decrypt the key from header, then it configures
> dm-crypt mapping for data. We need both use and in-kernel API here.
>
> For reference, see this table (my independent implementation of TrueCrypt/VeraCrypt modes,
> it should be complete history though):
> https://gitlab.com/cryptsetup/cryptsetup/-/blob/master/lib/tcrypt/tcrypt.c#L77
>
> If the above still works (I would really like to have way to open old containers)
> it is ok to do whatever you want to change here :-)
>

Thanks Milan.

With the XTS code removed from these drivers, the XTS template will be
used, which relies on the ECB mode helpers instead. So once we fix
those to get rid of the indirect calls, I'd expect XTS to actually
improve in performance for these algorithms.

> I have no info that CTR is used anywhere related to dm-crypt
> (IIRC it can be tricked to be used there but it does not make any sense).
>

Yes, that was my assumption. Thanks for confirming.
Eric Biggers Dec. 25, 2020, 7:20 p.m. UTC | #3
On Wed, Dec 23, 2020 at 11:38:31PM +0100, Ard Biesheuvel wrote:
> After applying my performance fixes for AES-NI in XTS mode, the only
> remaining users of the x86 glue helper module are the niche algorithms
> camellia, cast6, serpent and twofish.
> 
> It is not clear from the history why all these different versions of these
> algorithms in XTS and CTR modes were added in the first place: the only
> in-kernel references that seem to exist are to cbc(serpent), cbc(camellia)
> and cbc(twofish) in the IPsec stack. The XTS spec only mentions AES, and
> CTR modes don't seem to be widely used either.
> 
> Since the glue helper code relies heavily on indirect calls for small chunks
> of in/output, it needs some work to recover from the performance hit caused
> by the retpoline changes. However, it makes sense to only expend the effort
> for algorithms that are being used in the first place, and this does not
> seem to be the case for XTS and CTR.
> 
> CTR mode can simply be removed: it is not used in the kernel, and it is
> highly unlikely that it is being relied upon via algif_skcipher. And even
> if it was, the generic CTR mode driver can still provide the CTR transforms
> if necessary.
> 
> XTS mode may actually be in use by dm-crypt users, so we cannot simply drop
> this code entirely. However, as it turns out, the XTS template wrapped
> around the ECB mode skciphers perform roughly on par *, and so there is no
> need to retain all the complicated XTS helper logic. In the unlikely case
> that dm-crypt users are relying on xts(camellia) or xts(serpent) in the
> field, they should not be impacted by these changes at all.
> 
> As a follow-up, it makes sense to rework the ECB and CBC mode implementations
> to get rid of the indirect calls. Or perhaps we could drop [some of] these
> algorithms entirely ...
> 
> * tcrypt results for various XTS implementations below, captured on a
>   Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
> 
> Cc: Megha Dey <megha.dey@intel.com>
> Cc: Eric Biggers <ebiggers@kernel.org>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Milan Broz <gmazyland@gmail.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> 
> Ard Biesheuvel (10):
>   crypto: x86/camellia - switch to XTS template
>   crypto: x86/cast6 - switch to XTS template
>   crypto: x86/serpent- switch to XTS template
>   crypto: x86/twofish - switch to XTS template
>   crypto: x86/glue-helper - drop XTS helper routines
>   crypto: x86/camellia - drop CTR mode implementation
>   crypto: x86/cast6 - drop CTR mode implementation
>   crypto: x86/serpent - drop CTR mode implementation
>   crypto: x86/twofish - drop CTR mode implementation
>   crypto: x86/glue-helper - drop CTR helper routines

Acked-by: Eric Biggers <ebiggers@google.com>

- Eric