Message ID | 20230410225936.8940-12-chang.seok.bae@intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Herbert Xu |
Headers | show |
Series | x86: Support Key Locker | expand |
On Mon, Apr 10, 2023 at 03:59:35PM -0700, Chang S. Bae wrote: > [PATCH v6 11/12] crypto: x86/aes-kl - Support AES algorithm using Key Locker instructions Thanks for dropping the unnecessary modes of operation (CBC, ECB, CTR). It simplified the patchset quite a bit! Now that only AES-XTS is included, can you please also merge this patch with the following patch? As-is, this patch is misleading since it doesn't actually add "support" for anything at all. It actually just adds an unfinished AES-XTS implementation, which patch 12 then finishes. I assume that the current patchset organization is left over from when you were trying to support multiple modes of operation. IMO, it would be much easier to review if patches 11-12 were merged into one patch that adds the new AES-XTS implementation. > For disk encryption, storage bandwidth may be the bottleneck > before encryption bandwidth, but the potential performance difference is > why AES-KL is advertised as a distinct cipher in /proc/crypto rather than > the kernel transparently replacing AES-NI usage with AES-KL. This does not correctly describe what is going on. Actually, this patchset registers the AES-KL XTS algorithm with the usual name "xts(aes)". So, it can potentially be used by any AES-XTS user. It seems that you're actually relying on the algorithm priorities to prioritize AES-NI, as you've assigned priority 200 to AES-KL, whereas AES-NI has priority 401. Is that what you intend, and if so can you please update your explanation to properly explain this? The alternative would be to use a unique algorithm name, such as "keylocker-xts(aes)". I'm not sure that would be better, given that the algorithms are compatible. However, that actually would seem to match the explanation you gave more closely, so perhaps that's what you actually intended? > diff --git a/arch/x86/crypto/aeskl-intel_asm.S b/arch/x86/crypto/aeskl-intel_asm.S [...] > +#ifdef __x86_64__ > +#define AREG %rax > +#define HANDLEP %rdi > +#define OUTP %rsi > +#define KLEN %r9d > +#define INP %rdx > +#define T1 %r10 > +#define LEN %rcx > +#define IVP %r8 > +#else > +#define AREG %eax > +#define HANDLEP %edi > +#define OUTP AREG > +#define KLEN %ebx > +#define INP %edx > +#define T1 %ecx > +#define LEN %esi > +#define IVP %ebp > +#endif I strongly recommend skipping the 32-bit support, as it's unlikely to be worth the effort. And actually, aeskl-intel_glue.c only registers the algorithm for 64-bit anyway... So the 32-bit code paths are untested dead code. > +static inline int aeskl_enc(const void *ctx, u8 *out, const u8 *in) > +{ > + if (unlikely(keylength(ctx) == AES_KEYSIZE_192)) > + return -EINVAL; > + else if (!valid_keylocker()) > + return -ENODEV; > + else if (_aeskl_enc(ctx, out, in)) > + return -EINVAL; > + else > + return 0; > +} > + > +static inline int aeskl_dec(const void *ctx, u8 *out, const u8 *in) > +{ > + if (unlikely(keylength(ctx) == AES_KEYSIZE_192)) > + return -EINVAL; > + else if (!valid_keylocker()) > + return -ENODEV; > + else if (_aeskl_dec(ctx, out, in)) > + return -EINVAL; > + else > + return 0; > +} I don't think the above two functions should exist. keylength() and valid_keylocker() should be checked before calling xts_crypt_common(), and the assembly functions should just return the correct error code (-EINVAL, apparently) instead of an unspecified nonzero value. That would leave nothing for a wrapper function to do. Note: if you take this suggestion, the assembly functions would need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START. - Eric
On 5/5/2023 5:01 PM, Eric Biggers wrote: > On Mon, Apr 10, 2023 at 03:59:35PM -0700, Chang S. Bae wrote: >> [PATCH v6 11/12] crypto: x86/aes-kl - Support AES algorithm using Key Locker instructions > > Thanks for dropping the unnecessary modes of operation (CBC, ECB, CTR). It > simplified the patchset quite a bit! Yeah. But, there are more things to go away here as you pointed out here. I thought some generic establishment (patch10) then introduce some mode-specific code (patch11). Considerably, this incremental change was expected to help reviewers. Now I realize this introduces dead code on its hindsight. And this approach seems not helping that much. > Now that only AES-XTS is included, can you please also merge this patch with the > following patch? As-is, this patch is misleading since it doesn't actually add > "support" for anything at all. It actually just adds an unfinished AES-XTS > implementation, which patch 12 then finishes. I assume that the current > patchset organization is left over from when you were trying to support multiple > modes of operation. IMO, it would be much easier to review if patches 11-12 > were merged into one patch that adds the new AES-XTS implementation. Yes, I agree to merge them. >> For disk encryption, storage bandwidth may be the bottleneck >> before encryption bandwidth, but the potential performance difference is >> why AES-KL is advertised as a distinct cipher in /proc/crypto rather than >> the kernel transparently replacing AES-NI usage with AES-KL. > > This does not correctly describe what is going on. Actually, this patchset > registers the AES-KL XTS algorithm with the usual name "xts(aes)". So, it can > potentially be used by any AES-XTS user. It seems that you're actually relying > on the algorithm priorities to prioritize AES-NI, as you've assigned priority > 200 to AES-KL, whereas AES-NI has priority 401. Is that what you intend, and if > so can you please update your explanation to properly explain this? I think AES-KL could be a drop-in replacement for AES-NI IF it performs well -- something on par with AES-NI or better, AND it also supports all the key sizes. But, it can't be the default because that's not the case (at least for now). > The alternative would be to use a unique algorithm name, such as > "keylocker-xts(aes)". I'm not sure that would be better, given that the > algorithms are compatible. However, that actually would seem to match the > explanation you gave more closely, so perhaps that's what you actually intended? I think those AES implementations are functionally the same to end users. The key envelopment is not something user-visible to my understanding. So, I thought that same name makes sense. Now looking at the changelog, this text in the 'performance' section appears to be relevant: > the potential performance difference is why AES-KL is advertised as > a distinct cipher in /proc/crypto rather than the kernel > transparently replacing AES-NI usage with AES-KL. But, this does not seem to be clear enough. Maybe, this exposition story can go under a new section. The changelog is already tl;dr... > I strongly recommend skipping the 32-bit support, as it's unlikely to be worth > the effort. > > And actually, aeskl-intel_glue.c only registers the algorithm for 64-bit > anyway... So the 32-bit code paths are untested dead code. Yeah, will do. Also, I'd make the module available only with X86-64. Then, a bit of comments for the reason should come along. >> +static inline int aeskl_enc(const void *ctx, u8 *out, const u8 *in) >> +{ >> + if (unlikely(keylength(ctx) == AES_KEYSIZE_192)) >> + return -EINVAL; >> + else if (!valid_keylocker()) >> + return -ENODEV; >> + else if (_aeskl_enc(ctx, out, in)) >> + return -EINVAL; >> + else >> + return 0; >> +} >> + >> +static inline int aeskl_dec(const void *ctx, u8 *out, const u8 *in) >> +{ >> + if (unlikely(keylength(ctx) == AES_KEYSIZE_192)) >> + return -EINVAL; >> + else if (!valid_keylocker()) >> + return -ENODEV; >> + else if (_aeskl_dec(ctx, out, in)) >> + return -EINVAL; >> + else >> + return 0; >> +} > > I don't think the above two functions should exist. keylength() and > valid_keylocker() should be checked before calling xts_crypt_common(), and the > assembly functions should just return the correct error code (-EINVAL, > apparently) instead of an unspecified nonzero value. That would leave nothing > for a wrapper function to do. > > Note: if you take this suggestion, the assembly functions would need to use > SYM_TYPED_FUNC_START instead of SYM_FUNC_START. I thought this is something benign to stay here. But, yes, I agree that it is better to simplify the code. Thanks, Chang
> diff --git a/crypto/Kconfig b/crypto/Kconfig ... > +config AS_HAS_KEYLOCKER > + def_bool $(as-instr,encodekey256 %eax$(comma)%eax) > + help > + Supported by binutils >= 2.36 and LLVM integrated assembler >= V12 > + > +config CRYPTO_AES_KL > + tristate "AES cipher algorithms (AES-KL)" > + depends on AS_HAS_KEYLOCKER > + depends on CRYPTO_AES_NI_INTEL > + select X86_KEYLOCKER This material belongs in arch/x86/Kconfig now (which didn't exist when this patch series began).
> -----Original Message----- > From: Elliott, Robert (Servers) <elliott@hpe.com> ... > This material belongs in arch/x86/Kconfig now (which didn't exist when > this patch series began). Sorry, omitted one level: arch/x86/crypto/Kconfig
On 5/8/2023 12:24 PM, Elliott, Robert (Servers) wrote: > ... >> This material belongs in arch/x86/Kconfig now (which didn't exist when >> this patch series began). > > > Sorry, omitted one level: > arch/x86/crypto/Kconfig Thanks for pointing this out! Otherwise, it might happen to be based on the old mess before 28a936ef44e1 ("crypto: Kconfig - move x86 entries to a submenu") Thanks, Chang
On 5/6/23 02:01, Eric Biggers wrote: ... > This does not correctly describe what is going on. Actually, this patchset > registers the AES-KL XTS algorithm with the usual name "xts(aes)". So, it can > potentially be used by any AES-XTS user. It seems that you're actually relying > on the algorithm priorities to prioritize AES-NI, as you've assigned priority > 200 to AES-KL, whereas AES-NI has priority 401. Is that what you intend, and if > so can you please update your explanation to properly explain this? > > The alternative would be to use a unique algorithm name, such as > "keylocker-xts(aes)". I'm not sure that would be better, given that the > algorithms are compatible. However, that actually would seem to match the > explanation you gave more closely, so perhaps that's what you actually intended? Sorry to be late in-game, but as this is intended for LUKS/dm-crypt use, I have a comment here: LUKS2 will no longer support algorithms with the dash in the name for dm-crypt (iow "aes-generic" or something like that will no longer work, and I am afraid you will need aes-kl/keylocker-xts here to force to use AES-KL for dm-crypt). One reason is described in https://gitlab.com/cryptsetup/cryptsetup/-/issues/809, but the major problem is that cryptsetup used CIPHER-MODE-IV syntax (that mixes badly with the dash in algorithm names). And we still rely on internal conversions of common modes to that syntax (currently it worked only by a luck). When I added the "capi" format for dm-crypt for algorithms specification, I made a mistake in that it allows everything, including crypto driver platform-specific names. The intention was to keep the kernel to decide which crypto driver will be used. So, this is perhaps fine for dm-crypt now but LUKS is a portable format, and a generic algorithm (like AES) should not depend on a specific driver or CPU feature. IOW, implement xts(aes) and let the user prioritize the driver (no changes needed for LUKS header then, AES-KL is loaded automatically) or/and create a wrapper (similar to paes, that we already support) that will force to use AES-KL (...but without the dash in the name, please :) If there is a problem with it, please create an issue for cryptsetup upstream to discuss it there (before the kernel part is merged!), so we can find some solution - I would like to avoid incompatibilities later. Thanks, Milan
On 5/8/2023 11:18 AM, Chang S. Bae wrote: > > I thought this is something benign to stay here. But, yes, I agree that > it is better to simplify the code. After staring at this a bit, I realized that at least the feature flag check needs to stay there. This can populate a proper error code when the feature abruptly gets disabled (most likely due to the backup failure). Thanks, Chang
diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile index 8469b0a09cb5..ab9bd2b102dd 100644 --- a/arch/x86/crypto/Makefile +++ b/arch/x86/crypto/Makefile @@ -50,6 +50,9 @@ obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o aes-intel_glue.o aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o aes_ctrby8_avx-x86_64.o +obj-$(CONFIG_CRYPTO_AES_KL) += aeskl-intel.o +aeskl-intel-y := aeskl-intel_asm.o aeskl-intel_glue.o + obj-$(CONFIG_CRYPTO_SHA1_SSSE3) += sha1-ssse3.o sha1-ssse3-y := sha1_avx2_x86_64_asm.o sha1_ssse3_asm.o sha1_ssse3_glue.o sha1-ssse3-$(CONFIG_AS_SHA1_NI) += sha1_ni_asm.o diff --git a/arch/x86/crypto/aeskl-intel_asm.S b/arch/x86/crypto/aeskl-intel_asm.S new file mode 100644 index 000000000000..17c7b306a766 --- /dev/null +++ b/arch/x86/crypto/aeskl-intel_asm.S @@ -0,0 +1,184 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Implement AES algorithm using Intel AES Key Locker instructions. + * + * Most code is based from the AES-NI implementation, aesni-intel_asm.S + * + */ + +#include <linux/linkage.h> +#include <asm/inst.h> +#include <asm/frame.h> +#include "aes-intel_asm.S" + +.text + +#define STATE1 %xmm0 +#define STATE2 %xmm1 +#define STATE3 %xmm2 +#define STATE4 %xmm3 +#define STATE5 %xmm4 +#define STATE6 %xmm5 +#define STATE7 %xmm6 +#define STATE8 %xmm7 +#define STATE STATE1 + +#define IV %xmm9 +#define KEY %xmm10 +#define BSWAP_MASK %xmm11 +#define CTR %xmm12 +#define INC %xmm13 + +#ifdef __x86_64__ +#define IN1 %xmm8 +#define IN2 %xmm9 +#define IN3 %xmm10 +#define IN4 %xmm11 +#define IN5 %xmm12 +#define IN6 %xmm13 +#define IN7 %xmm14 +#define IN8 %xmm15 +#define IN IN1 +#define TCTR_LOW %r11 +#else +#define IN %xmm1 +#endif + +#ifdef __x86_64__ +#define AREG %rax +#define HANDLEP %rdi +#define OUTP %rsi +#define KLEN %r9d +#define INP %rdx +#define T1 %r10 +#define LEN %rcx +#define IVP %r8 +#else +#define AREG %eax +#define HANDLEP %edi +#define OUTP AREG +#define KLEN %ebx +#define INP %edx +#define T1 %ecx +#define LEN %esi +#define IVP %ebp +#endif + +#define UKEYP OUTP +#define GF128MUL_MASK %xmm11 + +/* + * int aeskl_setkey(struct crypto_aes_ctx *ctx, const u8 *in_key, unsigned int key_len) + */ +SYM_FUNC_START(aeskl_setkey) + FRAME_BEGIN +#ifndef __x86_64__ + push HANDLEP + movl (FRAME_OFFSET+8)(%esp), HANDLEP # ctx + movl (FRAME_OFFSET+12)(%esp), UKEYP # in_key + movl (FRAME_OFFSET+16)(%esp), %edx # key_len +#endif + movl %edx, 480(HANDLEP) + movdqu (UKEYP), STATE1 + mov $1, %eax + cmp $16, %dl + je .Lsetkey_128 + + movdqu 0x10(UKEYP), STATE2 + encodekey256 %eax, %eax + movdqu STATE4, 0x30(HANDLEP) + jmp .Lsetkey_end +.Lsetkey_128: + encodekey128 %eax, %eax + +.Lsetkey_end: + movdqu STATE1, (HANDLEP) + movdqu STATE2, 0x10(HANDLEP) + movdqu STATE3, 0x20(HANDLEP) + + xor AREG, AREG +#ifndef __x86_64__ + popl HANDLEP +#endif + FRAME_END + RET +SYM_FUNC_END(aeskl_setkey) + +/* + * int _aeskl_enc(const void *ctx, u8 *dst, const u8 *src) + */ +SYM_FUNC_START(_aeskl_enc) + FRAME_BEGIN +#ifndef __x86_64__ + pushl HANDLEP + pushl KLEN + movl (FRAME_OFFSET+12)(%esp), HANDLEP # ctx + movl (FRAME_OFFSET+16)(%esp), OUTP # dst + movl (FRAME_OFFSET+20)(%esp), INP # src +#endif + movdqu (INP), STATE + movl 480(HANDLEP), KLEN + + cmp $16, KLEN + je .Lenc_128 + aesenc256kl (HANDLEP), STATE + jz .Lenc_err + jmp .Lenc_noerr +.Lenc_128: + aesenc128kl (HANDLEP), STATE + jz .Lenc_err + +.Lenc_noerr: + xor AREG, AREG + jmp .Lenc_end +.Lenc_err: + mov $1, AREG +.Lenc_end: + movdqu STATE, (OUTP) +#ifndef __x86_64__ + popl KLEN + popl HANDLEP +#endif + FRAME_END + RET +SYM_FUNC_END(_aeskl_enc) + +/* + * int _aeskl_dec(const void *ctx, u8 *dst, const u8 *src) + */ +SYM_FUNC_START(_aeskl_dec) + FRAME_BEGIN +#ifndef __x86_64__ + pushl HANDLEP + pushl KLEN + movl (FRAME_OFFSET+12)(%esp), HANDLEP # ctx + movl (FRAME_OFFSET+16)(%esp), OUTP # dst + movl (FRAME_OFFSET+20)(%esp), INP # src +#endif + movdqu (INP), STATE + mov 480(HANDLEP), KLEN + + cmp $16, KLEN + je .Ldec_128 + aesdec256kl (HANDLEP), STATE + jz .Ldec_err + jmp .Ldec_noerr +.Ldec_128: + aesdec128kl (HANDLEP), STATE + jz .Ldec_err + +.Ldec_noerr: + xor AREG, AREG + jmp .Ldec_end +.Ldec_err: + mov $1, AREG +.Ldec_end: + movdqu STATE, (OUTP) +#ifndef __x86_64__ + popl KLEN + popl HANDLEP +#endif + FRAME_END + RET +SYM_FUNC_END(_aeskl_dec) + diff --git a/arch/x86/crypto/aeskl-intel_glue.c b/arch/x86/crypto/aeskl-intel_glue.c new file mode 100644 index 000000000000..0062baaaf7b2 --- /dev/null +++ b/arch/x86/crypto/aeskl-intel_glue.c @@ -0,0 +1,115 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Support for AES Key Locker instructions. This file contains glue + * code and the real AES implementation is in aeskl-intel_asm.S. + * + * Most code is based on AES-NI glue code, aesni-intel_glue.c + */ + +#include <linux/types.h> +#include <linux/module.h> +#include <linux/err.h> +#include <crypto/algapi.h> +#include <crypto/aes.h> +#include <crypto/xts.h> +#include <crypto/internal/skcipher.h> +#include <crypto/internal/simd.h> +#include <asm/simd.h> +#include <asm/cpu_device_id.h> +#include <asm/fpu/api.h> +#include <asm/keylocker.h> + +#include "aes-intel_glue.h" +#include "aesni-intel_glue.h" + +asmlinkage int aeskl_setkey(struct crypto_aes_ctx *ctx, const u8 *in_key, unsigned int key_len); + +asmlinkage int _aeskl_enc(const void *ctx, u8 *out, const u8 *in); +asmlinkage int _aeskl_dec(const void *ctx, u8 *out, const u8 *in); + +static int __maybe_unused aeskl_setkey_common(struct crypto_tfm *tfm, void *raw_ctx, + const u8 *in_key, unsigned int key_len) +{ + struct crypto_aes_ctx *ctx = aes_ctx(raw_ctx); + int err; + + if (!crypto_simd_usable()) + return -EBUSY; + + if (key_len != AES_KEYSIZE_128 && key_len != AES_KEYSIZE_192 && + key_len != AES_KEYSIZE_256) + return -EINVAL; + + kernel_fpu_begin(); + if (unlikely(key_len == AES_KEYSIZE_192)) { + pr_warn_once("AES-KL does not support 192-bit key. Use AES-NI.\n"); + err = aesni_set_key(ctx, in_key, key_len); + } else { + if (!valid_keylocker()) + err = -ENODEV; + else + err = aeskl_setkey(ctx, in_key, key_len); + } + kernel_fpu_end(); + + return err; +} + +static inline u32 keylength(const void *raw_ctx) +{ + struct crypto_aes_ctx *ctx = aes_ctx((void *)raw_ctx); + + return ctx->key_length; +} + +static inline int aeskl_enc(const void *ctx, u8 *out, const u8 *in) +{ + if (unlikely(keylength(ctx) == AES_KEYSIZE_192)) + return -EINVAL; + else if (!valid_keylocker()) + return -ENODEV; + else if (_aeskl_enc(ctx, out, in)) + return -EINVAL; + else + return 0; +} + +static inline int aeskl_dec(const void *ctx, u8 *out, const u8 *in) +{ + if (unlikely(keylength(ctx) == AES_KEYSIZE_192)) + return -EINVAL; + else if (!valid_keylocker()) + return -ENODEV; + else if (_aeskl_dec(ctx, out, in)) + return -EINVAL; + else + return 0; +} + +static int __init aeskl_init(void) +{ + if (!valid_keylocker()) + return -ENODEV; + + /* + * AES-KL itself does not depend on AES-NI. But AES-KL does not + * support 192-bit keys. To make itself AES-compliant, it falls + * back to AES-NI. + */ + if (!boot_cpu_has(X86_FEATURE_AES)) + return -ENODEV; + + return 0; +} + +static void __exit aeskl_exit(void) +{ + return; +} + +late_initcall(aeskl_init); +module_exit(aeskl_exit); + +MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm, AES Key Locker implementation"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("aes"); diff --git a/crypto/Kconfig b/crypto/Kconfig index 9c86f7045157..e432d1ded391 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -573,6 +573,42 @@ config CRYPTO_TWOFISH See https://www.schneier.com/twofish.html for further information. +config AS_HAS_KEYLOCKER + def_bool $(as-instr,encodekey256 %eax$(comma)%eax) + help + Supported by binutils >= 2.36 and LLVM integrated assembler >= V12 + +config CRYPTO_AES_KL + tristate "AES cipher algorithms (AES-KL)" + depends on AS_HAS_KEYLOCKER + depends on CRYPTO_AES_NI_INTEL + select X86_KEYLOCKER + + help + Key Locker provides AES SIMD instructions (AES-KL) for secure + data encryption and decryption. While this new instruction + set is analogous to AES-NI, AES-KL supports to encode an AES + key to an encoded form ('key handle') and uses it to transform + data instead of accessing the AES key. + + The setkey() transforms an AES key to a key handle, then the AES + key is no longer needed for data transformation. A user may + displace their keys from possible exposition. + + This key encryption is done by the CPU-internal wrapping key. + This wrapping key support is provided with X86_KEYLOCKER. + + AES-KL supports 128-/256-bit keys only. While giving a 192-bit + key does not return an error, as the AES-NI driver is chosen to + process it, the claimed security property is not available with + that. + + Bare metal disk encryption is the preferred use case. Key Locker + usage requires explicit opt-in at setup time. So select it if + unsure. + + See Documentation/x86/keylocker.rst for more details. + config CRYPTO_TWOFISH_COMMON tristate help