From patchwork Tue Sep 3 16:43:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128443 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7017F112C for ; Tue, 3 Sep 2019 16:43:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 488A521881 for ; Tue, 3 Sep 2019 16:43:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Sb9MFKJ0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729944AbfICQnp (ORCPT ); Tue, 3 Sep 2019 12:43:45 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:44396 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQnp (ORCPT ); Tue, 3 Sep 2019 12:43:45 -0400 Received: by mail-pg1-f193.google.com with SMTP id i18so9438172pgl.11 for ; Tue, 03 Sep 2019 09:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Fx4v5+pxRWU+v+BvHaZkcgPS3e4c7qTcjz3iyucazFw=; b=Sb9MFKJ0VTpOMdAaIDKGLEN+urYU1FAvCQhGv/1iiAcyuNVIQMlvPFbQ+yRPi5z6ol 6OeVX4iSO586BHzUEODaOlIuiFizylSW1+gBRQxVWNIRRIUIgn87cQyvOqF/u0kEuyWN vaCv5vqLI9So4WFmkSt8DOHreep0kuyCh8tjWNi9+r4oDzyHwr+dmXWlf1wcu2qzY+5A fNa2Pa0HVP6hW6WHfgxE80lizQTHIsIWwDFLzv/IOJBip+iRpGGbfMPu0x1Pm5u/Q/w2 p3fxjKyJhkkLiSqutDXWWDU/BEZDJxyfaBw6/8E3ia8hWGtetmVl9/zpusL3rU6nhgC2 fhpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Fx4v5+pxRWU+v+BvHaZkcgPS3e4c7qTcjz3iyucazFw=; b=Nr7MSU/QZFx9xZ2ntbI2Ln9Ehr2X9UW6n+1RllhpE/QeEswQTeSQrCPCOcxcKHV6yo w44EPwt/scpH9PC30WoGp4mcvDJ9aNUrcMGBUAzcLebsQb8+gExfSCikfVXI78ZZKUaj MzkHSKV6Mto7/lbiTkUGuwKDf0NGvQYEsN3XUOssMJHnbU8AT4hD5hqtpsHQr0RXbPcz 5e5aRwKL07vjHvw+QOajkc1ZTgUe1GNsgmQtxil2U7OVwegV+4FVCX5kDPMxz3s2gj1z aDqK++JqdTjEx7g6mBNPZm1c3o+AUbAwwUfhc3ZDm+07AfeTRF9snxqnDQ78bP3dcGIM ic9A== X-Gm-Message-State: APjAAAWPpbImezCiltxqcjERi1KlwSoYo207qlRw7qCeL20V+gjabRA+ bK8g0nCjXFzTF7+IS0EtWHkLEiNt3y33K9GN X-Google-Smtp-Source: APXvYqzdAm4G62gCcPN3By7wR8crgVsVVrtsyuyIy5HaTQ7EFL9d7mHhnE6nK1rhrcb9ZXW0CDRISw== X-Received: by 2002:a63:2a08:: with SMTP id q8mr30781007pgq.415.1567529024176; Tue, 03 Sep 2019 09:43:44 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:43 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 01/17] crypto: arm/aes - fix round key prototypes Date: Tue, 3 Sep 2019 09:43:23 -0700 Message-Id: <20190903164339.27984-2-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The AES round keys are arrays of u32s in native endianness now, so update the function prototypes accordingly. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 18 ++++----- arch/arm/crypto/aes-ce-glue.c | 40 ++++++++++---------- 2 files changed, 29 insertions(+), 29 deletions(-) diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 425000232d49..1e0d45183590 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -154,9 +154,9 @@ ENDPROC(aes_decrypt_3x) .endm /* - * aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, + * aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, * int blocks) - * aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, + * aes_ecb_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, * int blocks) */ ENTRY(ce_aes_ecb_encrypt) @@ -212,9 +212,9 @@ ENTRY(ce_aes_ecb_decrypt) ENDPROC(ce_aes_ecb_decrypt) /* - * aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, + * aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, * int blocks, u8 iv[]) - * aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, + * aes_cbc_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, * int blocks, u8 iv[]) */ ENTRY(ce_aes_cbc_encrypt) @@ -272,7 +272,7 @@ ENTRY(ce_aes_cbc_decrypt) ENDPROC(ce_aes_cbc_decrypt) /* - * aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, + * aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, * int blocks, u8 ctr[]) */ ENTRY(ce_aes_ctr_encrypt) @@ -349,10 +349,10 @@ ENTRY(ce_aes_ctr_encrypt) ENDPROC(ce_aes_ctr_encrypt) /* - * aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds, - * int blocks, u8 iv[], u8 const rk2[], int first) - * aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds, - * int blocks, u8 iv[], u8 const rk2[], int first) + * aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, + * int blocks, u8 iv[], u32 const rk2[], int first) + * aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, + * int blocks, u8 iv[], u32 const rk2[], int first) */ .macro next_tweak, out, in, const, tmp diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index a7265d0a7063..75d2ff03a63e 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -25,25 +25,25 @@ MODULE_LICENSE("GPL v2"); asmlinkage u32 ce_aes_sub(u32 input); asmlinkage void ce_aes_invert(void *dst, void *src); -asmlinkage void ce_aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], +asmlinkage void ce_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks); -asmlinkage void ce_aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], +asmlinkage void ce_aes_ecb_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks); -asmlinkage void ce_aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], +asmlinkage void ce_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 iv[]); -asmlinkage void ce_aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], +asmlinkage void ce_aes_cbc_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 iv[]); -asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], +asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 ctr[]); -asmlinkage void ce_aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], +asmlinkage void ce_aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, int blocks, u8 iv[], - u8 const rk2[], int first); -asmlinkage void ce_aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], + u32 const rk2[], int first); +asmlinkage void ce_aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, int blocks, u8 iv[], - u8 const rk2[], int first); + u32 const rk2[], int first); struct aes_block { u8 b[AES_BLOCK_SIZE]; @@ -182,7 +182,7 @@ static int ecb_encrypt(struct skcipher_request *req) kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { ce_aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_enc, num_rounds(ctx), blocks); + ctx->key_enc, num_rounds(ctx), blocks); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } kernel_neon_end(); @@ -202,7 +202,7 @@ static int ecb_decrypt(struct skcipher_request *req) kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { ce_aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_dec, num_rounds(ctx), blocks); + ctx->key_dec, num_rounds(ctx), blocks); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } kernel_neon_end(); @@ -222,7 +222,7 @@ static int cbc_encrypt(struct skcipher_request *req) kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { ce_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_enc, num_rounds(ctx), blocks, + ctx->key_enc, num_rounds(ctx), blocks, walk.iv); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } @@ -243,7 +243,7 @@ static int cbc_decrypt(struct skcipher_request *req) kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { ce_aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_dec, num_rounds(ctx), blocks, + ctx->key_dec, num_rounds(ctx), blocks, walk.iv); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } @@ -263,7 +263,7 @@ static int ctr_encrypt(struct skcipher_request *req) kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { ce_aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_enc, num_rounds(ctx), blocks, + ctx->key_enc, num_rounds(ctx), blocks, walk.iv); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } @@ -278,8 +278,8 @@ static int ctr_encrypt(struct skcipher_request *req) */ blocks = -1; - ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, - num_rounds(ctx), blocks, walk.iv); + ce_aes_ctr_encrypt(tail, NULL, ctx->key_enc, num_rounds(ctx), + blocks, walk.iv); crypto_xor_cpy(tdst, tsrc, tail, nbytes); err = skcipher_walk_done(&walk, 0); } @@ -324,8 +324,8 @@ static int xts_encrypt(struct skcipher_request *req) kernel_neon_begin(); for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key1.key_enc, rounds, blocks, - walk.iv, (u8 *)ctx->key2.key_enc, first); + ctx->key1.key_enc, rounds, blocks, walk.iv, + ctx->key2.key_enc, first); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } kernel_neon_end(); @@ -346,8 +346,8 @@ static int xts_decrypt(struct skcipher_request *req) kernel_neon_begin(); for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key1.key_dec, rounds, blocks, - walk.iv, (u8 *)ctx->key2.key_enc, first); + ctx->key1.key_dec, rounds, blocks, walk.iv, + ctx->key2.key_enc, first); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } kernel_neon_end(); From patchwork Tue Sep 3 16:43:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128445 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DF8915E9 for ; Tue, 3 Sep 2019 16:43:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id F379221881 for ; Tue, 3 Sep 2019 16:43:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="BrLC5uJN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730057AbfICQnr (ORCPT ); Tue, 3 Sep 2019 12:43:47 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:43513 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQnr (ORCPT ); Tue, 3 Sep 2019 12:43:47 -0400 Received: by mail-pl1-f194.google.com with SMTP id 4so8119413pld.10 for ; Tue, 03 Sep 2019 09:43:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=iouKzcU6vzz3J8Cw/jp3uuc+QHFS3Cf59HMMjSOOWbs=; b=BrLC5uJN4eOQS0EJ3ZdfeNolv1hkywi9MM4+6XIxCE8Yt9tSb3YBOrd5LKrmVYFSLd ghyuGXYVA3RuWF21bD1Fk3ifO3IScM5Tm2dCUk9H4dA2AjzsMX4VAS1XsxrFkeDNRiC1 K7xmYTJIeSc1Mrv7Qd9wLRek6cMN8WGdWf7h/mh8MCW08A7bVttBdyxZk2sZYlqCgl/f reCV+gkX1LTqVLYU3GIvnJdV60QZt4BcyUk0XcERV+LMSk8pjDgRxi8vDe/PM8b6yd4q rKrdyMc7Mn5fpZfgNXph243dKvcVwHXmoDMNLr6qn4mH62qoqgsP68lmLE+7lVk0dK8U zMkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=iouKzcU6vzz3J8Cw/jp3uuc+QHFS3Cf59HMMjSOOWbs=; b=H/lRM81V58JFzXHm2luSOc7PJI/eOApqtdB4yPkYej+itPFgIp3Kh1euqEp73PkwvB T9rF7e4DlH/C587SnWEuPeQmb3WEXupO7zeiqR2o6gbwu162/fn2stq8bSpy2jJYbLtB ABxSkbpRGJhGWaDi7I9pWmBQdrvSqYUlgpGkfJCQfdTAQIbOaZGcP8gtwUnQF/MNa4xa Mb01vtgebnOdWgA+iHx7bY0xjLh4oxy+TTS0zR9xy8lNpiMFU1iubxWAip6D3ogqoifa 1LZfQdREgckaiohG3h2wU3k0qdMsTQ0CPCd64xFzJdAytHZ/fRmcWPiC2UXgcEe8RHb4 7NRA== X-Gm-Message-State: APjAAAVuGdiEOnmi4u43f2nq6VmKXQzUqVyxRvaVsreQPsD6ydi1eSRx uKGjD5LI+9uAZDBAaNjFP/m8Z0eii0TMiIdC X-Google-Smtp-Source: APXvYqyglBS77KvKLzQSxxFKJC+JGrOFZMBXGPhpE79nK8Wd9c1u4DvLvkLgPjwj6mjjCH2qsHBzCw== X-Received: by 2002:a17:902:543:: with SMTP id 61mr36243602plf.20.1567529025930; Tue, 03 Sep 2019 09:43:45 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:45 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 02/17] crypto: arm/aes-ce - yield the SIMD unit between scatterwalk steps Date: Tue, 3 Sep 2019 09:43:24 -0700 Message-Id: <20190903164339.27984-3-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Reduce the scope of the kernel_neon_begin/end regions so that the SIMD unit is released (and thus preemption re-enabled) if the crypto operation cannot be completed in a single scatterwalk step. This avoids scheduling blackouts due to preemption being enabled for unbounded periods, resulting in a more responsive system. After this change, we can also permit the cipher_walk infrastructure to sleep, so set the 'atomic' parameter to skcipher_walk_virt() to false as well. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-glue.c | 47 ++++++++++---------- arch/arm/crypto/aes-neonbs-glue.c | 22 ++++----- 2 files changed, 34 insertions(+), 35 deletions(-) diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 75d2ff03a63e..486e862ae34a 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -177,15 +177,15 @@ static int ecb_encrypt(struct skcipher_request *req) unsigned int blocks; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); ce_aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key_enc, num_rounds(ctx), blocks); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -197,15 +197,15 @@ static int ecb_decrypt(struct skcipher_request *req) unsigned int blocks; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); ce_aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key_dec, num_rounds(ctx), blocks); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -217,16 +217,16 @@ static int cbc_encrypt(struct skcipher_request *req) unsigned int blocks; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); ce_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key_enc, num_rounds(ctx), blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -238,16 +238,16 @@ static int cbc_decrypt(struct skcipher_request *req) unsigned int blocks; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); ce_aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key_dec, num_rounds(ctx), blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -258,13 +258,14 @@ static int ctr_encrypt(struct skcipher_request *req) struct skcipher_walk walk; int err, blocks; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); ce_aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key_enc, num_rounds(ctx), blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } if (walk.nbytes) { @@ -278,13 +279,13 @@ static int ctr_encrypt(struct skcipher_request *req) */ blocks = -1; + kernel_neon_begin(); ce_aes_ctr_encrypt(tail, NULL, ctx->key_enc, num_rounds(ctx), blocks, walk.iv); + kernel_neon_end(); crypto_xor_cpy(tdst, tsrc, tail, nbytes); err = skcipher_walk_done(&walk, 0); } - kernel_neon_end(); - return err; } @@ -319,17 +320,16 @@ static int xts_encrypt(struct skcipher_request *req) struct skcipher_walk walk; unsigned int blocks; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + kernel_neon_begin(); ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key1.key_enc, rounds, blocks, walk.iv, ctx->key2.key_enc, first); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); - return err; } @@ -341,17 +341,16 @@ static int xts_decrypt(struct skcipher_request *req) struct skcipher_walk walk; unsigned int blocks; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + kernel_neon_begin(); ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key1.key_dec, rounds, blocks, walk.iv, ctx->key2.key_enc, first); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); - return err; } diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c index 45cd9818791e..9000d0796d5e 100644 --- a/arch/arm/crypto/aes-neonbs-glue.c +++ b/arch/arm/crypto/aes-neonbs-glue.c @@ -90,9 +90,8 @@ static int __ecb_crypt(struct skcipher_request *req, struct skcipher_walk walk; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; @@ -100,12 +99,13 @@ static int __ecb_crypt(struct skcipher_request *req, blocks = round_down(blocks, walk.stride / AES_BLOCK_SIZE); + kernel_neon_begin(); fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk, ctx->rounds, blocks); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes - blocks * AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -159,9 +159,8 @@ static int cbc_decrypt(struct skcipher_request *req) struct skcipher_walk walk; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; @@ -169,13 +168,14 @@ static int cbc_decrypt(struct skcipher_request *req) blocks = round_down(blocks, walk.stride / AES_BLOCK_SIZE); + kernel_neon_begin(); aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk, ctx->key.rounds, blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes - blocks * AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -223,9 +223,8 @@ static int ctr_encrypt(struct skcipher_request *req) u8 buf[AES_BLOCK_SIZE]; int err; - err = skcipher_walk_virt(&walk, req, true); + err = skcipher_walk_virt(&walk, req, false); - kernel_neon_begin(); while (walk.nbytes > 0) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL; @@ -236,8 +235,10 @@ static int ctr_encrypt(struct skcipher_request *req) final = NULL; } + kernel_neon_begin(); aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk, ctx->rounds, blocks, walk.iv, final); + kernel_neon_end(); if (final) { u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE; @@ -252,7 +253,6 @@ static int ctr_encrypt(struct skcipher_request *req) err = skcipher_walk_done(&walk, walk.nbytes - blocks * AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -329,7 +329,6 @@ static int __xts_crypt(struct skcipher_request *req, crypto_cipher_encrypt_one(ctx->tweak_tfm, walk.iv, walk.iv); - kernel_neon_begin(); while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; @@ -337,12 +336,13 @@ static int __xts_crypt(struct skcipher_request *req, blocks = round_down(blocks, walk.stride / AES_BLOCK_SIZE); + kernel_neon_begin(); fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk, ctx->key.rounds, blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes - blocks * AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } From patchwork Tue Sep 3 16:43:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128447 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9110015E9 for ; Tue, 3 Sep 2019 16:43:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 541CE233A1 for ; Tue, 3 Sep 2019 16:43:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="LruIlOMV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730078AbfICQns (ORCPT ); Tue, 3 Sep 2019 12:43:48 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:44404 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQns (ORCPT ); Tue, 3 Sep 2019 12:43:48 -0400 Received: by mail-pg1-f195.google.com with SMTP id i18so9438249pgl.11 for ; Tue, 03 Sep 2019 09:43:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=u8c7yjt2nGuO09VOjn5Nf28VuKqSVZEi1e4/jRpqUrQ=; b=LruIlOMVykumLE+Kn6REj/Kf6l4eAtnBxB2V4x2mog96vRfdjUQBvf7OHYCdaHyENd kX9IQKJ+BO11dvgIhjAfniDZPoip7cZWQ+fHmPH2U3zXA6FZhv91+VXKDNORi5da2cCQ VtwYMkv1peXXk1AgLqkhEjfR2eYFCVOqIG+L9K1H7m8U4YvsWXUWf6knPZig3SYpHRUc gNxRZQlFH7h6ThgktlnF+0hbBJhItD3dpa58XjB5X0Z77senmUP66bUw+SJFhiqSKv9B 0H+yiNuNBi0W5bYK5lSOPIHCTi1rhUxHgJRKXhVwhtA2Ofh4taOHACKM2nv3ox0518dz 2GnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=u8c7yjt2nGuO09VOjn5Nf28VuKqSVZEi1e4/jRpqUrQ=; b=XUDqgV9R+8Y/Nh9jbokGXeXXC7C/79RW05GXgyRPeRJdYL8AOa1+wk+HOG8osCsdqN Fi10aAfOZiteOgqYwMW/04TdNzj2ilV3Ala8PG5IDPkeiEsVr0jYqGW5FW6VpHf6vvxN ZI8k1nAEGftiZCs1xjUB6UVXA8kW5TxnWeFnB1KoxTiJjtSVi9STCuUCs15WdZ9E8+rH /xO6cn0yxHDMt6myk1OmVcqMmMDzNAu3gC8V/IqYaxdUxgbSs5L3V5UehFfFW2u41JKD mQE34Se8F2yIROO18nf7temUcbmIjt7Y61a0QFUpCA8uk2IBVWnHBIpVA8Dxt/7nIxo2 ASxA== X-Gm-Message-State: APjAAAWk9AEdX9psrQ7Ujmjnb/FiNE0YQE8riniglqx1hDl2Recm0jRo OE5kvB6eWUZYceXJX+9N1jyJZUH3hRIpCeeQ X-Google-Smtp-Source: APXvYqysJFRuAdws0qiUqgeqgn7SCWoWwHP3sXrqYYbgv3Hoj0krpMbegYUR4zn6gBFNanysHyKIFA== X-Received: by 2002:a63:1d0e:: with SMTP id d14mr31231670pgd.324.1567529027447; Tue, 03 Sep 2019 09:43:47 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:46 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 03/17] crypto: arm/aes-ce - switch to 4x interleave Date: Tue, 3 Sep 2019 09:43:25 -0700 Message-Id: <20190903164339.27984-4-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org When the ARM AES instruction based crypto driver was introduced, there were no known implementations that could benefit from a 4-way interleave, and so a 3-way interleave was used instead. Since we have sufficient space in the SIMD register file, let's switch to a 4-way interleave to align with the 64-bit driver, and to ensure that we can reach optimum performance when running under emulation on high end 64-bit cores. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 263 +++++++++++--------- 1 file changed, 144 insertions(+), 119 deletions(-) diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 1e0d45183590..a3ca4ac2d7bb 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -44,46 +44,56 @@ veor q0, q0, \key3 .endm - .macro enc_dround_3x, key1, key2 + .macro enc_dround_4x, key1, key2 enc_round q0, \key1 enc_round q1, \key1 enc_round q2, \key1 + enc_round q3, \key1 enc_round q0, \key2 enc_round q1, \key2 enc_round q2, \key2 + enc_round q3, \key2 .endm - .macro dec_dround_3x, key1, key2 + .macro dec_dround_4x, key1, key2 dec_round q0, \key1 dec_round q1, \key1 dec_round q2, \key1 + dec_round q3, \key1 dec_round q0, \key2 dec_round q1, \key2 dec_round q2, \key2 + dec_round q3, \key2 .endm - .macro enc_fround_3x, key1, key2, key3 + .macro enc_fround_4x, key1, key2, key3 enc_round q0, \key1 enc_round q1, \key1 enc_round q2, \key1 + enc_round q3, \key1 aese.8 q0, \key2 aese.8 q1, \key2 aese.8 q2, \key2 + aese.8 q3, \key2 veor q0, q0, \key3 veor q1, q1, \key3 veor q2, q2, \key3 + veor q3, q3, \key3 .endm - .macro dec_fround_3x, key1, key2, key3 + .macro dec_fround_4x, key1, key2, key3 dec_round q0, \key1 dec_round q1, \key1 dec_round q2, \key1 + dec_round q3, \key1 aesd.8 q0, \key2 aesd.8 q1, \key2 aesd.8 q2, \key2 + aesd.8 q3, \key2 veor q0, q0, \key3 veor q1, q1, \key3 veor q2, q2, \key3 + veor q3, q3, \key3 .endm .macro do_block, dround, fround @@ -114,8 +124,9 @@ * transforms. These should preserve all registers except q0 - q2 and ip * Arguments: * q0 : first in/output block - * q1 : second in/output block (_3x version only) - * q2 : third in/output block (_3x version only) + * q1 : second in/output block (_4x version only) + * q2 : third in/output block (_4x version only) + * q3 : fourth in/output block (_4x version only) * q8 : first round key * q9 : secound round key * q14 : final round key @@ -136,16 +147,16 @@ aes_decrypt: ENDPROC(aes_decrypt) .align 6 -aes_encrypt_3x: +aes_encrypt_4x: add ip, r2, #32 @ 3rd round key - do_block enc_dround_3x, enc_fround_3x -ENDPROC(aes_encrypt_3x) + do_block enc_dround_4x, enc_fround_4x +ENDPROC(aes_encrypt_4x) .align 6 -aes_decrypt_3x: +aes_decrypt_4x: add ip, r2, #32 @ 3rd round key - do_block dec_dround_3x, dec_fround_3x -ENDPROC(aes_decrypt_3x) + do_block dec_dround_4x, dec_fround_4x +ENDPROC(aes_decrypt_4x) .macro prepare_key, rk, rounds add ip, \rk, \rounds, lsl #4 @@ -163,17 +174,17 @@ ENTRY(ce_aes_ecb_encrypt) push {r4, lr} ldr r4, [sp, #8] prepare_key r2, r3 -.Lecbencloop3x: - subs r4, r4, #3 +.Lecbencloop4x: + subs r4, r4, #4 bmi .Lecbenc1x vld1.8 {q0-q1}, [r1]! - vld1.8 {q2}, [r1]! - bl aes_encrypt_3x + vld1.8 {q2-q3}, [r1]! + bl aes_encrypt_4x vst1.8 {q0-q1}, [r0]! - vst1.8 {q2}, [r0]! - b .Lecbencloop3x + vst1.8 {q2-q3}, [r0]! + b .Lecbencloop4x .Lecbenc1x: - adds r4, r4, #3 + adds r4, r4, #4 beq .Lecbencout .Lecbencloop: vld1.8 {q0}, [r1]! @@ -189,17 +200,17 @@ ENTRY(ce_aes_ecb_decrypt) push {r4, lr} ldr r4, [sp, #8] prepare_key r2, r3 -.Lecbdecloop3x: - subs r4, r4, #3 +.Lecbdecloop4x: + subs r4, r4, #4 bmi .Lecbdec1x vld1.8 {q0-q1}, [r1]! - vld1.8 {q2}, [r1]! - bl aes_decrypt_3x + vld1.8 {q2-q3}, [r1]! + bl aes_decrypt_4x vst1.8 {q0-q1}, [r0]! - vst1.8 {q2}, [r0]! - b .Lecbdecloop3x + vst1.8 {q2-q3}, [r0]! + b .Lecbdecloop4x .Lecbdec1x: - adds r4, r4, #3 + adds r4, r4, #4 beq .Lecbdecout .Lecbdecloop: vld1.8 {q0}, [r1]! @@ -236,38 +247,40 @@ ENDPROC(ce_aes_cbc_encrypt) ENTRY(ce_aes_cbc_decrypt) push {r4-r6, lr} ldrd r4, r5, [sp, #16] - vld1.8 {q6}, [r5] @ keep iv in q6 + vld1.8 {q15}, [r5] @ keep iv in q15 prepare_key r2, r3 -.Lcbcdecloop3x: - subs r4, r4, #3 +.Lcbcdecloop4x: + subs r4, r4, #4 bmi .Lcbcdec1x vld1.8 {q0-q1}, [r1]! - vld1.8 {q2}, [r1]! - vmov q3, q0 - vmov q4, q1 - vmov q5, q2 - bl aes_decrypt_3x - veor q0, q0, q6 - veor q1, q1, q3 - veor q2, q2, q4 - vmov q6, q5 + vld1.8 {q2-q3}, [r1]! + vmov q4, q0 + vmov q5, q1 + vmov q6, q2 + vmov q7, q3 + bl aes_decrypt_4x + veor q0, q0, q15 + veor q1, q1, q4 + veor q2, q2, q5 + veor q3, q3, q6 + vmov q15, q7 vst1.8 {q0-q1}, [r0]! - vst1.8 {q2}, [r0]! - b .Lcbcdecloop3x + vst1.8 {q2-q3}, [r0]! + b .Lcbcdecloop4x .Lcbcdec1x: - adds r4, r4, #3 + adds r4, r4, #4 beq .Lcbcdecout - vmov q15, q14 @ preserve last round key + vmov q6, q14 @ preserve last round key .Lcbcdecloop: vld1.8 {q0}, [r1]! @ get next ct block veor q14, q15, q6 @ combine prev ct with last key - vmov q6, q0 + vmov q15, q0 bl aes_decrypt vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcdecloop .Lcbcdecout: - vst1.8 {q6}, [r5] @ keep iv in q6 + vst1.8 {q15}, [r5] @ keep iv in q15 pop {r4-r6, pc} ENDPROC(ce_aes_cbc_decrypt) @@ -278,46 +291,52 @@ ENDPROC(ce_aes_cbc_decrypt) ENTRY(ce_aes_ctr_encrypt) push {r4-r6, lr} ldrd r4, r5, [sp, #16] - vld1.8 {q6}, [r5] @ load ctr + vld1.8 {q7}, [r5] @ load ctr prepare_key r2, r3 - vmov r6, s27 @ keep swabbed ctr in r6 + vmov r6, s31 @ keep swabbed ctr in r6 rev r6, r6 cmn r6, r4 @ 32 bit overflow? bcs .Lctrloop -.Lctrloop3x: - subs r4, r4, #3 +.Lctrloop4x: + subs r4, r4, #4 bmi .Lctr1x add r6, r6, #1 - vmov q0, q6 - vmov q1, q6 + vmov q0, q7 + vmov q1, q7 rev ip, r6 add r6, r6, #1 - vmov q2, q6 + vmov q2, q7 vmov s7, ip rev ip, r6 add r6, r6, #1 + vmov q3, q7 vmov s11, ip - vld1.8 {q3-q4}, [r1]! - vld1.8 {q5}, [r1]! - bl aes_encrypt_3x - veor q0, q0, q3 - veor q1, q1, q4 - veor q2, q2, q5 + rev ip, r6 + add r6, r6, #1 + vmov s15, ip + vld1.8 {q4-q5}, [r1]! + vld1.8 {q6}, [r1]! + vld1.8 {q15}, [r1]! + bl aes_encrypt_4x + veor q0, q0, q4 + veor q1, q1, q5 + veor q2, q2, q6 + veor q3, q3, q15 rev ip, r6 vst1.8 {q0-q1}, [r0]! - vst1.8 {q2}, [r0]! - vmov s27, ip - b .Lctrloop3x + vst1.8 {q2-q3}, [r0]! + vmov s31, ip + b .Lctrloop4x .Lctr1x: - adds r4, r4, #3 + adds r4, r4, #4 beq .Lctrout .Lctrloop: - vmov q0, q6 + vmov q0, q7 bl aes_encrypt adds r6, r6, #1 @ increment BE ctr rev ip, r6 - vmov s27, ip + vmov s31, ip bcs .Lctrcarry .Lctrcarrydone: @@ -329,7 +348,7 @@ ENTRY(ce_aes_ctr_encrypt) bne .Lctrloop .Lctrout: - vst1.8 {q6}, [r5] @ return next CTR value + vst1.8 {q7}, [r5] @ return next CTR value pop {r4-r6, pc} .Lctrtailblock: @@ -337,7 +356,7 @@ ENTRY(ce_aes_ctr_encrypt) b .Lctrout .Lctrcarry: - .irp sreg, s26, s25, s24 + .irp sreg, s30, s29, s28 vmov ip, \sreg @ load next word of ctr rev ip, ip @ ... to handle the carry adds ip, ip, #1 @@ -368,8 +387,8 @@ ENDPROC(ce_aes_ctr_encrypt) .quad 1, 0x87 ce_aes_xts_init: - vldr d14, .Lxts_mul_x - vldr d15, .Lxts_mul_x + 8 + vldr d30, .Lxts_mul_x + vldr d31, .Lxts_mul_x + 8 ldrd r4, r5, [sp, #16] @ load args ldr r6, [sp, #28] @@ -390,48 +409,51 @@ ENTRY(ce_aes_xts_encrypt) bl ce_aes_xts_init @ run shared prologue prepare_key r2, r3 - vmov q3, q0 + vmov q4, q0 teq r6, #0 @ start of a block? - bne .Lxtsenc3x + bne .Lxtsenc4x -.Lxtsencloop3x: - next_tweak q3, q3, q7, q6 -.Lxtsenc3x: - subs r4, r4, #3 +.Lxtsencloop4x: + next_tweak q4, q4, q15, q10 +.Lxtsenc4x: + subs r4, r4, #4 bmi .Lxtsenc1x - vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks - vld1.8 {q2}, [r1]! - next_tweak q4, q3, q7, q6 - veor q0, q0, q3 - next_tweak q5, q4, q7, q6 - veor q1, q1, q4 - veor q2, q2, q5 - bl aes_encrypt_3x - veor q0, q0, q3 - veor q1, q1, q4 - veor q2, q2, q5 - vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks - vst1.8 {q2}, [r0]! - vmov q3, q5 + vld1.8 {q0-q1}, [r1]! @ get 4 pt blocks + vld1.8 {q2-q3}, [r1]! + next_tweak q5, q4, q15, q10 + veor q0, q0, q4 + next_tweak q6, q5, q15, q10 + veor q1, q1, q5 + next_tweak q7, q6, q15, q10 + veor q2, q2, q6 + veor q3, q3, q7 + bl aes_encrypt_4x + veor q0, q0, q4 + veor q1, q1, q5 + veor q2, q2, q6 + veor q3, q3, q7 + vst1.8 {q0-q1}, [r0]! @ write 4 ct blocks + vst1.8 {q2-q3}, [r0]! + vmov q4, q7 teq r4, #0 beq .Lxtsencout - b .Lxtsencloop3x + b .Lxtsencloop4x .Lxtsenc1x: - adds r4, r4, #3 + adds r4, r4, #4 beq .Lxtsencout .Lxtsencloop: vld1.8 {q0}, [r1]! - veor q0, q0, q3 + veor q0, q0, q4 bl aes_encrypt - veor q0, q0, q3 + veor q0, q0, q4 vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsencout - next_tweak q3, q3, q7, q6 + next_tweak q4, q4, q15, q6 b .Lxtsencloop .Lxtsencout: - vst1.8 {q3}, [r5] + vst1.8 {q4}, [r5] pop {r4-r6, pc} ENDPROC(ce_aes_xts_encrypt) @@ -441,49 +463,52 @@ ENTRY(ce_aes_xts_decrypt) bl ce_aes_xts_init @ run shared prologue prepare_key r2, r3 - vmov q3, q0 + vmov q4, q0 teq r6, #0 @ start of a block? - bne .Lxtsdec3x + bne .Lxtsdec4x -.Lxtsdecloop3x: - next_tweak q3, q3, q7, q6 -.Lxtsdec3x: - subs r4, r4, #3 +.Lxtsdecloop4x: + next_tweak q4, q4, q15, q10 +.Lxtsdec4x: + subs r4, r4, #4 bmi .Lxtsdec1x - vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks - vld1.8 {q2}, [r1]! - next_tweak q4, q3, q7, q6 - veor q0, q0, q3 - next_tweak q5, q4, q7, q6 - veor q1, q1, q4 - veor q2, q2, q5 - bl aes_decrypt_3x - veor q0, q0, q3 - veor q1, q1, q4 - veor q2, q2, q5 - vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks - vst1.8 {q2}, [r0]! - vmov q3, q5 + vld1.8 {q0-q1}, [r1]! @ get 4 ct blocks + vld1.8 {q2-q3}, [r1]! + next_tweak q5, q4, q15, q10 + veor q0, q0, q4 + next_tweak q6, q5, q15, q10 + veor q1, q1, q5 + next_tweak q7, q6, q15, q10 + veor q2, q2, q6 + veor q3, q3, q7 + bl aes_decrypt_4x + veor q0, q0, q4 + veor q1, q1, q5 + veor q2, q2, q6 + veor q3, q3, q7 + vst1.8 {q0-q1}, [r0]! @ write 4 pt blocks + vst1.8 {q2-q3}, [r0]! + vmov q4, q7 teq r4, #0 beq .Lxtsdecout - b .Lxtsdecloop3x + b .Lxtsdecloop4x .Lxtsdec1x: - adds r4, r4, #3 + adds r4, r4, #4 beq .Lxtsdecout .Lxtsdecloop: vld1.8 {q0}, [r1]! - veor q0, q0, q3 + veor q0, q0, q4 add ip, r2, #32 @ 3rd round key bl aes_decrypt - veor q0, q0, q3 + veor q0, q0, q4 vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsdecout - next_tweak q3, q3, q7, q6 + next_tweak q4, q4, q15, q6 b .Lxtsdecloop .Lxtsdecout: - vst1.8 {q3}, [r5] + vst1.8 {q4}, [r5] pop {r4-r6, pc} ENDPROC(ce_aes_xts_decrypt) From patchwork Tue Sep 3 16:43:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128449 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B2FF1112C for ; Tue, 3 Sep 2019 16:43:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 94C012339D for ; Tue, 3 Sep 2019 16:43:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Seh3SnO7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730087AbfICQnu (ORCPT ); Tue, 3 Sep 2019 12:43:50 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:40400 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730080AbfICQnt (ORCPT ); Tue, 3 Sep 2019 12:43:49 -0400 Received: by mail-pl1-f195.google.com with SMTP id y10so2212266pll.7 for ; Tue, 03 Sep 2019 09:43:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aI11eOfvUHWceFRi+8iF4u67GyVnjyFicxzOIJeHmqs=; b=Seh3SnO7oiwdT24476RxHEDvAApEq55bf0yH1t/nmi12tYD+r2wdiyZUIw5lLTgxw5 nImaN4v64wqOcsGJjjuJLbYLCiLLd9G+viS+G8F9QVD/cnn9odNSCzhicQwuRbrvCPqN 5Z+mdw1MUcw2+c4AUQcubUfDMAfv2EqqBkhDzKz3Z/q7/GqmJSB2xtkea7HgtvHx5q2R ftc8C5c6EKIFwqyPL+OUD8frAZmld6nc9L7ojMJ6hd0LO0YxTLdaMnrDEQKslwBwZlmB yMYPQ3prX685iew4wa4l4X4PKRsv9vub9L/JHNKTUHd+boTHyYjIFs4VWbejkNf5xSPB jxEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aI11eOfvUHWceFRi+8iF4u67GyVnjyFicxzOIJeHmqs=; b=H1md8dSpqbeVObkxBI0HoCyd/quSXKMcLd2mPpKBOLovqyJULNM+DtqGNEYNsaYtYB uFH0ZFfKnd1sCukoHgroXXY6GnZiN5NZmMQQJyL/lR3S1aBDRslB/M3gtla0E6atAj8w R9cLwnmXi+wJdw+eSErUyiX3+FBFGik7/cBUljOO7VnD7t/kuZ9AcWh6RjvZxmY9JWMz ewNotF79TENmwz0rz7yWVoxzgQL/WE6SJYWw53A4Ytolg9oMTVQk7fewVWxuiXcShwys UBtRvffq+E85y90vQY4IAMA3ymCXT9V0PeTiHWcxBeGYoIZC5MvzTtU+6jP20mBI605f Za7g== X-Gm-Message-State: APjAAAXQIhfMYcY6/Qh65C7n1wJ7DUouWIRXmJWhwd5COjtVMFlZt3Ek n50VXv3SGKvsu/LwudRj0MdD9ek6H2ifK/pl X-Google-Smtp-Source: APXvYqytI0Rq7eo6Do1aUaFKBp32v0geiDf21mb7gIn2mUKf1M2wCeebK4Ys46jxum8y4Gal33n4NQ== X-Received: by 2002:a17:902:166:: with SMTP id 93mr19789458plb.320.1567529028847; Tue, 03 Sep 2019 09:43:48 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:48 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 04/17] crypto: arm/aes-ce - replace tweak mask literal with composition Date: Tue, 3 Sep 2019 09:43:26 -0700 Message-Id: <20190903164339.27984-5-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Replace the vector load from memory sequence with a simple instruction sequence to compose the tweak vector directly. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index a3ca4ac2d7bb..bb6ec1844370 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -382,13 +382,10 @@ ENDPROC(ce_aes_ctr_encrypt) veor \out, \out, \tmp .endm - .align 3 -.Lxts_mul_x: - .quad 1, 0x87 - ce_aes_xts_init: - vldr d30, .Lxts_mul_x - vldr d31, .Lxts_mul_x + 8 + vmov.i32 d30, #0x87 @ compose tweak mask vector + vmovl.u32 q15, d30 + vshr.u64 d30, d31, #7 ldrd r4, r5, [sp, #16] @ load args ldr r6, [sp, #28] From patchwork Tue Sep 3 16:43:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128451 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D89A112C for ; Tue, 3 Sep 2019 16:43:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E28EC21881 for ; Tue, 3 Sep 2019 16:43:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="GIRqgq3l" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730080AbfICQnv (ORCPT ); Tue, 3 Sep 2019 12:43:51 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:46148 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQnv (ORCPT ); Tue, 3 Sep 2019 12:43:51 -0400 Received: by mail-pg1-f194.google.com with SMTP id m3so9422597pgv.13 for ; Tue, 03 Sep 2019 09:43:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vb3yeqqCuXjboCdKiVDdUqWWVOWb7cOxAkDE/WzQmrQ=; b=GIRqgq3llkN/QIzSBJ/mhA5AUejcOaEkhxSPOK40hrmkt4nfNseLML5eIsAf6FpSWK OpqfWnty1g828PKFwQamifIM5nN5+JMDXMpIU2z86HYckDuQkPtnyhYG4IgYvBjctRg1 yN8ghbRtKJo0nZVgDp21kZcBBOZNt+iSBjRr4Q83ap41K0Ybhmash/+m1IfwdcKkcHmR SfIFH1t6LfwCBoJijEWjdy9AXuYtbNO3dOomtdQquoKlScB5kSYUUR3M9SPJAdvanKA7 al6i3cKFmXcdCJ11kM29nnPPl0vZTr/e797QtnabP6UOb8TO/1ixhwhjL0QjcE5icm7d 0XAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vb3yeqqCuXjboCdKiVDdUqWWVOWb7cOxAkDE/WzQmrQ=; b=EuHFr7bpdEMw+rgVz391j3MsKHJHrE3tk2iq5JejFAWRR7fAuSm6A5fFUY+tmIk2xV Y8OMcWxV/r/7TYSU17Qw/2HtD3GenwXXLuqcIRncFYSXQFsIF7kzlIFq3x61YllFEK0O 0CHtkPy+lEVic/kfButt/EnrSSE2ZV8ioS/xuRK0eyqcqs0rROQdDXjxS4ixDu4iBpit gpa9/LrXrZ3D6VxfWfH+NpBBChxNR4Q7GIFtFjy6lmzSZJRaqgfNpCpwYRHKbi3h1h+E YmbUXVHd09wYGxJjpbrzbK48c0O/QZDq01lX82QNkq2QSUtMqfX4PHQtsfKnlOg/6vMW M9NA== X-Gm-Message-State: APjAAAU00AEBNff5AJf9ahyKM3dAQlOjsNMaQpEfSo2oR92+UuVQFwpR ljq/8+K8xQ3mQzKFU0d0mb44RUuNeaJxFMqR X-Google-Smtp-Source: APXvYqwfUWT4n3+dKmXAqI/PTLVdGETTBdNKBfAWkI03YGLidEoXOl8QV0BLCcsJuT+Ll+n8TiRyNQ== X-Received: by 2002:a17:90a:fd8c:: with SMTP id cx12mr182813pjb.95.1567529030150; Tue, 03 Sep 2019 09:43:50 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:49 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 05/17] crypto: arm/aes-neonbs - replace tweak mask literal with composition Date: Tue, 3 Sep 2019 09:43:27 -0700 Message-Id: <20190903164339.27984-6-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Replace the vector load from memory sequence with a simple instruction sequence to compose the tweak vector directly. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-neonbs-core.S | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/arch/arm/crypto/aes-neonbs-core.S b/arch/arm/crypto/aes-neonbs-core.S index d3eab76b6e1b..bb75918e4984 100644 --- a/arch/arm/crypto/aes-neonbs-core.S +++ b/arch/arm/crypto/aes-neonbs-core.S @@ -887,10 +887,6 @@ ENDPROC(aesbs_ctr_encrypt) veor \out, \out, \tmp .endm - .align 4 -.Lxts_mul_x: - .quad 1, 0x87 - /* * aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, * int blocks, u8 iv[]) @@ -899,7 +895,9 @@ ENDPROC(aesbs_ctr_encrypt) */ __xts_prepare8: vld1.8 {q14}, [r7] // load iv - __ldr q15, .Lxts_mul_x // load tweak mask + vmov.i32 d30, #0x87 // compose tweak mask vector + vmovl.u32 q15, d30 + vshr.u64 d30, d31, #7 vmov q12, q14 __adr ip, 0f From patchwork Tue Sep 3 16:43:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128453 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72B6D15E9 for ; Tue, 3 Sep 2019 16:43:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 554C92339D for ; Tue, 3 Sep 2019 16:43:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="SrCoLNQG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730108AbfICQnw (ORCPT ); Tue, 3 Sep 2019 12:43:52 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:45437 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQnw (ORCPT ); Tue, 3 Sep 2019 12:43:52 -0400 Received: by mail-pf1-f196.google.com with SMTP id y72so3543392pfb.12 for ; Tue, 03 Sep 2019 09:43:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=u8SkUbJP7rMNf0OKuHQI1V2U/tlAU0T6PVWrsyU6ges=; b=SrCoLNQGWEWY+LiuLHoDSzH1HJI+hFhrVVXSwTV0trNjF96FVGW+hcA3jkLGreECtK 4nJ5yQAwAasjm375uzxkAngqjrDI+9C2Rw6VVUJ4SIBX/PEkPHfz7o/8/z4R0DAtc/0W OsQ5CRsRNR8jJfFe8ynCpi2Nbzx5+BJYSMsybD9jSSwTM7Up0Lhy97w3zuDRXcqlG0RH DsndUrof0s4Z1ff6s7ZcooT9D6CNtKzGXritmi0XU/rdi/lpXFatO3kvD9/lYMRtknNx pPfdZQajGErznYfYRK/KNmS0h/Boe5cgU+xwW91/flT5ug3n5IM1BfH1DlahE+eE3q5l Lu5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=u8SkUbJP7rMNf0OKuHQI1V2U/tlAU0T6PVWrsyU6ges=; b=ErIIVBi0TVri2SSvau8XT4G2FEqfV0eQiB0nqugxoEyz39iLgr+fot3nrbIMAMioOr HvVGRU3oGBntesqzDwcKKxCW9uHXBKRrQcQmjgO/OnjNE7xwTzx6CRAleyyyTibAgVnD DiyvUTDhCkd8AV4grVESjD1qxaifloi+QV4/vCUfhBvOxNUGtF+aMJX1FXxzzdiV+Nfv 7cw+HK2Ysk1z2NYF8fbLaqyWgL/0olufcuDTQfrPcvjDp5qo0qJnzIaL92FUH+e/cyi/ pi1H9KVPsmuCiaV+H+B9faMc95tB1yN3VGBME0cuxaufRKYGj4ApJSz8gCgBgwXC6XCw XdnQ== X-Gm-Message-State: APjAAAVWm+pQjlkM0z9S0qftfGVFs0XO3G4RyZ2OpGsiymtViyo16/mF 49Kx5R5cGwPo8CNA6a7274mXOFk63Ge7P0C0 X-Google-Smtp-Source: APXvYqz6Oh2niv/mU+yBwq33LT9lNf7XnsKUZryseKGXH6Q+UjqEWklspP5zItXHfmunohdGulcqAw== X-Received: by 2002:a63:4823:: with SMTP id v35mr31163401pga.138.1567529031665; Tue, 03 Sep 2019 09:43:51 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:50 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 06/17] crypto: arm64/aes-neonbs - replace tweak mask literal with composition Date: Tue, 3 Sep 2019 09:43:28 -0700 Message-Id: <20190903164339.27984-7-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Replace the vector load from memory sequence with a simple instruction sequence to compose the tweak vector directly. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-neonbs-core.S | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/arm64/crypto/aes-neonbs-core.S b/arch/arm64/crypto/aes-neonbs-core.S index cf10ff8878a3..65982039fa36 100644 --- a/arch/arm64/crypto/aes-neonbs-core.S +++ b/arch/arm64/crypto/aes-neonbs-core.S @@ -730,11 +730,6 @@ ENDPROC(aesbs_cbc_decrypt) eor \out\().16b, \out\().16b, \tmp\().16b .endm - .align 4 -.Lxts_mul_x: -CPU_LE( .quad 1, 0x87 ) -CPU_BE( .quad 0x87, 1 ) - /* * aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, * int blocks, u8 iv[]) @@ -806,7 +801,9 @@ ENDPROC(__xts_crypt8) mov x23, x4 mov x24, x5 -0: ldr q30, .Lxts_mul_x +0: movi v30.2s, #0x1 + movi v25.2s, #0x87 + uzp1 v30.4s, v30.4s, v25.4s ld1 {v25.16b}, [x24] 99: adr x7, \do8 From patchwork Tue Sep 3 16:43:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128455 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5418A15E9 for ; Tue, 3 Sep 2019 16:43:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2DD412339D for ; Tue, 3 Sep 2019 16:43:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="D3RNs20W" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730115AbfICQny (ORCPT ); Tue, 3 Sep 2019 12:43:54 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:40582 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730103AbfICQny (ORCPT ); Tue, 3 Sep 2019 12:43:54 -0400 Received: by mail-pf1-f195.google.com with SMTP id w16so11129957pfn.7 for ; Tue, 03 Sep 2019 09:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LWz788qVRg8VvOU54DLOD0vDhiGV6LnBrRgMkVFGK/8=; b=D3RNs20Wt553q4uqsUfaIXMBAouuA117acC9B1c7MNKBeEkY8xvicBU3+KyEqJhlvk DkflVENLENouzdrln4fZ/gsIQSeU9anVTAsUicLncCwK/SiJIa1XByamr2tS95YWEIWZ EvXSJR6B7O8lRjzBCIwDK7KxdpcnbztC6RFlg3rTVUL7mIA/G8xgRm6/Y1I4/fACyyZz XQLDlhxaFExRyflpLR/pQmDBcbkieHz3YO9hdffL4X3/TvNLMyj22DtvTPc8N05fkeFC 1NgYvv3xRsTD5o/Q4MfCvdI6/aCSFWWimg9qTxZXhNloL97CU98XfMrGMEaiCPfeaNqj aNxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LWz788qVRg8VvOU54DLOD0vDhiGV6LnBrRgMkVFGK/8=; b=m6mFkN/tCSk8G5YWXM+FnhBL2dA9ir1NHdI/Ur8186qpoymQkC9vVBqOaMudmfay+o 1HR6wQLWe9768EjZYGW/JdytRclf3oWsVft0G/BFup6TUdGf4IF38MhHYS4yK6o+t3Bp M1ZWdT7b1RG7fuhTXAEpLm4NRinJL/CFEZPCbCWpl3t2bhjjfaUMTDAj3PjideHCIYEh +doTMISwJtomxi2l2NnjD4x7/pp4PHKnFipcJUx57uGcQYptZ/0SOBvwJP4+sKBex5MV 0Jq0XEvwi5NJ6fSVgfPu2IOZt3IF3E/3sNbAu22Ps3t6+M8A6J+wU+GiJJrD60Ul4oUr Brbw== X-Gm-Message-State: APjAAAUZvp/SSOCVEgFzX4D2iOraDsJK1iqA5sZus+Otk/scE68UXpir 2uwhVw63rQQeIZL1j1rDNsStYeJmuhcvJaMW X-Google-Smtp-Source: APXvYqzDGSrp4prJWnrHseVnvPubRRu5k5CzbOHVEEocKEcAIzpR+qS+qQhV5L1ldpkr33vvcpU/1g== X-Received: by 2002:a17:90a:d792:: with SMTP id z18mr241693pju.36.1567529033292; Tue, 03 Sep 2019 09:43:53 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:52 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 07/17] crypto: arm64/aes-neon - limit exposed routines if faster driver is enabled Date: Tue, 3 Sep 2019 09:43:29 -0700 Message-Id: <20190903164339.27984-8-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The pure NEON AES implementation predates the bit-slicing one, and is generally slower, unless the algorithm in question can only execute sequentially. So advertising the skciphers that the bit-slicing driver implements as well serves no real purpose, and we can just disable them. Note that the bit-slicing driver also has a link time dependency on the pure NEON driver, for CBC encryption and for XTS tweak calculation, so we still need both drivers on systems that do not implement the Crypto Extensions. At the same time, expose those modaliases for the AES instruction based driver. This is necessary since otherwise, we may end up loading the wrong driver when any of the skciphers are instantiated before the CPU capability based module loading has completed. Finally, add the missing modalias for cts(cbc(aes)) so requests for this algorithm will autoload the correct module. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 112 +++++++++++--------- 1 file changed, 59 insertions(+), 53 deletions(-) diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index ca0c84d56cba..4154bb93a85b 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -54,15 +54,18 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions"); #define aes_xts_decrypt neon_aes_xts_decrypt #define aes_mac_update neon_aes_mac_update MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 NEON"); +#endif +#if defined(USE_V8_CRYPTO_EXTENSIONS) || !defined(CONFIG_CRYPTO_AES_ARM64_BS) MODULE_ALIAS_CRYPTO("ecb(aes)"); MODULE_ALIAS_CRYPTO("cbc(aes)"); -MODULE_ALIAS_CRYPTO("essiv(cbc(aes),sha256)"); MODULE_ALIAS_CRYPTO("ctr(aes)"); MODULE_ALIAS_CRYPTO("xts(aes)"); +#endif +MODULE_ALIAS_CRYPTO("cts(cbc(aes))"); +MODULE_ALIAS_CRYPTO("essiv(cbc(aes),sha256)"); MODULE_ALIAS_CRYPTO("cmac(aes)"); MODULE_ALIAS_CRYPTO("xcbc(aes)"); MODULE_ALIAS_CRYPTO("cbcmac(aes)"); -#endif MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); @@ -144,8 +147,8 @@ static int skcipher_aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return ret; } -static int xts_set_key(struct crypto_skcipher *tfm, const u8 *in_key, - unsigned int key_len) +static int __maybe_unused xts_set_key(struct crypto_skcipher *tfm, + const u8 *in_key, unsigned int key_len) { struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); int ret; @@ -165,8 +168,9 @@ static int xts_set_key(struct crypto_skcipher *tfm, const u8 *in_key, return -EINVAL; } -static int essiv_cbc_set_key(struct crypto_skcipher *tfm, const u8 *in_key, - unsigned int key_len) +static int __maybe_unused essiv_cbc_set_key(struct crypto_skcipher *tfm, + const u8 *in_key, + unsigned int key_len) { struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); SHASH_DESC_ON_STACK(desc, ctx->hash); @@ -190,7 +194,7 @@ static int essiv_cbc_set_key(struct crypto_skcipher *tfm, const u8 *in_key, return -EINVAL; } -static int ecb_encrypt(struct skcipher_request *req) +static int __maybe_unused ecb_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -210,7 +214,7 @@ static int ecb_encrypt(struct skcipher_request *req) return err; } -static int ecb_decrypt(struct skcipher_request *req) +static int __maybe_unused ecb_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -248,7 +252,7 @@ static int cbc_encrypt_walk(struct skcipher_request *req, return err; } -static int cbc_encrypt(struct skcipher_request *req) +static int __maybe_unused cbc_encrypt(struct skcipher_request *req) { struct skcipher_walk walk; int err; @@ -277,7 +281,7 @@ static int cbc_decrypt_walk(struct skcipher_request *req, return err; } -static int cbc_decrypt(struct skcipher_request *req) +static int __maybe_unused cbc_decrypt(struct skcipher_request *req) { struct skcipher_walk walk; int err; @@ -404,7 +408,7 @@ static int cts_cbc_decrypt(struct skcipher_request *req) return skcipher_walk_done(&walk, 0); } -static int essiv_cbc_init_tfm(struct crypto_skcipher *tfm) +static int __maybe_unused essiv_cbc_init_tfm(struct crypto_skcipher *tfm) { struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -415,14 +419,14 @@ static int essiv_cbc_init_tfm(struct crypto_skcipher *tfm) return 0; } -static void essiv_cbc_exit_tfm(struct crypto_skcipher *tfm) +static void __maybe_unused essiv_cbc_exit_tfm(struct crypto_skcipher *tfm) { struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); crypto_free_shash(ctx->hash); } -static int essiv_cbc_encrypt(struct skcipher_request *req) +static int __maybe_unused essiv_cbc_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -444,7 +448,7 @@ static int essiv_cbc_encrypt(struct skcipher_request *req) return err ?: cbc_encrypt_walk(req, &walk); } -static int essiv_cbc_decrypt(struct skcipher_request *req) +static int __maybe_unused essiv_cbc_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -520,7 +524,7 @@ static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst) local_irq_restore(flags); } -static int ctr_encrypt_sync(struct skcipher_request *req) +static int __maybe_unused ctr_encrypt_sync(struct skcipher_request *req) { if (!crypto_simd_usable()) return crypto_ctr_encrypt_walk(req, ctr_encrypt_one); @@ -528,7 +532,7 @@ static int ctr_encrypt_sync(struct skcipher_request *req) return ctr_encrypt(req); } -static int xts_encrypt(struct skcipher_request *req) +static int __maybe_unused xts_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -550,7 +554,7 @@ static int xts_encrypt(struct skcipher_request *req) return err; } -static int xts_decrypt(struct skcipher_request *req) +static int __maybe_unused xts_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -573,6 +577,7 @@ static int xts_decrypt(struct skcipher_request *req) } static struct skcipher_alg aes_algs[] = { { +#if defined(USE_V8_CRYPTO_EXTENSIONS) || !defined(CONFIG_CRYPTO_AES_ARM64_BS) .base = { .cra_name = "__ecb(aes)", .cra_driver_name = "__ecb-aes-" MODE, @@ -603,42 +608,6 @@ static struct skcipher_alg aes_algs[] = { { .setkey = skcipher_aes_setkey, .encrypt = cbc_encrypt, .decrypt = cbc_decrypt, -}, { - .base = { - .cra_name = "__cts(cbc(aes))", - .cra_driver_name = "__cts-cbc-aes-" MODE, - .cra_priority = PRIO, - .cra_flags = CRYPTO_ALG_INTERNAL, - .cra_blocksize = AES_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_module = THIS_MODULE, - }, - .min_keysize = AES_MIN_KEY_SIZE, - .max_keysize = AES_MAX_KEY_SIZE, - .ivsize = AES_BLOCK_SIZE, - .walksize = 2 * AES_BLOCK_SIZE, - .setkey = skcipher_aes_setkey, - .encrypt = cts_cbc_encrypt, - .decrypt = cts_cbc_decrypt, - .init = cts_cbc_init_tfm, -}, { - .base = { - .cra_name = "__essiv(cbc(aes),sha256)", - .cra_driver_name = "__essiv-cbc-aes-sha256-" MODE, - .cra_priority = PRIO + 1, - .cra_flags = CRYPTO_ALG_INTERNAL, - .cra_blocksize = AES_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct crypto_aes_essiv_cbc_ctx), - .cra_module = THIS_MODULE, - }, - .min_keysize = AES_MIN_KEY_SIZE, - .max_keysize = AES_MAX_KEY_SIZE, - .ivsize = AES_BLOCK_SIZE, - .setkey = essiv_cbc_set_key, - .encrypt = essiv_cbc_encrypt, - .decrypt = essiv_cbc_decrypt, - .init = essiv_cbc_init_tfm, - .exit = essiv_cbc_exit_tfm, }, { .base = { .cra_name = "__ctr(aes)", @@ -688,6 +657,43 @@ static struct skcipher_alg aes_algs[] = { { .setkey = xts_set_key, .encrypt = xts_encrypt, .decrypt = xts_decrypt, +}, { +#endif + .base = { + .cra_name = "__cts(cbc(aes))", + .cra_driver_name = "__cts-cbc-aes-" MODE, + .cra_priority = PRIO, + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_module = THIS_MODULE, + }, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .walksize = 2 * AES_BLOCK_SIZE, + .setkey = skcipher_aes_setkey, + .encrypt = cts_cbc_encrypt, + .decrypt = cts_cbc_decrypt, + .init = cts_cbc_init_tfm, +}, { + .base = { + .cra_name = "__essiv(cbc(aes),sha256)", + .cra_driver_name = "__essiv-cbc-aes-sha256-" MODE, + .cra_priority = PRIO + 1, + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_essiv_cbc_ctx), + .cra_module = THIS_MODULE, + }, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .setkey = essiv_cbc_set_key, + .encrypt = essiv_cbc_encrypt, + .decrypt = essiv_cbc_decrypt, + .init = essiv_cbc_init_tfm, + .exit = essiv_cbc_exit_tfm, } }; static int cbcmac_setkey(struct crypto_shash *tfm, const u8 *in_key, From patchwork Tue Sep 3 16:43:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128459 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A3B7B15E9 for ; Tue, 3 Sep 2019 16:43:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 867A822CED for ; Tue, 3 Sep 2019 16:43:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="VwfQIv3n" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730103AbfICQn4 (ORCPT ); Tue, 3 Sep 2019 12:43:56 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:35233 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQnz (ORCPT ); Tue, 3 Sep 2019 12:43:55 -0400 Received: by mail-pg1-f196.google.com with SMTP id n4so9462501pgv.2 for ; Tue, 03 Sep 2019 09:43:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pASuIHos0NrmENDZ2qsaTQ78g3KIF/aki90Q37gkm4s=; b=VwfQIv3n1+0mlkY5Im5NOS6y6VwYlW+pVg6UmlywFKZtFItpUJeig3EXyYYJ4gYNAf fGyVMFVtTKJs0FtpW71ypblZTRG8FEJKujnOxMjVhzZpWSO6CB3umOfNAj4jiHFat4Kx 6hIBRYPEusst4yKt8vFjlddE97wTbdbaoCbQrjsm9hwX6RD7l+nMOXGJgwZuH0Rz18xP QyyJ33rpF1NNnPaVjal3uowUw7ULy1D5PPW+XIHK7nGVL8kJBbyUgKKkJYBcPX5kclGH SS82P3CeF2QJtWcC3j914qeel4g+GA4zOeq49wwnhn3FvibDM1LNVqdniCr66ha/7BFM FPHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pASuIHos0NrmENDZ2qsaTQ78g3KIF/aki90Q37gkm4s=; b=g57fd3Bnii2P/5SWPHnejKvOH6oIm+xuDd4VbAzH7acQgwP5q50VYCNA+0aUJODBOh Uyp2Y8acCoRPiKDPY1QT1ko1fPi3zVkZoVnnHBAV9gfrCr0AMwAzel4Q1LCWPKUHkp34 KiIslBaUMUOq89ljHg+O9V5G5WLa1OLfgtYkd1pnMF0W0ZF69KbghcUuodiZjtCfdDs0 6K7Iz66tdvvqlyvy8ZpzA1eqSxNqLkzr6Up8JcnfeBa9yPNe1C9PLp8lxghzYaps49y9 BcVHQDlc1XG9IM7kqVwnkcvRziU+QhPd6b2SnQtHlWH0aR6sZA1lMu0VPthZFh8kXWj2 a+kw== X-Gm-Message-State: APjAAAW5O3N4lZ5UrDqHb0jVdsPiESpD91K1dfw4iOgJZbfm+9AouHcr H8rcr4FDrOMqFXPaY4Bf0/Zm3uxZkKv+itfF X-Google-Smtp-Source: APXvYqyraC7iIvrlISC8JuBbbumPQ2tSAMDDqc5yKhWrG8C8w2KoHgEeyb4C/Pc0CN7cnKEMXs4h2Q== X-Received: by 2002:a63:6ec1:: with SMTP id j184mr31086603pgc.232.1567529034711; Tue, 03 Sep 2019 09:43:54 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:53 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 08/17] crypto: skcipher - add the ability to abort a skcipher walk Date: Tue, 3 Sep 2019 09:43:30 -0700 Message-Id: <20190903164339.27984-9-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org After starting a skcipher walk, the only way to ensure that all resources it has tied up are released is to complete it. In some cases, it will be useful to be able to abort a walk cleanly after it has started, so add this ability to the skcipher walk API. Signed-off-by: Ard Biesheuvel --- include/crypto/internal/skcipher.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h index d68faa5759ad..734b6f7081b8 100644 --- a/include/crypto/internal/skcipher.h +++ b/include/crypto/internal/skcipher.h @@ -148,6 +148,11 @@ int skcipher_walk_aead_decrypt(struct skcipher_walk *walk, struct aead_request *req, bool atomic); void skcipher_walk_complete(struct skcipher_walk *walk, int err); +static inline void skcipher_walk_abort(struct skcipher_walk *walk) +{ + skcipher_walk_done(walk, -ECANCELED); +} + static inline void ablkcipher_request_complete(struct ablkcipher_request *req, int err) { From patchwork Tue Sep 3 16:43:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128461 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD85B15E9 for ; Tue, 3 Sep 2019 16:43:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BF58D22CED for ; Tue, 3 Sep 2019 16:43:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Y1N3tMki" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730118AbfICQn5 (ORCPT ); Tue, 3 Sep 2019 12:43:57 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:37925 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQn5 (ORCPT ); Tue, 3 Sep 2019 12:43:57 -0400 Received: by mail-pf1-f196.google.com with SMTP id h195so4576583pfe.5 for ; Tue, 03 Sep 2019 09:43:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gkbLgubyXxiJMMRMzAmpKCmMYVcmKbpOZX5ob6E8mqs=; b=Y1N3tMkiZScTCyMI8i+q2vzi1sOGzQSoqt/D/zN5IwCEIXpRElOP7pBpBB3JMeiYin nnNv8+gg5JvB5xmZ0L2mqda6onQg97YTutIlPOhE5w6OSlbJOggzrDGjdePWjre1SG2k MKrEC/9BMHmgvQMqytJF5EzuZ8qwI69iG7YSoUIlgUdJu0xrKVXIvjoaqqpdpsCHZbox zwgSF9ihqMP8c0ampc2sLHfl/58O3zeMa//4/qJBjeSIoGIxWQ5fIBEbev1fHqnYIFl0 fiKssFaNc6y2spi2cg1Hl9yYgq1T76u8jhjTRnN1XWg+h/ZsqoCfw9S/78qOZpbX56eO 9K2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gkbLgubyXxiJMMRMzAmpKCmMYVcmKbpOZX5ob6E8mqs=; b=NYB3SylRkeA9egQ6tBXCPhwOJBIMIPtvs9OnFmlZ+yv99MmD+8ZCgV1ZmoxN01FiJ7 MbLmNI4iyT9lDqK48u2+l6T1Uoe/qJ8Z0sHYybu/rU+hzRSCQUbHjU9I8XErS8kQ2IIW 1vRSosBlUGjJzr0YDQsrlWd22XLqyNTf5FKo9YkyEUGt9w8y8+9odvh6WWNpJflby29p HANJ0xqw6/4bEBuGWgUSoJtik3PyPXpw5CdD1tZD07qNd+Jt4E0I/N2MBjT2vmFvMHvd HLO7sqt96LBhJiY3SqpwRQWYK88Q8luTYklrKypf08KPJPOx5PzjpfUfQ9U0n69VUOCn 1mOw== X-Gm-Message-State: APjAAAXHssrx5dyd6IY/Ir8Bp5WcjVZXY2K2tnNq2c3QsGNKektqyWkQ O9LdeopJ/k0jotfce3h2pHMkTE8nNqf5daN0 X-Google-Smtp-Source: APXvYqxLN8jPnb4M9qzlorDiYEPu3SkvH5l5etJUT4+eZ1j0a1Bn/o3KiRzyyn5dl+o7/Lpo4ilsfA== X-Received: by 2002:aa7:8559:: with SMTP id y25mr40695542pfn.260.1567529036208; Tue, 03 Sep 2019 09:43:56 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:55 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 09/17] crypto: arm64/aes-cts-cbc-ce - performance tweak Date: Tue, 3 Sep 2019 09:43:31 -0700 Message-Id: <20190903164339.27984-10-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Optimize away one of the tbl instructions in the decryption path, which turns out to be unnecessary. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-modes.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 2879f030a749..38cd5a2091a8 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -293,12 +293,11 @@ AES_ENTRY(aes_cbc_cts_decrypt) ld1 {v5.16b}, [x5] /* get iv */ dec_prepare w3, x2, x6 - tbl v2.16b, {v1.16b}, v4.16b decrypt_block v0, w3, x2, x6, w7 - eor v2.16b, v2.16b, v0.16b + tbl v2.16b, {v0.16b}, v3.16b + eor v2.16b, v2.16b, v1.16b tbx v0.16b, {v1.16b}, v4.16b - tbl v2.16b, {v2.16b}, v3.16b decrypt_block v0, w3, x2, x6, w7 eor v0.16b, v0.16b, v5.16b /* xor with iv */ From patchwork Tue Sep 3 16:43:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128463 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 596B0112C for ; Tue, 3 Sep 2019 16:43:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3B4A422CED for ; Tue, 3 Sep 2019 16:43:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="XNUIp7gh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730120AbfICQn6 (ORCPT ); Tue, 3 Sep 2019 12:43:58 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:42700 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQn6 (ORCPT ); Tue, 3 Sep 2019 12:43:58 -0400 Received: by mail-pf1-f194.google.com with SMTP id w22so2679695pfi.9 for ; Tue, 03 Sep 2019 09:43:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4rOqC3sDxYEvPZjvQzEBSOK6QwtrxcRA7eQQZQ1weWg=; b=XNUIp7ghp+lOhuBLKnuLx1NkTFRe1oGt4Q58jSOkokHLV5Q2YA7rJwpvGFzgdfaugh Kbk0bVKcfqmC22mpp+Ie7M3ZE2x8sBShhMbFQXUTu/nMbHZGdI7fLJz8C38R4NFloT3W h68neaRl1rjFTgnpK7jSzic1Wz0KF6VS3DjvNZY1GyijSs0GisLf6kSvC4yzIKBd4Jev b063wN7T3n+B7wWvUGttJpAmjcpInqFMLqfqKv3IOh7WVigJK+gSSgm+dpLU1seRqSaI eBkV6uLa4rPnXSjliWM1e+sCruyrDoWJrctMd7FS152QFarxXhwrRD7g512pjNKaDzcZ Wp/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4rOqC3sDxYEvPZjvQzEBSOK6QwtrxcRA7eQQZQ1weWg=; b=Fgb8Bn7aufpBXd2hMZudMyevLyxy1COu3XKRUR4a3PJdtalDWqdFthoadrC5zx5gTk R7IN+xeFUoMAIdwoTcv9wzrtFyEXYM9/1CaJfJ9CTqPC1W5yIgLG1KjFj12kpGM95dO+ HGsFuFgUyFvHHReyanU05uy/X6/8naLoazzSp1iKJsxwVxj1sTeYuKqWdCDhpN0q88Ug HUfnkMQ5h4m5H1MogshO9pYcCcOOv1Ihmu9wCyDt2dCladZ5kTEnRB5/WZigRblQ3Vqr 6fd3vYvOwqQ83MRZIAJfCA8Fb+EEQOlm2Ksy8lllIRmdFrARAY06SP3GfcRRi31GFwcy FmFQ== X-Gm-Message-State: APjAAAX4TeuH1KYggb5wMaJpXDS47dbvgOw4f2GTZ+RaW/OCbLR9T6RS 09M6m2a4+1KaNMnCc7KGfjcMm4pCPSsixmrz X-Google-Smtp-Source: APXvYqxU/KRk0+60VC4NN8cqMA9dh7QO+0hwWQqErttNWTW5WK311lEqoAxd+fTjAKt/EaLgbnTutQ== X-Received: by 2002:a65:47c1:: with SMTP id f1mr30749430pgs.169.1567529037456; Tue, 03 Sep 2019 09:43:57 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:56 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 10/17] crypto: arm64/aes-cts-cbc - move request context data to the stack Date: Tue, 3 Sep 2019 09:43:32 -0700 Message-Id: <20190903164339.27984-11-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Since the CTS-CBC code completes synchronously, there is no point in keeping part of the scratch data it uses in the request context, so move it to the stack instead. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 61 +++++++++----------- 1 file changed, 26 insertions(+), 35 deletions(-) diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 4154bb93a85b..5ee980c5a5c2 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -107,12 +107,6 @@ asmlinkage void aes_mac_update(u8 const in[], u32 const rk[], int rounds, int blocks, u8 dg[], int enc_before, int enc_after); -struct cts_cbc_req_ctx { - struct scatterlist sg_src[2]; - struct scatterlist sg_dst[2]; - struct skcipher_request subreq; -}; - struct crypto_aes_xts_ctx { struct crypto_aes_ctx key1; struct crypto_aes_ctx __aligned(8) key2; @@ -292,23 +286,20 @@ static int __maybe_unused cbc_decrypt(struct skcipher_request *req) return cbc_decrypt_walk(req, &walk); } -static int cts_cbc_init_tfm(struct crypto_skcipher *tfm) -{ - crypto_skcipher_set_reqsize(tfm, sizeof(struct cts_cbc_req_ctx)); - return 0; -} - static int cts_cbc_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - struct cts_cbc_req_ctx *rctx = skcipher_request_ctx(req); int err, rounds = 6 + ctx->key_length / 4; int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2; struct scatterlist *src = req->src, *dst = req->dst; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; struct skcipher_walk walk; - skcipher_request_set_tfm(&rctx->subreq, tfm); + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, skcipher_request_flags(req), + NULL, NULL); if (req->cryptlen <= AES_BLOCK_SIZE) { if (req->cryptlen < AES_BLOCK_SIZE) @@ -317,31 +308,30 @@ static int cts_cbc_encrypt(struct skcipher_request *req) } if (cbc_blocks > 0) { - skcipher_request_set_crypt(&rctx->subreq, req->src, req->dst, + skcipher_request_set_crypt(&subreq, req->src, req->dst, cbc_blocks * AES_BLOCK_SIZE, req->iv); - err = skcipher_walk_virt(&walk, &rctx->subreq, false) ?: - cbc_encrypt_walk(&rctx->subreq, &walk); + err = skcipher_walk_virt(&walk, &subreq, false) ?: + cbc_encrypt_walk(&subreq, &walk); if (err) return err; if (req->cryptlen == AES_BLOCK_SIZE) return 0; - dst = src = scatterwalk_ffwd(rctx->sg_src, req->src, - rctx->subreq.cryptlen); + dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen); if (req->dst != req->src) - dst = scatterwalk_ffwd(rctx->sg_dst, req->dst, - rctx->subreq.cryptlen); + dst = scatterwalk_ffwd(sg_dst, req->dst, + subreq.cryptlen); } /* handle ciphertext stealing */ - skcipher_request_set_crypt(&rctx->subreq, src, dst, + skcipher_request_set_crypt(&subreq, src, dst, req->cryptlen - cbc_blocks * AES_BLOCK_SIZE, req->iv); - err = skcipher_walk_virt(&walk, &rctx->subreq, false); + err = skcipher_walk_virt(&walk, &subreq, false); if (err) return err; @@ -357,13 +347,16 @@ static int cts_cbc_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - struct cts_cbc_req_ctx *rctx = skcipher_request_ctx(req); int err, rounds = 6 + ctx->key_length / 4; int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2; struct scatterlist *src = req->src, *dst = req->dst; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; struct skcipher_walk walk; - skcipher_request_set_tfm(&rctx->subreq, tfm); + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, skcipher_request_flags(req), + NULL, NULL); if (req->cryptlen <= AES_BLOCK_SIZE) { if (req->cryptlen < AES_BLOCK_SIZE) @@ -372,31 +365,30 @@ static int cts_cbc_decrypt(struct skcipher_request *req) } if (cbc_blocks > 0) { - skcipher_request_set_crypt(&rctx->subreq, req->src, req->dst, + skcipher_request_set_crypt(&subreq, req->src, req->dst, cbc_blocks * AES_BLOCK_SIZE, req->iv); - err = skcipher_walk_virt(&walk, &rctx->subreq, false) ?: - cbc_decrypt_walk(&rctx->subreq, &walk); + err = skcipher_walk_virt(&walk, &subreq, false) ?: + cbc_decrypt_walk(&subreq, &walk); if (err) return err; if (req->cryptlen == AES_BLOCK_SIZE) return 0; - dst = src = scatterwalk_ffwd(rctx->sg_src, req->src, - rctx->subreq.cryptlen); + dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen); if (req->dst != req->src) - dst = scatterwalk_ffwd(rctx->sg_dst, req->dst, - rctx->subreq.cryptlen); + dst = scatterwalk_ffwd(sg_dst, req->dst, + subreq.cryptlen); } /* handle ciphertext stealing */ - skcipher_request_set_crypt(&rctx->subreq, src, dst, + skcipher_request_set_crypt(&subreq, src, dst, req->cryptlen - cbc_blocks * AES_BLOCK_SIZE, req->iv); - err = skcipher_walk_virt(&walk, &rctx->subreq, false); + err = skcipher_walk_virt(&walk, &subreq, false); if (err) return err; @@ -675,7 +667,6 @@ static struct skcipher_alg aes_algs[] = { { .setkey = skcipher_aes_setkey, .encrypt = cts_cbc_encrypt, .decrypt = cts_cbc_decrypt, - .init = cts_cbc_init_tfm, }, { .base = { .cra_name = "__essiv(cbc(aes),sha256)", From patchwork Tue Sep 3 16:43:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128465 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BA5215E9 for ; Tue, 3 Sep 2019 16:44:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8ED92339D for ; Tue, 3 Sep 2019 16:44:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="bneG9CkJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730127AbfICQoA (ORCPT ); Tue, 3 Sep 2019 12:44:00 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:34120 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQoA (ORCPT ); Tue, 3 Sep 2019 12:44:00 -0400 Received: by mail-pf1-f193.google.com with SMTP id b24so11162799pfp.1 for ; Tue, 03 Sep 2019 09:43:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lztbyRYUuu5cMp563Zlls3XdyH0KZSCJz0DwSh7fXts=; b=bneG9CkJ0iqWUkiAeHTNWE+/vDlISrWRvFAZrZx5Fv1pVFJXzXOsmkJ3rAbQrELRQ4 rAkIwzTvItPqL3oC6MxxWF8sRS1iBwWe+Uj7zDsh3otzDj5qMPMo7H3wPn4nS8PyqGU0 jN4yF/l6tDRErMWniDUAKCcCukJ2gILHpSNB08AEjGKnGDkhT383W9yEQjcP0SrQfcYF JIhx0tgDqCfAqQk3n3np7V1/4TxQCZg8ptZ8A1zurPGHhlTwnxvxnUy0RJ5RtFzSIOvz H1hBWQytOTv50O0i3GJs/Zi9QV4REnHCCv7xgNoeavEhQQ4TNvicKGL4ZnTkZ+gwEb2d I1yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lztbyRYUuu5cMp563Zlls3XdyH0KZSCJz0DwSh7fXts=; b=a7D0fqIsW5uFFAN85de+srqeabVUt7rbLTvyGWIqCuXeH8khjuwLFoBBen89H6rUiN sgFJX5Z0gIm8URwyp/VBFt13KVyg/ZRqmHymdr9Cu7O5PpRtHhYj2HHgxYqpGUOgXJH2 xAARPXwLZ/OrDyhWdITOeVKvEPaF0ONcSRVhAZYw+j/Y756cNMtCJAbpevnpl+paLTU3 vX8SZYNSY4Z5WZTDFa0ndxjRDfdhOorFPTdDxVTHrDV9rn3mBUXUwEWU/d/Zowx/Zn4F P1CTbMS5QKGWb/0jgUymKsxANfgqD5kscX5CjLUYP7kWX/eZsaZZZonOn3DG79DRM/kU fuUA== X-Gm-Message-State: APjAAAUIqqFA6YHaZ7+LYLdUyRpp8qUC+m9Pr61M58N6ul8tRUUjVnZ3 oz9Uy5tfBfGrnmypgeTsGqpq1hcbcnf7ORIP X-Google-Smtp-Source: APXvYqxK1n4syUK28eu9Q0hbMJNAgS6hmXwyZWk74F2t/lf6nKJIWl6H8okkCDazWRvNP3nwDUpzQw== X-Received: by 2002:a17:90a:7f06:: with SMTP id k6mr188039pjl.87.1567529038878; Tue, 03 Sep 2019 09:43:58 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:58 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 11/17] crypto: arm64/aes - implement support for XTS ciphertext stealing Date: Tue, 3 Sep 2019 09:43:33 -0700 Message-Id: <20190903164339.27984-12-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add the missing support for ciphertext stealing in the implementation of AES-XTS, which is part of the XTS specification but was omitted up until now due to lack of a need for it. The asm helpers are updated so they can deal with any input size, as long as the last full block and the final partial block are presented at the same time. The glue code is updated so that the common case of operating on a sector or page is mostly as before. When CTS is needed, the walk is split up into two pieces, unless the entire input is covered by a single step. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 126 ++++++++++++++++++-- arch/arm64/crypto/aes-modes.S | 99 ++++++++++++--- 2 files changed, 195 insertions(+), 30 deletions(-) diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 5ee980c5a5c2..eecb74fd2f61 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -90,10 +90,10 @@ asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 ctr[]); asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], - int rounds, int blocks, u32 const rk2[], u8 iv[], + int rounds, int bytes, u32 const rk2[], u8 iv[], int first); asmlinkage void aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[], - int rounds, int blocks, u32 const rk2[], u8 iv[], + int rounds, int bytes, u32 const rk2[], u8 iv[], int first); asmlinkage void aes_essiv_cbc_encrypt(u8 out[], u8 const in[], u32 const rk1[], @@ -529,21 +529,71 @@ static int __maybe_unused xts_encrypt(struct skcipher_request *req) struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); int err, first, rounds = 6 + ctx->key1.key_length / 4; + int tail = req->cryptlen % AES_BLOCK_SIZE; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct scatterlist *src, *dst; struct skcipher_walk walk; - unsigned int blocks; + + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; err = skcipher_walk_virt(&walk, req, false); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + if (unlikely(tail > 0 && walk.nbytes < walk.total)) { + int xts_blocks = DIV_ROUND_UP(req->cryptlen, + AES_BLOCK_SIZE) - 2; + + skcipher_walk_abort(&walk); + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, + skcipher_request_flags(req), + NULL, NULL); + skcipher_request_set_crypt(&subreq, req->src, req->dst, + xts_blocks * AES_BLOCK_SIZE, + req->iv); + req = &subreq; + err = skcipher_walk_virt(&walk, req, false); + } else { + tail = 0; + } + + for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) { + int nbytes = walk.nbytes; + + if (walk.nbytes < walk.total) + nbytes &= ~(AES_BLOCK_SIZE - 1); + kernel_neon_begin(); aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - ctx->key1.key_enc, rounds, blocks, + ctx->key1.key_enc, rounds, nbytes, ctx->key2.key_enc, walk.iv, first); kernel_neon_end(); - err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } - return err; + if (err || likely(!tail)) + return err; + + dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); + + skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail, + req->iv); + + err = skcipher_walk_virt(&walk, &subreq, false); + if (err) + return err; + + kernel_neon_begin(); + aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->key1.key_enc, rounds, walk.nbytes, + ctx->key2.key_enc, walk.iv, first); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); } static int __maybe_unused xts_decrypt(struct skcipher_request *req) @@ -551,21 +601,72 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req) struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); int err, first, rounds = 6 + ctx->key1.key_length / 4; + int tail = req->cryptlen % AES_BLOCK_SIZE; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct scatterlist *src, *dst; struct skcipher_walk walk; - unsigned int blocks; + + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; err = skcipher_walk_virt(&walk, req, false); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + if (unlikely(tail > 0 && walk.nbytes < walk.total)) { + int xts_blocks = DIV_ROUND_UP(req->cryptlen, + AES_BLOCK_SIZE) - 2; + + skcipher_walk_abort(&walk); + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, + skcipher_request_flags(req), + NULL, NULL); + skcipher_request_set_crypt(&subreq, req->src, req->dst, + xts_blocks * AES_BLOCK_SIZE, + req->iv); + req = &subreq; + err = skcipher_walk_virt(&walk, req, false); + } else { + tail = 0; + } + + for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) { + int nbytes = walk.nbytes; + + if (walk.nbytes < walk.total) + nbytes &= ~(AES_BLOCK_SIZE - 1); + kernel_neon_begin(); aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - ctx->key1.key_dec, rounds, blocks, + ctx->key1.key_dec, rounds, nbytes, ctx->key2.key_enc, walk.iv, first); kernel_neon_end(); - err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } - return err; + if (err || likely(!tail)) + return err; + + dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); + + skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail, + req->iv); + + err = skcipher_walk_virt(&walk, &subreq, false); + if (err) + return err; + + + kernel_neon_begin(); + aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->key1.key_dec, rounds, walk.nbytes, + ctx->key2.key_enc, walk.iv, first); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); } static struct skcipher_alg aes_algs[] = { { @@ -646,6 +747,7 @@ static struct skcipher_alg aes_algs[] = { { .min_keysize = 2 * AES_MIN_KEY_SIZE, .max_keysize = 2 * AES_MAX_KEY_SIZE, .ivsize = AES_BLOCK_SIZE, + .walksize = 2 * AES_BLOCK_SIZE, .setkey = xts_set_key, .encrypt = xts_encrypt, .decrypt = xts_decrypt, diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 38cd5a2091a8..f2c2ba739f36 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -413,10 +413,10 @@ AES_ENDPROC(aes_ctr_encrypt) /* + * aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds, + * int bytes, u8 const rk2[], u8 iv[], int first) * aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds, - * int blocks, u8 const rk2[], u8 iv[], int first) - * aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds, - * int blocks, u8 const rk2[], u8 iv[], int first) + * int bytes, u8 const rk2[], u8 iv[], int first) */ .macro next_tweak, out, in, tmp @@ -451,7 +451,7 @@ AES_ENTRY(aes_xts_encrypt) .LxtsencloopNx: next_tweak v4, v4, v8 .LxtsencNx: - subs w4, w4, #4 + subs w4, w4, #64 bmi .Lxtsenc1x ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 pt blocks */ next_tweak v5, v4, v8 @@ -468,33 +468,66 @@ AES_ENTRY(aes_xts_encrypt) eor v2.16b, v2.16b, v6.16b st1 {v0.16b-v3.16b}, [x0], #64 mov v4.16b, v7.16b - cbz w4, .Lxtsencout + cbz w4, .Lxtsencret xts_reload_mask v8 b .LxtsencloopNx .Lxtsenc1x: - adds w4, w4, #4 + adds w4, w4, #64 beq .Lxtsencout + subs w4, w4, #16 + bmi .LxtsencctsNx .Lxtsencloop: - ld1 {v1.16b}, [x1], #16 - eor v0.16b, v1.16b, v4.16b + ld1 {v0.16b}, [x1], #16 +.Lxtsencctsout: + eor v0.16b, v0.16b, v4.16b encrypt_block v0, w3, x2, x8, w7 eor v0.16b, v0.16b, v4.16b - st1 {v0.16b}, [x0], #16 - subs w4, w4, #1 - beq .Lxtsencout + cbz w4, .Lxtsencout + subs w4, w4, #16 next_tweak v4, v4, v8 + bmi .Lxtsenccts + st1 {v0.16b}, [x0], #16 b .Lxtsencloop .Lxtsencout: + st1 {v0.16b}, [x0] +.Lxtsencret: st1 {v4.16b}, [x6] ldp x29, x30, [sp], #16 ret -AES_ENDPROC(aes_xts_encrypt) +.LxtsencctsNx: + mov v0.16b, v3.16b + sub x0, x0, #16 +.Lxtsenccts: + adr_l x8, .Lcts_permute_table + + add x1, x1, w4, sxtw /* rewind input pointer */ + add w4, w4, #16 /* # bytes in final block */ + add x9, x8, #32 + add x8, x8, x4 + sub x9, x9, x4 + add x4, x0, x4 /* output address of final block */ + + ld1 {v1.16b}, [x1] /* load final block */ + ld1 {v2.16b}, [x8] + ld1 {v3.16b}, [x9] + + tbl v2.16b, {v0.16b}, v2.16b + tbx v0.16b, {v1.16b}, v3.16b + st1 {v2.16b}, [x4] /* overlapping stores */ + mov w4, wzr + b .Lxtsencctsout +AES_ENDPROC(aes_xts_encrypt) AES_ENTRY(aes_xts_decrypt) stp x29, x30, [sp, #-16]! mov x29, sp + /* subtract 16 bytes if we are doing CTS */ + sub w8, w4, #0x10 + tst w4, #0xf + csel w4, w4, w8, eq + ld1 {v4.16b}, [x6] xts_load_mask v8 cbz w7, .Lxtsdecnotfirst @@ -509,7 +542,7 @@ AES_ENTRY(aes_xts_decrypt) .LxtsdecloopNx: next_tweak v4, v4, v8 .LxtsdecNx: - subs w4, w4, #4 + subs w4, w4, #64 bmi .Lxtsdec1x ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 ct blocks */ next_tweak v5, v4, v8 @@ -530,22 +563,52 @@ AES_ENTRY(aes_xts_decrypt) xts_reload_mask v8 b .LxtsdecloopNx .Lxtsdec1x: - adds w4, w4, #4 + adds w4, w4, #64 beq .Lxtsdecout + subs w4, w4, #16 .Lxtsdecloop: - ld1 {v1.16b}, [x1], #16 - eor v0.16b, v1.16b, v4.16b + ld1 {v0.16b}, [x1], #16 + bmi .Lxtsdeccts +.Lxtsdecctsout: + eor v0.16b, v0.16b, v4.16b decrypt_block v0, w3, x2, x8, w7 eor v0.16b, v0.16b, v4.16b st1 {v0.16b}, [x0], #16 - subs w4, w4, #1 - beq .Lxtsdecout + cbz w4, .Lxtsdecout + subs w4, w4, #16 next_tweak v4, v4, v8 b .Lxtsdecloop .Lxtsdecout: st1 {v4.16b}, [x6] ldp x29, x30, [sp], #16 ret + +.Lxtsdeccts: + adr_l x8, .Lcts_permute_table + + add x1, x1, w4, sxtw /* rewind input pointer */ + add w4, w4, #16 /* # bytes in final block */ + add x9, x8, #32 + add x8, x8, x4 + sub x9, x9, x4 + add x4, x0, x4 /* output address of final block */ + + next_tweak v5, v4, v8 + + ld1 {v1.16b}, [x1] /* load final block */ + ld1 {v2.16b}, [x8] + ld1 {v3.16b}, [x9] + + eor v0.16b, v0.16b, v5.16b + decrypt_block v0, w3, x2, x8, w7 + eor v0.16b, v0.16b, v5.16b + + tbl v2.16b, {v0.16b}, v2.16b + tbx v0.16b, {v1.16b}, v3.16b + + st1 {v2.16b}, [x4] /* overlapping stores */ + mov w4, wzr + b .Lxtsdecctsout AES_ENDPROC(aes_xts_decrypt) /* From patchwork Tue Sep 3 16:43:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128467 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3629F15E9 for ; Tue, 3 Sep 2019 16:44:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0E61C22CED for ; Tue, 3 Sep 2019 16:44:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="W9f1qQsT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730129AbfICQoB (ORCPT ); Tue, 3 Sep 2019 12:44:01 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:38814 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQoB (ORCPT ); Tue, 3 Sep 2019 12:44:01 -0400 Received: by mail-pg1-f194.google.com with SMTP id d10so4892269pgo.5 for ; Tue, 03 Sep 2019 09:44:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=inENxCG3FHp1A7FxE00V5o6CRhf+tePDkDNu7R3bkwg=; b=W9f1qQsTkJP3G6z7uAc5Pjz9DpJz5vYtdCzjM+Y78LMW3mIpOtfWNYjJO9NesY9JYX 8WwuPWLTuCBddkeFx9ir06sDmf48slBKsrCpMqB+l27wry2z/oj14RuXOQXhFqsZmnUp Z0PSAqdTudKleBuqbGV2z6L/tzF7B/hhPJwlPi8J+kc1yEy1NYVdjm4uk7s5gb0/gXXD r//7I05uBbAYTH+44aFTUXcmsNznVBd5jT3cf3z4KceOVQlNQ+I6857gs7v05eIOCir2 8Bbq50MzSLMFJRxMnKuywf45heHDvqjYFCY1xo/hEaghhk7svnuSqchdHiPrgvwXI9sv 6n+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=inENxCG3FHp1A7FxE00V5o6CRhf+tePDkDNu7R3bkwg=; b=Suqt6xQ7nt2Vyx58pDX+QKZLnEpZbH3WrsP8z5mXRh/ErOKPaTXi0UsTIJOAZpqWJI q9zN/Ua4YQHmT6A5kXmsW03UX+HLSx1rt7VplLWNsf6sfFUyR4MLrb/moGAX2ZBheMTC hxdIaU/I1y74FHOkoneBTXjqq0FC28JCPIBIEamwXwObFUg9Gp3MPQTyt/x5BdZdPpV0 mOj8D+LMpWQw6NszFWKpEq2FaWppu9c2Q0vSPwf1oX1NqBlEBxYrNE2rInJFVaiJ7dos o4oqm35H8YyaYEy63MCajllAn/N4w5MJihlEeVeHF2Rb5Wf2uCqUn8wT93KieO0+hnmH DNlg== X-Gm-Message-State: APjAAAWhYFo1biIne/1LppLB+QLXHolcmJzy2WLZWiyqsEKHiB2iMlnz gC4WwzgEKcDWIRxxx2D0ukqozQo2Pil96Of6 X-Google-Smtp-Source: APXvYqzavf7+7S8TtPdXisg14IAb4u/loO5kAfm7AF5p13+jiBzkNj7p8WSj8/2zKwlgmqL53liiMg== X-Received: by 2002:a17:90a:25a9:: with SMTP id k38mr261400pje.12.1567529040277; Tue, 03 Sep 2019 09:44:00 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.43.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:43:59 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 12/17] crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS Date: Tue, 3 Sep 2019 09:43:34 -0700 Message-Id: <20190903164339.27984-13-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Update the AES-XTS implementation based on NEON instructions so that it can deal with inputs whose size is not a multiple of the cipher block size. This is part of the original XTS specification, but was never implemented before in the Linux kernel. Since the bit slicing driver is only faster if it can operate on at least 7 blocks of input at the same time, let's reuse the alternate path we are adding for CTS to process any data tail whose size is not a multiple of 128 bytes. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce.S | 3 + arch/arm64/crypto/aes-glue.c | 2 + arch/arm64/crypto/aes-modes.S | 3 + arch/arm64/crypto/aes-neon.S | 5 + arch/arm64/crypto/aes-neonbs-glue.c | 111 +++++++++++++++++--- 5 files changed, 110 insertions(+), 14 deletions(-) diff --git a/arch/arm64/crypto/aes-ce.S b/arch/arm64/crypto/aes-ce.S index 00bd2885feaa..c132c49c89a8 100644 --- a/arch/arm64/crypto/aes-ce.S +++ b/arch/arm64/crypto/aes-ce.S @@ -21,6 +21,9 @@ .macro xts_reload_mask, tmp .endm + .macro xts_cts_skip_tw, reg, lbl + .endm + /* preload all round keys */ .macro load_round_keys, rounds, rk cmp \rounds, #12 diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index eecb74fd2f61..327ac8d1489e 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -1073,5 +1073,7 @@ module_cpu_feature_match(AES, aes_init); module_init(aes_init); EXPORT_SYMBOL(neon_aes_ecb_encrypt); EXPORT_SYMBOL(neon_aes_cbc_encrypt); +EXPORT_SYMBOL(neon_aes_xts_encrypt); +EXPORT_SYMBOL(neon_aes_xts_decrypt); #endif module_exit(aes_exit); diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index f2c2ba739f36..131618389f1f 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -442,6 +442,7 @@ AES_ENTRY(aes_xts_encrypt) cbz w7, .Lxtsencnotfirst enc_prepare w3, x5, x8 + xts_cts_skip_tw w7, .LxtsencNx encrypt_block v4, w3, x5, x8, w7 /* first tweak */ enc_switch_key w3, x2, x8 b .LxtsencNx @@ -530,10 +531,12 @@ AES_ENTRY(aes_xts_decrypt) ld1 {v4.16b}, [x6] xts_load_mask v8 + xts_cts_skip_tw w7, .Lxtsdecskiptw cbz w7, .Lxtsdecnotfirst enc_prepare w3, x5, x8 encrypt_block v4, w3, x5, x8, w7 /* first tweak */ +.Lxtsdecskiptw: dec_prepare w3, x2, x8 b .LxtsdecNx diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S index 0cac5df6c901..22d9b110cf78 100644 --- a/arch/arm64/crypto/aes-neon.S +++ b/arch/arm64/crypto/aes-neon.S @@ -19,6 +19,11 @@ xts_load_mask \tmp .endm + /* special case for the neon-bs driver calling into this one for CTS */ + .macro xts_cts_skip_tw, reg, lbl + tbnz \reg, #1, \lbl + .endm + /* multiply by polynomial 'x' in GF(2^8) */ .macro mul_by_x, out, in, temp, const sshr \temp, \in, #7 diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index bafd2ebef8f1..ea873b8904c4 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -45,6 +46,12 @@ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks); asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 iv[]); +asmlinkage void neon_aes_xts_encrypt(u8 out[], u8 const in[], + u32 const rk1[], int rounds, int bytes, + u32 const rk2[], u8 iv[], int first); +asmlinkage void neon_aes_xts_decrypt(u8 out[], u8 const in[], + u32 const rk1[], int rounds, int bytes, + u32 const rk2[], u8 iv[], int first); struct aesbs_ctx { u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32]; @@ -64,6 +71,7 @@ struct aesbs_ctr_ctx { struct aesbs_xts_ctx { struct aesbs_ctx key; u32 twkey[AES_MAX_KEYLENGTH_U32]; + struct crypto_aes_ctx cts; }; static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key, @@ -270,6 +278,10 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return err; key_len /= 2; + err = aes_expandkey(&ctx->cts, in_key, key_len); + if (err) + return err; + err = aes_expandkey(&rk, in_key + key_len, key_len); if (err) return err; @@ -302,48 +314,119 @@ static int ctr_encrypt_sync(struct skcipher_request *req) return ctr_encrypt(req); } -static int __xts_crypt(struct skcipher_request *req, +static int __xts_crypt(struct skcipher_request *req, bool encrypt, void (*fn)(u8 out[], u8 const in[], u8 const rk[], int rounds, int blocks, u8 iv[])) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm); + int tail = req->cryptlen % (8 * AES_BLOCK_SIZE); + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct scatterlist *src, *dst; struct skcipher_walk walk; - int err; + int nbytes, err; + int first = 1; + u8 *out, *in; + + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; + + /* ensure that the cts tail is covered by a single step */ + if (unlikely(tail > 0 && tail < AES_BLOCK_SIZE)) { + int xts_blocks = DIV_ROUND_UP(req->cryptlen, + AES_BLOCK_SIZE) - 2; + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, + skcipher_request_flags(req), + NULL, NULL); + skcipher_request_set_crypt(&subreq, req->src, req->dst, + xts_blocks * AES_BLOCK_SIZE, + req->iv); + req = &subreq; + } else { + tail = 0; + } err = skcipher_walk_virt(&walk, req, false); if (err) return err; - kernel_neon_begin(); - neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, ctx->key.rounds, 1); - kernel_neon_end(); - while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; - if (walk.nbytes < walk.total) + if (walk.nbytes < walk.total || walk.nbytes % AES_BLOCK_SIZE) blocks = round_down(blocks, walk.stride / AES_BLOCK_SIZE); + out = walk.dst.virt.addr; + in = walk.src.virt.addr; + nbytes = walk.nbytes; + kernel_neon_begin(); - fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk, - ctx->key.rounds, blocks, walk.iv); + if (likely(blocks > 6)) { /* plain NEON is faster otherwise */ + if (first) + neon_aes_ecb_encrypt(walk.iv, walk.iv, + ctx->twkey, + ctx->key.rounds, 1); + first = 0; + + fn(out, in, ctx->key.rk, ctx->key.rounds, blocks, + walk.iv); + + out += blocks * AES_BLOCK_SIZE; + in += blocks * AES_BLOCK_SIZE; + nbytes -= blocks * AES_BLOCK_SIZE; + } + + if (walk.nbytes == walk.total && nbytes > 0) + goto xts_tail; + kernel_neon_end(); - err = skcipher_walk_done(&walk, - walk.nbytes - blocks * AES_BLOCK_SIZE); + skcipher_walk_done(&walk, nbytes); } - return err; + + if (err || likely(!tail)) + return err; + + /* handle ciphertext stealing */ + dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); + + skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail, + req->iv); + + err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + + out = walk.dst.virt.addr; + in = walk.src.virt.addr; + nbytes = walk.nbytes; + + kernel_neon_begin(); +xts_tail: + if (encrypt) + neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds, + nbytes, ctx->twkey, walk.iv, first ?: 2); + else + neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds, + nbytes, ctx->twkey, walk.iv, first ?: 2); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); } static int xts_encrypt(struct skcipher_request *req) { - return __xts_crypt(req, aesbs_xts_encrypt); + return __xts_crypt(req, true, aesbs_xts_encrypt); } static int xts_decrypt(struct skcipher_request *req) { - return __xts_crypt(req, aesbs_xts_decrypt); + return __xts_crypt(req, false, aesbs_xts_decrypt); } static struct skcipher_alg aes_algs[] = { { From patchwork Tue Sep 3 16:43:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128469 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF053112C for ; Tue, 3 Sep 2019 16:44:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 884722339D for ; Tue, 3 Sep 2019 16:44:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="qT7FKLfO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729692AbfICQoD (ORCPT ); Tue, 3 Sep 2019 12:44:03 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:33375 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQoD (ORCPT ); Tue, 3 Sep 2019 12:44:03 -0400 Received: by mail-pg1-f193.google.com with SMTP id n190so9465224pgn.0 for ; Tue, 03 Sep 2019 09:44:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ss0hT+6FcmRtVnmYdaRt5cn9RrJ1dCb8Db0Ibbel/hM=; b=qT7FKLfOPYRHLYOnoAcjsz6UTVdRycrjXPObzeviWggNsvsetYNOqR+isKEFyR1RLS KV3wWpsfu1aWC4ezhzxuE9kTtCY8x9mnx5c2aIk/iLhD2MPWNCCiwJnmgjt4XPN8ymEh JhICdej3GKCMwR4eF8ni6ydFiMR4vQ806K3+Z3+4fzGrcfy/dgRtiyg9WWZfEs2c/y24 6B2r5heX03jtWHdYZn9iAy+veE7OJWJdA4NfvUpIpNoia52IGIYPGDWrRtOIP0drh3f+ lCmsOAvCQgTZVembdOmRpvwTbGsaTlZl4DgNJTcJ1rfyVa1mCqfyzzlFoz/8pTjngGwe vZ1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ss0hT+6FcmRtVnmYdaRt5cn9RrJ1dCb8Db0Ibbel/hM=; b=QtQimRUF421rVFQXhZ6fclFw6sXf+VliWkF1NgGkBUfZAr7wTl3wpad066Q/Z/SJCN HG2qw3B8HLLPGDIAvz4VTcq0OmviN3nA0J7EigVFO3Wpj0iCzFQDClECk7fgh0qCZlne iNO0+lNLWtb7bA/ZCILsX5fnhfbBj5waUPUsHWPgOu2lIGZxlxq3DIc/Sz6KF/Q31iFf x97A4Xxp24UWbkjXbSkZjYwJhEEUhd5XOonZrNuSVWpxpEhZ4s4C4vGegRRQoBsnm3QX gQ1kwMnjmDXJQ3zYw2xy+1wyoE64I2BtA76uYcyhNTp9ZcjZd/dqaPDiVbyBRmp9aQIy jhDQ== X-Gm-Message-State: APjAAAUjcLA/xKQkdPIT4cYhMNjqXrfmoi+qqCnJW0+iniNakbEK8qTU xqt3C0IlMWh/F6SpcHttVbZ4wB3puc35j+LA X-Google-Smtp-Source: APXvYqzcmzskeJt5jrcMuOAWF9kneZ7u7WfiSttZoJiTtnCtA9R4KCkBssoI/wT13jL/Qmbd28U4Mg== X-Received: by 2002:a17:90a:b393:: with SMTP id e19mr230798pjr.118.1567529041664; Tue, 03 Sep 2019 09:44:01 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.44.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:44:00 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 13/17] crypto: arm/aes-ce - implement ciphertext stealing for XTS Date: Tue, 3 Sep 2019 09:43:35 -0700 Message-Id: <20190903164339.27984-14-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Update the AES-XTS implementation based on AES instructions so that it can deal with inputs whose size is not a multiple of the cipher block size. This is part of the original XTS specification, but was never implemented before in the Linux kernel. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 103 ++++++++++++++-- arch/arm/crypto/aes-ce-glue.c | 128 ++++++++++++++++++-- 2 files changed, 208 insertions(+), 23 deletions(-) diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index bb6ec1844370..763e51604ab6 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -369,9 +369,9 @@ ENDPROC(ce_aes_ctr_encrypt) /* * aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, - * int blocks, u8 iv[], u32 const rk2[], int first) + * int bytes, u8 iv[], u32 const rk2[], int first) * aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, - * int blocks, u8 iv[], u32 const rk2[], int first) + * int bytes, u8 iv[], u32 const rk2[], int first) */ .macro next_tweak, out, in, const, tmp @@ -414,7 +414,7 @@ ENTRY(ce_aes_xts_encrypt) .Lxtsencloop4x: next_tweak q4, q4, q15, q10 .Lxtsenc4x: - subs r4, r4, #4 + subs r4, r4, #64 bmi .Lxtsenc1x vld1.8 {q0-q1}, [r1]! @ get 4 pt blocks vld1.8 {q2-q3}, [r1]! @@ -434,24 +434,58 @@ ENTRY(ce_aes_xts_encrypt) vst1.8 {q2-q3}, [r0]! vmov q4, q7 teq r4, #0 - beq .Lxtsencout + beq .Lxtsencret b .Lxtsencloop4x .Lxtsenc1x: - adds r4, r4, #4 + adds r4, r4, #64 beq .Lxtsencout + subs r4, r4, #16 + bmi .LxtsencctsNx .Lxtsencloop: vld1.8 {q0}, [r1]! +.Lxtsencctsout: veor q0, q0, q4 bl aes_encrypt veor q0, q0, q4 - vst1.8 {q0}, [r0]! - subs r4, r4, #1 + teq r4, #0 beq .Lxtsencout + subs r4, r4, #16 next_tweak q4, q4, q15, q6 + bmi .Lxtsenccts + vst1.8 {q0}, [r0]! b .Lxtsencloop .Lxtsencout: + vst1.8 {q0}, [r0] +.Lxtsencret: vst1.8 {q4}, [r5] pop {r4-r6, pc} + +.LxtsencctsNx: + vmov q0, q3 + sub r0, r0, #16 +.Lxtsenccts: + movw ip, :lower16:.Lcts_permute_table + movt ip, :upper16:.Lcts_permute_table + + add r1, r1, r4 @ rewind input pointer + add r4, r4, #16 @ # bytes in final block + add lr, ip, #32 + add ip, ip, r4 + sub lr, lr, r4 + add r4, r0, r4 @ output address of final block + + vld1.8 {q1}, [r1] @ load final partial block + vld1.8 {q2}, [ip] + vld1.8 {q3}, [lr] + + vtbl.8 d4, {d0-d1}, d4 + vtbl.8 d5, {d0-d1}, d5 + vtbx.8 d0, {d2-d3}, d6 + vtbx.8 d1, {d2-d3}, d7 + + vst1.8 {q2}, [r4] @ overlapping stores + mov r4, #0 + b .Lxtsencctsout ENDPROC(ce_aes_xts_encrypt) @@ -462,13 +496,17 @@ ENTRY(ce_aes_xts_decrypt) prepare_key r2, r3 vmov q4, q0 + /* subtract 16 bytes if we are doing CTS */ + tst r4, #0xf + subne r4, r4, #0x10 + teq r6, #0 @ start of a block? bne .Lxtsdec4x .Lxtsdecloop4x: next_tweak q4, q4, q15, q10 .Lxtsdec4x: - subs r4, r4, #4 + subs r4, r4, #64 bmi .Lxtsdec1x vld1.8 {q0-q1}, [r1]! @ get 4 ct blocks vld1.8 {q2-q3}, [r1]! @@ -491,22 +529,55 @@ ENTRY(ce_aes_xts_decrypt) beq .Lxtsdecout b .Lxtsdecloop4x .Lxtsdec1x: - adds r4, r4, #4 + adds r4, r4, #64 beq .Lxtsdecout + subs r4, r4, #16 .Lxtsdecloop: vld1.8 {q0}, [r1]! + bmi .Lxtsdeccts +.Lxtsdecctsout: veor q0, q0, q4 - add ip, r2, #32 @ 3rd round key bl aes_decrypt veor q0, q0, q4 vst1.8 {q0}, [r0]! - subs r4, r4, #1 + teq r4, #0 beq .Lxtsdecout + subs r4, r4, #16 next_tweak q4, q4, q15, q6 b .Lxtsdecloop .Lxtsdecout: vst1.8 {q4}, [r5] pop {r4-r6, pc} + +.Lxtsdeccts: + movw ip, :lower16:.Lcts_permute_table + movt ip, :upper16:.Lcts_permute_table + + add r1, r1, r4 @ rewind input pointer + add r4, r4, #16 @ # bytes in final block + add lr, ip, #32 + add ip, ip, r4 + sub lr, lr, r4 + add r4, r0, r4 @ output address of final block + + next_tweak q5, q4, q15, q6 + + vld1.8 {q1}, [r1] @ load final partial block + vld1.8 {q2}, [ip] + vld1.8 {q3}, [lr] + + veor q0, q0, q5 + bl aes_decrypt + veor q0, q0, q5 + + vtbl.8 d4, {d0-d1}, d4 + vtbl.8 d5, {d0-d1}, d5 + vtbx.8 d0, {d2-d3}, d6 + vtbx.8 d1, {d2-d3}, d7 + + vst1.8 {q2}, [r4] @ overlapping stores + mov r4, #0 + b .Lxtsdecctsout ENDPROC(ce_aes_xts_decrypt) /* @@ -532,3 +603,13 @@ ENTRY(ce_aes_invert) vst1.32 {q0}, [r0] bx lr ENDPROC(ce_aes_invert) + + .section ".rodata", "a" + .align 6 +.Lcts_permute_table: + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff + .byte 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7 + .byte 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 486e862ae34a..c215792a2494 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -39,10 +40,10 @@ asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 ctr[]); asmlinkage void ce_aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], - int rounds, int blocks, u8 iv[], + int rounds, int bytes, u8 iv[], u32 const rk2[], int first); asmlinkage void ce_aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[], - int rounds, int blocks, u8 iv[], + int rounds, int bytes, u8 iv[], u32 const rk2[], int first); struct aes_block { @@ -317,20 +318,71 @@ static int xts_encrypt(struct skcipher_request *req) struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); int err, first, rounds = num_rounds(&ctx->key1); + int tail = req->cryptlen % AES_BLOCK_SIZE; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct scatterlist *src, *dst; struct skcipher_walk walk; - unsigned int blocks; + + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; err = skcipher_walk_virt(&walk, req, false); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + if (unlikely(tail > 0 && walk.nbytes < walk.total)) { + int xts_blocks = DIV_ROUND_UP(req->cryptlen, + AES_BLOCK_SIZE) - 2; + + skcipher_walk_abort(&walk); + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, + skcipher_request_flags(req), + NULL, NULL); + skcipher_request_set_crypt(&subreq, req->src, req->dst, + xts_blocks * AES_BLOCK_SIZE, + req->iv); + req = &subreq; + err = skcipher_walk_virt(&walk, req, false); + } else { + tail = 0; + } + + for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) { + int nbytes = walk.nbytes; + + if (walk.nbytes < walk.total) + nbytes &= ~(AES_BLOCK_SIZE - 1); + kernel_neon_begin(); ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - ctx->key1.key_enc, rounds, blocks, walk.iv, + ctx->key1.key_enc, rounds, nbytes, walk.iv, ctx->key2.key_enc, first); kernel_neon_end(); - err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } - return err; + + if (err || likely(!tail)) + return err; + + dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); + + skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail, + req->iv); + + err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + + kernel_neon_begin(); + ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->key1.key_enc, rounds, walk.nbytes, walk.iv, + ctx->key2.key_enc, first); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); } static int xts_decrypt(struct skcipher_request *req) @@ -338,20 +390,71 @@ static int xts_decrypt(struct skcipher_request *req) struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); int err, first, rounds = num_rounds(&ctx->key1); + int tail = req->cryptlen % AES_BLOCK_SIZE; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct scatterlist *src, *dst; struct skcipher_walk walk; - unsigned int blocks; + + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; err = skcipher_walk_virt(&walk, req, false); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + if (unlikely(tail > 0 && walk.nbytes < walk.total)) { + int xts_blocks = DIV_ROUND_UP(req->cryptlen, + AES_BLOCK_SIZE) - 2; + + skcipher_walk_abort(&walk); + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, + skcipher_request_flags(req), + NULL, NULL); + skcipher_request_set_crypt(&subreq, req->src, req->dst, + xts_blocks * AES_BLOCK_SIZE, + req->iv); + req = &subreq; + err = skcipher_walk_virt(&walk, req, false); + } else { + tail = 0; + } + + for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) { + int nbytes = walk.nbytes; + + if (walk.nbytes < walk.total) + nbytes &= ~(AES_BLOCK_SIZE - 1); + kernel_neon_begin(); ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - ctx->key1.key_dec, rounds, blocks, walk.iv, + ctx->key1.key_dec, rounds, nbytes, walk.iv, ctx->key2.key_enc, first); kernel_neon_end(); - err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } - return err; + + if (err || likely(!tail)) + return err; + + dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); + + skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail, + req->iv); + + err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + + kernel_neon_begin(); + ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->key1.key_dec, rounds, walk.nbytes, walk.iv, + ctx->key2.key_enc, first); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); } static struct skcipher_alg aes_algs[] = { { @@ -426,6 +529,7 @@ static struct skcipher_alg aes_algs[] = { { .min_keysize = 2 * AES_MIN_KEY_SIZE, .max_keysize = 2 * AES_MAX_KEY_SIZE, .ivsize = AES_BLOCK_SIZE, + .walksize = 2 * AES_BLOCK_SIZE, .setkey = xts_set_key, .encrypt = xts_encrypt, .decrypt = xts_decrypt, From patchwork Tue Sep 3 16:43:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128471 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1ED74112C for ; Tue, 3 Sep 2019 16:44:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EBA6321881 for ; Tue, 3 Sep 2019 16:44:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="LJ61StyT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730138AbfICQoE (ORCPT ); Tue, 3 Sep 2019 12:44:04 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:39538 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQoE (ORCPT ); Tue, 3 Sep 2019 12:44:04 -0400 Received: by mail-pg1-f196.google.com with SMTP id u17so9444959pgi.6 for ; Tue, 03 Sep 2019 09:44:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1Bgaz1dGwPYBh9D5oDVzcqGBYRtI51XEZb5wgMPv0t8=; b=LJ61StyT53EFG7gY6f3xx0Qeu1RNT4D9FoKRiDZpl5jicfWYT6GjL0ua/mUD2iaX4z Irezaw6M4YH1BTlzxqBl3AusHWSbHOK4obVw/nTOBq2Vq67D0rrtk3wZvv4aajrC2KYa mW4yELc29nlSmAQvSPyh7PV+2VFMIV1bHlV4XIdANuRTn/0ysLHdA7QLXutT/dfswhBN MPpcWdnY351jAmYF0uCCvmrVWvXavzlj+Glo9zoACNh2FBWBdp2Ccl0rbrUcWr6IlE0P m9qyE9CkhqUzd1fn/AJrj4iPCcP9Bakp3TfxnckDHtvDfuVc08zvlP0YoQUyydTh2qBe mEKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1Bgaz1dGwPYBh9D5oDVzcqGBYRtI51XEZb5wgMPv0t8=; b=G5PIajMsNXSv9J3cttTquaS/h0bFMy1eCt/q+K5euBypvfK74Evwvi389n+YjWbTFW QOOGCHlkDxg1Uqx2hyBIXmvHE65fOW3b7O7YY0pGn+yr/7VynlwSC7C0V+yyNkcP71AO Qt/1wlV+gAnrmdlHqimva/xHz4pdxp7IK2HC5/wFyJtpn1BlDPpN/It3LsoKQwpnubOM ZKTp9C7c8vC3Ak/yZH3N23Z14zx/HKebBlj+mt9xYYmRAUqHpeflw1f1aYLH39zNMpNv sag9CRGMAY2epQZCwlANsN8XJO67wnOwNmPDP2w78PLgqsonT+/9+pVs0ftGssGcA6L6 1h0Q== X-Gm-Message-State: APjAAAUa5ZDN3AzqGjUDAl2OqN9TBpuo1ZBUgN6RoCTtylyuOoOpkM4q y6OT5jP0/AkxMAyb6JUKt3jb0AzYfLXPcM38 X-Google-Smtp-Source: APXvYqyynPYJLGhQncX5UTjCS48TUELKIGiIGsL3Ds+xk0vJY/EExL9Lr+fFAOQD4gSYoOgzNNKghA== X-Received: by 2002:a17:90a:c386:: with SMTP id h6mr218160pjt.122.1567529043012; Tue, 03 Sep 2019 09:44:03 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.44.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:44:02 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 14/17] crypto: arm/aes-neonbs - implement ciphertext stealing for XTS Date: Tue, 3 Sep 2019 09:43:36 -0700 Message-Id: <20190903164339.27984-15-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Update the AES-XTS implementation based on NEON instructions so that it can deal with inputs whose size is not a multiple of the cipher block size. This is part of the original XTS specification, but was never implemented before in the Linux kernel. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-neonbs-core.S | 16 +++-- arch/arm/crypto/aes-neonbs-glue.c | 69 +++++++++++++++++--- 2 files changed, 72 insertions(+), 13 deletions(-) diff --git a/arch/arm/crypto/aes-neonbs-core.S b/arch/arm/crypto/aes-neonbs-core.S index bb75918e4984..cfaed4e67535 100644 --- a/arch/arm/crypto/aes-neonbs-core.S +++ b/arch/arm/crypto/aes-neonbs-core.S @@ -889,9 +889,9 @@ ENDPROC(aesbs_ctr_encrypt) /* * aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, u8 iv[]) + * int blocks, u8 iv[], int reorder_last_tweak) * aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, u8 iv[]) + * int blocks, u8 iv[], int reorder_last_tweak) */ __xts_prepare8: vld1.8 {q14}, [r7] // load iv @@ -944,17 +944,25 @@ __xts_prepare8: vld1.8 {q7}, [r1]! next_tweak q14, q12, q15, q13 - veor q7, q7, q12 +THUMB( itt le ) + W(cmple) r8, #0 + ble 1f +0: veor q7, q7, q12 vst1.8 {q12}, [r4, :128] -0: vst1.8 {q14}, [r7] // store next iv + vst1.8 {q14}, [r7] // store next iv bx lr + +1: vswp q12, q14 + b 0b ENDPROC(__xts_prepare8) .macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7 push {r4-r8, lr} mov r5, sp // preserve sp ldrd r6, r7, [sp, #24] // get blocks and iv args + ldr r8, [sp, #32] // reorder final tweak? + rsb r8, r8, #1 sub ip, sp, #128 // make room for 8x tweak bic ip, ip, #0xf // align sp to 16 bytes mov sp, ip diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c index 9000d0796d5e..e85839a8aaeb 100644 --- a/arch/arm/crypto/aes-neonbs-glue.c +++ b/arch/arm/crypto/aes-neonbs-glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -37,9 +38,9 @@ asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, int blocks, u8 ctr[], u8 final[]); asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, u8 iv[]); + int rounds, int blocks, u8 iv[], int); asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, u8 iv[]); + int rounds, int blocks, u8 iv[], int); struct aesbs_ctx { int rounds; @@ -53,6 +54,7 @@ struct aesbs_cbc_ctx { struct aesbs_xts_ctx { struct aesbs_ctx key; + struct crypto_cipher *cts_tfm; struct crypto_cipher *tweak_tfm; }; @@ -291,6 +293,9 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return err; key_len /= 2; + err = crypto_cipher_setkey(ctx->cts_tfm, in_key, key_len); + if (err) + return err; err = crypto_cipher_setkey(ctx->tweak_tfm, in_key + key_len, key_len); if (err) return err; @@ -302,7 +307,13 @@ static int xts_init(struct crypto_tfm *tfm) { struct aesbs_xts_ctx *ctx = crypto_tfm_ctx(tfm); + ctx->cts_tfm = crypto_alloc_cipher("aes", 0, 0); + if (IS_ERR(ctx->cts_tfm)) + return PTR_ERR(ctx->cts_tfm); + ctx->tweak_tfm = crypto_alloc_cipher("aes", 0, 0); + if (IS_ERR(ctx->tweak_tfm)) + crypto_free_cipher(ctx->cts_tfm); return PTR_ERR_OR_ZERO(ctx->tweak_tfm); } @@ -312,17 +323,34 @@ static void xts_exit(struct crypto_tfm *tfm) struct aesbs_xts_ctx *ctx = crypto_tfm_ctx(tfm); crypto_free_cipher(ctx->tweak_tfm); + crypto_free_cipher(ctx->cts_tfm); } -static int __xts_crypt(struct skcipher_request *req, +static int __xts_crypt(struct skcipher_request *req, bool encrypt, void (*fn)(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, u8 iv[])) + int rounds, int blocks, u8 iv[], int)) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm); + int tail = req->cryptlen % AES_BLOCK_SIZE; + struct skcipher_request subreq; + u8 buf[2 * AES_BLOCK_SIZE]; struct skcipher_walk walk; int err; + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; + + if (unlikely(tail)) { + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, + skcipher_request_flags(req), + NULL, NULL); + skcipher_request_set_crypt(&subreq, req->src, req->dst, + req->cryptlen - tail, req->iv); + req = &subreq; + } + err = skcipher_walk_virt(&walk, req, true); if (err) return err; @@ -331,30 +359,53 @@ static int __xts_crypt(struct skcipher_request *req, while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; + int reorder_last_tweak = !encrypt && tail > 0; - if (walk.nbytes < walk.total) + if (walk.nbytes < walk.total) { blocks = round_down(blocks, walk.stride / AES_BLOCK_SIZE); + reorder_last_tweak = 0; + } kernel_neon_begin(); fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk, - ctx->key.rounds, blocks, walk.iv); + ctx->key.rounds, blocks, walk.iv, reorder_last_tweak); kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes - blocks * AES_BLOCK_SIZE); } - return err; + if (err || likely(!tail)) + return err; + + /* handle ciphertext stealing */ + scatterwalk_map_and_copy(buf, req->dst, req->cryptlen - AES_BLOCK_SIZE, + AES_BLOCK_SIZE, 0); + memcpy(buf + AES_BLOCK_SIZE, buf, tail); + scatterwalk_map_and_copy(buf, req->src, req->cryptlen, tail, 0); + + crypto_xor(buf, req->iv, AES_BLOCK_SIZE); + + if (encrypt) + crypto_cipher_encrypt_one(ctx->cts_tfm, buf, buf); + else + crypto_cipher_decrypt_one(ctx->cts_tfm, buf, buf); + + crypto_xor(buf, req->iv, AES_BLOCK_SIZE); + + scatterwalk_map_and_copy(buf, req->dst, req->cryptlen - AES_BLOCK_SIZE, + AES_BLOCK_SIZE + tail, 1); + return 0; } static int xts_encrypt(struct skcipher_request *req) { - return __xts_crypt(req, aesbs_xts_encrypt); + return __xts_crypt(req, true, aesbs_xts_encrypt); } static int xts_decrypt(struct skcipher_request *req) { - return __xts_crypt(req, aesbs_xts_decrypt); + return __xts_crypt(req, false, aesbs_xts_decrypt); } static struct skcipher_alg aes_algs[] = { { From patchwork Tue Sep 3 16:43:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128473 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B015015E9 for ; Tue, 3 Sep 2019 16:44:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 89EB12339D for ; Tue, 3 Sep 2019 16:44:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="M3sINiCN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730140AbfICQoG (ORCPT ); Tue, 3 Sep 2019 12:44:06 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:34131 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728854AbfICQoF (ORCPT ); Tue, 3 Sep 2019 12:44:05 -0400 Received: by mail-pf1-f195.google.com with SMTP id b24so11162954pfp.1 for ; Tue, 03 Sep 2019 09:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Hp23yxcEr25HscpkTDg71i1OPHKTlvstuNQTz096mYk=; b=M3sINiCNhZmBkuWpiFoUnilGQouZRM0jji1oMwxGB/8z3i3CBfPPIVYA2fMNarOwoc VbZY04GeH8d4lvENPl9agRhCG8F9pKTrx+10GelfdwCZGTf/MvHBS3U4qfAl2wNyPzRu k0/eW+9j0+HZys0ha7BMXP6TXMHriI+itUl5R08WRSEiCxsBkSCjONMDH+Xz5LFkxU7q KV/PTU7yDXlNdL1ALa4Ya883qwL4VIMHJQMFfiEo2+rMhbOYhKLzqvbvrtgFn1DkNIIq GMeS1EFffSv5Ld4zh2xMTpN9L+hh+EG7ev/AUIDXFA33TiWIVbMtjXwcUvyWEvnE2/Qo olbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Hp23yxcEr25HscpkTDg71i1OPHKTlvstuNQTz096mYk=; b=Mh95k8wip3M6sNWJHOUMjfOTm/4tqqpuNvnuvULkp1oZ/UQxWv1tTswRwGvCqOJQnj hsdBNzXMuWAYo1oxaU5NhpgsLg9hiJf1ZoOlfuArKZTvh5gu9dZ0Tn/Gwh/hFzCkBYsC tLqOrbEE7SQbR5KpOuwismxOZOAXXUs+qW2Hz78Ds6IU69BR9E6P3kKZ3P3PVeOdp6Gl 3G9jtJvcGTvSzGiHtUWal27n8VJvpHFJn5ViYy8NGN9CLVx8Ajf5WvyvSxFpzm/p38/M eWLDdetlOMA+xcXdTrJjUpz8toVB7sVKSCr1bZ8Z0S5p+4zy83TJfGjdc4EFRDodAVJY Y6VA== X-Gm-Message-State: APjAAAVaYSVZJoX5nxcbStbizXgWQcjy1hSWd3xRyO23XEa3wfgpMeeb WgVcPzefKIFhnf5mKvb1PieS+Ui7iWRvKa2Y X-Google-Smtp-Source: APXvYqxURQXIUcPs8A4tKJV0rTp3kYRIyOFbpvjNUYWOIDD01eza6+4lbUi5CdIMtEh4nD53WFKpug== X-Received: by 2002:aa7:9e04:: with SMTP id y4mr40929346pfq.18.1567529044803; Tue, 03 Sep 2019 09:44:04 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.44.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:44:03 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 15/17] crypto: arm/aes-ce - implement ciphertext stealing for CBC Date: Tue, 3 Sep 2019 09:43:37 -0700 Message-Id: <20190903164339.27984-16-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Instead of relying on the CTS template to wrap the accelerated CBC skcipher, implement the ciphertext stealing part directly. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 85 +++++++++ arch/arm/crypto/aes-ce-glue.c | 188 ++++++++++++++++++-- 2 files changed, 256 insertions(+), 17 deletions(-) diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 763e51604ab6..b978cdf133af 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -284,6 +284,91 @@ ENTRY(ce_aes_cbc_decrypt) pop {r4-r6, pc} ENDPROC(ce_aes_cbc_decrypt) + + /* + * ce_aes_cbc_cts_encrypt(u8 out[], u8 const in[], u32 const rk[], + * int rounds, int bytes, u8 const iv[]) + * ce_aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[], + * int rounds, int bytes, u8 const iv[]) + */ + +ENTRY(ce_aes_cbc_cts_encrypt) + push {r4-r6, lr} + ldrd r4, r5, [sp, #16] + + movw ip, :lower16:.Lcts_permute_table + movt ip, :upper16:.Lcts_permute_table + sub r4, r4, #16 + add lr, ip, #32 + add ip, ip, r4 + sub lr, lr, r4 + vld1.8 {q5}, [ip] + vld1.8 {q6}, [lr] + + add ip, r1, r4 + vld1.8 {q0}, [r1] @ overlapping loads + vld1.8 {q3}, [ip] + + vld1.8 {q1}, [r5] @ get iv + prepare_key r2, r3 + + veor q0, q0, q1 @ xor with iv + bl aes_encrypt + + vtbl.8 d4, {d0-d1}, d10 + vtbl.8 d5, {d0-d1}, d11 + vtbl.8 d2, {d6-d7}, d12 + vtbl.8 d3, {d6-d7}, d13 + + veor q0, q0, q1 + bl aes_encrypt + + add r4, r0, r4 + vst1.8 {q2}, [r4] @ overlapping stores + vst1.8 {q0}, [r0] + + pop {r4-r6, pc} +ENDPROC(ce_aes_cbc_cts_encrypt) + +ENTRY(ce_aes_cbc_cts_decrypt) + push {r4-r6, lr} + ldrd r4, r5, [sp, #16] + + movw ip, :lower16:.Lcts_permute_table + movt ip, :upper16:.Lcts_permute_table + sub r4, r4, #16 + add lr, ip, #32 + add ip, ip, r4 + sub lr, lr, r4 + vld1.8 {q5}, [ip] + vld1.8 {q6}, [lr] + + add ip, r1, r4 + vld1.8 {q0}, [r1] @ overlapping loads + vld1.8 {q1}, [ip] + + vld1.8 {q3}, [r5] @ get iv + prepare_key r2, r3 + + bl aes_decrypt + + vtbl.8 d4, {d0-d1}, d10 + vtbl.8 d5, {d0-d1}, d11 + vtbx.8 d0, {d2-d3}, d12 + vtbx.8 d1, {d2-d3}, d13 + + veor q1, q1, q2 + bl aes_decrypt + veor q0, q0, q3 @ xor with iv + + add r4, r0, r4 + vst1.8 {q1}, [r4] @ overlapping stores + vst1.8 {q0}, [r0] + + pop {r4-r6, pc} +ENDPROC(ce_aes_cbc_cts_decrypt) + + /* * aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, * int blocks, u8 ctr[]) diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index c215792a2494..cdb1a07e7ad0 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -35,6 +35,10 @@ asmlinkage void ce_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 iv[]); asmlinkage void ce_aes_cbc_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 iv[]); +asmlinkage void ce_aes_cbc_cts_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int bytes, u8 const iv[]); +asmlinkage void ce_aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int bytes, u8 const iv[]); asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds, int blocks, u8 ctr[]); @@ -210,48 +214,182 @@ static int ecb_decrypt(struct skcipher_request *req) return err; } -static int cbc_encrypt(struct skcipher_request *req) +static int cbc_encrypt_walk(struct skcipher_request *req, + struct skcipher_walk *walk) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - struct skcipher_walk walk; unsigned int blocks; - int err; + int err = 0; - err = skcipher_walk_virt(&walk, req, false); - - while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) { kernel_neon_begin(); - ce_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ce_aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr, ctx->key_enc, num_rounds(ctx), blocks, - walk.iv); + walk->iv); kernel_neon_end(); - err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE); } return err; } -static int cbc_decrypt(struct skcipher_request *req) +static int cbc_encrypt(struct skcipher_request *req) { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); struct skcipher_walk walk; - unsigned int blocks; int err; err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + return cbc_encrypt_walk(req, &walk); +} - while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { +static int cbc_decrypt_walk(struct skcipher_request *req, + struct skcipher_walk *walk) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + unsigned int blocks; + int err = 0; + + while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) { kernel_neon_begin(); - ce_aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, + ce_aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr, ctx->key_dec, num_rounds(ctx), blocks, - walk.iv); + walk->iv); kernel_neon_end(); - err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE); } return err; } +static int cbc_decrypt(struct skcipher_request *req) +{ + struct skcipher_walk walk; + int err; + + err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + return cbc_decrypt_walk(req, &walk); +} + +static int cts_cbc_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2; + struct scatterlist *src = req->src, *dst = req->dst; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct skcipher_walk walk; + int err; + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, skcipher_request_flags(req), + NULL, NULL); + + if (req->cryptlen <= AES_BLOCK_SIZE) { + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; + cbc_blocks = 1; + } + + if (cbc_blocks > 0) { + skcipher_request_set_crypt(&subreq, req->src, req->dst, + cbc_blocks * AES_BLOCK_SIZE, + req->iv); + + err = skcipher_walk_virt(&walk, &subreq, false) ?: + cbc_encrypt_walk(&subreq, &walk); + if (err) + return err; + + if (req->cryptlen == AES_BLOCK_SIZE) + return 0; + + dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, + subreq.cryptlen); + } + + /* handle ciphertext stealing */ + skcipher_request_set_crypt(&subreq, src, dst, + req->cryptlen - cbc_blocks * AES_BLOCK_SIZE, + req->iv); + + err = skcipher_walk_virt(&walk, &subreq, false); + if (err) + return err; + + kernel_neon_begin(); + ce_aes_cbc_cts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->key_enc, num_rounds(ctx), walk.nbytes, + walk.iv); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); +} + +static int cts_cbc_decrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2; + struct scatterlist *src = req->src, *dst = req->dst; + struct scatterlist sg_src[2], sg_dst[2]; + struct skcipher_request subreq; + struct skcipher_walk walk; + int err; + + skcipher_request_set_tfm(&subreq, tfm); + skcipher_request_set_callback(&subreq, skcipher_request_flags(req), + NULL, NULL); + + if (req->cryptlen <= AES_BLOCK_SIZE) { + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; + cbc_blocks = 1; + } + + if (cbc_blocks > 0) { + skcipher_request_set_crypt(&subreq, req->src, req->dst, + cbc_blocks * AES_BLOCK_SIZE, + req->iv); + + err = skcipher_walk_virt(&walk, &subreq, false) ?: + cbc_decrypt_walk(&subreq, &walk); + if (err) + return err; + + if (req->cryptlen == AES_BLOCK_SIZE) + return 0; + + dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, + subreq.cryptlen); + } + + /* handle ciphertext stealing */ + skcipher_request_set_crypt(&subreq, src, dst, + req->cryptlen - cbc_blocks * AES_BLOCK_SIZE, + req->iv); + + err = skcipher_walk_virt(&walk, &subreq, false); + if (err) + return err; + + kernel_neon_begin(); + ce_aes_cbc_cts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->key_dec, num_rounds(ctx), walk.nbytes, + walk.iv); + kernel_neon_end(); + + return skcipher_walk_done(&walk, 0); +} + static int ctr_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); @@ -486,6 +624,22 @@ static struct skcipher_alg aes_algs[] = { { .setkey = ce_aes_setkey, .encrypt = cbc_encrypt, .decrypt = cbc_decrypt, +}, { + .base.cra_name = "__cts(cbc(aes))", + .base.cra_driver_name = "__cts-cbc-aes-ce", + .base.cra_priority = 300, + .base.cra_flags = CRYPTO_ALG_INTERNAL, + .base.cra_blocksize = AES_BLOCK_SIZE, + .base.cra_ctxsize = sizeof(struct crypto_aes_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .walksize = 2 * AES_BLOCK_SIZE, + .setkey = ce_aes_setkey, + .encrypt = cts_cbc_encrypt, + .decrypt = cts_cbc_decrypt, }, { .base.cra_name = "__ctr(aes)", .base.cra_driver_name = "__ctr-aes-ce", From patchwork Tue Sep 3 16:43:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128475 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EC23715E9 for ; Tue, 3 Sep 2019 16:44:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CD57A21881 for ; Tue, 3 Sep 2019 16:44:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="o/cSwbn5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730134AbfICQoH (ORCPT ); Tue, 3 Sep 2019 12:44:07 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:44556 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730133AbfICQoH (ORCPT ); Tue, 3 Sep 2019 12:44:07 -0400 Received: by mail-pf1-f196.google.com with SMTP id q21so6089929pfn.11 for ; Tue, 03 Sep 2019 09:44:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aamHsTt7qMo6nP80it71PLpr/6sTUJeGAetaZJptbPw=; b=o/cSwbn5Bu3uXQ+r3TPgIG/f55MDyn/ay3aJH67AUeXUl6ipWw5RFMEZ89MskDmHQ/ mb9qEYIx2+jYCXinXSGcGs7pICXFs3nbxLRiOUymU0avQ+tXEN5P5qI/3ecK22OKfwzE REbVVw8gLqd5dDuZbGZXXqHlQuOb5RrHvXWRmc48JnGeTHi9iAOmBprlSZFA9nojifeJ F7v7KhrZgzP/sYNutXRf0RxRUqg6W680TBy/ZTrgJQMci6Fm4ffnDyyc1RoOu+gfmQmw SXRGnaDH1BSLi46d8qYr9vokyoGHxruEKPWUte42OVDWwdwhgT/eDCncuG9pvqmnXZMd /HQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aamHsTt7qMo6nP80it71PLpr/6sTUJeGAetaZJptbPw=; b=lrBT7iGFnq6Pctn4YWdW+2A+Bum5PvACqmOyqfXXEf3Tnb+Cs9IhzHAEHW7DtHsyad Pbai7N16zS/cHlotComR7ownm7N+5r746RdRS+/+7F6l4rKR+8FphV2e+GuP+3ozEmgX q6h6NZaBjl1R2pZA357jQuTVYD6D3uASDqFEmWzYqD9EUS/4c18nxbw0TOi0nl6D1CxF C9jfoagU7opTpZ7WCqmBbbqx7mXe8duEXjojmeNlK1siMFd6JPxoBQ4n9jV7+FY6sFkh /wbtRKCEkfy+F7p2U1QaSPNNmPNEtSNwlOiKDkE7bKIhN6IwTO+wlmqOLWp6nF7hkSii J3sA== X-Gm-Message-State: APjAAAVvNwnYWSQltQfqxdN9nO9wORozC9JnGGtU/CLa5BwGv9yHBTNX ZFpR12z1NRkR5ZIL9+Ro9DVQDi6eAZ7uvIod X-Google-Smtp-Source: APXvYqz+lAqBUAXpJu3ICnJ3oZ8atMHz42I78xQ8SaezuXHpNEE6pXEIxNcWhVo84k3ICYaQyCuCIQ== X-Received: by 2002:a62:c141:: with SMTP id i62mr19695910pfg.64.1567529046227; Tue, 03 Sep 2019 09:44:06 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.44.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:44:05 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v2 16/17] crypto: testmgr - add test vectors for XTS ciphertext stealing Date: Tue, 3 Sep 2019 09:43:38 -0700 Message-Id: <20190903164339.27984-17-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Import the AES-XTS test vectors from IEEE publication P1619/D16 that exercise the ciphertext stealing part of the XTS algorithm, which we haven't supported in the Linux kernel implementation up till now. Tested-by: Pascal van Leeuwen Signed-off-by: Ard Biesheuvel --- crypto/testmgr.h | 60 ++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/crypto/testmgr.h b/crypto/testmgr.h index ef7d21f39d4a..86f2db6ae571 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -15291,6 +15291,66 @@ static const struct cipher_testvec aes_xts_tv_template[] = { "\xc4\xf3\x6f\xfd\xa9\xfc\xea\x70" "\xb9\xc6\xe6\x93\xe1\x48\xc1\x51", .len = 512, + }, { /* XTS-AES 15 */ + .key = "\xff\xfe\xfd\xfc\xfb\xfa\xf9\xf8" + "\xf7\xf6\xf5\xf4\xf3\xf2\xf1\xf0" + "\xbf\xbe\xbd\xbc\xbb\xba\xb9\xb8" + "\xb7\xb6\xb5\xb4\xb3\xb2\xb1\xb0", + .klen = 32, + .iv = "\x9a\x78\x56\x34\x12\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00", + .ptext = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" + "\x10", + .ctext = "\x6c\x16\x25\xdb\x46\x71\x52\x2d" + "\x3d\x75\x99\x60\x1d\xe7\xca\x09" + "\xed", + .len = 17, + }, { /* XTS-AES 16 */ + .key = "\xff\xfe\xfd\xfc\xfb\xfa\xf9\xf8" + "\xf7\xf6\xf5\xf4\xf3\xf2\xf1\xf0" + "\xbf\xbe\xbd\xbc\xbb\xba\xb9\xb8" + "\xb7\xb6\xb5\xb4\xb3\xb2\xb1\xb0", + .klen = 32, + .iv = "\x9a\x78\x56\x34\x12\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00", + .ptext = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" + "\x10\x11", + .ctext = "\xd0\x69\x44\x4b\x7a\x7e\x0c\xab" + "\x09\xe2\x44\x47\xd2\x4d\xeb\x1f" + "\xed\xbf", + .len = 18, + }, { /* XTS-AES 17 */ + .key = "\xff\xfe\xfd\xfc\xfb\xfa\xf9\xf8" + "\xf7\xf6\xf5\xf4\xf3\xf2\xf1\xf0" + "\xbf\xbe\xbd\xbc\xbb\xba\xb9\xb8" + "\xb7\xb6\xb5\xb4\xb3\xb2\xb1\xb0", + .klen = 32, + .iv = "\x9a\x78\x56\x34\x12\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00", + .ptext = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" + "\x10\x11\x12", + .ctext = "\xe5\xdf\x13\x51\xc0\x54\x4b\xa1" + "\x35\x0b\x33\x63\xcd\x8e\xf4\xbe" + "\xed\xbf\x9d", + .len = 19, + }, { /* XTS-AES 18 */ + .key = "\xff\xfe\xfd\xfc\xfb\xfa\xf9\xf8" + "\xf7\xf6\xf5\xf4\xf3\xf2\xf1\xf0" + "\xbf\xbe\xbd\xbc\xbb\xba\xb9\xb8" + "\xb7\xb6\xb5\xb4\xb3\xb2\xb1\xb0", + .klen = 32, + .iv = "\x9a\x78\x56\x34\x12\x00\x00\x00" + "\x00\x00\x00\x00\x00\x00\x00\x00", + .ptext = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" + "\x10\x11\x12\x13", + .ctext = "\x9d\x84\xc8\x13\xf7\x19\xaa\x2c" + "\x7b\xe3\xf6\x61\x71\xc7\xc5\xc2" + "\xed\xbf\x9d\xac", + .len = 20, } }; From patchwork Tue Sep 3 16:43:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 11128477 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 499AA112C for ; Tue, 3 Sep 2019 16:44:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1A6E022CED for ; Tue, 3 Sep 2019 16:44:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="WMTpMN5S" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730142AbfICQoJ (ORCPT ); Tue, 3 Sep 2019 12:44:09 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:35252 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730133AbfICQoJ (ORCPT ); Tue, 3 Sep 2019 12:44:09 -0400 Received: by mail-pg1-f196.google.com with SMTP id n4so9462807pgv.2 for ; Tue, 03 Sep 2019 09:44:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3KngK2u8vuYAJdSvAFBToWlIF0FxZzaL+/l/17JyunY=; b=WMTpMN5SkvfWgZ7IVb8u91f68Uex5G2xFUl3YOubk6NC5C/gTj5jbOJ+tCds29VQ1d gqTiZAOuslTqzbfSJqMRHVqQxzu2g0Td2nKw21YLyTrHM06sPQH1kZmhRtf6p/EuP5Q1 AaYD9FrOdjiYyOLW6k4/rku4zx/hM5JfLvnUjDjna4K5KBGk/CnRBmuxQVpSJle7Am5D f4Gk//HY9TblrnM0jevhrHKtl6AezXgh2a81Cf7OPgEMvZj4EGJ7i8F+reFgddiHnRgI EE96IgYNYks9OFAEcOgqH3yA3mOh1PJxUWpB7zXBLe2dclQPq/iAoeiQ4IcatZCMeAQr bKSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3KngK2u8vuYAJdSvAFBToWlIF0FxZzaL+/l/17JyunY=; b=fe/qxnHWd8E8AuqYsjgINuF8j+22f58RvtnmlbZjr/l/wtTCjFIR8x4LwZGMPK9CJe 37RQDb1c/l61VQr2mfbzNrSHYbT0E2I6UW9uMP1+P9THnEthqoUPWiqpoh8rTi4erKPu e+jHK33plGF0a5x9d9u1ch1FWWeI/2pgBEiNeRlMw3DBRKWvEHDMFAvwnC+ERjIBTAxz ojtQ0+IMG5M6uZ5uMs+ETJfNL0DLjn4f6SosfAVggiiw4c/hpn44shMoK3mAfZVVlz2g OdrwC1jdesH935rnspDuOXNq4I4ePznkEWF99Ai4tXs3eXyGW+1nB9lcmsIO1+O3/l7B YWHQ== X-Gm-Message-State: APjAAAV7CBc6modLdYOImbiaHEhEUEL83CuKUfqw4ra5W552OiFBeIrm BjLy0E3wykzrHzMdvhplQuoeiyUh15Fis4T6 X-Google-Smtp-Source: APXvYqwejA4Fnqa5db7RvXOCcEcQWLbP875YbSHcMZoM1z4cIDWrOhEcIWatDYFnfDFLK7C47twccg== X-Received: by 2002:a62:754a:: with SMTP id q71mr1456594pfc.70.1567529047851; Tue, 03 Sep 2019 09:44:07 -0700 (PDT) Received: from e111045-lin.nice.arm.com ([104.133.8.102]) by smtp.gmail.com with ESMTPSA id b126sm20311847pfb.110.2019.09.03.09.44.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2019 09:44:06 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Pascal van Leeuwen , Ard Biesheuvel Subject: [PATCH v2 17/17] crypto: testmgr - Add additional AES-XTS vectors for covering CTS Date: Tue, 3 Sep 2019 09:43:39 -0700 Message-Id: <20190903164339.27984-18-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190903164339.27984-1-ard.biesheuvel@linaro.org> References: <20190903164339.27984-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Pascal van Leeuwen This patch adds test vectors for AES-XTS that cover data inputs that are not a multiple of 16 bytes and therefore require cipher text stealing (CTS) to be applied. Vectors were added to cover all possible alignments combined with various interesting (i.e. for vector implementations working on 3,4,5 or 8 AES blocks in parallel) lengths. This code was kindly donated to the public domain by the author. Link: https://lore.kernel.org/linux-crypto/MN2PR20MB29739591E1A3E54E7A8A8E18CAC00@MN2PR20MB2973.namprd20.prod.outlook.com/ Signed-off-by: Ard Biesheuvel --- crypto/testmgr.h | 308 ++++++++++++++++++++ 1 file changed, 308 insertions(+) diff --git a/crypto/testmgr.h b/crypto/testmgr.h index 86f2db6ae571..b3296441f9c7 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -15351,6 +15351,314 @@ static const struct cipher_testvec aes_xts_tv_template[] = { "\x7b\xe3\xf6\x61\x71\xc7\xc5\xc2" "\xed\xbf\x9d\xac", .len = 20, + /* Additional vectors to increase CTS coverage */ + }, { /* 1 block + 21 bytes */ + .key = "\xa1\x34\x0e\x49\x38\xfd\x8b\xf6" + "\x45\x60\x67\x07\x0f\x50\xa8\x2b" + "\xa8\xf1\xfe\x7e\xf4\xf0\x47\xcd" + "\xfd\x91\x78\xf9\x14\x8b\x7d\x27" + "\x0e\xdc\xca\xe6\xf4\xfc\xd7\x4f" + "\x19\x8c\xd0\xe6\x9e\x2f\xf8\x75" + "\xb5\xe2\x48\x00\x4f\x07\xd9\xa1" + "\x42\xbc\x9d\xfc\x17\x98\x00\x48", + .klen = 64, + .iv = "\xcb\x35\x47\x5a\x7a\x06\x28\xb9" + "\x80\xf5\xa7\xe6\x8a\x23\x42\xf8", + .ptext = "\x04\x52\xc8\x7f\xb0\x5a\x12\xc5" + "\x96\x47\x6b\xf4\xbc\x2e\xdb\x74" + "\xd2\x20\x24\x32\xe5\x84\xb6\x25" + "\x4c\x2f\x96\xc7\x55\x9c\x90\x6f" + "\x0e\x96\x94\x68\xf4", + .ctext = "\x6a\x2d\x57\xb8\x72\x49\x10\x6b" + "\x5b\x5a\xc9\x92\xab\x59\x79\x36" + "\x7a\x01\x95\xf7\xdd\xcb\x3f\xbf" + "\xb2\xe3\x7e\x35\xe3\x11\x04\x68" + "\x28\xc3\x70\x6a\xe1", + .len = 37, + }, { /* 3 blocks + 22 bytes */ + .key = "\xf7\x87\x75\xdf\x36\x20\xe7\xcb" + "\x20\x5d\x49\x96\x81\x3d\x1d\x80" + "\xc7\x18\x7e\xbf\x2a\x0f\x79\xba" + "\x06\xb5\x4b\x63\x03\xfb\xb8\x49" + "\x93\x2d\x85\x5b\x95\x1f\x78\xea" + "\x7c\x1e\xf5\x5d\x02\xc6\xec\xb0" + "\xf0\xaa\x3d\x0a\x04\xe1\x67\x80" + "\x2a\xbe\x4e\x73\xc9\x11\xcc\x6c", + .klen = 64, + .iv = "\xeb\xba\x55\x24\xfc\x8f\x25\x7c" + "\x66\xf9\x04\x03\xcc\xb1\xf4\x84", + .ptext = "\x40\x75\x1b\x72\x2a\xc8\xbf\xef" + "\x0c\x92\x3e\x19\xc5\x09\x07\x38" + "\x4d\x87\x5c\xb8\xd6\x4f\x1a\x39" + "\x8c\xee\xa5\x22\x41\x12\xe1\x22" + "\xb5\x4b\xd7\xeb\x02\xfa\xaa\xf8" + "\x94\x47\x04\x5d\x8a\xb5\x40\x12" + "\x04\x62\x3d\xe4\x19\x8a\xeb\xb3" + "\xf9\xa3\x7d\xb6\xeb\x57\xf9\xb8" + "\x7f\xa8\xfa\x2d\x75\x2d", + .ctext = "\x46\x6d\xe5\x35\x5d\x22\x42\x33" + "\xf7\xb8\xfb\xc0\xcb\x18\xad\xa4" + "\x75\x6c\xc6\x38\xbb\xd4\xa1\x32" + "\x00\x05\x06\xd9\xc9\x17\xd9\x4f" + "\x1a\xf6\x24\x64\x27\x8a\x4a\xad" + "\x88\xa0\x86\xb7\xf9\x33\xaf\xa8" + "\x0e\x83\xd8\x0e\x88\xa2\x81\x79" + "\x65\x2e\x3e\x84\xaf\xa1\x46\x7d" + "\xa9\x91\xf8\x17\x82\x8d", + .len = 70, + }, { /* 4 blocks + 23 bytes */ + .key = "\x48\x09\xab\x48\xd6\xca\x7d\xb1" + "\x90\xa0\x00\xd8\x33\x8a\x20\x79" + "\x7c\xbc\x0c\x0c\x5f\x41\xbc\xbc" + "\x82\xaf\x41\x81\x23\x93\xcb\xc7" + "\x61\x7b\x83\x13\x16\xb1\x3e\x7c" + "\xcc\xae\xda\xca\x78\xc7\xab\x18" + "\x69\xb6\x58\x3e\x5c\x19\x5f\xed" + "\x7b\xcf\x70\xb9\x76\x00\xd8\xc9", + .klen = 64, + .iv = "\x2e\x20\x36\xf4\xa3\x22\x5d\xd8" + "\x38\x49\x82\xbf\x6c\x56\xd9\x3b", + .ptext = "\x79\x3c\x73\x99\x65\x21\xe1\xb9" + "\xa0\xfd\x22\xb2\x57\xc0\x7f\xf4" + "\x7f\x97\x36\xaf\xf8\x8d\x73\xe1" + "\x0d\x85\xe9\xd5\x3d\x82\xb3\x49" + "\x89\x25\x30\x1f\x0d\xca\x5c\x95" + "\x64\x31\x02\x17\x11\x08\x8f\x32" + "\xbc\x37\x23\x4f\x03\x98\x91\x4a" + "\x50\xe2\x58\xa8\x9b\x64\x09\xe0" + "\xce\x99\xc9\xb0\xa8\x21\x73\xb7" + "\x2d\x4b\x19\xba\x81\x83\x99\xce" + "\xa0\x7a\xd0\x9f\x27\xf6\x8a", + .ctext = "\xf9\x12\x76\x21\x06\x1e\xe4\x4b" + "\xf9\x94\x38\x29\x0f\xee\xcb\x13" + "\xa3\xc3\x50\xe3\xc6\x29\x9d\xcf" + "\x6f\x6a\x0a\x25\xab\x44\xf6\xe4" + "\x71\x29\x75\x3b\x07\x1c\xfc\x1a" + "\x75\xd4\x84\x58\x7f\xc4\xf3\xf7" + "\x8f\x7c\x7a\xdc\xa2\xa3\x95\x38" + "\x15\xdf\x3b\x9c\xdd\x24\xb4\x0b" + "\xa8\x97\xfa\x5f\xee\x58\x00\x0d" + "\x23\xc9\x8d\xee\xc2\x3f\x27\xd8" + "\xd4\x43\xa5\xf8\x25\x71\x3f", + .len = 87, + }, { /* 5 blocks + 24 bytes */ + .key = "\x8c\xf4\x4c\xe5\x91\x8f\x72\xe9" + "\x2f\xf8\xc0\x3c\x87\x76\x16\xa4" + "\x20\xab\x66\x39\x34\x10\xd6\x91" + "\xf1\x99\x2c\xf1\xd6\xc3\xda\x38" + "\xed\x2a\x4c\x80\xf4\xa5\x56\x28" + "\x1a\x1c\x79\x72\x6c\x93\x08\x86" + "\x8f\x8a\xaa\xcd\xf1\x8c\xca\xe7" + "\x0a\xe8\xee\x0c\x1c\xc2\xa8\xea", + .klen = 64, + .iv = "\x9a\x9e\xbc\xe4\xc9\xf3\xef\x9f" + "\xff\x82\x0e\x22\x8f\x80\x42\x76", + .ptext = "\xc1\xde\x66\x1a\x7e\x60\xd3\x3b" + "\x66\xd6\x29\x86\x99\xc6\xd7\xc8" + "\x29\xbf\x00\x57\xab\x21\x06\x24" + "\xd0\x92\xef\xe6\xb5\x1e\x20\xb9" + "\xb7\x7b\xd7\x18\x88\xf8\xd7\xe3" + "\x90\x61\xcd\x73\x2b\xa1\xb5\xc7" + "\x33\xef\xb5\xf2\x45\xf6\x92\x53" + "\x91\x98\xf8\x5a\x20\x75\x4c\xa8" + "\xf1\xf6\x01\x26\xbc\xba\x4c\xac" + "\xcb\xc2\x6d\xb6\x2c\x3c\x38\x61" + "\xe3\x98\x7f\x3e\x98\xbd\xec\xce" + "\xc0\xb5\x74\x23\x43\x24\x7b\x7e" + "\x3f\xed\xcb\xda\x88\x67\x6f\x9a", + .ctext = "\xeb\xdc\x6a\xb7\xd9\x5f\xa7\xfc" + "\x48\x75\x10\xef\xca\x65\xdc\x88" + "\xd0\x23\xde\x17\x5f\x3b\x61\xa2" + "\x15\x13\x81\x81\xf8\x57\x8b\x2a" + "\xe2\xc8\x49\xd1\xba\xed\xd6\xcb" + "\xed\x6f\x26\x69\x9b\xd2\xd2\x91" + "\x4e\xd7\x81\x20\x66\x38\x0c\x62" + "\x60\xcd\x01\x36\x97\x22\xf0\x5c" + "\xcf\x53\xc6\x58\xf5\x8b\x48\x0c" + "\xa5\x50\xc2\x73\xf9\x70\x60\x09" + "\x22\x69\xf3\x71\x74\x5d\xc9\xa0" + "\x9c\x79\xf9\xc4\x87\xac\xd7\x4b" + "\xac\x3c\xc6\xda\x81\x7a\xdd\x14", + .len = 104, + }, { /* 8 blocks + 25 bytes */ + .key = "\x70\x18\x09\x93\x10\x3a\x0c\xa9" + "\x02\x0b\x11\x10\xae\x34\x98\xdb" + "\x10\xb5\xee\x8c\x49\xbc\x52\x8e" + "\x4b\xf7\x0a\x36\x16\x8a\xf7\x06" + "\xb5\x94\x52\x54\xb9\xc1\x4d\x20" + "\xa2\xf0\x6e\x19\x7f\x67\x1e\xaa" + "\x94\x6c\xee\x54\x19\xfc\x96\x95" + "\x04\x85\x00\x53\x7c\x39\x5f\xeb", + .klen = 64, + .iv = "\x36\x87\x8f\x9d\x74\xe9\x52\xfb" + "\xe1\x76\x16\x99\x61\x86\xec\x8f", + .ptext = "\x95\x08\xee\xfe\x87\xb2\x4f\x93" + "\x01\xee\xf3\x77\x0d\xbb\xfb\x26" + "\x3e\xb3\x34\x20\xee\x51\xd6\x40" + "\xb1\x64\xae\xd9\xfd\x71\x8f\x93" + "\xa5\x85\xff\x74\xcc\xd3\xfd\x5e" + "\xc2\xfc\x49\xda\xa8\x3a\x94\x29" + "\xa2\x59\x90\x34\x26\xbb\xa0\x34" + "\x5d\x47\x33\xf2\xa8\x77\x90\x98" + "\x8d\xfd\x38\x60\x23\x1e\x50\xa1" + "\x67\x4d\x8d\x09\xe0\x7d\x30\xe3" + "\xdd\x39\x91\xd4\x70\x68\xbb\x06" + "\x4e\x11\xb2\x26\x0a\x85\x73\xf6" + "\x37\xb6\x15\xd0\x77\xee\x43\x7b" + "\x77\x13\xe9\xb9\x84\x2b\x34\xab" + "\x49\xc1\x27\x91\x2e\xa3\xca\xe5" + "\xa7\x79\x45\xba\x36\x97\x49\x44" + "\xf7\x57\x9b\xd7\xac\xb3\xfd\x6a" + "\x1c\xd1\xfc\x1c\xdf\x6f\x94\xac" + "\x95\xf4\x50\x7a\xc8\xc3\x8c\x60" + "\x3c", + .ctext = "\xb6\xc8\xf9\x5d\x35\x5a\x0a\x33" + "\x2b\xd3\x5a\x18\x09\x1c\x1b\x0b" + "\x2a\x0e\xde\xf6\x0d\x04\xa6\xb3" + "\xa8\xe8\x1b\x86\x29\x58\x75\x56" + "\xab\xab\xbf\xbe\x1f\xb4\xc4\xf3" + "\xde\x1a\xb0\x87\x69\xac\x5b\x0c" + "\x1b\xb7\xc7\x24\xa4\x47\xe7\x81" + "\x2c\x0a\x82\xf9\x18\x5d\xe6\x09" + "\xe3\x65\x36\x54\x3d\x8a\x3a\x64" + "\x34\xf4\x34\x7f\x26\x3c\x1e\x3b" + "\x5a\x13\xdf\x7f\xa8\x2d\x81\xce" + "\xfa\xad\xd0\xb1\xca\xfa\xc3\x55" + "\x94\xc8\xb8\x16\x7e\xff\x44\x88" + "\xb4\x47\x4b\xfe\xda\x60\x68\x2e" + "\xfc\x70\xb5\xe3\xf3\xe9\x46\x22" + "\x1d\x98\x66\x09\x0f\xed\xbb\x20" + "\x7b\x8c\x2a\xff\x45\x62\xde\x9b" + "\x20\x2e\x6c\xb4\xe4\x26\x03\x72" + "\x8a\xb4\x19\xc9\xb1\xcf\x9d\x86" + "\xa3", + .len = 153, + }, { /* 0 blocks + 26 bytes */ + .key = "\x5a\x38\x3f\x9c\x0c\x53\x17\x6c" + "\x60\x72\x23\x26\xba\xfe\xa1\xb7" + "\x03\xa8\xfe\xa0\x7c\xff\x78\x4c" + "\x7d\x84\x2f\x24\x84\x77\xec\x6f" + "\x88\xc8\x36\xe2\xcb\x52\x3c\xb4" + "\x39\xac\x37\xfa\x41\x8b\xc4\x59" + "\x24\x03\xe1\x51\xc9\x54\x7d\xb7" + "\xa3\xde\x91\x44\x8d\x16\x97\x22", + .klen = 64, + .iv = "\xfb\x7f\x3d\x60\x26\x0a\x3a\x3d" + "\xa5\xa3\x45\xf2\x24\x67\xfa\x6e", + .ptext = "\xfb\x56\x97\x65\x7c\xd8\x6c\x3c" + "\x5d\xd3\xea\xa6\xa4\x83\xf7\x9d" + "\x9d\x89\x2c\x85\xb8\xd9\xd4\xf0" + "\x1a\xad", + .ctext = "\xc9\x9b\x4b\xf2\xf7\x0f\x23\xfe" + "\xc3\x93\x88\xa1\xb3\x88\xab\xd6" + "\x26\x78\x82\xa6\x6b\x0b\x76\xad" + "\x21\x5e", + .len = 26, + }, { /* 0 blocks + 27 bytes */ + .key = "\xc0\xcf\x57\xa2\x3c\xa2\x4b\xf6" + "\x5d\x36\x7b\xd7\x1d\x16\xc3\x2f" + "\x50\xc6\x0a\xb2\xfd\xe8\x24\xfc" + "\x33\xcf\x73\xfd\xe0\xe9\xa5\xd1" + "\x98\xfc\xd6\x16\xdd\xfd\x6d\xab" + "\x44\xbc\x37\x9d\xab\x5b\x1d\xf2" + "\x6f\x5d\xbe\x6b\x14\x14\xc7\x74" + "\xbb\x91\x24\x4b\x52\xcb\x78\x31", + .klen = 64, + .iv = "\x5c\xc1\x3d\xb6\xa1\x6a\x2d\x1f" + "\xee\x75\x19\x4b\x04\xfa\xe1\x7e", + .ptext = "\x02\x95\x3a\xab\xac\x3b\xcd\xcd" + "\x63\xc7\x4c\x7c\xe5\x75\xee\x03" + "\x94\xc7\xff\xe8\xe0\xe9\x86\x2a" + "\xd3\xc7\xe4", + .ctext = "\x8e\x84\x76\x8b\xc1\x47\x55\x15" + "\x5e\x51\xb3\xe2\x3f\x72\x4d\x20" + "\x09\x3f\x4f\xb1\xce\xf4\xb0\x14" + "\xf6\xa7\xb3", + .len = 27, + }, { /* 0 blocks + 28 bytes */ + .key = "\x0b\x5b\x1d\xc8\xb1\x3f\x8f\xcd" + "\x87\xd2\x58\x28\x36\xc6\x34\xfb" + "\x04\xe8\xf1\xb7\x91\x30\xda\x75" + "\x66\x4a\x72\x90\x09\x39\x02\x19" + "\x62\x2d\xe9\x24\x95\x0e\x87\x43" + "\x4c\xc7\x96\xe4\xc9\x31\x6a\x13" + "\x16\x10\xef\x34\x9b\x98\x19\xf1" + "\x8b\x14\x38\x3f\xf8\x75\xcc\x76", + .klen = 64, + .iv = "\x0c\x2c\x55\x2c\xda\x40\xe1\xab" + "\xa6\x34\x66\x7a\xa4\xa3\xda\x90", + .ptext = "\xbe\x84\xd3\xfe\xe6\xb4\x29\x67" + "\xfd\x29\x78\x41\x3d\xe9\x81\x4e" + "\x3c\xf9\xf4\xf5\x3f\xd8\x0e\xcd" + "\x63\x73\x65\xf3", + .ctext = "\xd0\xa0\x16\x5f\xf9\x85\xd0\x63" + "\x9b\x81\xa1\x15\x93\xb3\x62\x36" + "\xec\x93\x0e\x14\x07\xf2\xa9\x38" + "\x80\x33\xc0\x20", + .len = 28, + }, { /* 0 blocks + 29 bytes */ + .key = "\xdc\x4c\xdc\x20\xb1\x34\x89\xa4" + "\xd0\xb6\x77\x05\xea\x0c\xcc\x68" + "\xb1\xd6\xf7\xfd\xa7\x0a\x5b\x81" + "\x2d\x4d\xa3\x65\xd0\xab\xa1\x02" + "\x85\x4b\x33\xea\x51\x16\x50\x12" + "\x3b\x25\xba\x13\xba\x7c\xbb\x3a" + "\xe4\xfd\xb3\x9c\x88\x8b\xb8\x30" + "\x7a\x97\xcf\x95\x5d\x69\x7b\x1d", + .klen = 64, + .iv = "\xe7\x69\xed\xd2\x54\x5d\x4a\x29" + "\xb2\xd7\x60\x90\xa0\x0b\x0d\x3a", + .ptext = "\x37\x22\x11\x62\xa0\x74\x92\x62" + "\x40\x4e\x2b\x0a\x8b\xab\xd8\x28" + "\x8a\xd2\xeb\xa5\x8e\xe1\x42\xc8" + "\x49\xef\x9a\xec\x1b", + .ctext = "\x7c\x66\x72\x6b\xe3\xc3\x57\x71" + "\x37\x13\xce\x1f\x6b\xff\x13\x87" + "\x65\xa7\xa1\xc5\x23\x7f\xca\x40" + "\x82\xbf\x2f\xc0\x2a", + .len = 29, + }, { /* 0 blocks + 30 bytes */ + .key = "\x72\x9a\xf5\x53\x55\xdd\x0f\xef" + "\xfc\x75\x6f\x03\x88\xc8\xba\x88" + "\xb7\x65\x89\x5d\x03\x86\x21\x22" + "\xb8\x42\x87\xd9\xa9\x83\x9e\x9c" + "\xca\x28\xa1\xd2\xb6\xd0\xa6\x6c" + "\xf8\x57\x42\x7c\x73\xfc\x7b\x0a" + "\xbc\x3c\x57\x7b\x5a\x39\x61\x55" + "\xb7\x25\xe9\xf1\xc4\xbb\x04\x28", + .klen = 64, + .iv = "\x8a\x38\x22\xba\xea\x5e\x1d\xa4" + "\x31\x18\x12\x5c\x56\x0c\x12\x50", + .ptext = "\x06\xfd\xbb\xa9\x2e\x56\x05\x5f" + "\xf2\xa7\x36\x76\x26\xd3\xb3\x49" + "\x7c\xe2\xe3\xbe\x1f\x65\xd2\x17" + "\x65\xe2\xb3\x0e\xb1\x93", + .ctext = "\xae\x1f\x19\x7e\x3b\xb3\x65\xcb" + "\x14\x70\x6b\x3c\xa0\x63\x95\x94" + "\x56\x52\xe1\xb4\x14\xca\x21\x13" + "\xb5\x03\x3f\xfe\xc9\x9f", + .len = 30, + }, { /* 0 blocks + 31 bytes */ + .key = "\xce\x06\x45\x53\x25\x81\xd2\xb2" + "\xdd\xc9\x57\xfe\xbb\xf6\x83\x07" + "\x28\xd8\x2a\xff\x53\xf8\x57\xc6" + "\x63\x50\xd4\x3e\x2a\x54\x37\x51" + "\x07\x3b\x23\x63\x3c\x31\x57\x0d" + "\xd3\x59\x20\xf2\xd0\x85\xac\xc5" + "\x3f\xa1\x74\x90\x0a\x3f\xf4\x10" + "\x12\xf0\x1b\x2b\xef\xcb\x86\x74", + .klen = 64, + .iv = "\x6d\x3e\x62\x94\x75\x43\x74\xea" + "\xed\x4a\xa6\xde\xba\x55\x83\x38", + .ptext = "\x6a\xe6\xa3\x66\x7e\x78\xef\x42" + "\x8b\x28\x08\x24\xda\xd4\xd6\x42" + "\x3d\xb6\x48\x7e\x51\xa6\x92\x65" + "\x98\x86\x26\x98\x37\x42\xa5", + .ctext = "\x64\xc6\xfc\x60\x21\x87\x7a\xf5" + "\xc3\x1d\xba\x41\x3c\x9c\x8c\xe8" + "\x2d\x93\xf0\x02\x95\x6d\xfe\x8d" + "\x68\x17\x05\x75\xc0\xd3\xa8", + .len = 31, } };