From patchwork Tue Dec 4 03:52:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 10711097 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39B2F109C for ; Tue, 4 Dec 2018 03:57:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26D752A8D3 for ; Tue, 4 Dec 2018 03:57:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 15BD82A93A; Tue, 4 Dec 2018 03:57:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 803982A8D3 for ; Tue, 4 Dec 2018 03:56:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=0fUEnVfLi/0fGwZ7S2uQMCIKZlu/4IspoKMsOJ3ICh8=; b=WxaGyaAu0OBCcd Ir71oYiI26TE+q1S1Y1gR1WbkxzWZxDQtrfiIMV8h9UcsTq+MEwjd+fJTXk18Q0eXidC0vuBDFhl4 9FxFrxSiyDZ4/z3K8wTMev1NY/xZZk1NNW5evbaQ7FBxuHtw96DaQAgAWjuj0SLikXj6wLFeLQzwB urVJzWtJNuDXVb+vTe9bbeor6zSeu1B0p/llZ1NeL2PXsDGGjXoQauXZbBtr73GbOAIZ6rnVVOzea KacaSy91lFJV3zUYBGaCD5ryjbV+LY5xWLZRhSlPDv9AYNVIdvmneh5ShmLlYvW2XvdJjpG1wj5S8 pDazbfZLIHeO1pxTgI7g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1pI-0005Vv-4M; Tue, 04 Dec 2018 03:56:48 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1oo-00054q-Hb for linux-arm-kernel@lists.infradead.org; Tue, 04 Dec 2018 03:56:21 +0000 Received: from sol.localdomain (c-24-23-142-8.hsd1.ca.comcast.net [24.23.142.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D60E92146D; Tue, 4 Dec 2018 03:56:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543895768; bh=MkiIKGnHamsrgq/XB7PRzEtpXIcUMFSweATIEJCLIFk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AHT7IXehp+aQTY0ozlsjl+CiUDyYp7bMEztNB3zvcvYjvKMklRbdUTbcOEGYWl5cb p+keS7eVx6KUjEjIKiFBFQ2SUv6KUKwMbFdjjiahIKjDfGc+BeTspKfrRTgrR1nNl3 mti/qDWSVTxIZfTlYs2REa7YTNtafbsxWi0T8ol8= From: Eric Biggers To: linux-crypto@vger.kernel.org Subject: [PATCH v2 1/4] crypto: arm64/nhpoly1305 - add NEON-accelerated NHPoly1305 Date: Mon, 3 Dec 2018 19:52:49 -0800 Message-Id: <20181204035252.14853-2-ebiggers@kernel.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181204035252.14853-1-ebiggers@kernel.org> References: <20181204035252.14853-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20181203_195618_712036_DF0DE2BB X-CRM114-Status: GOOD ( 15.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A . Donenfeld" , Ard Biesheuvel , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Paul Crowley Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers Add an ARM64 NEON implementation of NHPoly1305, an ε-almost-∆-universal hash function used in the Adiantum encryption mode. For now, only the NH portion is actually NEON-accelerated; the Poly1305 part is less performance-critical so is just implemented in C. Reviewed-by: Ard Biesheuvel Tested-by: Ard Biesheuvel # big-endian Signed-off-by: Eric Biggers --- arch/arm64/crypto/Kconfig | 5 ++ arch/arm64/crypto/Makefile | 3 + arch/arm64/crypto/nh-neon-core.S | 103 +++++++++++++++++++++++ arch/arm64/crypto/nhpoly1305-neon-glue.c | 77 +++++++++++++++++ 4 files changed, 188 insertions(+) create mode 100644 arch/arm64/crypto/nh-neon-core.S create mode 100644 arch/arm64/crypto/nhpoly1305-neon-glue.c diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index a5606823ed4d..3f5aeb786192 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -106,6 +106,11 @@ config CRYPTO_CHACHA20_NEON select CRYPTO_BLKCIPHER select CRYPTO_CHACHA20 +config CRYPTO_NHPOLY1305_NEON + tristate "NHPoly1305 hash function using NEON instructions (for Adiantum)" + depends on KERNEL_MODE_NEON + select CRYPTO_NHPOLY1305 + config CRYPTO_AES_ARM64_BS tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm" depends on KERNEL_MODE_NEON diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index f476fede09ba..125dbb10a93e 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -53,6 +53,9 @@ sha512-arm64-y := sha512-glue.o sha512-core.o obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o +obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o +nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o + obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o diff --git a/arch/arm64/crypto/nh-neon-core.S b/arch/arm64/crypto/nh-neon-core.S new file mode 100644 index 000000000000..e05570c38de7 --- /dev/null +++ b/arch/arm64/crypto/nh-neon-core.S @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * NH - ε-almost-universal hash function, ARM64 NEON accelerated version + * + * Copyright 2018 Google LLC + * + * Author: Eric Biggers + */ + +#include + + KEY .req x0 + MESSAGE .req x1 + MESSAGE_LEN .req x2 + HASH .req x3 + + PASS0_SUMS .req v0 + PASS1_SUMS .req v1 + PASS2_SUMS .req v2 + PASS3_SUMS .req v3 + K0 .req v4 + K1 .req v5 + K2 .req v6 + K3 .req v7 + T0 .req v8 + T1 .req v9 + T2 .req v10 + T3 .req v11 + T4 .req v12 + T5 .req v13 + T6 .req v14 + T7 .req v15 + +.macro _nh_stride k0, k1, k2, k3 + + // Load next message stride + ld1 {T3.16b}, [MESSAGE], #16 + + // Load next key stride + ld1 {\k3\().4s}, [KEY], #16 + + // Add message words to key words + add T0.4s, T3.4s, \k0\().4s + add T1.4s, T3.4s, \k1\().4s + add T2.4s, T3.4s, \k2\().4s + add T3.4s, T3.4s, \k3\().4s + + // Multiply 32x32 => 64 and accumulate + mov T4.d[0], T0.d[1] + mov T5.d[0], T1.d[1] + mov T6.d[0], T2.d[1] + mov T7.d[0], T3.d[1] + umlal PASS0_SUMS.2d, T0.2s, T4.2s + umlal PASS1_SUMS.2d, T1.2s, T5.2s + umlal PASS2_SUMS.2d, T2.2s, T6.2s + umlal PASS3_SUMS.2d, T3.2s, T7.2s +.endm + +/* + * void nh_neon(const u32 *key, const u8 *message, size_t message_len, + * u8 hash[NH_HASH_BYTES]) + * + * It's guaranteed that message_len % 16 == 0. + */ +ENTRY(nh_neon) + + ld1 {K0.4s,K1.4s}, [KEY], #32 + movi PASS0_SUMS.2d, #0 + movi PASS1_SUMS.2d, #0 + ld1 {K2.4s}, [KEY], #16 + movi PASS2_SUMS.2d, #0 + movi PASS3_SUMS.2d, #0 + + subs MESSAGE_LEN, MESSAGE_LEN, #64 + blt .Lloop4_done +.Lloop4: + _nh_stride K0, K1, K2, K3 + _nh_stride K1, K2, K3, K0 + _nh_stride K2, K3, K0, K1 + _nh_stride K3, K0, K1, K2 + subs MESSAGE_LEN, MESSAGE_LEN, #64 + bge .Lloop4 + +.Lloop4_done: + ands MESSAGE_LEN, MESSAGE_LEN, #63 + beq .Ldone + _nh_stride K0, K1, K2, K3 + + subs MESSAGE_LEN, MESSAGE_LEN, #16 + beq .Ldone + _nh_stride K1, K2, K3, K0 + + subs MESSAGE_LEN, MESSAGE_LEN, #16 + beq .Ldone + _nh_stride K2, K3, K0, K1 + +.Ldone: + // Sum the accumulators for each pass, then store the sums to 'hash' + addp T0.2d, PASS0_SUMS.2d, PASS1_SUMS.2d + addp T1.2d, PASS2_SUMS.2d, PASS3_SUMS.2d + st1 {T0.16b,T1.16b}, [HASH] + ret +ENDPROC(nh_neon) diff --git a/arch/arm64/crypto/nhpoly1305-neon-glue.c b/arch/arm64/crypto/nhpoly1305-neon-glue.c new file mode 100644 index 000000000000..22cc32ac9448 --- /dev/null +++ b/arch/arm64/crypto/nhpoly1305-neon-glue.c @@ -0,0 +1,77 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum + * (ARM64 NEON accelerated version) + * + * Copyright 2018 Google LLC + */ + +#include +#include +#include +#include +#include + +asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len, + u8 hash[NH_HASH_BYTES]); + +/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */ +static void _nh_neon(const u32 *key, const u8 *message, size_t message_len, + __le64 hash[NH_NUM_PASSES]) +{ + nh_neon(key, message, message_len, (u8 *)hash); +} + +static int nhpoly1305_neon_update(struct shash_desc *desc, + const u8 *src, unsigned int srclen) +{ + if (srclen < 64 || !may_use_simd()) + return crypto_nhpoly1305_update(desc, src, srclen); + + do { + unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE); + + kernel_neon_begin(); + crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon); + kernel_neon_end(); + src += n; + srclen -= n; + } while (srclen); + return 0; +} + +static struct shash_alg nhpoly1305_alg = { + .base.cra_name = "nhpoly1305", + .base.cra_driver_name = "nhpoly1305-neon", + .base.cra_priority = 200, + .base.cra_ctxsize = sizeof(struct nhpoly1305_key), + .base.cra_module = THIS_MODULE, + .digestsize = POLY1305_DIGEST_SIZE, + .init = crypto_nhpoly1305_init, + .update = nhpoly1305_neon_update, + .final = crypto_nhpoly1305_final, + .setkey = crypto_nhpoly1305_setkey, + .descsize = sizeof(struct nhpoly1305_state), +}; + +static int __init nhpoly1305_mod_init(void) +{ + if (!(elf_hwcap & HWCAP_ASIMD)) + return -ENODEV; + + return crypto_register_shash(&nhpoly1305_alg); +} + +static void __exit nhpoly1305_mod_exit(void) +{ + crypto_unregister_shash(&nhpoly1305_alg); +} + +module_init(nhpoly1305_mod_init); +module_exit(nhpoly1305_mod_exit); + +MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (NEON-accelerated)"); +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Eric Biggers "); +MODULE_ALIAS_CRYPTO("nhpoly1305"); +MODULE_ALIAS_CRYPTO("nhpoly1305-neon"); From patchwork Tue Dec 4 03:52:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 10711099 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C9073109C for ; Tue, 4 Dec 2018 03:57:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7E362A8D3 for ; Tue, 4 Dec 2018 03:57:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB7F32A93A; Tue, 4 Dec 2018 03:57:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E9FEA2A8D3 for ; Tue, 4 Dec 2018 03:57:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=w/zvD6kv/bdaPL48W9aPx43W2Q86alzdHgUA7My0b1g=; b=tLcVgCLHsVqEsD IlvJtYzaG7fjwo7yxiZT0Q42lMsEhaNjuLAXgYrcgZy9kaO1lRHZK9rhY4/arW6NMLI5cjbD/r5FA iesz8pTD85WTSh1JVi5R0k4UFT0QS0II9yKREF6itLOzSUjJWagA6mKI3qMQUDcgt4moahFz+svdu Q2WH5l8X6pTmUx21PgcorZ63kWqus2+tGfV/d8djYcBw1axE4j+0WjmGvqFsSF8+lSuc0oZBlN2VK 1y0lVxnfqNbYlYFAdvpLqyw1WtHltUxtCinDnHaPu7OMfZeDaRy21U/JAkILdYp9pw0kIx/U49qB4 laPdejbxV6FkGbaICl0g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1pW-0005p9-6r; Tue, 04 Dec 2018 03:57:02 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1oo-00054r-Hk for linux-arm-kernel@lists.infradead.org; Tue, 04 Dec 2018 03:56:21 +0000 Received: from sol.localdomain (c-24-23-142-8.hsd1.ca.comcast.net [24.23.142.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 37CF7214C1; Tue, 4 Dec 2018 03:56:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543895768; bh=4q5pDYK7J1eWYD984RIQP1w41XRnjIsevrsv2GyPqoo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EvCX2AysJiIB3lPVva3kcIGTJyRgX4PztTON5xo2IapDvNspfkvIk6L+7bv8mohJP RbR0Tcp7KzKJBeCnx6CsJarkXs33JbhexLuS0CyczLKQaFz2caT0AsiIRoTAm0IQjl XVHjJyjKD224welZLFAzMUVJkjHayN6P5S8HPp1E= From: Eric Biggers To: linux-crypto@vger.kernel.org Subject: [PATCH v2 2/4] crypto: arm64/chacha20 - add XChaCha20 support Date: Mon, 3 Dec 2018 19:52:50 -0800 Message-Id: <20181204035252.14853-3-ebiggers@kernel.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181204035252.14853-1-ebiggers@kernel.org> References: <20181204035252.14853-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20181203_195618_708847_F1159DA2 X-CRM114-Status: GOOD ( 16.60 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A . Donenfeld" , Ard Biesheuvel , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Paul Crowley Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers Add an XChaCha20 implementation that is hooked up to the ARM64 NEON implementation of ChaCha20. This can be used by Adiantum. A NEON implementation of single-block HChaCha20 is also added so that XChaCha20 can use it rather than the generic implementation. This required refactoring the ChaCha20 permutation into its own function. Signed-off-by: Eric Biggers Reviewed-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/chacha20-neon-core.S | 65 +++++++++++----- arch/arm64/crypto/chacha20-neon-glue.c | 101 +++++++++++++++++++------ 3 files changed, 125 insertions(+), 43 deletions(-) diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 3f5aeb786192..d54ddb8468ef 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -101,7 +101,7 @@ config CRYPTO_AES_ARM64_NEON_BLK select CRYPTO_SIMD config CRYPTO_CHACHA20_NEON - tristate "NEON accelerated ChaCha20 symmetric cipher" + tristate "ChaCha20 and XChaCha20 stream ciphers using NEON instructions" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER select CRYPTO_CHACHA20 diff --git a/arch/arm64/crypto/chacha20-neon-core.S b/arch/arm64/crypto/chacha20-neon-core.S index 13c85e272c2a..0571e45a1a0a 100644 --- a/arch/arm64/crypto/chacha20-neon-core.S +++ b/arch/arm64/crypto/chacha20-neon-core.S @@ -23,25 +23,20 @@ .text .align 6 -ENTRY(chacha20_block_xor_neon) - // x0: Input state matrix, s - // x1: 1 data block output, o - // x2: 1 data block input, i - - // - // This function encrypts one ChaCha20 block by loading the state matrix - // in four NEON registers. It performs matrix operation on four words in - // parallel, but requires shuffling to rearrange the words after each - // round. - // - - // x0..3 = s0..3 - adr x3, ROT8 - ld1 {v0.4s-v3.4s}, [x0] - ld1 {v8.4s-v11.4s}, [x0] - ld1 {v12.4s}, [x3] +/* + * chacha20_permute - permute one block + * + * Permute one 64-byte block where the state matrix is stored in the four NEON + * registers v0-v3. It performs matrix operations on four words in parallel, + * but requires shuffling to rearrange the words after each round. + * + * Clobbers: x3, x10, v4, v12 + */ +chacha20_permute: mov x3, #10 + adr x10, ROT8 + ld1 {v12.4s}, [x10] .Ldoubleround: // x0 += x1, x3 = rotl32(x3 ^ x0, 16) @@ -105,6 +100,23 @@ ENTRY(chacha20_block_xor_neon) subs x3, x3, #1 b.ne .Ldoubleround + ret +ENDPROC(chacha20_permute) + +ENTRY(chacha20_block_xor_neon) + // x0: Input state matrix, s + // x1: 1 data block output, o + // x2: 1 data block input, i + + stp x29, x30, [sp, #-16]! + mov x29, sp + + // x0..3 = s0..3 + ld1 {v0.4s-v3.4s}, [x0] + ld1 {v8.4s-v11.4s}, [x0] + + bl chacha20_permute + ld1 {v4.16b-v7.16b}, [x2] // o0 = i0 ^ (x0 + s0) @@ -125,9 +137,28 @@ ENTRY(chacha20_block_xor_neon) st1 {v0.16b-v3.16b}, [x1] + ldp x29, x30, [sp], #16 ret ENDPROC(chacha20_block_xor_neon) +ENTRY(hchacha20_block_neon) + // x0: Input state matrix, s + // x1: output (8 32-bit words) + + stp x29, x30, [sp, #-16]! + mov x29, sp + + ld1 {v0.4s-v3.4s}, [x0] + + bl chacha20_permute + + st1 {v0.16b}, [x1], #16 + st1 {v3.16b}, [x1] + + ldp x29, x30, [sp], #16 + ret +ENDPROC(hchacha20_block_neon) + .align 6 ENTRY(chacha20_4block_xor_neon) // x0: Input state matrix, s diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c index 96e0cfb8c3f5..a5b9cbc0c4de 100644 --- a/arch/arm64/crypto/chacha20-neon-glue.c +++ b/arch/arm64/crypto/chacha20-neon-glue.c @@ -30,6 +30,7 @@ asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src); asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src); +asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out); static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, unsigned int bytes) @@ -65,20 +66,16 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, kernel_neon_end(); } -static int chacha20_neon(struct skcipher_request *req) +static int chacha20_neon_stream_xor(struct skcipher_request *req, + struct chacha_ctx *ctx, u8 *iv) { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); struct skcipher_walk walk; u32 state[16]; int err; - if (!may_use_simd() || req->cryptlen <= CHACHA_BLOCK_SIZE) - return crypto_chacha_crypt(req); - err = skcipher_walk_virt(&walk, req, false); - crypto_chacha_init(state, ctx, walk.iv); + crypto_chacha_init(state, ctx, iv); while (walk.nbytes > 0) { unsigned int nbytes = walk.nbytes; @@ -94,22 +91,73 @@ static int chacha20_neon(struct skcipher_request *req) return err; } -static struct skcipher_alg alg = { - .base.cra_name = "chacha20", - .base.cra_driver_name = "chacha20-neon", - .base.cra_priority = 300, - .base.cra_blocksize = 1, - .base.cra_ctxsize = sizeof(struct chacha_ctx), - .base.cra_module = THIS_MODULE, - - .min_keysize = CHACHA_KEY_SIZE, - .max_keysize = CHACHA_KEY_SIZE, - .ivsize = CHACHA_IV_SIZE, - .chunksize = CHACHA_BLOCK_SIZE, - .walksize = 4 * CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, - .encrypt = chacha20_neon, - .decrypt = chacha20_neon, +static int chacha20_neon(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + + if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) + return crypto_chacha_crypt(req); + + return chacha20_neon_stream_xor(req, ctx, req->iv); +} + +static int xchacha20_neon(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + struct chacha_ctx subctx; + u32 state[16]; + u8 real_iv[16]; + + if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) + return crypto_xchacha_crypt(req); + + crypto_chacha_init(state, ctx, req->iv); + + kernel_neon_begin(); + hchacha20_block_neon(state, subctx.key); + kernel_neon_end(); + + memcpy(&real_iv[0], req->iv + 24, 8); + memcpy(&real_iv[8], req->iv + 16, 8); + return chacha20_neon_stream_xor(req, &subctx, real_iv); +} + +static struct skcipher_alg algs[] = { + { + .base.cra_name = "chacha20", + .base.cra_driver_name = "chacha20-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = crypto_chacha20_setkey, + .encrypt = chacha20_neon, + .decrypt = chacha20_neon, + }, { + .base.cra_name = "xchacha20", + .base.cra_driver_name = "xchacha20-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = crypto_chacha20_setkey, + .encrypt = xchacha20_neon, + .decrypt = xchacha20_neon, + } }; static int __init chacha20_simd_mod_init(void) @@ -117,12 +165,12 @@ static int __init chacha20_simd_mod_init(void) if (!(elf_hwcap & HWCAP_ASIMD)) return -ENODEV; - return crypto_register_skcipher(&alg); + return crypto_register_skciphers(algs, ARRAY_SIZE(algs)); } static void __exit chacha20_simd_mod_fini(void) { - crypto_unregister_skcipher(&alg); + crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); } module_init(chacha20_simd_mod_init); @@ -131,3 +179,6 @@ module_exit(chacha20_simd_mod_fini); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("chacha20"); +MODULE_ALIAS_CRYPTO("chacha20-neon"); +MODULE_ALIAS_CRYPTO("xchacha20"); +MODULE_ALIAS_CRYPTO("xchacha20-neon"); From patchwork Tue Dec 4 03:52:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 10711101 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B10013AF for ; Tue, 4 Dec 2018 03:57:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A2692A947 for ; Tue, 4 Dec 2018 03:57:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4B56B2A93A; Tue, 4 Dec 2018 03:57:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6D0DE2A8D3 for ; Tue, 4 Dec 2018 03:57:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=9wLgznwpF1GDA62479HOrB6Bijg/RT5LTbqPdT9zVLA=; b=LB8TjKzk0IIP/u c8zO5M5EpXjBFuekZVJbWQn1ph4pEXzbIVARhv3XIQgKrslc2VigeSOQAJdw493rjJLmthQmnnWe4 77OzQUkyKWy417i6qTeDcDF7pCmiHvVGjYOklOK57csjLbx1SyLYYTUVSd00FWD10wqEdJNt0kI1x fKpPzlkQVvFYsuk92NFw3z2ijy7LgkkiVjBNuBd4XHbF4zeq5GDl8vOCp1a9Zquq+DuXMSlD+tqWM DXmpif4qCsnGGOuU/CTZHmR8x11KkjyoIiSNbd+7+uTnz7ouLFA4nzNOdo2XmbPN/yHR4TxZUyEuY farOaxZB/ILWZ3jEzZ0g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1pk-00064a-88; Tue, 04 Dec 2018 03:57:16 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1oo-00054s-Hd for linux-arm-kernel@lists.infradead.org; Tue, 04 Dec 2018 03:56:22 +0000 Received: from sol.localdomain (c-24-23-142-8.hsd1.ca.comcast.net [24.23.142.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8F5AF214D9; Tue, 4 Dec 2018 03:56:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543895768; bh=ftfaoVGnKNW85A6Y/PiL+hSDjrfYf41jJJODQsjjrEc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HpDBI+T6Pm33OIHHqfP+kCELDD7FsJhgufgoEQeu/PEm0ol83AiyezcXnbv/I0KGy qXJfdcUkQwZeQTvK7zlUuCyoy/owDM1tL2OV4zx+Mx4/xw8Fh3UQYuPr94B5ilOHxq J7q4LHDBCfXRXDIJaMQFYYMRlF0cD0I8M2MHAnYM= From: Eric Biggers To: linux-crypto@vger.kernel.org Subject: [PATCH v2 3/4] crypto: arm64/chacha20 - refactor to allow varying number of rounds Date: Mon, 3 Dec 2018 19:52:51 -0800 Message-Id: <20181204035252.14853-4-ebiggers@kernel.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181204035252.14853-1-ebiggers@kernel.org> References: <20181204035252.14853-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20181203_195618_847638_B669A97B X-CRM114-Status: GOOD ( 13.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A . Donenfeld" , Ard Biesheuvel , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Paul Crowley Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers In preparation for adding XChaCha12 support, rename/refactor the ARM64 NEON implementation of ChaCha20 to support different numbers of rounds. Reviewed-by: Ard Biesheuvel Signed-off-by: Eric Biggers --- arch/arm64/crypto/Makefile | 4 +- ...hacha20-neon-core.S => chacha-neon-core.S} | 45 ++++++++------- ...hacha20-neon-glue.c => chacha-neon-glue.c} | 57 ++++++++++--------- 3 files changed, 57 insertions(+), 49 deletions(-) rename arch/arm64/crypto/{chacha20-neon-core.S => chacha-neon-core.S} (94%) rename arch/arm64/crypto/{chacha20-neon-glue.c => chacha-neon-glue.c} (71%) diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index 125dbb10a93e..a4ffd9fe3265 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -50,8 +50,8 @@ sha256-arm64-y := sha256-glue.o sha256-core.o obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o sha512-arm64-y := sha512-glue.o sha512-core.o -obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o -chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o +obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o +chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o diff --git a/arch/arm64/crypto/chacha20-neon-core.S b/arch/arm64/crypto/chacha-neon-core.S similarity index 94% rename from arch/arm64/crypto/chacha20-neon-core.S rename to arch/arm64/crypto/chacha-neon-core.S index 0571e45a1a0a..3d3a12db5204 100644 --- a/arch/arm64/crypto/chacha20-neon-core.S +++ b/arch/arm64/crypto/chacha-neon-core.S @@ -1,5 +1,5 @@ /* - * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions + * ChaCha/XChaCha NEON helper functions * * Copyright (C) 2016 Linaro, Ltd. * @@ -24,17 +24,18 @@ .align 6 /* - * chacha20_permute - permute one block + * chacha_permute - permute one block * * Permute one 64-byte block where the state matrix is stored in the four NEON * registers v0-v3. It performs matrix operations on four words in parallel, * but requires shuffling to rearrange the words after each round. * - * Clobbers: x3, x10, v4, v12 + * The round count is given in w3. + * + * Clobbers: w3, x10, v4, v12 */ -chacha20_permute: +chacha_permute: - mov x3, #10 adr x10, ROT8 ld1 {v12.4s}, [x10] @@ -97,16 +98,17 @@ chacha20_permute: // x3 = shuffle32(x3, MASK(0, 3, 2, 1)) ext v3.16b, v3.16b, v3.16b, #4 - subs x3, x3, #1 + subs w3, w3, #2 b.ne .Ldoubleround ret -ENDPROC(chacha20_permute) +ENDPROC(chacha_permute) -ENTRY(chacha20_block_xor_neon) +ENTRY(chacha_block_xor_neon) // x0: Input state matrix, s // x1: 1 data block output, o // x2: 1 data block input, i + // w3: nrounds stp x29, x30, [sp, #-16]! mov x29, sp @@ -115,7 +117,7 @@ ENTRY(chacha20_block_xor_neon) ld1 {v0.4s-v3.4s}, [x0] ld1 {v8.4s-v11.4s}, [x0] - bl chacha20_permute + bl chacha_permute ld1 {v4.16b-v7.16b}, [x2] @@ -139,42 +141,45 @@ ENTRY(chacha20_block_xor_neon) ldp x29, x30, [sp], #16 ret -ENDPROC(chacha20_block_xor_neon) +ENDPROC(chacha_block_xor_neon) -ENTRY(hchacha20_block_neon) +ENTRY(hchacha_block_neon) // x0: Input state matrix, s // x1: output (8 32-bit words) + // w2: nrounds stp x29, x30, [sp, #-16]! mov x29, sp ld1 {v0.4s-v3.4s}, [x0] - bl chacha20_permute + mov w3, w2 + bl chacha_permute st1 {v0.16b}, [x1], #16 st1 {v3.16b}, [x1] ldp x29, x30, [sp], #16 ret -ENDPROC(hchacha20_block_neon) +ENDPROC(hchacha_block_neon) .align 6 -ENTRY(chacha20_4block_xor_neon) +ENTRY(chacha_4block_xor_neon) // x0: Input state matrix, s // x1: 4 data blocks output, o // x2: 4 data blocks input, i + // w3: nrounds // - // This function encrypts four consecutive ChaCha20 blocks by loading + // This function encrypts four consecutive ChaCha blocks by loading // the state matrix in NEON registers four times. The algorithm performs // each operation on the corresponding word of each state matrix, hence // requires no word shuffling. For final XORing step we transpose the // matrix by interleaving 32- and then 64-bit words, which allows us to // do XOR in NEON registers. // - adr x3, CTRINC // ... and ROT8 - ld1 {v30.4s-v31.4s}, [x3] + adr x9, CTRINC // ... and ROT8 + ld1 {v30.4s-v31.4s}, [x9] // x0..15[0-3] = s0..3[0..3] mov x4, x0 @@ -186,8 +191,6 @@ ENTRY(chacha20_4block_xor_neon) // x12 += counter values 0-3 add v12.4s, v12.4s, v30.4s - mov x3, #10 - .Ldoubleround4: // x0 += x4, x12 = rotl32(x12 ^ x0, 16) // x1 += x5, x13 = rotl32(x13 ^ x1, 16) @@ -361,7 +364,7 @@ ENTRY(chacha20_4block_xor_neon) sri v7.4s, v18.4s, #25 sri v4.4s, v19.4s, #25 - subs x3, x3, #1 + subs w3, w3, #2 b.ne .Ldoubleround4 ld4r {v16.4s-v19.4s}, [x0], #16 @@ -475,7 +478,7 @@ ENTRY(chacha20_4block_xor_neon) st1 {v28.16b-v31.16b}, [x1] ret -ENDPROC(chacha20_4block_xor_neon) +ENDPROC(chacha_4block_xor_neon) CTRINC: .word 0, 1, 2, 3 ROT8: .word 0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha-neon-glue.c similarity index 71% rename from arch/arm64/crypto/chacha20-neon-glue.c rename to arch/arm64/crypto/chacha-neon-glue.c index a5b9cbc0c4de..4d992029b912 100644 --- a/arch/arm64/crypto/chacha20-neon-glue.c +++ b/arch/arm64/crypto/chacha-neon-glue.c @@ -1,5 +1,6 @@ /* - * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions + * ARM NEON accelerated ChaCha and XChaCha stream ciphers, + * including ChaCha20 (RFC7539) * * Copyright (C) 2016 - 2017 Linaro, Ltd. * @@ -28,18 +29,20 @@ #include #include -asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src); -asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src); -asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out); +asmlinkage void chacha_block_xor_neon(u32 *state, u8 *dst, const u8 *src, + int nrounds); +asmlinkage void chacha_4block_xor_neon(u32 *state, u8 *dst, const u8 *src, + int nrounds); +asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds); -static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, - unsigned int bytes) +static void chacha_doneon(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds) { u8 buf[CHACHA_BLOCK_SIZE]; while (bytes >= CHACHA_BLOCK_SIZE * 4) { kernel_neon_begin(); - chacha20_4block_xor_neon(state, dst, src); + chacha_4block_xor_neon(state, dst, src, nrounds); kernel_neon_end(); bytes -= CHACHA_BLOCK_SIZE * 4; src += CHACHA_BLOCK_SIZE * 4; @@ -52,7 +55,7 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, kernel_neon_begin(); while (bytes >= CHACHA_BLOCK_SIZE) { - chacha20_block_xor_neon(state, dst, src); + chacha_block_xor_neon(state, dst, src, nrounds); bytes -= CHACHA_BLOCK_SIZE; src += CHACHA_BLOCK_SIZE; dst += CHACHA_BLOCK_SIZE; @@ -60,14 +63,14 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, } if (bytes) { memcpy(buf, src, bytes); - chacha20_block_xor_neon(state, buf, buf); + chacha_block_xor_neon(state, buf, buf, nrounds); memcpy(dst, buf, bytes); } kernel_neon_end(); } -static int chacha20_neon_stream_xor(struct skcipher_request *req, - struct chacha_ctx *ctx, u8 *iv) +static int chacha_neon_stream_xor(struct skcipher_request *req, + struct chacha_ctx *ctx, u8 *iv) { struct skcipher_walk walk; u32 state[16]; @@ -83,15 +86,15 @@ static int chacha20_neon_stream_xor(struct skcipher_request *req, if (nbytes < walk.total) nbytes = round_down(nbytes, walk.stride); - chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr, - nbytes); + chacha_doneon(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes, ctx->nrounds); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } return err; } -static int chacha20_neon(struct skcipher_request *req) +static int chacha_neon(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -99,10 +102,10 @@ static int chacha20_neon(struct skcipher_request *req) if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) return crypto_chacha_crypt(req); - return chacha20_neon_stream_xor(req, ctx, req->iv); + return chacha_neon_stream_xor(req, ctx, req->iv); } -static int xchacha20_neon(struct skcipher_request *req) +static int xchacha_neon(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -116,12 +119,13 @@ static int xchacha20_neon(struct skcipher_request *req) crypto_chacha_init(state, ctx, req->iv); kernel_neon_begin(); - hchacha20_block_neon(state, subctx.key); + hchacha_block_neon(state, subctx.key, ctx->nrounds); kernel_neon_end(); + subctx.nrounds = ctx->nrounds; memcpy(&real_iv[0], req->iv + 24, 8); memcpy(&real_iv[8], req->iv + 16, 8); - return chacha20_neon_stream_xor(req, &subctx, real_iv); + return chacha_neon_stream_xor(req, &subctx, real_iv); } static struct skcipher_alg algs[] = { @@ -139,8 +143,8 @@ static struct skcipher_alg algs[] = { .chunksize = CHACHA_BLOCK_SIZE, .walksize = 4 * CHACHA_BLOCK_SIZE, .setkey = crypto_chacha20_setkey, - .encrypt = chacha20_neon, - .decrypt = chacha20_neon, + .encrypt = chacha_neon, + .decrypt = chacha_neon, }, { .base.cra_name = "xchacha20", .base.cra_driver_name = "xchacha20-neon", @@ -155,12 +159,12 @@ static struct skcipher_alg algs[] = { .chunksize = CHACHA_BLOCK_SIZE, .walksize = 4 * CHACHA_BLOCK_SIZE, .setkey = crypto_chacha20_setkey, - .encrypt = xchacha20_neon, - .decrypt = xchacha20_neon, + .encrypt = xchacha_neon, + .decrypt = xchacha_neon, } }; -static int __init chacha20_simd_mod_init(void) +static int __init chacha_simd_mod_init(void) { if (!(elf_hwcap & HWCAP_ASIMD)) return -ENODEV; @@ -168,14 +172,15 @@ static int __init chacha20_simd_mod_init(void) return crypto_register_skciphers(algs, ARRAY_SIZE(algs)); } -static void __exit chacha20_simd_mod_fini(void) +static void __exit chacha_simd_mod_fini(void) { crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); } -module_init(chacha20_simd_mod_init); -module_exit(chacha20_simd_mod_fini); +module_init(chacha_simd_mod_init); +module_exit(chacha_simd_mod_fini); +MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (NEON accelerated)"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("chacha20"); From patchwork Tue Dec 4 03:52:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 10711091 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0238D109C for ; Tue, 4 Dec 2018 03:56:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E5CE02A8D3 for ; Tue, 4 Dec 2018 03:56:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D87D92A93A; Tue, 4 Dec 2018 03:56:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8BC332A8D3 for ; Tue, 4 Dec 2018 03:56:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ZReLyOyxWGn+bLi01ieFBDMvvQZ4d8PoC82OiEn1kL4=; b=QjSJRmAXfQ4vH2 qESJ2/ns2MvTz/i3JIT/10BUFLIWiRHvlQ0yoweB6K4M1sFbIBWRntlZMUynuB7ybHhT22TajRqDh O+j7qHX6hXUA/7ypaQS/JTTcsIgdst0QVJ8v5ZHuPnHl83Z0GJLDx53oJ7l2uzw1879g4st5znotT bmKflzLQBugMy142CpejtGV+InFH4intAFRqsijyU6prIAYEJKqMvG7ByyoY48MRX8gAA20bJBnHN wmdHSTAP7KzCdxk2m6OI8dcTIHxVisVQRglpcBt9pUpnj4b4oLMvJG8kYHv3yC2XuZ799LLXrau/D BZgSCgL1lQrw2r+/NoUg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1os-00056H-7z; Tue, 04 Dec 2018 03:56:22 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gU1oo-00054t-Ha for linux-arm-kernel@lists.infradead.org; Tue, 04 Dec 2018 03:56:20 +0000 Received: from sol.localdomain (c-24-23-142-8.hsd1.ca.comcast.net [24.23.142.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id ED30E214DA; Tue, 4 Dec 2018 03:56:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543895769; bh=9sqEpCOjIyWA2Z9z1Qrp9WtcLxh4o/dxsqsNsRv+aLY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cU1Dv1/qRyL4eZgfrDygHyWN/4hjxW/HvodmSXDhxFCAbUCcXPJrrQSnElIiIA84g 7t0Fu/1yEHI3Ky4t/hTxqr3nb/7Tyz8URpRlCAURkcjLSNCF+C0U9HsEmxGiP9a44z muiw9cAiYJH05AmByrKVVmQvioy9ExR2I5DDMxH4= From: Eric Biggers To: linux-crypto@vger.kernel.org Subject: [PATCH v2 4/4] crypto: arm64/chacha - add XChaCha12 support Date: Mon, 3 Dec 2018 19:52:52 -0800 Message-Id: <20181204035252.14853-5-ebiggers@kernel.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181204035252.14853-1-ebiggers@kernel.org> References: <20181204035252.14853-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20181203_195618_603980_420920AF X-CRM114-Status: GOOD ( 14.17 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A . Donenfeld" , Ard Biesheuvel , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Paul Crowley Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers Now that the ARM64 NEON implementation of ChaCha20 and XChaCha20 has been refactored to support varying the number of rounds, add support for XChaCha12. This is identical to XChaCha20 except for the number of rounds, which is 12 instead of 20. This can be used by Adiantum. Reviewed-by: Ard Biesheuvel Signed-off-by: Eric Biggers --- arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/chacha-neon-glue.c | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index d54ddb8468ef..d9a523ecdd83 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -101,7 +101,7 @@ config CRYPTO_AES_ARM64_NEON_BLK select CRYPTO_SIMD config CRYPTO_CHACHA20_NEON - tristate "ChaCha20 and XChaCha20 stream ciphers using NEON instructions" + tristate "ChaCha20, XChaCha20, and XChaCha12 stream ciphers using NEON instructions" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER select CRYPTO_CHACHA20 diff --git a/arch/arm64/crypto/chacha-neon-glue.c b/arch/arm64/crypto/chacha-neon-glue.c index 4d992029b912..346eb85498a1 100644 --- a/arch/arm64/crypto/chacha-neon-glue.c +++ b/arch/arm64/crypto/chacha-neon-glue.c @@ -161,6 +161,22 @@ static struct skcipher_alg algs[] = { .setkey = crypto_chacha20_setkey, .encrypt = xchacha_neon, .decrypt = xchacha_neon, + }, { + .base.cra_name = "xchacha12", + .base.cra_driver_name = "xchacha12-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = crypto_chacha12_setkey, + .encrypt = xchacha_neon, + .decrypt = xchacha_neon, } }; @@ -187,3 +203,5 @@ MODULE_ALIAS_CRYPTO("chacha20"); MODULE_ALIAS_CRYPTO("chacha20-neon"); MODULE_ALIAS_CRYPTO("xchacha20"); MODULE_ALIAS_CRYPTO("xchacha20-neon"); +MODULE_ALIAS_CRYPTO("xchacha12"); +MODULE_ALIAS_CRYPTO("xchacha12-neon");