From patchwork Tue Mar 15 09:44:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "tianjia.zhang" X-Patchwork-Id: 12781231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AE3A8C433EF for ; Tue, 15 Mar 2022 09:46:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=kAzMQZu9x4rTjj1a2BAbjwhoB0YfZ8KPZmGGOas4ucc=; b=ZrMi3G4L2JzgN6 t4f8GoYzLmAsA/0W+erlXNL9Vb03ek4l9j0HFPq8CCCKILpNMjM0vTCNlb5K3yLRKr68L5Snl1rdN 3MDokXluWQUdn7Urof7msbuIBc6v+L6o4mfMVXcdymzdQjIhKMn0MiZi/A9m6nSxrLnOCDFjwN2MG 6IoEGvmsLeZlIqF0PDhgcLCur+OI3tqtp6oTzh0sJBsxmE1mH+oa5ByZtwcgpBxeluGqy4Ejr8hi1 vfOfz7HlEg9BUBDVQo4zAoerfTXW95garDO+RBRUKQcxnbZMw72WZHaAXHhxZZH5dLaKSz3MJTC2y zfhVmC7jysihkE9LipXg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nU3k5-008awO-QG; Tue, 15 Mar 2022 09:45:25 +0000 Received: from out199-2.us.a.mail.aliyun.com ([47.90.199.2]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nU3jm-008aoo-9a for linux-arm-kernel@lists.infradead.org; Tue, 15 Mar 2022 09:45:08 +0000 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R311e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04426; MF=tianjia.zhang@linux.alibaba.com; NM=1; PH=DS; RN=14; SR=0; TI=SMTPD_---0V7H-Cvw_1647337496; Received: from localhost(mailfrom:tianjia.zhang@linux.alibaba.com fp:SMTPD_---0V7H-Cvw_1647337496) by smtp.aliyun-inc.com(127.0.0.1); Tue, 15 Mar 2022 17:44:57 +0800 From: Tianjia Zhang To: Herbert Xu , "David S. Miller" , Catalin Marinas , Will Deacon , "Markku-Juhani O . Saarinen" , Jussi Kivilinna , Ard Biesheuvel , Gilad Ben-Yossef , linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Jia Zhang , zhuolong.lq@antfin.com Cc: Tianjia Zhang Subject: [PATCH 0/4] Add ARMv8 NEON and Crypto Extensions implementation of SM4-ECB/CBC/CFB/CTR Date: Tue, 15 Mar 2022 17:44:50 +0800 Message-Id: <20220315094454.45269-1-tianjia.zhang@linux.alibaba.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220315_024506_622602_AC00D4DF X-CRM114-Status: GOOD ( 11.67 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This series of patches adds ARMv8 implementations of SM4 in ECB, CBC, CFB and CTR modes, both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON. NEON is a general-purpose SIMD instruction set under ARMv8, and SM4 acceleration instructions are optional supported by Crypto Extensions. Patch 1 exports the constant array in the SM4 software implementation, sm4_sbox are used for the tbl/tbx instruction lookup table in the NEON implementation, sm4_fk and sm4_ck are used for the key expansion in the CE implementation. Patch 2 renamed the existing sm4-ce to sm4-ce-cipher, which is a single-block CE algorithm implementation that does not support the mode. This naming rule follows the AES algorithm in the same directory. Patch 3 introduces an SM4 accelerated implementation of plain NEON. Patch 4 introduced the SM4 CE instruction set implementation. In order to be more intuitive, the following compares the performance of the four algorithms sm4-generic, sm4-neon, sm4-ce-cipher, and sm4-ce, their performance is from low to high. Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218 mode of tcrypt. The abscissas are blocks of different lengths. The data is tabulated and the unit is Mb/s: sm4-generic | 16 64 128 256 1024 1420 4096 ECB enc | 80.05 91.42 93.66 94.77 95.69 95.77 95.86 ECB dec | 79.98 91.41 93.64 94.76 95.66 95.77 95.85 CBC enc | 78.55 86.50 88.02 88.77 89.36 89.42 89.48 CBC dec | 76.82 89.06 91.52 92.77 93.75 93.83 93.96 CFB enc | 77.64 86.13 87.62 88.42 89.08 88.83 89.18 CFB dec | 77.57 88.34 90.36 91.45 92.34 92.00 92.44 CTR enc | 77.80 88.28 90.23 91.22 92.11 91.81 92.25 CTR dec | 77.83 88.22 90.22 91.22 92.04 91.82 92.28 sm4-neon ECB enc | 28.31 112.77 203.03 209.89 215.49 202.11 210.59 ECB dec | 28.36 113.45 203.23 210.00 215.52 202.13 210.65 CBC enc | 79.32 87.02 88.51 89.28 89.85 89.89 89.97 CBC dec | 28.29 112.20 203.30 209.82 214.99 201.51 209.95 CFB enc | 79.59 87.16 88.54 89.30 89.83 89.62 89.92 CFB dec | 28.12 111.05 202.47 209.02 214.21 210.90 209.12 CTR enc | 28.04 108.81 200.62 206.65 211.78 208.78 206.74 CTR dec | 28.02 108.82 200.45 206.62 211.78 208.74 206.70 sm4-ce-cipher ECB enc | 336.79 587.13 682.70 747.37 803.75 811.52 818.06 ECB dec | 339.18 584.52 679.72 743.68 798.82 803.83 811.54 CBC enc | 316.63 521.47 597.00 647.14 690.82 695.21 700.55 CBC dec | 291.80 503.79 585.66 640.82 689.86 695.16 701.72 CFB enc | 294.79 482.31 552.13 594.71 631.60 628.91 638.92 CFB dec | 293.09 466.44 526.56 563.17 594.41 592.26 601.97 CTR enc | 309.61 506.13 576.86 620.47 656.38 654.51 665.10 CTR dec | 306.69 505.57 576.84 620.18 657.09 654.52 665.32 sm4-ce ECB enc | 366.96 1329.81 2024.29 2755.50 3790.07 3861.91 4051.40 ECB dec | 367.30 1323.93 2018.72 2747.43 3787.39 3862.55 4052.62 CBC enc | 358.09 682.68 807.24 885.35 958.29 963.60 973.73 CBC dec | 366.51 1303.63 1978.64 2667.93 3624.53 3683.41 3856.08 CFB enc | 351.51 681.26 807.81 893.10 968.54 969.17 985.83 CFB dec | 354.98 1266.61 1929.63 2634.81 3614.23 3611.59 3841.68 CTR enc | 324.23 1121.25 1689.44 2256.70 2981.90 3007.79 3060.74 CTR dec | 324.18 1120.44 1694.31 2258.32 2982.01 3010.09 3060.99 Tianjia Zhang (4): crypto: lib/sm4 - export sm4 constant arrays crypto: arm64/sm4-ce - rename to sm4-ce-cipher crypto: arm64/sm4 - add ARMv8 NEON implementation crypto: arm64/sm4 - add ARMv8 Crypto Extensions implementation arch/arm64/crypto/Kconfig | 12 + arch/arm64/crypto/Makefile | 8 +- arch/arm64/crypto/sm4-ce-cipher-core.S | 36 ++ arch/arm64/crypto/sm4-ce-cipher-glue.c | 82 +++ arch/arm64/crypto/sm4-ce-core.S | 688 +++++++++++++++++++++++-- arch/arm64/crypto/sm4-ce-glue.c | 386 ++++++++++++-- arch/arm64/crypto/sm4-neon-core.S | 487 +++++++++++++++++ arch/arm64/crypto/sm4-neon-glue.c | 442 ++++++++++++++++ include/crypto/sm4.h | 4 + lib/crypto/sm4.c | 10 +- 10 files changed, 2073 insertions(+), 82 deletions(-) create mode 100644 arch/arm64/crypto/sm4-ce-cipher-core.S create mode 100644 arch/arm64/crypto/sm4-ce-cipher-glue.c create mode 100644 arch/arm64/crypto/sm4-neon-core.S create mode 100644 arch/arm64/crypto/sm4-neon-glue.c