From patchwork Sat Mar 10 15:21:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 10273667 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F2A8360594 for ; Sat, 10 Mar 2018 15:30:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DB206299DB for ; Sat, 10 Mar 2018 15:30:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CC081299E3; Sat, 10 Mar 2018 15:30:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6E5FC299DB for ; Sat, 10 Mar 2018 15:30:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=IBPqkVNlx9E54aW79NZs3sFyYy4uomFVsB6BAKoNTWo=; b=oiKbrBu5f00b3a/nerGECIClgE vPom4YWbAuXbeAWgIdizKzn9qY6gZoR6SyFxtD/S87PxMKPWG17gA1Lvc8lb+WOvcEbenePd5SlWw ETs5Xiat91l0023PPzlybg9RZOiPhiG3s0WJgxuQ1yj5O7ksGHLKrC64XYkhQAhV1lXnt+BTzcGB9 Jsm8OvDTU4C/yIRN0lkI2mWG7F8IK9qMnBotFQx4Ptw04kxC6JKjECsMNG8Xt9qZaGomGFfRQf7hT SKNBopNZLQSFKxzhro4Cst3n0gEdIdZFQ1JBhuGr/4+S6ECiX1KDpCTqVS/uZRAQE4JmR584vuw6A ZLDpKgiw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.89 #1 (Red Hat Linux)) id 1eugRV-0004jf-7e; Sat, 10 Mar 2018 15:29:53 +0000 Received: from mail-wr0-x241.google.com ([2a00:1450:400c:c0c::241]) by bombadil.infradead.org with esmtps (Exim 4.89 #1 (Red Hat Linux)) id 1eugKq-00083h-Mq for linux-arm-kernel@lists.infradead.org; Sat, 10 Mar 2018 15:23:12 +0000 Received: by mail-wr0-x241.google.com with SMTP id z12so11638362wrg.4 for ; Sat, 10 Mar 2018 07:22:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=045pc8ZcXNQ6enBteVqbqVjC9IyYTZHYnUU9BOMiWIg=; b=gymxINhgItz+nuDPpQXrUwzDGnQLyQFYiBcxPpGA+owu+IcV0NuO7N1is247R5MHKQ jOAvbsyvv+NQ7mRXwUS7PhgQRtSCx2d3cNfdzmm4yvFDFHpnolBxukky0JMhhEID/Yz3 624aIfxCturwrbvsh45JVKmSjEbH8s8yCHIB8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=045pc8ZcXNQ6enBteVqbqVjC9IyYTZHYnUU9BOMiWIg=; b=IATXBHhC7MxWsvlAEWE0eM3EvlOpjn4+/XNFFVweI7/N0EeD4qUfDGlViULUpF0nEs PKL+9VHu2wqFXaQLeqMa1d99DeBUNc2eUK8mKsjpwF6CuxW+65PYMQCJwtUTAVj0u36z NTvNqsqaRVLkRsWJZxUts8QyfCrRiFQAbClKiHxX2O6N2cthoMmcqrMs9yM7QtC2XE+V D/b/XmEA6tYpN/mQeIuJbDOZZsyW+Ime1LTK0w13AE5p7/50MF35P3N0hA3vXU7QBqs8 gnTFWX1fCicDeXCrzO4ZJmuehKJVgilULy0JG0N82GnVoPX63HcgsWgmAPIOcYtk80xe UdsA== X-Gm-Message-State: AElRT7GYU8QACZR4O1mXnyVw/RGpEIJrWVL71f3VS5Wqjxyi/xHZjqmK crrDCsfWCjFga9BiRekyXUujiA== X-Google-Smtp-Source: AG47ELu29Qido9rVsaIcAag++GXXooC7FKIr1W+lJidRKvYHVTidW/FFGH9mgWbETHbMhG0NATAzwQ== X-Received: by 10.223.201.142 with SMTP id f14mr1899909wrh.40.1520695369357; Sat, 10 Mar 2018 07:22:49 -0800 (PST) Received: from localhost.localdomain ([105.148.128.186]) by smtp.gmail.com with ESMTPSA id m9sm7027531wrf.13.2018.03.10.07.22.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Mar 2018 07:22:48 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Subject: [PATCH v5 07/23] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path Date: Sat, 10 Mar 2018 15:21:52 +0000 Message-Id: <20180310152208.10369-8-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180310152208.10369-1-ard.biesheuvel@linaro.org> References: <20180310152208.10369-1-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180310_072301_522689_6FB8EB91 X-CRM114-Status: GOOD ( 14.25 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , herbert@gondor.apana.org.au, Ard Biesheuvel , Peter Zijlstra , Catalin Marinas , Sebastian Andrzej Siewior , Will Deacon , Russell King - ARM Linux , Steven Rostedt , Thomas Gleixner , Dave Martin , linux-arm-kernel@lists.infradead.org, linux-rt-users@vger.kernel.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP CBC encryption is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-modes.S | 31 ++++++++++++++++---- 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 27a235b2ddee..e86535a1329d 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -94,17 +94,36 @@ AES_ENDPROC(aes_ecb_decrypt) */ AES_ENTRY(aes_cbc_encrypt) - ld1 {v0.16b}, [x5] /* get iv */ + ld1 {v4.16b}, [x5] /* get iv */ enc_prepare w3, x2, x6 -.Lcbcencloop: - ld1 {v1.16b}, [x1], #16 /* get next pt block */ - eor v0.16b, v0.16b, v1.16b /* ..and xor with iv */ +.Lcbcencloop4x: + subs w4, w4, #4 + bmi .Lcbcenc1x + ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 pt blocks */ + eor v0.16b, v0.16b, v4.16b /* ..and xor with iv */ encrypt_block v0, w3, x2, x6, w7 - st1 {v0.16b}, [x0], #16 + eor v1.16b, v1.16b, v0.16b + encrypt_block v1, w3, x2, x6, w7 + eor v2.16b, v2.16b, v1.16b + encrypt_block v2, w3, x2, x6, w7 + eor v3.16b, v3.16b, v2.16b + encrypt_block v3, w3, x2, x6, w7 + st1 {v0.16b-v3.16b}, [x0], #64 + mov v4.16b, v3.16b + b .Lcbcencloop4x +.Lcbcenc1x: + adds w4, w4, #4 + beq .Lcbcencout +.Lcbcencloop: + ld1 {v0.16b}, [x1], #16 /* get next pt block */ + eor v4.16b, v4.16b, v0.16b /* ..and xor with iv */ + encrypt_block v4, w3, x2, x6, w7 + st1 {v4.16b}, [x0], #16 subs w4, w4, #1 bne .Lcbcencloop - st1 {v0.16b}, [x5] /* return iv */ +.Lcbcencout: + st1 {v4.16b}, [x5] /* return iv */ ret AES_ENDPROC(aes_cbc_encrypt)