From patchwork Wed Dec 6 19:43:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 10096945 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8E2C3602BF for ; Wed, 6 Dec 2017 19:46:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F06D28DED for ; Wed, 6 Dec 2017 19:46:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 73D3D28E2E; Wed, 6 Dec 2017 19:46:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 10E2528DFC for ; Wed, 6 Dec 2017 19:46:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=sXqQu3UbITzynLOhHUwXa8YMcc+dDKVAN+SXPwuy8BQ=; b=pfCBAk6U2Iu0uBsoQEPTFQmPKH JcKelgv/ofyWJRyVAviC4XmCvRyRPWE4Ni2AaZniCRBPlFs9/7xBSX7C9r1WKFhAyUppUggKlgb1c PV5kVeUqi4eOfyndlxR/jDjJIO7UOkp2hOA6v3s0cgjSlYrRNXWgg1GxbR7Qa2rCKjWbmwFR39/pr NdOHFVfYYB1Gwbw2S2o/BacZR2KWYhwtCvAFOfXFs/v89kUBblyECIDTfFLl+qQaO6bG2GkXSl7b+ hS3d/uCi8luKQ0w+wQb4OozBhi1Mo08ul0dBAir511UB0LHm2dWWoFo3PBxndVNdUJHxIl5uoYwf9 56uPnPvg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1eMfeF-0001An-HD; Wed, 06 Dec 2017 19:46:27 +0000 Received: from mail-wr0-x244.google.com ([2a00:1450:400c:c0c::244]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1eMfcQ-0006Nb-73 for linux-arm-kernel@lists.infradead.org; Wed, 06 Dec 2017 19:44:48 +0000 Received: by mail-wr0-x244.google.com with SMTP id o2so5119274wro.5 for ; Wed, 06 Dec 2017 11:44:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=I6Gnvs3Ig+XMQwO7jGoODHjYQvJyRj6RsM7VfBN0Rns=; b=gPnc/IaLPhCdaYOcGr4J+2RoP5OwvmoAXTKq9ICW09VxYkYF54b42E1vjjqzmKGLoM T7GLu3KWgqOXwI0iRN8boNafiqAXMmo+78Twwjs6gs3A3YmjVaKpmmOMOmWT0+WjGoUb R0uCMHgmXAQVpei9kx6u1t9YmKaE1qwLbasfo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=I6Gnvs3Ig+XMQwO7jGoODHjYQvJyRj6RsM7VfBN0Rns=; b=mVvH4qs7dgtJzsQislfo7g8gIb72vL4an4EzTm6XeC3ynNn4/B3ubwOcvDRaUdedeJ 7KmpZln5AAM+nBaV/E/dPrr7Ckh4dww/fb+K5SYfgcEsdjjAB8vuZ14aFEoDnDFttKVt Vr+xmvRK1CvxOmTKnxJkxVbjbOs7kqQA1c/o2A5Z24yrNxyS01CVgwCSNX+W151ebnrt d2zum4wqldqMO9X2cR/ihW2zUmHjR9EUBxKw9IszNkDBtv0Uw0KN3/U7L52KY3nkEzKU LqfxE61T6xxhQNf9606AZUN4wrCJIZSB1tm8k/391l0BG6tbZGj/WFaDNFUb5Xbss23P Sv5w== X-Gm-Message-State: AJaThX5kUqzoxKrJRKidcw/okxA5ot3Sr8LXmcrczo2h0Sz+PxkxYIr6 aqJdcnsrrnRtLZuoHCb3zhicJA== X-Google-Smtp-Source: AGs4zMaChjpxA64JWdrosbyLzKpwMP0o2OHejqGUGcOVS1KSyE1l4AdfewArHZ1505m8x3TUaqUAMg== X-Received: by 10.223.195.103 with SMTP id e36mr21193552wrg.10.1512589454918; Wed, 06 Dec 2017 11:44:14 -0800 (PST) Received: from localhost.localdomain ([105.150.171.234]) by smtp.gmail.com with ESMTPSA id b66sm3596594wmh.32.2017.12.06.11.44.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Dec 2017 11:44:13 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Subject: [PATCH v3 07/20] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path Date: Wed, 6 Dec 2017 19:43:33 +0000 Message-Id: <20171206194346.24393-8-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171206194346.24393-1-ard.biesheuvel@linaro.org> References: <20171206194346.24393-1-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20171206_114435_125267_2B385D84 X-CRM114-Status: GOOD ( 12.14 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , herbert@gondor.apana.org.au, Ard Biesheuvel , Peter Zijlstra , Catalin Marinas , Sebastian Andrzej Siewior , Will Deacon , Russell King - ARM Linux , Steven Rostedt , Thomas Gleixner , Dave Martin , linux-arm-kernel@lists.infradead.org, linux-rt-users@vger.kernel.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP CBC encryption is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-modes.S | 31 ++++++++++++++++---- 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 27a235b2ddee..e86535a1329d 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -94,17 +94,36 @@ AES_ENDPROC(aes_ecb_decrypt) */ AES_ENTRY(aes_cbc_encrypt) - ld1 {v0.16b}, [x5] /* get iv */ + ld1 {v4.16b}, [x5] /* get iv */ enc_prepare w3, x2, x6 -.Lcbcencloop: - ld1 {v1.16b}, [x1], #16 /* get next pt block */ - eor v0.16b, v0.16b, v1.16b /* ..and xor with iv */ +.Lcbcencloop4x: + subs w4, w4, #4 + bmi .Lcbcenc1x + ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 pt blocks */ + eor v0.16b, v0.16b, v4.16b /* ..and xor with iv */ encrypt_block v0, w3, x2, x6, w7 - st1 {v0.16b}, [x0], #16 + eor v1.16b, v1.16b, v0.16b + encrypt_block v1, w3, x2, x6, w7 + eor v2.16b, v2.16b, v1.16b + encrypt_block v2, w3, x2, x6, w7 + eor v3.16b, v3.16b, v2.16b + encrypt_block v3, w3, x2, x6, w7 + st1 {v0.16b-v3.16b}, [x0], #64 + mov v4.16b, v3.16b + b .Lcbcencloop4x +.Lcbcenc1x: + adds w4, w4, #4 + beq .Lcbcencout +.Lcbcencloop: + ld1 {v0.16b}, [x1], #16 /* get next pt block */ + eor v4.16b, v4.16b, v0.16b /* ..and xor with iv */ + encrypt_block v4, w3, x2, x6, w7 + st1 {v4.16b}, [x0], #16 subs w4, w4, #1 bne .Lcbcencloop - st1 {v0.16b}, [x5] /* return iv */ +.Lcbcencout: + st1 {v4.16b}, [x5] /* return iv */ ret AES_ENDPROC(aes_cbc_encrypt)