From patchwork Tue Aug 21 16:46:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 10572035 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A773A1390 for ; Tue, 21 Aug 2018 16:47:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94798286B0 for ; Tue, 21 Aug 2018 16:47:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 888932AA04; Tue, 21 Aug 2018 16:47:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.4 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_WEB autolearn=no version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3E316286B0 for ; Tue, 21 Aug 2018 16:47:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=37t6t3LwGLTYu8YGh5BG0y1KANdMIw9B8iRCpARrN1g=; b=eDc smZI2mEK565GnkQDVJ8NxVGrMPSLn97FdFPCKf4C896uKH5fdTgKn2+YwDwlpa4WvqPeY3812wEav SgpRvH2JQdHPY3XyiP0gHrU1R+aG+dsEWB36b3FHiN+3rnI5anSc+9qJzbwh0lFxGcCaCV2h6IgML Sh5eAbMf3xqefD3L+oreH6J3CJFGdTb/dL7akptR2GcmVg5h9aUWa+ZCVvficssqH7WSadpI155no Zy+6KR+9XmVSLfhFyWH2PD/njJKoS23tLBDAt8WLmyWEPcPekPbUEbMylPavvRQsxEpYAx8ytifhC 6n8jzVCvd7lrgwI9whRj0JK6UJHepTg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fs9ny-0005Fs-4H; Tue, 21 Aug 2018 16:46:54 +0000 Received: from mail-lf1-x141.google.com ([2a00:1450:4864:20::141]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fs9nv-0005F6-Sq for linux-arm-kernel@lists.infradead.org; Tue, 21 Aug 2018 16:46:53 +0000 Received: by mail-lf1-x141.google.com with SMTP id f135-v6so14066852lfg.10 for ; Tue, 21 Aug 2018 09:46:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=GAdyM4/n79tcWOyIneBtm4DVaLfiUdoUXk+BfQ9BAIU=; b=M8wHoRW2kTRrc7OXtegF2G5SlBKud/IiXpImOzC+Hs/FhvAi1d7bU+xd+RG3dcFzwf IUMk8Kup5O8cPROjlPWCA6wLgSVk2QLX6fp3KIObJzaEi8hrRL/i1OFCBgL/FiOecCM1 tyh2QMZt9SJLpneHxmjouSARdMIe1Liygaqd0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=GAdyM4/n79tcWOyIneBtm4DVaLfiUdoUXk+BfQ9BAIU=; b=ZGSYFLSw5qtETIolmuBu6aM/lttrcPmVdVULEZgjrgf4kTCNHDoOdFe/hshYONQevW 2cHebmC2MChSf8rk7J2Azh/KAfoRRliL1QPH7IZev+ebmv7obvmksPnqyR5oLMkwbEVp HHqwJmC73OcWfxNmnWI7n8LpF1TJQ1eWHcc6UrYXmsu7DJwqcdWSUk1oK0B2HC+J+Qxy egcRqeQYYcpEp0FSjqa5olgFkFh2UViDtRQ6q+5f0ZnhRolQz1Gr9KazDenfb114maZh 3+z3CvsnPJXQUZaXso1DbKP7asRBcxL5VnY/CT7x/6e3Q1s78QxHcyCmBmvLPBJPFela K8JA== X-Gm-Message-State: AOUpUlHWSUxFXScLBA5XrhzN0FQkqALNKuYPEkEVl3x2vsTh20JqPdIc EPnJvZqmughfs2yZFCmDENjFeg== X-Google-Smtp-Source: AA+uWPwMRj/Q3k5ZM+ZxtUdQb7nHowul1f5Z+WrB29pdFmD265g9/HHS1eBkYbuYVllxu0kddleczQ== X-Received: by 2002:a19:e991:: with SMTP id j17-v6mr16234235lfk.112.1534869999900; Tue, 21 Aug 2018 09:46:39 -0700 (PDT) Received: from localhost.localdomain ([148.122.187.2]) by smtp.gmail.com with ESMTPSA id g86-v6sm122217ljf.5.2018.08.21.09.46.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 21 Aug 2018 09:46:39 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Subject: [PATCH] crypto: arm64/aes-modes - get rid of literal load of addend vector Date: Tue, 21 Aug 2018 18:46:14 +0200 Message-Id: <20180821164614.31513-1-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180821_094651_929910_13122A51 X-CRM114-Status: GOOD ( 12.28 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nick Desaulniers , will.deacon@arm.com, herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Replace the literal load of the addend vector with a sequence that composes it using immediates. While at it, tweak the code that refers to it so it does not clobber the register, so we can take the load out of the loop as well. This results in generally better code, but also works around a Clang issue, whose integrated assembler does not implement the GNU ARM asm syntax completely, and does not support the =literal notation for FP registers. Cc: Nick Desaulniers Signed-off-by: Ard Biesheuvel Reviewed-by: Nick Desaulniers --- arch/arm64/crypto/aes-modes.S | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 483a7130cf0e..e966620ee230 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -225,6 +225,14 @@ AES_ENTRY(aes_ctr_encrypt) enc_prepare w22, x21, x6 ld1 {v4.16b}, [x24] + /* compose addend vector { 1, 2, 3, 0 } in v8.4s */ + movi v7.4h, #1 + movi v8.4h, #2 + uaddl v6.4s, v7.4h, v8.4h + zip1 v8.8h, v7.8h, v8.8h + zip1 v8.4s, v8.4s, v6.4s + zip2 v8.8h, v8.8h, v7.8h + umov x6, v4.d[1] /* keep swabbed ctr in reg */ rev x6, x6 .LctrloopNx: @@ -232,17 +240,16 @@ AES_ENTRY(aes_ctr_encrypt) bmi .Lctr1x cmn w6, #4 /* 32 bit overflow? */ bcs .Lctr1x - ldr q8, =0x30000000200000001 /* addends 1,2,3[,0] */ dup v7.4s, w6 mov v0.16b, v4.16b add v7.4s, v7.4s, v8.4s mov v1.16b, v4.16b - rev32 v8.16b, v7.16b + rev32 v7.16b, v7.16b mov v2.16b, v4.16b mov v3.16b, v4.16b - mov v1.s[3], v8.s[0] - mov v2.s[3], v8.s[1] - mov v3.s[3], v8.s[2] + mov v1.s[3], v7.s[0] + mov v2.s[3], v7.s[1] + mov v3.s[3], v7.s[2] ld1 {v5.16b-v7.16b}, [x20], #48 /* get 3 input blocks */ bl aes_encrypt_block4x eor v0.16b, v5.16b, v0.16b @@ -296,7 +303,6 @@ AES_ENTRY(aes_ctr_encrypt) ins v4.d[0], x7 b .Lctrcarrydone AES_ENDPROC(aes_ctr_encrypt) - .ltorg /*