From patchwork Sun Oct 25 14:31:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arvind Sankar X-Patchwork-Id: 11855275 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F005C4363A for ; Sun, 25 Oct 2020 14:31:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 24182208E4 for ; Sun, 25 Oct 2020 14:31:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1414761AbgJYObY (ORCPT ); Sun, 25 Oct 2020 10:31:24 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:34851 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1414677AbgJYObY (ORCPT ); Sun, 25 Oct 2020 10:31:24 -0400 Received: by mail-qt1-f196.google.com with SMTP id c15so4921116qtc.2; Sun, 25 Oct 2020 07:31:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MN0s8vlWrwWpnF9v8V5eNRd1d4ATTd49nrUqABX5hio=; b=ABdQs9FVB8/FzH8dKAILE4TkU/0WO2NHBuHP9zeq4lYz1kyJ1asNrBgxWVMh3bsIUA AGWqBw/D8j0CWprixxXJBrfZn8QYylaPBG7j5JXYE1xztO5/jP7cuyVTCaJcYI7JMFet ym9MPhVf2aSjXQxLDSVp/zbA6OmTWs09RCgsd/fHdThVOn8Wk/TfeArCL3yq7DUeLGAA 54iSTNyKhgttz/geEpm2deBV/KaYlKB0zzSO8LlXDbSkAj+8uwmb/T6i1iYAXSk7Tpep gHkd4elqXb8n6A1OXl/6X0xDJzvEdITCAJ3HNCWDLn/8iHZu9arKrirb3D+wxoQwm1jV 6BRw== X-Gm-Message-State: AOAM532lF+KE9aJ4fJKsjf2EHbOqj8UbFy5WyumVkF4m0ddCexbIT5hD LG1SBDNbJW3nsw5kwcP9kpk= X-Google-Smtp-Source: ABdhPJyOKIf4i9PAJrrokcQidlOn76hVGHHE4GJVfoadKRDzionc4xHJHuexCc+Z/QteZtRrMAURNg== X-Received: by 2002:aed:3a86:: with SMTP id o6mr11818147qte.333.1603636283458; Sun, 25 Oct 2020 07:31:23 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id s73sm4740898qke.71.2020.10.25.07.31.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Oct 2020 07:31:22 -0700 (PDT) From: Arvind Sankar To: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight Cc: linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH v4 1/6] crypto: lib/sha256 - Use memzero_explicit() for clearing state Date: Sun, 25 Oct 2020 10:31:14 -0400 Message-Id: <20201025143119.1054168-2-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201025143119.1054168-1-nivedita@alum.mit.edu> References: <20201025143119.1054168-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Without the barrier_data() inside memzero_explicit(), the compiler may optimize away the state-clearing if it can tell that the state is not used afterwards. At least in lib/crypto/sha256.c:__sha256_final(), the function can get inlined into sha256(), in which case the memset is optimized away. Signed-off-by: Arvind Sankar Reviewed-by: Eric Biggers Acked-by: Ard Biesheuvel --- lib/crypto/sha256.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c index 2321f6cb322f..d43bc39ab05e 100644 --- a/lib/crypto/sha256.c +++ b/lib/crypto/sha256.c @@ -265,7 +265,7 @@ static void __sha256_final(struct sha256_state *sctx, u8 *out, int digest_words) put_unaligned_be32(sctx->state[i], &dst[i]); /* Zeroize sensitive information. */ - memset(sctx, 0, sizeof(*sctx)); + memzero_explicit(sctx, sizeof(*sctx)); } void sha256_final(struct sha256_state *sctx, u8 *out) From patchwork Sun Oct 25 14:31:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arvind Sankar X-Patchwork-Id: 11855287 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 430A5C388F7 for ; Sun, 25 Oct 2020 14:31:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1AB642085B for ; Sun, 25 Oct 2020 14:31:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1415244AbgJYObr (ORCPT ); Sun, 25 Oct 2020 10:31:47 -0400 Received: from mail-qk1-f193.google.com ([209.85.222.193]:33379 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1416786AbgJYObo (ORCPT ); Sun, 25 Oct 2020 10:31:44 -0400 Received: by mail-qk1-f193.google.com with SMTP id t128so1007642qke.0; Sun, 25 Oct 2020 07:31:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8jTZMyqTbp+NmIDuVNdFO2SHgdZdzfvocT/pY+Bhan4=; b=I1lU0UcNCibq/bpPZxnxgncNdkHpIv46x/n3HD4OD/4q+/j6swlFc8R7I95GOEVByQ 59+43eM5mOHbX+W0ugiZk2Q/slXlgzJ5F0Jyzc8+0yT5nGHbRnBfsNsAjqVpIsXWH8Ev HDuL3rMKm9Y8ZiHnbJbtlkti9hXYfx9JOzIm51a9J14sedn0PTwJXCdyDexw/TW8WY7i dVMHG/sxaFSKSC41JTtSq7qYzkRmxDPsRQpHQBSmOPeBAVFM4IEuYQD8x/4SaGMv4J95 f5mmjPjVc2Qjl5XjczynHULL+TlO4fH7SOHY/rcQe06PcNKXcPp9o+qy1rnLiGuoiQn+ MOaA== X-Gm-Message-State: AOAM532OPB7c9uC20tJaPG38BZY+wtzwLGOknfd3zqIGjYw39ffhLQLD lbWURlDS1hVpAiFbRoQj9yvcDGQit1drKw== X-Google-Smtp-Source: ABdhPJzuueTo+fZMtEu39Pyuk4FNYAdp/A/Y4MPBKvVhsMOn3q+XhrP00SYwbPbus0zzxA0Bd70FaA== X-Received: by 2002:a37:a045:: with SMTP id j66mr11744247qke.305.1603636284600; Sun, 25 Oct 2020 07:31:24 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id s73sm4740898qke.71.2020.10.25.07.31.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Oct 2020 07:31:23 -0700 (PDT) From: Arvind Sankar To: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight Cc: linux-kernel@vger.kernel.org Subject: [PATCH v4 2/6] crypto: Use memzero_explicit() for clearing state Date: Sun, 25 Oct 2020 10:31:15 -0400 Message-Id: <20201025143119.1054168-3-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201025143119.1054168-1-nivedita@alum.mit.edu> References: <20201025143119.1054168-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Without the barrier_data() inside memzero_explicit(), the compiler may optimize away the state-clearing if it can tell that the state is not used afterwards. Signed-off-by: Arvind Sankar Acked-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-glue.c | 2 +- arch/arm64/crypto/poly1305-glue.c | 2 +- arch/arm64/crypto/sha3-ce-glue.c | 2 +- arch/x86/crypto/poly1305_glue.c | 2 +- include/crypto/sha1_base.h | 3 ++- include/crypto/sha256_base.h | 3 ++- include/crypto/sha512_base.h | 3 ++- include/crypto/sm3_base.h | 3 ++- 8 files changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index 8536008e3e35..2427e2f3a9a1 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -168,7 +168,7 @@ static int ghash_final(struct shash_desc *desc, u8 *dst) put_unaligned_be64(ctx->digest[1], dst); put_unaligned_be64(ctx->digest[0], dst + 8); - *ctx = (struct ghash_desc_ctx){}; + memzero_explicit(ctx, sizeof(*ctx)); return 0; } diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c index f33ada70c4ed..683de671741a 100644 --- a/arch/arm64/crypto/poly1305-glue.c +++ b/arch/arm64/crypto/poly1305-glue.c @@ -177,7 +177,7 @@ void poly1305_final_arch(struct poly1305_desc_ctx *dctx, u8 *dst) } poly1305_emit(&dctx->h, dst, dctx->s); - *dctx = (struct poly1305_desc_ctx){}; + memzero_explicit(dctx, sizeof(*dctx)); } EXPORT_SYMBOL(poly1305_final_arch); diff --git a/arch/arm64/crypto/sha3-ce-glue.c b/arch/arm64/crypto/sha3-ce-glue.c index 9a4bbfc45f40..e5a2936f0886 100644 --- a/arch/arm64/crypto/sha3-ce-glue.c +++ b/arch/arm64/crypto/sha3-ce-glue.c @@ -94,7 +94,7 @@ static int sha3_final(struct shash_desc *desc, u8 *out) if (digest_size & 4) put_unaligned_le32(sctx->st[i], (__le32 *)digest); - *sctx = (struct sha3_state){}; + memzero_explicit(sctx, sizeof(*sctx)); return 0; } diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index e508dbd91813..64d09520d279 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -209,7 +209,7 @@ void poly1305_final_arch(struct poly1305_desc_ctx *dctx, u8 *dst) } poly1305_simd_emit(&dctx->h, dst, dctx->s); - *dctx = (struct poly1305_desc_ctx){}; + memzero_explicit(dctx, sizeof(*dctx)); } EXPORT_SYMBOL(poly1305_final_arch); diff --git a/include/crypto/sha1_base.h b/include/crypto/sha1_base.h index 20fd1f7468af..a5d6033efef7 100644 --- a/include/crypto/sha1_base.h +++ b/include/crypto/sha1_base.h @@ -12,6 +12,7 @@ #include #include #include +#include #include @@ -101,7 +102,7 @@ static inline int sha1_base_finish(struct shash_desc *desc, u8 *out) for (i = 0; i < SHA1_DIGEST_SIZE / sizeof(__be32); i++) put_unaligned_be32(sctx->state[i], digest++); - *sctx = (struct sha1_state){}; + memzero_explicit(sctx, sizeof(*sctx)); return 0; } diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h index 6ded110783ae..93f9fd21cc06 100644 --- a/include/crypto/sha256_base.h +++ b/include/crypto/sha256_base.h @@ -12,6 +12,7 @@ #include #include #include +#include #include @@ -105,7 +106,7 @@ static inline int sha256_base_finish(struct shash_desc *desc, u8 *out) for (i = 0; digest_size > 0; i++, digest_size -= sizeof(__be32)) put_unaligned_be32(sctx->state[i], digest++); - *sctx = (struct sha256_state){}; + memzero_explicit(sctx, sizeof(*sctx)); return 0; } diff --git a/include/crypto/sha512_base.h b/include/crypto/sha512_base.h index fb19c77494dc..93ab73baa38e 100644 --- a/include/crypto/sha512_base.h +++ b/include/crypto/sha512_base.h @@ -12,6 +12,7 @@ #include #include #include +#include #include @@ -126,7 +127,7 @@ static inline int sha512_base_finish(struct shash_desc *desc, u8 *out) for (i = 0; digest_size > 0; i++, digest_size -= sizeof(__be64)) put_unaligned_be64(sctx->state[i], digest++); - *sctx = (struct sha512_state){}; + memzero_explicit(sctx, sizeof(*sctx)); return 0; } diff --git a/include/crypto/sm3_base.h b/include/crypto/sm3_base.h index 1cbf9aa1fe52..2f3a32ab97bb 100644 --- a/include/crypto/sm3_base.h +++ b/include/crypto/sm3_base.h @@ -13,6 +13,7 @@ #include #include #include +#include #include typedef void (sm3_block_fn)(struct sm3_state *sst, u8 const *src, int blocks); @@ -104,7 +105,7 @@ static inline int sm3_base_finish(struct shash_desc *desc, u8 *out) for (i = 0; i < SM3_DIGEST_SIZE / sizeof(__be32); i++) put_unaligned_be32(sctx->state[i], digest++); - *sctx = (struct sm3_state){}; + memzero_explicit(sctx, sizeof(*sctx)); return 0; } From patchwork Sun Oct 25 14:31:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arvind Sankar X-Patchwork-Id: 11855283 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A587EC55178 for ; Sun, 25 Oct 2020 14:31:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6BE472085B for ; Sun, 25 Oct 2020 14:31:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1415098AbgJYObj (ORCPT ); Sun, 25 Oct 2020 10:31:39 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:40116 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1414982AbgJYObj (ORCPT ); Sun, 25 Oct 2020 10:31:39 -0400 Received: by mail-qt1-f193.google.com with SMTP id m9so4899067qth.7; Sun, 25 Oct 2020 07:31:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=i4eX+361CNFGQN2UQLtgO0wdjlQPWSbW5hz94r5g1po=; b=tScp24scpYi2zAVXN+6cWi7UoQWz5yBBWFgZ5tgQ076hhFabZKGCc825tuzy2TNdz8 m3Lo/aDSMI6cWWTsIbNciemzSqHdZUcXRGBXZjqCFmNt+FlVQzqU/KeavkNjvQz+bepY ZyuC6lHp6yBc9mPbK6s/mE6jA/oRFWK/JsLL+Vb2IxL87srq0EV+t/kGYpIDL4svzKax eam5Z8LOGBVB9MF0CGT3ymIS/uj7CiCJ9gSczeoBjO85TTwXb7h06CIAADs9YTl1VW1a hwjFa2zrxgoAzNv9Dp+hQrn5IM5HW0VYaKhJxWoruxlmJj8x1TS5X3fEwl+1Y8TRkupS Amyg== X-Gm-Message-State: AOAM53350OucaiVPtJ1tYsA+NaaiePa8CgRCjE2OYSb8fCVZfWF1ggF5 u00/PyWcq8pA1L3ynFV3QTqLoJhfXuxTEQ== X-Google-Smtp-Source: ABdhPJzsyfCxly9OaP5gTUypi9SJ3XskYCAgFs2n8MAOXaQkSc0VJXtdrA9GC6ZXK/ovTZ5RkX0y+A== X-Received: by 2002:ac8:4e0d:: with SMTP id c13mr11383519qtw.382.1603636285799; Sun, 25 Oct 2020 07:31:25 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id s73sm4740898qke.71.2020.10.25.07.31.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Oct 2020 07:31:24 -0700 (PDT) From: Arvind Sankar To: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight Cc: linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH v4 3/6] crypto: lib/sha256 - Don't clear temporary variables Date: Sun, 25 Oct 2020 10:31:16 -0400 Message-Id: <20201025143119.1054168-4-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201025143119.1054168-1-nivedita@alum.mit.edu> References: <20201025143119.1054168-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The assignments to clear a through h and t1/t2 are optimized out by the compiler because they are unused after the assignments. Clearing individual scalar variables is unlikely to be useful, as they may have been assigned to registers, and even if stack spilling was required, there may be compiler-generated temporaries that are impossible to clear in any case. So drop the clearing of a through h and t1/t2. Signed-off-by: Arvind Sankar Reviewed-by: Eric Biggers Acked-by: Ard Biesheuvel --- lib/crypto/sha256.c | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c index d43bc39ab05e..099cd11f83c1 100644 --- a/lib/crypto/sha256.c +++ b/lib/crypto/sha256.c @@ -202,7 +202,6 @@ static void sha256_transform(u32 *state, const u8 *input) state[4] += e; state[5] += f; state[6] += g; state[7] += h; /* clear any sensitive info... */ - a = b = c = d = e = f = g = h = t1 = t2 = 0; memzero_explicit(W, 64 * sizeof(u32)); } From patchwork Sun Oct 25 14:31:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arvind Sankar X-Patchwork-Id: 11855277 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C392BC4363A for ; Sun, 25 Oct 2020 14:31:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 916B5208E4 for ; Sun, 25 Oct 2020 14:31:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1416635AbgJYOb2 (ORCPT ); Sun, 25 Oct 2020 10:31:28 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:40120 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1414677AbgJYOb2 (ORCPT ); Sun, 25 Oct 2020 10:31:28 -0400 Received: by mail-qt1-f196.google.com with SMTP id m9so4899089qth.7; Sun, 25 Oct 2020 07:31:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bi7rG0H94VO76zPLa4Hh0/sN75asB3cy5CcRjg5R6zI=; b=nyp0VgarkLXREJjtBpcWE0kZDqvPQ0R2yuPDrRgGg2lEClPv/Slonlc94thftyEBHV pwHUqAjSix+ap5kdAfn1YGOeI19gBHm2vru48ThN9OfE8RqbzbcQBO9D05qdGXf3jptG 1HQGULOuHNGncGYDyMEuxFTmFqa2nDcVfOaNv7NhyS9p4SnaA1d2+gZZNHsR50iNZv5x w3tRu6XRVaVOaHdChrKovEl5n/THltSuE28ZBBO3tmPJ2oOju2RTsKXvvrS8eRPLV1fJ lLnD8T/a/ZQlVJwwAHsjjdkRk8PYEx6eCHFd2KDb2Yqq6usLwZuztS1jYgv+uhcaShRk lgOw== X-Gm-Message-State: AOAM531tKJupFq/8CXE+PD/M+n4tLdrMnYmgFyOMwOFs93Hrrdi9B1Yt u4Kh4Y4GEMQWtbYfnFNKX1Q= X-Google-Smtp-Source: ABdhPJy6TJ7OnnPMnOVBY7QpOsC5dVA//baSaIV9SoClXC96369xl8GnFioUvT3jfKCGp6Xr8IyUnw== X-Received: by 2002:a05:622a:242:: with SMTP id c2mr12600299qtx.230.1603636287091; Sun, 25 Oct 2020 07:31:27 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id s73sm4740898qke.71.2020.10.25.07.31.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Oct 2020 07:31:26 -0700 (PDT) From: Arvind Sankar To: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight Cc: linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH v4 4/6] crypto: lib/sha256 - Clear W[] in sha256_update() instead of sha256_transform() Date: Sun, 25 Oct 2020 10:31:17 -0400 Message-Id: <20201025143119.1054168-5-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201025143119.1054168-1-nivedita@alum.mit.edu> References: <20201025143119.1054168-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The temporary W[] array is currently zeroed out once every call to sha256_transform(), i.e. once every 64 bytes of input data. Moving it to sha256_update() instead so that it is cleared only once per update can save about 2-3% of the total time taken to compute the digest, with a reasonable memset() implementation, and considerably more (~20%) with a bad one (eg the x86 purgatory currently uses a memset() coded in C). Signed-off-by: Arvind Sankar Reviewed-by: Eric Biggers Acked-by: Ard Biesheuvel --- lib/crypto/sha256.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c index 099cd11f83c1..c6bfeacc5b81 100644 --- a/lib/crypto/sha256.c +++ b/lib/crypto/sha256.c @@ -43,10 +43,9 @@ static inline void BLEND_OP(int I, u32 *W) W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16]; } -static void sha256_transform(u32 *state, const u8 *input) +static void sha256_transform(u32 *state, const u8 *input, u32 *W) { u32 a, b, c, d, e, f, g, h, t1, t2; - u32 W[64]; int i; /* load the input */ @@ -200,15 +199,13 @@ static void sha256_transform(u32 *state, const u8 *input) state[0] += a; state[1] += b; state[2] += c; state[3] += d; state[4] += e; state[5] += f; state[6] += g; state[7] += h; - - /* clear any sensitive info... */ - memzero_explicit(W, 64 * sizeof(u32)); } void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len) { unsigned int partial, done; const u8 *src; + u32 W[64]; partial = sctx->count & 0x3f; sctx->count += len; @@ -223,11 +220,13 @@ void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len) } do { - sha256_transform(sctx->state, src); + sha256_transform(sctx->state, src, W); done += 64; src = data + done; } while (done + 63 < len); + memzero_explicit(W, sizeof(W)); + partial = 0; } memcpy(sctx->buf + partial, src, len - done); From patchwork Sun Oct 25 14:31:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arvind Sankar X-Patchwork-Id: 11855285 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D542EC388F7 for ; Sun, 25 Oct 2020 14:31:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 952302085B for ; Sun, 25 Oct 2020 14:31:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1416757AbgJYObb (ORCPT ); Sun, 25 Oct 2020 10:31:31 -0400 Received: from mail-qv1-f66.google.com ([209.85.219.66]:33992 "EHLO mail-qv1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1416634AbgJYOba (ORCPT ); Sun, 25 Oct 2020 10:31:30 -0400 Received: by mail-qv1-f66.google.com with SMTP id g13so3134885qvu.1; Sun, 25 Oct 2020 07:31:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9NO4O2bt1CaoBiX1vuLjpgxnMXq9A7GoXHAtZwYww04=; b=kB4gPWbeYfyDnwF3AZhkPIuvBFlrLcG+WrG88okQs0S+aj0Dnmr5HN7t0w9xvkWQJ2 8G6sgf4YJy/ZydMnKIXl/zUTXdCOS8g7McKDTWqgbjjvHPzxaNs/15S0pnlUEZbtfsLz D6/3gTORcl4bDT9LtlMCppaSkDUXI5ZO8dx7FwxZaWeCPZcKii8dAN3xup5PnwN117do uudbI1bmds6EC4ZTYb6+tN8SWE8aIJlCNwIqNLrUgEvK4JVtxLonxyZJaBnEVnG49ZpZ JgSEQu1Sgc/LybYJHgqZJcBjUKPud77iQVjJEbK5YGkxwRx4N0Tta0BoboMxaW8nmMFD NfSQ== X-Gm-Message-State: AOAM533MCk+mjEFvu0mSIAGuPY14efYQ87SBLQzfPZYepaR/rrEBmPuO PiDo5zVla6vuDV+PDtFbitw= X-Google-Smtp-Source: ABdhPJyA1CMm5TMc6hS5T3Hn0bt5JhdwS2oIStPI9TZMoKPGRdWrCvRawEGtCDoKm2NZ0YVRG0+qug== X-Received: by 2002:a0c:b5e1:: with SMTP id o33mr9191231qvf.17.1603636288086; Sun, 25 Oct 2020 07:31:28 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id s73sm4740898qke.71.2020.10.25.07.31.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Oct 2020 07:31:27 -0700 (PDT) From: Arvind Sankar To: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight Cc: linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH v4 5/6] crypto: lib/sha256 - Unroll SHA256 loop 8 times intead of 64 Date: Sun, 25 Oct 2020 10:31:18 -0400 Message-Id: <20201025143119.1054168-6-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201025143119.1054168-1-nivedita@alum.mit.edu> References: <20201025143119.1054168-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org This reduces code size substantially (on x86_64 with gcc-10 the size of sha256_update() goes from 7593 bytes to 1952 bytes including the new SHA256_K array), and on x86 is slightly faster than the full unroll (tested on Broadwell Xeon). Signed-off-by: Arvind Sankar Reviewed-by: Eric Biggers Acked-by: Ard Biesheuvel --- lib/crypto/sha256.c | 174 ++++++++++---------------------------------- 1 file changed, 38 insertions(+), 136 deletions(-) diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c index c6bfeacc5b81..e2e29d9b0ccd 100644 --- a/lib/crypto/sha256.c +++ b/lib/crypto/sha256.c @@ -18,6 +18,25 @@ #include #include +static const u32 SHA256_K[] = { + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, + 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, + 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, + 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, + 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, + 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, + 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, + 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, + 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, + 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, + 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, + 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, + 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, + 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2, +}; + static inline u32 Ch(u32 x, u32 y, u32 z) { return z ^ (x & (y ^ z)); @@ -43,9 +62,17 @@ static inline void BLEND_OP(int I, u32 *W) W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16]; } +#define SHA256_ROUND(i, a, b, c, d, e, f, g, h) do { \ + u32 t1, t2; \ + t1 = h + e1(e) + Ch(e, f, g) + SHA256_K[i] + W[i]; \ + t2 = e0(a) + Maj(a, b, c); \ + d += t1; \ + h = t1 + t2; \ +} while (0) + static void sha256_transform(u32 *state, const u8 *input, u32 *W) { - u32 a, b, c, d, e, f, g, h, t1, t2; + u32 a, b, c, d, e, f, g, h; int i; /* load the input */ @@ -61,141 +88,16 @@ static void sha256_transform(u32 *state, const u8 *input, u32 *W) e = state[4]; f = state[5]; g = state[6]; h = state[7]; /* now iterate */ - t1 = h + e1(e) + Ch(e, f, g) + 0x428a2f98 + W[0]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0x71374491 + W[1]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0xb5c0fbcf + W[2]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0xe9b5dba5 + W[3]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0x3956c25b + W[4]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0x59f111f1 + W[5]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0x923f82a4 + W[6]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0xab1c5ed5 + W[7]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0xd807aa98 + W[8]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0x12835b01 + W[9]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0x243185be + W[10]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0x550c7dc3 + W[11]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0x72be5d74 + W[12]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0x80deb1fe + W[13]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0x9bdc06a7 + W[14]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0xc19bf174 + W[15]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0xe49b69c1 + W[16]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0xefbe4786 + W[17]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0x0fc19dc6 + W[18]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0x240ca1cc + W[19]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0x2de92c6f + W[20]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0x4a7484aa + W[21]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0x5cb0a9dc + W[22]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0x76f988da + W[23]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0x983e5152 + W[24]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0xa831c66d + W[25]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0xb00327c8 + W[26]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0xbf597fc7 + W[27]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0xc6e00bf3 + W[28]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0xd5a79147 + W[29]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0x06ca6351 + W[30]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0x14292967 + W[31]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0x27b70a85 + W[32]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0x2e1b2138 + W[33]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0x4d2c6dfc + W[34]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0x53380d13 + W[35]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0x650a7354 + W[36]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0x766a0abb + W[37]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0x81c2c92e + W[38]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0x92722c85 + W[39]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0xa2bfe8a1 + W[40]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0xa81a664b + W[41]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0xc24b8b70 + W[42]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0xc76c51a3 + W[43]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0xd192e819 + W[44]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0xd6990624 + W[45]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0xf40e3585 + W[46]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0x106aa070 + W[47]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0x19a4c116 + W[48]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0x1e376c08 + W[49]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0x2748774c + W[50]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0x34b0bcb5 + W[51]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0x391c0cb3 + W[52]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0x4ed8aa4a + W[53]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0x5b9cca4f + W[54]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0x682e6ff3 + W[55]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; - - t1 = h + e1(e) + Ch(e, f, g) + 0x748f82ee + W[56]; - t2 = e0(a) + Maj(a, b, c); d += t1; h = t1 + t2; - t1 = g + e1(d) + Ch(d, e, f) + 0x78a5636f + W[57]; - t2 = e0(h) + Maj(h, a, b); c += t1; g = t1 + t2; - t1 = f + e1(c) + Ch(c, d, e) + 0x84c87814 + W[58]; - t2 = e0(g) + Maj(g, h, a); b += t1; f = t1 + t2; - t1 = e + e1(b) + Ch(b, c, d) + 0x8cc70208 + W[59]; - t2 = e0(f) + Maj(f, g, h); a += t1; e = t1 + t2; - t1 = d + e1(a) + Ch(a, b, c) + 0x90befffa + W[60]; - t2 = e0(e) + Maj(e, f, g); h += t1; d = t1 + t2; - t1 = c + e1(h) + Ch(h, a, b) + 0xa4506ceb + W[61]; - t2 = e0(d) + Maj(d, e, f); g += t1; c = t1 + t2; - t1 = b + e1(g) + Ch(g, h, a) + 0xbef9a3f7 + W[62]; - t2 = e0(c) + Maj(c, d, e); f += t1; b = t1 + t2; - t1 = a + e1(f) + Ch(f, g, h) + 0xc67178f2 + W[63]; - t2 = e0(b) + Maj(b, c, d); e += t1; a = t1 + t2; + for (i = 0; i < 64; i += 8) { + SHA256_ROUND(i + 0, a, b, c, d, e, f, g, h); + SHA256_ROUND(i + 1, h, a, b, c, d, e, f, g); + SHA256_ROUND(i + 2, g, h, a, b, c, d, e, f); + SHA256_ROUND(i + 3, f, g, h, a, b, c, d, e); + SHA256_ROUND(i + 4, e, f, g, h, a, b, c, d); + SHA256_ROUND(i + 5, d, e, f, g, h, a, b, c); + SHA256_ROUND(i + 6, c, d, e, f, g, h, a, b); + SHA256_ROUND(i + 7, b, c, d, e, f, g, h, a); + } state[0] += a; state[1] += b; state[2] += c; state[3] += d; state[4] += e; state[5] += f; state[6] += g; state[7] += h; From patchwork Sun Oct 25 14:31:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arvind Sankar X-Patchwork-Id: 11855281 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 248F5C4363A for ; Sun, 25 Oct 2020 14:31:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2A242085B for ; Sun, 25 Oct 2020 14:31:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1416781AbgJYObn (ORCPT ); Sun, 25 Oct 2020 10:31:43 -0400 Received: from mail-qv1-f68.google.com ([209.85.219.68]:35466 "EHLO mail-qv1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1415244AbgJYObm (ORCPT ); Sun, 25 Oct 2020 10:31:42 -0400 Received: by mail-qv1-f68.google.com with SMTP id cv1so3134923qvb.2; Sun, 25 Oct 2020 07:31:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rybpjupXWyCo0yLZ9NEgjQq2QiRjNeLKs2uYGAtk9sY=; b=jIi59V8fg766tRd0f4HbdmPRyVIJf/KnKAebRnoQUaZybUE/lMGlo/Sgp00BG6CKXM dyo5Yw6YNoVfoaQbnMymg7EbBX9KL23hxNNEvPbMgK6PXpz1vxqxcJcJczzp15e+G7Gk FX1iMpCLEPQMv0YlZd4ht/HutMhQVp2+akp5lWSW/rSVdxyMgCfVeJ6LV3cUsShaXYVX 4wtCdgfDynElEv1jNPS4s7rKUXxtK1Rn3zV0mVSig/qMYOMGFN4To4Ek8R4SnStdb4ZH cY1Av+RK6wrqQk+nuJeGxz+U5dn263Xykr1Ah4pp2qBWHW8xtWq10Dz9AuzV1jQGf4Uf q2BQ== X-Gm-Message-State: AOAM531d4AILjWNP/Tl4EA3fX7B5Cxl+kIx4wDE0daH6o2WKmajDg9ub OjNdRv2oAb693lPzykq9yK4= X-Google-Smtp-Source: ABdhPJxsUNDbzN+EXbhacs1WTTLV7SYvwCSmhyoEEWNK5Iat+TXSXLYY98tgMkwvLMmZzQfTIZWOcA== X-Received: by 2002:a05:6214:174f:: with SMTP id dc15mr8374974qvb.26.1603636289075; Sun, 25 Oct 2020 07:31:29 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id s73sm4740898qke.71.2020.10.25.07.31.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Oct 2020 07:31:28 -0700 (PDT) From: Arvind Sankar To: Herbert Xu , "David S. Miller" , "linux-crypto@vger.kernel.org" , Eric Biggers , David Laight Cc: linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH v4 6/6] crypto: lib/sha256 - Unroll LOAD and BLEND loops Date: Sun, 25 Oct 2020 10:31:19 -0400 Message-Id: <20201025143119.1054168-7-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201025143119.1054168-1-nivedita@alum.mit.edu> References: <20201025143119.1054168-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Unrolling the LOAD and BLEND loops improves performance by ~8% on x86_64 (tested on Broadwell Xeon) while not increasing code size too much. Signed-off-by: Arvind Sankar Reviewed-by: Eric Biggers Acked-by: Ard Biesheuvel --- lib/crypto/sha256.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c index e2e29d9b0ccd..cdef37c05972 100644 --- a/lib/crypto/sha256.c +++ b/lib/crypto/sha256.c @@ -76,12 +76,28 @@ static void sha256_transform(u32 *state, const u8 *input, u32 *W) int i; /* load the input */ - for (i = 0; i < 16; i++) - LOAD_OP(i, W, input); + for (i = 0; i < 16; i += 8) { + LOAD_OP(i + 0, W, input); + LOAD_OP(i + 1, W, input); + LOAD_OP(i + 2, W, input); + LOAD_OP(i + 3, W, input); + LOAD_OP(i + 4, W, input); + LOAD_OP(i + 5, W, input); + LOAD_OP(i + 6, W, input); + LOAD_OP(i + 7, W, input); + } /* now blend */ - for (i = 16; i < 64; i++) - BLEND_OP(i, W); + for (i = 16; i < 64; i += 8) { + BLEND_OP(i + 0, W); + BLEND_OP(i + 1, W); + BLEND_OP(i + 2, W); + BLEND_OP(i + 3, W); + BLEND_OP(i + 4, W); + BLEND_OP(i + 5, W); + BLEND_OP(i + 6, W); + BLEND_OP(i + 7, W); + } /* load the state into our registers */ a = state[0]; b = state[1]; c = state[2]; d = state[3];