diff mbox

crypto: don't optimize keccakf()

Message ID 20180608095341.92777-1-dvyukov@google.com (mailing list archive)
State Accepted
Delegated to: Herbert Xu
Headers show

Commit Message

Dmitry Vyukov June 8, 2018, 9:53 a.m. UTC
keccakf() is the only function in kernel that uses __optimize() macro.
__optimize() breaks frame pointer unwinder as optimized code uses RBP,
and amusingly this always lead to degraded performance as gcc does not
inline across different optimizations levels, so keccakf() wasn't inlined
into its callers and keccakf_round() wasn't inlined into keccakf().

Drop __optimize() to resolve both problems.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 83dee2ce1ae7 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com
Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com
Cc: linux-crypto@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 crypto/sha3_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ard Biesheuvel June 8, 2018, 9:54 a.m. UTC | #1
On 8 June 2018 at 11:53, Dmitry Vyukov <dvyukov@google.com> wrote:
> keccakf() is the only function in kernel that uses __optimize() macro.
> __optimize() breaks frame pointer unwinder as optimized code uses RBP,
> and amusingly this always lead to degraded performance as gcc does not
> inline across different optimizations levels, so keccakf() wasn't inlined
> into its callers and keccakf_round() wasn't inlined into keccakf().
>
> Drop __optimize() to resolve both problems.
>
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Fixes: 83dee2ce1ae7 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
> Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com
> Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com
> Cc: linux-crypto@vger.kernel.org
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  crypto/sha3_generic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/crypto/sha3_generic.c b/crypto/sha3_generic.c
> index 264ec12c0b9c..7f6735d9003f 100644
> --- a/crypto/sha3_generic.c
> +++ b/crypto/sha3_generic.c
> @@ -152,7 +152,7 @@ static SHA3_INLINE void keccakf_round(u64 st[25])
>         st[24] ^= bc[ 4];
>  }
>
> -static void __optimize("O3") keccakf(u64 st[25])
> +static void keccakf(u64 st[25])
>  {
>         int round;
>
> --
> 2.18.0.rc1.242.g61856ae69a-goog
>
Herbert Xu June 15, 2018, 3:14 p.m. UTC | #2
On Fri, Jun 08, 2018 at 11:53:41AM +0200, Dmitry Vyukov wrote:
> keccakf() is the only function in kernel that uses __optimize() macro.
> __optimize() breaks frame pointer unwinder as optimized code uses RBP,
> and amusingly this always lead to degraded performance as gcc does not
> inline across different optimizations levels, so keccakf() wasn't inlined
> into its callers and keccakf_round() wasn't inlined into keccakf().
> 
> Drop __optimize() to resolve both problems.
> 
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Fixes: 83dee2ce1ae7 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
> Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com
> Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com
> Cc: linux-crypto@vger.kernel.org
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Patch applied.  Thanks.
diff mbox

Patch

diff --git a/crypto/sha3_generic.c b/crypto/sha3_generic.c
index 264ec12c0b9c..7f6735d9003f 100644
--- a/crypto/sha3_generic.c
+++ b/crypto/sha3_generic.c
@@ -152,7 +152,7 @@  static SHA3_INLINE void keccakf_round(u64 st[25])
 	st[24] ^= bc[ 4];
 }
 
-static void __optimize("O3") keccakf(u64 st[25])
+static void keccakf(u64 st[25])
 {
 	int round;