From patchwork Sat Aug 4 18:46:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 10555855 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A753F13AC for ; Sat, 4 Aug 2018 18:47:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94AE529A66 for ; Sat, 4 Aug 2018 18:47:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 894F429A6B; Sat, 4 Aug 2018 18:47:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2E26C29A66 for ; Sat, 4 Aug 2018 18:47:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=lqkm8TQyjrBgH0jdReZjVqDlmqlYAyBzM2Wv7YRlXCY=; b=f8JyrtQkRoqBTGEFAKFkVrwEPk UGhzp9RTWjK0EgER8EmFhXKVYh9N9J6WuAqv4zkWZMV/IqrXMhhuHfQ8ZnPSMTaS9DHOTy2lKUpNL sqhNckIzZzgrxPT/9NQtJpAj/UTwwJN7qYF4DeR3ftVAcS9P1dQHwHIkEvvcM7gcLMaeB1VaFvWA1 MTWS71pZNmwveuvoI/Vs2ze/ROzQFVZeC7A4tqZtvaixvmrknBL+m0Z3rUA1RtBF6e6qW9xu3Zh5G hsEuYCowZh5FrikOve4Y/icK7XSpZTN9Y9zY4vyWDTNixwoMWH3ojnGnTC3em/WM5tHEk2I1z+WTo r9zO7DxQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fm1Zl-0002Y7-V4; Sat, 04 Aug 2018 18:46:53 +0000 Received: from mail-ed1-x529.google.com ([2a00:1450:4864:20::529]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fm1Zb-0002QD-Eh for linux-arm-kernel@lists.infradead.org; Sat, 04 Aug 2018 18:46:45 +0000 Received: by mail-ed1-x529.google.com with SMTP id f23-v6so3286495edr.11 for ; Sat, 04 Aug 2018 11:46:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=GoEAMPv7ae12W1B/AP8WTKMiEpYBcJEkfUG44j/a860=; b=Zer9dOQ4R/DkOk+xID0AtZJI+AdzIM4vjUOSVg6cFds/I/B+NSnbf2Z96uPxuJIqbf A7yyxePXeMcVLiL3ANRrNuq307/K+wlj8+2HUXTwuKUdZhj+6U4fdnXnO9KQMj16IPU0 g1xuyGRNu1JDLBucDzyjEqmnIQy0r6W8J4PvA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=GoEAMPv7ae12W1B/AP8WTKMiEpYBcJEkfUG44j/a860=; b=nwEFCl2ANI2NPuHV0LpapBe61TilJcrrp19AsZT4Cm2M9s49YoFKBxvuRXtmMMfg+R efPuuhr1mdUY/l4g+LV1z8AI2XZcLstkHJ9B/cjAro8grZHGyDp4Gj9qZ9gIZubhosYF WbP1KOuZ18CfgAJvNVKkwP4CrX+TERXKHxyMS2siyLCiv2Bwoi3FsK3WyLcBPf+8E48t s5pymVNSy2r+chQ0RytQ2OmKfMeWXvrpfp2Rdih3Edb/mi7MPfyokRZLs8x+QQcBnF+I RA7AuH05VEj9xJJUwwEdIisyeFcLnEFjXCFs8Uu4BsOVqOpAY6p46DHADVfB/QOHkliu KeRA== X-Gm-Message-State: AOUpUlH2FbECYjjp8XqrBU46FlVnl+xxK4yXWMpwbBQJAcUZCL62fTMw 0fV5FJJ7XavxXawx21rCz8tGqzUOCNQ= X-Google-Smtp-Source: AAOMgpcgiJVBxyHyoicGs0Dsmfoubo+9mkkHZx+XBkKdkl7bAU5/4fFZnRd6YIZG77rFE9TrrD1ZxQ== X-Received: by 2002:a50:d307:: with SMTP id g7-v6mr12098903edh.221.1533408391768; Sat, 04 Aug 2018 11:46:31 -0700 (PDT) Received: from rev02.home (b80182.upc-b.chello.nl. [212.83.80.182]) by smtp.gmail.com with ESMTPSA id l30-v6sm4340504edc.70.2018.08.04.11.46.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 04 Aug 2018 11:46:30 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Subject: [PATCH 1/2] crypto: arm64/ghash-ce - replace NEON yield check with block limit Date: Sat, 4 Aug 2018 20:46:24 +0200 Message-Id: <20180804184625.28523-2-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180804184625.28523-1-ard.biesheuvel@linaro.org> References: <20180804184625.28523-1-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180804_114643_497020_65025A94 X-CRM114-Status: GOOD ( 14.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ard Biesheuvel , jerome.forissier@linaro.org, herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, jens.wiklander@linaro.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Checking the TIF_NEED_RESCHED flag is disproportionately costly on cores with fast crypto instructions and comparatively slow memory accesses. On algorithms such as GHASH, which executes at ~1 cycle per byte on cores that implement support for 64 bit polynomial multiplication, there is really no need to check the TIF_NEED_RESCHED particularly often, and so we can remove the NEON yield check from the assembler routines. However, unlike the AEAD or skcipher APIs, the shash/ahash APIs take arbitrary input lengths, and so there needs to be some sanity check to ensure that we don't hog the CPU for excessive amounts of time. So let's simply cap the maximum input size that is processed in one go to 64 KB. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-core.S | 39 ++++++-------------- arch/arm64/crypto/ghash-ce-glue.c | 16 ++++++-- 2 files changed, 23 insertions(+), 32 deletions(-) diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index 913e49932ae6..344811c6a0ca 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -213,31 +213,23 @@ .endm .macro __pmull_ghash, pn - frame_push 5 - - mov x19, x0 - mov x20, x1 - mov x21, x2 - mov x22, x3 - mov x23, x4 - -0: ld1 {SHASH.2d}, [x22] - ld1 {XL.2d}, [x20] + ld1 {SHASH.2d}, [x3] + ld1 {XL.2d}, [x1] ext SHASH2.16b, SHASH.16b, SHASH.16b, #8 eor SHASH2.16b, SHASH2.16b, SHASH.16b __pmull_pre_\pn /* do the head block first, if supplied */ - cbz x23, 1f - ld1 {T1.2d}, [x23] - mov x23, xzr - b 2f + cbz x4, 0f + ld1 {T1.2d}, [x4] + mov x4, xzr + b 1f -1: ld1 {T1.2d}, [x21], #16 - sub w19, w19, #1 +0: ld1 {T1.2d}, [x2], #16 + sub w0, w0, #1 -2: /* multiply XL by SHASH in GF(2^128) */ +1: /* multiply XL by SHASH in GF(2^128) */ CPU_LE( rev64 T1.16b, T1.16b ) ext T2.16b, XL.16b, XL.16b, #8 @@ -259,18 +251,9 @@ CPU_LE( rev64 T1.16b, T1.16b ) eor T2.16b, T2.16b, XH.16b eor XL.16b, XL.16b, T2.16b - cbz w19, 3f - - if_will_cond_yield_neon - st1 {XL.2d}, [x20] - do_cond_yield_neon - b 0b - endif_yield_neon - - b 1b + cbnz w0, 0b -3: st1 {XL.2d}, [x20] - frame_pop + st1 {XL.2d}, [x1] ret .endm diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index 88e3d93fa7c7..03ce71ea81a2 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -113,6 +113,9 @@ static void ghash_do_update(int blocks, u64 dg[], const char *src, } } +/* avoid hogging the CPU for too long */ +#define MAX_BLOCKS (SZ_64K / GHASH_BLOCK_SIZE) + static int ghash_update(struct shash_desc *desc, const u8 *src, unsigned int len) { @@ -136,11 +139,16 @@ static int ghash_update(struct shash_desc *desc, const u8 *src, blocks = len / GHASH_BLOCK_SIZE; len %= GHASH_BLOCK_SIZE; - ghash_do_update(blocks, ctx->digest, src, key, - partial ? ctx->buf : NULL); + do { + int chunk = min(blocks, MAX_BLOCKS); + + ghash_do_update(chunk, ctx->digest, src, key, + partial ? ctx->buf : NULL); - src += blocks * GHASH_BLOCK_SIZE; - partial = 0; + blocks -= chunk; + src += chunk * GHASH_BLOCK_SIZE; + partial = 0; + } while (unlikely(blocks > 0)); } if (len) memcpy(ctx->buf + partial, src, len);