From patchwork Thu Oct 6 22:31:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000790 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC197C433F5 for ; Thu, 6 Oct 2022 22:37:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbiJFWhW (ORCPT ); Thu, 6 Oct 2022 18:37:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229849AbiJFWhV (ORCPT ); Thu, 6 Oct 2022 18:37:21 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 044D592593; Thu, 6 Oct 2022 15:37:21 -0700 (PDT) Received: from pps.filterd (m0134422.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296MYusK001006; Thu, 6 Oct 2022 22:37:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=Duxug50hyqlUo3vJQebtcrpCMNmKwa8QfgSYdJzoNuM=; b=UhZyX7MU0w0HqSuDw3L+7n1VfgicJUNVpqT5yfol8RiGbu2SHXmcDOsDE+GDPG17WbB5 +4a0STX/LZgmBfZLHSGwLsNu9Ly3k4rn26owIePqTqDLTzLeKy280HQ3SqOeeAISp0BU v8+83FTDfuARSYr5Wl+e4Y0B/N+BvfwoLvcgKFMwcksgclqIfIWYK93WzR4SW8okQmb/ djJyx4EOfjQzkWH/zotD2XEMRn0A0wLnO5IHlQvYpZZwn0rhOH5Z+NXYQrzqAVnlHf5N oGNwTh9xPV3BMm42dCGu7cWyAvPUhLMX1s0Exp4tH4oUi3YQIfQrrhspV0V6FtI/6R/X nw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27xag0ke-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:37:15 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 65B3313949; Thu, 6 Oct 2022 22:32:15 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 162C6803AB6; Thu, 6 Oct 2022 22:32:15 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 1/7] rcu: correct CONFIG_EXT_RCU_CPU_STALL_TIMEOUT descriptions Date: Thu, 6 Oct 2022 17:31:45 -0500 Message-Id: <20221006223151.22159-2-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 3tntthtH4ugWrwTBrPT1WkqFVy2HJArY X-Proofpoint-GUID: 3tntthtH4ugWrwTBrPT1WkqFVy2HJArY X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 malwarescore=0 bulkscore=0 adultscore=0 priorityscore=1501 mlxlogscore=999 lowpriorityscore=0 spamscore=0 impostorscore=0 suspectscore=0 phishscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Make the descriptions of CONFIG_EXT_RCU_CPU_STALL_TIMEOUT match the code: - there is no longer a default of 20 ms for Android since commit 1045a06724f3 ("remove CONFIG_ANDROID"), - the code includes a maximum of 21 seconds, evident when specifying 0 which means to use the CONFIG_RCU_STALL_TIMEOUT value (whose default is 60 seconds). Example .config: CONFIG_RCU_CPU_STALL_TIMEOUT=60 CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0 leads to: /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout:60 /sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout:21000 Fixes: 1045a06724f3 ("remove CONFIG_ANDROID") Signed-off-by: Robert Elliott --- Documentation/RCU/stallwarn.rst | 9 +++++---- kernel/rcu/Kconfig.debug | 2 +- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index e38c587067fc..d86a8b47504f 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -168,10 +168,11 @@ CONFIG_RCU_EXP_CPU_STALL_TIMEOUT Same as the CONFIG_RCU_CPU_STALL_TIMEOUT parameter but only for the expedited grace period. This parameter defines the period of time that RCU will wait from the beginning of an expedited - grace period until it issues an RCU CPU stall warning. This time - period is normally 20 milliseconds on Android devices. A zero - value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be used, - after conversion to milliseconds. + grace period until it issues an RCU CPU stall warning. + + A zero value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be + used, after conversion to milliseconds, limited to a maximum of + 21 seconds. This configuration parameter may be changed at runtime via the /sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout, however diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug index 1b0c41d490f0..4477eeb8a54f 100644 --- a/kernel/rcu/Kconfig.debug +++ b/kernel/rcu/Kconfig.debug @@ -93,7 +93,7 @@ config RCU_EXP_CPU_STALL_TIMEOUT If the RCU grace period persists, additional CPU stall warnings are printed at more widely spaced intervals. A value of zero says to use the RCU_CPU_STALL_TIMEOUT value converted from - seconds to milliseconds. + seconds to milliseconds, limited to a maximum of 21 seconds. config RCU_TRACE bool "Enable tracing for RCU" From patchwork Thu Oct 6 22:31:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000784 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90CDFC433FE for ; Thu, 6 Oct 2022 22:32:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231416AbiJFWc2 (ORCPT ); Thu, 6 Oct 2022 18:32:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231819AbiJFWc0 (ORCPT ); Thu, 6 Oct 2022 18:32:26 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 405C9F191A; Thu, 6 Oct 2022 15:32:25 -0700 (PDT) Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296LnvEX028642; Thu, 6 Oct 2022 22:32:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=zWgczEyt8JM5uP71whdVulbYeWTSWmnUfLLvmKY1IRc=; b=Qpowa9nqBZ6KWbEyNYz7R+65NLNG1ybfYzpwNUHYcZB8q4PujF+KKqQqV5kIz8RYO0SW JLkZp4KVEwJOp5iw5g72ZtqyHoXvqdFzXZi99ULefnhhkXR5FwqL1aIcAn5hmAJUKHWU VLcD63toWsaB+SlgIULN9tYmbdhmDogunPUGMt+XvSZ1ohADk7RiFAFJqws1JvggVyTO Vkepmx/WD16jc36F53u3MakFphd8G0D/tAj/EwoEOXwiK/Q10A1LoMiRzyH6995xb5ss bwTprmxnvAUK6j9giWxd4lMPOOhjKtG19AqFY+m624guWb8FbcwAND7fNrBWTX6ZsTHp WQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27950a0m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:22 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 7F9DAB2; Thu, 6 Oct 2022 22:32:21 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 34D6D8065D2; Thu, 6 Oct 2022 22:32:21 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 2/7] crypto: x86/sha - limit FPU preemption Date: Thu, 6 Oct 2022 17:31:46 -0500 Message-Id: <20221006223151.22159-3-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 9S8TRWttWHfcsOzFi-PSwOQ09LiawGst X-Proofpoint-GUID: 9S8TRWttWHfcsOzFi-PSwOQ09LiawGst X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 impostorscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060132 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running. This leads to "rcu_preempt detected expedited stalls" with stack dumps pointing to the optimized hash function if this module is loaded and used a lot: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {12-... } 22 jiffies s: 277 root: 0x1/. For example, that can occur during boot with the stack track pointing to the sha512-x86 function if the system set to use SHA-512 for module signing. The call trace includes: module_sig_check mod_verify_sig pkcs7_verify pkcs7_digest sha512_finup sha512_base_do_update Fixes: 66be89515888 ("crypto: sha1 - SSSE3 based SHA1 implementation for x86-64") Fixes: 8275d1aa6422 ("crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.") Fixes: 87de4579f92d ("crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.") Fixes: aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") Suggested-by: Herbert Xu Reviewed-by: Tim Chen Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 34 +++++++++++++++++++++++----- arch/x86/crypto/sha256_ssse3_glue.c | 35 ++++++++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 35 ++++++++++++++++++++++++----- 3 files changed, 89 insertions(+), 15 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 4430463dee62..033812989476 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -27,10 +27,13 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha1_block_fn *sha1_xform) { struct sha1_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE) @@ -42,9 +45,18 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0); - kernel_fpu_begin(); - sha1_base_do_update(desc, data, len, sha1_xform); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); return 0; } @@ -52,12 +64,24 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, static int sha1_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out, sha1_block_fn *sha1_xform) { + unsigned int chunk; + if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha1_base_do_update(desc, data, len, sha1_xform); sha1_base_do_finalize(desc, sha1_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index e437fba0299b..99a25c238f40 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -40,6 +40,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); @@ -47,6 +49,7 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha256_block_fn *sha256_xform) { struct sha256_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) @@ -58,9 +61,18 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); - kernel_fpu_begin(); - sha256_base_do_update(desc, data, len, sha256_xform); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); return 0; } @@ -68,12 +80,25 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, static int sha256_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out, sha256_block_fn *sha256_xform) { + unsigned int chunk; + if (!crypto_simd_usable()) return crypto_sha256_finup(desc, data, len, out); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha256_base_do_update(desc, data, len, sha256_xform); sha256_base_do_finalize(desc, sha256_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 3c19f803f288..72eee03448dc 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -39,6 +39,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); @@ -46,6 +48,7 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha512_block_fn *sha512_xform) { struct sha512_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count[0] % SHA512_BLOCK_SIZE) + len < SHA512_BLOCK_SIZE) @@ -57,9 +60,18 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); - kernel_fpu_begin(); - sha512_base_do_update(desc, data, len, sha512_xform); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); return 0; } @@ -67,12 +79,25 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, static int sha512_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out, sha512_block_fn *sha512_xform) { + unsigned int chunk; + if (!crypto_simd_usable()) return crypto_sha512_finup(desc, data, len, out); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha512_base_do_update(desc, data, len, sha512_xform); sha512_base_do_finalize(desc, sha512_xform); kernel_fpu_end(); From patchwork Thu Oct 6 22:31:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000791 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8B13C433FE for ; Thu, 6 Oct 2022 22:37:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232064AbiJFWhe (ORCPT ); Thu, 6 Oct 2022 18:37:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232110AbiJFWhc (ORCPT ); Thu, 6 Oct 2022 18:37:32 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E33D9F252C; Thu, 6 Oct 2022 15:37:27 -0700 (PDT) Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296Lo2xg028782; Thu, 6 Oct 2022 22:37:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=rc1eSB//t36JGFSMPIYD6WvUbxbmez76ZMUGGs5nJTI=; b=ZFEr9C6zrZAV6MjZyHA4bf3JnYE2HZuNC8J8jRZxVcM5C8WFr5fherR3yR0vZ49bM0YD AhXbboNbz4//2fcBr4FVkNt0zCbW3qKo60x2fi7DA+8LBnu4duX1PYfQfmzqX1hniiW6 RNuZqjEnED4MrBMM55RXHzESd3JaSSCGZZfVD2vGftx5dEsn4WCcQtoQIcgB90nRBUCY tX0qHv/IbGaQf3iCIJI0p+v7/nOcXs6CocUYWQxUSY6g1ujiD5Fkq+dBEBDsgxDyKL2V VwFDunt3YPjoYgO154E/zGwGNpZfx7V9ElepgI/1CLprg/OQBOEwuQYSn8gKnT9pcc9V tg== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27950b7w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:37:24 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 087E9806B59; Thu, 6 Oct 2022 22:32:23 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id AE57E8038CA; Thu, 6 Oct 2022 22:32:23 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 3/7] crypto: x86/crc - limit FPU preemption Date: Thu, 6 Oct 2022 17:31:47 -0500 Message-Id: <20221006223151.22159-4-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: Luei-Eh2Ae2Z24fI054UdPMjt7E04wS7 X-Proofpoint-GUID: Luei-Eh2Ae2Z24fI054UdPMjt7E04wS7 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 impostorscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {12-... } 22 jiffies s: 277 root: 0x1/. Fixes: 78c37d191dd6 ("crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation") Fixes: 6a8ce1ef3940 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction") Fixes: 0b95a7f85718 ("crypto: crct10dif - Glue code to cast accelerated CRCT10DIF assembly as a crypto transform") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_glue.c | 18 ++++++++++---- arch/x86/crypto/crc32c-intel_glue.c | 32 ++++++++++++++++++++----- arch/x86/crypto/crct10dif-pclmul_glue.c | 32 ++++++++++++++++++++----- 3 files changed, 66 insertions(+), 16 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 288200fe7b4e..7cf65dc726c4 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -49,6 +49,8 @@ #define SCALE_F 16L /* size of xmm register */ #define SCALE_F_MASK (SCALE_F - 1) +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + u32 crc32_pclmul_le_16(unsigned char const *buffer, size_t len, u32 crc32); static u32 __attribute__((pure)) @@ -57,6 +59,7 @@ static u32 __attribute__((pure)) unsigned int iquotient; unsigned int iremainder; unsigned int prealign; + unsigned int chunk; if (len < PCLMUL_MIN_LEN + SCALE_F_MASK || !crypto_simd_usable()) return crc32_le(crc, p, len); @@ -73,12 +76,19 @@ static u32 __attribute__((pure)) iquotient = len & (~SCALE_F_MASK); iremainder = len & SCALE_F_MASK; - kernel_fpu_begin(); - crc = crc32_pclmul_le_16(p, iquotient, crc); - kernel_fpu_end(); + do { + chunk = min(iquotient, FPU_BYTES); + iquotient -= chunk; + + kernel_fpu_begin(); + crc = crc32_pclmul_le_16(p, chunk, crc); + kernel_fpu_end(); + + p += chunk; + } while (iquotient); if (iremainder) - crc = crc32_le(crc, p + iquotient, iremainder); + crc = crc32_le(crc, p, iremainder); return crc; } diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index c5c965b694c6..b277c215f0fb 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -44,6 +44,8 @@ */ #define CRC32C_PCL_BREAKEVEN 512 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage unsigned int crc_pcl(const u8 *buffer, int len, unsigned int crc_init); #endif /* CONFIG_X86_64 */ @@ -155,15 +157,23 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data, unsigned int len) { u32 *crcp = shash_desc_ctx(desc); + unsigned int chunk; /* * use faster PCL version if datasize is large enough to * overcome kernel fpu state save/restore overhead */ if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *crcp = crc_pcl(data, len, *crcp); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + data += chunk; + } while (len); } else *crcp = crc32c_intel_le_hw(*crcp, data, len); return 0; @@ -172,10 +182,20 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data, static int __crc32c_pcl_intel_finup(u32 *crcp, const u8 *data, unsigned int len, u8 *out) { + unsigned int chunk; + if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__le32 *)out = ~cpu_to_le32(crc_pcl(data, len, *crcp)); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + data += chunk; + } while (len); + *(__le32 *)out = ~cpu_to_le32(*crcp); } else *(__le32 *)out = ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 7c5a32282d51..bcd362df6b62 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -36,6 +36,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage u16 crc_t10dif_pcl(u16 init_crc, const u8 *buf, size_t len); struct chksum_desc_ctx { @@ -55,11 +57,19 @@ static int chksum_update(struct shash_desc *desc, const u8 *data, unsigned int length) { struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + unsigned int chunk; if (length >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - ctx->crc = crc_t10dif_pcl(ctx->crc, data, length); - kernel_fpu_end(); + do { + chunk = min(length, FPU_BYTES); + length -= chunk; + + kernel_fpu_begin(); + ctx->crc = crc_t10dif_pcl(ctx->crc, data, chunk); + kernel_fpu_end(); + + data += chunk; + } while (length); } else ctx->crc = crc_t10dif_generic(ctx->crc, data, length); return 0; @@ -75,10 +85,20 @@ static int chksum_final(struct shash_desc *desc, u8 *out) static int __chksum_finup(__u16 crc, const u8 *data, unsigned int len, u8 *out) { + unsigned int chunk; + if (len >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__u16 *)out = crc_t10dif_pcl(crc, data, len); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + + kernel_fpu_begin(); + crc = crc_t10dif_pcl(crc, data, chunk); + kernel_fpu_end(); + + data += chunk; + } while (len); + *(__u16 *)out = crc; } else *(__u16 *)out = crc_t10dif_generic(crc, data, len); return 0; From patchwork Thu Oct 6 22:31:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000785 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D899C433F5 for ; Thu, 6 Oct 2022 22:32:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232069AbiJFWcc (ORCPT ); Thu, 6 Oct 2022 18:32:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231994AbiJFWc3 (ORCPT ); Thu, 6 Oct 2022 18:32:29 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8F04C5114; Thu, 6 Oct 2022 15:32:28 -0700 (PDT) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296IWLf3006721; Thu, 6 Oct 2022 22:32:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=W4jQyBaAqxB5hXYqMT2DINItd/YLoIK4YtDAVH9jynk=; b=SQSCgXROlXa4+csUjEQxQgo0s7+Bv+VCgPnuav2T0Ok/XDCFjoeLitDnbN/lIR87Brnm pl4nh196U+uul8eQJmoz/KLSRX3oVOAfwe5/m9XZYAhLgLSVuBD6WGXRF35Se6/Z2VOZ V5iZtJf+LDVIi+wRfrkj4Uh9g7muCdCdHAPjKeqjr6yWiOlS3YctDzFkjwCh6m6zj4iC 8Y+uGm93CW4mxuCya50iy/yQu0OVLk59SSK4b1JYaASbcPvtvITfEEqNGToEI2Kqwmpa tac4+qPng5oSAdcypCYNoH8VRG7B531ofFXcMC1cIxMhTEfDlribTP78n1eP2U1Y8N5v Lw== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3k22bhmrcm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:26 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 7D712D26D; Thu, 6 Oct 2022 22:32:25 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 291B6806184; Thu, 6 Oct 2022 22:32:25 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 4/7] crypto: x86/sm3 - limit FPU preemption Date: Thu, 6 Oct 2022 17:31:48 -0500 Message-Id: <20221006223151.22159-5-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: U4fVSM4QWnBCRlNVR049AYWW0stri5xV X-Proofpoint-GUID: U4fVSM4QWnBCRlNVR049AYWW0stri5xV X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 phishscore=0 spamscore=0 clxscore=1015 malwarescore=0 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 adultscore=0 suspectscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, causing: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {12-... } 22 jiffies s: 277 root: 0x1/. Fixes: 930ab34d906d ("crypto: x86/sm3 - add AVX assembly implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/sm3_avx_glue.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index bfd11da956a4..adb9e55e2a16 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -18,6 +18,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sm3_transform_avx(struct sm3_state *state, const u8 *data, int nblocks); @@ -25,6 +27,7 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { struct sm3_state *sctx = shash_desc_ctx(desc); + unsigned int chunk; if (!crypto_simd_usable() || (sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) { @@ -38,9 +41,14 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); - kernel_fpu_begin(); - sm3_base_do_update(desc, data, len, sm3_transform_avx); - kernel_fpu_end(); + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + data += chunk; + } while (len); return 0; } @@ -48,6 +56,8 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { + unsigned int chunk; + if (!crypto_simd_usable()) { struct sm3_state *sctx = shash_desc_ctx(desc); @@ -58,9 +68,17 @@ static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, return 0; } + do { + chunk = min(len, FPU_BYTES); + len -= chunk; + if (chunk) { + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + } + data += chunk; + } while (len); kernel_fpu_begin(); - if (len) - sm3_base_do_update(desc, data, len, sm3_transform_avx); sm3_base_do_finalize(desc, sm3_transform_avx); kernel_fpu_end(); From patchwork Thu Oct 6 22:31:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000786 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6590DC433FE for ; Thu, 6 Oct 2022 22:32:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232088AbiJFWcd (ORCPT ); Thu, 6 Oct 2022 18:32:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231997AbiJFWcb (ORCPT ); Thu, 6 Oct 2022 18:32:31 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5188F1927; Thu, 6 Oct 2022 15:32:30 -0700 (PDT) Received: from pps.filterd (m0134421.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296LdZOq027880; Thu, 6 Oct 2022 22:32:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=B+xhqzalsDHwm/ksRSO5kdmSiCCi1Sgu2dmiQdGIOpU=; b=Okw2brtWKNBoiaErRF1ACg4hUbP6nC6nKJl2/z/mzAVYMPFnkIUE5rLmWAdMFP7upOoQ RO4Y2MYD7yXuYfeRVu8+MrU4V77ZP5wYEbLbyjH1bCA3Q1dDtDHuq8XtIdpDPA556mr2 uE3j40VBR1YeEb8Dj4iHA4tWGQDrQtzbeajX5NqwmrCMV0KZ4I1YvwPN6sU0sbSkejG/ LsvQQv6Ok/pavLgp8o1UoEvQWk+kW2U0iVosOjz9RbnLnLvjAubuZQhkLC8DDrd7SyPL Bxv8udBYQPMHtbpu4a5tKUIdJ+5tEhh5cKQx8iJxpDC5h5B6X3tZRX3RHXJ5r3fR7MHl wg== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k274bgdda-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:27 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 9A2C4801AD4; Thu, 6 Oct 2022 22:32:27 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 495238065D2; Thu, 6 Oct 2022 22:32:27 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 5/7] crypto: x86/ghash - restructure FPU context saving Date: Thu, 6 Oct 2022 17:31:49 -0500 Message-Id: <20221006223151.22159-6-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 7MOIYjqP-Hhb-6ROWjn498ByW2wkiC_9 X-Proofpoint-ORIG-GUID: 7MOIYjqP-Hhb-6ROWjn498ByW2wkiC_9 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 spamscore=0 phishscore=0 bulkscore=0 impostorscore=0 mlxscore=0 mlxlogscore=999 priorityscore=1501 suspectscore=0 lowpriorityscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060132 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 3a96c167d78d..b25730c5b267 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -82,7 +82,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -93,10 +92,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end(); From patchwork Thu Oct 6 22:31:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000787 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0711C433F5 for ; Thu, 6 Oct 2022 22:32:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232193AbiJFWcn (ORCPT ); Thu, 6 Oct 2022 18:32:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232064AbiJFWcc (ORCPT ); Thu, 6 Oct 2022 18:32:32 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 251FFF1937; Thu, 6 Oct 2022 15:32:32 -0700 (PDT) Received: from pps.filterd (m0150242.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296HWkvY000808; Thu, 6 Oct 2022 22:32:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=t163V27LVije1BDnig0VkhCY7/r4gwdWvV8VjbL84sE=; b=ORfCnLE5VzKy/OqP5HWtp+gHNQEl1WAdJR2ymgtBN9G0y4EA4rhp51XrhDbWQsn1j9q5 cOXi9wnbGY5r5DElGxknqnZ26hO6X4fvt6FqKZbPnBpJvT3ifGqW8mr17e5UuHhawPa4 ecutpGh5oXqPYZzZua4NIHnEpnpqzzb87FJqCVEcbQCWPgdt7S9hej04k+JL5vZicPs5 ge1l9ffS/F2S1jxUEZQzaNKQKUzRoLPTztJIb/xhiCELk/tVGgwmJRxF5/cw4dscw04n dqQuAmOymQqyCbGePeegx1jE3aqsBq301qI4jSr4ml5bbm/tQR767xp5zijJDsn/jNNW ng== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3k23gtudem-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:29 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id D3BB0806B41; Thu, 6 Oct 2022 22:32:28 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 80EBD8038CA; Thu, 6 Oct 2022 22:32:28 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 6/7] crypto: x86/ghash - limit FPU preemption Date: Thu, 6 Oct 2022 17:31:50 -0500 Message-Id: <20221006223151.22159-7-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: oUq8sLBHiCTDU3XfxziFUd5z6bCvmE37 X-Proofpoint-ORIG-GUID: oUq8sLBHiCTDU3XfxziFUd5z6bCvmE37 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxlogscore=999 suspectscore=0 clxscore=1015 lowpriorityscore=0 spamscore=0 adultscore=0 impostorscore=0 phishscore=0 priorityscore=1501 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {12-... } 22 jiffies s: 277 root: 0x1/. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 27 ++++++++++++++++------ 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index b25730c5b267..b9a3e2187f5b 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -25,6 +25,8 @@ #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + void clmul_ghash_mul(char *dst, const u128 *shash); void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, @@ -81,10 +83,11 @@ static int ghash_update(struct shash_desc *desc, struct ghash_desc_ctx *dctx = shash_desc_ctx(desc); struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; + unsigned int fpulen; if (dctx->bytes) { int n = min(srclen, dctx->bytes); - u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); + u8 *pos = dst + GHASH_BLOCK_SIZE - dctx->bytes; dctx->bytes -= n; srclen -= n; @@ -99,13 +102,23 @@ static int ghash_update(struct shash_desc *desc, } } - kernel_fpu_begin(); - clmul_ghash_update(dst, src, srclen, &ctx->shash); - kernel_fpu_end(); + while (srclen >= GHASH_BLOCK_SIZE) { + fpulen = min(srclen, FPU_BYTES); + + kernel_fpu_begin(); + while (fpulen >= GHASH_BLOCK_SIZE) { + int n = min_t(unsigned int, fpulen, GHASH_BLOCK_SIZE); + + clmul_ghash_update(dst, src, n, &ctx->shash); + + srclen -= n; + fpulen -= n; + src += n; + } + kernel_fpu_end(); + } - if (srclen & 0xf) { - src += srclen - (srclen & 0xf); - srclen &= 0xf; + if (srclen) { dctx->bytes = GHASH_BLOCK_SIZE - srclen; while (srclen--) *dst++ ^= *src++; From patchwork Thu Oct 6 22:31:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13000788 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 131B1C433F5 for ; Thu, 6 Oct 2022 22:33:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232312AbiJFWdF (ORCPT ); Thu, 6 Oct 2022 18:33:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231644AbiJFWcv (ORCPT ); Thu, 6 Oct 2022 18:32:51 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5360C183EE9; Thu, 6 Oct 2022 15:32:40 -0700 (PDT) Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296LnmM3028576; Thu, 6 Oct 2022 22:32:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ndNOQqnDYieYfKjG+Qoea1ejqGoDU1XlffrmRRYeio0=; b=H08juEmk79ucPnqWUTDRL+jxWw+p8pbC5dZoMApglOz3bRlvILd7PeL4f4a/gwy8HTIi EPzT2fnKFXl6lIsXBOwuZrVX55MRqk+DVXFawnMTI5wfckYwN0WhV+vd1Ek+4X5Gnc3T oV/dFApBfnOPvUU7//6JX62JqaTaB14zrzSgj3xwwezEjFDbyGSCPVEziTBtMoh4nTq2 BP054m5ckuGHpoAHMWWjdx7dw9kR9SvMPLrepktm5E+6+tqPUswcRrmnAGXsD82f+ZqP FtWz3KHIvDBR/7dNYwuJ5+uDGYHLqndXJQTH5UuC36Objm7RJQVSzsF5QKIpB9WIIbSG wA== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k27950a2e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 22:32:36 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id EA73D806B41; Thu, 6 Oct 2022 22:32:30 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id A2560807F19; Thu, 6 Oct 2022 22:32:30 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [RFC PATCH 7/7] crypto: x86 - use common macro for FPU limit Date: Thu, 6 Oct 2022 17:31:51 -0500 Message-Id: <20221006223151.22159-8-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221006223151.22159-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: s2EqSClVPHeR83IWeMZ0ZM2Rxs6EgdAQ X-Proofpoint-GUID: s2EqSClVPHeR83IWeMZ0ZM2Rxs6EgdAQ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_05,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 impostorscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060133 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Use a common macro name (FPU_BYTES) for the limit of the number of bytes processed within kernel_fpu_begin and kernel_fpu_end rather than using SZ_4K (which is a signed value), or a magic value of 4096U. Use unsigned int rather than size_t for some of the arguments to avoid typecasting for the min() macro. Signed-off-by: Robert Elliott --- arch/x86/crypto/blake2s-glue.c | 7 ++++--- arch/x86/crypto/chacha_glue.c | 4 +++- arch/x86/crypto/nhpoly1305-avx2-glue.c | 3 ++- arch/x86/crypto/nhpoly1305-sse2-glue.c | 4 +++- arch/x86/crypto/poly1305_glue.c | 25 ++++++++++++++----------- arch/x86/crypto/polyval-clmulni_glue.c | 5 +++-- 6 files changed, 29 insertions(+), 19 deletions(-) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index a88522e4d0f8..02b72d29dc9b 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -18,6 +18,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, const u8 *block, const size_t nblocks, const u32 inc); @@ -31,8 +33,7 @@ static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); void blake2s_compress(struct blake2s_state *state, const u8 *block, size_t nblocks, const u32 inc) { - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); + BUILD_BUG_ON(FPU_BYTES / BLAKE2S_BLOCK_SIZE < 8); if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { blake2s_compress_generic(state, block, nblocks, inc); @@ -41,7 +42,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, do { const size_t blocks = min_t(size_t, nblocks, - SZ_4K / BLAKE2S_BLOCK_SIZE); + FPU_BYTES / BLAKE2S_BLOCK_SIZE); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index feb53e90f0e3..40ddd0ce50d6 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -18,6 +18,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, unsigned int len, int nrounds); asmlinkage void chacha_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, @@ -150,7 +152,7 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, return chacha_crypt_generic(state, dst, src, bytes, nrounds); do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + unsigned int todo = min(bytes, FPU_BYTES); kernel_fpu_begin(); chacha_dosimd(state, dst, src, todo, nrounds); diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 68cf24213e1c..7e65ccd86f75 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -16,6 +16,7 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -34,7 +35,7 @@ static int nhpoly1305_avx2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2); diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index 75c324253b37..4f35b52e21f0 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -16,6 +16,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -33,7 +35,7 @@ static int nhpoly1305_sse2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2); diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 59d0b01b3389..c036315dbd39 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -18,20 +18,24 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void poly1305_init_x86_64(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]); asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); -asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); @@ -89,14 +93,13 @@ static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]) poly1305_init_x86_64(ctx, key); } -static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, +static void poly1305_simd_blocks(void *ctx, const u8 *inp, unsigned int len, const u32 padbit) { struct poly1305_arch_internal *state = ctx; - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE || - SZ_4K % POLY1305_BLOCK_SIZE); + BUILD_BUG_ON(FPU_BYTES < POLY1305_BLOCK_SIZE || + FPU_BYTES % POLY1305_BLOCK_SIZE); if (!static_branch_likely(&poly1305_use_avx) || (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) || @@ -107,7 +110,7 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, } do { - const size_t bytes = min_t(size_t, len, SZ_4K); + const unsigned int bytes = min(len, FPU_BYTES); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index b7664d018851..2502964afef6 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -29,6 +29,8 @@ #define NUM_KEY_POWERS 8 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + struct polyval_tfm_ctx { /* * These powers must be in the order h^8, ..., h^1. @@ -123,8 +125,7 @@ static int polyval_x86_update(struct shash_desc *desc, } while (srclen >= POLYVAL_BLOCK_SIZE) { - /* Allow rescheduling every 4K bytes. */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + nblocks = min(srclen, FPU_BYTES) / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE;