From patchwork Wed Oct 12 21:59:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005469 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E57DC4332F for ; Wed, 12 Oct 2022 22:00:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229955AbiJLWAW (ORCPT ); Wed, 12 Oct 2022 18:00:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229910AbiJLWAN (ORCPT ); Wed, 12 Oct 2022 18:00:13 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C63B29836; Wed, 12 Oct 2022 15:00:08 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7PBG016325; Wed, 12 Oct 2022 21:59:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ieDB45JQqpAn8o0cFOMeE/nL1kcgeiYxQWZKC81F9j4=; b=fsyctmOy/FYzK3XyIUb8Si33A4BcMpuF9C67VH44qcpSAcuIn4pcet+hyATkTV7j28pE OFVippZ56La6QNVD03Pnxz6K0LHJ1IxrBZpK2UP7gUUB49RBEYP5WJkRvm9asm92yZ1x qR6Wc/YtK2gIhxfM6c9yNsl/u7+XIsp7608Qg2OHSLP7vRHtXUp5DG6bEcLl8msPt6Mt q9EKIg5pRWClv/eYKV4E0KDFITFi9AYzo6BO6cm+BI78Rbk/os3OoAk/AfiuJIZfeY06 unw02LnhGLPj51ENwgjOoHEalmXVVqPxPvVIvhNI0i7jWA+hUThLS0xkgiNO+fvhQ7GU mw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8a86-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:44 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 93BE21395E; Wed, 12 Oct 2022 21:59:43 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 2AF43806B7E; Wed, 12 Oct 2022 21:59:43 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 01/19] crypto: tcrypt - test crc32 Date: Wed, 12 Oct 2022 16:59:13 -0500 Message-Id: <20221012215931.3896-2-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: M8yLfQztDkfGvUQlwxTk7O468trsTZRl X-Proofpoint-GUID: M8yLfQztDkfGvUQlwxTk7O468trsTZRl X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=943 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add self-test and speed tests for crc32, paralleling those offered for crc32c and crct10dif. Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index a82679b576bb..4426386dfb42 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1711,6 +1711,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) ret += tcrypt_test("gcm(aria)"); break; + case 59: + ret += tcrypt_test("crc32"); + break; + case 100: ret += tcrypt_test("hmac(md5)"); break; @@ -2317,6 +2321,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) generic_hash_speed_template); if (mode > 300 && mode < 400) break; fallthrough; + case 329: + test_hash_speed("crc32", sec, generic_hash_speed_template); + if (mode > 300 && mode < 400) break; + fallthrough; case 399: break; From patchwork Wed Oct 12 21:59:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005465 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADB88C433FE for ; Wed, 12 Oct 2022 22:00:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229565AbiJLWAD (ORCPT ); Wed, 12 Oct 2022 18:00:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229755AbiJLWAB (ORCPT ); Wed, 12 Oct 2022 18:00:01 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A91421A3A4; Wed, 12 Oct 2022 14:59:59 -0700 (PDT) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL4RTk005193; Wed, 12 Oct 2022 21:59:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=Nj/d5OC2AY6b4soOh7HzxmFB+I7wGPKsbzfoKqMoiAo=; b=mPsvLxOfIXLYpsskHdU5Yah6fDTyDmk7EYqojBSolApJF9of4Kda4tJxBDYFeOFhJAzY a5eAuMuT/V+wnNv8XPjcUtCyZRDbHtORDRNzTU7a/hsKiQdg5EsoLqJgX5tV3RUgaAzO SnuwiQWaqXMahGsjhjDZeSksbjavla2tW20RCsjbY8CubbaIk/5CJlUvisA6K9s6cvnc H7mNmVNM5zIofME/gtoMuYWRiA/74YoaXA+eyrV/e5m33zY3KZCSd5tK9s3F5aOzaQqM VC/85Ofa4iX4niHexAqB0X+OWH8HTNwD6NmAHaSBrOqQ2JTnCOpocIlVGE2+2tPoBcAW Gg== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k6471rtmr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:45 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 359EDD25E; Wed, 12 Oct 2022 21:59:45 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id C79D6805032; Wed, 12 Oct 2022 21:59:44 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 02/19] crypto: tcrypt - test nhpoly1305 Date: Wed, 12 Oct 2022 16:59:14 -0500 Message-Id: <20221012215931.3896-3-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: TfiL1oFu1kgu4lTkUG7plhaa5JSLfse- X-Proofpoint-ORIG-GUID: TfiL1oFu1kgu4lTkUG7plhaa5JSLfse- X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=917 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add self-test mode for nhpoly1305. Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 4426386dfb42..7a6a56751043 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1715,6 +1715,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) ret += tcrypt_test("crc32"); break; + case 60: + ret += tcrypt_test("nhpoly1305"); + break; + case 100: ret += tcrypt_test("hmac(md5)"); break; From patchwork Wed Oct 12 21:59:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005466 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 235B7C4321E for ; Wed, 12 Oct 2022 22:00:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229893AbiJLWAK (ORCPT ); Wed, 12 Oct 2022 18:00:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229844AbiJLWAE (ORCPT ); Wed, 12 Oct 2022 18:00:04 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 832931D673; Wed, 12 Oct 2022 15:00:01 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7teG016232; Wed, 12 Oct 2022 21:59:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=EFnJoixPZ3/Qjr5SSbSdQKS+GwLwG6oMKcGkzRv3xr4=; b=jSk9PP8WRCVAOeGUixGXW9NLr2sXIe6c1aUAQMCd7ycrg3gvQZMqyp68g6j7DZMfWsBX xkMVpZeSW1OQ6y2jPA3bHTZLI5TP7tk4SoyeM7pX4h+JQcYVO9TU8TqBRwE2fBFru+jZ Ah1wYcYFxGsNN+aQjWi9vAvzMEkx46MWpLp3CIf3lX0+cyHYYow40G8R/zYZMEP6W8Hp uiodQ3dNzJgwcKEdWrfOJoBwRIUAimu/TwGhOgLVp64y2zKfSVXxSttnLXbslK9exD8z 96oTbYOBl4i4ZykfTsmNE0HeTJmy2+XKfjqOb7n4SXf/0wtZJFvjVRptOSQngqdPJ1K0 ag== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a3w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:48 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id CFA29806B5B; Wed, 12 Oct 2022 21:59:46 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 6C00D806B7E; Wed, 12 Oct 2022 21:59:46 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 03/19] crypto: tcrypt - reschedule during cycles speed tests Date: Wed, 12 Oct 2022 16:59:15 -0500 Message-Id: <20221012215931.3896-4-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: GHeQjPytoHSRh0cwqRJqjh7xyq6hpR-c X-Proofpoint-GUID: GHeQjPytoHSRh0cwqRJqjh7xyq6hpR-c X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org commit 2af632996b89 ("crypto: tcrypt - reschedule during speed tests") added cond_resched() calls to "Avoid RCU stalls in the case of non-preemptible kernel and lengthy speed tests by rescheduling when advancing from one block size to another." It only makes those calls if the sec module parameter is used (run the speed test for a certain number of seconds), not the default "cycles" mode. Expand those to also run in "cycles" mode to reduce the rate of rcu stall warnings: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: Suggested-by: Herbert Xu Tested-by: Taehee Yoo Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 44 ++++++++++++++++++-------------------------- 1 file changed, 18 insertions(+), 26 deletions(-) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 7a6a56751043..c025ba26b663 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -408,14 +408,13 @@ static void test_mb_aead_speed(const char *algo, int enc, int secs, } - if (secs) { + if (secs) ret = test_mb_aead_jiffies(data, enc, bs, secs, num_mb); - cond_resched(); - } else { + else ret = test_mb_aead_cycles(data, enc, bs, num_mb); - } + cond_resched(); if (ret) { pr_err("%s() failed return code=%d\n", e, ret); @@ -661,13 +660,11 @@ static void test_aead_speed(const char *algo, int enc, unsigned int secs, bs + (enc ? 0 : authsize), iv); - if (secs) { - ret = test_aead_jiffies(req, enc, bs, - secs); - cond_resched(); - } else { + if (secs) + ret = test_aead_jiffies(req, enc, bs, secs); + else ret = test_aead_cycles(req, enc, bs); - } + cond_resched(); if (ret) { pr_err("%s() failed return code=%d\n", e, ret); @@ -917,14 +914,13 @@ static void test_ahash_speed_common(const char *algo, unsigned int secs, ahash_request_set_crypt(req, sg, output, speed[i].plen); - if (secs) { + if (secs) ret = test_ahash_jiffies(req, speed[i].blen, speed[i].plen, output, secs); - cond_resched(); - } else { + else ret = test_ahash_cycles(req, speed[i].blen, speed[i].plen, output); - } + cond_resched(); if (ret) { pr_err("hashing failed ret=%d\n", ret); @@ -1184,15 +1180,14 @@ static void test_mb_skcipher_speed(const char *algo, int enc, int secs, cur->sg, bs, iv); } - if (secs) { + if (secs) ret = test_mb_acipher_jiffies(data, enc, bs, secs, num_mb); - cond_resched(); - } else { + else ret = test_mb_acipher_cycles(data, enc, bs, num_mb); - } + cond_resched(); if (ret) { pr_err("%s() failed flags=%x\n", e, @@ -1401,14 +1396,11 @@ static void test_skcipher_speed(const char *algo, int enc, unsigned int secs, skcipher_request_set_crypt(req, sg, sg, bs, iv); - if (secs) { - ret = test_acipher_jiffies(req, enc, - bs, secs); - cond_resched(); - } else { - ret = test_acipher_cycles(req, enc, - bs); - } + if (secs) + ret = test_acipher_jiffies(req, enc, bs, secs); + else + ret = test_acipher_cycles(req, enc, bs); + cond_resched(); if (ret) { pr_err("%s() failed flags=%x\n", e, From patchwork Wed Oct 12 21:59:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005483 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D8E3C433FE for ; Wed, 12 Oct 2022 22:04:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229735AbiJLWEb (ORCPT ); Wed, 12 Oct 2022 18:04:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbiJLWDo (ORCPT ); Wed, 12 Oct 2022 18:03:44 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC1F912C8A3; Wed, 12 Oct 2022 15:01:20 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwflC006339; Wed, 12 Oct 2022 21:59:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=pkIjY3xNYdD7u0iVrBec5NLvLKLYByXvLSfeEAPzOWA=; b=H/+AVupIA0qT3WW+ZAU2dM2Gj2l3wqwR0WsAa/VfsVsS5iNMLJC7y8Ht8zY9f5OvhkBx 2yLbpwmj37y7COimS4bcYZXap7tWfA+f9jfIliFcbQ+513C5zXVqoqpMPMUKpaQZdv18 hktdM2YGYo3tRFTVAi7YV1xAMWnSW89cLVpbAz4A923DrH/GUDNRl5Ld07SqTiP+PXvj VO8N9e4AHFDwKfHaVK05G3rZ6EdQ3Am7Xvyz8NwOIvIN/3DX0mQtHBM0ONsISnk24f22 +trjHtXOEgf2mx27hfc02CqJehN6CTaOKRtBmoBwWIjCo4twBOe4dR3MxIan4xh9zojX PQ== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8da7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:50 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 8611C806B5C; Wed, 12 Oct 2022 21:59:49 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 28602808ECE; Wed, 12 Oct 2022 21:59:49 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 04/19] crypto: x86/sha - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:16 -0500 Message-Id: <20221012215931.3896-5-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: Uwyq1NOSShnQQpJzF5T27vXCpBPm0wlx X-Proofpoint-ORIG-GUID: Uwyq1NOSShnQQpJzF5T27vXCpBPm0wlx X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running. This leads to "rcu_preempt detected expedited stalls" with stack dumps pointing to the optimized hash function if the module is loaded and used a lot: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... For example, that can occur during boot with the stack track pointing to the sha512-x86 function if the system set to use SHA-512 for module signing. The call trace includes: module_sig_check mod_verify_sig pkcs7_verify pkcs7_digest sha512_finup sha512_base_do_update Fixes: 66be89515888 ("crypto: sha1 - SSSE3 based SHA1 implementation for x86-64") Fixes: 8275d1aa6422 ("crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.") Fixes: 87de4579f92d ("crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.") Fixes: aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") Suggested-by: Herbert Xu Reviewed-by: Tim Chen Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 32 ++++++++++++++++++++++++----- arch/x86/crypto/sha256_ssse3_glue.c | 32 ++++++++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 32 ++++++++++++++++++++++++----- 3 files changed, 81 insertions(+), 15 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 44340a1139e0..a9f5779b41ca 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -26,6 +26,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha1_block_fn *sha1_xform) { @@ -41,9 +43,18 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0); - kernel_fpu_begin(); - sha1_base_do_update(desc, data, len, sha1_xform); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -54,9 +65,20 @@ static int sha1_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha1_base_do_update(desc, data, len, sha1_xform); sha1_base_do_finalize(desc, sha1_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 3a5f6be7dbba..322c8aa907af 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -40,6 +40,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); @@ -58,9 +60,18 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); - kernel_fpu_begin(); - sha256_base_do_update(desc, data, len, sha256_xform); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -71,9 +82,20 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha256_finup(desc, data, len, out); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha256_base_do_update(desc, data, len, sha256_xform); sha256_base_do_finalize(desc, sha256_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 6d3b85e53d0e..fd5075a32613 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -39,6 +39,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); @@ -57,9 +59,18 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); - kernel_fpu_begin(); - sha512_base_do_update(desc, data, len, sha512_xform); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -70,9 +81,20 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha512_finup(desc, data, len, out); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha512_base_do_update(desc, data, len, sha512_xform); sha512_base_do_finalize(desc, sha512_xform); kernel_fpu_end(); From patchwork Wed Oct 12 21:59:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005471 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5B7FC4332F for ; Wed, 12 Oct 2022 22:00:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229550AbiJLWAn (ORCPT ); Wed, 12 Oct 2022 18:00:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229972AbiJLWA3 (ORCPT ); Wed, 12 Oct 2022 18:00:29 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90C1E422C0; Wed, 12 Oct 2022 15:00:12 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwqKa008071; Wed, 12 Oct 2022 21:59:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=QTkyAOc86d/ALjZrO/JAy9mwf2Rc5zZNS5XWX8x3KVU=; b=F15R0CXaZFRQYQbo63rBujJeUwVGm5wVfZpKkJGSgieEZcMt2EFbdYPECgs4xLBYmb4k LVjfvJsuUDHHZj9eaItdHeqAsp7HgE4dtBkHmImBJydkqiJ0E0738Wk7VR7cisacdOxN l0BjpNb0qPwiM8UHOBG2Qv6gzu4KWUOV8yvHUOJH2pSKyYygfoYBFiAL/HmDfHLeDbaM xNHt+gQuiWoWktqp09iNI4/5DMoIzqVQdnGB+Vy4RlbDPgBPq+L/EiMrPrdag1c/Y4n0 XOXJAzx4LoOKDOUfQEWSczS8mtxI+JV3QPrIBo95k4OevsH35qrm1Sd2X87xZta2SmJb ww== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8daf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:52 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id DFE7613965; Wed, 12 Oct 2022 21:59:51 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 853DA808EE1; Wed, 12 Oct 2022 21:59:51 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 05/19] crypto: x86/crc - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:17 -0500 Message-Id: <20221012215931.3896-6-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: P2-UHsNACr6wYY-UIFlHqmPBROyZAoO5 X-Proofpoint-ORIG-GUID: P2-UHsNACr6wYY-UIFlHqmPBROyZAoO5 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 78c37d191dd6 ("crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation") Fixes: 6a8ce1ef3940 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction") Fixes: 0b95a7f85718 ("crypto: crct10dif - Glue code to cast accelerated CRCT10DIF assembly as a crypto transform") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_asm.S | 6 ++-- arch/x86/crypto/crc32-pclmul_glue.c | 19 ++++++++---- arch/x86/crypto/crc32c-intel_glue.c | 29 ++++++++++++++---- arch/x86/crypto/crct10dif-pclmul_glue.c | 39 ++++++++++++++++++++----- 4 files changed, 71 insertions(+), 22 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_asm.S b/arch/x86/crypto/crc32-pclmul_asm.S index ca53e96996ac..9abd861636c3 100644 --- a/arch/x86/crypto/crc32-pclmul_asm.S +++ b/arch/x86/crypto/crc32-pclmul_asm.S @@ -72,15 +72,15 @@ .text /** * Calculate crc32 - * BUF - buffer (16 bytes aligned) - * LEN - sizeof buffer (16 bytes aligned), LEN should be grater than 63 + * BUF - buffer - must be 16 bytes aligned + * LEN - sizeof buffer - must be multiple of 16 bytes and greater than 63 * CRC - initial crc32 * return %eax crc32 * uint crc32_pclmul_le_16(unsigned char const *buffer, * size_t len, uint crc32) */ -SYM_FUNC_START(crc32_pclmul_le_16) /* buffer and buffer size are 16 bytes aligned */ +SYM_FUNC_START(crc32_pclmul_le_16) movdqa (BUF), %xmm1 movdqa 0x10(BUF), %xmm2 movdqa 0x20(BUF), %xmm3 diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 98cf3b4e4c9f..38539c6edfe5 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -46,6 +46,8 @@ #define SCALE_F 16L /* size of xmm register */ #define SCALE_F_MASK (SCALE_F - 1) +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + u32 crc32_pclmul_le_16(unsigned char const *buffer, size_t len, u32 crc32); static u32 __attribute__((pure)) @@ -70,12 +72,19 @@ static u32 __attribute__((pure)) iquotient = len & (~SCALE_F_MASK); iremainder = len & SCALE_F_MASK; - kernel_fpu_begin(); - crc = crc32_pclmul_le_16(p, iquotient, crc); - kernel_fpu_end(); + do { + unsigned int chunk = min(iquotient, FPU_BYTES); + + kernel_fpu_begin(); + crc = crc32_pclmul_le_16(p, chunk, crc); + kernel_fpu_end(); + + iquotient -= chunk; + p += chunk; + } while (iquotient >= PCLMUL_MIN_LEN); - if (iremainder) - crc = crc32_le(crc, p + iquotient, iremainder); + if (iquotient || iremainder) + crc = crc32_le(crc, p, iquotient + iremainder); return crc; } diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index feccb5254c7e..ece620227057 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -41,6 +41,8 @@ */ #define CRC32C_PCL_BREAKEVEN 512 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage unsigned int crc_pcl(const u8 *buffer, int len, unsigned int crc_init); #endif /* CONFIG_X86_64 */ @@ -158,9 +160,16 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data, * overcome kernel fpu state save/restore overhead */ if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *crcp = crc_pcl(data, len, *crcp); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len); } else *crcp = crc32c_intel_le_hw(*crcp, data, len); return 0; @@ -170,9 +179,17 @@ static int __crc32c_pcl_intel_finup(u32 *crcp, const u8 *data, unsigned int len, u8 *out) { if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__le32 *)out = ~cpu_to_le32(crc_pcl(data, len, *crcp)); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len); + *(__le32 *)out = ~cpu_to_le32(*crcp); } else *(__le32 *)out = ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 71291d5af9f4..54a537fc88ee 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -34,6 +34,10 @@ #include #include +#define PCLMUL_MIN_LEN 16U /* minimum size of buffer for crc_t10dif_pcl */ + +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage u16 crc_t10dif_pcl(u16 init_crc, const u8 *buf, size_t len); struct chksum_desc_ctx { @@ -54,10 +58,19 @@ static int chksum_update(struct shash_desc *desc, const u8 *data, { struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); - if (length >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - ctx->crc = crc_t10dif_pcl(ctx->crc, data, length); - kernel_fpu_end(); + if (length >= PCLMUL_MIN_LEN && crypto_simd_usable()) { + do { + unsigned int chunk = min(length, FPU_BYTES); + + kernel_fpu_begin(); + ctx->crc = crc_t10dif_pcl(ctx->crc, data, chunk); + kernel_fpu_end(); + + length -= chunk; + data += chunk; + } while (length >= PCLMUL_MIN_LEN); + if (length) + ctx->crc = crc_t10dif_generic(ctx->crc, data, length); } else ctx->crc = crc_t10dif_generic(ctx->crc, data, length); return 0; @@ -73,10 +86,20 @@ static int chksum_final(struct shash_desc *desc, u8 *out) static int __chksum_finup(__u16 crc, const u8 *data, unsigned int len, u8 *out) { - if (len >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__u16 *)out = crc_t10dif_pcl(crc, data, len); - kernel_fpu_end(); + if (len >= PCLMUL_MIN_LEN && crypto_simd_usable()) { + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + crc = crc_t10dif_pcl(crc, data, chunk); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len >= PCLMUL_MIN_LEN); + if (len) + crc = crc_t10dif_generic(crc, data, len); + *(__u16 *)out = crc; } else *(__u16 *)out = crc_t10dif_generic(crc, data, len); return 0; From patchwork Wed Oct 12 21:59:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005468 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EC69C433FE for ; Wed, 12 Oct 2022 22:00:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229912AbiJLWAN (ORCPT ); Wed, 12 Oct 2022 18:00:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229854AbiJLWAE (ORCPT ); Wed, 12 Oct 2022 18:00:04 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 830BD1D656; Wed, 12 Oct 2022 15:00:01 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7nUY016092; Wed, 12 Oct 2022 21:59:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ucCBxZTErIqQD5ALBAy3477d9TkzMjYJkeMraMaDllo=; b=oO9OfIR5WNQgWIqFPF/XIYrSG39bYT/aMzXVmpBqZEKeMCtE64GD0+lwWYtTH3gh7rDL 9R0RFoM+KeHjtUeBor6HVeDgPKlzpcqmh4WNPrbqe2JpiHzSRn7kjvLGnyZNnG8Jcvac o7VWjZiXjMhACKwIYhYDtCtPW2UvVDVt9jraUZUmEd5O6AOXmnLBTofSvC/Rr3MWWd56 Pz3ucwFB5cVr35pGdIlzXZd5RWSz0GAJuS1o53L8+31nApp33wo6njkMVohYPhtHMLbm VAtf+X2xPgMljQeYSZvziN+FNapTs1wA5YBpVkGYDhOd+bzX1M9prGBZ0k4XUCofuwZt PQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a4b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:54 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 8C6A429583; Wed, 12 Oct 2022 21:59:53 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 300CA806B5D; Wed, 12 Oct 2022 21:59:53 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 06/19] crypto: x86/sm3 - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:18 -0500 Message-Id: <20221012215931.3896-7-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: SsfEUl-bSa_7-e8O3HxfPvS3a8M686_J X-Proofpoint-GUID: SsfEUl-bSa_7-e8O3HxfPvS3a8M686_J X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, causing: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 930ab34d906d ("crypto: x86/sm3 - add AVX assembly implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/sm3_avx_glue.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 661b6f22ffcd..ffb6d2f409ef 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -17,6 +17,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sm3_transform_avx(struct sm3_state *state, const u8 *data, int nblocks); @@ -37,9 +39,16 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); - kernel_fpu_begin(); - sm3_base_do_update(desc, data, len, sm3_transform_avx); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -57,9 +66,19 @@ static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, return 0; } + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); kernel_fpu_begin(); - if (len) - sm3_base_do_update(desc, data, len, sm3_transform_avx); sm3_base_do_finalize(desc, sm3_transform_avx); kernel_fpu_end(); From patchwork Wed Oct 12 21:59:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005467 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2460C43219 for ; Wed, 12 Oct 2022 22:00:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229495AbiJLWAP (ORCPT ); Wed, 12 Oct 2022 18:00:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229892AbiJLWAI (ORCPT ); Wed, 12 Oct 2022 18:00:08 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3589E205E6; Wed, 12 Oct 2022 15:00:04 -0700 (PDT) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CJxoEv014051; Wed, 12 Oct 2022 21:59:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=YlzDIESEfgTHM3UWOTC3N8erZlt5QA9vPO1xjBSnLF0=; b=D7WmPvZh2tewYv0m43vvUEFLRXK17NSFGmmHwTe7OsTULsEOStWj9m6E9EysrJ6vHDHg i6uY73vFKpCCqor2m6SJNd9tzXdVf5fB0O54yPor+ETQqTUay4uyt6kqEKxNkuvhYgHo dDZKGCVqWs+7zLT3gEWbgbTrPAKdxcfAvU5vAsWpie2/doWuit373EQB+3tbZmJWiFCE W80/+09eVfBA+HDMDSsaP9bXa6WQNPcCKa75Zn92TBj7V4+gqVTm7UMjfuKQ/ZrG9r2I hmBSJs/FexNgs2Fq2yz6csFdwbq3siGM0bylVguz+mv3hqIwEv+yEWIM713PGc5+0VEr Hw== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k6471rtp0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:56 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 9A72B804719; Wed, 12 Oct 2022 21:59:55 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 39899808EDA; Wed, 12 Oct 2022 21:59:55 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 07/19] crypto: x86/ghash - restructure FPU context saving Date: Wed, 12 Oct 2022 16:59:19 -0500 Message-Id: <20221012215931.3896-8-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 77A3sNrJXj1Vd87FM5GRaq59qsi5gdvl X-Proofpoint-ORIG-GUID: 77A3sNrJXj1Vd87FM5GRaq59qsi5gdvl X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 1f1a95f3dd0c..53aa286ec27f 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -80,7 +80,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -91,10 +90,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end(); From patchwork Wed Oct 12 21:59:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005472 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1337C433FE for ; Wed, 12 Oct 2022 22:01:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229979AbiJLWBP (ORCPT ); Wed, 12 Oct 2022 18:01:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229983AbiJLWAa (ORCPT ); Wed, 12 Oct 2022 18:00:30 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AE8B42E7C; Wed, 12 Oct 2022 15:00:12 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwnKK007661; Wed, 12 Oct 2022 21:59:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=hdZBxgZ/TUVLMXZM2B/5tdVJYyvU0kV133lw76XKM8w=; b=PY7mnfOtlfIXFa4yUUY0tZviVcH/bFeyu3JBG6VGqXNQw1sfRWRcxaLVZyLmS2IrtQfz WVbnPSsUpjspFG1WKDsSh1Qw917II0pkRLzWUzNpG8WaEYECgzxxpyZBtclU5W7aXdND YyTWFu1fVJgUBCXAqrKo0Q5K/D8U6H83oEydtgDMlcPFONrhOvJMBJiup7h1ijEuSqVU MLFtqABpE3a8WAYQDRwu4uMQMZPoN6k8Qhe+Tf1oT67G9ej96Ukm8WCiO7Cy5BGBXHdf Xi8WHckpBkZCL7da+O1k77EX1gx+5T17F/OFBtLAto9aNJ6s27T/02BCC6tRjEkBNmt5 Qw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8dba-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:59 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 29CD513966; Wed, 12 Oct 2022 21:59:57 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id C24208036BF; Wed, 12 Oct 2022 21:59:56 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 08/19] crypto: x86/ghash - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:20 -0500 Message-Id: <20221012215931.3896-9-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: GN3sVZsnEFkLjRI-H6zI7Felq_M5_ziE X-Proofpoint-ORIG-GUID: GN3sVZsnEFkLjRI-H6zI7Felq_M5_ziE X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 26 ++++++++++++++++------ 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 53aa286ec27f..a39fc405c7cf 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -23,6 +23,8 @@ #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + void clmul_ghash_mul(char *dst, const u128 *shash); void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, @@ -82,7 +84,7 @@ static int ghash_update(struct shash_desc *desc, if (dctx->bytes) { int n = min(srclen, dctx->bytes); - u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); + u8 *pos = dst + GHASH_BLOCK_SIZE - dctx->bytes; dctx->bytes -= n; srclen -= n; @@ -97,13 +99,23 @@ static int ghash_update(struct shash_desc *desc, } } - kernel_fpu_begin(); - clmul_ghash_update(dst, src, srclen, &ctx->shash); - kernel_fpu_end(); + while (srclen >= GHASH_BLOCK_SIZE) { + unsigned int fpulen = min(srclen, FPU_BYTES); + + kernel_fpu_begin(); + while (fpulen >= GHASH_BLOCK_SIZE) { + int n = min_t(unsigned int, fpulen, GHASH_BLOCK_SIZE); + + clmul_ghash_update(dst, src, n, &ctx->shash); + + srclen -= n; + fpulen -= n; + src += n; + } + kernel_fpu_end(); + } - if (srclen & 0xf) { - src += srclen - (srclen & 0xf); - srclen &= 0xf; + if (srclen) { dctx->bytes = GHASH_BLOCK_SIZE - srclen; while (srclen--) *dst++ ^= *src++; From patchwork Wed Oct 12 21:59:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005470 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42013C4332F for ; Wed, 12 Oct 2022 22:00:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229924AbiJLWAe (ORCPT ); Wed, 12 Oct 2022 18:00:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229711AbiJLWAU (ORCPT ); Wed, 12 Oct 2022 18:00:20 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EB343E742; Wed, 12 Oct 2022 15:00:11 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7nUZ016092; Wed, 12 Oct 2022 21:59:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=PJ23TVkPoznEmI/AzY+zwfyDq1fzzxeVCOJWO7ijd2E=; b=NOgzVQ6FqkEG8LF7XdZO10hs5ViDwApvL5Rc/w9UogENt2AtV+dZVdTI1fnlCldKSRs+ BTzEab8hy5QFzbnfcxL6rVMEf4YFlWfUyy3lLmGyJuh/8MeLSDhKr29y6w29WSznkYCG V64eeYk+kEod5S7mijkokBYrgW/7d1/lP8nYaa3b535if8SWJUhcGjsMJdR8c5YRF1yM UNOmlNeqAG+RcXRR+jEhs+veQsKQriTQkQgDW2hnq6ecMrJqxDKPpscNFPc831sqpSnf Q64CkZX/bpfrU4QBPHKLWsdbeQ71XA64u2cRLyW23Mh/rxrLgKz28Dj67EPZCCCMF34Z 6w== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a4y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:59 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 81D031396A; Wed, 12 Oct 2022 21:59:58 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 28796805032; Wed, 12 Oct 2022 21:59:58 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 09/19] crypto: x86 - use common macro for FPU limit Date: Wed, 12 Oct 2022 16:59:21 -0500 Message-Id: <20221012215931.3896-10-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: cvBzHnz1Jrp4-w5puDxizPr6_DNBD6L8 X-Proofpoint-GUID: cvBzHnz1Jrp4-w5puDxizPr6_DNBD6L8 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Use a common macro name (FPU_BYTES) for the limit of the number of bytes processed within kernel_fpu_begin and kernel_fpu_end rather than using SZ_4K (which is a signed value), or a magic value of 4096U. Use unsigned int rather than size_t for some of the arguments to avoid typecasting for the min() macro. Signed-off-by: Robert Elliott --- arch/x86/crypto/blake2s-glue.c | 7 ++++--- arch/x86/crypto/chacha_glue.c | 4 +++- arch/x86/crypto/nhpoly1305-avx2-glue.c | 4 +++- arch/x86/crypto/nhpoly1305-sse2-glue.c | 4 +++- arch/x86/crypto/poly1305_glue.c | 25 ++++++++++++++----------- arch/x86/crypto/polyval-clmulni_glue.c | 5 +++-- 6 files changed, 30 insertions(+), 19 deletions(-) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index aaba21230528..3054ee7fa219 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -16,6 +16,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, const u8 *block, const size_t nblocks, const u32 inc); @@ -29,8 +31,7 @@ static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); void blake2s_compress(struct blake2s_state *state, const u8 *block, size_t nblocks, const u32 inc) { - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); + BUILD_BUG_ON(FPU_BYTES / BLAKE2S_BLOCK_SIZE < 8); if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { blake2s_compress_generic(state, block, nblocks, inc); @@ -39,7 +40,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, do { const size_t blocks = min_t(size_t, nblocks, - SZ_4K / BLAKE2S_BLOCK_SIZE); + FPU_BYTES / BLAKE2S_BLOCK_SIZE); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7b3a1cf0984b..0d7e172862db 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -15,6 +15,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, unsigned int len, int nrounds); asmlinkage void chacha_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, @@ -147,7 +149,7 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, return chacha_crypt_generic(state, dst, src, bytes, nrounds); do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + unsigned int todo = min(bytes, FPU_BYTES); kernel_fpu_begin(); chacha_dosimd(state, dst, src, todo, nrounds); diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 8ea5ab0f1ca7..59615ae95e86 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -13,6 +13,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -30,7 +32,7 @@ static int nhpoly1305_avx2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2); diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index 2b353d42ed13..bf91c375821a 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -13,6 +13,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -30,7 +32,7 @@ static int nhpoly1305_sse2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2); diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 1dfb8af48a3c..3764301bdf1b 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -15,20 +15,24 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void poly1305_init_x86_64(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]); asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); -asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); @@ -86,14 +90,13 @@ static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]) poly1305_init_x86_64(ctx, key); } -static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, +static void poly1305_simd_blocks(void *ctx, const u8 *inp, unsigned int len, const u32 padbit) { struct poly1305_arch_internal *state = ctx; - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE || - SZ_4K % POLY1305_BLOCK_SIZE); + BUILD_BUG_ON(FPU_BYTES < POLY1305_BLOCK_SIZE || + FPU_BYTES % POLY1305_BLOCK_SIZE); if (!static_branch_likely(&poly1305_use_avx) || (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) || @@ -104,7 +107,7 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, } do { - const size_t bytes = min_t(size_t, len, SZ_4K); + const unsigned int bytes = min(len, FPU_BYTES); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index b7664d018851..2502964afef6 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -29,6 +29,8 @@ #define NUM_KEY_POWERS 8 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + struct polyval_tfm_ctx { /* * These powers must be in the order h^8, ..., h^1. @@ -123,8 +125,7 @@ static int polyval_x86_update(struct shash_desc *desc, } while (srclen >= POLYVAL_BLOCK_SIZE) { - /* Allow rescheduling every 4K bytes. */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + nblocks = min(srclen, FPU_BYTES) / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; From patchwork Wed Oct 12 21:59:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005474 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57C01C43219 for ; Wed, 12 Oct 2022 22:01:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230052AbiJLWBR (ORCPT ); Wed, 12 Oct 2022 18:01:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229885AbiJLWAc (ORCPT ); Wed, 12 Oct 2022 18:00:32 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6A285726C; Wed, 12 Oct 2022 15:00:17 -0700 (PDT) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CGUv8R027242; Wed, 12 Oct 2022 22:00:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=+nE5VKyWUSEW08QSYxLtY7i4Gw38jTfKxO2sgAjNaZ0=; b=KzCk7OZduvaetOB2063hl/uSWKsOE7U5VUXfRM/HdmRdqgIw92enm50CxKmjqfnJT3/P d3ZIb5KHQ79gv1D0LRvFyF/E0YC/+FPCSgDNw57DoPis4MDDfe/Y6JkAVPFJ4fFByVMe rN0qCnLmEuIclTZiGeXCBkDwuEjvcgl6esotD4jGy5DIYGgaLQTSTRl/+vQ7VpAMq4gt UzQ1JCOn6FYKWwgmYhut4TkDVPVSnMcyzuWrDwcDYVYOblXPQV6/KNGxUlZN7W7ZmgTN WKDzX/i1258la1dmp9wBzzxHdVQAXphqHoSzKEI3lgS4UbNFFSTTTMyDujjh6qgn6fst 0A== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k615t2eqb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:00 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id D03BD801AC0; Wed, 12 Oct 2022 21:59:59 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 75AC8800344; Wed, 12 Oct 2022 21:59:59 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 10/19] crypto: x86/sha1, sha256 - load based on CPU features Date: Wed, 12 Oct 2022 16:59:22 -0500 Message-Id: <20221012215931.3896-11-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: MKcI_SpIQ6gI5fGSk5W54aFNBmuJTDxR X-Proofpoint-ORIG-GUID: MKcI_SpIQ6gI5fGSk5W54aFNBmuJTDxR X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=952 lowpriorityscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sha1, sha256 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. This commit covers modules that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 13 +++++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 13 +++++++++++++ 2 files changed, 26 insertions(+) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index a9f5779b41ca..edffc33bd12e 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -310,6 +311,15 @@ static int register_sha1_ni(void) return 0; } +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static void unregister_sha1_ni(void) { if (boot_cpu_has(X86_FEATURE_SHA_NI)) @@ -326,6 +336,9 @@ static int __init sha1_ssse3_mod_init(void) if (register_sha1_ssse3()) goto fail; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha1_avx()) { unregister_sha1_ssse3(); goto fail; diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 322c8aa907af..42e8cb1a6708 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -366,6 +367,15 @@ static struct shash_alg sha256_ni_algs[] = { { } } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int register_sha256_ni(void) { if (boot_cpu_has(X86_FEATURE_SHA_NI)) @@ -388,6 +398,9 @@ static inline void unregister_sha256_ni(void) { } static int __init sha256_ssse3_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha256_ssse3()) goto fail; From patchwork Wed Oct 12 21:59:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005481 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C918FC4332F for ; Wed, 12 Oct 2022 22:02:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229672AbiJLWC1 (ORCPT ); Wed, 12 Oct 2022 18:02:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229704AbiJLWBa (ORCPT ); Wed, 12 Oct 2022 18:01:30 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E954A58089; Wed, 12 Oct 2022 15:00:17 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7Xtx016427; Wed, 12 Oct 2022 22:00:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=IHgdcqY57yff1zidfBdTpXQFa6PN/RGQ+7WerJQTk2I=; b=H5vD7ecinNIj7w7ZEZzSEMqJ1ASb1KmrQnVhBK0AaEuNmAog6qKRJtznVbURM6OVc5fR bz2v6JFOFPMnezZx6/Rp9sjP/i/5/c6znkW5fUgJ68jL8nZ8ahZ2o+UXZUu3EaFqZldD 2mgIA2JlTgnRLqo8H9uYsg5Phq4KHbEt5QLi0kdnBTuvv/y4n0kpI+JE/S0qDxQBPjQ9 2Nl/TgPTZWIFMhGDKCRxI0wk38n8hhTtnLHKbAPP+rc1pS62qJudX8/T6ojGk1dQLMx1 zgNoTte2G05k7JZCIV3+/fd4zcFqPew4yTNE9Rw8aHP8/BK+2TQnlhlE/UJqfLS6w2lo Gw== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8aas-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:01 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 11C2B804701; Wed, 12 Oct 2022 22:00:01 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id A6C01806B7E; Wed, 12 Oct 2022 22:00:00 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 11/19] crypto: x86/crc - load based on CPU features Date: Wed, 12 Oct 2022 16:59:23 -0500 Message-Id: <20221012215931.3896-12-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: MDQZ-3AN4UnDZwqsSrh64-VrKqaaLGYZ X-Proofpoint-GUID: MDQZ-3AN4UnDZwqsSrh64-VrKqaaLGYZ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: crc32, crc32c, and crct10dif Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. Remove the print on a device table mismatch from crc32 that is not present in the other modules. Modules are not supposed to print unless they are active. This commit covers modules that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_glue.c | 9 +++------ arch/x86/crypto/crc32c-intel_glue.c | 6 +++--- arch/x86/crypto/crct10dif-pclmul_glue.c | 6 +++--- 3 files changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 38539c6edfe5..d49a19dcee37 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -178,20 +178,17 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crc32pclmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crc32pclmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32_pclmul_mod_init(void) { - - if (!x86_match_cpu(crc32pclmul_cpu_id)) { - pr_info("PCLMULQDQ-NI instructions are not detected.\n"); + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - } return crypto_register_shash(&alg); } diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index ece620227057..980c62929256 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -231,15 +231,15 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crc32c_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_XMM4_2, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crc32c_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32c_intel_mod_init(void) { - if (!x86_match_cpu(crc32c_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) { diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 54a537fc88ee..3b8e9394c40d 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -136,15 +136,15 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crct10dif_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crct10dif_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crct10dif_intel_mod_init(void) { - if (!x86_match_cpu(crct10dif_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; return crypto_register_shash(&alg); From patchwork Wed Oct 12 21:59:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005475 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BABAC4332F for ; Wed, 12 Oct 2022 22:01:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230086AbiJLWBg (ORCPT ); Wed, 12 Oct 2022 18:01:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230000AbiJLWAc (ORCPT ); Wed, 12 Oct 2022 18:00:32 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C3AE4D4EC; Wed, 12 Oct 2022 15:00:15 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL8Ert016464; Wed, 12 Oct 2022 22:00:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=bap2tNZolRwbU8NuMuLdrUdY6VmdGzhFwpmbpTsze4k=; b=bMqiADQ4M2zeZi+iNQEQd7DsQIboatGHMck+sBlj2TK8XhtdBeaau7ek5GrtMar0Tpaa RB3T0q3TwQwDelJC5xiBxFv+OGDNgMjTlAfzR4W08HiF8DYrPxBoMSS/shqcV2jdeNLK OKN/rVXlguzEOYKj8rzmmCK7L0oqUzsRhvTQnaf2AV8/dVwhu6AuXWmchPM7OtgGYuKe LeZorWzqpNCYeW+GknksbJZ5TzBftFPAESOMSAPCEdqEbEXJ2ugyqZCZuzEAZr533+6Z Gaiy/+X5iFJc6h/2NLgnhFEKUMNeeQ+j+Acj+ikcKDKIalzUcIfHyvm20DT1g/YmWI8M CA== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a5k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:03 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 6E5FD13964; Wed, 12 Oct 2022 22:00:02 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 0FB92805032; Wed, 12 Oct 2022 22:00:02 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 12/19] crypto: x86/sm3 - load based on CPU features Date: Wed, 12 Oct 2022 16:59:24 -0500 Message-Id: <20221012215931.3896-13-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: qJn0nosD52Bi6Md1qaiHmCNMvlHY8Xfb X-Proofpoint-GUID: qJn0nosD52Bi6Md1qaiHmCNMvlHY8Xfb X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sm3 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. This commit covers a module that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/sm3_avx_glue.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index ffb6d2f409ef..475b9637a06d 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -115,10 +116,19 @@ static struct shash_alg sm3_avx_alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sm3_avx_mod_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX)) { pr_info("AVX instruction are not detected.\n"); return -ENODEV; From patchwork Wed Oct 12 21:59:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005473 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8378CC4332F for ; Wed, 12 Oct 2022 22:01:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230037AbiJLWBQ (ORCPT ); Wed, 12 Oct 2022 18:01:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229989AbiJLWAa (ORCPT ); Wed, 12 Oct 2022 18:00:30 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73F8C4DF36; Wed, 12 Oct 2022 15:00:16 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CLWO6L026058; Wed, 12 Oct 2022 22:00:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=o8FuU3Q+uLfzqaV74fBLQi/WP+5rrgR7yOITBPlHv9o=; b=L81HtriDA4ZqbzWE5F/tsLihXXg4BeQt5qvnjLtJXuS1UNmP+ulHHdSDC2awfQ/CLUf5 o5xSjQbOULwBd+N9+CE2+OibN8bbtyIgWut6F3py4AdA5oztgtJFfl1APsiClotthErQ wY+gZtpOUoAR+ftNeksN8Rddo9sWCLNVTqOgaM1hSo5cyG+hWAhUYgdk6zLEx2Oh9Y6l NKx9uYJzrB4P2M0YMtKtRXvQXpTrlRPM7BOFssN23S6bwMFeSpy365LhEzpy/IakgeFV iGnBjRUmES2QNsGs83FQ9V9lnKTR/T19XqFWKLAD2G8bfeZ/uU+eQ7Eed+epc3Knsq5c HQ== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8abj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:05 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 4299213965; Wed, 12 Oct 2022 22:00:04 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id D23D28038C2; Wed, 12 Oct 2022 22:00:03 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 13/19] crypto: x86/ghash - load based on CPU features Date: Wed, 12 Oct 2022 16:59:25 -0500 Message-Id: <20221012215931.3896-14-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 5USlLj6klF6wOtLmvXZJ3Frf8jg5Io-U X-Proofpoint-GUID: 5USlLj6klF6wOtLmvXZJ3Frf8jg5Io-U X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: ghash Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. This commit covers modules that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index a39fc405c7cf..69945e41bc41 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -327,17 +327,17 @@ static struct ahash_alg ghash_async_alg = { }, }; -static const struct x86_cpu_id pcmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), /* Pickle-Mickle-Duck */ {} }; -MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init ghash_pclmulqdqni_mod_init(void) { int err; - if (!x86_match_cpu(pcmul_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; err = crypto_register_shash(&ghash_alg); From patchwork Wed Oct 12 21:59:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005476 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE4FAC433FE for ; Wed, 12 Oct 2022 22:01:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229941AbiJLWBr (ORCPT ); Wed, 12 Oct 2022 18:01:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230015AbiJLWAm (ORCPT ); Wed, 12 Oct 2022 18:00:42 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B51A22937D; Wed, 12 Oct 2022 15:00:20 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwftF006351; Wed, 12 Oct 2022 22:00:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=r+X+jLySvqDiKiHMsNUb6PPWVOp5GHi0Deu+a2h35I8=; b=UiswrlZFgQtCZIAdMZNiMGZr/q6aI8bgKtaV37CyrdXPdKzrgww2NYex3vIIll5zpqdH k9Hpf8/OT8JZvtnTtUFokJ6ms/pRVLpj/1EdxQS/veTxjpFCKcAt8S9aesQnZi2iCnvr ksdiQ1kGFc1WdedR8EaOTSrAnzdaWii8d+f5TSrjz8ogGCSC1d+rhJ8h59ZdlGyP68NI 8dvB1nXDtk12JRVOh2wKwwvDZGsoDclvgAfA3ZwYGMxhDt88JrSxIScrJDWE3yMgXGUV d1tFFpTyNY8D1zKpvX/cF/mxqYvWugErNSrFS/joWPpcAnL1jzAN1TYVvJPiZuz+QW9u AA== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8dd4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:06 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id B508313943; Wed, 12 Oct 2022 22:00:05 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 616D5807DB8; Wed, 12 Oct 2022 22:00:05 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 14/19] crypto: x86 - load based on CPU features Date: Wed, 12 Oct 2022 16:59:26 -0500 Message-Id: <20221012215931.3896-15-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 7z1WUvCAZ4qfCWocIMKTMMooH90dGYAE X-Proofpoint-ORIG-GUID: 7z1WUvCAZ4qfCWocIMKTMMooH90dGYAE X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org x86 optimized crypto modules built as modules rather than built-in to the kernel end up as .ko files in the filesystem, e.g., in /usr/lib/modules. If the filesystem itself is a module, these might not be available when the crypto API is initialized, resulting in the generic implementation being used (e.g., sha512_transform rather than sha512_transform_avx2). In one test case, CPU utilization in the sha512 function dropped from 15.34% to 7.18% after forcing loading of the optimized module. Set module aliases for x86 optimized crypto modules based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. For example, with sha256, sha512, aesni_intel, and blake2s configured as built-in and the rest configured as modules: [ 13.749145] sha256_ssse3: CPU-optimized crypto module loaded (SSSE3=no, AVX=no, AVX2=yes, SHA-NI=no) [ 13.758502] sha512_ssse3: CPU-optimized crypto module loaded (SSSE3=no, AVX=no, AVX2=yes) [ 13.766939] libblake2s_x86_64: CPU-optimized crypto module loaded (SSSE3=yes, AVX512=yes) [ 16.794502] aesni_intel: CPU-optimized crypto module loaded (GCM SSE=no, AVX=yes, AVX2=yes)(CTR AVX=yes) ... [ 18.160648] Run /init as init process ... [ 20.073484] twofish_x86_64: CPU-optimized crypto module loaded [ 23.974029] serpent_sse2_x86_64: CPU-optimized crypto module loaded [ 24.080749] serpent_avx_x86_64: CPU-optimized crypto module loaded [ 24.187148] serpent_avx2: CPU-optimized crypto module loaded [ 24.358980] des3_ede_x86_64: CPU-optimized crypto module loaded [ 24.459257] camellia_x86_64: CPU-optimized crypto module loaded [ 24.548487] camellia_aesni_avx_x86_64: CPU-optimized crypto module loaded [ 24.630777] camellia_aesni_avx2: CPU-optimized crypto module loaded [ 24.957134] blowfish_x86_64: CPU-optimized crypto module loaded [ 25.063537] aegis128_aesni: CPU-optimized crypto module loaded [ 25.174560] chacha_x86_64: CPU-optimized crypto module loaded (AVX2=yes, AVX512=yes) [ 25.270084] sha1_ssse3: CPU-optimized crypto module loaded (SSSE3=no, AVX=no, AVX2=yes, SHA-NI=no) [ 25.531724] ghash_clmulni_intel: CPU-optimized crypto module loaded [ 25.596316] crc32c_intel: CPU-optimized crypto module loaded (PCLMULQDQ=yes) [ 25.661693] crc32_pclmul: CPU-optimized crypto module loaded [ 25.696388] crct10dif_pclmul: CPU-optimized crypto module loaded [ 25.742040] poly1305_x86_64: CPU-optimized crypto module loaded (AVX=yes, AVX2=yes, AVX512=no) [ 25.841364] nhpoly1305_avx2: CPU-optimized crypto module loaded [ 25.856401] curve25519_x86_64: CPU-optimized crypto module loaded (ADX=yes) [ 25.866615] sm3_avx_x86_64: CPU-optimized crypto module loaded This commit covers modules that did not create rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 9 +++++++++ arch/x86/crypto/aesni-intel_glue.c | 7 +++---- arch/x86/crypto/blake2s-glue.c | 11 ++++++++++- arch/x86/crypto/blowfish_glue.c | 10 ++++++++++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 12 ++++++++++++ arch/x86/crypto/camellia_aesni_avx_glue.c | 11 +++++++++++ arch/x86/crypto/camellia_glue.c | 9 +++++++++ arch/x86/crypto/cast5_avx_glue.c | 10 ++++++++++ arch/x86/crypto/cast6_avx_glue.c | 10 ++++++++++ arch/x86/crypto/chacha_glue.c | 12 ++++++++++-- arch/x86/crypto/curve25519-x86_64.c | 12 +++++++++++- arch/x86/crypto/des3_ede_glue.c | 10 ++++++++++ arch/x86/crypto/nhpoly1305-avx2-glue.c | 10 ++++++++++ arch/x86/crypto/nhpoly1305-sse2-glue.c | 10 ++++++++++ arch/x86/crypto/poly1305_glue.c | 12 ++++++++++++ arch/x86/crypto/serpent_avx2_glue.c | 10 ++++++++++ arch/x86/crypto/serpent_avx_glue.c | 10 ++++++++++ arch/x86/crypto/serpent_sse2_glue.c | 10 ++++++++++ arch/x86/crypto/sm4_aesni_avx2_glue.c | 12 ++++++++++++ arch/x86/crypto/sm4_aesni_avx_glue.c | 11 +++++++++++ arch/x86/crypto/twofish_avx_glue.c | 10 ++++++++++ arch/x86/crypto/twofish_glue.c | 10 ++++++++++ arch/x86/crypto/twofish_glue_3way.c | 10 ++++++++++ 23 files changed, 230 insertions(+), 8 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 4623189000d8..9e4ba031704d 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -263,10 +263,19 @@ static struct aead_alg crypto_aegis128_aesni_alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_aead_alg *simd_alg; static int __init crypto_aegis128_aesni_module_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2) || !boot_cpu_has(X86_FEATURE_AES) || !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index a5b0cb3efeba..4a530a558436 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -36,7 +36,6 @@ #include #include - #define AESNI_ALIGN 16 #define AESNI_ALIGN_ATTR __attribute__ ((__aligned__(AESNI_ALIGN))) #define AES_BLOCK_MASK (~(AES_BLOCK_SIZE - 1)) @@ -1228,17 +1227,17 @@ static struct aead_alg aesni_aeads[0]; static struct simd_aead_alg *aesni_simd_aeads[ARRAY_SIZE(aesni_aeads)]; -static const struct x86_cpu_id aesni_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init aesni_init(void) { int err; - if (!x86_match_cpu(aesni_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_AVX2)) { diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index 3054ee7fa219..5153bb423dbe 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -10,7 +10,7 @@ #include #include #include - +#include #include #include #include @@ -56,8 +56,17 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, } EXPORT_SYMBOL(blake2s_compress); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init blake2s_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_SSSE3)) static_branch_enable(&blake2s_use_ssse3); diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 019c64c1340a..4c0ead71b198 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include /* regular block cipher functions */ asmlinkage void __blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src, @@ -303,10 +304,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init blowfish_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "blowfish-x86_64: performance on this CPU " diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index e7e4d64e9577..8e3ac5be7cf6 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "camellia.h" #include "ecb_cbc_helpers.h" @@ -98,12 +99,23 @@ static struct skcipher_alg camellia_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index c7ccf63e741e..54fcd86160ff 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "camellia.h" #include "ecb_cbc_helpers.h" @@ -98,12 +99,22 @@ static struct skcipher_alg camellia_algs[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index d45e9c0c42ac..e21d2d5b68f9 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -1377,10 +1377,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init camellia_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "camellia-x86_64: performance on this CPU " diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index 3976a87f92ad..bdc3c763334c 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "ecb_cbc_helpers.h" @@ -93,12 +94,21 @@ static struct skcipher_alg cast5_algs[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *cast5_simd_algs[ARRAY_SIZE(cast5_algs)]; static int __init cast5_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index 7e2aea372349..addca34b3511 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "ecb_cbc_helpers.h" @@ -93,12 +94,21 @@ static struct skcipher_alg cast6_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *cast6_simd_algs[ARRAY_SIZE(cast6_algs)]; static int __init cast6_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 0d7e172862db..7275cae3380d 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -278,10 +279,17 @@ static struct skcipher_alg algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init chacha_simd_mod_init(void) { - if (!boot_cpu_has(X86_FEATURE_SSSE3)) - return 0; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + static_branch_enable(&chacha_use_simd); diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index d55fa9e9b9e6..7fe395dfa79d 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -12,7 +12,7 @@ #include #include #include - +#include #include #include @@ -1697,9 +1697,19 @@ static struct kpp_alg curve25519_alg = { .max_size = curve25519_max_size, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ADX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init curve25519_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) static_branch_enable(&curve25519_use_bmi2_adx); else diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index abb8b1fe123b..168cac5c6ca6 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include struct des3_ede_x86_ctx { struct des3_ede_ctx enc; @@ -354,10 +355,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init des3_ede_x86_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { pr_info("des3_ede-x86_64: performance on this CPU would be suboptimal: disabling des3_ede-x86_64.\n"); return -ENODEV; diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 59615ae95e86..a8046334ddca 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -57,8 +58,17 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) return -ENODEV; diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index bf91c375821a..cdbe5df00927 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -57,8 +58,17 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2)) return -ENODEV; diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 3764301bdf1b..3e6ff505cd26 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -260,8 +261,19 @@ static struct shash_alg alg = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX512F, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init poly1305_simd_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_AVX) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) static_branch_enable(&poly1305_use_avx); diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 347e97f4b713..24741d33edaf 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "serpent-avx.h" #include "ecb_cbc_helpers.h" @@ -94,12 +95,21 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_avx2_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { pr_info("AVX2 instructions are not detected.\n"); return -ENODEV; diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 6c248e1ea4ef..0db18d99da50 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "serpent-avx.h" #include "ecb_cbc_helpers.h" @@ -100,12 +101,21 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index d78f37e9b2cf..5288441cc223 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "serpent-sse2.h" #include "ecb_cbc_helpers.h" @@ -103,10 +104,19 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_sse2_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2)) { printk(KERN_INFO "SSE2 instructions are not detected.\n"); return -ENODEV; diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 84bc718f49a3..2e9fe76056b8 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -126,6 +127,14 @@ static struct skcipher_alg sm4_aesni_avx2_skciphers[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg * simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)]; @@ -133,6 +142,9 @@ static int __init sm4_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 7800f77d68ad..f730822f203a 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -445,6 +446,13 @@ static struct skcipher_alg sm4_aesni_avx_skciphers[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg * simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)]; @@ -452,6 +460,9 @@ static int __init sm4_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 3eb3440b477a..4657e6efc35d 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "twofish.h" #include "ecb_cbc_helpers.h" @@ -103,12 +104,21 @@ static struct skcipher_alg twofish_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *twofish_simd_algs[ARRAY_SIZE(twofish_algs)]; static int __init twofish_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); return -ENODEV; diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index f9c4adc27404..ade98aef3402 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -43,6 +43,7 @@ #include #include #include +#include asmlinkage void twofish_enc_blk(struct twofish_ctx *ctx, u8 *dst, const u8 *src); @@ -81,8 +82,17 @@ static struct crypto_alg alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init twofish_glue_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + return crypto_register_alg(&alg); } diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 90454cf18e0d..790e5a59a9a7 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "twofish.h" #include "ecb_cbc_helpers.h" @@ -140,8 +141,17 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init twofish_3way_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "twofish-x86_64-3way: performance on this CPU " From patchwork Wed Oct 12 21:59:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005479 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 757C2C4332F for ; Wed, 12 Oct 2022 22:02:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230036AbiJLWCY (ORCPT ); Wed, 12 Oct 2022 18:02:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230040AbiJLWBQ (ORCPT ); Wed, 12 Oct 2022 18:01:16 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94C3E71997; Wed, 12 Oct 2022 15:00:20 -0700 (PDT) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL6Inh011403; Wed, 12 Oct 2022 22:00:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pps0720; bh=vfe3JaXy5t1uPLNmSMs9D1BrEZcB/b63Kn1+U3AJmdI=; b=Oz8xN/q2X3vq6EDXxqiBUc4Xqu6bY4QmotP8gAhS52I4uz99gDtz+hl3wr9CajHLdoxN fP5r9Auacaxm9MJ2bafrCeOthd+M6hV9x2hVGQ0IWVNS/cDnrhH+8yKO4KK7uUZWrQJq h9hHlWWEAuGSAuO/n0glABkqIBQz3jhRTEAhIGuItvaXM5M7XAG9VGJzuoYgZdaeTsdj D9QYfWFyo+JHdLGiApKRVB95rQv3nA/qgSOmFmvO9Nmy/OiEfrs0O9yCTTA5SzBPDJ+l 5KNX3WCUQZpbyuuTMZ2YMgY0DdOuFXxcOM1Ej+4zsEhyj9zOiqkhfy7VHxlfNPejQeCV 7A== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k6471rtrh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:07 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id DFDEA13959; Wed, 12 Oct 2022 22:00:06 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 863EB8036BF; Wed, 12 Oct 2022 22:00:06 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 15/19] crypto: x86 - add pr_fmt to all modules Date: Wed, 12 Oct 2022 16:59:27 -0500 Message-Id: <20221012215931.3896-16-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: Dpqcuqi8bN92J1TWWdtx_xSR_XapDlNC X-Proofpoint-ORIG-GUID: Dpqcuqi8bN92J1TWWdtx_xSR_XapDlNC X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add pr_fmt to all the modules so prints are prefixed by the Kconfig system module name (which is usually similar to, but not an exact match, for the runtime module name). Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 2 ++ arch/x86/crypto/aesni-intel_glue.c | 2 ++ arch/x86/crypto/aria_aesni_avx_glue.c | 2 ++ arch/x86/crypto/blake2s-glue.c | 2 ++ arch/x86/crypto/blowfish_glue.c | 2 ++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 2 ++ arch/x86/crypto/camellia_aesni_avx_glue.c | 2 ++ arch/x86/crypto/camellia_glue.c | 3 +++ arch/x86/crypto/cast5_avx_glue.c | 2 ++ arch/x86/crypto/cast6_avx_glue.c | 2 ++ arch/x86/crypto/chacha_glue.c | 2 ++ arch/x86/crypto/crc32-pclmul_glue.c | 3 +++ arch/x86/crypto/crc32c-intel_glue.c | 3 +++ arch/x86/crypto/crct10dif-pclmul_glue.c | 2 ++ arch/x86/crypto/curve25519-x86_64.c | 2 ++ arch/x86/crypto/des3_ede_glue.c | 2 ++ arch/x86/crypto/ghash-clmulni-intel_glue.c | 2 ++ arch/x86/crypto/nhpoly1305-avx2-glue.c | 2 ++ arch/x86/crypto/nhpoly1305-sse2-glue.c | 2 ++ arch/x86/crypto/poly1305_glue.c | 2 ++ arch/x86/crypto/polyval-clmulni_glue.c | 2 ++ arch/x86/crypto/serpent_avx2_glue.c | 2 ++ arch/x86/crypto/serpent_avx_glue.c | 2 ++ arch/x86/crypto/serpent_sse2_glue.c | 2 ++ arch/x86/crypto/sha256_ssse3_glue.c | 1 - arch/x86/crypto/sm4_aesni_avx2_glue.c | 2 ++ arch/x86/crypto/sm4_aesni_avx_glue.c | 2 ++ arch/x86/crypto/twofish_avx_glue.c | 2 ++ arch/x86/crypto/twofish_glue.c | 2 ++ arch/x86/crypto/twofish_glue_3way.c | 2 ++ 30 files changed, 61 insertions(+), 1 deletion(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 9e4ba031704d..122bfd04ee47 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -7,6 +7,8 @@ * Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index 4a530a558436..df93cb44b4eb 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -15,6 +15,8 @@ * Copyright (c) 2010, Intel Corporation. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index c561ea4fefa5..589097728bd1 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -5,6 +5,8 @@ * Copyright (c) 2022 Taehee Yoo */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index 5153bb423dbe..ac7fb7a9922b 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -3,6 +3,8 @@ * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 4c0ead71b198..5cfcbb91c4ca 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2006 Herbert Xu */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index 8e3ac5be7cf6..851f2a29963c 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -5,6 +5,8 @@ * Copyright © 2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 54fcd86160ff..8846493c92fb 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -5,6 +5,8 @@ * Copyright © 2012-2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index e21d2d5b68f9..3c14a904af00 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -8,6 +8,9 @@ * Copyright (C) 2006 NTT (Nippon Telegraph and Telephone Corporation) */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include #include #include #include diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index bdc3c763334c..fdeec0849ab5 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -6,6 +6,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index addca34b3511..9258082408eb 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -8,6 +8,8 @@ * Copyright © 2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7275cae3380d..8e5cadc808b4 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -6,6 +6,8 @@ * Copyright (C) 2015 Martin Willi */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index d49a19dcee37..bc2b31b04e05 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -26,6 +26,9 @@ * * Wrappers for kernel crypto shash api to pclmulqdq crc32 implementation. */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index 980c62929256..ebf530934a3e 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -11,6 +11,9 @@ * Authors: Austin Zhang * Kent Liu */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 3b8e9394c40d..03e35a1b7677 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -22,6 +22,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index 7fe395dfa79d..f9a1adb0c183 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -4,6 +4,8 @@ * Copyright (c) 2016-2020 INRIA, CMU and Microsoft Corporation */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index 168cac5c6ca6..83e686a6c2f3 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2006 Herbert Xu */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 69945e41bc41..3ad55144da48 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -7,6 +7,8 @@ * Author: Huang Ying */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index a8046334ddca..40f49107e5a9 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -6,6 +6,8 @@ * Copyright 2018 Google LLC */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index cdbe5df00927..bb40fed92c92 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -6,6 +6,8 @@ * Copyright 2018 Google LLC */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 3e6ff505cd26..a2a7cb39cdec 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -3,6 +3,8 @@ * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index 2502964afef6..5a345db20ca9 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -16,6 +16,8 @@ * operations. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 24741d33edaf..5944bf5ead2e 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -5,6 +5,8 @@ * Copyright © 2012-2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 0db18d99da50..45713c7a4cb9 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -8,6 +8,8 @@ * Copyright © 2011-2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index 5288441cc223..d8aa0d3fbf15 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -12,6 +12,8 @@ * Copyright (c) 2006 Herbert Xu */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 42e8cb1a6708..8a0fb308fbba 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -26,7 +26,6 @@ * SOFTWARE. */ - #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 2e9fe76056b8..3fe9e170b880 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2021 Tianjia Zhang */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index f730822f203a..14ae012948ae 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2021 Tianjia Zhang */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 4657e6efc35d..044e4f92e2c0 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -8,6 +8,8 @@ * Copyright © 2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index ade98aef3402..031ed290c755 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -38,6 +38,8 @@ * Third Edition. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 790e5a59a9a7..7e2a18e3abe7 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -5,6 +5,8 @@ * Copyright (c) 2011 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include From patchwork Wed Oct 12 21:59:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005480 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCEA4C433FE for ; Wed, 12 Oct 2022 22:02:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230085AbiJLWCZ (ORCPT ); Wed, 12 Oct 2022 18:02:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229693AbiJLWBT (ORCPT ); Wed, 12 Oct 2022 18:01:19 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 435E275FFB; Wed, 12 Oct 2022 15:00:22 -0700 (PDT) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKWT9v015442; Wed, 12 Oct 2022 22:00:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ntW9Ji65PFoN+eegclvP9j8XWOomEfeAptNcfNSx+Ww=; b=c2Af7gH+FDARfJhl2E3aelLxoxX8RDfySYFsKUjxQ67YSpfZRxYlw4YkQQdalIkINgq7 JjfBwQ+NUfhWUJWfTC7XL44LYuSYoqX+qy3QZ2Nt+lDRsS+oiL+eh9QYkkfjDuY94VZy oMlS6m1SdydQmyvQWHEj4p4quyKOzNrh+FhEBCyHkN1hK9sOS7oGDWGfwaafz2xPE2PB qjVouTBQEaYmSG7t6xIpP7VjSAPRqZzXlZOLbbhJ57xNAhm5pY4nVnI6DsufWx4A3DFL d5Y3kiBfBKCB618kXWqVTkyB2c/VqJ+bXCpUwYBGKHwEP0Wzuo5XTE+ORggCoZhokaMm +Q== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k615t2esc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:09 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 4962E29582; Wed, 12 Oct 2022 22:00:08 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id E2DA18003BA; Wed, 12 Oct 2022 22:00:07 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 16/19] crypto: x86 - print CPU optimized loaded messages Date: Wed, 12 Oct 2022 16:59:28 -0500 Message-Id: <20221012215931.3896-17-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: _Qi2zLBrbtmT0s8gvCC9eeuThCRMqzid X-Proofpoint-ORIG-GUID: _Qi2zLBrbtmT0s8gvCC9eeuThCRMqzid X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=999 lowpriorityscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Print a positive message at the info level if the CPU-optimized module is loaded, for all modules except the sha modules. Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 8 +++++-- arch/x86/crypto/aesni-intel_glue.c | 22 +++++++++++++------ arch/x86/crypto/aria_aesni_avx_glue.c | 13 ++++++++--- arch/x86/crypto/blake2s-glue.c | 14 ++++++++++-- arch/x86/crypto/blowfish_glue.c | 2 ++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 6 +++++- arch/x86/crypto/camellia_aesni_avx_glue.c | 6 +++++- arch/x86/crypto/camellia_glue.c | 3 +++ arch/x86/crypto/cast5_avx_glue.c | 6 +++++- arch/x86/crypto/cast6_avx_glue.c | 6 +++++- arch/x86/crypto/chacha_glue.c | 17 +++++++++++++-- arch/x86/crypto/crc32-pclmul_glue.c | 8 ++++++- arch/x86/crypto/crc32c-intel_glue.c | 15 +++++++++++-- arch/x86/crypto/crct10dif-pclmul_glue.c | 7 +++++- arch/x86/crypto/curve25519-x86_64.c | 13 +++++++++-- arch/x86/crypto/des3_ede_glue.c | 2 ++ arch/x86/crypto/ghash-clmulni-intel_glue.c | 1 + arch/x86/crypto/nhpoly1305-avx2-glue.c | 7 +++++- arch/x86/crypto/nhpoly1305-sse2-glue.c | 7 +++++- arch/x86/crypto/poly1305_glue.c | 25 ++++++++++++++++++---- arch/x86/crypto/polyval-clmulni_glue.c | 7 +++++- arch/x86/crypto/serpent_avx2_glue.c | 7 ++++-- arch/x86/crypto/serpent_avx_glue.c | 6 +++++- arch/x86/crypto/serpent_sse2_glue.c | 7 +++++- arch/x86/crypto/sm3_avx_glue.c | 6 +++++- arch/x86/crypto/sm4_aesni_avx2_glue.c | 6 +++++- arch/x86/crypto/sm4_aesni_avx_glue.c | 7 ++++-- arch/x86/crypto/twofish_avx_glue.c | 10 ++++++--- arch/x86/crypto/twofish_glue.c | 7 +++++- arch/x86/crypto/twofish_glue_3way.c | 9 ++++++-- 30 files changed, 213 insertions(+), 47 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 122bfd04ee47..e8eaf79ef220 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -275,7 +275,8 @@ static struct simd_aead_alg *simd_alg; static int __init crypto_aegis128_aesni_module_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + int ret; + return -ENODEV; if (!boot_cpu_has(X86_FEATURE_XMM2) || @@ -283,8 +284,11 @@ static int __init crypto_aegis128_aesni_module_init(void) !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) return -ENODEV; - return simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, + ret = simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, &simd_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit crypto_aegis128_aesni_module_exit(void) diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index df93cb44b4eb..56023ba70049 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -1238,25 +1238,28 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init aesni_init(void) { int err; + int enabled_gcm_sse = 0; + int enabled_gcm_avx = 0; + int enabled_gcm_avx2 = 0; + int enabled_ctr_avx = 0; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_AVX2)) { - pr_info("AVX2 version of gcm_enc/dec engaged.\n"); + enabled_gcm_avx = 1; + enabled_gcm_avx2 = 1; static_branch_enable(&gcm_use_avx); static_branch_enable(&gcm_use_avx2); - } else - if (boot_cpu_has(X86_FEATURE_AVX)) { - pr_info("AVX version of gcm_enc/dec engaged.\n"); + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + enabled_gcm_avx = 1; static_branch_enable(&gcm_use_avx); } else { - pr_info("SSE version of gcm_enc/dec engaged.\n"); + enabled_gcm_sse = 1; } if (boot_cpu_has(X86_FEATURE_AVX)) { - /* optimize performance of ctr mode encryption transform */ + enabled_ctr_avx = 1; static_call_update(aesni_ctr_enc_tfm, aesni_ctr_enc_avx_tfm); - pr_info("AES CTR mode by8 optimization enabled\n"); } #endif /* CONFIG_X86_64 */ @@ -1283,6 +1286,11 @@ static int __init aesni_init(void) goto unregister_aeads; #endif /* CONFIG_X86_64 */ + pr_info("CPU-optimized crypto module loaded (GCM SSE=%s, AVX=%s, AVX2=%s)(CTR AVX=%s)\n", + enabled_gcm_sse ? "yes" : "no", + enabled_gcm_avx ? "yes" : "no", + enabled_gcm_avx2 ? "yes" : "no", + enabled_ctr_avx ? "yes" : "no"); return 0; #ifdef CONFIG_X86_64 diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index 589097728bd1..d58fb995a266 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -170,6 +170,8 @@ static struct simd_skcipher_alg *aria_simd_algs[ARRAY_SIZE(aria_algs)]; static int __init aria_avx_init(void) { const char *feature_name; + int ret; + int enabled_gfni = 0; if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || @@ -188,15 +190,20 @@ static int __init aria_avx_init(void) aria_ops.aria_encrypt_16way = aria_aesni_avx_gfni_encrypt_16way; aria_ops.aria_decrypt_16way = aria_aesni_avx_gfni_decrypt_16way; aria_ops.aria_ctr_crypt_16way = aria_aesni_avx_gfni_ctr_crypt_16way; + enabled_gfni = 1; } else { aria_ops.aria_encrypt_16way = aria_aesni_avx_encrypt_16way; aria_ops.aria_decrypt_16way = aria_aesni_avx_decrypt_16way; aria_ops.aria_ctr_crypt_16way = aria_aesni_avx_ctr_crypt_16way; } - return simd_register_skciphers_compat(aria_algs, - ARRAY_SIZE(aria_algs), - aria_simd_algs); + ret = simd_register_skciphers_compat(aria_algs, + ARRAY_SIZE(aria_algs), + aria_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded (GFNI=%s)\n", + enabled_gfni ? "yes" : "no"); + return ret; } static void __exit aria_avx_exit(void) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index ac7fb7a9922b..4f2f385f6674 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -66,11 +66,16 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init blake2s_mod_init(void) { + int enabled_ssse3 = 0; + int enabled_avx512 = 0; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (boot_cpu_has(X86_FEATURE_SSSE3)) { + enabled_ssse3 = 1; static_branch_enable(&blake2s_use_ssse3); + } if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX) && @@ -78,9 +83,14 @@ static int __init blake2s_mod_init(void) boot_cpu_has(X86_FEATURE_AVX512F) && boot_cpu_has(X86_FEATURE_AVX512VL) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | - XFEATURE_MASK_AVX512, NULL)) + XFEATURE_MASK_AVX512, NULL)) { + enabled_avx512 = 1; static_branch_enable(&blake2s_use_avx512); + } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX512=%s)\n", + enabled_ssse3 ? "yes" : "no", + enabled_avx512 ? "yes" : "no"); return 0; } diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 5cfcbb91c4ca..27b7aed9a488 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -336,6 +336,8 @@ static int __init blowfish_init(void) if (err) crypto_unregister_alg(&bf_cipher_alg); + if (!err) + pr_info("CPU-optimized crypto module loaded\n"); return err; } diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index 851f2a29963c..e6c4ed1e40d2 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -114,6 +114,7 @@ static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -132,9 +133,12 @@ static int __init camellia_aesni_init(void) return -ENODEV; } - return simd_register_skciphers_compat(camellia_algs, + ret = simd_register_skciphers_compat(camellia_algs, ARRAY_SIZE(camellia_algs), camellia_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit camellia_aesni_fini(void) diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 8846493c92fb..6a9eadf0fe90 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -113,6 +113,7 @@ static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -130,9 +131,12 @@ static int __init camellia_aesni_init(void) return -ENODEV; } - return simd_register_skciphers_compat(camellia_algs, + ret = simd_register_skciphers_compat(camellia_algs, ARRAY_SIZE(camellia_algs), camellia_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit camellia_aesni_fini(void) diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index 3c14a904af00..94dd2973bb47 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -1410,6 +1410,9 @@ static int __init camellia_init(void) if (err) crypto_unregister_alg(&camellia_cipher_alg); + if (!err) + pr_info("CPU-optimized crypto module loaded\n"); + return err; } diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index fdeec0849ab5..b5ae17c3ac53 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -107,6 +107,7 @@ static struct simd_skcipher_alg *cast5_simd_algs[ARRAY_SIZE(cast5_algs)]; static int __init cast5_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -117,9 +118,12 @@ static int __init cast5_init(void) return -ENODEV; } - return simd_register_skciphers_compat(cast5_algs, + ret = simd_register_skciphers_compat(cast5_algs, ARRAY_SIZE(cast5_algs), cast5_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit cast5_exit(void) diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index 9258082408eb..d1c14a5f80d7 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -107,6 +107,7 @@ static struct simd_skcipher_alg *cast6_simd_algs[ARRAY_SIZE(cast6_algs)]; static int __init cast6_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -117,9 +118,12 @@ static int __init cast6_init(void) return -ENODEV; } - return simd_register_skciphers_compat(cast6_algs, + ret = simd_register_skciphers_compat(cast6_algs, ARRAY_SIZE(cast6_algs), cast6_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit cast6_exit(void) diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 8e5cadc808b4..de424fbe9f0e 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -289,6 +289,9 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init chacha_simd_mod_init(void) { + int ret; + int enabled_avx2 = 0; + int enabled_avx512 = 0; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -298,15 +301,25 @@ static int __init chacha_simd_mod_init(void) if (boot_cpu_has(X86_FEATURE_AVX) && boot_cpu_has(X86_FEATURE_AVX2) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + enabled_avx2 = 1; static_branch_enable(&chacha_use_avx2); if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX512VL) && - boot_cpu_has(X86_FEATURE_AVX512BW)) /* kmovq */ + boot_cpu_has(X86_FEATURE_AVX512BW)) { /* kmovq */ + enabled_avx512 = 1; static_branch_enable(&chacha_use_avx512vl); + } } - return IS_REACHABLE(CONFIG_CRYPTO_SKCIPHER) ? + ret = IS_REACHABLE(CONFIG_CRYPTO_SKCIPHER) ? crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0; + if (!ret) + pr_info("CPU-optimized crypto module loaded (AVX2=%s, AVX512=%s)\n", + enabled_avx2 ? "yes" : "no", + enabled_avx512 ? "yes" : "no"); + else + pr_info("CPU-optimized crypto module not loaded"); + return ret; } static void __exit chacha_simd_mod_fini(void) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index bc2b31b04e05..c56d3d3ab0a0 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -190,9 +190,15 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32_pclmul_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - return crypto_register_shash(&alg); + + ret = crypto_register_shash(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit crc32_pclmul_mod_fini(void) diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index ebf530934a3e..c633d303f19b 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -242,16 +242,27 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32c_intel_mod_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + int ret; + int pcl_enabled = 0; + + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU feature (SSE4.2) not supported\n"); return -ENODEV; + } + #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) { + pcl_enabled = 1; alg.update = crc32c_pcl_intel_update; alg.finup = crc32c_pcl_intel_finup; alg.digest = crc32c_pcl_intel_digest; } #endif - return crypto_register_shash(&alg); + ret = crypto_register_shash(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded (PCLMULQDQ=%s)\n", + pcl_enabled ? "yes" : "no"); + return ret; } static void __exit crc32c_intel_mod_fini(void) diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 03e35a1b7677..4476b9af1e61 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -146,10 +146,15 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crct10dif_intel_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - return crypto_register_shash(&alg); + ret = crypto_register_shash(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit crct10dif_intel_mod_fini(void) diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index f9a1adb0c183..b9289feef375 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -1709,15 +1709,24 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init curve25519_mod_init(void) { + int ret; + int enabled_adx = 0; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) + if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) { + enabled_adx = 1; static_branch_enable(&curve25519_use_bmi2_adx); + } else return 0; - return IS_REACHABLE(CONFIG_CRYPTO_KPP) ? + ret = IS_REACHABLE(CONFIG_CRYPTO_KPP) ? crypto_register_kpp(&curve25519_alg) : 0; + if (!ret) + pr_info("CPU-optimized crypto module loaded (ADX=%s)\n", + enabled_adx ? "yes" : "no"); + return ret; } static void __exit curve25519_mod_exit(void) diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index 83e686a6c2f3..7b4dd02007ed 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -384,6 +384,8 @@ static int __init des3_ede_x86_init(void) if (err) crypto_unregister_alg(&des3_ede_cipher); + if (!err) + pr_info("CPU-optimized crypto module loaded\n"); return err; } diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 3ad55144da48..496a410eaff7 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -349,6 +349,7 @@ static int __init ghash_pclmulqdqni_mod_init(void) if (err) goto err_shash; + pr_info("CPU-optimized crypto module loaded\n"); return 0; err_shash: diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 40f49107e5a9..2dc7b618771f 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -68,6 +68,8 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init nhpoly1305_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -75,7 +77,10 @@ static int __init nhpoly1305_mod_init(void) !boot_cpu_has(X86_FEATURE_OSXSAVE)) return -ENODEV; - return crypto_register_shash(&nhpoly1305_alg); + ret = crypto_register_shash(&nhpoly1305_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit nhpoly1305_mod_exit(void) diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index bb40fed92c92..bf0f8ac7afd6 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -68,13 +68,18 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init nhpoly1305_mod_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_XMM2)) return -ENODEV; - return crypto_register_shash(&nhpoly1305_alg); + ret = crypto_register_shash(&nhpoly1305_alg); + if (ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit nhpoly1305_mod_exit(void) diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index a2a7cb39cdec..c9ebb6b90d1f 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -273,22 +273,39 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init poly1305_simd_mod_init(void) { + int ret; + int enabled_avx = 0; + int enabled_avx2 = 0; + int enabled_avx512 = 0; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (boot_cpu_has(X86_FEATURE_AVX) && - cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + enabled_avx = 1; static_branch_enable(&poly1305_use_avx); + } if (boot_cpu_has(X86_FEATURE_AVX) && boot_cpu_has(X86_FEATURE_AVX2) && - cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + enabled_avx2 = 1; static_branch_enable(&poly1305_use_avx2); + } if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX) && boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX512F) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | XFEATURE_MASK_AVX512, NULL) && /* Skylake downclocks unacceptably much when using zmm, but later generations are fast. */ - boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X) + boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X) { + enabled_avx512 = 1; static_branch_enable(&poly1305_use_avx512); - return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? crypto_register_shash(&alg) : 0; + } + ret = IS_REACHABLE(CONFIG_CRYPTO_HASH) ? crypto_register_shash(&alg) : 0; + if (!ret) + pr_info("CPU-optimized crypto module loaded (AVX=%s, AVX2=%s, AVX512=%s)\n", + enabled_avx ? "yes" : "no", + enabled_avx2 ? "yes" : "no", + enabled_avx512 ? "yes" : "no"); + return ret; } static void __exit poly1305_simd_mod_exit(void) diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index 5a345db20ca9..7a3a80085c90 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -183,13 +183,18 @@ MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); static int __init polyval_clmulni_mod_init(void) { + int ret; + if (!x86_match_cpu(pcmul_cpu_id)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX)) return -ENODEV; - return crypto_register_shash(&polyval_alg); + ret = crypto_register_shash(&polyval_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit polyval_clmulni_mod_exit(void) diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 5944bf5ead2e..bf59addaf804 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -108,8 +108,8 @@ static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_avx2_init(void) { const char *feature_name; + int ret; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { @@ -122,9 +122,12 @@ static int __init serpent_avx2_init(void) return -ENODEV; } - return simd_register_skciphers_compat(serpent_algs, + ret = simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), serpent_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit serpent_avx2_fini(void) diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 45713c7a4cb9..7b0c02a61552 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -114,6 +114,7 @@ static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -124,9 +125,12 @@ static int __init serpent_init(void) return -ENODEV; } - return simd_register_skciphers_compat(serpent_algs, + ret = simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), serpent_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit serpent_exit(void) diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index d8aa0d3fbf15..f82880ef6f10 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -116,6 +116,8 @@ static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_sse2_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -124,9 +126,12 @@ static int __init serpent_sse2_init(void) return -ENODEV; } - return simd_register_skciphers_compat(serpent_algs, + ret = simd_register_skciphers_compat(serpent_algs, ARRAY_SIZE(serpent_algs), serpent_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit serpent_sse2_exit(void) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 475b9637a06d..532f07b05745 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -125,6 +125,7 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init sm3_avx_mod_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -145,7 +146,10 @@ static int __init sm3_avx_mod_init(void) return -ENODEV; } - return crypto_register_shash(&sm3_avx_alg); + ret = crypto_register_shash(&sm3_avx_alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit sm3_avx_mod_exit(void) diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 3fe9e170b880..42819ee5d36d 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -143,6 +143,7 @@ simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)]; static int __init sm4_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -161,9 +162,12 @@ static int __init sm4_init(void) return -ENODEV; } - return simd_register_skciphers_compat(sm4_aesni_avx2_skciphers, + ret = simd_register_skciphers_compat(sm4_aesni_avx2_skciphers, ARRAY_SIZE(sm4_aesni_avx2_skciphers), simd_sm4_aesni_avx2_skciphers); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit sm4_exit(void) diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 14ae012948ae..8a25376d341f 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -461,8 +461,8 @@ simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)]; static int __init sm4_init(void) { const char *feature_name; + int ret; - if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX) || @@ -478,9 +478,12 @@ static int __init sm4_init(void) return -ENODEV; } - return simd_register_skciphers_compat(sm4_aesni_avx_skciphers, + ret = simd_register_skciphers_compat(sm4_aesni_avx_skciphers, ARRAY_SIZE(sm4_aesni_avx_skciphers), simd_sm4_aesni_avx_skciphers); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit sm4_exit(void) diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 044e4f92e2c0..ccf016bf6ef2 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -117,6 +117,7 @@ static struct simd_skcipher_alg *twofish_simd_algs[ARRAY_SIZE(twofish_algs)]; static int __init twofish_init(void) { const char *feature_name; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -126,9 +127,12 @@ static int __init twofish_init(void) return -ENODEV; } - return simd_register_skciphers_compat(twofish_algs, - ARRAY_SIZE(twofish_algs), - twofish_simd_algs); + ret = simd_register_skciphers_compat(twofish_algs, + ARRAY_SIZE(twofish_algs), + twofish_simd_algs); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit twofish_exit(void) diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index 031ed290c755..5756b9cab982 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -92,10 +92,15 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init twofish_glue_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - return crypto_register_alg(&alg); + ret = crypto_register_alg(&alg); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit twofish_glue_fini(void) diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 7e2a18e3abe7..2fde637b40c8 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -151,6 +151,8 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init twofish_3way_init(void) { + int ret; + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; @@ -162,8 +164,11 @@ static int __init twofish_3way_init(void) return -ENODEV; } - return crypto_register_skciphers(tf_skciphers, - ARRAY_SIZE(tf_skciphers)); + ret = crypto_register_skciphers(tf_skciphers, + ARRAY_SIZE(tf_skciphers)); + if (!ret) + pr_info("CPU-optimized crypto module loaded\n"); + return ret; } static void __exit twofish_3way_fini(void) From patchwork Wed Oct 12 21:59:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005477 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B789C43217 for ; Wed, 12 Oct 2022 22:01:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229769AbiJLWBu (ORCPT ); Wed, 12 Oct 2022 18:01:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229908AbiJLWAn (ORCPT ); Wed, 12 Oct 2022 18:00:43 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F58C7654B; Wed, 12 Oct 2022 15:00:22 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL88dt016394; Wed, 12 Oct 2022 22:00:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=cA+jiT1Ykbspsq4IfGAHBDwf5ujiMklaMuWz0ZTfqho=; b=JlnPlMbr7D+Jfwij5Q+naBSj+gztLGJQaolsbe0LmM+o9uGUjg4ry5uzr5m8YzyFwqCf qx07bS3/uudkyA1gfyVaZR9/evrdk/TEibHa9bHD7Oe/eyY3zxcOdxTqe7d1iFnfwxhf /KQJixQ7fDJY+XGT243GGKAMBGtI5DIaLFOF7sLcp63FNp24x1urqOoksGnmJKTMKlWa 2O1RNDHJk0hMqr+HwI2DfOpx+y5TgTjJrv8Wfgncvlqdu5AyKGWdH2eDULjQHS5ZaS2v GYt29VIX/q2dCrO0c8qRwNh9v6a7Tth+z53OmFhSzST/E5vzV8kdGrLcSk0EZTn72VQx 0g== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a7s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:10 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 90B3713970; Wed, 12 Oct 2022 22:00:09 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 1E63D807DA5; Wed, 12 Oct 2022 22:00:09 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 17/19] crypto: x86 - standardize suboptimal prints Date: Wed, 12 Oct 2022 16:59:29 -0500 Message-Id: <20221012215931.3896-18-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: XKLvG6nOYC17q4IoVPFq8vnXPAREF7v6 X-Proofpoint-GUID: XKLvG6nOYC17q4IoVPFq8vnXPAREF7v6 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Reword prints that the module is not being loaded (although it otherwise qualifies) because performance would be suboptimal on the particular CPU model. Although modules are not supposed to print unless they're loaded and active, this is an existing exception. Signed-off-by: Robert Elliott --- arch/x86/crypto/blowfish_glue.c | 5 +---- arch/x86/crypto/camellia_glue.c | 5 +---- arch/x86/crypto/des3_ede_glue.c | 2 +- arch/x86/crypto/twofish_glue_3way.c | 5 +---- 4 files changed, 4 insertions(+), 13 deletions(-) diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 27b7aed9a488..8d4ecf406dee 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -320,10 +320,7 @@ static int __init blowfish_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - printk(KERN_INFO - "blowfish-x86_64: performance on this CPU " - "would be suboptimal: disabling " - "blowfish-x86_64.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index 94dd2973bb47..002a1e84b277 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -1394,10 +1394,7 @@ static int __init camellia_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - printk(KERN_INFO - "camellia-x86_64: performance on this CPU " - "would be suboptimal: disabling " - "camellia-x86_64.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index 7b4dd02007ed..b38ad3ec38e2 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -371,7 +371,7 @@ static int __init des3_ede_x86_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - pr_info("des3_ede-x86_64: performance on this CPU would be suboptimal: disabling des3_ede-x86_64.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 2fde637b40c8..29c35d2aaeba 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -157,10 +157,7 @@ static int __init twofish_3way_init(void) return -ENODEV; if (!force && is_blacklisted_cpu()) { - printk(KERN_INFO - "twofish-x86_64-3way: performance on this CPU " - "would be suboptimal: disabling " - "twofish-x86_64-3way.\n"); + pr_info("CPU-optimized crypto module not loaded, crypto optimization performance on this CPU would be suboptimal\n"); return -ENODEV; } From patchwork Wed Oct 12 21:59:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005484 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECFEAC433FE for ; Wed, 12 Oct 2022 22:04:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229972AbiJLWEt (ORCPT ); Wed, 12 Oct 2022 18:04:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229978AbiJLWEB (ORCPT ); Wed, 12 Oct 2022 18:04:01 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B116A13954B; Wed, 12 Oct 2022 15:01:28 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7Wtg016423; Wed, 12 Oct 2022 22:00:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=X7Go4viamzwOn3WUTwlo1LbW+Q0tptmA7a1g3x+mSls=; b=P89Qz77nTHr377CTaQ/w4cfXZnv9P2GTFzorUYCdAaLW32Qi5hAMB4PfaApAqOM+RWxF 8nI7zrA1czGGQBc3YGt2cA6JnciH+Gbw1GHXRFLLupGSzxZxweyF7CPlwRJkv7XvV8mB 8R0w32EgX5W8nusFomefeeoeWXOMzh6S5y3EwWzlliueSEDXbc1KsoVwXSI9ObAJispM U1I0Zx0SbTwNj+jh387WY1LA7nvqHRpUjOqIYINF2JnZYDV6t2E9xiDJncFAw/clHFX+ 53fDY8vxloTFZayxAiee7eslHFFtuyuCUeuvRhHI/QcM6F3AZwBxxhU1mPzNdLd5gjRE jw== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8adb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:11 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id C626129585; Wed, 12 Oct 2022 22:00:10 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 6DADB807DA5; Wed, 12 Oct 2022 22:00:10 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 18/19] crypto: x86 - standardize not loaded prints Date: Wed, 12 Oct 2022 16:59:30 -0500 Message-Id: <20221012215931.3896-19-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 82XBGjj4A8ErvKSGv2QYlO_RI8X_yhn1 X-Proofpoint-GUID: 82XBGjj4A8ErvKSGv2QYlO_RI8X_yhn1 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Standardize the prints that additional required CPU features are not present along with the main CPU features (e.g., OSXSAVE is not present along with AVX). Although modules are not supposed to print unless loaded and active, these are existing exceptions. Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 4 +++- arch/x86/crypto/aria_aesni_avx_glue.c | 4 ++-- arch/x86/crypto/camellia_aesni_avx2_glue.c | 5 +++-- arch/x86/crypto/camellia_aesni_avx_glue.c | 5 +++-- arch/x86/crypto/cast5_avx_glue.c | 3 ++- arch/x86/crypto/cast6_avx_glue.c | 3 ++- arch/x86/crypto/crc32-pclmul_glue.c | 4 +++- arch/x86/crypto/nhpoly1305-avx2-glue.c | 4 +++- arch/x86/crypto/serpent_avx2_glue.c | 8 +++++--- arch/x86/crypto/serpent_avx_glue.c | 3 ++- arch/x86/crypto/sm3_avx_glue.c | 7 ++++--- arch/x86/crypto/sm4_aesni_avx2_glue.c | 5 +++-- arch/x86/crypto/sm4_aesni_avx_glue.c | 5 +++-- arch/x86/crypto/twofish_avx_glue.c | 3 ++- 14 files changed, 40 insertions(+), 23 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index e8eaf79ef220..aa94b9f8703c 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -281,8 +281,10 @@ static int __init crypto_aegis128_aesni_module_init(void) if (!boot_cpu_has(X86_FEATURE_XMM2) || !boot_cpu_has(X86_FEATURE_AES) || - !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) { + pr_info("CPU-optimized crypto module not loaded, all required CPU features (SSE2, AESNI) not supported\n"); return -ENODEV; + } ret = simd_register_aeads_compat(&crypto_aegis128_aesni_alg, 1, &simd_alg); diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index d58fb995a266..24982450a125 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -176,13 +176,13 @@ static int __init aria_avx_init(void) if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AES-NI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU extended feature '%s' is not supported\n", feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index e6c4ed1e40d2..bc6862077984 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -123,13 +123,14 @@ static int __init camellia_aesni_init(void) !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AVX2, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 6a9eadf0fe90..96e7e1accb6c 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -121,13 +121,14 @@ static int __init camellia_aesni_init(void) if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index b5ae17c3ac53..89650fffb550 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -114,7 +114,8 @@ static int __init cast5_init(void) if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index d1c14a5f80d7..d69f62ac9553 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -114,7 +114,8 @@ static int __init cast6_init(void) if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index c56d3d3ab0a0..4cf86f8f9428 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -192,8 +192,10 @@ static int __init crc32_pclmul_mod_init(void) { int ret; - if (!x86_match_cpu(module_cpu_ids)) + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU feature (PCLMULQDQ) not supported\n"); return -ENODEV; + } ret = crypto_register_shash(&alg); if (!ret) diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 2dc7b618771f..834bf64bb160 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -74,8 +74,10 @@ static int __init nhpoly1305_mod_init(void) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX2) || - !boot_cpu_has(X86_FEATURE_OSXSAVE)) + !boot_cpu_has(X86_FEATURE_OSXSAVE)) { + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX2, OSXSAVE) not supported\n"); return -ENODEV; + } ret = crypto_register_shash(&nhpoly1305_alg); if (!ret) diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index bf59addaf804..4bd59ccea69a 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -112,13 +112,15 @@ static int __init serpent_avx2_init(void) return -ENODEV; - if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 instructions are not detected.\n"); + if (!boot_cpu_has(X86_FEATURE_AVX2) || + !boot_cpu_has(X86_FEATURE_OSXSAVE)) { + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX2, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 7b0c02a61552..853b48677d2b 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -121,7 +121,8 @@ static int __init serpent_init(void) if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 532f07b05745..5250fee79147 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -131,18 +131,19 @@ static int __init sm3_avx_mod_init(void) return -ENODEV; if (!boot_cpu_has(X86_FEATURE_AVX)) { - pr_info("AVX instruction are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, required CPU feature (AVX) not supported\n"); return -ENODEV; } if (!boot_cpu_has(X86_FEATURE_BMI2)) { - pr_info("BMI2 instruction are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, required CPU feature (BMI2) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 42819ee5d36d..cdd7ca92ca61 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -152,13 +152,14 @@ static int __init sm4_init(void) !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX2 or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AVX2, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 8a25376d341f..a2ae3d1e0a4a 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -468,13 +468,14 @@ static int __init sm4_init(void) if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { - pr_info("AVX or AES-NI instructions are not detected.\n"); + pr_info("CPU-optimized crypto module not loaded, all required CPU features (AVX, AESNI, OSXSAVE) not supported\n"); return -ENODEV; } if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index ccf016bf6ef2..70167dd01816 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -123,7 +123,8 @@ static int __init twofish_init(void) return -ENODEV; if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { - pr_info("CPU feature '%s' is not supported.\n", feature_name); + pr_info("CPU-optimized crypto module not loaded, CPU extended feature '%s' is not supported\n", + feature_name); return -ENODEV; } From patchwork Wed Oct 12 21:59:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 13005482 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7831AC433FE for ; Wed, 12 Oct 2022 22:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230087AbiJLWDF (ORCPT ); Wed, 12 Oct 2022 18:03:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230094AbiJLWBj (ORCPT ); Wed, 12 Oct 2022 18:01:39 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8860610324C; Wed, 12 Oct 2022 15:00:27 -0700 (PDT) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKWT9x015442; Wed, 12 Oct 2022 22:00:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=K0yyN1Z9ykifAfRg9N8WShvUGMpa8FIhYC0CUsfvRDw=; b=mVhez82LOF2jhuRvGSI27LUkKOAUZ1b+Edj+9zvfQvsFVE3ngyNXmhmblWCmkUoyVzXx P4Q8i6D3de3Hw9zHBntr2L1AoqeBKOMtukz5H+kYMbH/6Fnog9chP1WlXbydGN/tiLhv e/Yy/WdEbxii64ByH/EoxRuYQXLjHYzASqr2bx1q12AsL6z8wFOGDZdvumKL9GAnQFUb uXkO98EWWUSOgjlzsSqJVqiZftJdj6UXObapdVRgGkHh4XnIBp5ewl7GY3s1Pk5adaYW wA2bGojpRB/fXeN0ES2aO3wdXEjMdDxPO8qLsekzkH5tUHcGhI5ZpV/amfBi4QJPpP04 tQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k615t2eth-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:12 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 013CBD268; Wed, 12 Oct 2022 22:00:11 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 9B9058038C2; Wed, 12 Oct 2022 22:00:11 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 19/19] crypto: x86/sha - register only the best function Date: Wed, 12 Oct 2022 16:59:31 -0500 Message-Id: <20221012215931.3896-20-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: MtKhtP7F9kP6sMI-XKAtztT-zs2OFsZc X-Proofpoint-ORIG-GUID: MtKhtP7F9kP6sMI-XKAtztT-zs2OFsZc X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=999 lowpriorityscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Don't register and unregister each of the functions from least- to most-optimized (SSSE3 then AVX then AVX2); determine the most-optimized function and load only that version. Suggested-by: Tim Chen Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 139 ++++++++++++------------- arch/x86/crypto/sha256_ssse3_glue.c | 154 ++++++++++++++-------------- arch/x86/crypto/sha512_ssse3_glue.c | 120 ++++++++++++---------- 3 files changed, 210 insertions(+), 203 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index edffc33bd12e..90a86d737bcf 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -123,17 +123,16 @@ static struct shash_alg sha1_ssse3_alg = { } }; -static int register_sha1_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shash(&sha1_ssse3_alg); - return 0; -} - +static bool sha1_ssse3_registered; +static bool sha1_avx_registered; +static bool sha1_avx2_registered; +static bool sha1_ni_registered; static void unregister_sha1_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (sha1_ssse3_registered) { crypto_unregister_shash(&sha1_ssse3_alg); + sha1_ssse3_registered = 0; + } } asmlinkage void sha1_transform_avx(struct sha1_state *state, @@ -172,28 +171,12 @@ static struct shash_alg sha1_avx_alg = { } }; -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} - -static int register_sha1_avx(void) -{ - if (avx_usable()) - return crypto_register_shash(&sha1_avx_alg); - return 0; -} - static void unregister_sha1_avx(void) { - if (avx_usable()) + if (sha1_avx_registered) { crypto_unregister_shash(&sha1_avx_alg); + sha1_avx_registered = 0; + } } #define SHA1_AVX2_BLOCK_OPTSIZE 4 /* optimal 4*64 bytes of SHA1 blocks */ @@ -201,16 +184,6 @@ static void unregister_sha1_avx(void) asmlinkage void sha1_transform_avx2(struct sha1_state *state, const u8 *data, int blocks); -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) - && boot_cpu_has(X86_FEATURE_BMI1) - && boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - static void sha1_apply_transform_avx2(struct sha1_state *state, const u8 *data, int blocks) { @@ -254,17 +227,13 @@ static struct shash_alg sha1_avx2_alg = { } }; -static int register_sha1_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shash(&sha1_avx2_alg); - return 0; -} static void unregister_sha1_avx2(void) { - if (avx2_usable()) + if (sha1_avx2_registered) { crypto_unregister_shash(&sha1_avx2_alg); + sha1_avx2_registered = 0; + } } #ifdef CONFIG_AS_SHA1_NI @@ -304,13 +273,6 @@ static struct shash_alg sha1_ni_alg = { } }; -static int register_sha1_ni(void) -{ - if (boot_cpu_has(X86_FEATURE_SHA_NI)) - return crypto_register_shash(&sha1_ni_alg); - return 0; -} - static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), @@ -322,44 +284,79 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static void unregister_sha1_ni(void) { - if (boot_cpu_has(X86_FEATURE_SHA_NI)) + if (sha1_ni_registered) { crypto_unregister_shash(&sha1_ni_alg); + sha1_ni_registered = 0; + } } #else -static inline int register_sha1_ni(void) { return 0; } static inline void unregister_sha1_ni(void) { } #endif static int __init sha1_ssse3_mod_init(void) { - if (register_sha1_ssse3()) - goto fail; + const char *feature_name; + const char *driver_name = NULL; + int ret; if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - if (register_sha1_avx()) { - unregister_sha1_ssse3(); - goto fail; - } + /* SHA-NI */ + if (boot_cpu_has(X86_FEATURE_SHA_NI)) { - if (register_sha1_avx2()) { - unregister_sha1_avx(); - unregister_sha1_ssse3(); - goto fail; - } + ret = crypto_register_shash(&sha1_ni_alg); + if (!ret) + sha1_ni_registered = 1; - if (register_sha1_ni()) { - unregister_sha1_avx2(); - unregister_sha1_avx(); - unregister_sha1_ssse3(); - goto fail; + /* AVX2 */ + } else if (boot_cpu_has(X86_FEATURE_AVX2)) { + + if (boot_cpu_has(X86_FEATURE_BMI1) && + boot_cpu_has(X86_FEATURE_BMI2)) { + + ret = crypto_register_shash(&sha1_avx2_alg); + if (!ret) { + sha1_avx2_registered = 1; + driver_name = sha1_avx2_alg.base.cra_driver_name; + } + } else { + pr_info("AVX2-optimized version not engaged, all required features (AVX2, BMI1, BMI2) not supported\n"); + } + + /* AVX */ + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + + ret = crypto_register_shash(&sha1_avx_alg); + if (!ret) { + sha1_avx_registered = 1; + driver_name = sha1_avx_alg.base.cra_driver_name; + } + } else { + pr_info("AVX-optimized version not engaged, CPU extended feature '%s' is not supported\n", + feature_name); + } + + /* SSE3 */ + } else if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shash(&sha1_ssse3_alg); + if (!ret) { + sha1_ssse3_registered = 1; + driver_name = sha1_ssse3_alg.base.cra_driver_name; + } } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX=%s, AVX2=%s, SHA-NI=%s): driver=%s\n", + sha1_ssse3_registered ? "yes" : "no", + sha1_avx_registered ? "yes" : "no", + sha1_avx2_registered ? "yes" : "no", + sha1_ni_registered ? "yes" : "no", + driver_name); return 0; -fail: - return -ENODEV; } static void __exit sha1_ssse3_mod_fini(void) diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 8a0fb308fbba..cd7bf2b48f3d 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -150,19 +150,18 @@ static struct shash_alg sha256_ssse3_algs[] = { { } } }; -static int register_sha256_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shashes(sha256_ssse3_algs, - ARRAY_SIZE(sha256_ssse3_algs)); - return 0; -} +static bool sha256_ssse3_registered; +static bool sha256_avx_registered; +static bool sha256_avx2_registered; +static bool sha256_ni_registered; static void unregister_sha256_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (sha256_ssse3_registered) { crypto_unregister_shashes(sha256_ssse3_algs, ARRAY_SIZE(sha256_ssse3_algs)); + sha256_ssse3_registered = 0; + } } asmlinkage void sha256_transform_avx(struct sha256_state *state, @@ -215,30 +214,13 @@ static struct shash_alg sha256_avx_algs[] = { { } } }; -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} - -static int register_sha256_avx(void) -{ - if (avx_usable()) - return crypto_register_shashes(sha256_avx_algs, - ARRAY_SIZE(sha256_avx_algs)); - return 0; -} - static void unregister_sha256_avx(void) { - if (avx_usable()) + if (sha256_avx_registered) { crypto_unregister_shashes(sha256_avx_algs, ARRAY_SIZE(sha256_avx_algs)); + sha256_avx_registered = 0; + } } asmlinkage void sha256_transform_rorx(struct sha256_state *state, @@ -291,28 +273,13 @@ static struct shash_alg sha256_avx2_algs[] = { { } } }; -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) && - boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - -static int register_sha256_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shashes(sha256_avx2_algs, - ARRAY_SIZE(sha256_avx2_algs)); - return 0; -} - static void unregister_sha256_avx2(void) { - if (avx2_usable()) + if (sha256_avx2_registered) { crypto_unregister_shashes(sha256_avx2_algs, ARRAY_SIZE(sha256_avx2_algs)); + sha256_avx2_registered = 0; + } } #ifdef CONFIG_AS_SHA256_NI @@ -375,55 +342,92 @@ static const struct x86_cpu_id module_cpu_ids[] = { }; MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); -static int register_sha256_ni(void) -{ - if (boot_cpu_has(X86_FEATURE_SHA_NI)) - return crypto_register_shashes(sha256_ni_algs, - ARRAY_SIZE(sha256_ni_algs)); - return 0; -} - static void unregister_sha256_ni(void) { - if (boot_cpu_has(X86_FEATURE_SHA_NI)) + if (sha256_ni_registered) { crypto_unregister_shashes(sha256_ni_algs, ARRAY_SIZE(sha256_ni_algs)); + sha256_ni_registered = 0; + } } #else -static inline int register_sha256_ni(void) { return 0; } static inline void unregister_sha256_ni(void) { } #endif static int __init sha256_ssse3_mod_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + const char *feature_name; + const char *driver_name = NULL; + const char *driver_name2 = NULL; + int ret; + + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU features (SSSE3, AVX, AVX2, or SHA-NI) not supported\n"); return -ENODEV; + } - if (register_sha256_ssse3()) - goto fail; + /* SHA-NI */ + if (boot_cpu_has(X86_FEATURE_SHA_NI)) { - if (register_sha256_avx()) { - unregister_sha256_ssse3(); - goto fail; - } + ret = crypto_register_shashes(sha256_ni_algs, + ARRAY_SIZE(sha256_ni_algs)); + if (!ret) { + sha256_ni_registered = 1; + driver_name = sha256_ni_algs[0].base.cra_driver_name; + driver_name2 = sha256_ni_algs[1].base.cra_driver_name; + } - if (register_sha256_avx2()) { - unregister_sha256_avx(); - unregister_sha256_ssse3(); - goto fail; - } + /* AVX2 */ + } else if (boot_cpu_has(X86_FEATURE_AVX2)) { + + if (boot_cpu_has(X86_FEATURE_BMI2)) { + ret = crypto_register_shashes(sha256_avx2_algs, + ARRAY_SIZE(sha256_avx2_algs)); + if (!ret) { + sha256_avx2_registered = 1; + driver_name = sha256_avx2_algs[0].base.cra_driver_name; + driver_name2 = sha256_avx2_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX2-optimized version not engaged, all required CPU features (AVX2, BMI2) not supported\n"); + } - if (register_sha256_ni()) { - unregister_sha256_avx2(); - unregister_sha256_avx(); - unregister_sha256_ssse3(); - goto fail; + /* AVX */ + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + ret = crypto_register_shashes(sha256_avx_algs, + ARRAY_SIZE(sha256_avx_algs)); + if (!ret) { + sha256_avx_registered = 1; + driver_name = sha256_avx_algs[0].base.cra_driver_name; + driver_name2 = sha256_avx_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX-optimized version not engaged, CPU extended feature '%s' is not supported\n", + feature_name); + } + + /* SSE3 */ + } else if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shashes(sha256_ssse3_algs, + ARRAY_SIZE(sha256_ssse3_algs)); + if (!ret) { + sha256_ssse3_registered = 1; + driver_name = sha256_ssse3_algs[0].base.cra_driver_name; + driver_name2 = sha256_ssse3_algs[1].base.cra_driver_name; + } } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX=%s, AVX2=%s, SHA-NI=%s): drivers=%s, %s\n", + sha256_ssse3_registered ? "yes" : "no", + sha256_avx_registered ? "yes" : "no", + sha256_avx2_registered ? "yes" : "no", + sha256_ni_registered ? "yes" : "no", + driver_name, driver_name2); return 0; -fail: - return -ENODEV; } static void __exit sha256_ssse3_mod_fini(void) diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index fd5075a32613..df9f8207cc79 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -149,33 +149,21 @@ static struct shash_alg sha512_ssse3_algs[] = { { } } }; -static int register_sha512_ssse3(void) -{ - if (boot_cpu_has(X86_FEATURE_SSSE3)) - return crypto_register_shashes(sha512_ssse3_algs, - ARRAY_SIZE(sha512_ssse3_algs)); - return 0; -} +static bool sha512_ssse3_registered; +static bool sha512_avx_registered; +static bool sha512_avx2_registered; static void unregister_sha512_ssse3(void) { - if (boot_cpu_has(X86_FEATURE_SSSE3)) + if (sha512_ssse3_registered) { crypto_unregister_shashes(sha512_ssse3_algs, ARRAY_SIZE(sha512_ssse3_algs)); + sha512_ssse3_registered = 0; + } } asmlinkage void sha512_transform_avx(struct sha512_state *state, const u8 *data, int blocks); -static bool avx_usable(void) -{ - if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { - if (boot_cpu_has(X86_FEATURE_AVX)) - pr_info("AVX detected but unusable.\n"); - return false; - } - - return true; -} static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) @@ -225,19 +213,13 @@ static struct shash_alg sha512_avx_algs[] = { { } } }; -static int register_sha512_avx(void) -{ - if (avx_usable()) - return crypto_register_shashes(sha512_avx_algs, - ARRAY_SIZE(sha512_avx_algs)); - return 0; -} - static void unregister_sha512_avx(void) { - if (avx_usable()) + if (sha512_avx_registered) { crypto_unregister_shashes(sha512_avx_algs, ARRAY_SIZE(sha512_avx_algs)); + sha512_avx_registered = 0; + } } asmlinkage void sha512_transform_rorx(struct sha512_state *state, @@ -291,22 +273,6 @@ static struct shash_alg sha512_avx2_algs[] = { { } } }; -static bool avx2_usable(void) -{ - if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2) && - boot_cpu_has(X86_FEATURE_BMI2)) - return true; - - return false; -} - -static int register_sha512_avx2(void) -{ - if (avx2_usable()) - return crypto_register_shashes(sha512_avx2_algs, - ARRAY_SIZE(sha512_avx2_algs)); - return 0; -} static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), @@ -317,33 +283,73 @@ MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static void unregister_sha512_avx2(void) { - if (avx2_usable()) + if (sha512_avx2_registered) { crypto_unregister_shashes(sha512_avx2_algs, ARRAY_SIZE(sha512_avx2_algs)); + sha512_avx2_registered = 0; + } } static int __init sha512_ssse3_mod_init(void) { - if (!x86_match_cpu(module_cpu_ids)) + const char *feature_name; + const char *driver_name = NULL; + const char *driver_name2 = NULL; + int ret; + + if (!x86_match_cpu(module_cpu_ids)) { + pr_info("CPU-optimized crypto module not loaded, required CPU features (SSSE3, AVX, or AVX2) not supported\n"); return -ENODEV; + } - if (register_sha512_ssse3()) - goto fail; + /* AVX2 */ + if (boot_cpu_has(X86_FEATURE_AVX2)) { + if (boot_cpu_has(X86_FEATURE_BMI2)) { + ret = crypto_register_shashes(sha512_avx2_algs, + ARRAY_SIZE(sha512_avx2_algs)); + if (!ret) { + sha512_avx2_registered = 1; + driver_name = sha512_avx2_algs[0].base.cra_driver_name; + driver_name2 = sha512_avx2_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX2-optimized version not engaged, all required CPU features (AVX2, BMI2) not supported\n"); + } - if (register_sha512_avx()) { - unregister_sha512_ssse3(); - goto fail; - } + /* AVX */ + } else if (boot_cpu_has(X86_FEATURE_AVX)) { + + if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + ret = crypto_register_shashes(sha512_avx_algs, + ARRAY_SIZE(sha512_avx_algs)); + if (!ret) { + sha512_avx_registered = 1; + driver_name = sha512_avx_algs[0].base.cra_driver_name; + driver_name2 = sha512_avx_algs[1].base.cra_driver_name; + } + } else { + pr_info("AVX-optimized version not engaged, CPU extended feature '%s' is not supported\n", + feature_name); + } - if (register_sha512_avx2()) { - unregister_sha512_avx(); - unregister_sha512_ssse3(); - goto fail; + /* SSE3 */ + } else if (boot_cpu_has(X86_FEATURE_SSSE3)) { + ret = crypto_register_shashes(sha512_ssse3_algs, + ARRAY_SIZE(sha512_ssse3_algs)); + if (!ret) { + sha512_ssse3_registered = 1; + driver_name = sha512_ssse3_algs[0].base.cra_driver_name; + driver_name2 = sha512_ssse3_algs[1].base.cra_driver_name; + } } + pr_info("CPU-optimized crypto module loaded (SSSE3=%s, AVX=%s, AVX2=%s): drivers=%s, %s\n", + sha512_ssse3_registered ? "yes" : "no", + sha512_avx_registered ? "yes" : "no", + sha512_avx2_registered ? "yes" : "no", + driver_name, driver_name2); return 0; -fail: - return -ENODEV; } static void __exit sha512_ssse3_mod_fini(void)