From patchwork Thu Feb 6 07:20:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13962459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C457CC0219B for ; Thu, 6 Feb 2025 07:21:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 460E2280009; Thu, 6 Feb 2025 02:21:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 40F01280004; Thu, 6 Feb 2025 02:21:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1EB06280008; Thu, 6 Feb 2025 02:21:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E7B51280004 for ; Thu, 6 Feb 2025 02:21:14 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A187A140F11 for ; Thu, 6 Feb 2025 07:21:14 +0000 (UTC) X-FDA: 83088673668.16.14ED312 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) by imf26.hostedemail.com (Postfix) with ESMTP id 68E43140002 for ; Thu, 6 Feb 2025 07:21:12 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=S+GpFWbf; spf=pass (imf26.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.8 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738826472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gnSmxZqV9wqCUiLPVpgANRl6E1t7e66SUFDb7Q2bhFk=; b=ENbUatr/MufLA/TMA3X/uPyJV0NyTf+QVawaiDd1vL9IV9TjtUBHrsVA9PACqFwVQz5CT+ FsbmS6W6jVGe/D91T/FBUhgd0UqhnyZOljcp1vpC0qLf4qELj/fDfEYSYUCux+ZjBFAhWs lphs2nkNPDL8e2gr/FKge6JdiPQKJcA= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=S+GpFWbf; spf=pass (imf26.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.8 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738826472; a=rsa-sha256; cv=none; b=fCjiLX4pfLqkYskaSuXV9fee/PeBbyFoE/o1RIYCGBZumONoYBZafjU3x6vVXhuSQRd9wQ bPGFkAOK74otfRDOXx18KVblNnO98xhQ4Ftk/ETZo/gjWPgoZsVtCyMCvdgtruCJ2vjvsX tw7rwPf51iwv3VBCZqjMmJ9K5En2bxY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738826472; x=1770362472; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=N5FH+kdEATUlMgFfKUdUQJnWoZvD4Wvii2jQxOoHG0g=; b=S+GpFWbf69GTJ+n+c+8YZuByUEAr1ImTpLxlkKN1HAPEtSybqX7T7tBL VeWWA/IxYaskeWfzGxMHaCWZzIeQHq7WYhltLq2E+EhWyhp14YcdvChZH 9dCbmTJOM6lkhIXiDbPxH0AIuvjAO0VU7viMcNp9wsrkw/EaVRPnDFVD6 sP1UWBuLtaXbtOfFm04pEVg6/f1sde+hM3BoHpHk0qaYaRCBuneXY77Ah 3Dl2Lz26BwGzpyNbtLWbh+/fgTUTxUypYRpqgkk8Au0alfkCnYLSTKBHt E6BaqxKWUwjMvT9Zc91Fl8PkgJoFr/uNthEfjyuk1/ddDX554fXylN/0w w==; X-CSE-ConnectionGUID: 3gIEiB2BR4awWRPWIdTmog== X-CSE-MsgGUID: 3YiZQn8rTGiojQ5HhgHHWw== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="56962677" X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="56962677" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2025 23:21:05 -0800 X-CSE-ConnectionGUID: 6++sgLdmQ82mgLCGkPkmNA== X-CSE-MsgGUID: TSTCr4N2SO6ManNSMSYRPQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="112022628" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.115]) by orviesa008.jf.intel.com with ESMTP; 05 Feb 2025 23:21:05 -0800 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosry.ahmed@linux.dev, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [PATCH v6 10/16] crypto: iaa - Descriptor allocation timeouts with mitigations in iaa_crypto. Date: Wed, 5 Feb 2025 23:20:56 -0800 Message-Id: <20250206072102.29045-11-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20250206072102.29045-1-kanchana.p.sridhar@intel.com> References: <20250206072102.29045-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 68E43140002 X-Stat-Signature: xgzh1dezrmox4mesrczyc4fdsdpsw4b9 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738826472-364127 X-HE-Meta: U2FsdGVkX1/sgo2MoGIWdE0N1eEvMptLMvKRsnTZzuEM7SVmtIVVNIfaHWSKVUz+HQWuOuim39ISG5AuEnE24GpB9vDUgCoFfOYAiWYORPGoXqtHABmTDDSxQUCVz69Z0TiR2fR1xVE5GIrTfLWx72dkGQC+J8aj0zggHSN10KaHCM0AzixheRHDDKSac445nAwavbUWGybW8sOqlHToHHpEE1JQ2ykMO/bhsRkF0JXLi5n0YgJzVlBeqlQOJUPuKDE/2WvAUIN/YQWP/vbvYWkbaRfuN9t56Pv3aFSvVoRSXFEYgkKzSWcrkfvXzuWXsmtMK1fKDfGPoMOLrdNxz1spQa07tNNBzxhWqda4VstLHqPsYvvGJx3vHOjUeJj5JB1aew03f9YwnDnG6Sjvs9E7sxtsMo4hSvW6x4/g7/GCoC0CzrYjv6UiahdTZ+rurooWX5q0P6EjC2+/7mPqHs9QsZK+s1/BV3DdjiIMJ37Ngcn4zPwERCmoRb4BsXjR9zecWxVMmk/fiu9WVNW1BRTSYLmS3jKhsOlcC4HR6NJrpB10+JterAep1+zbv2UuUhFoLwWUG/LUNbin+bndm9W6hdFlP91PH7vKJuHaWyEfuLtcNU6BZkmGihEWXoHjNK81bf42AI8IwFzu9/Bascz1mkhBxiBuHdputsqq0bzAJyqophOZMyBzoEMqsSPu+Otr7O3gge7R3VTRlat3uQ6BaGx8IFyi780cnQ2KwRO/RnMZ0duLpoUEbqQnVAGhZ4LZKWBYwwMKJ3f4CWznSVta1nZvp3Gqh0YpN7YpmWYSRMEGpWkfQoHfbYp1uPl63RB8REVc6HSNbdwzkBkAejvndjD622qOHI3GQOuTx0Sa8As1+hBy8SGHtmyn7Pu4DkM9G8A7v+iIZt0fGAwFZzYJcOzmAelixx3/Z8N7Q5l6bCc3OJHUfaeSJj9zXKUAVeCBSyVwfZ/M73vt/FR w2qcmRqr 1REUsgYUgWU9X8Wbg5vOce3zcyvNh+KirnnaJbAFefxKMer2aTGIjNVmN+sBlj5YxImcF0gGoqJD0hnZGqaeoBjGGUT/NI+hnLc2tU6JdovKRiySgyW9EGBI7cmZ8HrkvZ4cY1jH0h7Fn5dC/6azKdr1JY0B0C6685nksCgN7EaX8gxF7ST0tg837Y4qgMNjc6/3wAVObYgpbrYinuYgzQMqA8s2GDxfC1s5I X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch modifies the descriptor allocation from blocking to non-blocking with bounded retries or "timeouts". This is necessary to prevent task blocked errors with higher compress/decompress batch-sizes in high contention scenarios, for instance, when the platform has only 1 IAA device enabled. With 1 IAA device enabled per package on a dual-package SPR with 56 cores/package, there are 112 logical cores mapped to this single IAA device. In this scenario, the task blocked errors can occur because idxd_alloc_desc() is called with IDXD_OP_BLOCK. Any process that is able to obtain IAA_CRYPTO_MAX_BATCH_SIZE (8UL) descriptors, will cause contention for allocating descriptors for all other processes. Under IDXD_OP_BLOCK, this can cause compress/decompress jobs to stall in stress test scenarios (e.g. zswap_store() of 2M folios). In order to make the iaa_crypto driver be more fail-safe, this commit implements the following: 1) Change compress/decompress descriptor allocations to be non-blocking with retries ("timeouts"). 2) Return compress error to zswap if descriptor allocation with timeouts fails during compress ops. zswap_store() will return an error and the folio gets stored in the backing swap device. 3) Fallback to software decompress if descriptor allocation with timeouts fails during decompress ops. 4) Bug fixes for freeing the descriptor consistently in all error cases. With these fixes, there are no task blocked errors seen under stress testing conditions, and no performance degradation observed. Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto.h | 3 + drivers/crypto/intel/iaa/iaa_crypto_main.c | 81 +++++++++++++--------- 2 files changed, 51 insertions(+), 33 deletions(-) diff --git a/drivers/crypto/intel/iaa/iaa_crypto.h b/drivers/crypto/intel/iaa/iaa_crypto.h index c46c70ecf355..b175e9065025 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto.h +++ b/drivers/crypto/intel/iaa/iaa_crypto.h @@ -21,6 +21,9 @@ #define IAA_COMPLETION_TIMEOUT 1000000 +#define IAA_ALLOC_DESC_COMP_TIMEOUT 1000 +#define IAA_ALLOC_DESC_DECOMP_TIMEOUT 500 + #define IAA_ANALYTICS_ERROR 0x0a #define IAA_ERROR_DECOMP_BUF_OVERFLOW 0x0b #define IAA_ERROR_COMP_BUF_OVERFLOW 0x19 diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 4ca9028d6050..5292d8f7ebd6 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -1406,6 +1406,7 @@ static int deflate_generic_decompress(struct acomp_req *req) void *src, *dst; int ret; + req->dlen = PAGE_SIZE; src = kmap_local_page(sg_page(req->src)) + req->src->offset; dst = kmap_local_page(sg_page(req->dst)) + req->dst->offset; @@ -1469,7 +1470,8 @@ static int iaa_compress_verify(struct crypto_tfm *tfm, struct acomp_req *req, struct iaa_device_compression_mode *active_compression_mode; struct iaa_compression_ctx *ctx = crypto_tfm_ctx(tfm); struct iaa_device *iaa_device; - struct idxd_desc *idxd_desc; + struct idxd_desc *idxd_desc = ERR_PTR(-EAGAIN); + int alloc_desc_retries = 0; struct iax_hw_desc *desc; struct idxd_device *idxd; struct iaa_wq *iaa_wq; @@ -1485,7 +1487,11 @@ static int iaa_compress_verify(struct crypto_tfm *tfm, struct acomp_req *req, active_compression_mode = get_iaa_device_compression_mode(iaa_device, ctx->mode); - idxd_desc = idxd_alloc_desc(wq, IDXD_OP_BLOCK); + while ((idxd_desc == ERR_PTR(-EAGAIN)) && (alloc_desc_retries++ < IAA_ALLOC_DESC_DECOMP_TIMEOUT)) { + idxd_desc = idxd_alloc_desc(wq, IDXD_OP_NONBLOCK); + cpu_relax(); + } + if (IS_ERR(idxd_desc)) { dev_dbg(dev, "idxd descriptor allocation failed\n"); dev_dbg(dev, "iaa compress failed: ret=%ld\n", @@ -1661,7 +1667,8 @@ static int iaa_compress(struct crypto_tfm *tfm, struct acomp_req *req, struct iaa_device_compression_mode *active_compression_mode; struct iaa_compression_ctx *ctx = crypto_tfm_ctx(tfm); struct iaa_device *iaa_device; - struct idxd_desc *idxd_desc; + struct idxd_desc *idxd_desc = ERR_PTR(-EAGAIN); + int alloc_desc_retries = 0; struct iax_hw_desc *desc; struct idxd_device *idxd; struct iaa_wq *iaa_wq; @@ -1677,7 +1684,11 @@ static int iaa_compress(struct crypto_tfm *tfm, struct acomp_req *req, active_compression_mode = get_iaa_device_compression_mode(iaa_device, ctx->mode); - idxd_desc = idxd_alloc_desc(wq, IDXD_OP_BLOCK); + while ((idxd_desc == ERR_PTR(-EAGAIN)) && (alloc_desc_retries++ < IAA_ALLOC_DESC_COMP_TIMEOUT)) { + idxd_desc = idxd_alloc_desc(wq, IDXD_OP_NONBLOCK); + cpu_relax(); + } + if (IS_ERR(idxd_desc)) { dev_dbg(dev, "idxd descriptor allocation failed\n"); dev_dbg(dev, "iaa compress failed: ret=%ld\n", PTR_ERR(idxd_desc)); @@ -1753,15 +1764,10 @@ static int iaa_compress(struct crypto_tfm *tfm, struct acomp_req *req, *compression_crc = idxd_desc->iax_completion->crc; - if (!ctx->async_mode || disable_async) - idxd_free_desc(wq, idxd_desc); -out: - return ret; err: idxd_free_desc(wq, idxd_desc); - dev_dbg(dev, "iaa compress failed: ret=%d\n", ret); - - goto out; +out: + return ret; } static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, @@ -1773,7 +1779,8 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, struct iaa_device_compression_mode *active_compression_mode; struct iaa_compression_ctx *ctx = crypto_tfm_ctx(tfm); struct iaa_device *iaa_device; - struct idxd_desc *idxd_desc; + struct idxd_desc *idxd_desc = ERR_PTR(-EAGAIN); + int alloc_desc_retries = 0; struct iax_hw_desc *desc; struct idxd_device *idxd; struct iaa_wq *iaa_wq; @@ -1789,12 +1796,18 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, active_compression_mode = get_iaa_device_compression_mode(iaa_device, ctx->mode); - idxd_desc = idxd_alloc_desc(wq, IDXD_OP_BLOCK); + while ((idxd_desc == ERR_PTR(-EAGAIN)) && (alloc_desc_retries++ < IAA_ALLOC_DESC_DECOMP_TIMEOUT)) { + idxd_desc = idxd_alloc_desc(wq, IDXD_OP_NONBLOCK); + cpu_relax(); + } + if (IS_ERR(idxd_desc)) { dev_dbg(dev, "idxd descriptor allocation failed\n"); dev_dbg(dev, "iaa decompress failed: ret=%ld\n", PTR_ERR(idxd_desc)); - return PTR_ERR(idxd_desc); + ret = PTR_ERR(idxd_desc); + idxd_desc = NULL; + goto fallback_software_decomp; } desc = idxd_desc->iax_hw; @@ -1837,7 +1850,7 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, ret = idxd_submit_desc(wq, idxd_desc); if (ret) { dev_dbg(dev, "submit_desc failed ret=%d\n", ret); - goto err; + goto fallback_software_decomp; } /* Update stats */ @@ -1851,19 +1864,20 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, } ret = check_completion(dev, idxd_desc->iax_completion, false, false); + +fallback_software_decomp: if (ret) { - dev_dbg(dev, "%s: check_completion failed ret=%d\n", __func__, ret); - if (idxd_desc->iax_completion->status == IAA_ANALYTICS_ERROR) { + dev_dbg(dev, "%s: desc allocation/submission/check_completion failed ret=%d\n", __func__, ret); + if (idxd_desc && idxd_desc->iax_completion->status == IAA_ANALYTICS_ERROR) { pr_warn("%s: falling back to deflate-generic decompress, " "analytics error code %x\n", __func__, idxd_desc->iax_completion->error_code); - ret = deflate_generic_decompress(req); - if (ret) { - dev_dbg(dev, "%s: deflate-generic failed ret=%d\n", - __func__, ret); - goto err; - } - } else { + } + + ret = deflate_generic_decompress(req); + + if (ret) { + pr_err("%s: iaa decompress failed: fallback to deflate-generic software decompress error ret=%d\n", __func__, ret); goto err; } } else { @@ -1872,19 +1886,15 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, *dlen = req->dlen; - if (!ctx->async_mode || disable_async) - idxd_free_desc(wq, idxd_desc); - /* Update stats */ update_total_decomp_bytes_in(slen); update_wq_decomp_bytes(wq, slen); + +err: + if (idxd_desc) + idxd_free_desc(wq, idxd_desc); out: return ret; -err: - idxd_free_desc(wq, idxd_desc); - dev_dbg(dev, "iaa decompress failed: ret=%d\n", ret); - - goto out; } static int iaa_comp_acompress(struct acomp_req *req) @@ -2375,6 +2385,7 @@ static bool iaa_comp_acompress_batch( if (errors[i] != -EINPROGRESS) { errors[i] = -EINVAL; err = -EINVAL; + pr_debug("%s desc was not submitted\n", __func__); } else { errors[i] = -EAGAIN; } @@ -2521,9 +2532,13 @@ static bool iaa_comp_adecompress_batch( } else { errors[i] = iaa_comp_adecompress(reqs[i]); + /* + * If it failed desc allocation/submission, errors[i] can + * be 0 or error value from software decompress. + */ if (errors[i] != -EINPROGRESS) { - errors[i] = -EINVAL; err = -EINVAL; + pr_debug("%s desc was not submitted\n", __func__); } else { errors[i] = -EAGAIN; }