From patchwork Sat Dec 21 06:31:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kanchana P Sridhar X-Patchwork-Id: 13917686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0639CE7718B for ; Sat, 21 Dec 2024 06:32:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 764BC6B009B; Sat, 21 Dec 2024 01:31:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 677286B009C; Sat, 21 Dec 2024 01:31:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A2AF6B009D; Sat, 21 Dec 2024 01:31:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2772D6B009B for ; Sat, 21 Dec 2024 01:31:32 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id DC4ED14065A for ; Sat, 21 Dec 2024 06:31:31 +0000 (UTC) X-FDA: 82917992724.09.7697EA3 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) by imf05.hostedemail.com (Postfix) with ESMTP id 72B0E10000C for ; Sat, 21 Dec 2024 06:30:22 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=JeWsFQVG; spf=pass (imf05.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 198.175.65.20 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734762674; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VSzjkLD0I5iZ2z62ubJd2c2FWmXHJGf6H/r0GuI5bHQ=; b=UEZoIuAPhdcOZrsfBq99lr7d8LXHCAn6PPDl8Bbnk2dUq7pcPMN8ucZVOWl/V84cl2xnLY Vju2ltdD0P7TxYkFKzdyUbbYoao17oKxOeihpr8+MMDAdTM8G4W7XKepBnU2wwT4isEYPK 2sOjPoUpyjVnEHJcrXL6fPeMkPBjL68= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=JeWsFQVG; spf=pass (imf05.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 198.175.65.20 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734762674; a=rsa-sha256; cv=none; b=Up9GV7dktD0dbYwv41bXPS6WthbiYa+H3DJLZmIAHJZABs7bo1OHULSTNoAih4tSwGKG4F fl/gvbDkV6nklJAHCPT20wgRa6aXyFleu73hJw0UUDF5IAZ9jORccIw20aHbn/YEHCZCTS zkY/a7vNxvF7myM+trf3OwtNvHBNdk4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734762690; x=1766298690; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3T5mE0tmACfbWWCJAYrQoPctDfhArcrXCQeauBvfdDw=; b=JeWsFQVGFo/xTdfexoFcjUdabGvU6xaeusk9JPLp9AYDJf0f2eVOtZ/v Ig9VHkMJAri1ivHlMdOsgSQgpH9WiDHQMQjYozjbbpuxV3qsD/Z9pQGHx bwg/PZDTOe0pC/LxzF9UZv2iWr0Mk8ukIn9xmK49fUv1JdPUnNK4J3IfC I1VaZ35Ko8h1iRSMzl6wr1vO2l7+/FGebvQf6wm+ZiVgxMGqrKgVvb+Gk APm4pwJ1v39FhSCYYibZdYtx2v61dTSOV8IIyJ4u/e7A0GPYKB3pw5IdH Ul+yl4j1Fi8q2zVpI4nIR4LTAjGL3oux+QRgwzdEBqKS11iDVy9iy6t2B Q==; X-CSE-ConnectionGUID: 3UD9Bvj1RJeKvBFIuqQCZw== X-CSE-MsgGUID: mocbQH3RRJyagBtQSOx2qA== X-IronPort-AV: E=McAfee;i="6700,10204,11292"; a="35021734" X-IronPort-AV: E=Sophos;i="6.12,253,1728975600"; d="scan'208";a="35021734" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2024 22:31:21 -0800 X-CSE-ConnectionGUID: 8ICHdYlyQDKSYSh1fVmI1Q== X-CSE-MsgGUID: AiI31hJNT9G273n9bOuR5g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="99184610" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.115]) by orviesa007.jf.intel.com with ESMTP; 20 Dec 2024 22:31:21 -0800 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [PATCH v5 11/12] mm: zswap: Restructure & simplify zswap_store() to make it amenable for batching. Date: Fri, 20 Dec 2024 22:31:18 -0800 Message-Id: <20241221063119.29140-12-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241221063119.29140-1-kanchana.p.sridhar@intel.com> References: <20241221063119.29140-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 72B0E10000C X-Stat-Signature: 7fdax5pjwg7yocx8s56cauuzmhxyxrmo X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1734762622-549826 X-HE-Meta: U2FsdGVkX19T5TNa6rHTQOSlh8RYLEoKVeYhDFELGKoF+oFtax4QCcKEkIE0Kma0TQYN9sp3u3Vks5nsCvIvTXLiiJTKnNM2AjTFr0CRdSTjLexRSrFai+aPvDUbGuPaN7ITJRURGMrYeZ4kYZKw2oFOISuvYcGiJY8BqVBRdUGuy8PJUGYT1onrchP+HfCUgruOzcoPkJs7S7BkToeIWn9Rd2ivKAUT6d9zk9Qr5ROo4yLypaQx5xJEdgwxHsKpyZGbcGf8YE89J/exk0rjy6M7m3n2eZEtw4H2xYto1nW1BTtg5Og7NQwfxkl0D9aZMCg98olwRrQsIt1whW96RqYeOX0E4sZ3B0oDntChSziKHkLOmqL8BrAir1bkHP7LmKx8D1Sz7hK+6xdLoHbn8FNF6DBYkONBuCtOGDbfAUtZjFlaw0t0vroIfYRkylkGeZjl3FigfuWNvDiLDjzpxDHDotB3nHuy8/Pwe3A0Wk04a+H9y72+T5y9WSQ2D6bJJyWQcLlT5mKYeyCLr1FTx5CFrdM8C07ArYkaIUsh56naMsCEEQ3zrQ4JOb804hHD3A2uF3llkLtxvitYjpYdRzHnP0SpYbGSAU0pMwBUEI/r7LNDielA5nOqT4bDRxaD0+dAQ9w1x2KkPaN2DMxFB6rxu9p53MVLA1W85XMNHYjO+IBJf+Pdp9PX3DoBqTeTUJfzws9G5krn74LrNFsFiP1d/OIst3Ap67N6UXFdWln/g1qVj9crPry+7MKCMlKheWWpBLDXXwPoj3qEBlNb1tS44yn6t7G6RfdMFCAlCicAH22syb3K29xPIHF8cQH5an9uEbbz6izdt6g702vlpi7F9UgEq8/rRiPshZlI7/Eaot2Vd2STPBTDUSRZ012/0ejnUER/FpSwpInp1+FvkchbbOxNK2lr/yTw6+YUTDjw51n4sarxAU8ty+DR8XgoKcb7V2eFexCYWASqXj4 yTFO0Wfc 63pEZefgyqdVYYzxCTcvc8CVcdUM+cf3KZ6O/UpZBIAWJRmgUPxdl0Pc/UiYyh1BzuGvTrLppC7LLTPZ62ZcGYS+Pm0CtISgK1KbItLME4STMlBBFodzw1+Si1kFCib9KXNMZgpmZJtY5ucQmKexznO/OpdMikmkSEds3CxqHPhzPjGV93KegwWcOaSMYz5SsEz59+PutoS+XVrpzCPDXu8UoG+fSRFPVsXExGDlmfaktLDs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch introduces zswap_store_folio() that implements all the computes done earlier in zswap_store_page() for a single-page, for all the pages in a folio. This allows us to move the loop over the folio's pages from zswap_store() to zswap_store_folio(). A distinct zswap_compress_folio() is also added, that simply calls zswap_compress() for each page in the folio it is called with. zswap_store_folio() starts by allocating all zswap entries required to store the folio. Next, it calls zswap_compress_folio() and finally, adds the entries to the xarray and LRU. The error handling and cleanup required for all failure scenarios that can occur while storing a folio in zswap is now consolidated to a "store_folio_failed" label in zswap_store_folio(). These changes facilitate developing support for compress batching in zswap_store_folio(). Signed-off-by: Kanchana P Sridhar --- mm/zswap.c | 183 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 116 insertions(+), 67 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 99cd78891fd0..1be0f1807bfc 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1467,77 +1467,129 @@ static void shrink_worker(struct work_struct *w) * main API **********************************/ -static ssize_t zswap_store_page(struct page *page, - struct obj_cgroup *objcg, - struct zswap_pool *pool) +static bool zswap_compress_folio(struct folio *folio, + struct zswap_entry *entries[], + struct zswap_pool *pool) { - swp_entry_t page_swpentry = page_swap_entry(page); - struct zswap_entry *entry, *old; + long index, nr_pages = folio_nr_pages(folio); - /* allocate entry */ - entry = zswap_entry_cache_alloc(GFP_KERNEL, page_to_nid(page)); - if (!entry) { - zswap_reject_kmemcache_fail++; - return -EINVAL; + for (index = 0; index < nr_pages; ++index) { + struct page *page = folio_page(folio, index); + + if (!zswap_compress(page, entries[index], pool)) + return false; } - if (!zswap_compress(page, entry, pool)) - goto compress_failed; + return true; +} - old = xa_store(swap_zswap_tree(page_swpentry), - swp_offset(page_swpentry), - entry, GFP_KERNEL); - if (xa_is_err(old)) { - int err = xa_err(old); +/* + * Store all pages in a folio. + * + * The error handling from all failure points is consolidated to the + * "store_folio_failed" label, based on the initialization of the zswap entries' + * handles to ERR_PTR(-EINVAL) at allocation time, and the fact that the + * entry's handle is subsequently modified only upon a successful zpool_malloc() + * after the page is compressed. + */ +static ssize_t zswap_store_folio(struct folio *folio, + struct obj_cgroup *objcg, + struct zswap_pool *pool) +{ + long index, nr_pages = folio_nr_pages(folio); + struct zswap_entry **entries = NULL; + int node_id = folio_nid(folio); + size_t compressed_bytes = 0; - WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); - zswap_reject_alloc_fail++; - goto store_failed; + entries = kmalloc(nr_pages * sizeof(*entries), GFP_KERNEL); + if (!entries) + return -ENOMEM; + + /* allocate entries */ + for (index = 0; index < nr_pages; ++index) { + entries[index] = zswap_entry_cache_alloc(GFP_KERNEL, node_id); + + if (!entries[index]) { + zswap_reject_kmemcache_fail++; + nr_pages = index; + goto store_folio_failed; + } + + entries[index]->handle = (unsigned long)ERR_PTR(-EINVAL); } - /* - * We may have had an existing entry that became stale when - * the folio was redirtied and now the new version is being - * swapped out. Get rid of the old. - */ - if (old) - zswap_entry_free(old); + if (!zswap_compress_folio(folio, entries, pool)) + goto store_folio_failed; - /* - * The entry is successfully compressed and stored in the tree, there is - * no further possibility of failure. Grab refs to the pool and objcg. - * These refs will be dropped by zswap_entry_free() when the entry is - * removed from the tree. - */ - zswap_pool_get(pool); - if (objcg) - obj_cgroup_get(objcg); + for (index = 0; index < nr_pages; ++index) { + swp_entry_t page_swpentry = page_swap_entry(folio_page(folio, index)); + struct zswap_entry *old, *entry = entries[index]; + + old = xa_store(swap_zswap_tree(page_swpentry), + swp_offset(page_swpentry), + entry, GFP_KERNEL); + if (xa_is_err(old)) { + int err = xa_err(old); + + WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); + zswap_reject_alloc_fail++; + goto store_folio_failed; + } - /* - * We finish initializing the entry while it's already in xarray. - * This is safe because: - * - * 1. Concurrent stores and invalidations are excluded by folio lock. - * - * 2. Writeback is excluded by the entry not being on the LRU yet. - * The publishing order matters to prevent writeback from seeing - * an incoherent entry. - */ - entry->pool = pool; - entry->swpentry = page_swpentry; - entry->objcg = objcg; - entry->referenced = true; - if (entry->length) { - INIT_LIST_HEAD(&entry->lru); - zswap_lru_add(&zswap_list_lru, entry); + /* + * We may have had an existing entry that became stale when + * the folio was redirtied and now the new version is being + * swapped out. Get rid of the old. + */ + if (old) + zswap_entry_free(old); + + /* + * The entry is successfully compressed and stored in the tree, there is + * no further possibility of failure. Grab refs to the pool and objcg. + * These refs will be dropped by zswap_entry_free() when the entry is + * removed from the tree. + */ + zswap_pool_get(pool); + if (objcg) + obj_cgroup_get(objcg); + + /* + * We finish initializing the entry while it's already in xarray. + * This is safe because: + * + * 1. Concurrent stores and invalidations are excluded by folio lock. + * + * 2. Writeback is excluded by the entry not being on the LRU yet. + * The publishing order matters to prevent writeback from seeing + * an incoherent entry. + */ + entry->pool = pool; + entry->swpentry = page_swpentry; + entry->objcg = objcg; + entry->referenced = true; + if (entry->length) { + INIT_LIST_HEAD(&entry->lru); + zswap_lru_add(&zswap_list_lru, entry); + } + + compressed_bytes += entry->length; } - return entry->length; + kfree(entries); + + return compressed_bytes; + +store_folio_failed: + for (index = 0; index < nr_pages; ++index) { + if (!IS_ERR_VALUE(entries[index]->handle)) + zpool_free(pool->zpool, entries[index]->handle); + + zswap_entry_cache_free(entries[index]); + } + + kfree(entries); -store_failed: - zpool_free(pool->zpool, entry->handle); -compress_failed: - zswap_entry_cache_free(entry); return -EINVAL; } @@ -1549,8 +1601,8 @@ bool zswap_store(struct folio *folio) struct mem_cgroup *memcg = NULL; struct zswap_pool *pool; size_t compressed_bytes = 0; + ssize_t bytes; bool ret = false; - long index; VM_WARN_ON_ONCE(!folio_test_locked(folio)); VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); @@ -1584,15 +1636,11 @@ bool zswap_store(struct folio *folio) mem_cgroup_put(memcg); } - for (index = 0; index < nr_pages; ++index) { - struct page *page = folio_page(folio, index); - ssize_t bytes; + bytes = zswap_store_folio(folio, objcg, pool); + if (bytes < 0) + goto put_pool; - bytes = zswap_store_page(page, objcg, pool); - if (bytes < 0) - goto put_pool; - compressed_bytes += bytes; - } + compressed_bytes = bytes; if (objcg) { obj_cgroup_charge_zswap(objcg, compressed_bytes); @@ -1622,6 +1670,7 @@ bool zswap_store(struct folio *folio) pgoff_t offset = swp_offset(swp); struct zswap_entry *entry; struct xarray *tree; + long index; for (index = 0; index < nr_pages; ++index) { tree = swap_zswap_tree(swp_entry(type, offset + index));