From patchwork Fri Jun 21 07:15:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13706935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD6DDC27C4F for ; Fri, 21 Jun 2024 07:15:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D01D8D013F; Fri, 21 Jun 2024 03:15:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 480D18D0138; Fri, 21 Jun 2024 03:15:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AAD98D013F; Fri, 21 Jun 2024 03:15:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0875F8D0138 for ; Fri, 21 Jun 2024 03:15:52 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7C38540AAC for ; Fri, 21 Jun 2024 07:15:51 +0000 (UTC) X-FDA: 82254036102.09.696BFC6 Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) by imf08.hostedemail.com (Postfix) with ESMTP id 1C283160016 for ; Fri, 21 Jun 2024 07:15:48 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=AAEy6Q8q; spf=pass (imf08.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718954139; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CYdTXTKYKtxpSFisk6BbhTPD4FTbLOjBtWhY7hcv+m0=; b=ShAJ8A58wZtkxqiTaOyuBftMfLhZgCxnw8V/5Nfjx2JBgGg3c7Dnhu3WZhRU6VRzmVGyMD T9/AQOHRye27hWvRSRq8lQDwo1b0HpH+9irrMDq2nue/GNpV6SLnNlPw6JfLEntTfQ5ri0 5n50nT+GuCc7GI6A103gCEySz3aVNfM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718954139; a=rsa-sha256; cv=none; b=8HaEtA8qTf3elNyQ6JaHDnDS4/LXBEJi7GnkGu5ZSI2v68mOKoxpWZZDcxQVBia4eAFMUp SW+WFDb0wZTeC2cVnecb9c3ccvzUk2pMzds0PAydpHUQYtwXRUIZKP02hHSdDlHUX0U11F MhoSM+vWCTAwttqcIsWh0nyhakM6k78= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=AAEy6Q8q; spf=pass (imf08.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Envelope-To: nphamcs@gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1718954147; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CYdTXTKYKtxpSFisk6BbhTPD4FTbLOjBtWhY7hcv+m0=; b=AAEy6Q8qMThfp8kju3S0+6cahdNf6SL7krmx7kKmZxDZqLUkrCic+qYfM2S6DCjAorPsyl CpXV7PUVRqGash7dK+FHYkF2Dl5YrOXyd69BoZsID0WC5m+qnk2yyWzr2+YIjRa6uhYDFa cutl+bIlz4gc45U/G18ydx2LjOsEl4c= X-Envelope-To: yosryahmed@google.com X-Envelope-To: dan.carpenter@linaro.org X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: linux-mm@kvack.org X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: yuzhao@google.com X-Envelope-To: senozhatsky@chromium.org X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: linux-kernel@vger.kernel.org X-Envelope-To: flintglass@gmail.com X-Envelope-To: minchan@kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Fri, 21 Jun 2024 15:15:10 +0800 Subject: [PATCH v2 2/2] mm/zswap: use only one pool in zswap MIME-Version: 1.0 Message-Id: <20240621-zsmalloc-lock-mm-everything-v2-2-d30e9cd2b793@linux.dev> References: <20240621-zsmalloc-lock-mm-everything-v2-0-d30e9cd2b793@linux.dev> In-Reply-To: <20240621-zsmalloc-lock-mm-everything-v2-0-d30e9cd2b793@linux.dev> To: Minchan Kim , Sergey Senozhatsky , Andrew Morton , Johannes Weiner , Yosry Ahmed , Nhat Pham Cc: Yu Zhao , Takero Funaki , Chengming Zhou , Dan Carpenter , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chengming Zhou X-Developer-Signature: v=1; a=ed25519-sha256; t=1718954135; l=7684; i=chengming.zhou@linux.dev; s=20240617; h=from:subject:message-id; bh=7D3T7yM8Ob4tOureTBkgzH9fmQTasZoU94aL1jswHyU=; b=8sdShy75MzwC/1IFMnr18SRcRbKvQCp+VzBanMFYq6w/l2N6ol5uzzF1PQ9WaWRTmkpeR30Yh u576rjAX3CKBSC1rNo6BhQSlRBwj42Y/lHwdKCcU1Hr+SqBuF1IZCDl X-Developer-Key: i=chengming.zhou@linux.dev; a=ed25519; pk=/XPhIutBo+zyUeQyf4Ni5JYk/PEIWxIeUQqy2DYjmhI= X-Migadu-Flow: FLOW_OUT X-Stat-Signature: ck9zah6zhyokkhfsd3nqx7paeqj7u8dx X-Rspamd-Queue-Id: 1C283160016 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1718954148-580311 X-HE-Meta: U2FsdGVkX1/BocivCVdvuBV9mWutKAq8ZGRXotBiFbkSZBhYn+sjxDiT2n0wqEsdz8HRRGG2SVKmS7KZolZAyFPKPBm7DgjE0pztSG4OcGUzREq31qsq8uoMO5073vvimx8k/dVZ5CD6Y/iIx/TqaDzE1/50QYcm0ZLs/EBLykLRExLmNETwW27fzdTZFRVS83Vbp3T6/k/u43AT7CK3AvZazoNxzwAXeQnhXe5qiAv6j9rkV+IYZselGH3igPSGzW59Z5HVUNeyKiyT1699mw50U/eUBt2yBfR/zFWR5zFUtRJaZO78xtcNwiNtqHsT98D7o2Id8XKUNiVnOt6DmBUN4DEbHEIRAeUCGcjRpUl8eLu9AR4/q5mjiv5nX1uv+7oxD/FdlZ94RxF+NyBH9C7Mf8THrATKNNJDfCcjU8sswCiCp8N7gCW27KHgMmAAfFYqmCXq0omG2n6CuL21k8caaqBk7GGG9CunSrDifjAm8OxtDNzzqGFCZOfGZfhOPgiewrBuBi5MeG+hc8Yf/d9k63C9DDeBtepFjqsjzBVZXK+uCwzbhPb35iyUBya3Mk20TfxXc8vuBAl1Dvag+4LAjBPWA0TUzlbGD4VIy6rF5m9WfKx8+r+S+GvBKRaJtYeNajeQWlOv7GmQlat39GEcHMEL691o1URHeISIH26Eb4Z5kqogdKeI6j4nopvpLVZj2HvnjsCP7aLsZtWXPZ3tTAHKvv9C4tTgQlYlGlZzdr7sPU/iXEtAFwQ4nVuQqG5ucHPdrQj9Pwlo2KcKxjOkjKASTnb3CmxoRAvo93BwjwHmFPAaCca0V6blxstbUFoHZyVDL35YUgxI0ahq58z67cEBWOzdtut5jhBgZ38ZXBufUx1RZ4pqfPIwIM6qLMmlEmlRTFISAfafuk/iC3dMdMBRmH7i2rVeE6zoQHeXlujsJzlY+pjfJOE59j6Wvq08XdAtwEtYWpMUcjV jhOvky3X PqIOVaMkjxvFLis435cNXAlXJG2OuEzXk005GbWxaUvbab3Vb31AZkQ2qm1MgkWyvjQ7hOVILu41/7E+C49shuvEPHVrQ9rxVEQiqddbEyF5oIs2FhBShlRuTIEOhHaZ5C1Sk13p1f6QunIMaxM0SRgZSmg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Zswap uses 32 pools to workaround the locking scalability problem in zswap backends (mainly zsmalloc nowadays), which brings its own problems like memory waste and more memory fragmentation. Testing results show that we can have near performance with only one pool in zswap after changing zsmalloc to use per-size_class lock instead of pool spinlock. Testing kernel build (make bzImage -j32) on tmpfs with memory.max=1GB, and zswap shrinker enabled with 10GB swapfile on ext4. real user sys 6.10.0-rc3 138.18 1241.38 1452.73 6.10.0-rc3-onepool 149.45 1240.45 1844.69 6.10.0-rc3-onepool-perclass 138.23 1242.37 1469.71 And do the same testing using zbud, which shows a little worse performance as expected since we don't do any locking optimization for zbud. I think it's acceptable since zsmalloc became a lot more popular than other backends, and we may want to support only zsmalloc in the future. real user sys 6.10.0-rc3-zbud 138.23 1239.58 1430.09 6.10.0-rc3-onepool-zbud 139.64 1241.37 1516.59 Reviewed-by: Nhat Pham Signed-off-by: Chengming Zhou --- mm/zswap.c | 60 +++++++++++++++++++----------------------------------------- 1 file changed, 19 insertions(+), 41 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index e25a6808c2ed..7925a3d0903e 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -122,9 +122,6 @@ static unsigned int zswap_accept_thr_percent = 90; /* of max pool size */ module_param_named(accept_threshold_percent, zswap_accept_thr_percent, uint, 0644); -/* Number of zpools in zswap_pool (empirically determined for scalability) */ -#define ZSWAP_NR_ZPOOLS 32 - /* Enable/disable memory pressure-based shrinker. */ static bool zswap_shrinker_enabled = IS_ENABLED( CONFIG_ZSWAP_SHRINKER_DEFAULT_ON); @@ -160,7 +157,7 @@ struct crypto_acomp_ctx { * needs to be verified that it's still valid in the tree. */ struct zswap_pool { - struct zpool *zpools[ZSWAP_NR_ZPOOLS]; + struct zpool *zpool; struct crypto_acomp_ctx __percpu *acomp_ctx; struct percpu_ref ref; struct list_head list; @@ -237,7 +234,7 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp) #define zswap_pool_debug(msg, p) \ pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ - zpool_get_type((p)->zpools[0])) + zpool_get_type((p)->zpool)) /********************************* * pool functions @@ -246,7 +243,6 @@ static void __zswap_pool_empty(struct percpu_ref *ref); static struct zswap_pool *zswap_pool_create(char *type, char *compressor) { - int i; struct zswap_pool *pool; char name[38]; /* 'zswap' + 32 char (max) num + \0 */ gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM; @@ -267,18 +263,14 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor) if (!pool) return NULL; - for (i = 0; i < ZSWAP_NR_ZPOOLS; i++) { - /* unique name for each pool specifically required by zsmalloc */ - snprintf(name, 38, "zswap%x", - atomic_inc_return(&zswap_pools_count)); - - pool->zpools[i] = zpool_create_pool(type, name, gfp); - if (!pool->zpools[i]) { - pr_err("%s zpool not available\n", type); - goto error; - } + /* unique name for each pool specifically required by zsmalloc */ + snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count)); + pool->zpool = zpool_create_pool(type, name, gfp); + if (!pool->zpool) { + pr_err("%s zpool not available\n", type); + return NULL; } - pr_debug("using %s zpool\n", zpool_get_type(pool->zpools[0])); + pr_debug("using %s zpool\n", zpool_get_type(pool->zpool)); strscpy(pool->tfm_name, compressor, sizeof(pool->tfm_name)); @@ -311,8 +303,7 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor) error: if (pool->acomp_ctx) free_percpu(pool->acomp_ctx); - while (i--) - zpool_destroy_pool(pool->zpools[i]); + zpool_destroy_pool(pool->zpool); kfree(pool); return NULL; } @@ -361,15 +352,12 @@ static struct zswap_pool *__zswap_pool_create_fallback(void) static void zswap_pool_destroy(struct zswap_pool *pool) { - int i; - zswap_pool_debug("destroying", pool); cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node); free_percpu(pool->acomp_ctx); - for (i = 0; i < ZSWAP_NR_ZPOOLS; i++) - zpool_destroy_pool(pool->zpools[i]); + zpool_destroy_pool(pool->zpool); kfree(pool); } @@ -464,8 +452,7 @@ static struct zswap_pool *zswap_pool_find_get(char *type, char *compressor) list_for_each_entry_rcu(pool, &zswap_pools, list) { if (strcmp(pool->tfm_name, compressor)) continue; - /* all zpools share the same type */ - if (strcmp(zpool_get_type(pool->zpools[0]), type)) + if (strcmp(zpool_get_type(pool->zpool), type)) continue; /* if we can't get it, it's about to be destroyed */ if (!zswap_pool_get(pool)) @@ -492,12 +479,8 @@ unsigned long zswap_total_pages(void) unsigned long total = 0; rcu_read_lock(); - list_for_each_entry_rcu(pool, &zswap_pools, list) { - int i; - - for (i = 0; i < ZSWAP_NR_ZPOOLS; i++) - total += zpool_get_total_pages(pool->zpools[i]); - } + list_for_each_entry_rcu(pool, &zswap_pools, list) + total += zpool_get_total_pages(pool->zpool); rcu_read_unlock(); return total; @@ -802,11 +785,6 @@ static void zswap_entry_cache_free(struct zswap_entry *entry) kmem_cache_free(zswap_entry_cache, entry); } -static struct zpool *zswap_find_zpool(struct zswap_entry *entry) -{ - return entry->pool->zpools[hash_ptr(entry, ilog2(ZSWAP_NR_ZPOOLS))]; -} - /* * Carries out the common pattern of freeing and entry's zpool allocation, * freeing the entry itself, and decrementing the number of stored pages. @@ -814,7 +792,7 @@ static struct zpool *zswap_find_zpool(struct zswap_entry *entry) static void zswap_entry_free(struct zswap_entry *entry) { zswap_lru_del(&zswap_list_lru, entry); - zpool_free(zswap_find_zpool(entry), entry->handle); + zpool_free(entry->pool->zpool, entry->handle); zswap_pool_put(entry->pool); if (entry->objcg) { obj_cgroup_uncharge_zswap(entry->objcg, entry->length); @@ -939,7 +917,7 @@ static bool zswap_compress(struct folio *folio, struct zswap_entry *entry) if (comp_ret) goto unlock; - zpool = zswap_find_zpool(entry); + zpool = entry->pool->zpool; gfp = __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM; if (zpool_malloc_support_movable(zpool)) gfp |= __GFP_HIGHMEM | __GFP_MOVABLE; @@ -968,7 +946,7 @@ static bool zswap_compress(struct folio *folio, struct zswap_entry *entry) static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) { - struct zpool *zpool = zswap_find_zpool(entry); + struct zpool *zpool = entry->pool->zpool; struct scatterlist input, output; struct crypto_acomp_ctx *acomp_ctx; u8 *src; @@ -1467,7 +1445,7 @@ bool zswap_store(struct folio *folio) return true; store_failed: - zpool_free(zswap_find_zpool(entry), entry->handle); + zpool_free(entry->pool->zpool, entry->handle); put_pool: zswap_pool_put(entry->pool); freepage: @@ -1683,7 +1661,7 @@ static int zswap_setup(void) pool = __zswap_pool_create_fallback(); if (pool) { pr_info("loaded using pool %s/%s\n", pool->tfm_name, - zpool_get_type(pool->zpools[0])); + zpool_get_type(pool->zpool)); list_add(&pool->list, &zswap_pools); zswap_has_pool = true; static_branch_enable(&zswap_ever_enabled);