From patchwork Sat Jun 8 15:53:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takero Funaki X-Patchwork-Id: 13691032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E480C27C5F for ; Sat, 8 Jun 2024 15:53:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86C9C6B0095; Sat, 8 Jun 2024 11:53:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A5406B0096; Sat, 8 Jun 2024 11:53:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5851E6B0098; Sat, 8 Jun 2024 11:53:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 33AB16B0095 for ; Sat, 8 Jun 2024 11:53:35 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A595A120160 for ; Sat, 8 Jun 2024 15:53:34 +0000 (UTC) X-FDA: 82208166348.10.C5B6925 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf16.hostedemail.com (Postfix) with ESMTP id D3661180004 for ; Sat, 8 Jun 2024 15:53:32 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UeKZFZ3U; spf=pass (imf16.hostedemail.com: domain of flintglass@gmail.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717862012; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HrIr32TNVXDKH7oG1xumlOKq/cguM0V8sG+9FXVj9vY=; b=4XKhis/2lsaxnayB/VfZVW8h1kUdEOWksIRLv9Vq8id+LPbVqt7z9lQiWJpKS6wwV9nC21 p0Da2avHx424gOfh5bVFIIBnuXK/V72jD+ou6YuxcTYOAu7i0ml1Vqfj0WOydrSUiK6ZX8 3lVGrZKKFHtW+RheOQfkxrRTv6CIl44= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717862012; a=rsa-sha256; cv=none; b=m9cSdmZidgQB+qa1bqTEkT92GIwLsSvWz0rVhtSIJnM+DIID9Q06V+G7wUk7AjU9fwWTG0 80JTw4dRnG1EizaaT6DG8bVSaiaFS46u5KX0FF+1HYjQ7WGJ5I/Kdt3m7MNv5fvFBwj1rw xNjTdEIHQGd957MdzSl1mITGidLNxhI= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UeKZFZ3U; spf=pass (imf16.hostedemail.com: domain of flintglass@gmail.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-701b0b0be38so3010200b3a.0 for ; Sat, 08 Jun 2024 08:53:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717862012; x=1718466812; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HrIr32TNVXDKH7oG1xumlOKq/cguM0V8sG+9FXVj9vY=; b=UeKZFZ3U10RLVCdgkeECyhVEeSnruyQMWAfT0ZWf+FVTVB9ddXenZ5fcVeXY9fch5T gMqib5ccXd+7InENt5oC04I5LjWC7W2JOHCzRf8SHYuDiobTEWN874LpU5pTVSscm7hz V1IOX4Pr31jWS3+rTphL+uT0KQxbK4xC42Dq+QC6ubk3+VoqC+XFU8AKYuQMcrVEpbM6 1U7HI1UMr4LNVxKHIKZpqKu/Zs400C7UlDNwSO9VKF89Uz5XvnLrN4srxJwHwDaLqZZg NcR0T7JYV7bdIUMEkonuvSXSjS0lQPZWPXkH/QGNW16I4lyX874YjYc3h7sZjLYGeZDU 8Y6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717862012; x=1718466812; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HrIr32TNVXDKH7oG1xumlOKq/cguM0V8sG+9FXVj9vY=; b=dUsd0fvxLnvnMct0B6lcSNeEJfrhUnjzbthyL1FdgG0NmX+u6TKQJO0ROHsNNwAkfN SUSkgm8usOP7iX4YG0fVWe1uRWK+CB6yiFYPSSh4NqUOJTVTyoO/YE5v+JYFgNJ68XUq xY94jirOSgTdCBEnn0B7YP0gsCfrnkoaiB55QkpfcJWrgNEAgiMt/Pz1fqGhxqWzqpm2 2fMsGi1h+DiyW43llHlTfZwRcnNUnVXpIsqmNPqiqHRg2VPAu6LAx+wbvhub1ajBTbOB ZKPRyizS3ssd2GsqRBQ8F1C8agpACeK3KkKR5OIyjCO0LwHk13Vp80waTbH0Nurfiln+ Vnxw== X-Forwarded-Encrypted: i=1; AJvYcCXr05amVwVXD/nJFBiHDT99Y7hlXub6utmV5Tv9rnxkhSIoghwJYb46oU88gdbA268xCihLxhKhrid3ZplKHQVfOs0= X-Gm-Message-State: AOJu0Ywdx/+Yl3mYqHxNfFJOY1FWgx64j7ytCgKCs9edeCYnaUfPqi9v cHQLlKWJlzzXDt588oETn/JvYEEOS+0pMfuRT+Ql3S0NRd4iydFP X-Google-Smtp-Source: AGHT+IEteUQgfUsrWSdHev4f9MAtLQvgT4O27ZIatD/ynyxVmEqXJoDhlhnvAnSdhI3u3ENtRXai/Q== X-Received: by 2002:a05:6a20:43a2:b0:1b4:2a8:629 with SMTP id adf61e73a8af0-1b402a8081dmr4737912637.53.1717862011509; Sat, 08 Jun 2024 08:53:31 -0700 (PDT) Received: from cbuild.srv.usb0.net (uw2.srv.usb0.net. [185.197.30.200]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-703fd4d9d8fsm4335209b3a.149.2024.06.08.08.53.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Jun 2024 08:53:31 -0700 (PDT) From: Takero Funaki To: Johannes Weiner , Yosry Ahmed , Nhat Pham , Chengming Zhou , Jonathan Corbet , Andrew Morton , Domenico Cerasuolo Cc: Takero Funaki , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 3/3] mm: zswap: proactive shrinking before pool size limit is hit Date: Sat, 8 Jun 2024 15:53:10 +0000 Message-ID: <20240608155316.451600-4-flintglass@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240608155316.451600-1-flintglass@gmail.com> References: <20240608155316.451600-1-flintglass@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D3661180004 X-Rspam-User: X-Stat-Signature: x1zp5931psu3bgng8f66spfhm9demutp X-HE-Tag: 1717862012-101743 X-HE-Meta: U2FsdGVkX1/q8IaT6MBVzYcsFrKvhJmDAtbqtrpB233GsPb1+iml8O3N5+pOcbLwmLraHkXjpY0r+FTaGc+HqVcCEEp0A4Mmk5hk3QYwRuws8EuViqSlIlTv46dDFlwjHYF1pg3uTqm1/svFpwTP2D5Ui0ptVSLbua8R7oPEWo/EJvZHUHVZpN3++RqpNktt3i0PvlEix8a6ak4H+8BVR9X8Rukbjc16ODKqTSG74jWm5p7Npz0ks0hCHPH2ienkIusLnhquDgsg4EacX48fUblPI8MpOOprxT6vWVT9vR6zs4NaChzKNXAhPvXs/RxwS+h07xk+qEu8hNZNjaoM3UsfldC9KLiYum7WUKrauMANpTez1ZfoyRf0JQgZNW141qsaNZoJD6gW06Z+BQJzrd3Yt5vmoFt6sVqaFzz6SRAHoF/XS/nbmFTbQs2PHjxlmgLExD8qTeKc8zmzZXNU9woe4nfcWDW3puzVdpC7oCJQ15bP6djd6kmaJU2U+Tyu5Isa8fFfgDTBV9xuqkNYJ8E9Rr7obl3UrDbf7cRhVp0mdplUUA2DyIqnc9sqMizhwAsT4CWwfhOLVscj2quuHFGhpr/Xq/CXl58LhCsg2n/drpKyZatn8py9mi9ewVlao3lsS2Sp+GLlqLSzOSlA6+mRrqzHbuheOannDRNudaQhj7aoKl8FvxOMHYl4vTIycHCGAvtcFF4TrR9cXkzs3SgabUAC72mxZpFn+qSdSJitkWd5pKOZDhems9RtSDl6QSpkcAaGRVTZT4l1lS9E6B8+3N5TqRlOY69MJGWqVoMeX5UNo3w5ZuxEvIZxZqmNWIUCoZEY6oNb/UxDBH7QbIRlUTucvpDxIn4LDfxPFKN6tmK54eXyDQpfx+ruqDI1NPq3dxYCMVsQUwHc8l1fsLXuokHi4sOpcEYllHuvbyse26gGGTEosp/x3j11OvR68FZ4bioxy/a/D9cXxfO /XIj4OSX wKVLYb+Lc9Qpg8mpCovHW0/Ux4YTYdeJLE0jBUW7vjVhyoQz8D/UZP8XeyL9X7zwegZI3d9eGHhIDzp47AFL6qcNA2gTAM0klm+6UaXfEuhpk1kS4EXjdbQ1WAJ42C/gXQ5AR2J8LsAu4yJeMvBxS/6F47wolg4XLAgiWdDUup6gTqcL5ypCmTqYTpZhbP+Zyi2TpZFhptPubHYq0oG0I6b5iSV4/yZy+qkfnItB4xoVXsiKBLpMp2ZbRqA2+BWxMHRVUeKgnMmKpqA2DBBAgjemO/fNuZ/B3btLhM7cboJ7o5HM7AX21v9Q0E31z4fHJ8IGSHoVWrJl/eJ9OIxXkVZ+lhoquNizeetJfJrT0ZyK1U2sWpKfa69Y/XhEEcShU+nNna/Fxzo70J3DdPS9wfnFcEqk1Bthg3KJCSx+6fMh/NoBzqSni64je9m2vMYsxpl/i X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch implements proactive shrinking of zswap pool before the max pool size limit is reached. This also changes zswap to accept new pages while the shrinker is running. To prevent zswap from rejecting new pages and incurring latency when zswap is full, this patch queues the global shrinker by a pool usage threshold between 100% and accept_thr_percent, instead of the max pool size. The pool size will be controlled between 90% to 91% for the default accept_thr_percent=90. Since the current global shrinker continues to shrink until accept_thr_percent, we do not need to maintain the hysteresis variable tracking the pool limit overage in zswap_store(). Before this patch, zswap rejected pages while the shrinker is running without incrementing zswap_pool_limit_hit counter. It could be a reason why zswap writethrough new pages before writeback old pages. With this patch, zswap accepts new pages while shrinking, and zswap increments the counter when and only when zswap rejects pages by the max pool size. Now, reclaims smaller than the proactive shrinking amount finish instantly and trigger background shrinking. Admins can check if new pages are buffered by zswap by monitoring the pool_limit_hit counter. The name of sysfs tunable accept_thr_percent is unchanged as it is still the stop condition of the shrinker. The respective documentation is updated to describe the new behavior. Signed-off-by: Takero Funaki --- Documentation/admin-guide/mm/zswap.rst | 17 ++++---- mm/zswap.c | 54 ++++++++++++++++---------- 2 files changed, 42 insertions(+), 29 deletions(-) diff --git a/Documentation/admin-guide/mm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst index 3598dcd7dbe7..a1d8f167a27a 100644 --- a/Documentation/admin-guide/mm/zswap.rst +++ b/Documentation/admin-guide/mm/zswap.rst @@ -111,18 +111,17 @@ checked if it is a same-value filled page before compressing it. If true, the compressed length of the page is set to zero and the pattern or same-filled value is stored. -To prevent zswap from shrinking pool when zswap is full and there's a high -pressure on swap (this will result in flipping pages in and out zswap pool -without any real benefit but with a performance drop for the system), a -special parameter has been introduced to implement a sort of hysteresis to -refuse taking pages into zswap pool until it has sufficient space if the limit -has been hit. To set the threshold at which zswap would start accepting pages -again after it became full, use the sysfs ``accept_threshold_percent`` -attribute, e. g.:: +To prevent zswap from rejecting new pages and incurring latency when zswap is +full, zswap initiates a worker called global shrinker that proactively evicts +some pages from the pool to swap devices while the pool is reaching the limit. +The global shrinker continues to evict pages until there is sufficient space to +accept new pages. To control how many pages should remain in the pool, use the +sysfs ``accept_threshold_percent`` attribute as a percentage of the max pool +size, e. g.:: echo 80 > /sys/module/zswap/parameters/accept_threshold_percent -Setting this parameter to 100 will disable the hysteresis. +Setting this parameter to 100 will disable the proactive shrinking. Some users cannot tolerate the swapping that comes with zswap store failures and zswap writebacks. Swapping can be disabled entirely (without disabling diff --git a/mm/zswap.c b/mm/zswap.c index 1a90f434f247..e957bfdeaf70 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -71,8 +71,6 @@ static u64 zswap_reject_kmemcache_fail; /* Shrinker work queue */ static struct workqueue_struct *shrink_wq; -/* Pool limit was hit, we need to calm down */ -static bool zswap_pool_reached_full; /********************************* * tunables @@ -118,7 +116,10 @@ module_param_cb(zpool, &zswap_zpool_param_ops, &zswap_zpool_type, 0644); static unsigned int zswap_max_pool_percent = 20; module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644); -/* The threshold for accepting new pages after the max_pool_percent was hit */ +/* + * The percentage of pool size that the global shrinker keeps in memory. + * It does not protect old pages from the dynamic shrinker. + */ static unsigned int zswap_accept_thr_percent = 90; /* of max pool size */ module_param_named(accept_threshold_percent, zswap_accept_thr_percent, uint, 0644); @@ -539,6 +540,20 @@ static unsigned long zswap_accept_thr_pages(void) return zswap_max_pages() * zswap_accept_thr_percent / 100; } +/* + * Returns threshold to start proactive global shrinking. + */ +static inline unsigned long zswap_shrink_start_pages(void) +{ + /* + * Shrinker will evict pages to the accept threshold. + * We add 1% to not schedule shrinker too frequently + * for small swapout. + */ + return zswap_max_pages() * + min(100, zswap_accept_thr_percent + 1) / 100; +} + unsigned long zswap_total_pages(void) { struct zswap_pool *pool; @@ -556,21 +571,6 @@ unsigned long zswap_total_pages(void) return total; } -static bool zswap_check_limits(void) -{ - unsigned long cur_pages = zswap_total_pages(); - unsigned long max_pages = zswap_max_pages(); - - if (cur_pages >= max_pages) { - zswap_pool_limit_hit++; - zswap_pool_reached_full = true; - } else if (zswap_pool_reached_full && - cur_pages <= zswap_accept_thr_pages()) { - zswap_pool_reached_full = false; - } - return zswap_pool_reached_full; -} - /********************************* * param callbacks **********************************/ @@ -1577,6 +1577,8 @@ bool zswap_store(struct folio *folio) struct obj_cgroup *objcg = NULL; struct mem_cgroup *memcg = NULL; unsigned long value; + unsigned long cur_pages; + bool need_global_shrink = false; VM_WARN_ON_ONCE(!folio_test_locked(folio)); VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); @@ -1599,8 +1601,17 @@ bool zswap_store(struct folio *folio) mem_cgroup_put(memcg); } - if (zswap_check_limits()) + cur_pages = zswap_total_pages(); + + if (cur_pages >= zswap_max_pages()) { + zswap_pool_limit_hit++; + need_global_shrink = true; goto reject; + } + + /* schedule shrink for incoming pages */ + if (cur_pages >= zswap_shrink_start_pages()) + queue_work(shrink_wq, &zswap_shrink_work); /* allocate entry */ entry = zswap_entry_cache_alloc(GFP_KERNEL, folio_nid(folio)); @@ -1643,6 +1654,9 @@ bool zswap_store(struct folio *folio) WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); zswap_reject_alloc_fail++; + + /* reduce entry in array */ + need_global_shrink = true; goto store_failed; } @@ -1692,7 +1706,7 @@ bool zswap_store(struct folio *folio) zswap_entry_cache_free(entry); reject: obj_cgroup_put(objcg); - if (zswap_pool_reached_full) + if (need_global_shrink) queue_work(shrink_wq, &zswap_shrink_work); check_old: /*