From patchwork Wed Oct 11 05:11:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhongkun He X-Patchwork-Id: 13416569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E279CCD98F5 for ; Wed, 11 Oct 2023 05:11:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2659A6B01F2; Wed, 11 Oct 2023 01:11:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 211C06B01F3; Wed, 11 Oct 2023 01:11:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08F536B0205; Wed, 11 Oct 2023 01:11:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E6EDA6B01F2 for ; Wed, 11 Oct 2023 01:11:32 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B2CEBA0114 for ; Wed, 11 Oct 2023 05:11:32 +0000 (UTC) X-FDA: 81332007624.15.646B0D2 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf08.hostedemail.com (Postfix) with ESMTP id 3586C160021 for ; Wed, 11 Oct 2023 05:11:28 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="ajZD/EPx"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697001090; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=8YdumXqFECiIqGWe/cPAmhangKs7E2fmFDNCzZi3I4I=; b=dCdfDFlgfaNT4RkiiMX+FXNCykZAnHfX+AHZr8ruCocgIDhY2B946GBnnbow/8EKXYBYos CDgeiG0VAPJefXZGNr0Bmq/+WgqyvWOwhmFZmVeOSYDfn//F0gTMuAt+nVMYmEwwHPZPho 4z0USyDINP4wpFpNRkddabpSczyYkKE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="ajZD/EPx"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697001090; a=rsa-sha256; cv=none; b=EaONXJPBsOBRCKv2hat689nsWc7IaFWCBhIjCM6gfvleLAsMKRiHWZJird80RWLpc5m+hV 2m9PxnhNckGIG1GtmTyMNJGwq6ItXggsne3dHeDudNIYc25GDkvbKvzFSULuG6VfXEg9Be GzN2HQPIzBBCqN3kr3iYnmGElouAr1k= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-1c5c91bec75so44735955ad.3 for ; Tue, 10 Oct 2023 22:11:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1697001088; x=1697605888; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=8YdumXqFECiIqGWe/cPAmhangKs7E2fmFDNCzZi3I4I=; b=ajZD/EPxwXWSZIgMw4lEU2OGiFeJupDEa6A9AX556Tc1ZXWNh1PSIgY4EsJR+ku3O5 lmPezuamuR6gEpyNFddxuEQqEY/USzO5mzVJ8Xkd/j+8K9ihG46oJcM57TdQ2wMXjTR8 BYhGPSvGGGDw2MruVM/2BY77N+gdI38yb0BUAsY+15H2HptAR0ECo86YsJdmMvwq+Kst A5z3mKif1LUMqFLMhgzQrJC7+/6WBofwoKIvvKlwQF4vl6X4wLiguDjMhH0a2FQrtV1+ CVvWsJMhMzX6IbLbqZ0T/6a3t0vMGgmpGNtC4L1lNe72D6hSFZwMxXLU6X9l038kubOf nBng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697001088; x=1697605888; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8YdumXqFECiIqGWe/cPAmhangKs7E2fmFDNCzZi3I4I=; b=LcV8D8itCPiu+btoZ0+th7xm2/g0nzfZc6x4Je54fv6EP57X8j8VCV0sYLvE9mMaSa 6fl0wIaPI5BVp3Zm9y7D9OZQNcA/wotajGCt5uFpVAyQh5oRb9IFjI8Hpn5CgQ5LWv9W fwizScZ32MUf0r0Fx7J4jW7r3szehRgdffxCpfDmaf3h3h4f7/X8hk0OVFJ6/vkKi1ob 4oL8I02BgKyvjDXanAQie7daxNUmfP/zfCqwymrd1lLzqt3NWOwayE8R2fphpvdlHmKN jKrflYGb5OA0+ireA2MGxjS+KjOzo2ns45HMRmvNZi0DYDnfsn5og4EjNVt8QinqNoWf oCKQ== X-Gm-Message-State: AOJu0YycVbKhwuQGllwqSv7CY2cFoYgMqBULHTqGO0wltcWZgV2leJq/ 2cfI8J+8DcM5aNJP0Fp0dJtw7A== X-Google-Smtp-Source: AGHT+IH87J+fW2lRM0pJ0z3HScGPs556gHd0Vo58M/p4eeCdCb2C67udLQZaOW+Qet93ce4nOhvFxA== X-Received: by 2002:a17:903:11c8:b0:1c7:23c9:a7e1 with SMTP id q8-20020a17090311c800b001c723c9a7e1mr20211376plh.26.1697001087763; Tue, 10 Oct 2023 22:11:27 -0700 (PDT) Received: from Tower.bytedance.net ([203.208.167.147]) by smtp.gmail.com with ESMTPSA id b14-20020a170902d50e00b001bc6e6069a6sm12745399plg.122.2023.10.10.22.11.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 22:11:27 -0700 (PDT) From: Zhongkun He To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zhongkun He Subject: [RFC PATCH] zswap: add writeback_time_threshold interface to shrink zswap pool Date: Wed, 11 Oct 2023 13:11:17 +0800 Message-Id: <20231011051117.2289518-1-hezhongkun.hzk@bytedance.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3586C160021 X-Stat-Signature: 5ath1frmn7zqnk47swronxfg9m8ewyn1 X-HE-Tag: 1697001088-103985 X-HE-Meta: U2FsdGVkX19uKCgTEMYTQkA7E2tgpEcEknUca9mtGutumLZBlyKrBS19wDzMG6EwBi4Z4giXoXTtAO5a1sTdPT+MmxVP7r396ffOd+SMdEufog/278UyzEpFBEutwi9QoWpVe2tf4W+Fp6zNJMMIcHH+RaGfbxQafZRJZXIg+LRj8fZ/gOsrEOutTiItsbVArrO86dulNT4o0tegDjfYxz1noaPd7kRBczZwtU+CsDmhMxAcBYjONyuMmqSU+JI8qO9TSbVeHIJkftV8xyJ9wvpNgca1+u/Hzg0sWoBSwxwRufKTmGK/SsPlVEG1XJrT7EUMNh8BfwttiLieQE10ifDyyHZNugs99ZmzqRj+/0jzi245Y6SLFXcxCbu50DY6l2nVMLTvaw8hDShWsl1BjXmmcJrsxRdImhj1QIUI3bkDrs89n8MmLG4MXTRtcV22FOD3xrhOXN3v9Vafr5rl35sXyV5R7lLTZf8jDhbtydIVMxK2PTjKzd2pXIBJyCCiuCmBrx5qOvJfT6rcazzWRz2rnyp7OgTPeeZDI8B3XNeUUvEgEaJrbb4u74a400/ukNIHLB0NXUS1kWBXHdoF20o5hKk9z+pXz+CZopiI63LA4Nv3/Ci1yiyEaq/PGogM+gNqYbtF+dOV4raC+zzyOxFGt33haLKMSRnb/8BbdP4OKT5EODO2cKPq1krSqmBjcdx0rS1oiLPGRFhm/bKwl2BIfkBoBRQyN7uEQJ0LPvM3ROFv74QHnrON84BjMLmOI5i52kg5QbsK1UzM/5USRIQM6l7DVjTKiSTvG2iUs33yNmg5BPFDzgbwt7wagrxAOlz6bZGLaCWe0x+KE1cWSisnPlImUlsp5alSvSwBPawjwyR9TGOVEhrkPdWfU3dI1Na1na9Qc+qhvrobIW4ZFfB9KbjCAwB8aoTr3hTnewiHjhAyYoOunb3Dc6KUHgyiBVDEz4C9BMvP4Vdb50H fBpX6guZ xrWR6uHUNec+zjbTNVa2HpMv/05kLaiGoC0Wwt5YBo8RvHsLlT7s3JEF3Obsa6NejbhZG4ju4rZy//KCunEly3bEgzP7G0R9FFdjhA35XBJ6GG9j/6S2Xy1+7z5U09Ab5jxw0mmEzbk+8olqqSQyEMwiKMa4BQI6xy5zFFOpd1gFH53/07kAh+A/3IXLOLrvz5MUg9AHswIaSosXmX7tNSxiT3bYAdwooUbsU6grpHtEFIxECfG1POr/LNlEfnY43ySPGPHBdsaxWuQMAyrGjcguq6Qrx1mUScNv32hwa8mlC6VxLFQ+gpYxK0VSPbNcraweO0sZhPoQx6sKt46WvkrUN5j0/j1TOg2liSpygKqwZmkmUWKAKXf58ltSBHgk1F+ah0pvgWPMXbPzpfqSFViSFTYr3CcAj9qnZccAse88yNFgJQkgdAzxAVDyTPGZPmq9aZoiYWrKakvlHbk3YoQDCyg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: zswap does not have a suitable method to select objects that have not been accessed for a long time, and just shrink the pool when the limit is hit. There is a high probability of wasting memory in zswap if the limit is too high. This patch add a new interface writeback_time_threshold to shrink zswap pool proactively based on the time threshold in second, e.g.:: echo 600 > /sys/module/zswap/parameters/writeback_time_threshold If zswap_entrys have not been accessed for more than 600 seconds, they will be swapout to swap. if set to 0, all of them will be swapout. Signed-off-by: Zhongkun He --- Documentation/admin-guide/mm/zswap.rst | 9 +++ mm/zswap.c | 76 ++++++++++++++++++++++++++ 2 files changed, 85 insertions(+) diff --git a/Documentation/admin-guide/mm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst index 45b98390e938..9ffaed26c3c0 100644 --- a/Documentation/admin-guide/mm/zswap.rst +++ b/Documentation/admin-guide/mm/zswap.rst @@ -153,6 +153,15 @@ attribute, e. g.:: Setting this parameter to 100 will disable the hysteresis. +When there is a lot of cold memory according to the store time in the zswap, +it can be swapout and save memory in userspace proactively. User can write +writeback time threshold in second to enable it, e.g.:: + + echo 600 > /sys/module/zswap/parameters/writeback_time_threshold + +If zswap_entrys have not been accessed for more than 600 seconds, they will be +swapout. if set to 0, all of them will be swapout. + A debugfs interface is provided for various statistic about pool size, number of pages stored, same-value filled pages and various counters for the reasons pages are rejected. diff --git a/mm/zswap.c b/mm/zswap.c index 083c693602b8..c3a19b56a29b 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -141,6 +141,16 @@ static bool zswap_exclusive_loads_enabled = IS_ENABLED( CONFIG_ZSWAP_EXCLUSIVE_LOADS_DEFAULT_ON); module_param_named(exclusive_loads, zswap_exclusive_loads_enabled, bool, 0644); +/* zswap writeback time threshold in second */ +static unsigned int zswap_writeback_time_thr; +static int zswap_writeback_time_thr_param_set(const char *, const struct kernel_param *); +static const struct kernel_param_ops zswap_writeback_param_ops = { + .set = zswap_writeback_time_thr_param_set, + .get = param_get_uint, +}; +module_param_cb(writeback_time_threshold, &zswap_writeback_param_ops, + &zswap_writeback_time_thr, 0644); + /* Number of zpools in zswap_pool (empirically determined for scalability) */ #define ZSWAP_NR_ZPOOLS 32 @@ -197,6 +207,7 @@ struct zswap_pool { * value - value of the same-value filled pages which have same content * objcg - the obj_cgroup that the compressed memory is charged to * lru - handle to the pool's lru used to evict pages. + * sto_time - the store time of zswap_entry. */ struct zswap_entry { struct rb_node rbnode; @@ -210,6 +221,7 @@ struct zswap_entry { }; struct obj_cgroup *objcg; struct list_head lru; + ktime_t sto_time; }; /* @@ -288,6 +300,31 @@ static void zswap_update_total_size(void) zswap_pool_total_size = total; } +static void zswap_reclaim_entry_by_timethr(void); + +static bool zswap_reach_timethr(struct zswap_pool *pool) +{ + struct zswap_entry *entry; + ktime_t expire_time = 0; + bool ret = false; + + spin_lock(&pool->lru_lock); + + if (list_empty(&pool->lru)) + goto out; + + entry = list_last_entry(&pool->lru, struct zswap_entry, lru); + expire_time = ktime_add(entry->sto_time, + ns_to_ktime(zswap_writeback_time_thr * NSEC_PER_SEC)); + + if (ktime_after(ktime_get_boottime(), expire_time)) + ret = true; +out: + spin_unlock(&pool->lru_lock); + return ret; +} + + /********************************* * zswap entry functions **********************************/ @@ -395,6 +432,7 @@ static void zswap_free_entry(struct zswap_entry *entry) else { spin_lock(&entry->pool->lru_lock); list_del(&entry->lru); + entry->sto_time = 0; spin_unlock(&entry->pool->lru_lock); zpool_free(zswap_find_zpool(entry), entry->handle); zswap_pool_put(entry->pool); @@ -709,6 +747,28 @@ static void shrink_worker(struct work_struct *w) zswap_pool_put(pool); } +static void zswap_reclaim_entry_by_timethr(void) +{ + struct zswap_pool *pool = zswap_pool_current_get(); + int ret, failures = 0; + + if (!pool) + return; + + while (zswap_reach_timethr(pool)) { + ret = zswap_reclaim_entry(pool); + if (ret) { + zswap_reject_reclaim_fail++; + if (ret != -EAGAIN) + break; + if (++failures == MAX_RECLAIM_RETRIES) + break; + } + cond_resched(); + } + zswap_pool_put(pool); +} + static struct zswap_pool *zswap_pool_create(char *type, char *compressor) { int i; @@ -1037,6 +1097,21 @@ static int zswap_enabled_param_set(const char *val, return ret; } +static int zswap_writeback_time_thr_param_set(const char *val, + const struct kernel_param *kp) +{ + int ret = -ENODEV; + + /* if this is load-time (pre-init) param setting, just return. */ + if (system_state != SYSTEM_RUNNING) + return ret; + + ret = param_set_uint(val, kp); + if (!ret) + zswap_reclaim_entry_by_timethr(); + return ret; +} + /********************************* * writeback code **********************************/ @@ -1360,6 +1435,7 @@ bool zswap_store(struct folio *folio) if (entry->length) { spin_lock(&entry->pool->lru_lock); list_add(&entry->lru, &entry->pool->lru); + entry->sto_time = ktime_get_boottime(); spin_unlock(&entry->pool->lru_lock); } spin_unlock(&tree->lock);