From patchwork Mon Mar 25 23:50:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13603188 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FA3FC54E58 for ; Mon, 25 Mar 2024 23:50:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0EB06B0095; Mon, 25 Mar 2024 19:50:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C951D6B0096; Mon, 25 Mar 2024 19:50:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A98606B0098; Mon, 25 Mar 2024 19:50:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 902C66B0095 for ; Mon, 25 Mar 2024 19:50:34 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 725CAA09EA for ; Mon, 25 Mar 2024 23:50:34 +0000 (UTC) X-FDA: 81937208388.09.013F8C8 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf03.hostedemail.com (Postfix) with ESMTP id A213C20015 for ; Mon, 25 Mar 2024 23:50:32 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZQ+G1DnV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3xw0CZgoKCCIWMQPW8FKCBEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--yosryahmed.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3xw0CZgoKCCIWMQPW8FKCBEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711410632; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UL7FjpWvNZHo/XDshUAVHcqtRsHSd5BUf04PTKRvwLE=; b=ixUVQA/1ZbNMtApsAowMWGjDNXo7geSNeixSH3W2l8rP5c45KKmo54LItw3R0nM/mRKI2X j7iRVKfrkNq8MhGTifKs3IiYAjSiL5VGPdpYPb++7pc4TgV9ipSxaSWe48jvqxthfLfbga UCpgUJIAmgES1G6Yot2O249+K656oCc= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZQ+G1DnV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3xw0CZgoKCCIWMQPW8FKCBEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--yosryahmed.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3xw0CZgoKCCIWMQPW8FKCBEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711410632; a=rsa-sha256; cv=none; b=lwJq5Dn3PA8XRzPo3qwC3n01hQpcZrBIym+M7vReIrnIyhlpl6t/d5Qw3hy99Hx3086NZr aSaY9hWNecR3sWyWzoot+BedBKU8n10wgFDBwdYAmH8CaKXgmsr4JBk4MdQVlmBCB9WNxI XXXEm65nZXqwtHGXcLFacRXoSxlsAUE= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6b2682870so8003243276.0 for ; Mon, 25 Mar 2024 16:50:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711410632; x=1712015432; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=UL7FjpWvNZHo/XDshUAVHcqtRsHSd5BUf04PTKRvwLE=; b=ZQ+G1DnVp+IqoLKJu82U7VVOAW1+CIcWdVHrrQtcYEGtIwhwE9u3QHD8OB/DF3fQrs 3GVVl1Zr7407AG/j1Y8Z0YLZlgi2VWD/u+IjymCukWU+iucpYr1oPg2HuOTBDWuwB0jo ORjztwp2DzDhh0m5jfm5RqTK71L7ZGTbT/BmQEH5HM5Hstghsm+8XSsyc1PcbbZhUZ70 rkyTLnZjcjW/FWZ9tM29H/3jlOvqGm5DhPXsHCimBs6MHAJurrcWK+cWZI85rYIJ6O7l MuiheonKIXWpFPkBZhXqaxQ7VPxSHsyeFbB10EmsoEqfGi4+yTvG1o9WUwPTUQ0rFxE9 Hglw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711410632; x=1712015432; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UL7FjpWvNZHo/XDshUAVHcqtRsHSd5BUf04PTKRvwLE=; b=EA5G/IfXNk5dP0yXaXgmnXWlu3MEVzN7oGCGOkLlciMfeTf5ccJGu6KB4HuPjsPY3C nLoT3Lsbn7m7j/4k+Q3PsYxQgGOsgoZqL/IzOCSFsAEpJKwrPdwGNvm4F2NjssPdEs9b 1uFkq/eLAwNzGRjMPs5V+8S+6lf5kja4c9+8edIM42bSnJKuBVyBiqtrmMhvNLts8DH5 IvCVO6qtzHEi6HV0zrFkKla2aHQhIb6VaSuHkn7rYYN4XoRO7WwIHP/MFzZwtIbBSx+y EOpkuFW1tyfiRzO7neJW5/Z0LRfj3LxqbIygrfCWUPLO51R+h5XuCIWNZCnj2p63Q2bc Ufzg== X-Forwarded-Encrypted: i=1; AJvYcCVfulzCCf5KioCg3nrcnrq6o/r7rcYEi5C9ASMg7SwhriZJcwQlJaspTQUbSBmQ8OszY74olQARiDdYFW2VCHloRr8= X-Gm-Message-State: AOJu0YzXG+6m3skJI7YizB37bFR1GVfCGzCnhpN6A2LnsTxYin1ee2eF ysjszWjJmJhLqldVqR+BubC5vzNo8GYBjWA76rrzc1+I+NKYIxvOfacPX8mlWL1ju/hJIz/wBmZ +27wRFhBlWEEw+mQNuw== X-Google-Smtp-Source: AGHT+IG+TLfwzbh3JCRDIgWmOJOjvnZ/+V3sJUz+/QQZSrlaneyk9M9ZmZQb1PVEN8h1tDdQ5SE3NK8iK22/6mCW X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a05:6902:1507:b0:dc6:e8a7:fdba with SMTP id q7-20020a056902150700b00dc6e8a7fdbamr2612541ybu.4.1711410631787; Mon, 25 Mar 2024 16:50:31 -0700 (PDT) Date: Mon, 25 Mar 2024 23:50:15 +0000 In-Reply-To: <20240325235018.2028408-1-yosryahmed@google.com> Mime-Version: 1.0 References: <20240325235018.2028408-1-yosryahmed@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325235018.2028408-8-yosryahmed@google.com> Subject: [RFC PATCH 7/9] mm: zswap: store zero-filled pages without a zswap_entry From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Nhat Pham , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspamd-Queue-Id: A213C20015 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: qaskzk6tb45hhb541ugd4tqz7tftudrm X-HE-Tag: 1711410632-963789 X-HE-Meta: U2FsdGVkX1+uu4rgMH3PYw4ew+GfKKlul61IKBZrjTpdHNkdChVpPiei2hOi3hov8ewBkThRB8u+KW3MskWiJ+ekaMs9CMDTQqomNbwQWmySJvsHkCy6p9pECAeKuBz7T2vEqk7UiEKOaxmX1k6JhUgPtWclWUVoHdNcRKX5GkOfDLYep3yZi9E6TAsxEF80L1AkOQlfWLwEAN4TnOffTimKc1j8qngDQswYuzEDyZtebSBFWgzBVsz/2gTSL+IUWFBO4EZJOxb7A32q8eU9Q8TmfGrGxB25aBGgXsw3I5PGQq14RT74UWraBIJfv8heQuJL9jY9BsQ0tEYSLpiSwPiS/SDSv38pYUGZ7sqJ8VCLgCTP7lbPfFZD+0Y9UDNmbwhUXTSS3Xtq2H0BS/04sP7Jq70TpfomPoMR6qqFVtK+/OzQ3MWfchfBnfAS50TBCpA459AkFnT6qiFnwcQeQIO/aLRGGi2M5Asq3jZ2c5PylukRnrv55W2kIWMHEUz8DwtKw7KGiDIV76Czv3S0h2KzX441HFCFzkoweIJwgRkNuDxElGU6TKkMxhszBfpRTaxitQH1QNp9plyiHZv/zQCKqlDITwB4m4O2opvt15m1aQQBDMGKodpuxSIpKyLJ54XZAvVBUfdEcQbjF//KKK7E4IKUQyVUExp5YQ50W5ZpVUwQlw5DETdMj++f+rSeDsI8WQJ1z0Wq90Lv9RY6Eya+yPApuvUXAyrJ9K0XNp0XGwmK901rgboiQFLWJWElSB3UuVMUvE2cfG1FrN69i9bAAlZ7QQ7+c2xGv3ilwohC0DyZQ3XK1pV2Cii4lAvS8RmhERP5Pdvjs4OK9gy6zZzuXpNJ0KzGIzvrtZxNJ4HMbt84Q1D4MQA+TOydGKCF97L0yd6msRUghBSQzLMRjorSP60XNRNrIq2kacSXUCW3KFn4ZRK+8pJYPjUwcUY6pocA/HlMSrVwQb2Gmmp oseWce/2 +TwLSMtj0a/kgmDVWZgQxRcVo0X1VQkvNH8DNDxIWcsW9sZHlCu28SihsjynYBz5akofBoso6TAHHDpPTmzL9xw+/r452d8knY2QnRwKjdQFLIb6AVGqrSjjx+IonZ1ur70a5svY9oU7NhAqcY90nPPrxYfbw4pQfOKwWEqW36e5AVEKGD8SNv0AzADEWyv9Mc8QYDXrZeNUzCC3ubCa/JPOlMXA6d3bGqbeRIEJJRKWz6m416jhzI2dOuF6PJvK6twDMuGS2C5uVwRUJchz//hTSae8UJcJT5iRJcO1HopecR8pirXmTLD83He1Lw4KVolBjyC+mUVKiuQ5v127kZ2jWXpLaahulISj1WuJ9smcHamoBHkYdBvKZGZrDUhbk5cQyRkICvAynCf5rJa8sa8QBUSfaqrvqbH/v4ojYm63+Rd05adEzIGop7ZWzqJEDp4MeNse4XiL2mmh79evX9Rt3mJ1ljWgL3HGdYQFWycAa7wkclZYG7NLkCI8XSWjVOHfYva+f33WvQyI6p9kDDSaWpm+wbdsoDstSBQh6w/pWvgncZrVr8LOQjmXvmYWtXiXnfI2ltCYlERgu2avexOcIf6VsSgvgGXF3SlhGtabZdnipqSY+9nU7siJqT7ebj7droEh+QGxr+15zpd3vu8/fZrSijKED61onFJ0nIm2bfRrBO4QiLU//SCu6NsDAWmsPVQPiUtKT+ja6vK+WUUKkjfj3iN6pgWvxGf1iSy0RcWpLUWY1KnvH/aIHDeqNjMeD6kh8uJ7DJb5lhU6C6c0DL0stn2Pz9cloDMN86rAW65YZ+iiMPa38WA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: After the rbtree to xarray conversion, and dropping zswap_entry.refcount and zswap_entry.value, the only members of zswap_entry utilized by zero-filled pages are zswap_entry.length (always 0) and zswap_entry.objcg. Store the objcg pointer directly in the xarray as a tagged pointer and avoid allocating a zswap_entry completely for zero-filled pages. This simplifies the code as we no longer need to special case zero-length cases. We are also able to further separate the zero-filled pages handling logic and completely isolate them within store/load helpers. Handling tagged xarray pointers is handled in these two helpers, as well as the newly introduced helper for freeing tree elements, zswap_tree_free_element(). There is also a small performance improvement observed over 50 runs of kernel build test (kernbench) comparing the mean build time on a skylake machine when building the kernel in a cgroup v1 container with a 3G limit. This is on top of the improvement from dropping support for non-zero same-filled pages: base patched % diff real 69.915 69.757 -0.229% user 2956.147 2955.244 -0.031% sys 2594.718 2575.747 -0.731% This probably comes from avoiding the zswap_entry allocation and cleanup/freeing for zero-filled pages. Note that the percentage of zero-filled pages during this test was only around 1.5% on average. Practical workloads could have a larger proportion of such pages (e.g. Johannes observed around 10% [1]), so the performance improvement should be larger. This change also saves a small amount of memory due to less allocated zswap_entry's. In the kernel build test above, we save around 2M of slab usage when we swap out 3G to zswap. [1]https://lore.kernel.org/linux-mm/20240320210716.GH294822@cmpxchg.org/ Signed-off-by: Yosry Ahmed Reviewed-by: Chengming Zhou --- mm/zswap.c | 137 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 78 insertions(+), 59 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 413d9242cf500..efc323bab2f22 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -183,12 +183,11 @@ static struct shrinker *zswap_shrinker; * struct zswap_entry * * This structure contains the metadata for tracking a single compressed - * page within zswap. + * page within zswap, it does not track zero-filled pages. * * swpentry - associated swap entry, the offset indexes into the red-black tree * length - the length in bytes of the compressed page data. Needed during - * decompression. For a zero-filled page length is 0, and both - * pool and lru are invalid and must be ignored. + * decompression. * pool - the zswap_pool the entry's data is in * handle - zpool allocation handle that stores the compressed page data * objcg - the obj_cgroup that the compressed memory is charged to @@ -794,30 +793,35 @@ static struct zpool *zswap_find_zpool(struct zswap_entry *entry) return entry->pool->zpools[hash_ptr(entry, ilog2(ZSWAP_NR_ZPOOLS))]; } -/* - * Carries out the common pattern of freeing and entry's zpool allocation, - * freeing the entry itself, and decrementing the number of stored pages. - */ static void zswap_entry_free(struct zswap_entry *entry) { - if (!entry->length) - atomic_dec(&zswap_zero_filled_pages); - else { - zswap_lru_del(&zswap_list_lru, entry); - zpool_free(zswap_find_zpool(entry), entry->handle); - zswap_pool_put(entry->pool); - } + zswap_lru_del(&zswap_list_lru, entry); + zpool_free(zswap_find_zpool(entry), entry->handle); + zswap_pool_put(entry->pool); if (entry->objcg) { obj_cgroup_uncharge_zswap(entry->objcg, entry->length); obj_cgroup_put(entry->objcg); } zswap_entry_cache_free(entry); - atomic_dec(&zswap_stored_pages); } /********************************* * zswap tree functions **********************************/ +static void zswap_tree_free_element(void *elem) +{ + if (!elem) + return; + + if (xa_pointer_tag(elem)) { + obj_cgroup_put(xa_untag_pointer(elem)); + atomic_dec(&zswap_zero_filled_pages); + } else { + zswap_entry_free((struct zswap_entry *)elem); + } + atomic_dec(&zswap_stored_pages); +} + static int zswap_tree_store(struct xarray *tree, pgoff_t offset, void *new) { void *old; @@ -834,7 +838,7 @@ static int zswap_tree_store(struct xarray *tree, pgoff_t offset, void *new) * the folio was redirtied and now the new version is being * swapped out. Get rid of the old. */ - zswap_entry_free(old); + zswap_tree_free_element(old); } return err; } @@ -1089,7 +1093,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry, if (entry->objcg) count_objcg_event(entry->objcg, ZSWPWB); - zswap_entry_free(entry); + zswap_tree_free_element(entry); /* folio is up to date */ folio_mark_uptodate(folio); @@ -1373,6 +1377,33 @@ static void shrink_worker(struct work_struct *w) } while (zswap_total_pages() > thr); } +/********************************* +* zero-filled functions +**********************************/ +#define ZSWAP_ZERO_FILLED_TAG 1UL + +static int zswap_store_zero_filled(struct xarray *tree, pgoff_t offset, + struct obj_cgroup *objcg) +{ + int err = zswap_tree_store(tree, offset, + xa_tag_pointer(objcg, ZSWAP_ZERO_FILLED_TAG)); + + if (!err) + atomic_inc(&zswap_zero_filled_pages); + return err; +} + +static bool zswap_load_zero_filled(void *elem, struct page *page, + struct obj_cgroup **objcg) +{ + if (!xa_pointer_tag(elem)) + return false; + + clear_highpage(page); + *objcg = xa_untag_pointer(elem); + return true; +} + static bool zswap_is_folio_zero_filled(struct folio *folio) { unsigned long *kaddr; @@ -1432,22 +1463,21 @@ bool zswap_store(struct folio *folio) if (!zswap_check_limit()) goto reject; - /* allocate entry */ + if (zswap_is_folio_zero_filled(folio)) { + if (zswap_store_zero_filled(tree, offset, objcg)) + goto reject; + goto stored; + } + + if (!zswap_non_zero_filled_pages_enabled) + goto reject; + entry = zswap_entry_cache_alloc(GFP_KERNEL, folio_nid(folio)); if (!entry) { zswap_reject_kmemcache_fail++; goto reject; } - if (zswap_is_folio_zero_filled(folio)) { - entry->length = 0; - atomic_inc(&zswap_zero_filled_pages); - goto insert_entry; - } - - if (!zswap_non_zero_filled_pages_enabled) - goto freepage; - /* if entry is successfully added, it keeps the reference */ entry->pool = zswap_pool_current_get(); if (!entry->pool) @@ -1465,17 +1495,14 @@ bool zswap_store(struct folio *folio) if (!zswap_compress(folio, entry)) goto put_pool; -insert_entry: entry->swpentry = swp; entry->objcg = objcg; if (zswap_tree_store(tree, offset, entry)) goto store_failed; - if (objcg) { + if (objcg) obj_cgroup_charge_zswap(objcg, entry->length); - count_objcg_event(objcg, ZSWPOUT); - } /* * We finish initializing the entry while it's already in xarray. @@ -1487,25 +1514,21 @@ bool zswap_store(struct folio *folio) * The publishing order matters to prevent writeback from seeing * an incoherent entry. */ - if (entry->length) { - INIT_LIST_HEAD(&entry->lru); - zswap_lru_add(&zswap_list_lru, entry); - } + INIT_LIST_HEAD(&entry->lru); + zswap_lru_add(&zswap_list_lru, entry); - /* update stats */ +stored: + if (objcg) + count_objcg_event(objcg, ZSWPOUT); atomic_inc(&zswap_stored_pages); count_vm_event(ZSWPOUT); return true; store_failed: - if (!entry->length) - atomic_dec(&zswap_zero_filled_pages); - else { - zpool_free(zswap_find_zpool(entry), entry->handle); + zpool_free(zswap_find_zpool(entry), entry->handle); put_pool: - zswap_pool_put(entry->pool); - } + zswap_pool_put(entry->pool); freepage: zswap_entry_cache_free(entry); reject: @@ -1518,9 +1541,7 @@ bool zswap_store(struct folio *folio) * possibly stale entry which was previously stored at this offset. * Otherwise, writeback could overwrite the new data in the swapfile. */ - entry = xa_erase(tree, offset); - if (entry) - zswap_entry_free(entry); + zswap_tree_free_element(xa_erase(tree, offset)); return false; } @@ -1531,26 +1552,27 @@ bool zswap_load(struct folio *folio) struct page *page = &folio->page; struct xarray *tree = swap_zswap_tree(swp); struct zswap_entry *entry; + struct obj_cgroup *objcg; + void *elem; VM_WARN_ON_ONCE(!folio_test_locked(folio)); - entry = xa_erase(tree, offset); - if (!entry) + elem = xa_erase(tree, offset); + if (!elem) return false; - if (entry->length) + if (!zswap_load_zero_filled(elem, page, &objcg)) { + entry = elem; + objcg = entry->objcg; zswap_decompress(entry, page); - else - clear_highpage(page); + } count_vm_event(ZSWPIN); - if (entry->objcg) - count_objcg_event(entry->objcg, ZSWPIN); - - zswap_entry_free(entry); + if (objcg) + count_objcg_event(objcg, ZSWPIN); + zswap_tree_free_element(elem); folio_mark_dirty(folio); - return true; } @@ -1558,11 +1580,8 @@ void zswap_invalidate(swp_entry_t swp) { pgoff_t offset = swp_offset(swp); struct xarray *tree = swap_zswap_tree(swp); - struct zswap_entry *entry; - entry = xa_erase(tree, offset); - if (entry) - zswap_entry_free(entry); + zswap_tree_free_element(xa_erase(tree, offset)); } int zswap_swapon(int type, unsigned long nr_pages)