From patchwork Mon Nov 28 19:16:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13057890 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A84FC433FE for ; Mon, 28 Nov 2022 19:16:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4A4A6B0075; Mon, 28 Nov 2022 14:16:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CFA416B0078; Mon, 28 Nov 2022 14:16:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB3A06B007B; Mon, 28 Nov 2022 14:16:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9AFDB6B0075 for ; Mon, 28 Nov 2022 14:16:20 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 80306140BC4 for ; Mon, 28 Nov 2022 19:16:20 +0000 (UTC) X-FDA: 80183806920.23.EA3BB41 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf29.hostedemail.com (Postfix) with ESMTP id 07796120012 for ; Mon, 28 Nov 2022 19:16:19 +0000 (UTC) Received: by mail-pf1-f173.google.com with SMTP id w79so11473913pfc.2 for ; Mon, 28 Nov 2022 11:16:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=F/z8xBFicb3R2PeUO5QeKRD0fnuKCQ0SAaRvKkmAk3E=; b=MoXOV1pWeYXh6PbisJX01BZJce1uz4bNi3N8VaqG8qKv7jhznnybAwHouVcezuswdR 3rwSPKUFecPuAqXKIoX9s/uUVgU68mSe/sp8MPwWcYHzOZeybPexk8iEXBciozwzH8NG sa0DhlIdnpCghqfW9A+bG/zOqBnNUeb+tU4ahX2OTZ91qq4y9OFv6kqUavWrjuNyDrFC Ntg+DiGiJghENQ0wYrDIrlOZlrdtQL9QW4WWsFVFhBPrytQznW3ck7MRBQ2PjxbPzXsN 5UL0J2LZF9HCtnimSrb8Av0tYmDx4N2egUI/lwL6hOtm9JRmADw9kL8vrmt2zF0F77Ke Ha+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F/z8xBFicb3R2PeUO5QeKRD0fnuKCQ0SAaRvKkmAk3E=; b=gphXGFk1noxzQwouls4p39eUDw0mxLvpoL6saOgD5enu9Qf77NcRq2zknHTk+LejIt yXUQSAdp1ZfYQ3Q3NyK97G08lba/X3tA8WdvjRudcj0L6vFbWp50Vtafw3T/pPylK5Ph 5qP0e8zOd4wHSbYkA4KOBYEgB9D5zBNZUturSLIwLpDzSn4zLJvTL6JBH6a9jE0VQlAP RiK7E2q9TwMcJfOp0Ez/SDi8zq8HSt9z34VBWpiUKAEAYwZNBx4kIgyA+0VX2d5oiAM4 yrywK0nHX/7kHz0Hozl5FtkpyYn+gVr91vkyE2lyBZGKOsA+dtPC+uo91bgNXBjE4Rfp pxRA== X-Gm-Message-State: ANoB5pnRhizLVNrzYOKw5AsTJ9sGoFdU99MeAoKxN8fiqWptAQK19FFe tPNfw6IDjHx9R5ggwuhLL9K470yFVp5vsQ== X-Google-Smtp-Source: AA0mqf523izJHxzI+4A7CQUZVosROEwsf3PehmYpsrWuusjaoiqMgcmIdQp5MoYFCbbxgrsbbKH3JQ== X-Received: by 2002:a63:5a56:0:b0:46e:9bac:17f with SMTP id k22-20020a635a56000000b0046e9bac017fmr31414467pgm.420.1669662978660; Mon, 28 Nov 2022 11:16:18 -0800 (PST) Received: from localhost (fwdproxy-prn-013.fbsv.net. [2a03:2880:ff:d::face:b00c]) by smtp.gmail.com with ESMTPSA id 67-20020a630346000000b0046f1e8cb30dsm7187107pgd.26.2022.11.28.11.16.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:16:18 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v7 1/6] zswap: fix writeback lock ordering for zsmalloc Date: Mon, 28 Nov 2022 11:16:10 -0800 Message-Id: <20221128191616.1261026-2-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221128191616.1261026-1-nphamcs@gmail.com> References: <20221128191616.1261026-1-nphamcs@gmail.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669662980; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=F/z8xBFicb3R2PeUO5QeKRD0fnuKCQ0SAaRvKkmAk3E=; b=H8eW6PXRkddcQBB7VSYUz58QnXbmHJbqFnPQvUTddTEgrHS9TxY033zDM3GiAL67FOd/VA JlX9J728y7+jMtjkYNNH30TTRKTYlW5y8rWE2pBDSRU05Gpb9fZ2t6UkkBm80KB1xuSe6y yGa0hI+0pnJV14CGhImUHw7RM/CACr4= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=MoXOV1pW; spf=pass (imf29.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669662980; a=rsa-sha256; cv=none; b=aqzKKbUDz0hp391BVlnotEqyqoFsn7IiGDisbsAgXrnqyxIcWvxwrwsaJ9qh4GzMCffkla EqeTwIKuQHWe9nmVPjaZccYiOsoTR4xnHuvVf7AWbUc5TfsOKiR1caudhYfO/V2wE4vESJ D8TdBC0Ad/C8+gNAZV4VbfxBsgwM2uE= X-Stat-Signature: dt3rc3d7it9cpkb16bjxdmmqwybuz53a X-Rspamd-Queue-Id: 07796120012 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=MoXOV1pW; spf=pass (imf29.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1669662979-366216 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Johannes Weiner zswap's customary lock order is tree->lock before pool->lock, because the tree->lock protects the entries' refcount, and the free callbacks in the backends acquire their respective pool locks to dispatch the backing object. zsmalloc's map callback takes the pool lock, so zswap must not grab the tree->lock while a handle is mapped. This currently only happens during writeback, which isn't implemented for zsmalloc. In preparation for it, move the tree->lock section out of the mapped entry section Signed-off-by: Johannes Weiner Signed-off-by: Nhat Pham Reviewed-by: Sergey Senozhatsky --- mm/zswap.c | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) -- 2.30.2 diff --git a/mm/zswap.c b/mm/zswap.c index 3019f0bde194..f6c89049cf70 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -968,6 +968,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) swpentry = zhdr->swpentry; /* here */ tree = zswap_trees[swp_type(swpentry)]; offset = swp_offset(swpentry); + zpool_unmap_handle(pool, handle); /* find and ref zswap entry */ spin_lock(&tree->lock); @@ -975,20 +976,12 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) if (!entry) { /* entry was invalidated */ spin_unlock(&tree->lock); - zpool_unmap_handle(pool, handle); kfree(tmp); return 0; } spin_unlock(&tree->lock); BUG_ON(offset != entry->offset); - src = (u8 *)zhdr + sizeof(struct zswap_header); - if (!zpool_can_sleep_mapped(pool)) { - memcpy(tmp, src, entry->length); - src = tmp; - zpool_unmap_handle(pool, handle); - } - /* try to allocate swap cache page */ switch (zswap_get_swap_cache_page(swpentry, &page)) { case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happened */ @@ -1006,6 +999,14 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); dlen = PAGE_SIZE; + zhdr = zpool_map_handle(pool, handle, ZPOOL_MM_RO); + src = (u8 *)zhdr + sizeof(struct zswap_header); + if (!zpool_can_sleep_mapped(pool)) { + memcpy(tmp, src, entry->length); + src = tmp; + zpool_unmap_handle(pool, handle); + } + mutex_lock(acomp_ctx->mutex); sg_init_one(&input, src, entry->length); sg_init_table(&output, 1); @@ -1015,6 +1016,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) dlen = acomp_ctx->req->dlen; mutex_unlock(acomp_ctx->mutex); + if (!zpool_can_sleep_mapped(pool)) + kfree(tmp); + else + zpool_unmap_handle(pool, handle); + BUG_ON(ret); BUG_ON(dlen != PAGE_SIZE); @@ -1045,7 +1051,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) zswap_entry_put(tree, entry); spin_unlock(&tree->lock); - goto end; + return ret; + +fail: + if (!zpool_can_sleep_mapped(pool)) + kfree(tmp); /* * if we get here due to ZSWAP_SWAPCACHE_EXIST @@ -1054,17 +1064,10 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) * if we free the entry in the following put * it is also okay to return !0 */ -fail: spin_lock(&tree->lock); zswap_entry_put(tree, entry); spin_unlock(&tree->lock); -end: - if (zpool_can_sleep_mapped(pool)) - zpool_unmap_handle(pool, handle); - else - kfree(tmp); - return ret; } From patchwork Mon Nov 28 19:16:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13057891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77ACAC4167D for ; Mon, 28 Nov 2022 19:16:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 18CD76B0078; Mon, 28 Nov 2022 14:16:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 13B676B007B; Mon, 28 Nov 2022 14:16:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF79D6B007D; Mon, 28 Nov 2022 14:16:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DBC7B6B0078 for ; Mon, 28 Nov 2022 14:16:22 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9DFA8140E29 for ; Mon, 28 Nov 2022 19:16:22 +0000 (UTC) X-FDA: 80183807004.24.5DDC996 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf28.hostedemail.com (Postfix) with ESMTP id 2B211C0008 for ; Mon, 28 Nov 2022 19:16:20 +0000 (UTC) Received: by mail-pf1-f182.google.com with SMTP id q12so7374783pfn.10 for ; Mon, 28 Nov 2022 11:16:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oRCSr1pTz6K6wdPtBbAWa4aN3tG3lkr1IsdWKZqDSLk=; b=WoKsWj+sHsST19ykpbG8s6H7nLNX8l9Ihbsp9tL14ftWTCBngqRJg+gn0YDHv/rjK8 hgZDRuD/gg565LJXVXniIA99E818HT9mgJZQikSvddb5OEwFnAjmm94BxhmVOmO6xIvU 9tcYq+bkvSqCqN107imWGrtaFh04uX08ZiX42A3tzhiPaCZ9mS0FbS5gP5f892HmqNcY v2bl3f2akDqinOw48tYCPhH6hAtmkPsm7N1Mry8h578PeyoUQOQ+PSrxReKu3ntb8vZy pst3zCxazSvyT9K0K0rJqLxspplJnqlmLQlMWL2Ep9AIaqX3yQABAFvWutvvxif4T1dS S4mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oRCSr1pTz6K6wdPtBbAWa4aN3tG3lkr1IsdWKZqDSLk=; b=T+SAwnueYQ3okQ8BtOCtaCOmpAAsAIvoMXcP2rpWCadFn4JjlrJ+LpKCBffjqgl3b2 +ZEzzmZomXqq0taVFZIBHbTrbdAeKx6R8gRok2/0N74kW7+eCryGjNI1vtowWSHSaCgO qVlaYe8J0btRigH3nL3n0q3bqPEYFnnqW0bD6MrdYqYzm+Ly02bqV4u1SHsQ/ENirqco 5TJRNql/YoYlAJfMg4C1vvJHbA5rpIrCiehiWu71yPnq2xr0OVBCjdO3vvziuvSRsgGt rDZi2AZFEK04KNcrpmMTL2IO1LDmYxB72VsFJwMygiuMU2YQzgb3AOAr1oejMeyX8+Ai z3eQ== X-Gm-Message-State: ANoB5pk2G17d/x4s2c/mDuhzk1GJ9CGeHkjHCImkBIQ/YMFKuRoNR4U7 oybya9OnVb1vvdLsyIawx5s= X-Google-Smtp-Source: AA0mqf5hOZ/K+3ZtiWIoCY26o6TpUDJpIMpCrclO5OctE+qfS6HBCgLy6d3vr/vid71JIAsPnrxFVQ== X-Received: by 2002:a05:6a00:1303:b0:561:7dc7:510b with SMTP id j3-20020a056a00130300b005617dc7510bmr54913163pfu.3.1669662980087; Mon, 28 Nov 2022 11:16:20 -0800 (PST) Received: from localhost (fwdproxy-prn-007.fbsv.net. [2a03:2880:ff:7::face:b00c]) by smtp.gmail.com with ESMTPSA id u4-20020a17090341c400b00183e2a96414sm9256090ple.121.2022.11.28.11.16.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:16:19 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v7 2/6] zpool: clean out dead code Date: Mon, 28 Nov 2022 11:16:11 -0800 Message-Id: <20221128191616.1261026-3-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221128191616.1261026-1-nphamcs@gmail.com> References: <20221128191616.1261026-1-nphamcs@gmail.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669662981; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oRCSr1pTz6K6wdPtBbAWa4aN3tG3lkr1IsdWKZqDSLk=; b=k4RWBdOJ6m+Hbx5s/Cpw9lAf6eqiS3tlhZMBhX6afOlvglAsR5mlJOghA/Ub3eTZow5sn0 cPTF87G0hGPTjFlnZZ72CrMWOGkOC68YQp8gw/MjiTJgoWc6AfwZUJ9uFmVIJNwDPGMEQk LnEF5ijZpH/0VnnfLWRKJF8SdDwwTe4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=WoKsWj+s; spf=pass (imf28.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669662981; a=rsa-sha256; cv=none; b=gJRPCp7h14O5iJS+tWgv2GsZBCssXJF0Rdlo03iAoGUMVqAukO1mj+UQP1793Y8yvLnFCY tkPhX6uzKfjct6HGOyuII1Y04ZK23ImpYUyaNSyeUoqY4G1dIvwTG3DMUXqjggfMaW7QaH zzTwHwFGM3E5+9AE7SPuYL1tee+8L6Y= X-Stat-Signature: qtrxp1yyo56stejy4xukmxu1atu7gma3 X-Rspamd-Queue-Id: 2B211C0008 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=WoKsWj+s; spf=pass (imf28.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1669662980-404442 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Johannes Weiner There is a lot of provision for flexibility that isn't actually needed or used. Zswap (the only zpool user) always passes zpool_ops with an .evict method set. The backends who reclaim only do so for zswap, so they can also directly call zpool_ops without indirection or checks. Finally, there is no need to check the retries parameters and bail with -EINVAL in the reclaim function, when that's called just a few lines below with a hard-coded 8. There is no need to duplicate the evictable and sleep_mapped attrs from the driver in zpool_ops. Signed-off-by: Johannes Weiner Signed-off-by: Nhat Pham Reviewed-by: Sergey Senozhatsky --- mm/z3fold.c | 36 +++++------------------------------- mm/zbud.c | 32 +++++--------------------------- mm/zpool.c | 10 ++-------- 3 files changed, 12 insertions(+), 66 deletions(-) -- 2.30.2 diff --git a/mm/z3fold.c b/mm/z3fold.c index cf71da10d04e..a4de0c317ac7 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -68,9 +68,6 @@ * Structures *****************/ struct z3fold_pool; -struct z3fold_ops { - int (*evict)(struct z3fold_pool *pool, unsigned long handle); -}; enum buddy { HEADLESS = 0, @@ -138,8 +135,6 @@ struct z3fold_header { * @stale: list of pages marked for freeing * @pages_nr: number of z3fold pages in the pool. * @c_handle: cache for z3fold_buddy_slots allocation - * @ops: pointer to a structure of user defined operations specified at - * pool creation time. * @zpool: zpool driver * @zpool_ops: zpool operations structure with an evict callback * @compact_wq: workqueue for page layout background optimization @@ -158,7 +153,6 @@ struct z3fold_pool { struct list_head stale; atomic64_t pages_nr; struct kmem_cache *c_handle; - const struct z3fold_ops *ops; struct zpool *zpool; const struct zpool_ops *zpool_ops; struct workqueue_struct *compact_wq; @@ -907,13 +901,11 @@ static inline struct z3fold_header *__z3fold_alloc(struct z3fold_pool *pool, * z3fold_create_pool() - create a new z3fold pool * @name: pool name * @gfp: gfp flags when allocating the z3fold pool structure - * @ops: user-defined operations for the z3fold pool * * Return: pointer to the new z3fold pool or NULL if the metadata allocation * failed. */ -static struct z3fold_pool *z3fold_create_pool(const char *name, gfp_t gfp, - const struct z3fold_ops *ops) +static struct z3fold_pool *z3fold_create_pool(const char *name, gfp_t gfp) { struct z3fold_pool *pool = NULL; int i, cpu; @@ -949,7 +941,6 @@ static struct z3fold_pool *z3fold_create_pool(const char *name, gfp_t gfp, if (!pool->release_wq) goto out_wq; INIT_WORK(&pool->work, free_pages_work); - pool->ops = ops; return pool; out_wq: @@ -1230,10 +1221,6 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) slots.pool = (unsigned long)pool | (1 << HANDLES_NOFREE); spin_lock(&pool->lock); - if (!pool->ops || !pool->ops->evict || retries == 0) { - spin_unlock(&pool->lock); - return -EINVAL; - } for (i = 0; i < retries; i++) { if (list_empty(&pool->lru)) { spin_unlock(&pool->lock); @@ -1319,17 +1306,17 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) } /* Issue the eviction callback(s) */ if (middle_handle) { - ret = pool->ops->evict(pool, middle_handle); + ret = pool->zpool_ops->evict(pool->zpool, middle_handle); if (ret) goto next; } if (first_handle) { - ret = pool->ops->evict(pool, first_handle); + ret = pool->zpool_ops->evict(pool->zpool, first_handle); if (ret) goto next; } if (last_handle) { - ret = pool->ops->evict(pool, last_handle); + ret = pool->zpool_ops->evict(pool->zpool, last_handle); if (ret) goto next; } @@ -1593,26 +1580,13 @@ static const struct movable_operations z3fold_mops = { * zpool ****************/ -static int z3fold_zpool_evict(struct z3fold_pool *pool, unsigned long handle) -{ - if (pool->zpool && pool->zpool_ops && pool->zpool_ops->evict) - return pool->zpool_ops->evict(pool->zpool, handle); - else - return -ENOENT; -} - -static const struct z3fold_ops z3fold_zpool_ops = { - .evict = z3fold_zpool_evict -}; - static void *z3fold_zpool_create(const char *name, gfp_t gfp, const struct zpool_ops *zpool_ops, struct zpool *zpool) { struct z3fold_pool *pool; - pool = z3fold_create_pool(name, gfp, - zpool_ops ? &z3fold_zpool_ops : NULL); + pool = z3fold_create_pool(name, gfp); if (pool) { pool->zpool = zpool; pool->zpool_ops = zpool_ops; diff --git a/mm/zbud.c b/mm/zbud.c index 6348932430b8..3acd26193920 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -74,10 +74,6 @@ struct zbud_pool; -struct zbud_ops { - int (*evict)(struct zbud_pool *pool, unsigned long handle); -}; - /** * struct zbud_pool - stores metadata for each zbud pool * @lock: protects all pool fields and first|last_chunk fields of any @@ -90,8 +86,6 @@ struct zbud_ops { * @lru: list tracking the zbud pages in LRU order by most recently * added buddy. * @pages_nr: number of zbud pages in the pool. - * @ops: pointer to a structure of user defined operations specified at - * pool creation time. * @zpool: zpool driver * @zpool_ops: zpool operations structure with an evict callback * @@ -110,7 +104,6 @@ struct zbud_pool { }; struct list_head lru; u64 pages_nr; - const struct zbud_ops *ops; struct zpool *zpool; const struct zpool_ops *zpool_ops; }; @@ -212,12 +205,11 @@ static int num_free_chunks(struct zbud_header *zhdr) /** * zbud_create_pool() - create a new zbud pool * @gfp: gfp flags when allocating the zbud pool structure - * @ops: user-defined operations for the zbud pool * * Return: pointer to the new zbud pool or NULL if the metadata allocation * failed. */ -static struct zbud_pool *zbud_create_pool(gfp_t gfp, const struct zbud_ops *ops) +static struct zbud_pool *zbud_create_pool(gfp_t gfp) { struct zbud_pool *pool; int i; @@ -231,7 +223,6 @@ static struct zbud_pool *zbud_create_pool(gfp_t gfp, const struct zbud_ops *ops) INIT_LIST_HEAD(&pool->buddied); INIT_LIST_HEAD(&pool->lru); pool->pages_nr = 0; - pool->ops = ops; return pool; } @@ -419,8 +410,7 @@ static int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) unsigned long first_handle = 0, last_handle = 0; spin_lock(&pool->lock); - if (!pool->ops || !pool->ops->evict || list_empty(&pool->lru) || - retries == 0) { + if (list_empty(&pool->lru)) { spin_unlock(&pool->lock); return -EINVAL; } @@ -444,12 +434,12 @@ static int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) /* Issue the eviction callback(s) */ if (first_handle) { - ret = pool->ops->evict(pool, first_handle); + ret = pool->zpool_ops->evict(pool->zpool, first_handle); if (ret) goto next; } if (last_handle) { - ret = pool->ops->evict(pool, last_handle); + ret = pool->zpool_ops->evict(pool->zpool, last_handle); if (ret) goto next; } @@ -524,25 +514,13 @@ static u64 zbud_get_pool_size(struct zbud_pool *pool) * zpool ****************/ -static int zbud_zpool_evict(struct zbud_pool *pool, unsigned long handle) -{ - if (pool->zpool && pool->zpool_ops && pool->zpool_ops->evict) - return pool->zpool_ops->evict(pool->zpool, handle); - else - return -ENOENT; -} - -static const struct zbud_ops zbud_zpool_ops = { - .evict = zbud_zpool_evict -}; - static void *zbud_zpool_create(const char *name, gfp_t gfp, const struct zpool_ops *zpool_ops, struct zpool *zpool) { struct zbud_pool *pool; - pool = zbud_create_pool(gfp, zpool_ops ? &zbud_zpool_ops : NULL); + pool = zbud_create_pool(gfp); if (pool) { pool->zpool = zpool; pool->zpool_ops = zpool_ops; diff --git a/mm/zpool.c b/mm/zpool.c index f46c0d5e766c..571f5c5031dd 100644 --- a/mm/zpool.c +++ b/mm/zpool.c @@ -21,9 +21,6 @@ struct zpool { struct zpool_driver *driver; void *pool; - const struct zpool_ops *ops; - bool evictable; - bool can_sleep_mapped; }; static LIST_HEAD(drivers_head); @@ -177,9 +174,6 @@ struct zpool *zpool_create_pool(const char *type, const char *name, gfp_t gfp, zpool->driver = driver; zpool->pool = driver->create(name, gfp, ops, zpool); - zpool->ops = ops; - zpool->evictable = driver->shrink && ops && ops->evict; - zpool->can_sleep_mapped = driver->sleep_mapped; if (!zpool->pool) { pr_err("couldn't create %s pool\n", type); @@ -380,7 +374,7 @@ u64 zpool_get_total_size(struct zpool *zpool) */ bool zpool_evictable(struct zpool *zpool) { - return zpool->evictable; + return zpool->driver->shrink; } /** @@ -398,7 +392,7 @@ bool zpool_evictable(struct zpool *zpool) */ bool zpool_can_sleep_mapped(struct zpool *zpool) { - return zpool->can_sleep_mapped; + return zpool->driver->sleep_mapped; } MODULE_LICENSE("GPL"); From patchwork Mon Nov 28 19:16:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13057892 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B4C3C47089 for ; Mon, 28 Nov 2022 19:16:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 645E16B007B; Mon, 28 Nov 2022 14:16:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F5D16B007D; Mon, 28 Nov 2022 14:16:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4982A6B007E; Mon, 28 Nov 2022 14:16:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3A0A06B007B for ; Mon, 28 Nov 2022 14:16:23 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0AC98160DFD for ; Mon, 28 Nov 2022 19:16:23 +0000 (UTC) X-FDA: 80183807046.13.E43ACA1 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf30.hostedemail.com (Postfix) with ESMTP id 8E6B580011 for ; Mon, 28 Nov 2022 19:16:22 +0000 (UTC) Received: by mail-pf1-f170.google.com with SMTP id q12so7374842pfn.10 for ; Mon, 28 Nov 2022 11:16:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ExG6lmT8dvWnpflTbpJM8HrOLYrR6VJHXI/Psqh0C0Y=; b=V60CBauNQzHEVrccaqhf87xf1fx0GsfzOJp+uXFsGzNU9ke01CDJ72B+QiN383k4sN VzkeTtN8xmGe9Xg12SCCtlz1eiwsiiLgh1bAuLN3gqqrytVEuW3OQ2+8e39KvtBM+amQ Zv4pFO0JddjSEykJXKsPct2PEqIZR/59Q0QDZgglRkWAsYH2MfeMr1MMPa3Jv9e0FvNS 3YFvxAYoQQywUgV66RmkdZfTM681Pw7z2n8FBY2xD0Zq+ii+Pffn/1WvBhdxu65FoqMA 3SLZwXKjtvTXTBrCep/jz0IHPt5vaoQhDp+U5LX17w2kUKkY3rarVuACRIexGe8Scw/7 t1SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ExG6lmT8dvWnpflTbpJM8HrOLYrR6VJHXI/Psqh0C0Y=; b=42V118vfFz1wVPuYXTfFMo06GGes33UNE7vyt6kzh/+KSXyXt9FdYExjzB5VEhmHmN GXVweYWgOm5AhniqChfbGuW4LauK7aZRbOJhWto+uIEPg9Hu9gzjbj8YYrGcwd1jbNXk H6xln82UVLae4budODfHtLvSIclvqrj9FzKGffTWfb35jJBBu9s/rYArwqRcXtQLK/D4 AQnlh6MPfJADT3oUciqW2jDKW9wAuBx/DwHUVZIA9FXC53zeSRblfo1ApkXjQrh2LFk6 gzj3ZQ809Ru6eTzSve0azRdrX+1t2qer9W+5RzYpSM6u/aWHU5kF9YRiH6G1zIGsqvgt F1KQ== X-Gm-Message-State: ANoB5pmbD3QTQ5eI3aX7gB6Mk+EmDQU8xoYUlbrkS4Y8dk5yzr0E4YwT vC5QDdW8fmD0uXItPG9khSo= X-Google-Smtp-Source: AA0mqf41zUPopT41Ye2wflSu0jpR4ajnaER/TTyeS8LpLd5LGtmwd9kTxrcY7FfpL5RjNzFtjruwsQ== X-Received: by 2002:aa7:8595:0:b0:574:3ccd:a468 with SMTP id w21-20020aa78595000000b005743ccda468mr28783409pfn.61.1669662981505; Mon, 28 Nov 2022 11:16:21 -0800 (PST) Received: from localhost (fwdproxy-prn-010.fbsv.net. [2a03:2880:ff:a::face:b00c]) by smtp.gmail.com with ESMTPSA id w8-20020a17090a1b8800b001f94d25bfabsm9932074pjc.28.2022.11.28.11.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:16:20 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v7 3/6] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks Date: Mon, 28 Nov 2022 11:16:12 -0800 Message-Id: <20221128191616.1261026-4-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221128191616.1261026-1-nphamcs@gmail.com> References: <20221128191616.1261026-1-nphamcs@gmail.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=V60CBauN; spf=pass (imf30.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669662982; a=rsa-sha256; cv=none; b=Ay9cjgO2DmTuYLIubioNtlUA4PHcHUMuKCVj+4WqvE2ZpD0tO+fYtCsezjQ0TuXmGuETpB 1JHW+QPPyVBairoRx36hJhjYY5QlB6aywpZYbWSX+hoTozCn4bsuupfa43tX2njZ6iaLMZ ZUK4AMnukJQQBH8Vyx9tWru0Snuas50= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669662982; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ExG6lmT8dvWnpflTbpJM8HrOLYrR6VJHXI/Psqh0C0Y=; b=8H8ICAaduuMeYY+sGNbyRgRHHwwEA3mMkneqnDrHBwwOg+SDsRv2lyvozdYME2WAdOHHfX Bw9ABzSUWFdCDwKihMIruyHzCVosRbBM6Yq1FFj7N1CcNDToaK/WVSoa0VAEOnoKyTd8vH s/eM0PE4XxBHf3OurqZcumfIR4j5sjA= X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8E6B580011 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=V60CBauN; spf=pass (imf30.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: 4d4kg7qu4n9663jj99sf67ti1d5e3p8p X-HE-Tag: 1669662982-382181 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, zsmalloc has a hierarchy of locks, which includes a pool-level migrate_lock, and a lock for each size class. We have to obtain both locks in the hotpath in most cases anyway, except for zs_malloc. This exception will no longer exist when we introduce a LRU into the zs_pool for the new writeback functionality - we will need to obtain a pool-level lock to synchronize LRU handling even in zs_malloc. In preparation for zsmalloc writeback, consolidate these locks into a single pool-level lock, which drastically reduces the complexity of synchronization in zsmalloc. We have also benchmarked the lock consolidation to see the performance effect of this change on zram. First, we ran a synthetic FS workload on a server machine with 36 cores (same machine for all runs), using fs_mark -d ../zram1mnt -s 100000 -n 2500 -t 32 -k before and after for btrfs and ext4 on zram (FS usage is 80%). Here is the result (unit is file/second): With lock consolidation (btrfs): Average: 13520.2, Median: 13531.0, Stddev: 137.5961482019028 Without lock consolidation (btrfs): Average: 13487.2, Median: 13575.0, Stddev: 309.08283679298665 With lock consolidation (ext4): Average: 16824.4, Median: 16839.0, Stddev: 89.97388510006668 Without lock consolidation (ext4) Average: 16958.0, Median: 16986.0, Stddev: 194.7370021336469 As you can see, we observe a 0.3% regression for btrfs, and a 0.9% regression for ext4. This is a small, barely measurable difference in my opinion. For a more realistic scenario, we also tries building the kernel on zram. Here is the time it takes (in seconds): With lock consolidation (btrfs): real Average: 319.6, Median: 320.0, Stddev: 0.8944271909999159 user Average: 6894.2, Median: 6895.0, Stddev: 25.528415540334656 sys Average: 521.4, Median: 522.0, Stddev: 1.51657508881031 Without lock consolidation (btrfs): real Average: 319.8, Median: 320.0, Stddev: 0.8366600265340756 user Average: 6896.6, Median: 6899.0, Stddev: 16.04057355583023 sys Average: 520.6, Median: 521.0, Stddev: 1.140175425099138 With lock consolidation (ext4): real Average: 320.0, Median: 319.0, Stddev: 1.4142135623730951 user Average: 6896.8, Median: 6878.0, Stddev: 28.621670111997307 sys Average: 521.2, Median: 521.0, Stddev: 1.7888543819998317 Without lock consolidation (ext4) real Average: 319.6, Median: 319.0, Stddev: 0.8944271909999159 user Average: 6886.2, Median: 6887.0, Stddev: 16.93221781102523 sys Average: 520.4, Median: 520.0, Stddev: 1.140175425099138 The difference is entirely within the noise of a typical run on zram. This hardly justifies the complexity of maintaining both the pool lock and the class lock. In fact, for writeback, we would need to introduce yet another lock to prevent data races on the pool's LRU, further complicating the lock handling logic. IMHO, it is just better to collapse all of these into a single pool-level lock. Suggested-by: Johannes Weiner Signed-off-by: Nhat Pham Acked-by: Minchan Kim Acked-by: Johannes Weiner Reviewed-by: Sergey Senozhatsky --- mm/zsmalloc.c | 87 ++++++++++++++++++++++----------------------------- 1 file changed, 37 insertions(+), 50 deletions(-) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 78feda34ad9a..5427a00a0518 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -33,8 +33,7 @@ /* * lock ordering: * page_lock - * pool->migrate_lock - * class->lock + * pool->lock * zspage->lock */ @@ -192,7 +191,6 @@ static const int fullness_threshold_frac = 4; static size_t huge_class_size; struct size_class { - spinlock_t lock; struct list_head fullness_list[NR_ZS_FULLNESS]; /* * Size of objects stored in this class. Must be multiple @@ -247,8 +245,7 @@ struct zs_pool { #ifdef CONFIG_COMPACTION struct work_struct free_work; #endif - /* protect page/zspage migration */ - rwlock_t migrate_lock; + spinlock_t lock; }; struct zspage { @@ -355,7 +352,7 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) kmem_cache_free(pool->zspage_cachep, zspage); } -/* class->lock(which owns the handle) synchronizes races */ +/* pool->lock(which owns the handle) synchronizes races */ static void record_obj(unsigned long handle, unsigned long obj) { *(unsigned long *)handle = obj; @@ -452,7 +449,7 @@ static __maybe_unused int is_first_page(struct page *page) return PagePrivate(page); } -/* Protected by class->lock */ +/* Protected by pool->lock */ static inline int get_zspage_inuse(struct zspage *zspage) { return zspage->inuse; @@ -597,13 +594,13 @@ static int zs_stats_size_show(struct seq_file *s, void *v) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); class_almost_full = zs_stat_get(class, CLASS_ALMOST_FULL); class_almost_empty = zs_stat_get(class, CLASS_ALMOST_EMPTY); obj_allocated = zs_stat_get(class, OBJ_ALLOCATED); obj_used = zs_stat_get(class, OBJ_USED); freeable = zs_can_compact(class); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); objs_per_zspage = class->objs_per_zspage; pages_used = obj_allocated / objs_per_zspage * @@ -916,7 +913,7 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, get_zspage_mapping(zspage, &class_idx, &fg); - assert_spin_locked(&class->lock); + assert_spin_locked(&pool->lock); VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); @@ -1268,19 +1265,19 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, BUG_ON(in_interrupt()); /* It guarantees it can get zspage from handle safely */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_location(obj, &page, &obj_idx); zspage = get_zspage(page); /* - * migration cannot move any zpages in this zspage. Here, class->lock + * migration cannot move any zpages in this zspage. Here, pool->lock * is too heavy since callers would take some time until they calls * zs_unmap_object API so delegate the locking from class to zspage * which is smaller granularity. */ migrate_read_lock(zspage); - read_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); class = zspage_class(pool, zspage); off = (class->size * obj_idx) & ~PAGE_MASK; @@ -1433,8 +1430,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) size += ZS_HANDLE_SIZE; class = pool->size_class[get_size_class_index(size)]; - /* class->lock effectively protects the zpage migration */ - spin_lock(&class->lock); + /* pool->lock effectively protects the zpage migration */ + spin_lock(&pool->lock); zspage = find_get_zspage(class); if (likely(zspage)) { obj = obj_malloc(pool, zspage, handle); @@ -1442,12 +1439,12 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } - spin_unlock(&class->lock); + spin_unlock(&pool->lock); zspage = alloc_zspage(pool, class, gfp); if (!zspage) { @@ -1455,7 +1452,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) return (unsigned long)ERR_PTR(-ENOMEM); } - spin_lock(&class->lock); + spin_lock(&pool->lock); obj = obj_malloc(pool, zspage, handle); newfg = get_fullness_group(class, zspage); insert_zspage(class, zspage, newfg); @@ -1468,7 +1465,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } @@ -1512,16 +1509,14 @@ void zs_free(struct zs_pool *pool, unsigned long handle) return; /* - * The pool->migrate_lock protects the race with zpage's migration + * The pool->lock protects the race with zpage's migration * so it's safe to get the page from handle. */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_page(obj, &f_page); zspage = get_zspage(f_page); class = zspage_class(pool, zspage); - spin_lock(&class->lock); - read_unlock(&pool->migrate_lock); obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); @@ -1531,7 +1526,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) free_zspage(pool, class, zspage); out: - spin_unlock(&class->lock); + spin_unlock(&pool->lock); cache_free_handle(pool, handle); } EXPORT_SYMBOL_GPL(zs_free); @@ -1888,16 +1883,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, pool = zspage->pool; /* - * The pool migrate_lock protects the race between zpage migration + * The pool's lock protects the race between zpage migration * and zs_free. */ - write_lock(&pool->migrate_lock); + spin_lock(&pool->lock); class = zspage_class(pool, zspage); - /* - * the class lock protects zpage alloc/free in the zspage. - */ - spin_lock(&class->lock); /* the migrate_write_lock protects zpage access via zs_map_object */ migrate_write_lock(zspage); @@ -1927,10 +1918,9 @@ static int zs_page_migrate(struct page *newpage, struct page *page, replace_sub_page(class, zspage, newpage, page); /* * Since we complete the data copy and set up new zspage structure, - * it's okay to release migration_lock. + * it's okay to release the pool's lock. */ - write_unlock(&pool->migrate_lock); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); dec_zspage_isolation(zspage); migrate_write_unlock(zspage); @@ -1985,9 +1975,9 @@ static void async_free_zspage(struct work_struct *work) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); list_splice_init(&class->fullness_list[ZS_EMPTY], &free_pages); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } list_for_each_entry_safe(zspage, tmp, &free_pages, list) { @@ -1997,9 +1987,9 @@ static void async_free_zspage(struct work_struct *work) get_zspage_mapping(zspage, &class_idx, &fullness); VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; - spin_lock(&class->lock); + spin_lock(&pool->lock); __free_zspage(pool, class, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } }; @@ -2060,10 +2050,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, struct zspage *dst_zspage = NULL; unsigned long pages_freed = 0; - /* protect the race between zpage migration and zs_free */ - write_lock(&pool->migrate_lock); - /* protect zpage allocation/free */ - spin_lock(&class->lock); + /* + * protect the race between zpage migration and zs_free + * as well as zpage allocation/free + */ + spin_lock(&pool->lock); while ((src_zspage = isolate_zspage(class, true))) { /* protect someone accessing the zspage(i.e., zs_map_object) */ migrate_write_lock(src_zspage); @@ -2088,7 +2079,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(class, dst_zspage); migrate_write_unlock(dst_zspage); dst_zspage = NULL; - if (rwlock_is_contended(&pool->migrate_lock)) + if (spin_is_contended(&pool->lock)) break; } @@ -2105,11 +2096,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, pages_freed += class->pages_per_zspage; } else migrate_write_unlock(src_zspage); - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); cond_resched(); - write_lock(&pool->migrate_lock); - spin_lock(&class->lock); + spin_lock(&pool->lock); } if (src_zspage) { @@ -2117,8 +2106,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, migrate_write_unlock(src_zspage); } - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); return pages_freed; } @@ -2221,7 +2209,7 @@ struct zs_pool *zs_create_pool(const char *name) return NULL; init_deferred_free(pool); - rwlock_init(&pool->migrate_lock); + spin_lock_init(&pool->lock); pool->name = kstrdup(name, GFP_KERNEL); if (!pool->name) @@ -2292,7 +2280,6 @@ struct zs_pool *zs_create_pool(const char *name) class->index = i; class->pages_per_zspage = pages_per_zspage; class->objs_per_zspage = objs_per_zspage; - spin_lock_init(&class->lock); pool->size_class[i] = class; for (fullness = ZS_EMPTY; fullness < NR_ZS_FULLNESS; fullness++) From patchwork Mon Nov 28 19:16:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13057893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE03AC4332F for ; Mon, 28 Nov 2022 19:16:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB4576B007D; Mon, 28 Nov 2022 14:16:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EDDA6B007E; Mon, 28 Nov 2022 14:16:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 890226B0080; Mon, 28 Nov 2022 14:16:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 78E3A6B007D for ; Mon, 28 Nov 2022 14:16:24 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 485D6AAA30 for ; Mon, 28 Nov 2022 19:16:24 +0000 (UTC) X-FDA: 80183807088.28.13F67B5 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf03.hostedemail.com (Postfix) with ESMTP id CB1692001B for ; Mon, 28 Nov 2022 19:16:23 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id o1so6914229pfp.12 for ; Mon, 28 Nov 2022 11:16:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aHuTThRVlbG3opW3Eui02soyfeJZjUm1L8jiWi26BCg=; b=C1njT9ct1VEYu1a7Som+DHiPAWCnbNJXRNIJ0x1OklwTE76JeUyTP2yaMXnFmywfH4 WuWw8qqctMsDQtHmLfNOBUU86VVgOuFkPcDatoYete6FUZ2pGju05ox7BnhKuLBOuLCZ XLvJ+3eQpnSODgFAYSB+hMu1Aid/ShlAzqtk0dgyFNUac/dUdoFgZw2iIRtXQB0BOBsC ECpRgFsAC/0ZP0R1CHybZI8+e76M58NrWLNrYzB2dvIRN8IT+sxV4fZjs8A+Eu2cM4l5 TO/JWLXj6SKTpgP6jedFUG1at6oSCDn680+q5y7aLWPOWN1aKsp0LGtBqQoh4oNIX+4w RWyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aHuTThRVlbG3opW3Eui02soyfeJZjUm1L8jiWi26BCg=; b=ygs0wFREpD/kT4aZe5pMRDpl2t2NTNI8Kt+8pv2oD7tF61F/bch4/YVL0kBqBFaDB+ nTNwebvhG+3Hysvxjxtn0+BFuJRmIi0XswCg0g/eSIQ4ZvwPzdlWnR313pTdufT1Vgnq E9Azp+6VH49WE/KoqXxHzI9EBKv1lTuym9SGPfVDzRwx0MvdTRRH/MyCNshW1xLL4lhJ HAE8lqPgNWjAS77OC4d7xRFquDwjKxak8qwDEWF0PFUZOdiVOK2TB/LeT1xyRORnFR5F OOo78xTbQDdIDM5cDNswULqEuql9RlavYQ7IcWfjylcc/sFqupmwOx14ZhhEmWqJnQ94 hx9g== X-Gm-Message-State: ANoB5pm2K+qGGlflXLhOhzouBI/rLCvclW0NNBBMNBzTvP2n7/Ujd/S1 5CCgHeUrGA+I7N/6i3uDS8I= X-Google-Smtp-Source: AA0mqf4RsNu9K1vn8qwUjdZ6i2kL/AxZ7ZpWTB9LPwc2WGsc/bsG43pHTovayPvnSUlNKGvmMgmFsA== X-Received: by 2002:a63:a61:0:b0:478:2d2c:6e82 with SMTP id z33-20020a630a61000000b004782d2c6e82mr3001168pgk.136.1669662982851; Mon, 28 Nov 2022 11:16:22 -0800 (PST) Received: from localhost (fwdproxy-prn-018.fbsv.net. [2a03:2880:ff:12::face:b00c]) by smtp.gmail.com with ESMTPSA id 4-20020a630d44000000b00439c6a4e1ccsm7006068pgn.62.2022.11.28.11.16.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:16:22 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v7 4/6] zsmalloc: Add a LRU to zs_pool to keep track of zspages in LRU order Date: Mon, 28 Nov 2022 11:16:13 -0800 Message-Id: <20221128191616.1261026-5-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221128191616.1261026-1-nphamcs@gmail.com> References: <20221128191616.1261026-1-nphamcs@gmail.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669662983; a=rsa-sha256; cv=none; b=Kf2r9yYAa+uHmmrTKtqV5FRvhl1a55MpTUwmpAy7kH87nFIr/JzRbwvweyGsuWTI6mMWDA Ic50cjJm9eLPAqBhly/6j36yf91d6aSCOmNapj5CprmHPFEYrmXo7FyHCdQuqFY6fK/sxW 4gsv6YC78n5TN73zDVvMl/o82M1TjM8= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=C1njT9ct; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669662983; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aHuTThRVlbG3opW3Eui02soyfeJZjUm1L8jiWi26BCg=; b=fdHFS3pcnuH6LA8uY7KuS6KlK9O0eeOWY9KBrQA3yAWwREe7RGaSDC5md97RLwUU1X7Eqg ziNZq182oTIXphBSPsFFSWniXxUrKIUuubrSw0j5GgSQHaXAeYKV20499OKAk7E/ocNtRJ 5TJblf4oZteibNI4MlWQ204pluRQdZk= X-Rspamd-Queue-Id: CB1692001B Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=C1njT9ct; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=nphamcs@gmail.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: c5iezfy5nkg3oh7krrra4yy5jpjzkphe X-HE-Tag: 1669662983-946606 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This helps determines the coldest zspages as candidates for writeback. Signed-off-by: Nhat Pham Acked-by: Johannes Weiner Reviewed-by: Sergey Senozhatsky --- mm/zsmalloc.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 5427a00a0518..b1bc231d94a3 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -239,6 +239,11 @@ struct zs_pool { /* Compact classes */ struct shrinker shrinker; +#ifdef CONFIG_ZPOOL + /* List tracking the zspages in LRU order by most recently added object */ + struct list_head lru; +#endif + #ifdef CONFIG_ZSMALLOC_STAT struct dentry *stat_dentry; #endif @@ -260,6 +265,12 @@ struct zspage { unsigned int freeobj; struct page *first_page; struct list_head list; /* fullness list */ + +#ifdef CONFIG_ZPOOL + /* links the zspage to the lru list in the pool */ + struct list_head lru; +#endif + struct zs_pool *pool; #ifdef CONFIG_COMPACTION rwlock_t lock; @@ -953,6 +964,9 @@ static void free_zspage(struct zs_pool *pool, struct size_class *class, } remove_zspage(class, zspage, ZS_EMPTY); +#ifdef CONFIG_ZPOOL + list_del(&zspage->lru); +#endif __free_zspage(pool, class, zspage); } @@ -998,6 +1012,10 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) off %= PAGE_SIZE; } +#ifdef CONFIG_ZPOOL + INIT_LIST_HEAD(&zspage->lru); +#endif + set_freeobj(zspage, 0); } @@ -1270,6 +1288,31 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, obj_to_location(obj, &page, &obj_idx); zspage = get_zspage(page); +#ifdef CONFIG_ZPOOL + /* + * Move the zspage to front of pool's LRU. + * + * Note that this is swap-specific, so by definition there are no ongoing + * accesses to the memory while the page is swapped out that would make + * it "hot". A new entry is hot, then ages to the tail until it gets either + * written back or swaps back in. + * + * Furthermore, map is also called during writeback. We must not put an + * isolated page on the LRU mid-reclaim. + * + * As a result, only update the LRU when the page is mapped for write + * when it's first instantiated. + * + * This is a deviation from the other backends, which perform this update + * in the allocation function (zbud_alloc, z3fold_alloc). + */ + if (mm == ZS_MM_WO) { + if (!list_empty(&zspage->lru)) + list_del(&zspage->lru); + list_add(&zspage->lru, &pool->lru); + } +#endif + /* * migration cannot move any zpages in this zspage. Here, pool->lock * is too heavy since callers would take some time until they calls @@ -1988,6 +2031,9 @@ static void async_free_zspage(struct work_struct *work) VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; spin_lock(&pool->lock); +#ifdef CONFIG_ZPOOL + list_del(&zspage->lru); +#endif __free_zspage(pool, class, zspage); spin_unlock(&pool->lock); } @@ -2299,6 +2345,10 @@ struct zs_pool *zs_create_pool(const char *name) */ zs_register_shrinker(pool); +#ifdef CONFIG_ZPOOL + INIT_LIST_HEAD(&pool->lru); +#endif + return pool; err: From patchwork Mon Nov 28 19:16:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13057894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33064C4321E for ; Mon, 28 Nov 2022 19:16:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50FE96B007E; Mon, 28 Nov 2022 14:16:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 421106B0080; Mon, 28 Nov 2022 14:16:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 275316B0081; Mon, 28 Nov 2022 14:16:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 045526B007E for ; Mon, 28 Nov 2022 14:16:26 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 941641C5DB4 for ; Mon, 28 Nov 2022 19:16:25 +0000 (UTC) X-FDA: 80183807130.10.5C77462 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf26.hostedemail.com (Postfix) with ESMTP id 0C719140013 for ; Mon, 28 Nov 2022 19:16:24 +0000 (UTC) Received: by mail-pj1-f41.google.com with SMTP id v3-20020a17090ac90300b00218441ac0f6so13110712pjt.0 for ; Mon, 28 Nov 2022 11:16:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6+uMQ1ItOkPZCYgWOP1o+0fNZIWxA74Wiz+rf0diPIk=; b=mOBGBJZ7OABTvFUk36b1YD2VXURC5VWzKV8eesthXAyoyPSB+PXrQg60yl9zNo2Fe6 5R+vCBcdHWDJSwYwRE8bxkcD1HE+J+n8xVi1PJ2GN4Tkc+LJGg0hvIho2uUOnWhBSEd3 iwd7sUGiLgOAdvU/hkrLPYXFWbiaYOwuNvbAcrYKaT5WbuY6OYM6lV4nWrFQvpOvNADG Cdea8r/LMopjvcb4eJQ8XeTvuV6rjbOX0Adyb5fWTY8Jo3itY02O4XcLSch8reL4He38 lwFd26FA3CgiMjzo5fhii63nJUqb4QQHeUw50avhWQrbPMn+MO7Rq6JAeWvim3PXQTgr N5ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6+uMQ1ItOkPZCYgWOP1o+0fNZIWxA74Wiz+rf0diPIk=; b=iCoR+zvZwOPguog94rHlbc1c3yEg0LAhDDjUngsZUhIxKhUYuQZnovC7SgXFkRzU+Q uPg9ccZaIsBBIfOBu3MCD3ApWYje7JzOzwCmFYiTWI/2795N9mfdYCP/XVcT9S98P3KT uZVrlWTiSJeyyT9mlvG2EkWzk4yjk1xPxKzQq2X3TggaAFZlqGaWoxBu0C+W2Q1Ie+iv 6IrUvuZ32roUuJPqq1T7R4b3zld4Jk/zBB2qRNq6SkViSaWXU8q2BkMtCYfvxyAkIRaj Szn+gh8jyRPL8wwlkn/zLMH3nh5hLrjHgkjr7elr/c7llPvUN1GzqGSiRS3zreOiHo+X pGAA== X-Gm-Message-State: ANoB5pkCySx9uMsWJTh4XmZZz2SSyD9GevU8VOUvdZT8t7T+F9+Wx3eI W5Ak+LNTNZMc2RT5BP2ixxs= X-Google-Smtp-Source: AA0mqf7BhiiwUiOLsHG8xP4tWbnATiNkWArLtdQfxMZhqLlRBpCQEXvsOymAXRXARI6nxTmj42BqcA== X-Received: by 2002:a17:902:b493:b0:176:a6fb:801a with SMTP id y19-20020a170902b49300b00176a6fb801amr33820008plr.97.1669662984133; Mon, 28 Nov 2022 11:16:24 -0800 (PST) Received: from localhost (fwdproxy-prn-009.fbsv.net. [2a03:2880:ff:9::face:b00c]) by smtp.gmail.com with ESMTPSA id w28-20020aa79a1c000000b005754106e364sm1900230pfj.199.2022.11.28.11.16.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:16:23 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v7 5/6] zsmalloc: Add zpool_ops field to zs_pool to store evict handlers Date: Mon, 28 Nov 2022 11:16:14 -0800 Message-Id: <20221128191616.1261026-6-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221128191616.1261026-1-nphamcs@gmail.com> References: <20221128191616.1261026-1-nphamcs@gmail.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=mOBGBJZ7; spf=pass (imf26.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669662985; a=rsa-sha256; cv=none; b=EYihr1rGFB4T0/sILCjVsMfs/06Gpe4lvBG1d1sxkJCL/5plGCnANjGsnI0bICcY/s/z2L UBhvOQtRT/+H+CqYoqMgHUele0HlohQBp02YivrnnsPiT+xpHyR2aWqm+U4JQtIwTtdZ1x KNtG/uDOR8QxrVBQESvWxRHx3vo6yMs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669662985; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6+uMQ1ItOkPZCYgWOP1o+0fNZIWxA74Wiz+rf0diPIk=; b=AYsVHlMXQsLvonS3lRIMaB5HZA0pQQ8iKzsjqohpy3LT2Ei5MoQACqCxGb2K6/uRaM2wcV Hz5yG7TyRfkmHRsp+UDnCwmWfZbf/0eGZS6Q+etSc4ZSetNtFl+fcqEmQBn9ZEJXyggTcp JApEKXiZcK4PONm7VmK5jozmGHlKni4= Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=mOBGBJZ7; spf=pass (imf26.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0C719140013 X-Stat-Signature: 5ihx4nkq33p3w8efkkuaat7wbxywwsce X-Rspam-User: X-HE-Tag: 1669662984-468632 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This adds a new field to zs_pool to store evict handlers for writeback, analogous to the zbud allocator. Signed-off-by: Nhat Pham Acked-by: Minchan Kim Acked-by: Johannes Weiner Reviewed-by: Sergey Senozhatsky --- mm/zsmalloc.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index b1bc231d94a3..d06f9150b9da 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -242,6 +242,8 @@ struct zs_pool { #ifdef CONFIG_ZPOOL /* List tracking the zspages in LRU order by most recently added object */ struct list_head lru; + struct zpool *zpool; + const struct zpool_ops *zpool_ops; #endif #ifdef CONFIG_ZSMALLOC_STAT @@ -382,7 +384,14 @@ static void *zs_zpool_create(const char *name, gfp_t gfp, * different contexts and its caller must provide a valid * gfp mask. */ - return zs_create_pool(name); + struct zs_pool *pool = zs_create_pool(name); + + if (pool) { + pool->zpool = zpool; + pool->zpool_ops = zpool_ops; + } + + return pool; } static void zs_zpool_destroy(void *pool) From patchwork Mon Nov 28 19:16:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13057895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 757ACC4167D for ; Mon, 28 Nov 2022 19:16:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B89D6B0080; Mon, 28 Nov 2022 14:16:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A6368E0001; Mon, 28 Nov 2022 14:16:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 559FF6B0082; Mon, 28 Nov 2022 14:16:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3F7766B0080 for ; Mon, 28 Nov 2022 14:16:27 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D998012069F for ; Mon, 28 Nov 2022 19:16:26 +0000 (UTC) X-FDA: 80183807172.22.011432D Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) by imf20.hostedemail.com (Postfix) with ESMTP id 5B88A1C0017 for ; Mon, 28 Nov 2022 19:16:26 +0000 (UTC) Received: by mail-pg1-f182.google.com with SMTP id r18so10785702pgr.12 for ; Mon, 28 Nov 2022 11:16:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=a4gi1oG2jocLUE4hkXZ/M31m4aC6HWi7g5/rCEC+57M=; b=FGM0jY3WNm+ApgBJGMPpYq7TVSQbR2kUx0BMOXpRfUckWD9UKFecYNRwtU3znkUFd+ jgovPgGYP96PGP/C0sUG/6ptMDPQ6wfR/RicLLaJLSWkJfQ4B4FogNMhx/lUzMeKcqlZ Q851rePp9nEIoe79Ioro4je17Z2PG3pVQ19e0W7I9o+YDgndV7OXTg0+sYt2aD23UjPA lFtuuA/a8a3ascu2MNPfRoS8pSQi16a7yQVIK9uxbvloeC5/vzObjTcNux+wf67KO5NC FYT2vIQ2f4UM9DFOGbEj1dPMcSK7a9kDtJQalCuOGE5rT1jNU2X5/mADTUi36NZqTfwS Pg0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a4gi1oG2jocLUE4hkXZ/M31m4aC6HWi7g5/rCEC+57M=; b=zioH+L+cs4wCHXr6rHAcNOt0HH4kNHJEzDj8H7X5ebgYzdW9DnMx28aD2IehEX5ZYV K/xNTqj9hDEjav6H/2oDRQFKXh7z3toGVitnG07lueobvwK8c2y2AtDfRaPdcFwjqp1N 992DCZM7lD4vI2e+XXJDq/m/2LaF4dOE7zpjoeWlZZy03//oGwFQ5nDH5IvDHxmaqqK2 w1MTpn5qdfTHQ/KGKr2WbOOtRoOtI05jvbxQm0YjLFYQ1ojKtOC22zS56fJZqreaUBpz Fbk50jnCbfDlZbcWVG87w9pCmV31UDr+deAExDrj9kzfpAaISJuRJlvhz0cpVTUvFufd p1Vg== X-Gm-Message-State: ANoB5pnovoSJgbk084cEqjchm5gg50UsA08z3VTesafQy7c1Xa8+9LUN DMXBhaPfKk/wTRNsNEohtc8= X-Google-Smtp-Source: AA0mqf4k+8ejpzi+51JV8sGSjHrlAuQjlPDP9ehqDk2GJ9wAQScXid4TygsdVRkzUkcpqiUJDtCOWg== X-Received: by 2002:aa7:8595:0:b0:574:3ccd:a468 with SMTP id w21-20020aa78595000000b005743ccda468mr28783695pfn.61.1669662985347; Mon, 28 Nov 2022 11:16:25 -0800 (PST) Received: from localhost (fwdproxy-prn-012.fbsv.net. [2a03:2880:ff:c::face:b00c]) by smtp.gmail.com with ESMTPSA id u143-20020a627995000000b0056cee8af3a5sm8436415pfc.29.2022.11.28.11.16.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:16:24 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v7 6/6] zsmalloc: Implement writeback mechanism for zsmalloc Date: Mon, 28 Nov 2022 11:16:15 -0800 Message-Id: <20221128191616.1261026-7-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221128191616.1261026-1-nphamcs@gmail.com> References: <20221128191616.1261026-1-nphamcs@gmail.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=FGM0jY3W; spf=pass (imf20.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669662986; a=rsa-sha256; cv=none; b=LFb9/WcDlY5cpJcArdPc0Bg9KPKgZTgNduWb7eI+wShk3T8bkj0eAZJtEWxXsBsUPT8CMK uAWB16o+YtfhMCMa4Rsefza62qL7h+Ez3Lvb+iKUBAa6EDyD9L3a2fVLc6RAthtUE/niok oH4mK3yPs8+BHwwyETP+BTg+KbBaBLM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669662986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a4gi1oG2jocLUE4hkXZ/M31m4aC6HWi7g5/rCEC+57M=; b=5snF4I+1fI2es4+pBSQkPwMeapvRYjyP620I/zhEFSS4E9mYwEgo7UrF/gUP/bj0lde+Zg m1ZJWHUtmiMgKiTVGwUQne/jpHYpt+7K+HU4/fY4ebOIpMx6jBPIbQxxtIXJwn8r5FQq1P Z5X2cZODFHrxN1ZXPLwbjoLojJdj80k= Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=FGM0jY3W; spf=pass (imf20.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam01 X-Stat-Signature: 3ineetq476uatkiia8a9q3tmaipghxsd X-Rspamd-Queue-Id: 5B88A1C0017 X-Rspam-User: X-HE-Tag: 1669662986-221553 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This commit adds the writeback mechanism for zsmalloc, analogous to the zbud allocator. Zsmalloc will attempt to determine the coldest zspage (i.e least recently used) in the pool, and attempt to write back all the stored compressed objects via the pool's evict handler. Signed-off-by: Nhat Pham Acked-by: Johannes Weiner Reviewed-by: Sergey Senozhatsky --- mm/zsmalloc.c | 194 +++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 183 insertions(+), 11 deletions(-) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index d06f9150b9da..9445bee6b014 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -271,12 +271,13 @@ struct zspage { #ifdef CONFIG_ZPOOL /* links the zspage to the lru list in the pool */ struct list_head lru; + bool under_reclaim; + /* list of unfreed handles whose objects have been reclaimed */ + unsigned long *deferred_handles; #endif struct zs_pool *pool; -#ifdef CONFIG_COMPACTION rwlock_t lock; -#endif }; struct mapping_area { @@ -297,10 +298,11 @@ static bool ZsHugePage(struct zspage *zspage) return zspage->huge; } -#ifdef CONFIG_COMPACTION static void migrate_lock_init(struct zspage *zspage); static void migrate_read_lock(struct zspage *zspage); static void migrate_read_unlock(struct zspage *zspage); + +#ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage); static void migrate_write_lock_nested(struct zspage *zspage); static void migrate_write_unlock(struct zspage *zspage); @@ -308,9 +310,6 @@ static void kick_deferred_free(struct zs_pool *pool); static void init_deferred_free(struct zs_pool *pool); static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage); #else -static void migrate_lock_init(struct zspage *zspage) {} -static void migrate_read_lock(struct zspage *zspage) {} -static void migrate_read_unlock(struct zspage *zspage) {} static void migrate_write_lock(struct zspage *zspage) {} static void migrate_write_lock_nested(struct zspage *zspage) {} static void migrate_write_unlock(struct zspage *zspage) {} @@ -413,6 +412,27 @@ static void zs_zpool_free(void *pool, unsigned long handle) zs_free(pool, handle); } +static int zs_reclaim_page(struct zs_pool *pool, unsigned int retries); + +static int zs_zpool_shrink(void *pool, unsigned int pages, + unsigned int *reclaimed) +{ + unsigned int total = 0; + int ret = -EINVAL; + + while (total < pages) { + ret = zs_reclaim_page(pool, 8); + if (ret < 0) + break; + total++; + } + + if (reclaimed) + *reclaimed = total; + + return ret; +} + static void *zs_zpool_map(void *pool, unsigned long handle, enum zpool_mapmode mm) { @@ -451,6 +471,7 @@ static struct zpool_driver zs_zpool_driver = { .malloc_support_movable = true, .malloc = zs_zpool_malloc, .free = zs_zpool_free, + .shrink = zs_zpool_shrink, .map = zs_zpool_map, .unmap = zs_zpool_unmap, .total_size = zs_zpool_total_size, @@ -924,6 +945,25 @@ static int trylock_zspage(struct zspage *zspage) return 0; } +#ifdef CONFIG_ZPOOL +/* + * Free all the deferred handles whose objects are freed in zs_free. + */ +static void free_handles(struct zs_pool *pool, struct zspage *zspage) +{ + unsigned long handle = (unsigned long)zspage->deferred_handles; + + while (handle) { + unsigned long nxt_handle = handle_to_obj(handle); + + cache_free_handle(pool, handle); + handle = nxt_handle; + } +} +#else +static inline void free_handles(struct zs_pool *pool, struct zspage *zspage) {} +#endif + static void __free_zspage(struct zs_pool *pool, struct size_class *class, struct zspage *zspage) { @@ -938,6 +978,9 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); + /* Free all deferred handles from zs_free */ + free_handles(pool, zspage); + next = page = get_first_page(zspage); do { VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -1023,6 +1066,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) #ifdef CONFIG_ZPOOL INIT_LIST_HEAD(&zspage->lru); + zspage->under_reclaim = false; + zspage->deferred_handles = NULL; #endif set_freeobj(zspage, 0); @@ -1572,12 +1617,26 @@ void zs_free(struct zs_pool *pool, unsigned long handle) obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); + +#ifdef CONFIG_ZPOOL + if (zspage->under_reclaim) { + /* + * Reclaim needs the handles during writeback. It'll free + * them along with the zspage when it's done with them. + * + * Record current deferred handle at the memory location + * whose address is given by handle. + */ + record_obj(handle, (unsigned long)zspage->deferred_handles); + zspage->deferred_handles = (unsigned long *)handle; + spin_unlock(&pool->lock); + return; + } +#endif fullness = fix_fullness_group(class, zspage); - if (fullness != ZS_EMPTY) - goto out; + if (fullness == ZS_EMPTY) + free_zspage(pool, class, zspage); - free_zspage(pool, class, zspage); -out: spin_unlock(&pool->lock); cache_free_handle(pool, handle); } @@ -1777,7 +1836,7 @@ static enum fullness_group putback_zspage(struct size_class *class, return fullness; } -#ifdef CONFIG_COMPACTION +#if defined(CONFIG_ZPOOL) || defined(CONFIG_COMPACTION) /* * To prevent zspage destroy during migration, zspage freeing should * hold locks of all pages in the zspage. @@ -1819,6 +1878,24 @@ static void lock_zspage(struct zspage *zspage) } migrate_read_unlock(zspage); } +#endif /* defined(CONFIG_ZPOOL) || defined(CONFIG_COMPACTION) */ + +#ifdef CONFIG_ZPOOL +/* + * Unlocks all the pages of the zspage. + * + * pool->lock must be held before this function is called + * to prevent the underlying pages from migrating. + */ +static void unlock_zspage(struct zspage *zspage) +{ + struct page *page = get_first_page(zspage); + + do { + unlock_page(page); + } while ((page = get_next_page(page)) != NULL); +} +#endif /* CONFIG_ZPOOL */ static void migrate_lock_init(struct zspage *zspage) { @@ -1835,6 +1912,7 @@ static void migrate_read_unlock(struct zspage *zspage) __releases(&zspage->lock) read_unlock(&zspage->lock); } +#ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage) { write_lock(&zspage->lock); @@ -2399,6 +2477,100 @@ void zs_destroy_pool(struct zs_pool *pool) } EXPORT_SYMBOL_GPL(zs_destroy_pool); +#ifdef CONFIG_ZPOOL +static int zs_reclaim_page(struct zs_pool *pool, unsigned int retries) +{ + int i, obj_idx, ret = 0; + unsigned long handle; + struct zspage *zspage; + struct page *page; + enum fullness_group fullness; + + /* Lock LRU and fullness list */ + spin_lock(&pool->lock); + if (list_empty(&pool->lru)) { + spin_unlock(&pool->lock); + return -EINVAL; + } + + for (i = 0; i < retries; i++) { + struct size_class *class; + + zspage = list_last_entry(&pool->lru, struct zspage, lru); + list_del(&zspage->lru); + + /* zs_free may free objects, but not the zspage and handles */ + zspage->under_reclaim = true; + + class = zspage_class(pool, zspage); + fullness = get_fullness_group(class, zspage); + + /* Lock out object allocations and object compaction */ + remove_zspage(class, zspage, fullness); + + spin_unlock(&pool->lock); + cond_resched(); + + /* Lock backing pages into place */ + lock_zspage(zspage); + + obj_idx = 0; + page = get_first_page(zspage); + while (1) { + handle = find_alloced_obj(class, page, &obj_idx); + if (!handle) { + page = get_next_page(page); + if (!page) + break; + obj_idx = 0; + continue; + } + + /* + * This will write the object and call zs_free. + * + * zs_free will free the object, but the + * under_reclaim flag prevents it from freeing + * the zspage altogether. This is necessary so + * that we can continue working with the + * zspage potentially after the last object + * has been freed. + */ + ret = pool->zpool_ops->evict(pool->zpool, handle); + if (ret) + goto next; + + obj_idx++; + } + +next: + /* For freeing the zspage, or putting it back in the pool and LRU list. */ + spin_lock(&pool->lock); + zspage->under_reclaim = false; + + if (!get_zspage_inuse(zspage)) { + /* + * Fullness went stale as zs_free() won't touch it + * while the page is removed from the pool. Fix it + * up for the check in __free_zspage(). + */ + zspage->fullness = ZS_EMPTY; + + __free_zspage(pool, class, zspage); + spin_unlock(&pool->lock); + return 0; + } + + putback_zspage(class, zspage); + list_add(&zspage->lru, &pool->lru); + unlock_zspage(zspage); + } + + spin_unlock(&pool->lock); + return -EAGAIN; +} +#endif /* CONFIG_ZPOOL */ + static int __init zs_init(void) { int ret;