From patchwork Thu Jul 25 23:28:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13742209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAA64C3DA7F for ; Thu, 25 Jul 2024 23:28:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0344D6B0096; Thu, 25 Jul 2024 19:28:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E3AD16B0098; Thu, 25 Jul 2024 19:28:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD8956B0099; Thu, 25 Jul 2024 19:28:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B11C16B0096 for ; Thu, 25 Jul 2024 19:28:18 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 391F9121064 for ; Thu, 25 Jul 2024 23:28:18 +0000 (UTC) X-FDA: 82379865876.19.E55CAD1 Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) by imf29.hostedemail.com (Postfix) with ESMTP id 6123012001E for ; Thu, 25 Jul 2024 23:28:16 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FVLlAnJn; spf=pass (imf29.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.174 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721950030; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=p4hnJFsfLt33aZBeoJY1bqyzrEBlyMzRqpVt6Wmxo/I=; b=DSoU21UYCz7NhMOPM3GLYIMLIiWPPgMfeA8X2Cfwud3i8zuhe4n8vlGERy2vfjG6Dc93x9 6NSlB6OjmY0cN0LoHQUFOmY2i2rL1oYifaThzNVm2yGgiqMBTlRCPAxAl61GQA1DKBiJRy N7waxD+i3DeL0mQzaDbBrnrT8eDgHiM= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FVLlAnJn; spf=pass (imf29.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.174 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721950030; a=rsa-sha256; cv=none; b=Kn53B5QRxWXLes3TP+PnKIIycDNkbSlf5eqT+bcII2f79dtqpJ3TBFTDGeI6N46ksco9rc PH6Y3mCt/zUrMdFEWQqL/XFEaU/nSrB1+rIWBVCLzndPlE+FFWUJhV6rUYZqvOLy2xaE7n /u4XWovnFnoaXaZxKaeDOoJsydso/nE= Received: by mail-yw1-f174.google.com with SMTP id 00721157ae682-66a1842b452so13100737b3.3 for ; Thu, 25 Jul 2024 16:28:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721950095; x=1722554895; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=p4hnJFsfLt33aZBeoJY1bqyzrEBlyMzRqpVt6Wmxo/I=; b=FVLlAnJnBctW9m7+z3Em01XrmBvzXid5Azb3aoCG4Hw+FfYLp/UdeIU2ACYGqMgstj A+kSevwbOppKkVWQSQjyPSUbMyl3tuWwX3SI/7+aiz8ZINt3enEIa6TZ+7BzCHO8DpS0 HH8MLSOw+FJzLV013wfY7Dqyhtq8JzmVrL+3Tn6tO8Sfpb241KIP0hlS+ZyJJCEtpU2X GWLckWiY559TMNPCXi/1aygt8L/NhkJ13l08X0zttmAccxxhNBcSmZaRQVk/LCEEpyml mxzX92szTnh++cSnuHWf3RJX0whmPWaq8hlypIBweIzO3zdG39rLgpG53hInQQMqecYh sqIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721950095; x=1722554895; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p4hnJFsfLt33aZBeoJY1bqyzrEBlyMzRqpVt6Wmxo/I=; b=cAarpfHuQROaUi3cGSrf47SutZs3b94ctlB6frVE9u7W0zaKPQcyGagXJlvR+bN7kc 0vQBuGKPC61kPzX0DkSZUYo2RX5NDeZlSjUk+5MZCjNA9P1G5v8XFigptV8s6qodDZAU 3AlrXuuJFNn/V8S+JtfmFRGMN31ywU5vy8zYvHeHUFlJrjvxfQmtmf5pzRXRsPvm6dsW CfBZhA51JRg6ipVs05dGMSptUL2cvCUO0yXiwmPynBRrOMlwGUKZK1OsqjDN3hHNKM69 prq/dpI5su1duPQXVq4R2iuCOh/gcqmE5n8WsL6OWfq2pOCButehqisS76uY96AhH8/g IhEw== X-Forwarded-Encrypted: i=1; AJvYcCVUbhgZEdhSGHti+urNql+s5GFVQdPJcBGR78X1zjU1vtRSu9qjOFu5fIaMa00XQdOxvVHYa3XzrmIxl39NOYIZlGM= X-Gm-Message-State: AOJu0YzSPs8/EWjOuxSLa1zLka4YucD9Kl0K1Ggj+P9x2REms7Z8k07q t2ybhC6RlgEzZRDFE20ZTRtTmSLNaokeChW51DdnOuMz0n6Nbvgd X-Google-Smtp-Source: AGHT+IHMRvmpREzfHp9MTziEwTmKWI/CclFCpOKGls4UgocwHK4nP3Wuzq9o+Q/xBN8NQZ5Xw1YlYQ== X-Received: by 2002:a05:690c:6501:b0:64a:5443:7cbd with SMTP id 00721157ae682-675b9f4d002mr44231387b3.25.1721950095372; Thu, 25 Jul 2024 16:28:15 -0700 (PDT) Received: from localhost (fwdproxy-nha-114.fbsv.net. [2a03:2880:25ff:72::face:b00c]) by smtp.gmail.com with ESMTPSA id 00721157ae682-67568113ceesm6043477b3.61.2024.07.25.16.28.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jul 2024 16:28:15 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, yosryahmed@google.com, shakeelb@google.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, flintglass@gmail.com Subject: [PATCH 2/2] zswap: increment swapin count for non-pivot swapped in pages Date: Thu, 25 Jul 2024 16:28:13 -0700 Message-ID: <20240725232813.2260665-3-nphamcs@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240725232813.2260665-1-nphamcs@gmail.com> References: <20240725232813.2260665-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 6123012001E X-Stat-Signature: 8ie7jtmnrjt98zrunbpjqcxa93hphth4 X-Rspam-User: X-HE-Tag: 1721950096-482785 X-HE-Meta: U2FsdGVkX181XzEEFyV+MWD9xTPtukYKfPHn19keqybwcQgd9DapD4fh1elxHbmrrFGb2a+uS/EV5vdJ2PzSzymcieG8HFiPP2HqAZU5JsIzJYMldtaRsDnhKAuwtWo8KeWmrlGoajLLUcrEqU5NNq4Igsr6WxxxJXwyvxL6Oavay6t8Ynp2Ku6XCFfKw8RC2XeSEO7d74D3BLxnWwSzY4KcRyBUjvmJjqm1WvcOZj8impnogegxAVwceDdbjG4PB2z1b0xIUiW4Rc/DQEGRQBM30QcjUQ9lN0qu+HwN/W0LHxINXEkLzaOenKRgxwPs56hEGkwWJ+KOY4xrIJ7u1/AMQ9oYrtC1PNpPJFpoGoeIvxZblcyCAmmmMN0FfMt1WwllIt0WZJpq2lhFJhLiAZVFzJ9Qhf/VFxRe6j0u6O38imObZcM6FWMch27f7s9eaFYLdPgLF1jZt+oCpbrdYNuDG+nzwu/i8s1xxC+sb/IO+ABbm8JvKeIBl1Bh+8mvK87iOLrQUYiEWDyfjl4ZBuBbW2Vxh422/Nkis8yHeI9BlgsNohQO/sZuQtU+nXGgLZAlMpFC9taVMn/KQIxRoL2agQ63Mc/bELG4l4oXZQwRDyOHvXfVscDoUFwLPEhIg5clFc/3Zbsm0zQeopjUG1h8gvEwHmBa2M90NierY1yPhl1qxUtVC2j+fLE9V07i7l+O4oRw15UndWtSXTsi9NjahMNO+KvVOlGoM+saZLh7NUqJF5t+4R610vyc7bRJO7WhXEgPR7jRgdvLjTkLGOs6Kkt7wumIKfHef5tIHg6G29yg+JFXftaURYX/ZXAu1ib86smKTXHUnLUPdvptvHuw9h5i+Ghy3LyI542srrALvsFawTJip7aS0wX67gxU0k+9IeymtTXVsZB0ay52NSzVlbZPe2i9uANLRNK/rr+xo52WGAzEHnNQ95/SVyn0Pz7yyLViun3d9h933Wu CSp/GnkT Uzm+hiTeHRhbgQQfkP4dDDhyRRvhiIbSWHW4dzfFGgUscoeKazrHEYisPjoQY4uWEfHFVsN4/kbT+QWb82X3q+pZ5BD2+Z1PD6BHWnOxdON52jrjxwwKqsXrK0TX4Lk7sch9IDWUugOcK51VnjZ5kyOXDAWS4TzjgZCQWDnuZCD44J0RahU0qNkOJRyP7nI88+ASNZzgjRGoo3+s/uZ18D+hp/YjdkYMQQ94PPxQtRh4aV9KYrC9QHRWrDTV/31FAFYvFqGRPHOAnmtzuvq3Swste7VyvLBrG4azhguPmgPt4anS07JaeEWTJV6Bqn6KMDEywZEl4vI8tcmJxdU75KFTy0U9PJyWjqj9LamEkRLLPM9F6umP2GpqPIxTLvHjofYl5LChHtjtyHEjuh5+FWTmKik9Sg5tN5E5YNN9w5DiftgeONH1eitEXDbT8543eghTjvJ4dJnYMZGauzNfENwmKQ09gAw1xQWtg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, we only increment the swapin counter on pivot pages. This means we are not taking into account pages that also need to be swapped in, but are already taken care of as part of the readahead window. We are also incrementing when the pages are read from the zswap pool, which is inaccurate. This patch rectifies this issue by incrementing whenever we need to perform a non-zswap read. To test this change, I built the kernel under a cgroup with its memory.max set to 2 GB: real: 236.66s user: 4286.06s sys: 652.86s swapins: 81552 For comparison, with just the new second chance algorithm, the build time is as follows: real: 244.85s user: 4327.22s sys: 664.39s swapins: 94663 Without neither: real: 263.89s user: 4318.11s sys: 673.29s swapins: 227300.5 (average over 5 runs) With this change, the kernel CPU time reduces by a further 1.7%, and the real time is reduced by another 3.3%, compared to just the second chance algorithm by itself. The swapins count also reduces by another 13.85%. Combinng the two changes, we reduce the real time by 10.32%, kernel CPU time by 3%, and number of swapins by 64.12%. To gauge the new scheme's ability to offload cold data, I ran another benchmark, in which the kernel was built under a cgroup with memory.max set to 3 GB, but with 0.5 GB worth of cold data allocated before each build (in a shmem file). Under the old scheme: real: 197.18s user: 4365.08s sys: 289.02s zswpwb: 72115.2 Under the new scheme: real: 195.8s user: 4362.25s sys: 290.14s zswpwb: 87277.8 (average over 5 runs) Notice that we actually observe a 21% increase in the number of written back pages - so the new scheme is just as good, if not better at offloading pages from the zswap pool when they are cold. Build time reduces by around 0.7% as a result. Suggested-by: Johannes Weiner Signed-off-by: Nhat Pham --- mm/page_io.c | 11 ++++++++++- mm/swap_state.c | 8 ++------ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index ff8c99ee3af7..0004c9fbf7e8 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -521,7 +521,15 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug) if (zswap_load(folio)) { folio_unlock(folio); - } else if (data_race(sis->flags & SWP_FS_OPS)) { + goto finish; + } + + /* + * We have to read the page from slower devices. Increase zswap protection. + */ + zswap_folio_swapin(folio); + + if (data_race(sis->flags & SWP_FS_OPS)) { swap_read_folio_fs(folio, plug); } else if (synchronous) { swap_read_folio_bdev_sync(folio, sis); @@ -529,6 +537,7 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug) swap_read_folio_bdev_async(folio, sis); } +finish: if (workingset) { delayacct_thrashing_end(&in_thrashing); psi_memstall_leave(&pflags); diff --git a/mm/swap_state.c b/mm/swap_state.c index a1726e49a5eb..3a0cf965f32b 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -698,10 +698,8 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, /* The page was likely read above, so no need for plugging here */ folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx, &page_allocated, false); - if (unlikely(page_allocated)) { - zswap_folio_swapin(folio); + if (unlikely(page_allocated)) swap_read_folio(folio, NULL); - } return folio; } @@ -850,10 +848,8 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask, /* The folio was likely read above, so no need for plugging here */ folio = __read_swap_cache_async(targ_entry, gfp_mask, mpol, targ_ilx, &page_allocated, false); - if (unlikely(page_allocated)) { - zswap_folio_swapin(folio); + if (unlikely(page_allocated)) swap_read_folio(folio, NULL); - } return folio; }