From patchwork Thu Nov 16 02:24:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13457527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1C57C2BB3F for ; Thu, 16 Nov 2023 02:24:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B3B66B03D8; Wed, 15 Nov 2023 21:24:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 463696B03DA; Wed, 15 Nov 2023 21:24:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32C496B03DB; Wed, 15 Nov 2023 21:24:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 22F6D6B03D8 for ; Wed, 15 Nov 2023 21:24:17 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id F019D1203A5 for ; Thu, 16 Nov 2023 02:24:16 +0000 (UTC) X-FDA: 81462222912.05.6D2584C Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf24.hostedemail.com (Postfix) with ESMTP id 58CF0180003 for ; Thu, 16 Nov 2023 02:24:15 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BuE0tpl9; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3Tn1VZQoKCM8J9DCJv27zy19916z.x97638FI-775Gvx5.9C1@flex--yosryahmed.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3Tn1VZQoKCM8J9DCJv27zy19916z.x97638FI-775Gvx5.9C1@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700101455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=v1AWHqkKP0z0LkMVHCfdRkRpGsaiflFrHJR7zY+cIaQ=; b=qO77ePBr/W1AcpXajuvLCHs0iNh4pwHmbcZBmIUAMFqT7sE6LusLVS5dSd5bJuqZj2zlsO LQvZOIqmKacmYtdpn+9CfVrIPT7AddZmZt85CMfrEJBt1zki7lFjbLtR6yR8H1gHEJHf+r O3J0htaOXoSM61P/yD0diAJcrvXUF0c= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BuE0tpl9; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3Tn1VZQoKCM8J9DCJv27zy19916z.x97638FI-775Gvx5.9C1@flex--yosryahmed.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3Tn1VZQoKCM8J9DCJv27zy19916z.x97638FI-775Gvx5.9C1@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700101455; a=rsa-sha256; cv=none; b=6YVBeD3YcZuoHJFq20X8LzEuwNiVcJUcre2h8P5iP8YnMIzujxrOFufNNOeDERuDpS9W+2 B/xFn686gnnMkG/MNLJRyc7lcJ9VP6C6B0COivBCfGungMtvzamq9p1un/mnMixCtBTjyZ E54z/gwizK8lNVDW93iovshlqKADKnA= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-da13698a6d3so461414276.0 for ; Wed, 15 Nov 2023 18:24:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700101454; x=1700706254; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=v1AWHqkKP0z0LkMVHCfdRkRpGsaiflFrHJR7zY+cIaQ=; b=BuE0tpl9ITShzfE7XIgGFA7aLKwzH3FFBBwtT2uWtUMBizrC2Q+0ZuUSuds1OhHz/1 yqQU5wFhCLwN/FZCE3NBwxNsva03uC3JZSKOaR6hHr79QhfGbnt79RCayXeZsIp0awqN /hnIq4KI5mGuEEHqs4DjzbmGJjHNw0YaM94QeNQR6bR3ZxFLPYtnku4EsDl1wsn7n05l xjQm94jPFnNTey0LzJnLW7mSNkg15z7IJoNj2KdEL9wtVrCTPx4ZFTtkoxoJUb6y7IzE rbErU99sv5/F7Vrh9za/vUyY1hEJe+ABEXdi0CrqJg36kitTFSgC3PugMlYft5RDxQBk Ak9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700101454; x=1700706254; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=v1AWHqkKP0z0LkMVHCfdRkRpGsaiflFrHJR7zY+cIaQ=; b=FsIsrS3muTMa9PRIpZWXDdcx4E/Bzbm17Ue595SLEr/N2DJaWjBavw3/qFeZ0GQJtf xjMh+SZygTYKkX6ULptX50k4Aqa0ABRW9TadOoAjPIQ+xhSfbnNZ7i6ZkadtUsAg0aeX IMZ918cS2fgGnEjm+3Yrb7rgb/tPPJGA/XZNL5zch/y6Eo6e/JRY2HKgXGYgJkZdMKj2 9/pdAPEPn1CE+T0ebCvU6ppbQffPgLjrjRJ/mVRBZchcwWoEI6FWQ0hIYNieuERackRF CZ/rOd2dJlCvDzhSZBZxCEN0x6aEzvF9D60kCHfcsXWLxOcvUetOpOEtHarcrQM0mkdq Zd4Q== X-Gm-Message-State: AOJu0Yw9rxg+320QTy4Xj4xzaevkJcusvMTLZWkGZwxO3JEXUqyjJgTR vz24lBtNufxbI/cbIVrhNElP3O6SZBYgZUrm X-Google-Smtp-Source: AGHT+IFcBS884mdgTaLDdRw+tqVenUibb2jpM4m0sdAM+Xjm5sQJMErfWYKcGClz5dYEj+8EimzXW6AsKlzA6lNX X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a25:428c:0:b0:daf:34be:1e83 with SMTP id p134-20020a25428c000000b00daf34be1e83mr354171yba.2.1700101454395; Wed, 15 Nov 2023 18:24:14 -0800 (PST) Date: Thu, 16 Nov 2023 02:24:05 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.rc0.421.g78406f8d94-goog Message-ID: <20231116022411.2250072-1-yosryahmed@google.com> Subject: [PATCH v3 0/5] mm: memcg: subtree stats flushing and thresholds From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Ivan Babrou , Tejun Heo , " =?utf-8?q?Michal_Koutn=C3=BD?= " , Waiman Long , kernel-team@cloudflare.com, Wei Xu , Greg Thelen , Domenico Cerasuolo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspamd-Queue-Id: 58CF0180003 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: uiohdm1jcb8c8bqws5nqi1oesepgxom1 X-HE-Tag: 1700101455-422799 X-HE-Meta: U2FsdGVkX18ca4FqtOh4Dp2U3XzhNnwAcvdjhvg65g2i3AD+Zsdutt0/o/B2rPrTzr4A5KeZLkoDN+WFV7X3bL0S8oD8bfIDw2LuGe3Th/nTHAe89e0bOC1Hac6tvmOtyi7iN31pRy64+LyDqID9aHuEdqM6ufW4oyjaBHGzZlFoPp2Y96obqb2ydQZ2dcTpVUPF7OuPVFPiet1Md7vmgcXMdEQJa8YTAhqre9huLuuDOyOI7P38gTSghMmLDscGqFmkara0+H+nsCQq8z/2KmeyRDO+IaURm60y1+8ya2TcAN3BXa05dBl36teUzilNK2lrbquToBxkPmpDN5BG2vMBsj+yVINfeIdPbdJ8P6oddS5Aop60yyxhuSVLBwyrwEzjLwYc9XOxPQVCWhMlngT0dWLLCbEYS7njrTWd1TBOvIFBwXE0LAvKrFwGKgLNC2Y20JulIAAHzd1lnP7dTB9gUn4RQ7LWGcpED27cs2fwpFHF9qdf3ZIbZRVqQ71gVBs2BSWdNN3ZDPjDDnougfJEU6TCFpXUphXP3R8cqh0q6fjrS2hbJVJP0ey7D02Ftx0MJ86e1z+oxpVJ5DgZ92v4DxTs2c4oNct1F36fE3uLSd1mavDFjvPDGrTitequzl+Qhlq3Dldh1ncWGqBccAOpuKTWbFlC1XY7slTC1V8saECT8qw2HchNJTDT6XiM+pHR5U/GxQSSm/mQ+KHFqN/5tL5X4FkaNuKjonChp7rShrfW9F5Xy7slAus7bA199mdQvq05S1B23PbSdNrW+pdn71n88NeG926ftP0tENIyTpL6PpJM84PfaVkFw9DOtKIWgCTDMKqjUp+Ly9lgmiC2wsdl/RDlcrdbb1CH/SNra9y3bUPIpGizskE4M6W8zwvv4YdASj0yk1nHr4X6kjdf/6Ayk/k+EFlk6n/XZ6QyrLFseujtIhw1auoudoS0GRxW6nMyuFyy1pee2LE eKiAso/J NYIeHnpVg3vUzJ1IokU46f7BYyy2pce3IGN0HNmphjTEDZPPvw1gybcqo2ihm9pb1IrcnZEHnCPUeUBMD1SLP8bbt4l9A7p2vyIQFXFN8BZlcZEQ1rUQsoFtZFjh/Kk6E1hKLhgTrb7HzaS926MrxyE1UEto5v94eCF/mNapCWJDeHmMPqz7yJZ31tZSr2grwbmRqsFdrcGkNwbBhaE0NDc3m1zEwWQ5y99S812APS79vL8n1/EB+XkIKDnrtqWyMfT/aRj308n9R/ExaQP/y+ajYYHSg/i0B5CE+w2x8NC9TXBzQ/4ZuaZtYvZUgMtLlvAiE0SO/oV4+ZqYi7vM9LSAfVdFxfAgpboA7vGYmtE4CV2I+dIuvSW9jp7NzBjipd8dmYDIbhiwCq1vUuOR90JmjRjCEpQxAHmH35Jm5fnyIpvqOhtejen6DTETc8lvqfkJDbwx+M0HpdG5fMjkCycbwL5xSEVkbuM5TqhVjvkmzjqQMjlgMN6Xgejr+X8s7UkIYeOvuyJlw9MEWEX3oaOvSrLZk3HvC2Jxx+s0/YdtGx2OOpCd0iURLhas0mDfpp1dWQ6WRYDSjKASb5md72udYEdtfZyugxTlFK1REmoptVApn7OPABSZTFGZmAndbjGmHJ+1cMcHvBP9HJNN0iq3fOFNvambGjby/IUPtQm2pTJdA+CT9FTNbzhpo7N95xCFechhzbqE7P1cspABBwhyifUNmjnkHa5/MT8IP9fQ88ok= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This series attempts to address shortages in today's approach for memcg stats flushing, namely occasionally stale or expensive stat reads. The series does so by changing the threshold that we use to decide whether to trigger a flush to be per memcg instead of global (patch 3), and then changing flushing to be per memcg (i.e. subtree flushes) instead of global (patch 5). Patch 3 & 5 are the core of the series, and they include more details and testing results. The rest are either cleanups or prep work. This series replaces the "memcg: more sophisticated stats flushing" series [1], which also replaces another series, in a long list of attempts to improve memcg stats flushing. It is not a new version of the same patchset as it is a completely different approach. This is based on collected feedback from discussions on lkml in all previous attempts. Hopefully, this is the final attempt. There was a reported regression in v2 [2] for will-it-scale::fallocate benchmark. I believe this regression should not affect production workloads. This specific benchmark is allocating and freeing memory (using fallocate/ftruncate) at a rate that is much faster to make actual use of the memory. Testing this series on 100+ machines running production workloads did not show any practical regressions in page fault latency or allocation latency, but it showed great improvements in stats read time. I do not have numbers about the exact improvements for this series, but combined with another optimization for cgroup v1 [3] we see 5-10x improvements. A significant chunk of that is coming from the cgroup v1 optimization, but this series also made an improvement as reported by Domenico [4]. [1]https://lore.kernel.org/lkml/20230913073846.1528938-1-yosryahmed@google.com/ [2]https://lore.kernel.org/lkml/202310202303.c68e7639-oliver.sang@intel.com/ [3]https://lore.kernel.org/lkml/20230803185046.1385770-1-yosryahmed@google.com/ [4]https://lore.kernel.org/lkml/CAFYChMv_kv_KXOMRkrmTN-7MrfgBHMcK3YXv0dPYEL7nK77e2A@mail.gmail.com/ v2 -> v3: - Rebased on top of v6.7-rc1. - Updated commit messages based on discussions in previous versions. - Reset percpu stats_updates in mem_cgroup_css_rstat_flush(). - Added a mem_cgroup_disabled() check to mem_cgroup_flush_stats(). v2: https://lore.kernel.org/lkml/20231010032117.1577496-1-yosryahmed@google.com/ Yosry Ahmed (5): mm: memcg: change flush_next_time to flush_last_time mm: memcg: move vmstats structs definition above flushing code mm: memcg: make stats flushing threshold per-memcg mm: workingset: move the stats flush into workingset_test_recent() mm: memcg: restore subtree stats flushing include/linux/memcontrol.h | 8 +- mm/memcontrol.c | 272 +++++++++++++++++++++---------------- mm/vmscan.c | 2 +- mm/workingset.c | 42 ++++-- 4 files changed, 188 insertions(+), 136 deletions(-)