From patchwork Mon Jun 24 17:53:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13709935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54180C2D0D1 for ; Mon, 24 Jun 2024 17:53:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4F6E6B0188; Mon, 24 Jun 2024 13:53:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BFFB76B0389; Mon, 24 Jun 2024 13:53:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC66D6B0391; Mon, 24 Jun 2024 13:53:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 87AE26B0188 for ; Mon, 24 Jun 2024 13:53:25 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 33867A3844 for ; Mon, 24 Jun 2024 17:53:25 +0000 (UTC) X-FDA: 82266529170.08.32D2DC3 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf28.hostedemail.com (Postfix) with ESMTP id 6BD9AC0005 for ; Mon, 24 Jun 2024 17:53:23 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lHeWDUlV; spf=pass (imf28.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719251590; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=gP8ofnoJ30QwHe6W/WMZYV9hOcECf6BTaLpfzWtFYL0=; b=1Yw7DZbYz88ht6LvOZ8OS311FlUmu597KPBX2ntakRshQDupZUtTCDrbjfG5rKso6bCFcG rGZvPpB4iSngX6EGvjAxPhh43pk9C2bqvvqo9F5LNSUbTN+m0hUGwiBGvqqabUYS5Z5iO6 osQ5wYMVu+r8Sqv68yosOXYh9JEDaVA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719251590; a=rsa-sha256; cv=none; b=CZCxtj6IGrmh784s9faXUL1BV+0G/k3NgSetSC6dNPDJy9VHQK+2SB4+WRzKPaHHLFam3W Tbouj+ScphNI800CWUauXLcaVN5cHqUGrZluXMyAgwMeSfCmKOR1jz4AFt7bD+wanU/smV jwTGxkjbaSQv43jLhsa0AFNrNxcywvw= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lHeWDUlV; spf=pass (imf28.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-70685ab8fb1so1002521b3a.2 for ; Mon, 24 Jun 2024 10:53:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719251601; x=1719856401; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=gP8ofnoJ30QwHe6W/WMZYV9hOcECf6BTaLpfzWtFYL0=; b=lHeWDUlVebJhGXXBB//ZL5t2ugeW8ksjGvvTZpp6eVtylmg36zdZpJ/5Yb22F/J2bJ w/l1om1gAelIeOircA4GC1X32KNLSK+N6fsiY48/YMq/YSy3GnC8Dq9MRnk5i4f9kvsy zP1t3XX3EkZe6Oh7RJT7KKwVyBZmELClNeMxmo9ondHnK1uJKaz9ZkLcqgX56c0XvhGi fAqK+MDhGg2ywqO0vtCCZwdZtV64bwGTsvK8YoACYo8bO4dINWL0XcJVKNq8yV6g1ix8 FDy+w/LPEAEzVXHGb0kuTlkyoZzVqKFqgkU/h7wBMVx0MatcufAXixFFopp4bgH5zeg0 NbCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719251601; x=1719856401; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gP8ofnoJ30QwHe6W/WMZYV9hOcECf6BTaLpfzWtFYL0=; b=p84pcEhPfN7wMDij9cyxgHnofG5jxZpZcNheyzdRQbiSEB386GfE8mR2HmWKFQmOpU TYo0ulDEbNkadzFwFlEMVphnYht8z7xwIuK60Ls0eK9dsA7XfUnAvaJLdoVcwMp7IbuS BVC+NL+t/HL6NPfEdoprSNPXbP6/6kvmuyzR4Rpt96+8EdRt2u79TZ2KJ9VmAHBQWqrR KW9zA+U4Gw3pKqCFjfJnZpfHbd+Mu+G567y9MwfjfXOIhI226bwhIrEALLR1t0ELPc58 vxaFbcICffsrkdxazu6+gv3LuD27s9zFZXzJHIIavi7YIsvyoTF7HZF4sFUCtLH2RKwS /2kQ== X-Gm-Message-State: AOJu0Yx1IAp+4oNSEwSkxJgHoW+Zw/Pe5eRqFceJtTT4sznaoaK2OfUK f+ixHeAqqviNi8a7pBaNOUk7snXlhtQgDADDzz9e6pPKWz79PosAkQxYkO1bBf8= X-Google-Smtp-Source: AGHT+IHmcKHzc7qaeM3cMvkLjugDNhjtnNpqnRghtCtusl+5818i9ueLoTbeQ5i3nj44xvKCc8jvKg== X-Received: by 2002:a05:6a20:c101:b0:1bc:f8dc:f99f with SMTP id adf61e73a8af0-1bcf8dcfc48mr4567109637.33.1719251601174; Mon, 24 Jun 2024 10:53:21 -0700 (PDT) Received: from KASONG-MC4.tencent.com ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70662ee117asm5008049b3a.211.2024.06.24.10.53.17 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 24 Jun 2024 10:53:20 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Waiman Long , Shakeel Butt , Nhat Pham , Michal Hocko , Chengming Zhou , Qi Zheng , Muchun Song , Chris Li , Yosry Ahmed , "Huang, Ying" , Kairui Song Subject: [PATCH 0/7] Split list_lru lock into per-cgroup scope Date: Tue, 25 Jun 2024 01:53:06 +0800 Message-ID: <20240624175313.47329-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.45.2 Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Queue-Id: 6BD9AC0005 X-Stat-Signature: bz7qroxp4agqdscafd9od56ih9zk7ozj X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1719251603-775213 X-HE-Meta: U2FsdGVkX18SuWtAcGKsAAwZ+1KR63mMBQROSykPOwu78xmWYqKXbtdbWg4WtTwYAC4TemJPSiylzEh3ylqkssVgn/oRD0n5hVFoXYWOop2IGvl2vFXclfpuclMC8H89c2V3x56kNYF9y6uwttrc86oc+kpRh9cRaxqFGDdPsyxy2cuLddYUnb5MaP96FIsAM20/hn+dZuhXjwFz+AqbJ1iytcmBjsTQJSuLx5cMEcy/XSmOTF1me42jCFsAbJtlatQU88FUYP1M6wWy7ufE7ntnM1WUhRZGF5bBKRLAeKmldcYEuyeD2Vx8HJQmX7CZPfKAdyhOjADzBkTxdTGU6nvOmSsj5AtaYskaPZ8c6F3RrNWr8YDY3o/tNY9D4eXqacFZhGA90tF41T4yiA53kgYjQsjIBW4wZwaUxXnUU5iDwdFRfTdVBBCGn0to4/j4rTJWeXwAKWVGNIVqbHSNI+sH4XOz350x3HuHXYSEfEourKBJ2SSx+uztffcRgHg/BWSXdZp9m0dEicJ6cz/+QuMRR4a/KBp4oY6bI4oEMfYy5x4JcaiK8oLUvcEGXzBOiejK6f925LGkGZVrq1q6ZA1wJXikJ4BfeJx3ra2l2jmzLMk7/m9H2Krtnkt/SZiXuSO8WV6DuwecXp425UnZtDpJmU5Zd5+XtBi9WvmzNbfju68VlSIECX4ORqZ3g95Pj0wkDx7wkdLvCdaDOiQMKvxpZYOA6G5QF0BJm8TLDUEoIGUZiKR5IllsUy0Xbn5sOKx3q0qzrG0H9u9YaHmyA6kTEi64Kvq1nEoaJpHOI4liND99PJHk4qvDdEfxBI4KuUGiDqKhEMczuiZkbYtIjDgWczuynnoaqfF3Xk+Xr43KvMCNtugXGU3bNgzTP/e0FCnP1hCyvELO9CrLt+oQYCoEMoY1yPdey+qDnzaCuERJAX0guf+CJ84mphe+DnoF79vh8UFRAGGZZ/aW4tT UyRIybfU 3/uOZaaPDy2nN/P8IcC2eoNGXEWfi8WxlE2qk82CZHkGMsI0VERlp1kB2FLMmMQcV0eWliwKjGmanCxWB+4TCuh2laWG1fJIsxe8ZpQK2p0iI0+yXgR8r9TVi0TZn4I+DX7N29Ucu+Ee3dWZqt5LFCEknBcbGjhjmff0xwuwKOsrQu8JR+rnUu8ztDuoUM3MUEX3LHQBvhOoRQ4rXyZv2lO0/DMBX3blfAh3s6vuctT9ESvPpCg1WBOxgu9a/kMcOufq8Gxu0iRVtzz3AAFti/3AFsLlOXNdB3w1rDCusPjkqAjJEZr2OqCzPEPat4vSwzPc6q6nwjCEcwdeIXfmCYXHtJmUwZ1jF3+HwOvZBH3n8zh3TZ0FLOQmCwJR15DCGgaf0Vr9DA50blMJxr/wNS6ub9NFdFdfy7Orhj1SKOmAVoDQQH+uL46v9DMGepYkLT88LSNOBbudeVHnhIfGk88eKYoD2cgQz2dflwnyq3Jz4Wqz7QvIV1HOMMCyqiajcIjO1y1rI6E3Tpinf2tB/e8ITJxSwWVSwTmHjXSgVz9tjSRzzFD2i3Im/zm+3QoTIfjDAKC4U4ujO8qo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Currently, every list_lru has a per-node lock that protects adding, deletion, isolation, and reparenting of all list_lru_one instances belonging to this list_lru on this node. This lock contention is heavy when multiple cgroups modify the same list_lru. This can be alleviated by splitting the lock into per-cgroup scope. To achieve this, this series reworks and optimizes the reparenting process step by step, making it possible to have a stable list_lru_one, and making it possible to pin the list_lru_one. Then split the lock into per-cgroup scope. The result is reduced LOC and better performance: I see a ~25% improvement for multi-cgroup SWAP over ZRAM and a ~10% improvement for multi-cgroup inode / dentry workload, as tested in PATCH 6/7: memhog SWAP test (shadow nodes): Before: real 0m20.328s user 0m4.315s sys 10m23.639s real 0m20.440s user 0m4.142s sys 10m34.756s real 0m20.381s user 0m4.164s sys 10m29.035s After: real 0m15.156s user 0m4.590s sys 7m34.361s real 0m15.161s user 0m4.776s sys 7m35.086s real 0m15.429s user 0m4.734s sys 7m42.919s File read test (inode / dentry): Before: real 0m26.939s user 0m36.322s sys 6m30.248s real 0m15.111s user 0m33.749s sys 5m4.991s real 0m16.796s user 0m33.438s sys 5m22.865s real 0m15.256s user 0m34.060s sys 4m56.870s real 0m14.826s user 0m33.531s sys 4m55.907s real 0m15.664s user 0m35.619s sys 6m3.638s real 0m15.746s user 0m34.066s sys 4m56.519s After: real 0m22.166s user 0m35.155s sys 6m21.045s real 0m13.753s user 0m34.554s sys 4m40.982s real 0m13.815s user 0m34.693s sys 4m39.605s real 0m13.495s user 0m34.372s sys 4m40.776s real 0m13.895s user 0m34.005s sys 4m39.061s real 0m13.629s user 0m33.476s sys 4m43.626s real 0m14.001s user 0m33.463s sys 4m41.261s PATCH 1/7: Fixes a long-existing bug, so shadow nodes will be accounted to the right cgroup and put into the right list_lru. PATCH 2/7 - 4/7: Clean up PATCH 6/7: Reworks and optimizes reparenting process, avoids touching kmemcg_id on reparenting as first step. PATCH 7/7: Makes it possible to pin the list_lru_one and prevent racing with reparenting, and splits the lock. Kairui Song (7): mm/swap, workingset: make anon workingset nodes memcg aware mm/list_lru: don't pass unnecessary key parameters mm/list_lru: don't export list_lru_add mm/list_lru: code clean up for reparenting mm/list_lru: simplify reparenting and initial allocation mm/list_lru: split the lock to per-cgroup scope mm/list_lru: Simplify the list_lru walk callback function drivers/android/binder_alloc.c | 6 +- drivers/android/binder_alloc.h | 2 +- fs/dcache.c | 4 +- fs/gfs2/quota.c | 2 +- fs/inode.c | 5 +- fs/nfs/nfs42xattr.c | 4 +- fs/nfsd/filecache.c | 5 +- fs/xfs/xfs_buf.c | 2 - fs/xfs/xfs_qm.c | 6 +- include/linux/list_lru.h | 26 ++- mm/list_lru.c | 387 +++++++++++++++++---------------- mm/memcontrol.c | 10 +- mm/swap_state.c | 3 +- mm/workingset.c | 20 +- mm/zswap.c | 12 +- 15 files changed, 246 insertions(+), 248 deletions(-)