From patchwork Tue Dec 10 09:28:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13901072 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB00BE7717F for ; Tue, 10 Dec 2024 09:29:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E71506B014E; Tue, 10 Dec 2024 04:29:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DF9B86B014F; Tue, 10 Dec 2024 04:29:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4D536B0150; Tue, 10 Dec 2024 04:29:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A2BF96B014E for ; Tue, 10 Dec 2024 04:29:02 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 501761604C1 for ; Tue, 10 Dec 2024 09:29:02 +0000 (UTC) X-FDA: 82878525114.29.9AD8960 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf12.hostedemail.com (Postfix) with ESMTP id 02E0740017 for ; Tue, 10 Dec 2024 09:28:50 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VyqSHIGT; spf=pass (imf12.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733822924; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=3brNrtjgiFvI2dpljuY16w30F6KROwJEev3dmdE/vaQ=; b=R3/1/y47pp4+vQlYw1MOlePReOZrqJR+EmcVbFzMSdLbakg/L8h2roXfNnxwiu554+Zqij lnENJsKumrtQ1ALJKRKUV70IeuPxz4dnBhdTTZjYwLmlVY8S+1jbkpaZUW7DLnQ3JAFsA5 AGf5pat9AxqwExCVCUqvau2qHJnGnlw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VyqSHIGT; spf=pass (imf12.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733822924; a=rsa-sha256; cv=none; b=frvcJg7b713GjE+fE+/doY/s269xBk2VbBxLWR0KD+NLW+zGBnXvngli91jxOUwzV6de05 X6uEvABXZOuUmkivJnxNoueZEMUgqInyVNrutGSsUM+8XtgXAUBlCZ7iA/riGaLr+r6ZEv IDZGl1+YdYSlfJz8tfuB7b2gUnDKAVU= Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-7259d6ae0c8so4437680b3a.0 for ; Tue, 10 Dec 2024 01:29:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733822939; x=1734427739; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=3brNrtjgiFvI2dpljuY16w30F6KROwJEev3dmdE/vaQ=; b=VyqSHIGTgXGxAzsipFaNbBKURwisKXpbG/3q/Hc+GZPIFv/ajc3OKrkoevNG1NkTfF vjB+Yd/r5S0BJOzKVYIsOtFy9JYqYdkEOfpMo6Y0/2aoGl+iq9SP+PUJcb+dSngtv+qZ RXRc0x7TLp3HXAJTKWqD+L5xmvQ4LsRkaVd1zlrCZQDJ2rkzbsyPxYlvX2fHnKJTf5Lb vuOA7fzGf8xCaEVY0NMtr0x9M51W1SMu0JNS6VT7ZMfYoqhZWwm/co6RMB77aGibJlm7 HWa2lcHcqxs62yGiyWu6mFNBd8XQpBOrZqzEtxo3ngJoQ6MOctvnUrkhMt9MFmFMgfgb 5YCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733822939; x=1734427739; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=3brNrtjgiFvI2dpljuY16w30F6KROwJEev3dmdE/vaQ=; b=HXtn2F13m9hvSkw0UwJNWA+xJFH1bHvSjlEH1fa+hRpiHHcuBJZXm7Wy81E9sJh+iG n8JfxYxNq30tHMjPOzb+teP6BMdBLBIBQ70hpmzzS1HezYETCUmhGVbY4/Vyv+gmkn7E hOmRN1H8sAtVYiSrkzflDWRwRNe8AslEJskFY/wRUtYc3qbamZiK3sBOg9AQj/VOzfpo HkImWUnqlZQOnipztEUCGsGUjObV57xem9KEHE20oOcaJ5SEGjoDRYV6FAfymer2X3Ua DOACg1gGxCHd1Se8r6AjBJDnIRWX5j4/hSwLKZM2LCHZTTc5xuh9LE1b8ZHYc4K8rZk8 iKXQ== X-Gm-Message-State: AOJu0Yw9rHfz+3d7FNpSpnlCgjwfXfm+a1kjloqUgNZJXiZIJWmMtE+t g0A+SSsiWZU6M4YkPi44+9S3tIkH2vgMVqEN563KG+av7Km8xGPeMy53tSzEBH4= X-Gm-Gg: ASbGncv29BBz2stz7BkwW3oymF5+DQnmYA31Ef83Z0RTv8mXQ6qJTqRplm+8XMdSj5Q l4PV4I1wNrOAKPmm1O7Hnk5yiC9g7DaxCl7GSs7MTXV19lPu5Oq+kLiukbCKYcNjQFrbR98mqV9 nLlZryzewMKRb3R+Ngrf11CXGzEBcl3yAUnFJ1nR9eJqF18fqLNpTviLpcjVXvRjxnrj9Ild02y OCytUwHpcnYIlU3POOWTvONezv2sdzEghQ101vXxrmb5QPTVeErS2Mnx4kuFW/tSi2IlkH/4qFu 55CjIDnX X-Google-Smtp-Source: AGHT+IERxmWetmXJHwNhsuCX8qmmQME1SODknBN8SZsyNFWK4dOwc3xcBmWd/o7//0CA+zx7t9zMBw== X-Received: by 2002:a05:6a21:1518:b0:1e1:b8bf:8e80 with SMTP id adf61e73a8af0-1e1b8bf918amr2204728637.41.1733822938506; Tue, 10 Dec 2024 01:28:58 -0800 (PST) Received: from KASONG-MC4.tencent.com ([43.132.141.21]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7fd1568f26asm8750095a12.9.2024.12.10.01.28.54 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 10 Dec 2024 01:28:58 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Chris Li , Hugh Dickins , "Huang, Ying" , Yosry Ahmed , Roman Gushchin , Shakeel Butt , Johannes Weiner , Barry Song , Michal Hocko , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v2 0/3] mm/swap_cgroup: remove global swap cgroup lock Date: Tue, 10 Dec 2024 17:28:02 +0800 Message-ID: <20241210092805.87281-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.47.1 Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Queue-Id: 02E0740017 X-Rspamd-Server: rspam12 X-Stat-Signature: 85eod9wrp864oyjgsr7yd7p8yjxrs6t8 X-Rspam-User: X-HE-Tag: 1733822930-843652 X-HE-Meta: U2FsdGVkX18S/52UzES0v9KaQvsBVIvnpBOsRaG85proF8o1GBe8Mfxm6cRXXPOzfWk8s2y+hOHbNfN6Fl0JRNMBpePMneluwijo1zeVHl4hQLr/L3bqYp19vyiQFJN98cey4lzRzbpVblGQO9v4vs94AcClo/emD9Su+4KUIGxGHCMFhMXfjzODIzRPuf8QIzdEJh0jEw8cdF+AwQEicfgAfvjlCW5rruLmklUb3flCeIrpVzkKLoClfITgmZyURxIKqIpKIRqtjrD5WivAcMy68bpZDuFVZHGRLArpivk1TsMjH1CZCqQSYcIezViXkIDDSiQIMN+/pEb/JBghuAIwCPp97c2VQjY7r0E833lNYCar0Sdif6RExYagYerfyn3sDhWGKXAJ94BQsdzxWT/A1cqdQK8NkF2BjXyCV1UP32eTeVewA228l7c4M5PzDgzHUqATzAe+oNznVx0Fg2bNW420WNiCeRJMEAGgkZuma+dkrOgcmlSTY59bi8/nRzC9Qp0dKluqt0i75v7qMGb/HZAjWcGpmJ2oIkwCMmmAEHf0fa+x7PvbmOvz4ASxRvAYicrxxJQSVcWCCkEP40hRiMt2uS66qMKZbIN9OR3RPC9hnrpI5gnNF563jZjBGcVwoD2Nv88WYqzOKPVDlgyI67xbodd083SKXqTC4EEnIke8G86Gh5rmPme2JKNM1YG9b4IdMfRDvhJtKtptt55BzOb7ZfdvLL09StNUyjP2CGItHoc5SbXVBKc+5vuP3tHoANUx1Pnhg3Np3m/6GG/SvoYJM5hlJ0eJQg1/lzm7rc8xzLb5NFfwQ9aBl7llkgQ1F8lmst0gM/n4mif3CJAzQS5GOkaZ8a4W2bHy9efb09s4cHI5TMyw8jyxgo0+XN8q7VInG0jbVosCddAQk8Uba5bvEXxTcSVHrwxPvxXsdNy0Ojpm4JJQYn2OeObV6sl/tX53CYAPlKVaxJi Cn7OfZja 0TsLHvYjjn5GvSaxSlsodgzfe9vCAUiex/kYc9dQhhNCT1i5Nw5Yag2Cf/so4roZUUxEUyuLfp7EOHV8dIb5Ym362Jpj4XmXQWgDCu89e8cqTKueiIhtqUpkKQNkVhcTVgo3vxB+XEyhxb6jwWc+guZPyqdwodW+BPg6HpGqJWg5Pj2osZg+7T323faE85LgnrwE0jCEKxyh3KWsEnNKq+RQLbDP+BoYorLur7XkYzej/dSbX+z7gmiCGI5J+YFiXMfeIGVfRD6zR3DUKf8puj8UiMFr+ksrAqXP5d4PRpIoqZjd9J8h55tacCrgxIQXCn5zF+Z22zGdZA4GanZYbQe4nZoEa0H4DjKQ1gio/1p+DgfqxgNPsW+vu1qWOKh4cci5a3+vSkl4QOyv7tEOl4vqXaH3pgA6gPbuUaRhlHGDBYn0NyOYcGtJCnmtrJqsOXCh/bP0Zgb1cCgo4ImrLeED/bmXy6iAgs+GoGQshflaYmxjURItWuGH0DFQC9QUxswxWn63/Sd+t71pbkfXijnDKbIoJvMAoNCmHho6BTSe0jITvg9UtALCJk6ih3VECdl29pAkYkYprvsv7ijRyAopfdRaykk8ORNA0wq0POajQgVw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000610, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song This series removes the global swap cgroup lock. The critical section of this lock is very short but it's still a bottle neck for mass parallel swap workloads. Up to 10% performance gain for tmpfs build kernel test on a 48c96t system, and no regression for other cases: Testing using 64G brd and build with build kernel with make -j96 in 1.5G memory cgroup using 4k folios showed below improvement (10 test run): Before this series: Sys time: 10809.46 (stdev 80.831491) Real time: 171.41 (stdev 1.239894) After this commit: Sys time: 9621.26 (stdev 34.620000), -10.42% Real time: 160.00 (stdev 0.497814), -6.57% With 64k folios and 2G memcg: Before this series: Sys time: 8231.99 (stdev 30.030994) Real time: 143.57 (stdev 0.577394) After this commit: Sys time: 7403.47 (stdev 6.270000), -10.06% Real time: 135.18 (stdev 0.605000), -5.84% Sequential swapout of 8G 64k zero folios (24 test run): Before this series: 5461409.12 us (stdev 183957.827084) After this commit: 5420447.26 us (stdev 196419.240317) Sequential swapin of 8G 4k zero folios (24 test run): Before this series: 19736958.916667 us (stdev 189027.246676) After this commit: 19662182.629630 us (stdev 172717.640614) V1: https://lore.kernel.org/linux-mm/20241202184154.19321-1-ryncsn@gmail.com/ Updates: - Collect Review and Ack. - Use bit shift instead of a mixed usage of short and atomic for emulating 2 byte xchg [Chris Li] - Merge patch 3 into patch 4 for simplicity [Roman Gushchin]. - Drop call of mem_cgroup_disabled instead in patch 1, also fix bot build error [Yosry Ahmed] - Wrap the access of the atomic_t map with helpers properly, so the emulation can be dropped to use native 2 byte xchg once available. Kairui Song (3): mm, memcontrol: avoid duplicated memcg enable check mm/swap_cgroup: remove swap_cgroup_cmpxchg mm, swap_cgroup: remove global swap cgroup lock include/linux/swap_cgroup.h | 2 - mm/memcontrol.c | 2 +- mm/swap_cgroup.c | 96 ++++++++++++++++--------------------- 3 files changed, 43 insertions(+), 57 deletions(-)