From patchwork Sun Jul 16 14:54:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhu, Lipeng" X-Patchwork-Id: 13314825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06BCDC001DC for ; Sun, 16 Jul 2023 14:50:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 717276B0074; Sun, 16 Jul 2023 10:50:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C7426B0075; Sun, 16 Jul 2023 10:50:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B65B6B0078; Sun, 16 Jul 2023 10:50:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4C3C86B0074 for ; Sun, 16 Jul 2023 10:50:37 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 19761C01F4 for ; Sun, 16 Jul 2023 14:50:37 +0000 (UTC) X-FDA: 81017761314.07.1BD2153 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf17.hostedemail.com (Postfix) with ESMTP id 9AD3E4001B for ; Sun, 16 Jul 2023 14:50:33 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eyslkyng; spf=pass (imf17.hostedemail.com: domain of lipeng.zhu@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=lipeng.zhu@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689519033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iZEJxYK+iggH711n12d7B8ERJa+dNVHv7Vl8tYoyytg=; b=wq6M4xd5PPkrLfgs8Thea60LMfBWiFZ5xNVleo6ghpNFvgoMuuU4DdqUpTumKrLOABoSSv la5aublgtfbi4b22K4M3PAD7k+JoGUN0q/9EHOfE6IMmanPNjwB8Qf5FTL5gwYlGPz6Mn4 HYReX0oZ234RYD3pgPR/pj4rpnJN1s0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eyslkyng; spf=pass (imf17.hostedemail.com: domain of lipeng.zhu@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=lipeng.zhu@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689519033; a=rsa-sha256; cv=none; b=HILiGGNsXbXPIiJ8O5uGyUVY01mU2u0D1SMtCnGklnkMmX85LRfVwB06W6EKTMll+h4yP6 /BsUAi0IUqM7ZuWhNte29eD/aEtRKt/PJNp6Sab+oDDAJ9FLobsLL+YmmPpNPXeESqzmO2 A7r/Uevy8OTLn+5yoEDGzyFP0CYmILA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689519033; x=1721055033; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T7FhqY0BU1HUaPgWFw91cagayl/GC8Rb31al0fZzyCk=; b=eyslkyngTYfcCBNVHGSmUxOjwy8+4DFl+nrwO2rAIHIZQo3fQhDtJchi eFjGWrKEC6jCxwmeGCq6dwRt+2f2xXSuVDNMVZAd1UDNHaSlEySegTqER gIh18n9fICWADvUlFGjX3fHHxNH7xloLkA7sHnqknew1iNZLJdi1mc/8F S+bJE+Dc+qx3zstBauIcRi3KqvoaQz7JiL449kzET4ay59WwkkjNFTBd0 Kmo8cc1IT8DHdQFiz+nsHEukX0L4A7X+O97A+A9jNxcNvpPFq5an2b04C ca4YQ8IFv4jfjUC9eTAmF7osaf84UEqLDAnZrLaVnpRqqYL/faXYy4GOs A==; X-IronPort-AV: E=McAfee;i="6600,9927,10773"; a="365804114" X-IronPort-AV: E=Sophos;i="6.01,210,1684825200"; d="scan'208";a="365804114" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Jul 2023 07:50:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10773"; a="792955221" X-IronPort-AV: E=Sophos;i="6.01,210,1684825200"; d="scan'208";a="792955221" Received: from linux-pnp-server-30.sh.intel.com ([10.239.146.163]) by fmsmga004.fm.intel.com with ESMTP; 16 Jul 2023 07:50:29 -0700 From: "Zhu, Lipeng" To: lipeng.zhu@intel.com Cc: akpm@linux-foundation.org, brauner@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, pan.deng@intel.com, tianyou.li@intel.com, tim.c.chen@linux.intel.com, viro@zeniv.linux.org.uk, yu.ma@intel.com Subject: [PATCH v2] fs/address_space: add alignment padding for i_map and i_mmap_rwsem to mitigate a false sharing. Date: Sun, 16 Jul 2023 22:54:51 +0800 Message-Id: <20230716145450.20108-1-lipeng.zhu@intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 9AD3E4001B X-Rspam-User: X-Stat-Signature: jbs77nhkaknsorty9j1i35x7iy188qdr X-Rspamd-Server: rspam01 X-HE-Tag: 1689519033-490768 X-HE-Meta: U2FsdGVkX1/YrcV1+0EQ9aslhWZmRekWfQyfowkCPfbgk9CWMOWly1vGBeQ03MJ2MsHvy6X3un2kp89QWVXPSe8Bcs/9RGqjnG87cYnJ66c/Ol6Kvt8QLsVkZC+WLHvgG8g03VfgzwhC5PV9SUvCWMvCvpVcZDyFkRbenQTPScy9A/1e/ReHOGI05FjNGYH38joIpLEdB5TYDlruxJhk9QpdVwngfFDPZCO0vIhs9xK1b1GjYKlNlqD6nMKJoG038xGWD/t7ugaqZ4s3S+urcNCrVE3IpLyvq/CYjLopkmwYI3R10k1z+7jlyVZ3JfzaIp1THQusT1Y/NuA2oEG6FxbQzZbvZ+l/8U+vCNVHDhGfsm6A6zW7Hsl5jJmRD2gjDVdxT5Axl2MeuhqoA65+8w/bUytOcX6v3MxBPmwCwkplI2M01GS4VQwxYrHfZKens5FE77fkSJZS/yulwrJem4+RPHFuxq0XVXX5Z9WKe0iO4lpQAnlJKykSaQhoVC1B2w4RiPfFBxrc5lCHWlujTx9mgUhMrwJdqIJ0EovxBqIcE/bcbjdnINDSWpY3YmMXVnUnvT/39ByDq1T3kc0jN3jS8KpQjW1m2pKJym9sPq0l2au4IeqkFxK+MiwCim4uRS3mqMlfQ2TMLoyfZGLvGqJX8ZYy1u6yRl+sa5QKdr150+mJcbyCOWD/aijkLgvViaN5xB6iqDo3ZiJaqzn016qKQpJZ1jsXJC8sO4+mVTMOoMnlvIZXBHe4v3EEc/vKM5DlPhgyiwqOUMfTm1PuBgw5i2dToBOJRhkXueBtoiY0v8MYSMG4aBTdlwUl4QvDiG7t0kqRK2taySnm5yr4yb1YT+EiO7Z9eoyP021NfjczFPCFuTVFsHY5AHkU+q+689yXhlxh2iZWn7YR1vweYtzKRcgXAO1e41EyYSe0hpey0E+GBLreJ2st/+ydDOpaDOmJHGF5QzKAZjkDxFZ Lw0K8JpD UYElToWDCwAokuJ04NhtZSmt3f6R9Rv0ejw1xAlH+K4ShuSS+k7JXFZQ2fQKArTi3Pjv1VfYeVP+4c+qx5AyDc5Y4TG+yQrchSml6L2x8NPQqVQDKXFwTZV4ddxKAwD694iK/4RJ8JNmzIqO9+Drlm7FbOdgwo9Cq88KjcEjx7yFFhVO3OBpmmjLKY+ghBo/tlMI0AFizRiojawdf51dmRQs/TuVEE5WZEf6dvSBGOSiHKrZ5mKSisG3xumNMvQFWbC1QI3L2PCYTbTtTNpyLAzTkjj4430/S+END5AoFUrhsi9bUwoLDFGF73vT2Ag0+jAqtcoPAzJeViubbNQ0Abex3BjVnudXuThtd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When running UnixBench/Shell Scripts, we observed high false sharing for accessing i_mmap against i_mmap_rwsem. UnixBench/Shell Scripts are typical load/execute command test scenarios, which concurrently launch->execute->exit a lot of shell commands. A lot of processes invoke vma_interval_tree_remove which touch "i_mmap", the call stack: ----vma_interval_tree_remove |----unlink_file_vma | free_pgtables | |----exit_mmap | | mmput | | |----begin_new_exec | | | load_elf_binary | | | bprm_execve Meanwhile, there are a lot of processes touch 'i_mmap_rwsem' to acquire the semaphore in order to access 'i_mmap'. In existing 'address_space' layout, 'i_mmap' and 'i_mmap_rwsem' are in the same cacheline. The patch places the i_mmap and i_mmap_rwsem in separate cache lines to avoid this false sharing problem. With this patch, based on kernel v6.4.0, on Intel Sapphire Rapids 112C/224T platform, the score improves by ~5.3%. And perf c2c tool shows the false sharing is resolved as expected, the symbol vma_interval_tree_remove disappeared in cache line 0 after this change. Baseline: ================================================= Shared Cache Line Distribution Pareto ================================================= ------------------------------------------------------------- 0 3729 5791 0 0 0xff19b3818445c740 ------------------------------------------------------------- 3.27% 3.02% 0.00% 0.00% 0x18 0 1 0xffffffffa194403b 604 483 389 692 203 [k] vma_interval_tree_insert [kernel.kallsyms] vma_interval_tree_insert+75 0 1 4.13% 3.63% 0.00% 0.00% 0x20 0 1 0xffffffffa19440a2 553 413 415 962 215 [k] vma_interval_tree_remove [kernel.kallsyms] vma_interval_tree_remove+18 0 1 2.04% 1.35% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1d6 1210 855 460 1229 222 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1 0.62% 1.85% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1bf 762 329 577 527 198 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1 0.48% 0.31% 0.00% 0.00% 0x28 0 1 0xffffffffa219a58c 1677 1476 733 1544 224 [k] down_write [kernel.kallsyms] down_write+28 0 1 0.05% 0.07% 0.00% 0.00% 0x28 0 1 0xffffffffa219a21d 1040 819 689 33 27 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+749 0 1 0.00% 0.05% 0.00% 0.00% 0x28 0 1 0xffffffffa17707db 0 1005 786 1373 223 [k] up_write [kernel.kallsyms] up_write+27 0 1 0.00% 0.02% 0.00% 0.00% 0x28 0 1 0xffffffffa219a064 0 233 778 32 30 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1 33.82% 34.10% 0.00% 0.00% 0x30 0 1 0xffffffffa1770945 779 495 534 6011 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1 17.06% 15.28% 0.00% 0.00% 0x30 0 1 0xffffffffa1770915 593 438 468 2715 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1 3.54% 3.52% 0.00% 0.00% 0x30 0 1 0xffffffffa2199f84 881 601 583 1421 223 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+84 0 1 With this change: ------------------------------------------------------------- 0 556 838 0 0 0xff2780d7965d2780 ------------------------------------------------------------- 0.18% 0.60% 0.00% 0.00% 0x8 0 1 0xffffffffafff27b8 503 453 569 14 13 [k] do_dentry_open [kernel.kallsyms] do_dentry_open+456 0 1 0.54% 0.12% 0.00% 0.00% 0x8 0 1 0xffffffffaffc51ac 510 199 428 15 12 [k] hugepage_vma_check [kernel.kallsyms] hugepage_vma_check+252 0 1 1.80% 2.15% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1d6 1778 799 343 215 136 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1 0.54% 1.31% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1bf 547 296 528 91 71 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1 0.72% 0.72% 0.00% 0.00% 0x18 0 1 0xffffffffb079a58c 1479 1534 676 288 163 [k] down_write [kernel.kallsyms] down_write+28 0 1 0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffafd707db 0 2381 744 282 158 [k] up_write [kernel.kallsyms] up_write+27 0 1 0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffb079a064 0 239 518 6 6 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1 46.58% 47.02% 0.00% 0.00% 0x20 0 1 0xffffffffafd70945 704 403 499 1137 219 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1 23.92% 25.78% 0.00% 0.00% 0x20 0 1 0xffffffffafd70915 558 413 500 542 185 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1 v1->v2: change padding to exchange fields. Reviewed-by: Tim Chen Signed-off-by: Lipeng Zhu --- include/linux/fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 133f0640fb24..4a525ed17eab 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -451,11 +451,11 @@ struct address_space { atomic_t nr_thps; #endif struct rb_root_cached i_mmap; - struct rw_semaphore i_mmap_rwsem; unsigned long nrpages; pgoff_t writeback_index; const struct address_space_operations *a_ops; unsigned long flags; + struct rw_semaphore i_mmap_rwsem; errseq_t wb_err; spinlock_t private_lock; struct list_head private_list;