From patchwork Sun Jul 16 14:56:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhu, Lipeng" X-Patchwork-Id: 13314827 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D538EB64DD for ; Sun, 16 Jul 2023 14:52:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 605466B0074; Sun, 16 Jul 2023 10:52:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B5136B0075; Sun, 16 Jul 2023 10:52:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47D126B0078; Sun, 16 Jul 2023 10:52:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 38EED6B0074 for ; Sun, 16 Jul 2023 10:52:34 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0A14A140208 for ; Sun, 16 Jul 2023 14:52:34 +0000 (UTC) X-FDA: 81017766228.06.BC19261 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf16.hostedemail.com (Postfix) with ESMTP id F2CE918001C for ; Sun, 16 Jul 2023 14:52:31 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=h+08XHRZ; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf16.hostedemail.com: domain of lipeng.zhu@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=lipeng.zhu@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689519152; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iZEJxYK+iggH711n12d7B8ERJa+dNVHv7Vl8tYoyytg=; b=tc9TPKMtaDFWej3OaEdB/Og3cpLpsvGf0ya2AMBB+bXOeJFTNop2wnRIjj4WVsjrn1Lu9F wPCkjsbM1BepSEo+Rzc5HLpHZ90bY2+CSR0FxDp44Ee0dVNubrQ9cRYMhaK7JwTGaeWcVg yEKcgj15462Nac7IddQo8XewpwbuoC4= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=h+08XHRZ; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf16.hostedemail.com: domain of lipeng.zhu@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=lipeng.zhu@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689519152; a=rsa-sha256; cv=none; b=c1LSRacGoF5yB0kV+DQtNyRKqVrodAa3Cky3lPaJC/GjfJN0Qx+8XzMADsUp+bkYGcpxTC xuVyKZoC2qNQOe2IvRWTsDzZO29ZnRIkemvAamIhZOT7xAIoXcQ79dko5qi/8QzQpxmUOE f2NBYoj0CQAIdqhUZ3n+b9o/ySjHpFg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689519152; x=1721055152; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T7FhqY0BU1HUaPgWFw91cagayl/GC8Rb31al0fZzyCk=; b=h+08XHRZD1GjDBZuRhlML/szDc2GcAOEHJ4yPsgYM0yaNq/aFEdmqpn2 /GsXaaC6MDXO7lwcG6r2KoU1Ok2wmuMDXsMKeL9dv/nPrTHv9rdv+mAcl LH55a9dThdZ7pp7maBnUOl5uz+cDbNdp5VzPDSLN1DU6mBsqXHbBpqhd+ qu2uzG0VG4zIpRamFhPjbTqXXn98z9mSsx5W+Ni1tD5kjVIau+JxAG821 2bSFDOpDVHv5v/KIQFm2fvUFA0EmnOGkHgHVKBcX8FZx3ISEVBDXPP5BX XZLd7ohjpObk4e6JIIYYZ5uYo9ZpRRGZ+3Gmx270XfLqFB42NIHB3VJpQ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10773"; a="396580051" X-IronPort-AV: E=Sophos;i="6.01,210,1684825200"; d="scan'208";a="396580051" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Jul 2023 07:52:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10773"; a="969573131" X-IronPort-AV: E=Sophos;i="6.01,210,1684825200"; d="scan'208";a="969573131" Received: from linux-pnp-server-30.sh.intel.com ([10.239.146.163]) by fmsmga006.fm.intel.com with ESMTP; 16 Jul 2023 07:52:27 -0700 From: "Zhu, Lipeng" To: akpm@linux-foundation.org Cc: lipeng.zhu@inte.com, brauner@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, pan.deng@intel.com, tianyou.li@intel.com, tim.c.chen@linux.intel.com, viro@zeniv.linux.org.uk, yu.ma@intel.com, "Zhu, Lipeng" Subject: [PATCH v2] fs/address_space: add alignment padding for i_map and i_mmap_rwsem to mitigate a false sharing. Date: Sun, 16 Jul 2023 22:56:54 +0800 Message-Id: <20230716145653.20122-1-lipeng.zhu@intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: F2CE918001C X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: jbs77nhkaknsorty9j1i35x7iy188qdr X-HE-Tag: 1689519151-345777 X-HE-Meta: U2FsdGVkX18LX3E5UPvscpiaodpP/aeAbEaMK5BjjEaE2RHOCn6aqi3x+Pg7QsrB67uY/0dM/3zj6pM/hj9JXX7bV7ZeCTwedB8QyE/TwpZEh2qnpmGybSqWT+B4XBZGQXH94p37/FkOLFBrnOz8O41bsWtJzDsC2+xreMh7PFZIAetUUSMXS/F+j7CtmYQOWFB5ySHROVYL3RXYOVXEeBU2kTOUQmHe0uL4EHSFbrYYK5cH4IjOG2ae5Hk6aRhiFW/ir8uPr/A82LZ+pOuXi/JceWuoX+CyNqw0i4YFDS/3HovGjex+OXZUCUtRFBg2oPk5E2Of1/OE+WVggxHBMBBe17HuThyAON/MUDN9YrQTyCmYJcqx7IWNwaBEiTPudViLhg1EO6nwKxlngj7EioSW+x9KZQcmp/1oArxi1UNbH1q6i3rvja2MeDxFC2oIDESjmC8HjMWJpAAQcuOHVqMuNDQc36ccXsY4fq738LweVPeKK6q7xCGc1TbuRVUAds02eGNOdnusHsrIxtBqF6ZdDVgbIiJIvlpWT9KK0rfj7HqI+4fWxunXDgBNgH+NLDXaMPeWGeCtW1BPUjdeEE+v8nYmnCwc8ugN7lJTj//YiwJasB+VljCESxS17PNyCo7mK8ctxc/nYfBwqHxZaBTE3UvfbAuyG+JcC6r1HJcZR0MnJyVZbCMngjrqIoBwHKEeIJPCgePbRTe1PfJTvSFxHI/FVSRiwUvvgMeu6ksAbw2gF+c9Em0ZcYHA54X98xfG0XL838ja7nmvFJM3Gzaz9Rfv0Pq7M+NF30Vvk6faWSxazJgvWvN6cijamvd1bgeViMZ2CJxdjQvySfiHYHK+6Y4UBFzlBbIMaXYUc004mmJlxbV3FZfrfTxH/uj3Jtoxfr+VyVr1I6EkeY27Fa37VuBiYLEH6MXXvgtqpa2UfS0zxcHSODsiVlYrnWgoGxOHOtKp6bI108o7zqu /Av99CKV ND28k8jzIWqnGfFFNbGaJCKWzrUlaIzWNjRsUzHhLJ9fJXVnkCpbu5nRlkQLqdKESjP715+VOr4Gk2jgD8tUsjDq60Q+tDj+IHd2CCMzYaHumSqqI29NFmNfiiRaY2Fm+4fCHkUaVnTz9AwqpgxmvlGItIjn4LcS6/dw+jyX985m1mWSEBcyANA2ewbFgDwrCnlDoT1afZdBM4Sh8G6HFAlZQpbRk65mI/jWLrEKbuUCA+7e1HhzYm8aJrYz0Km77CxtZq4Zj4n3dO3L5YqMa3G49kgecRyB+gyv4p9l48isF8pnKnIyIoDzKjlKdv6pCwuoFC7MnVgGQ1kWKtfNWNAnDBD1hmhX+dAG1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When running UnixBench/Shell Scripts, we observed high false sharing for accessing i_mmap against i_mmap_rwsem. UnixBench/Shell Scripts are typical load/execute command test scenarios, which concurrently launch->execute->exit a lot of shell commands. A lot of processes invoke vma_interval_tree_remove which touch "i_mmap", the call stack: ----vma_interval_tree_remove |----unlink_file_vma | free_pgtables | |----exit_mmap | | mmput | | |----begin_new_exec | | | load_elf_binary | | | bprm_execve Meanwhile, there are a lot of processes touch 'i_mmap_rwsem' to acquire the semaphore in order to access 'i_mmap'. In existing 'address_space' layout, 'i_mmap' and 'i_mmap_rwsem' are in the same cacheline. The patch places the i_mmap and i_mmap_rwsem in separate cache lines to avoid this false sharing problem. With this patch, based on kernel v6.4.0, on Intel Sapphire Rapids 112C/224T platform, the score improves by ~5.3%. And perf c2c tool shows the false sharing is resolved as expected, the symbol vma_interval_tree_remove disappeared in cache line 0 after this change. Baseline: ================================================= Shared Cache Line Distribution Pareto ================================================= ------------------------------------------------------------- 0 3729 5791 0 0 0xff19b3818445c740 ------------------------------------------------------------- 3.27% 3.02% 0.00% 0.00% 0x18 0 1 0xffffffffa194403b 604 483 389 692 203 [k] vma_interval_tree_insert [kernel.kallsyms] vma_interval_tree_insert+75 0 1 4.13% 3.63% 0.00% 0.00% 0x20 0 1 0xffffffffa19440a2 553 413 415 962 215 [k] vma_interval_tree_remove [kernel.kallsyms] vma_interval_tree_remove+18 0 1 2.04% 1.35% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1d6 1210 855 460 1229 222 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1 0.62% 1.85% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1bf 762 329 577 527 198 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1 0.48% 0.31% 0.00% 0.00% 0x28 0 1 0xffffffffa219a58c 1677 1476 733 1544 224 [k] down_write [kernel.kallsyms] down_write+28 0 1 0.05% 0.07% 0.00% 0.00% 0x28 0 1 0xffffffffa219a21d 1040 819 689 33 27 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+749 0 1 0.00% 0.05% 0.00% 0.00% 0x28 0 1 0xffffffffa17707db 0 1005 786 1373 223 [k] up_write [kernel.kallsyms] up_write+27 0 1 0.00% 0.02% 0.00% 0.00% 0x28 0 1 0xffffffffa219a064 0 233 778 32 30 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1 33.82% 34.10% 0.00% 0.00% 0x30 0 1 0xffffffffa1770945 779 495 534 6011 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1 17.06% 15.28% 0.00% 0.00% 0x30 0 1 0xffffffffa1770915 593 438 468 2715 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1 3.54% 3.52% 0.00% 0.00% 0x30 0 1 0xffffffffa2199f84 881 601 583 1421 223 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+84 0 1 With this change: ------------------------------------------------------------- 0 556 838 0 0 0xff2780d7965d2780 ------------------------------------------------------------- 0.18% 0.60% 0.00% 0.00% 0x8 0 1 0xffffffffafff27b8 503 453 569 14 13 [k] do_dentry_open [kernel.kallsyms] do_dentry_open+456 0 1 0.54% 0.12% 0.00% 0.00% 0x8 0 1 0xffffffffaffc51ac 510 199 428 15 12 [k] hugepage_vma_check [kernel.kallsyms] hugepage_vma_check+252 0 1 1.80% 2.15% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1d6 1778 799 343 215 136 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1 0.54% 1.31% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1bf 547 296 528 91 71 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1 0.72% 0.72% 0.00% 0.00% 0x18 0 1 0xffffffffb079a58c 1479 1534 676 288 163 [k] down_write [kernel.kallsyms] down_write+28 0 1 0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffafd707db 0 2381 744 282 158 [k] up_write [kernel.kallsyms] up_write+27 0 1 0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffb079a064 0 239 518 6 6 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1 46.58% 47.02% 0.00% 0.00% 0x20 0 1 0xffffffffafd70945 704 403 499 1137 219 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1 23.92% 25.78% 0.00% 0.00% 0x20 0 1 0xffffffffafd70915 558 413 500 542 185 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1 v1->v2: change padding to exchange fields. Reviewed-by: Tim Chen Signed-off-by: Lipeng Zhu --- include/linux/fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 133f0640fb24..4a525ed17eab 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -451,11 +451,11 @@ struct address_space { atomic_t nr_thps; #endif struct rb_root_cached i_mmap; - struct rw_semaphore i_mmap_rwsem; unsigned long nrpages; pgoff_t writeback_index; const struct address_space_operations *a_ops; unsigned long flags; + struct rw_semaphore i_mmap_rwsem; errseq_t wb_err; spinlock_t private_lock; struct list_head private_list;