From patchwork Tue Sep 5 14:13:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 13374493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BB0DC83F3E for ; Tue, 5 Sep 2023 14:09:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 837928D0006; Tue, 5 Sep 2023 10:09:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 799318D0001; Tue, 5 Sep 2023 10:09:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 615818D0006; Tue, 5 Sep 2023 10:09:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 417BA8D0001 for ; Tue, 5 Sep 2023 10:09:04 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DDEEC120717 for ; Tue, 5 Sep 2023 14:09:03 +0000 (UTC) X-FDA: 81202725366.29.F74D4EA Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by imf06.hostedemail.com (Postfix) with ESMTP id 78A711801B1 for ; Tue, 5 Sep 2023 14:07:39 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dbeB2+f3; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf06.hostedemail.com: domain of feng.tang@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=feng.tang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693922861; a=rsa-sha256; cv=none; b=nqogziY5y3T3FajnrLny0NuSGZB/MYf24eQbrSeInyeWcFGXnzqxzTkeY2f8wWjfQiObbr VwlshVzf9ZDCF6RUN06hg2s+NIQH58M/M2DdYlWinMzFR7C++Scu9aD04ZeYGLWFllWjim C8C6CbBO0Lfb1xhcKw6BOyUbVsOVr8k= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dbeB2+f3; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf06.hostedemail.com: domain of feng.tang@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=feng.tang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693922861; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sp6m1759+xAjhak4EbHxsPW+m3pg3r3Czi0Tpj8/NAg=; b=76UD0OEPFJ9U/ukR1ynNfnt8AbjzajL6ZO/5rhFkvUfTSnAcfnBDavkuoKesdjuRFitERg 4QhYxwANnqmqIShI/dG6+jxNIgvzKBVpLjizEpXrt7bwtm2bCk8vzbIGF/bvKuL4gmYG3b 0QC+7d3Iaz7zxqw/IkVGsbUi0Hqu7w4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693922860; x=1725458860; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Q95MzduZqjx1H3qSsIIWM/4xzW0Di0Tyl1YsUi4/EEk=; b=dbeB2+f31FK+RvU7e7OZeb2Jnke+/wUjXB5VHXY/iOrWC7EC70zXiSYF JXiL/jHG5FViLsyIvOhKFJo2xUu+0G4XJGagX4tQbxZFSk9EULcrn/MO5 Icfbxf/jC9l/rQhjTuY4abM9TvBPdqBAOnV2xlf13i9uMnywOtbfbYkuJ jhQ9cVNbvuvB7xx+EDyWYUPBGHo16jHa0Rz83EX4CP/UJ7ABSMpaIfqNi k+6t07FfyvVMDI27GJ8k9olz5ySjXOelYjsnxA3ULqBhC9M+Rf3Yf2cq8 NxrFwiSvsxOVFfi2qjQ7PeXffKvRRxXgoj6xV0UJSNHXFdOc9AKC9P4RB g==; X-IronPort-AV: E=McAfee;i="6600,9927,10824"; a="380609604" X-IronPort-AV: E=Sophos;i="6.02,229,1688454000"; d="scan'208";a="380609604" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Sep 2023 07:06:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10824"; a="811242153" X-IronPort-AV: E=Sophos;i="6.02,229,1688454000"; d="scan'208";a="811242153" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by fmsmga004.fm.intel.com with ESMTP; 05 Sep 2023 07:06:23 -0700 From: Feng Tang To: Vlastimil Babka , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [RFC Patch 2/3] mm/slub: double per-cpu partial number for large systems Date: Tue, 5 Sep 2023 22:13:47 +0800 Message-Id: <20230905141348.32946-3-feng.tang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230905141348.32946-1-feng.tang@intel.com> References: <20230905141348.32946-1-feng.tang@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 78A711801B1 X-Stat-Signature: 4ff9ix91y4j1abirrizd9i3m5fbgibbm X-HE-Tag: 1693922859-97519 X-HE-Meta: U2FsdGVkX1/5IjxTq6sIIeUt6WeMJdKS4vKeo0sVU5ZP20KatPC2YJC20t2jbWNptlKv4QQbvuko1A1CVP/ktF2x+VXrxpzmhRK0Okv5f0dEyMk0AQGHJjfqg2fXe2YhAsLpOdlpoRCsTznwY0hRt/79sC5/d8Iia+e+ASaaD/reGY9Av7QKb/2wq0MUzCA2rES6cdKI9boBHST0Td+NLSgPb94lpHEL9pkm1BALjcDjRQfbeB+3JkHA6KpXJGcSkeEk1yI51iuEyauhJzrcRLExY9YXy5kHSEz4GJROng8MXlMPjlNyNlwT0Qyeu/Su2N+afj8o01IZM6ZfV8B4K4RDPvh/my/LS86Gzu/ELRTaG49UyF9DpeZLKo3uESrbrM4Kmi3tiny1rE8ZEKwfNGa2fN8wGx6BLllQJQqsW147dx/4hqEoJjoFY8eMwmT92DQ67QUuGQJL45zptqfGB3sivzAZDCFArR3BP8ZLy/gFM6FObEVkFRF0DdUWddNpq8MA5pTQH9u4Q0YWfaelXl0Q6HGbKcLYBv9cf1RaksAuejoL6wyb7b/11mQNHJC/VVPKkf6ij4RVE/yHovQFSEwNosFkM/nOJcx9zGc+IDXfARDqSGFS63vUco42/9dKtPvvGn0lhstIiNrL9YrW4XjtzZ7mqmkZw310+JcOF+Yi/Icp1SfVXGya8uuFZU2V+QsFkIBAiOHoMs6jCvfcZnEriIgjuB7qgT8fXttje56VdzFlQ63Wvn2gRyivLHEI4xit395io0AyXn1RUDF6sPrUnKqSXrZwihfZAbneoluuMbMAnxb2Wa/pDbeTghpri1agWtno+kUu4YHCLF/9IZGV9YwdKu2RPu5Vb3ESLT76MRXNo4GhtQwKQRRkzwDba2Z5a+GiPNE78W7GTulOAnZunLnOYRnVWqSZiIqq+VRM6ieeOwWze6eA7couHlmnkJ3WAxdxMBZan96ImJv NBPEhfI/ g6q+FFXNi/NQmNyJLoRSy51qrDWM5xOqCWt0qpgQduKox30MVVK1If+JZU02hObLPpgZrP+8x0L8LTFinbUthAShQgN3s5n+IDB7m+xsfTBWlMKDvI2KTupFGc2659nteybWOlH2IYspBRcPs6d7jId3XUvnxakJhforaWRzuYvlU9ZWNcj/kJwhCfqTo7lhs0CM3dDUMIcDj3DCCl9N1+Wuxnjn84/WKlcjleG0SRkk72TrLrcLbrr7pAviXtNFIRK7Q1h5BhaN2usHTANlAHenZtIvyfBfT/Rus X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are reports about severe lock contention for slub's per-node 'list_lock' in 'hackbench' test, [1][2], on server systems. And similar contention is also seen when running 'mmap1' case of will-it-scale on big systems. As the trend is one processor (socket) will have more and more CPUs (100+, 200+), the contention could be much more severe and becomes a scalability issue. One way to help reducing the contention is to double the per-cpu partial number for large systems. Following is some performance data, where it shows big improvment in will-it-scale/mmap1 case, but no ovbious change for the 'hackbench' test. The patch itself only makes the per-cpu partial number 2X, and for better analysis, the 4X case is also profiled will-it-scale/mmap1 ------------------- Run will-it-scale benchmark's 'mmap1' test case on a 2 socket Sapphire Rapids server (112 cores / 224 threads) with 256 GB DRAM, run 3 configurations with parallel test threads of 25%, 50% and 100% of number of CPUs, and the data is (base is vanilla v6.5 kernel): base base + 2X patch base + 4X patch wis-mmap1-25 223670 +12.7% 251999 +34.9% 301749 per_process_ops wis-mmap1-50 186020 +28.0% 238067 +55.6% 289521 per_process_ops wis-mmap1-100 89200 +40.7% 125478 +62.4% 144858 per_process_ops Take the perf-profile comparasion of 50% test case, the lock contention is greatly reduced: 43.80 -11.5 32.27 -27.9 15.91 pp.self.native_queued_spin_lock_slowpath hackbench --------- Run same hackbench testcase mentioned in [1], use same HW/SW as will-it-scale: base base + 2X patch base + 4X patch hackbench 759951 +0.2% 761506 +0.5% 763972 hackbench.throughput [1]. https://lore.kernel.org/all/202307172140.3b34825a-oliver.sang@intel.com/ [2]. ttps://lore.kernel.org/lkml/ZORaUsd+So+tnyMV@chenyu5-mobl2/ Signed-off-by: Feng Tang --- mm/slub.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index f7940048138c..51ca6dbaad09 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4361,6 +4361,13 @@ static void set_cpu_partial(struct kmem_cache *s) else nr_objects = 120; + /* + * Give larger system more per-cpu partial slabs to reduce/postpone + * contending per-node partial list. + */ + if (num_cpus() >= 32) + nr_objects *= 2; + slub_set_cpu_partial(s, nr_objects); #endif }