From patchwork Mon Apr 15 08:12:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kefeng Wang X-Patchwork-Id: 13629608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E771C4345F for ; Mon, 15 Apr 2024 08:12:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB57A6B008C; Mon, 15 Apr 2024 04:12:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C0ACB6B0093; Mon, 15 Apr 2024 04:12:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 953206B0095; Mon, 15 Apr 2024 04:12:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 62B246B008C for ; Mon, 15 Apr 2024 04:12:49 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EE1A7A1439 for ; Mon, 15 Apr 2024 08:12:48 +0000 (UTC) X-FDA: 82011050016.29.1497A06 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf24.hostedemail.com (Postfix) with ESMTP id 3B0DD180004 for ; Mon, 15 Apr 2024 08:12:45 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf24.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713168767; a=rsa-sha256; cv=none; b=3KCJkwrd1v+vZIoyXBCgC44PE3iT3XRA9jqevkn9LIGRz5/IaLGe88F//jKMUr7YTJMivU Hne7k3+Fx0PPlhxM3IbmenmoCnQMZeACCikfOR8O3kO0egtGobievQOBe9U/0U3Cq/cLge Csiaq6X+sHwFn6sMul6rmlix3cJWgZg= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf24.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713168767; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=pUzF3IkhL2MgPF7vA9hDotysMBNd7NkeKNT0AmIOn7s=; b=HAIl0fmLbjxGLTx6ZlvRwdPRgji5KIx4DFNZFjoJhtAIAOnGureihee54BOGCOU2p5YfKj hvi54OeOPlrWqQdMuQ7IY6wSx+IiStFaAGFMnk4JVyrVEV/KiSC8UtVoMsBkz1LdMn4pWs WVpkHSaM08v4qOYh2dyg8U/67m0h/s0= Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4VJ0Fd0gmjzwSr1; Mon, 15 Apr 2024 16:09:41 +0800 (CST) Received: from dggpemm100001.china.huawei.com (unknown [7.185.36.93]) by mail.maildlp.com (Postfix) with ESMTPS id E03C218006C; Mon, 15 Apr 2024 16:12:41 +0800 (CST) Received: from localhost.localdomain (10.175.112.125) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Mon, 15 Apr 2024 16:12:41 +0800 From: Kefeng Wang To: Andrew Morton CC: Huang Ying , Mel Gorman , Ryan Roberts , David Hildenbrand , Barry Song , Vlastimil Babka , Zi Yan , "Matthew Wilcox (Oracle)" , Jonathan Corbet , Yang Shi , Yu Zhao , , Kefeng Wang Subject: [PATCH rfc 0/3] mm: allow more high-order pages stored on PCP lists Date: Mon, 15 Apr 2024 16:12:17 +0800 Message-ID: <20240415081220.3246839-1-wangkefeng.wang@huawei.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Originating-IP: [10.175.112.125] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm100001.china.huawei.com (7.185.36.93) X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 3B0DD180004 X-Stat-Signature: sy834andrt6gds8ddb58qnwftcpir4h1 X-Rspam-User: X-HE-Tag: 1713168765-80185 X-HE-Meta: U2FsdGVkX1/b1Vnr8SrtaSV5tHxn2jPAAdDY8YM/8utqOm67RxSWB0kb1ynky3/aq+1rwpsxDMpicNqqTqI4V5A+ESwi3gg77kNiUY+aa+SgFoAjG4NSS2yTvWipP1zLxfNNn3+upqPC8kbuRJ6Pq3Saq+0+TH3NFDbIdsjGbDkG4ZCFyyuC28JtCtvifDMa6zJ2JL2z9IqdGXQuI1Vp9FyqTG5x0Mh9yPl+TTtVWlCCS1Lir6uTfwcDkl4NcuVQusUI7iDCrJPjtWGDSOG/qr/j8Kg8BDBzsKgwl+YKxzwCpQSFtVKyiLrtPIXIRZ+eJOebfg1AebhxybYFd7dVuKc1KkmybnRC26zYtLudZ99BRvDh+DY903fG1Lx4wmp9aH1HKa3+tfmCud90dVFi79y42xL9uPj1/bbHq3qg1IFUn7W5XgDsbiuFA+2q/8S2Xpypf6KU0uSgDPHBOS9WBqxqXb6zvvU0KUVdJ2z1C4/LrlFx615rUE3/qRXsMIc6P3TYzG6Cc9vkhmhftky3PhxEWZacHRhTs/Mm2Dr0apcfCLaBgKIYBeXXtn8CvVyDrhD/umNW1SIeY7FChHNn5MPpkD75EFyAYHr6ALPhE7DEt3g3xGC/WBODUCjQP8vMe7HbpgxeI4yE0pz5BdExP3uUD9XWSS3ZWyxPOEyfyZYmNq7aAGdLt0RTAfXChGk9noJzrISnN7R68jEZpN6XZO8F+tabGvHJHHkSQ0E8MVrQtatrepNPXI1dYd5QvyZOJevA+wNuCvE9JVRVVoc5fD9txsBPkgii0Vcswnkrycc/ZH4T2SBhE+JrWFHsAQQDojDqFu0yqzROaHys2OQ5V9HsUCEuipIJO4vW2Bd9IXVZTecGzvFDo0TxTf5/wj3dXQ7Id9J5WKN61ejCuW1mHKVRAVEQ+rlBDhsIM7EOsIt5SakP7U6Ivmy1GtWtLUKQON5ABrveTAtVqGRnv36 TLQiWybg bvU2icfWYBMeQJvVvNXlwhhLEcIoYkuUTh+JcwCjoc7dHhZ8CPtvMc5D+eTlY9Xkp316ZfbgySpf0fiLVT3fYrPTCeBuUsdpZW8S8fxI8OPGaV1XdD8aYWIXgPA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Both the file pages and anonymous pages support large folio, high-order pages except PMD_ORDER will also be allocated frequently which could increase the zone lock contention, allow high-order pages on pcp lists could reduce the big zone lock contention, but as commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") pointed, it may not win in all the scenes, add a new control sysfs to enable or disable specified high-order pages stored on PCP lists, the order (PAGE_ALLOC_COSTLY_ORDER, PMD_ORDER) won't be stored on PCP list by default. With perf lock tools, the lock contention from will-it-scale page_fault1 (with 90 tasks run 10s, hugepage-2048KB never, hugepage-64K always) show below(only care about zone spinlock and pcp spinlock), Without patches, contended total wait max wait avg wait type caller 713 4.64 ms 74.37 us 6.51 us spinlock __alloc_pages+0x23c With patches, contended total wait max wait avg wait type caller 2 25.66 us 16.31 us 12.83 us spinlock rmqueue_pcplist+0x2b0 Similar results on shell8 from unixbench, Without patches, 4942 901.09 ms 1.31 ms 182.33 us spinlock __alloc_pages+0x23c 1556 298.76 ms 1.23 ms 192.01 us spinlock rmqueue_pcplist+0x2b0 991 182.73 ms 879.80 us 184.39 us spinlock rmqueue_pcplist+0x2b0 With patches, contended total wait max wait avg wait type caller 988 187.63 ms 855.18 us 189.91 us spinlock rmqueue_pcplist+0x2b0 505 88.99 ms 793.27 us 176.21 us spinlock rmqueue_pcplist+0x2b0 The Benchmarks Score shows a little improvoment(0.28%) from shell8, but the zone lock from __alloc_pages() disappeared. Kefeng Wang (3): mm: prepare more high-order pages to be stored on the per-cpu lists mm: add control to allow specified high-order pages stored on PCP list mm: pcp: show each order page count Documentation/admin-guide/mm/transhuge.rst | 11 ++++ include/linux/gfp.h | 1 + include/linux/huge_mm.h | 1 + include/linux/mmzone.h | 10 ++- include/linux/vmstat.h | 19 ++++++ mm/Kconfig.debug | 8 +++ mm/huge_memory.c | 74 ++++++++++++++++++++++ mm/page_alloc.c | 30 +++++++-- mm/vmstat.c | 16 +++++ 9 files changed, 164 insertions(+), 6 deletions(-)