From patchwork Thu Dec 17 03:43:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979049 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DDC5C2BB9A for ; Thu, 17 Dec 2020 03:46:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 20F9B2371F for ; Thu, 17 Dec 2020 03:46:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728104AbgLQDq3 (ORCPT ); Wed, 16 Dec 2020 22:46:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727990AbgLQDq3 (ORCPT ); Wed, 16 Dec 2020 22:46:29 -0500 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83500C061282 for ; Wed, 16 Dec 2020 19:45:43 -0800 (PST) Received: by mail-pl1-x62c.google.com with SMTP id g20so13554954plo.2 for ; Wed, 16 Dec 2020 19:45:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I8JqAljVYtmaPaBvHCRChrR0oheI9SrolUe2et8YFn8=; b=LFRZS+7DcgILx1b9ou9M96wmw6taZL3I+iRan+cCcipo6HVZ24/ujb+bZ41PZXuLCZ 1ReDBz6W6bu3NpG+2xk3pKfOFqrr9g2Qrb+uxz9T03e6UIzD/fN/FfVR0NBGQ5XLo0iW 92K9qsyjSKNMYvLIcXdcn4DmKQ6yGUXvZ65QwyV7KMh0r+dZcCC79wnl9nGlvvWbYfOn RsEhQDpDm1uB7/ugs30Y6r4YkQ5eTOqbqec+OmXWgHo8olU4a4utfAysCogecVHdUejg SBw8Ad8KH0AGA1ChhpBwqy5mYnB+rBtL8Da74sdNyCe6Hm4IgtSSQ7doCiueJfa1paez mfkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I8JqAljVYtmaPaBvHCRChrR0oheI9SrolUe2et8YFn8=; b=YAW0TsSWrPbCt1apHdf/NIxmFDc0mcpED10wR4am5slR0B4BrRhfk+Ftz7UUrKiXl3 ygmk9AVB82De7VHWB0+yOV8FHxoDQFnFWjHVk2keI7laNyMoBt8rUTgCVbFhZlAj9k3z 7YBwJYsboEerfOHQRXq3nh6hG7IeweKHKdZ2LhAirY5JPO+lRHxZhmBBBkoNNGsFuDqR jQErq4PVCGaXjRyMCalLe9MALbdth4YOKnXpy2lcSIEMuDHPA7xS+BY1rszmkbBjLOcF xanygDjvhg4kaqmk+d3YgHr4WQAvY2jOjOT4nPtKtspR8txA/IUpUutkFy+9vr7eb/MF RboA== X-Gm-Message-State: AOAM533WPhgOAQRc/RF/NzuD06WLfbsyJqN3eq2SxMpMvHNVnVvx52UQ azX7MueCUMsA+1mX0FNSZLtHqQ== X-Google-Smtp-Source: ABdhPJw4NdNDKtIfv8xy6x6h9OgRGXasL/6YLMvw+2wPMLJxvSJx/uoemPCfa3HQ7YnKhTF68dvRxw== X-Received: by 2002:a17:90a:bc0a:: with SMTP id w10mr5877433pjr.79.1608176742627; Wed, 16 Dec 2020 19:45:42 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.45.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:45:41 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song , Michal Hocko , Pankaj Gupta Subject: [PATCH v5 1/7] mm: memcontrol: fix NR_ANON_THPS accounting in charge moving Date: Thu, 17 Dec 2020 11:43:50 +0800 Message-Id: <20201217034356.4708-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The unit of NR_ANON_THPS is HPAGE_PMD_NR already. So it should inc/dec by one rather than nr_pages. Fixes: 468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter") Signed-off-by: Muchun Song Acked-by: Michal Hocko Acked-by: Johannes Weiner Acked-by: Pankaj Gupta Reviewed-by: Roman Gushchin Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b80328f52fb4..8818bf64d6fe 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5653,10 +5653,8 @@ static int mem_cgroup_move_account(struct page *page, __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); if (PageTransHuge(page)) { - __mod_lruvec_state(from_vec, NR_ANON_THPS, - -nr_pages); - __mod_lruvec_state(to_vec, NR_ANON_THPS, - nr_pages); + __dec_lruvec_state(from_vec, NR_ANON_THPS); + __inc_lruvec_state(to_vec, NR_ANON_THPS); } } From patchwork Thu Dec 17 03:43:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9265C2BB40 for ; Thu, 17 Dec 2020 03:47:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 94441238D6 for ; Thu, 17 Dec 2020 03:47:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728167AbgLQDq4 (ORCPT ); Wed, 16 Dec 2020 22:46:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727665AbgLQDqz (ORCPT ); Wed, 16 Dec 2020 22:46:55 -0500 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04E09C061257 for ; Wed, 16 Dec 2020 19:45:51 -0800 (PST) Received: by mail-pg1-x534.google.com with SMTP id n10so11375827pgl.10 for ; Wed, 16 Dec 2020 19:45:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=b5xEcw+QILJhfSVYPIObqEXv3BhdeKivb4McVYg+alA=; b=OCXu6xCtbb8DYZp7/Ioy5nSX2nDvquLvBCofU++HMxBYRH+92k8QkrhtuZkIbJwRMJ +xolUpvc1jmheSVyBdHM5bSuNtUh3f6/VInAo9RxEY8ePH9/IDhtDj8OLQZ9LUq2m/+H UQTXHAyXus79XqDk0F5fWuzOGPJ6MSn7giJ0sIMogi40Yiqg09sHpMB5Z1R4VUUMI2IG OmYMKRIbn6GhvLNz17uWiRiQr1lb8W0vgW3B9hKP0F/CxDvJ5SLEuZ6D4pF8kbPSY7/b uizMwaKSZit3RPO5WSFdt9yLp/Ig+yWGHJFclOwLvv3x7nQJ4GyZkn9B7kZRK200C+jd /CIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b5xEcw+QILJhfSVYPIObqEXv3BhdeKivb4McVYg+alA=; b=UafA+jH5d8xUjB4hZSPQL1ioNItZ1W+Io0Y6PLsx2xKvzhiOW1EE40skFXyU4UvUkA Sr6iqXHTxrP4x+INxLtXWkYx4OkhmH4wjo6NdXWigtT0mUK/rE5UTO74euJkAI/Z5FbB gLnrvaGIZeY0njcCKwj66vQoLDIIZy2+xSNnhMFgJcGFhP365NJflJHfqDrVYqD4XorK mNqk+7n/qBW82+h9ZtoYBJ6sA5Wim2JYX2s2qhXDeMTwmWppuh5nCbox6MYIi3z5tWUX gYllmtQM0mzqisi6u76DGB3jPRMlFO+LXyZScF5cvIvIBxkPB7XXWV4wi/lZjW5wJEWr j9Lg== X-Gm-Message-State: AOAM5330dQCXC8DwCKZl2j4jzUttI6PpwQeEu53w6BizIKXMtlauaeq3 6ROU/xIynEnRCADm0nD5/vlkRg== X-Google-Smtp-Source: ABdhPJwPT1oMLSX8UQdWwyo5hbl582ypBWmpwsR0Z6Eb+BjfQA1bXYdeYwPRSM2T4Uu92vHD2cgTjg== X-Received: by 2002:a62:27c3:0:b029:196:63f6:cfac with SMTP id n186-20020a6227c30000b029019663f6cfacmr34949297pfn.75.1608176750558; Wed, 16 Dec 2020 19:45:50 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.45.43 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:45:49 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song Subject: [PATCH v5 2/7] mm: memcontrol: convert NR_ANON_THPS account to pages Date: Thu, 17 Dec 2020 11:43:51 +0800 Message-Id: <20201217034356.4708-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics expecially THP vmstat counters. In the systems with hundreads of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_ANON_THPS account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Signed-off-by: Muchun Song --- drivers/base/node.c | 15 +++++++++------ fs/proc/meminfo.c | 2 +- include/linux/mmzone.h | 10 ++++++++++ mm/huge_memory.c | 3 ++- mm/memcontrol.c | 20 ++++++-------------- mm/page_alloc.c | 2 +- mm/rmap.c | 6 +++--- mm/vmstat.c | 11 +++++++++-- 8 files changed, 41 insertions(+), 28 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 04f71c7bc3f8..6da0c3508bc9 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -461,8 +461,7 @@ static ssize_t node_read_meminfo(struct device *dev, nid, K(sunreclaimable) #ifdef CONFIG_TRANSPARENT_HUGEPAGE , - nid, K(node_page_state(pgdat, NR_ANON_THPS) * - HPAGE_PMD_NR), + nid, K(node_page_state(pgdat, NR_ANON_THPS)), nid, K(node_page_state(pgdat, NR_SHMEM_THPS) * HPAGE_PMD_NR), nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * @@ -519,10 +518,14 @@ static ssize_t node_read_vmstat(struct device *dev, sum_zone_numa_state(nid, i)); #endif - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - len += sysfs_emit_at(buf, len, "%s %lu\n", - node_stat_name(i), - node_page_state_pages(pgdat, i)); + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { + unsigned long pages = node_page_state_pages(pgdat, i); + + if (vmstat_item_print_in_thp(i)) + pages /= HPAGE_PMD_NR; + len += sysfs_emit_at(buf, len, "%s %lu\n", node_stat_name(i), + pages); + } return len; } diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index d6fc74619625..a635c8a84ddf 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -129,7 +129,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) #ifdef CONFIG_TRANSPARENT_HUGEPAGE show_val_kb(m, "AnonHugePages: ", - global_node_page_state(NR_ANON_THPS) * HPAGE_PMD_NR); + global_node_page_state(NR_ANON_THPS)); show_val_kb(m, "ShmemHugePages: ", global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR); show_val_kb(m, "ShmemPmdMapped: ", diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b593316bff3d..71029c09782b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -210,6 +210,16 @@ enum node_stat_item { }; /* + * Returns true if the item should be printed in THPs (/proc/vmstat + * currently prints number of anon, file and shmem THPs. But the item + * is charged in pages). + */ +static __always_inline bool vmstat_item_print_in_thp(enum node_stat_item item) +{ + return item == NR_ANON_THPS; +} + +/* * Returns true if the value is measured in bytes (most vmstat values are * measured in pages). This defines the API part, the internal representation * might be different. diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 10dd3cae5f53..66ec454120de 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2178,7 +2178,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, lock_page_memcg(page); if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { /* Last compound_mapcount is gone. */ - __dec_lruvec_page_state(page, NR_ANON_THPS); + __mod_lruvec_page_state(page, NR_ANON_THPS, + -HPAGE_PMD_NR); if (TestClearPageDoubleMap(page)) { /* No need in mapcount reference anymore */ for (i = 0; i < HPAGE_PMD_NR; i++) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8818bf64d6fe..b18e25a5cdf3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1532,7 +1532,7 @@ static struct memory_stat memory_stats[] = { * on some architectures, the macro of HPAGE_PMD_SIZE is not * constant(e.g. powerpc). */ - { "anon_thp", 0, NR_ANON_THPS }, + { "anon_thp", PAGE_SIZE, NR_ANON_THPS }, { "file_thp", 0, NR_FILE_THPS }, { "shmem_thp", 0, NR_SHMEM_THPS }, #endif @@ -1565,8 +1565,7 @@ static int __init memory_stats_init(void) for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (memory_stats[i].idx == NR_ANON_THPS || - memory_stats[i].idx == NR_FILE_THPS || + if (memory_stats[i].idx == NR_FILE_THPS || memory_stats[i].idx == NR_SHMEM_THPS) memory_stats[i].ratio = HPAGE_PMD_SIZE; #endif @@ -4088,10 +4087,6 @@ static int memcg_stat_show(struct seq_file *m, void *v) if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; nr = memcg_page_state_local(memcg, memcg1_stats[i]); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (memcg1_stats[i] == NR_ANON_THPS) - nr *= HPAGE_PMD_NR; -#endif seq_printf(m, "%s %lu\n", memcg1_stat_names[i], nr * PAGE_SIZE); } @@ -4122,10 +4117,6 @@ static int memcg_stat_show(struct seq_file *m, void *v) if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; nr = memcg_page_state(memcg, memcg1_stats[i]); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (memcg1_stats[i] == NR_ANON_THPS) - nr *= HPAGE_PMD_NR; -#endif seq_printf(m, "total_%s %llu\n", memcg1_stat_names[i], (u64)nr * PAGE_SIZE); } @@ -5653,10 +5644,11 @@ static int mem_cgroup_move_account(struct page *page, __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); if (PageTransHuge(page)) { - __dec_lruvec_state(from_vec, NR_ANON_THPS); - __inc_lruvec_state(to_vec, NR_ANON_THPS); + __mod_lruvec_state(from_vec, NR_ANON_THPS, + -nr_pages); + __mod_lruvec_state(to_vec, NR_ANON_THPS, + nr_pages); } - } } else { __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 469e28f95ce7..1700f52b7869 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5580,7 +5580,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) K(node_page_state(pgdat, NR_SHMEM_THPS) * HPAGE_PMD_NR), K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR), - K(node_page_state(pgdat, NR_ANON_THPS) * HPAGE_PMD_NR), + K(node_page_state(pgdat, NR_ANON_THPS)), #endif K(node_page_state(pgdat, NR_WRITEBACK_TEMP)), node_page_state(pgdat, NR_KERNEL_STACK_KB), diff --git a/mm/rmap.c b/mm/rmap.c index 08c56aaf72eb..c4d5c63cfd29 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1144,7 +1144,7 @@ void do_page_add_anon_rmap(struct page *page, * disabled. */ if (compound) - __inc_lruvec_page_state(page, NR_ANON_THPS); + __mod_lruvec_page_state(page, NR_ANON_THPS, nr); __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr); } @@ -1186,7 +1186,7 @@ void page_add_new_anon_rmap(struct page *page, if (hpage_pincount_available(page)) atomic_set(compound_pincount_ptr(page), 0); - __inc_lruvec_page_state(page, NR_ANON_THPS); + __mod_lruvec_page_state(page, NR_ANON_THPS, nr); } else { /* Anon THP always mapped first with PMD */ VM_BUG_ON_PAGE(PageTransCompound(page), page); @@ -1292,7 +1292,7 @@ static void page_remove_anon_compound_rmap(struct page *page) if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return; - __dec_lruvec_page_state(page, NR_ANON_THPS); + __mod_lruvec_page_state(page, NR_ANON_THPS, -thp_nr_pages(page)); if (TestClearPageDoubleMap(page)) { /* diff --git a/mm/vmstat.c b/mm/vmstat.c index 663f49ed1fd6..37c9e7b21e1e 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1624,8 +1624,12 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat, if (is_zone_first_populated(pgdat, zone)) { seq_printf(m, "\n per-node stats"); for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { + unsigned long pages = node_page_state_pages(pgdat, i); + + if (vmstat_item_print_in_thp(i)) + pages /= HPAGE_PMD_NR; seq_printf(m, "\n %-12s %lu", node_stat_name(i), - node_page_state_pages(pgdat, i)); + pages); } } seq_printf(m, @@ -1745,8 +1749,11 @@ static void *vmstat_start(struct seq_file *m, loff_t *pos) v += NR_VM_NUMA_STAT_ITEMS; #endif - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { v[i] = global_node_page_state_pages(i); + if (vmstat_item_print_in_thp(i)) + v[i] /= HPAGE_PMD_NR; + } v += NR_VM_NODE_STAT_ITEMS; global_dirty_limits(v + NR_DIRTY_BG_THRESHOLD, From patchwork Thu Dec 17 03:43:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E649C2BBCA for ; Thu, 17 Dec 2020 03:47:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DC95F238D6 for ; Thu, 17 Dec 2020 03:47:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728460AbgLQDrK (ORCPT ); Wed, 16 Dec 2020 22:47:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727165AbgLQDrJ (ORCPT ); Wed, 16 Dec 2020 22:47:09 -0500 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 249E9C0611CC for ; Wed, 16 Dec 2020 19:45:58 -0800 (PST) Received: by mail-pf1-x42e.google.com with SMTP id m6so8438853pfm.6 for ; Wed, 16 Dec 2020 19:45:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ChPy3SYz1507gCWRlnvA5RU8RNcwK6R9U8YqgL54fkc=; b=JLO/0y47sQz7EC+4uaKewlvvDxhNaJ7IMfmItkyYaKNL0/UduSC1AqhLpJcgLhYD4i mINmRlssJshSO4JkFuXJV4S1eHZqSYm0lxI2OQCKH7unbaAgoxE1Wd2YGbPu55lSqnxB FlGyoiwg0FVhhzBKBCxvXre94vECLc9vTH4QYDeLbKFhWvs/r0X4lA5AzkgMGQxSq7Gt Ur1XayIC5dOJuzSy88KOwPwGZrmnvVpX77SyJZ32r+vXwJxI+Qs2VmVNMmIhmgLewOq8 PCWQfgZP2mX62knI/ZgY2zlKUvzsUhfoEpo6sfBSRUwx77/TahmyNtRj5fTI8FjZU1p5 3OWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ChPy3SYz1507gCWRlnvA5RU8RNcwK6R9U8YqgL54fkc=; b=RMn3ywdROA1EaFjKD1Nk0eJfbqkARdvmyKZgpOMDXH1yoUpVjmOvCWZ+W3KRe5zOwi 64M2CM6kE+WZ6OLWN1IhnrUvFFc5X+klMTgk0URUpXua/zJ9H9L3ydpCxeeDlg4CEPct ELtvVZKmFCbYdj1T6FXCxotBw+sPVW+fXazqoNY/3MtdMeuqO/fC4KU+E74DJ/cLTR6I cTkeVmbQcGK/62Sx1Gs3M/57Ph7xik6G5fG0RHXSMiwaYzHeZq4ZfT5pD4qDr7p2lsLv ACqPqRj6epeHo8FSY+FAKrhOwkC/jz/EXhOt7UZDmE9c6GyCC5NqcVKaWxn9H/XaK4wO TlkQ== X-Gm-Message-State: AOAM533YY3jX2DdnW3Wxafj4tofZ01mUICdyX56HhVij6PVshjidZGPU Ml1+Qx2B2Xh4SPull7A3GWpFXA== X-Google-Smtp-Source: ABdhPJxP3PP2vH+nLB1q8F0lLYL41DTLr53Moyw79kHaXuI2zxof2mGKryd3jKFYZFhYGLMwpkJdtg== X-Received: by 2002:a65:4785:: with SMTP id e5mr7988863pgs.0.1608176757741; Wed, 16 Dec 2020 19:45:57 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.45.50 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:45:57 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song Subject: [PATCH v5 3/7] mm: memcontrol: convert NR_FILE_THPS account to pages Date: Thu, 17 Dec 2020 11:43:52 +0800 Message-Id: <20201217034356.4708-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics expecially THP vmstat counters. In the systems with hundreads of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_FILE_THPS account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Signed-off-by: Muchun Song --- drivers/base/node.c | 3 +-- fs/proc/meminfo.c | 2 +- include/linux/mmzone.h | 3 ++- mm/filemap.c | 2 +- mm/huge_memory.c | 5 ++++- mm/khugepaged.c | 4 +++- mm/memcontrol.c | 5 ++--- 7 files changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 6da0c3508bc9..d5952f754911 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -466,8 +466,7 @@ static ssize_t node_read_meminfo(struct device *dev, HPAGE_PMD_NR), nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR), - nid, K(node_page_state(pgdat, NR_FILE_THPS) * - HPAGE_PMD_NR), + nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED) * HPAGE_PMD_NR) #endif diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index a635c8a84ddf..7ea4679880c8 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -135,7 +135,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "ShmemPmdMapped: ", global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR); show_val_kb(m, "FileHugePages: ", - global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR); + global_node_page_state(NR_FILE_THPS)); show_val_kb(m, "FilePmdMapped: ", global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR); #endif diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 71029c09782b..19e77f656410 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -216,7 +216,8 @@ enum node_stat_item { */ static __always_inline bool vmstat_item_print_in_thp(enum node_stat_item item) { - return item == NR_ANON_THPS; + return item == NR_ANON_THPS || + item == NR_FILE_THPS; } /* diff --git a/mm/filemap.c b/mm/filemap.c index 78090ee08ac2..c5e6f5202476 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -207,7 +207,7 @@ static void unaccount_page_cache_page(struct address_space *mapping, if (PageTransHuge(page)) __dec_lruvec_page_state(page, NR_SHMEM_THPS); } else if (PageTransHuge(page)) { - __dec_lruvec_page_state(page, NR_FILE_THPS); + __mod_lruvec_page_state(page, NR_FILE_THPS, -nr); filemap_nr_thps_dec(mapping); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 66ec454120de..cdf61596ef76 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2745,10 +2745,13 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) } spin_unlock(&ds_queue->split_queue_lock); if (mapping) { + int nr = thp_nr_pages(head); + if (PageSwapBacked(head)) __dec_lruvec_page_state(head, NR_SHMEM_THPS); else - __dec_lruvec_page_state(head, NR_FILE_THPS); + __mod_lruvec_page_state(head, NR_FILE_THPS, + -nr); } __split_huge_page(page, list, end); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 494d3cb0b58a..23f93a3e2e69 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1654,6 +1654,7 @@ static void collapse_file(struct mm_struct *mm, XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); int nr_none = 0, result = SCAN_SUCCEED; bool is_shmem = shmem_file(file); + int nr; VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); @@ -1865,11 +1866,12 @@ static void collapse_file(struct mm_struct *mm, put_page(page); goto xa_unlocked; } + nr = thp_nr_pages(new_page); if (is_shmem) __inc_lruvec_page_state(new_page, NR_SHMEM_THPS); else { - __inc_lruvec_page_state(new_page, NR_FILE_THPS); + __mod_lruvec_page_state(new_page, NR_FILE_THPS, nr); filemap_nr_thps_inc(mapping); } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b18e25a5cdf3..04985c8c6a0a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1533,7 +1533,7 @@ static struct memory_stat memory_stats[] = { * constant(e.g. powerpc). */ { "anon_thp", PAGE_SIZE, NR_ANON_THPS }, - { "file_thp", 0, NR_FILE_THPS }, + { "file_thp", PAGE_SIZE, NR_FILE_THPS }, { "shmem_thp", 0, NR_SHMEM_THPS }, #endif { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON }, @@ -1565,8 +1565,7 @@ static int __init memory_stats_init(void) for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (memory_stats[i].idx == NR_FILE_THPS || - memory_stats[i].idx == NR_SHMEM_THPS) + if (memory_stats[i].idx == NR_SHMEM_THPS) memory_stats[i].ratio = HPAGE_PMD_SIZE; #endif VM_BUG_ON(!memory_stats[i].ratio); From patchwork Thu Dec 17 03:43:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11853C4361B for ; Thu, 17 Dec 2020 03:47:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D74212371F for ; Thu, 17 Dec 2020 03:47:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728356AbgLQDrK (ORCPT ); Wed, 16 Dec 2020 22:47:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728403AbgLQDrJ (ORCPT ); Wed, 16 Dec 2020 22:47:09 -0500 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13C0FC0611CE for ; Wed, 16 Dec 2020 19:46:06 -0800 (PST) Received: by mail-pj1-x102d.google.com with SMTP id b5so3312634pjk.2 for ; Wed, 16 Dec 2020 19:46:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xPjoCn2Uhs8i6RF8GvQfWf4AwMz+svXGYxnAYdmZuDI=; b=FdA31vfhWIaMgeUtFlI+xUN18KN7HWs+1u/iMKZ8h9cFMId0stAFiuLOownLzArh8L 2fzM9WXVHXZsmTiYNgo8QU7+nQP5UM8beR3pMq/h2X1dBGboRcETj+dNmhqlDaQoz+tu jz1pWXsd2WzKOVU326tp4I/9a5CfKgWPYoZN4guHONSWx/6s4ArO6aaZ/yICNOSS3owR 4Hz/ORKvyCLq6lO7V8pspKM5mgeVd12DulFBRz5iBhBkH0+NqpSvhza0Yo3upeYmRnhj nFIsfm6BHRg2N3TztOzHwrCed+GDNCsUpURley6YjEjMkGrm0zE9jNwnxsaMV+iX5M96 t44A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xPjoCn2Uhs8i6RF8GvQfWf4AwMz+svXGYxnAYdmZuDI=; b=a7q/mWdDg468hK/vEUrhpa1XHolBQ3eQSEob//OojgYwg+X27zw7B0KAhhU06JqPSF NrGDegZGAjBu3eCPGLyo+5M8KPO5kZX6LUf39lyfyj/BOJBV/ZW0tjQ8OxSYmlV5/qPr /7KT7gZMzYlEGgHy/1iWtcqgz6EYfjna0erq7WA2atKCL7ytBomXyxP76luBSCHNPBpd AEecmihBJuRk4OIoyUNjho0AUjkMPKDBtUy9gCi8oYv5Z9gPI/B19ljdx4Hmhkm4QTmL DquEJG12Zj+9oG5gLenwEp+WFG8gQN8x0+Xw2uPDLGA89OXHRgJGHQY87Jbc4jF4QgQF Rgfg== X-Gm-Message-State: AOAM532grtOxAEJDUSO9Y7kFx44vXGiN9ByPzvVZqvGk4a7aBSI9apMT USo6tDwvF+B4NQ+4Bje0SJbiMg== X-Google-Smtp-Source: ABdhPJyslSNhwkUQPenLnPmRpXXnGeJnaiZemfAb8ZSiBZaOJ+bSnMxU+5XmF2/JoA9qIu+/teWM9Q== X-Received: by 2002:a17:90a:638a:: with SMTP id f10mr6021880pjj.191.1608176765601; Wed, 16 Dec 2020 19:46:05 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.45.58 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:46:05 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song Subject: [PATCH v5 4/7] mm: memcontrol: convert NR_SHMEM_THPS account to pages Date: Thu, 17 Dec 2020 11:43:53 +0800 Message-Id: <20201217034356.4708-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics expecially THP vmstat counters. In the systems with hundreads of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_SHMEM_THPS account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Signed-off-by: Muchun Song --- drivers/base/node.c | 3 +-- fs/proc/meminfo.c | 2 +- include/linux/mmzone.h | 3 ++- mm/filemap.c | 2 +- mm/huge_memory.c | 3 ++- mm/khugepaged.c | 2 +- mm/memcontrol.c | 26 ++------------------------ mm/page_alloc.c | 2 +- mm/shmem.c | 2 +- 9 files changed, 12 insertions(+), 33 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index d5952f754911..6d5ac6ffb6e1 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -462,8 +462,7 @@ static ssize_t node_read_meminfo(struct device *dev, #ifdef CONFIG_TRANSPARENT_HUGEPAGE , nid, K(node_page_state(pgdat, NR_ANON_THPS)), - nid, K(node_page_state(pgdat, NR_SHMEM_THPS) * - HPAGE_PMD_NR), + nid, K(node_page_state(pgdat, NR_SHMEM_THPS)), nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR), nid, K(node_page_state(pgdat, NR_FILE_THPS)), diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 7ea4679880c8..cfb107eaa3e6 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -131,7 +131,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "AnonHugePages: ", global_node_page_state(NR_ANON_THPS)); show_val_kb(m, "ShmemHugePages: ", - global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR); + global_node_page_state(NR_SHMEM_THPS)); show_val_kb(m, "ShmemPmdMapped: ", global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR); show_val_kb(m, "FileHugePages: ", diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 19e77f656410..3b45f39011ab 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -217,7 +217,8 @@ enum node_stat_item { static __always_inline bool vmstat_item_print_in_thp(enum node_stat_item item) { return item == NR_ANON_THPS || - item == NR_FILE_THPS; + item == NR_FILE_THPS || + item == NR_SHMEM_THPS; } /* diff --git a/mm/filemap.c b/mm/filemap.c index c5e6f5202476..1952e923cc2e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -205,7 +205,7 @@ static void unaccount_page_cache_page(struct address_space *mapping, if (PageSwapBacked(page)) { __mod_lruvec_page_state(page, NR_SHMEM, -nr); if (PageTransHuge(page)) - __dec_lruvec_page_state(page, NR_SHMEM_THPS); + __mod_lruvec_page_state(page, NR_SHMEM_THPS, -nr); } else if (PageTransHuge(page)) { __mod_lruvec_page_state(page, NR_FILE_THPS, -nr); filemap_nr_thps_dec(mapping); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index cdf61596ef76..5aa045e3b5dc 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2748,7 +2748,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) int nr = thp_nr_pages(head); if (PageSwapBacked(head)) - __dec_lruvec_page_state(head, NR_SHMEM_THPS); + __mod_lruvec_page_state(head, NR_SHMEM_THPS, + -nr); else __mod_lruvec_page_state(head, NR_FILE_THPS, -nr); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 23f93a3e2e69..8369d9620f6d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1869,7 +1869,7 @@ static void collapse_file(struct mm_struct *mm, nr = thp_nr_pages(new_page); if (is_shmem) - __inc_lruvec_page_state(new_page, NR_SHMEM_THPS); + __mod_lruvec_page_state(new_page, NR_SHMEM_THPS, nr); else { __mod_lruvec_page_state(new_page, NR_FILE_THPS, nr); filemap_nr_thps_inc(mapping); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 04985c8c6a0a..a40797a27f87 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1515,7 +1515,7 @@ struct memory_stat { unsigned int idx; }; -static struct memory_stat memory_stats[] = { +static const struct memory_stat memory_stats[] = { { "anon", PAGE_SIZE, NR_ANON_MAPPED }, { "file", PAGE_SIZE, NR_FILE_PAGES }, { "kernel_stack", 1024, NR_KERNEL_STACK_KB }, @@ -1527,14 +1527,9 @@ static struct memory_stat memory_stats[] = { { "file_dirty", PAGE_SIZE, NR_FILE_DIRTY }, { "file_writeback", PAGE_SIZE, NR_WRITEBACK }, #ifdef CONFIG_TRANSPARENT_HUGEPAGE - /* - * The ratio will be initialized in memory_stats_init(). Because - * on some architectures, the macro of HPAGE_PMD_SIZE is not - * constant(e.g. powerpc). - */ { "anon_thp", PAGE_SIZE, NR_ANON_THPS }, { "file_thp", PAGE_SIZE, NR_FILE_THPS }, - { "shmem_thp", 0, NR_SHMEM_THPS }, + { "shmem_thp", PAGE_SIZE, NR_SHMEM_THPS }, #endif { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON }, { "active_anon", PAGE_SIZE, NR_ACTIVE_ANON }, @@ -1559,23 +1554,6 @@ static struct memory_stat memory_stats[] = { { "workingset_nodereclaim", 1, WORKINGSET_NODERECLAIM }, }; -static int __init memory_stats_init(void) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (memory_stats[i].idx == NR_SHMEM_THPS) - memory_stats[i].ratio = HPAGE_PMD_SIZE; -#endif - VM_BUG_ON(!memory_stats[i].ratio); - VM_BUG_ON(memory_stats[i].idx >= MEMCG_NR_STAT); - } - - return 0; -} -pure_initcall(memory_stats_init); - static char *memory_stat_format(struct mem_cgroup *memcg) { struct seq_buf s; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1700f52b7869..720fb5a220b6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5577,7 +5577,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) K(node_page_state(pgdat, NR_WRITEBACK)), K(node_page_state(pgdat, NR_SHMEM)), #ifdef CONFIG_TRANSPARENT_HUGEPAGE - K(node_page_state(pgdat, NR_SHMEM_THPS) * HPAGE_PMD_NR), + K(node_page_state(pgdat, NR_SHMEM_THPS)), K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR), K(node_page_state(pgdat, NR_ANON_THPS)), diff --git a/mm/shmem.c b/mm/shmem.c index 53d84d2c9fe5..de261cfbf987 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -713,7 +713,7 @@ static int shmem_add_to_page_cache(struct page *page, } if (PageTransHuge(page)) { count_vm_event(THP_FILE_ALLOC); - __inc_lruvec_page_state(page, NR_SHMEM_THPS); + __mod_lruvec_page_state(page, NR_SHMEM_THPS, nr); } mapping->nrpages += nr; __mod_lruvec_page_state(page, NR_FILE_PAGES, nr); From patchwork Thu Dec 17 03:43:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F9BBC2BB48 for ; Thu, 17 Dec 2020 03:47:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 012422376F for ; Thu, 17 Dec 2020 03:47:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728539AbgLQDrL (ORCPT ); Wed, 16 Dec 2020 22:47:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728491AbgLQDrK (ORCPT ); Wed, 16 Dec 2020 22:47:10 -0500 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B86E7C0611E4 for ; Wed, 16 Dec 2020 19:46:13 -0800 (PST) Received: by mail-pl1-x62f.google.com with SMTP id s2so14382119plr.9 for ; Wed, 16 Dec 2020 19:46:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uCtNZ3XPH5CZg3kErhLn4u4ZqNgApfd/QZFgeqYuuAI=; b=pcm4215a4X2RNo2dZf7EyXAAYC5KGczkCgIidM57M+/r2ke+P81Ojx8ebrNuwXoYdh blqdQ0Fy9Y19G0wNejnxAoH+PRL3/buG8NTNh4djOaI8sHDZFOnaATngcnevJwtKOhXm NYZwdtZ05PtuL5ve9dpyV+kRtyfehzRh2B3179sLk6U2mx3joFvWEicyjiaiiv5Y23Wm 62xm6axFw7nYZ0fMoAnxktTa8c/FcyXJBayaP0OqrNjKjRpzm6YWI9uNZQYnuHjoYjX7 qhDfJrWK/nJTNhsqmXj3i3UgwFpL2MZTzKipwmpSOLYcw/avMBVh4r+EkK0ydX2WO1rJ 23Ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uCtNZ3XPH5CZg3kErhLn4u4ZqNgApfd/QZFgeqYuuAI=; b=FHF2fFlfaYwlQJVo9zDnsvA1chBpJOzF44elLv0ExPUx2N1Nq30l+SICdTY05rwqln 2VUydHIfbrozcfdNHu5sep5SCjyGWLgggmmEvXxPIrEXMSiy4jacGXYUqbhY40JPToi2 PsIcYoBDOkzVyK525vCwn80Tinao1QoNzyQU7IYlJsgdNwjn5nFiayOCX8xhBDbcHrC0 geRY34q6YK5oSknhgOqb7E0COF1ZQctY/dIn/UnZZkq0UZb7yw05sGehoYyFrWd31IOV W9XkMIRVBrh7nteTtFsvbcMNJYzb6z37Krd3VfbeTQqUbCwr/k4HC0CkaL6jKjslYZbG OwmQ== X-Gm-Message-State: AOAM533zTvnnkT0N9bVKhJikBvKWG1X+/BI62RqSpDRbND3Zs8tzEB8K Jjz8jwtngR42pjb82vb7KeS51Q== X-Google-Smtp-Source: ABdhPJydrabIcHW4vN8GmUTqrv2fTW8jWRUbABXMbydTMp3cZy/rFGk/PqA2SCgBGY+rBTpsfQILIw== X-Received: by 2002:a17:90a:73c5:: with SMTP id n5mr5948234pjk.118.1608176773175; Wed, 16 Dec 2020 19:46:13 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.46.05 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:46:12 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song Subject: [PATCH v5 5/7] mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages Date: Thu, 17 Dec 2020 11:43:54 +0800 Message-Id: <20201217034356.4708-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics expecially THP vmstat counters. In the systems with hundreads of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_SHMEM_PMDMAPPED account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Signed-off-by: Muchun Song --- drivers/base/node.c | 3 +-- fs/proc/meminfo.c | 2 +- include/linux/mmzone.h | 3 ++- mm/page_alloc.c | 3 +-- mm/rmap.c | 14 ++++++++++---- 5 files changed, 15 insertions(+), 10 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 6d5ac6ffb6e1..7a66aefe4e46 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -463,8 +463,7 @@ static ssize_t node_read_meminfo(struct device *dev, , nid, K(node_page_state(pgdat, NR_ANON_THPS)), nid, K(node_page_state(pgdat, NR_SHMEM_THPS)), - nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * - HPAGE_PMD_NR), + nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED) * HPAGE_PMD_NR) diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index cfb107eaa3e6..c61f440570f9 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -133,7 +133,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "ShmemHugePages: ", global_node_page_state(NR_SHMEM_THPS)); show_val_kb(m, "ShmemPmdMapped: ", - global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR); + global_node_page_state(NR_SHMEM_PMDMAPPED)); show_val_kb(m, "FileHugePages: ", global_node_page_state(NR_FILE_THPS)); show_val_kb(m, "FilePmdMapped: ", diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3b45f39011ab..1ca6a34f6a4a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -218,7 +218,8 @@ static __always_inline bool vmstat_item_print_in_thp(enum node_stat_item item) { return item == NR_ANON_THPS || item == NR_FILE_THPS || - item == NR_SHMEM_THPS; + item == NR_SHMEM_THPS || + item == NR_SHMEM_PMDMAPPED; } /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 720fb5a220b6..575fbfeea4b5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5578,8 +5578,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) K(node_page_state(pgdat, NR_SHMEM)), #ifdef CONFIG_TRANSPARENT_HUGEPAGE K(node_page_state(pgdat, NR_SHMEM_THPS)), - K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) - * HPAGE_PMD_NR), + K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), K(node_page_state(pgdat, NR_ANON_THPS)), #endif K(node_page_state(pgdat, NR_WRITEBACK_TEMP)), diff --git a/mm/rmap.c b/mm/rmap.c index c4d5c63cfd29..1c1b576c0627 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1211,14 +1211,17 @@ void page_add_file_rmap(struct page *page, bool compound) VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); lock_page_memcg(page); if (compound && PageTransHuge(page)) { - for (i = 0, nr = 0; i < thp_nr_pages(page); i++) { + int nr_pages = thp_nr_pages(page); + + for (i = 0, nr = 0; i < nr_pages; i++) { if (atomic_inc_and_test(&page[i]._mapcount)) nr++; } if (!atomic_inc_and_test(compound_mapcount_ptr(page))) goto out; if (PageSwapBacked(page)) - __inc_node_page_state(page, NR_SHMEM_PMDMAPPED); + __mod_lruvec_page_state(page, NR_SHMEM_PMDMAPPED, + nr_pages); else __inc_node_page_state(page, NR_FILE_PMDMAPPED); } else { @@ -1252,14 +1255,17 @@ static void page_remove_file_rmap(struct page *page, bool compound) /* page still mapped by someone else? */ if (compound && PageTransHuge(page)) { - for (i = 0, nr = 0; i < thp_nr_pages(page); i++) { + int nr_pages = thp_nr_pages(page); + + for (i = 0, nr = 0; i < nr_pages; i++) { if (atomic_add_negative(-1, &page[i]._mapcount)) nr++; } if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) return; if (PageSwapBacked(page)) - __dec_node_page_state(page, NR_SHMEM_PMDMAPPED); + __mod_lruvec_page_state(page, NR_SHMEM_PMDMAPPED, + -nr_pages); else __dec_node_page_state(page, NR_FILE_PMDMAPPED); } else { From patchwork Thu Dec 17 03:43:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFC72C4361B for ; Thu, 17 Dec 2020 03:47:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8A9492376F for ; Thu, 17 Dec 2020 03:47:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728784AbgLQDrg (ORCPT ); Wed, 16 Dec 2020 22:47:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727268AbgLQDrg (ORCPT ); Wed, 16 Dec 2020 22:47:36 -0500 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79293C0617A7 for ; Wed, 16 Dec 2020 19:46:21 -0800 (PST) Received: by mail-pj1-x1031.google.com with SMTP id iq13so2896124pjb.3 for ; Wed, 16 Dec 2020 19:46:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UauSishbX7RnBX7K58fFhe45sU8LPO/cdBxsyz6B3SU=; b=x2tW2mmU9L1wdYbgWOJBGPpHM/c0zI8LzK8gbOzNeqMxSEvT39yMzMhK++aZe0mwnn UinbyALx0XGkuKBzBEpw1U6woTvBcMG/jgBdIE4CBIJncj5dCI5UZCrZrfM+gMfOSwdL ACAo8bNr3y5hvoVxik4WNw2ddz1fy7xbFYjKRSGPCMzO7CurA/OzBtahS8sDWKzI6MmK 298avBcWSmMbmXgRVITspGUDf89lCfA3YRdR/hKPVBOkgZ5awc2EfPtXGiKyq+NmIRYl y9xg9gYEOwFylyDulLVSIAG3C7MUDet7P51oa8ankzc64E8dTk4d6TSA+knwLlzFmnqT 7KJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UauSishbX7RnBX7K58fFhe45sU8LPO/cdBxsyz6B3SU=; b=qyeWjDenS4SPO86psidDdVLj1VrKjyPVuqKcN2wL3EvBa1BvfkMqpOP52ZJax8Wo+T OWdtpLHWPE3pPbNlCU87SfLp32HiKDBRz3c+GhAfh1HflRGnQ6UNKAACeKBkzfn1ERoq 4A/iWPPs/SKckFLTB50e6OrEIekVvfh0mreEom1FKvO8UmaHnMwlr2zdTPDnogYEzMS/ Fu5XCqjvqC0n6CS8apA/copXdNI2ZXEPtPErRUH47RQOg/SU0+LF/utWP6EpTLQqXZeI bgscLmwYKwnxm8dvTuXHyzhp9+ObmJrh8JmIwAPcAqqr78cEwQyt5JlZpuf62eKw08Gh 2ApA== X-Gm-Message-State: AOAM533sgjIZOyowt7eC384kYN+/JNxPCjcZ10Tlg9SCDbzxAcXxyXf7 AojUM5sBdf8dSrbbmvu4zqH7SA== X-Google-Smtp-Source: ABdhPJxeMkklPh+rYfkTZFFo93lMxHKhiwftHUyUbrO6fSQpXmBiTaS5E/LJ4Nh1ONcKI8Djt2NVjQ== X-Received: by 2002:a17:902:59d0:b029:da:69a8:11a8 with SMTP id d16-20020a17090259d0b02900da69a811a8mr34531436plj.63.1608176781034; Wed, 16 Dec 2020 19:46:21 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.46.13 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:46:20 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song Subject: [PATCH v5 6/7] mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages Date: Thu, 17 Dec 2020 11:43:55 +0800 Message-Id: <20201217034356.4708-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics expecially THP vmstat counters. In the systems with hundreads of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_FILE_PMDMAPPED account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Signed-off-by: Muchun Song --- drivers/base/node.c | 3 +-- fs/proc/meminfo.c | 2 +- include/linux/mmzone.h | 3 ++- mm/rmap.c | 6 ++++-- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 7a66aefe4e46..d02d86aec19f 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -465,8 +465,7 @@ static ssize_t node_read_meminfo(struct device *dev, nid, K(node_page_state(pgdat, NR_SHMEM_THPS)), nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), nid, K(node_page_state(pgdat, NR_FILE_THPS)), - nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED) * - HPAGE_PMD_NR) + nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED)) #endif ); len += hugetlb_report_node_meminfo(buf, len, nid); diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index c61f440570f9..6fa761c9cc78 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -137,7 +137,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "FileHugePages: ", global_node_page_state(NR_FILE_THPS)); show_val_kb(m, "FilePmdMapped: ", - global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR); + global_node_page_state(NR_FILE_PMDMAPPED)); #endif #ifdef CONFIG_CMA diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 1ca6a34f6a4a..6139479ba417 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -219,7 +219,8 @@ static __always_inline bool vmstat_item_print_in_thp(enum node_stat_item item) return item == NR_ANON_THPS || item == NR_FILE_THPS || item == NR_SHMEM_THPS || - item == NR_SHMEM_PMDMAPPED; + item == NR_SHMEM_PMDMAPPED || + item == NR_FILE_PMDMAPPED; } /* diff --git a/mm/rmap.c b/mm/rmap.c index 1c1b576c0627..5ebf16fae4b9 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1223,7 +1223,8 @@ void page_add_file_rmap(struct page *page, bool compound) __mod_lruvec_page_state(page, NR_SHMEM_PMDMAPPED, nr_pages); else - __inc_node_page_state(page, NR_FILE_PMDMAPPED); + __mod_lruvec_page_state(page, NR_FILE_PMDMAPPED, + nr_pages); } else { if (PageTransCompound(page) && page_mapping(page)) { VM_WARN_ON_ONCE(!PageLocked(page)); @@ -1267,7 +1268,8 @@ static void page_remove_file_rmap(struct page *page, bool compound) __mod_lruvec_page_state(page, NR_SHMEM_PMDMAPPED, -nr_pages); else - __dec_node_page_state(page, NR_FILE_PMDMAPPED); + __mod_lruvec_page_state(page, NR_FILE_PMDMAPPED, + -nr_pages); } else { if (!atomic_add_negative(-1, &page->_mapcount)) return; From patchwork Thu Dec 17 03:43:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11979063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25063C2BB48 for ; Thu, 17 Dec 2020 03:47:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EF46B23877 for ; Thu, 17 Dec 2020 03:47:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728824AbgLQDrg (ORCPT ); Wed, 16 Dec 2020 22:47:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728779AbgLQDrg (ORCPT ); Wed, 16 Dec 2020 22:47:36 -0500 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23DD4C0619D2 for ; Wed, 16 Dec 2020 19:46:29 -0800 (PST) Received: by mail-pf1-x42f.google.com with SMTP id d2so18175605pfq.5 for ; Wed, 16 Dec 2020 19:46:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=p3OHjqKD6x4JCybHo/LDnqZ2DJrExSzBnbiuStXpNr8=; b=gNrD3FTQyEAeYIei7WvGyw9+CpWASIvUxWGkgRGWCKaevxWSwLYl80l46XILQxfms3 Kq0H/64myBzlRrDA2axWEhYble0/v/GmQpF2dBxpHmI7JMjPK0pe4XGhI/K2RIyfblNy I5Hv3MkiORMHb/en93POgbLuGGoZVI9xqB0swUgbPqNnIKC7CZxHOz7R0aduSiDYjiv9 h1yyeQzdOLfQ8E86PX5U0uFrzWxf8okIG5zyV2PfbBNkHJJZK2z7wph6WMcjA2Z0o/Sc Givi4tcZp7ggF0cKQOEmet7K/DiXNRfFIKzbs076wPiOR0Gvg48GtKD1iIm8xR5R9aaa 0m7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=p3OHjqKD6x4JCybHo/LDnqZ2DJrExSzBnbiuStXpNr8=; b=QUnLaAeCng/xXGabQZkvJ4YnfHqmBhjCK6g0YDzfcmjmoCvNKG8WHVwtI3Rs4M2dn0 Rm5MG2/frTFOVyoWU2y9+yx2dgde0YnHQ5FrAp11ecKjOl/48I0rPAtx9ZhnsJm5YDKF fL36B8LkBJygUBq4cNiQoMA9n34GFakBF/7XQ5BkgcBy4h+DQN7t2k00gCiETZW314wW wxeXXrMtqwvrS58zI93nRG8g5ZJVhdvuqNj2OJbfSK0ZXkQ1AyOpGNfwElnXpX342Ath TOlVr+raaZLg5GQD3As7BgaoKesOz8QTeCWIXOJZJmbKelL6lulvrAjq8rk/sfe6wAqz BDvg== X-Gm-Message-State: AOAM531AUGoDl0vS5HbohHFFD5hZ1eCSL3XHACihb9M18nkS9KVa+nL3 +fetTP0QjrKCigx+uAmJjay0+A== X-Google-Smtp-Source: ABdhPJwimDhYyddymRxBCGj9F04EBR3CDzft0fiHwDgJ3l08K4VUVjY6HdPu2jGBPVyaPoRUT9luVA== X-Received: by 2002:a62:d142:0:b029:19e:62a0:ca1a with SMTP id t2-20020a62d1420000b029019e62a0ca1amr34445419pfl.80.1608176788668; Wed, 16 Dec 2020 19:46:28 -0800 (PST) Received: from localhost.localdomain ([139.177.225.237]) by smtp.gmail.com with ESMTPSA id b2sm3792412pfo.164.2020.12.16.19.46.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 19:46:28 -0800 (PST) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, adobriyan@gmail.com, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, hughd@google.com, shakeelb@google.com, guro@fb.com, samitolvanen@google.com, feng.tang@intel.com, neilb@suse.de, iamjoonsoo.kim@lge.com, rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Muchun Song Subject: [PATCH v5 7/7] mm: memcontrol: make the slab calculation consistent Date: Thu, 17 Dec 2020 11:43:56 +0800 Message-Id: <20201217034356.4708-8-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201217034356.4708-1-songmuchun@bytedance.com> References: <20201217034356.4708-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Although the ratio of the slab is one, we also should read the ratio from the related memory_stats instead of hard-coding. And the local variable of size is already the value of slab_unreclaimable. So we do not need to read again. To do this we need some code like below: if (unlikely(memory_stats[i].idx == NR_SLAB_UNRECLAIMABLE_B)) { - size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + - memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B); + size += memcg_page_state(memcg, memory_stats[i - 1].idx) * + memory_stats[i - 1].ratio; It requires a series of BUG_ONs or comments to ensure these two items are actually adjacent and in the right order. So it would probably be easier to implement this using a wrapper that has a big switch() for unit conversion. This would fix the ratio inconsistency and get rid of the order guarantee. Signed-off-by: Muchun Song --- mm/memcontrol.c | 105 +++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 66 insertions(+), 39 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a40797a27f87..eec44918d373 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1511,49 +1511,71 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg) struct memory_stat { const char *name; - unsigned int ratio; unsigned int idx; }; static const struct memory_stat memory_stats[] = { - { "anon", PAGE_SIZE, NR_ANON_MAPPED }, - { "file", PAGE_SIZE, NR_FILE_PAGES }, - { "kernel_stack", 1024, NR_KERNEL_STACK_KB }, - { "pagetables", PAGE_SIZE, NR_PAGETABLE }, - { "percpu", 1, MEMCG_PERCPU_B }, - { "sock", PAGE_SIZE, MEMCG_SOCK }, - { "shmem", PAGE_SIZE, NR_SHMEM }, - { "file_mapped", PAGE_SIZE, NR_FILE_MAPPED }, - { "file_dirty", PAGE_SIZE, NR_FILE_DIRTY }, - { "file_writeback", PAGE_SIZE, NR_WRITEBACK }, + { "anon", NR_ANON_MAPPED }, + { "file", NR_FILE_PAGES }, + { "kernel_stack", NR_KERNEL_STACK_KB }, + { "pagetables", NR_PAGETABLE }, + { "percpu", MEMCG_PERCPU_B }, + { "sock", MEMCG_SOCK }, + { "shmem", NR_SHMEM }, + { "file_mapped", NR_FILE_MAPPED }, + { "file_dirty", NR_FILE_DIRTY }, + { "file_writeback", NR_WRITEBACK }, #ifdef CONFIG_TRANSPARENT_HUGEPAGE - { "anon_thp", PAGE_SIZE, NR_ANON_THPS }, - { "file_thp", PAGE_SIZE, NR_FILE_THPS }, - { "shmem_thp", PAGE_SIZE, NR_SHMEM_THPS }, + { "anon_thp", NR_ANON_THPS }, + { "file_thp", NR_FILE_THPS }, + { "shmem_thp", NR_SHMEM_THPS }, #endif - { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON }, - { "active_anon", PAGE_SIZE, NR_ACTIVE_ANON }, - { "inactive_file", PAGE_SIZE, NR_INACTIVE_FILE }, - { "active_file", PAGE_SIZE, NR_ACTIVE_FILE }, - { "unevictable", PAGE_SIZE, NR_UNEVICTABLE }, - - /* - * Note: The slab_reclaimable and slab_unreclaimable must be - * together and slab_reclaimable must be in front. - */ - { "slab_reclaimable", 1, NR_SLAB_RECLAIMABLE_B }, - { "slab_unreclaimable", 1, NR_SLAB_UNRECLAIMABLE_B }, + { "inactive_anon", NR_INACTIVE_ANON }, + { "active_anon", NR_ACTIVE_ANON }, + { "inactive_file", NR_INACTIVE_FILE }, + { "active_file", NR_ACTIVE_FILE }, + { "unevictable", NR_UNEVICTABLE }, + { "slab_reclaimable", NR_SLAB_RECLAIMABLE_B }, + { "slab_unreclaimable", NR_SLAB_UNRECLAIMABLE_B }, /* The memory events */ - { "workingset_refault_anon", 1, WORKINGSET_REFAULT_ANON }, - { "workingset_refault_file", 1, WORKINGSET_REFAULT_FILE }, - { "workingset_activate_anon", 1, WORKINGSET_ACTIVATE_ANON }, - { "workingset_activate_file", 1, WORKINGSET_ACTIVATE_FILE }, - { "workingset_restore_anon", 1, WORKINGSET_RESTORE_ANON }, - { "workingset_restore_file", 1, WORKINGSET_RESTORE_FILE }, - { "workingset_nodereclaim", 1, WORKINGSET_NODERECLAIM }, + { "workingset_refault_anon", WORKINGSET_REFAULT_ANON }, + { "workingset_refault_file", WORKINGSET_REFAULT_FILE }, + { "workingset_activate_anon", WORKINGSET_ACTIVATE_ANON }, + { "workingset_activate_file", WORKINGSET_ACTIVATE_FILE }, + { "workingset_restore_anon", WORKINGSET_RESTORE_ANON }, + { "workingset_restore_file", WORKINGSET_RESTORE_FILE }, + { "workingset_nodereclaim", WORKINGSET_NODERECLAIM }, }; +/* Translate stat items to the correct unit for memory.stat output */ +static int memcg_page_state_unit(int item) +{ + switch (item) { + case MEMCG_PERCPU_B: + case NR_SLAB_RECLAIMABLE_B: + case NR_SLAB_UNRECLAIMABLE_B: + case WORKINGSET_REFAULT_ANON: + case WORKINGSET_REFAULT_FILE: + case WORKINGSET_ACTIVATE_ANON: + case WORKINGSET_ACTIVATE_FILE: + case WORKINGSET_RESTORE_ANON: + case WORKINGSET_RESTORE_FILE: + case WORKINGSET_NODERECLAIM: + return 1; + case NR_KERNEL_STACK_KB: + return SZ_1K; + default: + return PAGE_SIZE; + } +} + +static inline unsigned long memcg_page_state_output(struct mem_cgroup *memcg, + int item) +{ + return memcg_page_state(memcg, item) * memcg_page_state_unit(item); +} + static char *memory_stat_format(struct mem_cgroup *memcg) { struct seq_buf s; @@ -1577,13 +1599,12 @@ static char *memory_stat_format(struct mem_cgroup *memcg) for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { u64 size; - size = memcg_page_state(memcg, memory_stats[i].idx); - size *= memory_stats[i].ratio; + size = memcg_page_state_output(memcg, memory_stats[i].idx); seq_buf_printf(&s, "%s %llu\n", memory_stats[i].name, size); if (unlikely(memory_stats[i].idx == NR_SLAB_UNRECLAIMABLE_B)) { - size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + - memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B); + size += memcg_page_state_output(memcg, + NR_SLAB_RECLAIMABLE_B); seq_buf_printf(&s, "slab %llu\n", size); } } @@ -6377,6 +6398,12 @@ static int memory_stat_show(struct seq_file *m, void *v) } #ifdef CONFIG_NUMA +static inline unsigned long lruvec_page_state_output(struct lruvec *lruvec, + int item) +{ + return lruvec_page_state(lruvec, item) * memcg_page_state_unit(item); +} + static int memory_numa_stat_show(struct seq_file *m, void *v) { int i; @@ -6394,8 +6421,8 @@ static int memory_numa_stat_show(struct seq_file *m, void *v) struct lruvec *lruvec; lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid)); - size = lruvec_page_state(lruvec, memory_stats[i].idx); - size *= memory_stats[i].ratio; + size = lruvec_page_state_output(lruvec, + memory_stats[i].idx); seq_printf(m, " N%d=%llu", nid, size); } seq_putc(m, '\n');