From patchwork Fri Apr 12 15:15:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 10898539 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2332814DB for ; Fri, 12 Apr 2019 15:15:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 092CE28D42 for ; Fri, 12 Apr 2019 15:15:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F10D628D4E; Fri, 12 Apr 2019 15:15:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0A81D28D42 for ; Fri, 12 Apr 2019 15:15:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3F776B000D; Fri, 12 Apr 2019 11:15:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AFDBA6B0010; Fri, 12 Apr 2019 11:15:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A05DB6B026A; Fri, 12 Apr 2019 11:15:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 7F9236B0010 for ; Fri, 12 Apr 2019 11:15:22 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id k13so8973421qtc.23 for ; Fri, 12 Apr 2019 08:15:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=vYujH9ql2WWkJyoNj4zC7fKkltAGztxNOgJanRlBf/k=; b=IXRYW1TTQcaMTM/XLKPH4aLMZitRIjmzF5uBEp96qd3wKm2UDsns+Wa6O6nKouizGz nRKurN4Uh+6VJlHjVA/+GJHcZUsQ4Ur2CI/eKqdETPppiIpJi8+WoY97/5Rfc5aEc8YC 0MppP28ci1iMeNLECNIrRakAOgJEEpXaMt+CB6JR/bDQ/UzSnM9/opxsWJIlzBJ6gNVQ +/zuVYHgp2WQv8Jur+NCZFIiF6KuYYW8XMdgnIPnQZDm+5CeWKgyG2Zg3T3i6c7U4AiU snc0pXG8HehKjw+KT8nps5p6fey+6kyOP1r8M8xwkI1ox400mFRi7q7ENpqgisfzohdt W/Iw== X-Gm-Message-State: APjAAAXbbt1wMFnspp/bRhqOVk1qDjc5hX2ZteZA0CIFEGRSj51JfzzU xs20fMfz9C+jFQeFDo0Hr+KMM5TJ6+X1ylR6X8CkOGUd2yZVs7uTlwVUayZsHEvSU78cpTAIVzF 7XK5vwjvFae9hjGAvyOaNVCQ1iaT9JLBDED6LOosJUjKlZCA7BXd+/xoknPq1Po5HOg== X-Received: by 2002:a05:620a:1438:: with SMTP id k24mr3904276qkj.165.1555082122248; Fri, 12 Apr 2019 08:15:22 -0700 (PDT) X-Received: by 2002:a05:620a:1438:: with SMTP id k24mr3904162qkj.165.1555082120929; Fri, 12 Apr 2019 08:15:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555082120; cv=none; d=google.com; s=arc-20160816; b=Eg8mGCNYiHKBDx8Y5IY9KlxniPJLLqyS2iOmPuaBZhEcWqKKHKR59+x8jahU8pu5tr nUS26NgFkDDq6HvuHbvfLRQ8hrh5VTynmgAXRTK/gU5v1Epj4OxFFTBGlC+G5IA1r7+t aIQx/uybWEjVSvIO4tQjneq1sq8iLGuG0ZL62KHW/02t45HYGOedtkn6xKZT2LVFre1c 5+T4A5EAQw6CcHL32n0KqYrlGsJTed7MxNfaXDncMtn4acZBKBvBRELzoaW8etqeIETn mpwI8Srq0rIYBuPNEc0iXgXh11YkLdq44RGdMPEYKzYC2o9HKepAIpQfoIjnDTkXXvKw Nclw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=vYujH9ql2WWkJyoNj4zC7fKkltAGztxNOgJanRlBf/k=; b=N6VA15+9KQpxc0k/dq3dSsaUU4KPTlL5dbExv43GEx6WEI7DEudIyV/XNO0bwk9la3 k32xAQsqaFvGH3eJKnTYuV0SPZFs7L6lTpaEXKTYFF7BYajeEOspRXy2LHjwvfTFXjY1 +vextGLEsNUS3+bZ27xG9A3wngFYopW13PR1FonaQIQMy4kbXIXqgmIMTvra9xIyIFD5 FhuZJby8I/ykxXL7Hd+2gNA0ITMeA4zGgu8yWxwiaZ/PQtAr8sQYmfrscu9Kai+wvdCw qbxLVe8p8Jf2lhvu3IA7qMl45/tGI1zB7eLA6A/D0esvB5Y2oScoJHBJnJVqkvYhi9Tx GvFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=PLL+ZIOo; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id f9sor41871962qvr.51.2019.04.12.08.15.17 for (Google Transport Security); Fri, 12 Apr 2019 08:15:17 -0700 (PDT) Received-SPF: pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=PLL+ZIOo; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vYujH9ql2WWkJyoNj4zC7fKkltAGztxNOgJanRlBf/k=; b=PLL+ZIOovXrfX1wJ1YTZ6L+zDg1hjgBGzySqLyYncg5msMIkS5Ic2UAIQ46BtEetQO 2XD1BaVaNpjylQD6A7WTKHPwz6kWz7EDemV/kgcnsT0tUXbK1G+pfy9D8VfT9BS1p66X FD+ZaoDb3EuL6yBgT+YGLTA0U/Ix4bUhu75FAuHncIA6p8Zn7C5w7sBs31iQR+VHOBh3 6w3Nwx/SWTeMwaWvCyes6vQT3oV2sIvvzepgVFgTQxMwwXnTSq5V0ZUiEk5u5KTl6H/y AWYf4x/nbpo3NE1CAMpnujWd0d+NTo5PGRlmnVbBOXIc518OwHEjkW4S5aGtFyhlxoM3 tNfA== X-Google-Smtp-Source: APXvYqxULIHF7gHec2iqEMR9IBNg7HbcJMq/Fq+YT6crJISDP6QO1qnolhGkKP+j0EvGTftVvC70YQ== X-Received: by 2002:a0c:d483:: with SMTP id u3mr46384167qvh.54.1555082117403; Fri, 12 Apr 2019 08:15:17 -0700 (PDT) Received: from localhost (pool-108-27-252-85.nycmny.fios.verizon.net. [108.27.252.85]) by smtp.gmail.com with ESMTPSA id e6sm21040344qtr.56.2019.04.12.08.15.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 12 Apr 2019 08:15:16 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/4] mm: memcontrol: make cgroup stats and events query API explicitly local Date: Fri, 12 Apr 2019 11:15:04 -0400 Message-Id: <20190412151507.2769-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190412151507.2769-1-hannes@cmpxchg.org> References: <20190412151507.2769-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP memcg_page_state(), lruvec_page_state(), memcg_sum_events() are currently returning the state of the local memcg or lruvec, not the recursive state. In practice there is a demand for both versions, although the callers that want the recursive counts currently sum them up by hand. Per default, cgroups are considered recursive entities and generally we expect more users of the recursive counters, with the local counts being special cases. To reflect that in the name, add a _local suffix to the current implementations. The following patch will re-incarnate these functions with recursive semantics, but with an O(1) implementation. Signed-off-by: Johannes Weiner --- include/linux/memcontrol.h | 16 +++++++-------- mm/memcontrol.c | 40 ++++++++++++++++++++------------------ mm/vmscan.c | 4 ++-- mm/workingset.c | 7 ++++--- 4 files changed, 35 insertions(+), 32 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 3823cb335b60..139be7d44c29 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -569,8 +569,8 @@ void unlock_page_memcg(struct page *page); * idx can be of type enum memcg_stat_item or node_stat_item. * Keep in sync with memcg_exact_page_state(). */ -static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, - int idx) +static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, + int idx) { long x = atomic_long_read(&memcg->vmstats[idx]); #ifdef CONFIG_SMP @@ -639,8 +639,8 @@ static inline void mod_memcg_page_state(struct page *page, mod_memcg_state(page->mem_cgroup, idx, val); } -static inline unsigned long lruvec_page_state(struct lruvec *lruvec, - enum node_stat_item idx) +static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, + enum node_stat_item idx) { struct mem_cgroup_per_node *pn; long x; @@ -1043,8 +1043,8 @@ static inline void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) { } -static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, - int idx) +static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, + int idx) { return 0; } @@ -1073,8 +1073,8 @@ static inline void mod_memcg_page_state(struct page *page, { } -static inline unsigned long lruvec_page_state(struct lruvec *lruvec, - enum node_stat_item idx) +static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, + enum node_stat_item idx) { return node_page_state(lruvec_pgdat(lruvec), idx); } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index cd03b1181f7f..109608b8091f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -687,8 +687,8 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) return mz; } -static unsigned long memcg_sum_events(struct mem_cgroup *memcg, - int event) +static unsigned long memcg_events_local(struct mem_cgroup *memcg, + int event) { return atomic_long_read(&memcg->vmevents[event]); } @@ -1325,12 +1325,14 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) if (memcg1_stats[i] == MEMCG_SWAP && !do_swap_account) continue; pr_cont(" %s:%luKB", memcg1_stat_names[i], - K(memcg_page_state(iter, memcg1_stats[i]))); + K(memcg_page_state_local(iter, + memcg1_stats[i]))); } for (i = 0; i < NR_LRU_LISTS; i++) pr_cont(" %s:%luKB", mem_cgroup_lru_names[i], - K(memcg_page_state(iter, NR_LRU_BASE + i))); + K(memcg_page_state_local(iter, + NR_LRU_BASE + i))); pr_cont("\n"); } @@ -1401,13 +1403,13 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg, { struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); - if (lruvec_page_state(lruvec, NR_INACTIVE_FILE) || - lruvec_page_state(lruvec, NR_ACTIVE_FILE)) + if (lruvec_page_state_local(lruvec, NR_INACTIVE_FILE) || + lruvec_page_state_local(lruvec, NR_ACTIVE_FILE)) return true; if (noswap || !total_swap_pages) return false; - if (lruvec_page_state(lruvec, NR_INACTIVE_ANON) || - lruvec_page_state(lruvec, NR_ACTIVE_ANON)) + if (lruvec_page_state_local(lruvec, NR_INACTIVE_ANON) || + lruvec_page_state_local(lruvec, NR_ACTIVE_ANON)) return true; return false; @@ -2976,16 +2978,16 @@ static void accumulate_vmstats(struct mem_cgroup *memcg, for_each_mem_cgroup_tree(mi, memcg) { for (i = 0; i < acc->vmstats_size; i++) - acc->vmstats[i] += memcg_page_state(mi, + acc->vmstats[i] += memcg_page_state_local(mi, acc->vmstats_array ? acc->vmstats_array[i] : i); for (i = 0; i < acc->vmevents_size; i++) - acc->vmevents[i] += memcg_sum_events(mi, + acc->vmevents[i] += memcg_events_local(mi, acc->vmevents_array ? acc->vmevents_array[i] : i); for (i = 0; i < NR_LRU_LISTS; i++) - acc->lru_pages[i] += memcg_page_state(mi, + acc->lru_pages[i] += memcg_page_state_local(mi, NR_LRU_BASE + i); } } @@ -2998,10 +3000,10 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) struct mem_cgroup *iter; for_each_mem_cgroup_tree(iter, memcg) { - val += memcg_page_state(iter, MEMCG_CACHE); - val += memcg_page_state(iter, MEMCG_RSS); + val += memcg_page_state_local(iter, MEMCG_CACHE); + val += memcg_page_state_local(iter, MEMCG_RSS); if (swap) - val += memcg_page_state(iter, MEMCG_SWAP); + val += memcg_page_state_local(iter, MEMCG_SWAP); } } else { if (!swap) @@ -3343,7 +3345,7 @@ static unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, for_each_lru(lru) { if (!(BIT(lru) & lru_mask)) continue; - nr += lruvec_page_state(lruvec, NR_LRU_BASE + lru); + nr += lruvec_page_state_local(lruvec, NR_LRU_BASE + lru); } return nr; } @@ -3357,7 +3359,7 @@ static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *memcg, for_each_lru(lru) { if (!(BIT(lru) & lru_mask)) continue; - nr += memcg_page_state(memcg, NR_LRU_BASE + lru); + nr += memcg_page_state_local(memcg, NR_LRU_BASE + lru); } return nr; } @@ -3442,17 +3444,17 @@ static int memcg_stat_show(struct seq_file *m, void *v) if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; seq_printf(m, "%s %lu\n", memcg1_stat_names[i], - memcg_page_state(memcg, memcg1_stats[i]) * + memcg_page_state_local(memcg, memcg1_stats[i]) * PAGE_SIZE); } for (i = 0; i < ARRAY_SIZE(memcg1_events); i++) seq_printf(m, "%s %lu\n", memcg1_event_names[i], - memcg_sum_events(memcg, memcg1_events[i])); + memcg_events_local(memcg, memcg1_events[i])); for (i = 0; i < NR_LRU_LISTS; i++) seq_printf(m, "%s %lu\n", mem_cgroup_lru_names[i], - memcg_page_state(memcg, NR_LRU_BASE + i) * + memcg_page_state_local(memcg, NR_LRU_BASE + i) * PAGE_SIZE); /* Hierarchical information */ diff --git a/mm/vmscan.c b/mm/vmscan.c index c9f8afe61ae3..6e99a8b9b2ad 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -346,7 +346,7 @@ unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone int zid; if (!mem_cgroup_disabled()) - lru_size = lruvec_page_state(lruvec, NR_LRU_BASE + lru); + lru_size = lruvec_page_state_local(lruvec, NR_LRU_BASE + lru); else lru_size = node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru); @@ -2163,7 +2163,7 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file, * is being established. Disable active list protection to get * rid of the stale workingset quickly. */ - refaults = lruvec_page_state(lruvec, WORKINGSET_ACTIVATE); + refaults = lruvec_page_state_local(lruvec, WORKINGSET_ACTIVATE); if (file && actual_reclaim && lruvec->refaults != refaults) { inactive_ratio = 0; } else { diff --git a/mm/workingset.c b/mm/workingset.c index 6419baebd306..e0b4edcb88c8 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -430,9 +430,10 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, lruvec = mem_cgroup_lruvec(NODE_DATA(sc->nid), sc->memcg); for (pages = 0, i = 0; i < NR_LRU_LISTS; i++) - pages += lruvec_page_state(lruvec, NR_LRU_BASE + i); - pages += lruvec_page_state(lruvec, NR_SLAB_RECLAIMABLE); - pages += lruvec_page_state(lruvec, NR_SLAB_UNRECLAIMABLE); + pages += lruvec_page_state_local(lruvec, + NR_LRU_BASE + i); + pages += lruvec_page_state_local(lruvec, NR_SLAB_RECLAIMABLE); + pages += lruvec_page_state_local(lruvec, NR_SLAB_UNRECLAIMABLE); } else #endif pages = node_present_pages(sc->nid); From patchwork Fri Apr 12 15:15:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 10898543 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A12441515 for ; Fri, 12 Apr 2019 15:15:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FFB328D42 for ; Fri, 12 Apr 2019 15:15:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7238728D4D; Fri, 12 Apr 2019 15:15:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B8F2D28D42 for ; Fri, 12 Apr 2019 15:15:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8336E6B026B; Fri, 12 Apr 2019 11:15:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7E7B26B026C; Fri, 12 Apr 2019 11:15:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63AB96B026D; Fri, 12 Apr 2019 11:15:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 39E176B026B for ; Fri, 12 Apr 2019 11:15:25 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id z24so8992295qto.7 for ; Fri, 12 Apr 2019 08:15:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=PRTUaF678C5HIzdBjBIPSEr+aHrXt+S0IRBXo9MeAws=; b=tH1VaIsWfd20kC4otvaAJRRqMPzp1KBuWC1BPL+0WAra/Bpm9q3iCPx3TJWuZQWAYP BvZG/2R8Pqm0owb4xGtPMO/3tW0YN7MROpX+6ahUdMWWrWcUvWQE9sH1CVrp1khLhWys ORNjTSRZbpq8ZC8Y5+YAdpQlg7gaC9OhLRbVAt3LCZkDHzneQFaqJnTibYJcXfmgyEqx GMjKALhHjct+SBW2f8EZS+zQEnwGRHYWXe6lZdm1xu0EoxfPVw8LjqVsSQIuvwkrlgJp Q313zenrwvRUqKU9f7n/vKJ6YgPaj/TQ4s0xH4MZomfXxIq2uXkKMLRVmpjBnBUuwxiM b7wg== X-Gm-Message-State: APjAAAWPCPP8D0MfVWac5D5P58Kt6qMtqksLwEo2uDN/siXIKiCSCCMB t3PQD6a+pj2XX3LHdXLKnYEkcwMr75ajJJHlJuch9Wwtt93mczA1gOfBohaURDWLkk5FI4IZ0YL XfnsfaobvqBaRxYSf6qPFeSm/u1SPal/giGXZ9PajcYWY6PR0aAg9GNXxePXAKDRO2Q== X-Received: by 2002:ac8:72c4:: with SMTP id o4mr47680823qtp.88.1555082124930; Fri, 12 Apr 2019 08:15:24 -0700 (PDT) X-Received: by 2002:ac8:72c4:: with SMTP id o4mr47680708qtp.88.1555082123645; Fri, 12 Apr 2019 08:15:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555082123; cv=none; d=google.com; s=arc-20160816; b=gkT1D5dMqCZp/bSyL9kLGSsd/I35rTVbJud5Qy5zoAy4gOg+LRvPxsYamdou6Kje18 yHEOkM3cIV9IuyyFJGiGHRd/vu0B87hdTK2Xkl+jdgslcGnq0w7kizWS5fUsQx/eGHnQ lX5c1bBGrLiaHRto0FMnS/nh9vq0xlWdbxc4+ab7qiBWoE4h6XkQGSTotcd0R/y+uv32 ZNsWWtZOaiddRNnhUcXjuZk1aACKLXOMCI7RPQfZDCaTwOa0hDv6NQ/xzEuCwVqcP/FR L5+WqMAn2mOg6R164E4Lw0oCi69UXw8mrPpNZy9hkFDndFNyoOnEoPcRCJ+gIfRhjtjU ug9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=PRTUaF678C5HIzdBjBIPSEr+aHrXt+S0IRBXo9MeAws=; b=G0ND7KzKA8KxrCXpmqiE5+eB5+7oTY1MTjGSvBhamBiy3dX7tK2XqSGjh8uFmSsdfk BeUBiQ+xG2RemFnF/Eexk1MDKP2IxIBxAaAUrEcN1Gp6cscWSjtn9MktlHaTRsj8Su6I tMhjd9F4novqBAC+b5X+6ABr+PuI0YzkNw9wzZ5+YsgoLh6HMr3xjAc/hNo1u58FHfjB 7NDAhhWPafzyMs1O0fnz90ms8aYmtH4ed/CEzdu610o0BtKdkvP5dHLY+rg6GAHtjpSV sVFDeVoTv9C55VH36EpoWjNZ1wbGhxO7xAMF+zfoCdpRnkdYqMGWx5nkH434p59moiw0 5rvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=ILqDJJHk; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id d7sor25525040qki.7.2019.04.12.08.15.19 for (Google Transport Security); Fri, 12 Apr 2019 08:15:19 -0700 (PDT) Received-SPF: pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=ILqDJJHk; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PRTUaF678C5HIzdBjBIPSEr+aHrXt+S0IRBXo9MeAws=; b=ILqDJJHkPZz+eoKOvEr4x0/kgN/IBd0kCMfXy7rPlSfiWsGSmDI6DjyrVYmzgB/KgJ o/a/0Y1afe7RfkMShz9vwu+3rR2TVnT91QuwBCSOdI5Q13s8C0dbu9xj3PjV4IRsSngj +xYSZhSKjRK1gaqwdDvmiaOzBR7OeV2vcxlhnLdTeOQSTfrRfivBEbyYNTxVKLSnp+CB zjIsSHe6XO4EZuIEVpRXex1t0Chpz6YqKKdL/mvxmkyCEQpYZDkvyznHQ6p01d47B6DJ dw6CszzzPqXfI2qk2mij7G5QVuKJIXQGJm+0/35AP73eRIX1DEbeAooXRbZCZUFLrkSP 7iKg== X-Google-Smtp-Source: APXvYqysIlg0nPC2TAIyhyZGpUry9fm9BHzZ/cusQuSfW34/ihXJ8nq+qmczg5uW0MTGOcK5jfgKsQ== X-Received: by 2002:a37:4f95:: with SMTP id d143mr45617065qkb.253.1555082119044; Fri, 12 Apr 2019 08:15:19 -0700 (PDT) Received: from localhost (pool-108-27-252-85.nycmny.fios.verizon.net. [108.27.252.85]) by smtp.gmail.com with ESMTPSA id x24sm13094296qtm.65.2019.04.12.08.15.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 12 Apr 2019 08:15:18 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/4] mm: memcontrol: move stat/event counting functions out-of-line Date: Fri, 12 Apr 2019 11:15:05 -0400 Message-Id: <20190412151507.2769-3-hannes@cmpxchg.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190412151507.2769-1-hannes@cmpxchg.org> References: <20190412151507.2769-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP These are getting too big to be inlined in every callsite. They were stolen from vmstat.c, which already out-of-lines them, and they have only been growing since. The callsites aren't that hot, either. Move __mod_memcg_state() __mod_lruvec_state() and __count_memcg_events() out of line and add kerneldoc comments. Signed-off-by: Johannes Weiner --- include/linux/memcontrol.h | 62 +++--------------------------- mm/memcontrol.c | 79 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+), 57 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 139be7d44c29..cae7d1b11eea 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -580,22 +580,7 @@ static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, return x; } -/* idx can be of type enum memcg_stat_item or node_stat_item */ -static inline void __mod_memcg_state(struct mem_cgroup *memcg, - int idx, int val) -{ - long x; - - if (mem_cgroup_disabled()) - return; - - x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); - if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { - atomic_long_add(x, &memcg->vmstats[idx]); - x = 0; - } - __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); -} +void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val); /* idx can be of type enum memcg_stat_item or node_stat_item */ static inline void mod_memcg_state(struct mem_cgroup *memcg, @@ -657,31 +642,8 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, return x; } -static inline void __mod_lruvec_state(struct lruvec *lruvec, - enum node_stat_item idx, int val) -{ - struct mem_cgroup_per_node *pn; - long x; - - /* Update node */ - __mod_node_page_state(lruvec_pgdat(lruvec), idx, val); - - if (mem_cgroup_disabled()) - return; - - pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); - - /* Update memcg */ - __mod_memcg_state(pn->memcg, idx, val); - - /* Update lruvec */ - x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]); - if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { - atomic_long_add(x, &pn->lruvec_stat[idx]); - x = 0; - } - __this_cpu_write(pn->lruvec_stat_cpu->count[idx], x); -} +void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, + int val); static inline void mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, int val) @@ -723,22 +685,8 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, gfp_t gfp_mask, unsigned long *total_scanned); -static inline void __count_memcg_events(struct mem_cgroup *memcg, - enum vm_event_item idx, - unsigned long count) -{ - unsigned long x; - - if (mem_cgroup_disabled()) - return; - - x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); - if (unlikely(x > MEMCG_CHARGE_BATCH)) { - atomic_long_add(x, &memcg->vmevents[idx]); - x = 0; - } - __this_cpu_write(memcg->vmstats_percpu->events[idx], x); -} +void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, + unsigned long count); static inline void count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 109608b8091f..3535270ebeec 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -687,6 +687,85 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) return mz; } +/** + * __mod_memcg_state - update cgroup memory statistics + * @memcg: the memory cgroup + * @idx: the stat item - can be enum memcg_stat_item or enum node_stat_item + * @val: delta to add to the counter, can be negative + */ +void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) +{ + long x; + + if (mem_cgroup_disabled()) + return; + + x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); + if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { + atomic_long_add(x, &memcg->vmstats[idx]); + x = 0; + } + __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); +} + +/** + * __mod_lruvec_state - update lruvec memory statistics + * @lruvec: the lruvec + * @idx: the stat item + * @val: delta to add to the counter, can be negative + * + * The lruvec is the intersection of the NUMA node and a cgroup. This + * function updates the all three counters that are affected by a + * change of state at this level: per-node, per-cgroup, per-lruvec. + */ +void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, + int val) +{ + struct mem_cgroup_per_node *pn; + long x; + + /* Update node */ + __mod_node_page_state(lruvec_pgdat(lruvec), idx, val); + + if (mem_cgroup_disabled()) + return; + + pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); + + /* Update memcg */ + __mod_memcg_state(pn->memcg, idx, val); + + /* Update lruvec */ + x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]); + if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { + atomic_long_add(x, &pn->lruvec_stat[idx]); + x = 0; + } + __this_cpu_write(pn->lruvec_stat_cpu->count[idx], x); +} + +/** + * __count_memcg_events - account VM events in a cgroup + * @memcg: the memory cgroup + * @idx: the event item + * @count: the number of events that occured + */ +void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, + unsigned long count) +{ + unsigned long x; + + if (mem_cgroup_disabled()) + return; + + x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); + if (unlikely(x > MEMCG_CHARGE_BATCH)) { + atomic_long_add(x, &memcg->vmevents[idx]); + x = 0; + } + __this_cpu_write(memcg->vmstats_percpu->events[idx], x); +} + static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) { From patchwork Fri Apr 12 15:15:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 10898545 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 336C214DB for ; Fri, 12 Apr 2019 15:15:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17AF228D42 for ; Fri, 12 Apr 2019 15:15:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BCEB28D4D; Fri, 12 Apr 2019 15:15:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D701128D42 for ; Fri, 12 Apr 2019 15:15:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1065F6B026C; Fri, 12 Apr 2019 11:15:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0B85B6B026E; Fri, 12 Apr 2019 11:15:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECBDB6B026F; Fri, 12 Apr 2019 11:15:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id C051C6B026C for ; Fri, 12 Apr 2019 11:15:25 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id s70so8297017qka.1 for ; Fri, 12 Apr 2019 08:15:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=FEfDsV1rTEEhN0SVMMTVTmF+r1SaZigmXU93Z6LcqvY=; b=E8LlzTkKOiiTo0FX+xlnSwjcNLgiK3QCJYfDQzUh66OVs0u42gKdX5aQ0FcD0zTnHD 9Kv83SXnAy9xFdWHSOe9hun5m9uX6jO6apcyiQgnvguXhc4RY8/sZR76zv6vN22M7fs/ NpUcW8mcmuE3psPBMWneaHABL9b2cCm8jcG4QLccbOzoDOzVEkJ8tx4KtAKI/dVaKgeW xF90Tlrk/VykN8LTYyNNlYT7s2TkXBURo+KKtI/0Qo6zIBoEAfPuvtGL6AIFdaZLZt6i iSH+xLuVF5UQQcJBNJimi4q0a2Z+vjgwKr1LcnJbziRk6dyhknz89VYYSlLGui29Ssyl SFqA== X-Gm-Message-State: APjAAAVNwOOIt2OBWx83ItR2Zr0uQkKGNlla3KczZw6cH774H+DfQa10 rNv7RmVy9kDPiVD4HQD8TRndIxKI5VACfZByJJtWKHoptIi6NzJ4lWEKyCRb6hR0wEqsFP+/Jli CPsoPVquMaTEs7/gOEI0GRNX4VSBckiJ23cbLq0icjMqQXdtxZI61iZIAcCFLKuLnAQ== X-Received: by 2002:ac8:464b:: with SMTP id f11mr47413238qto.276.1555082125442; Fri, 12 Apr 2019 08:15:25 -0700 (PDT) X-Received: by 2002:ac8:464b:: with SMTP id f11mr47413069qto.276.1555082123512; Fri, 12 Apr 2019 08:15:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555082123; cv=none; d=google.com; s=arc-20160816; b=SK+y5H3+tQCcB+kN//rjCZvpiECZQqGbxzgTjFt0wy3xVEiMa6SNjp6suDSY4Uzn8t gYa0HuY6zlcQguxMpshHrYUeWaxAsbXVGMe5v7b+isDiK+kE1rJQw00tnyXgy7crY/c4 UKsPMUfFJ7WFWevkR2yTUWktEYiWCsxvb7J5DIAJ1mV+x7t+t2aaLhquenWklgFntAAO h7X4icRrM+0JWymLSs39gsfxFvdnlIldY1Hx+V0Z19rFUM0j8xASl4kehIJJ5K/6amV8 WJlJCJREu3H2u/O+wrr6vsqQHcPpBxxzxcLfdej6ousKfQ12uEmViwkKMGT1AFnJz3jd 7sOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=FEfDsV1rTEEhN0SVMMTVTmF+r1SaZigmXU93Z6LcqvY=; b=zknvTX+DC+HY4dEK68ptdEgD+ysTQBx7wNTQaT90YU5FFyZ8JyGRsOfwZaYDKvWjLn dxWgXl+dSLNnFKSu0KiKRm6NTR1BKJa2fB3+PFN5sdsJB7oO/NoaozYEce/7+zZdnrWY wh0KzXlB99WcBmnJ8rUXJk+qZJdMwumpTvAk8OwmcQgSMo9K+nzF86fmC5HqCCfwpYPo W2EDF3e837+SIMAOMGL3manikGxxElrIyKAxMAqcsn1gbf7QUJC6vdFJ0S9Q86It8YyF A1q9fl1Axa4WgF7k1zKkzG5Wbw34VawwwhPeIdubH+OZF8v5PMYVgyf3DWjtOt/iRXhN OeIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=uCXjsNn9; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id s2sor56581607qte.69.2019.04.12.08.15.21 for (Google Transport Security); Fri, 12 Apr 2019 08:15:21 -0700 (PDT) Received-SPF: pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=uCXjsNn9; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FEfDsV1rTEEhN0SVMMTVTmF+r1SaZigmXU93Z6LcqvY=; b=uCXjsNn9nQODbHIT3m0XWRrWyphSHgyMZl+xQCh+e/OHrUHKtlyxHUgk5wZ8rTqNwv 8wDvO39zbgpNE/i4pfbMVB0LE281Xa3dHNYmW7MmfqUGgCSpzPovdnVcbEZ8mBLpGZ5c rPkhPtv2cg+BOyUkGrpZ0s8SW7sJuEd0OmIpv/llGjoaIuAh3nDKQgvOB3YhQhDOcc5L 1bW+QQaV7BIU0WALBTPmxh5Ee8yV7PqudKPyzCGByaA75pESqzAceJe+BI0k/kWhsX2u ttJ903ree2TmIBnc5E+0+48VF8MAHaoMobzRkvXllpiWbz9A93EzAr3bxxk3Ctxtys/a B+mA== X-Google-Smtp-Source: APXvYqzo127I/xyvi2f2TQuihJ/PTbv/GbR3bqhfptJyq2tZG1LoaycbqyG+TZzhhwy3eZxqSBnHxA== X-Received: by 2002:ac8:3782:: with SMTP id d2mr46415710qtc.170.1555082120810; Fri, 12 Apr 2019 08:15:20 -0700 (PDT) Received: from localhost (pool-108-27-252-85.nycmny.fios.verizon.net. [108.27.252.85]) by smtp.gmail.com with ESMTPSA id b46sm24700810qtk.77.2019.04.12.08.15.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 12 Apr 2019 08:15:20 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/4] mm: memcontrol: fix recursive statistics correctness & scalabilty Date: Fri, 12 Apr 2019 11:15:06 -0400 Message-Id: <20190412151507.2769-4-hannes@cmpxchg.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190412151507.2769-1-hannes@cmpxchg.org> References: <20190412151507.2769-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Right now, when somebody needs to know the recursive memory statistics and events of a cgroup subtree, they need to walk the entire subtree and sum up the counters manually. There are two issues with this: 1. When a cgroup gets deleted, its stats are lost. The state counters should all be 0 at that point, of course, but the events are not. When this happens, the event counters, which are supposed to be monotonic, can go backwards in the parent cgroups. 2. During regular operation, we always have a certain number of lazily freed cgroups sitting around that have been deleted, have no tasks, but have a few cache pages remaining. These groups' statistics do not change until we eventually hit memory pressure, but somebody watching, say, memory.stat on an ancestor has to iterate those every time. This patch addresses both issues by introducing recursive counters at each level that are propagated from the write side when stats change. Upward propagation happens when the per-cpu caches spill over into the local atomic counter. This is the same thing we do during charge and uncharge, except that the latter uses atomic RMWs, which are more expensive; stat changes happen at around the same rate. In a sparse file test (page faults and reclaim at maximum CPU speed) with 5 cgroup nesting levels, perf shows __mod_memcg_page state at ~1%. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt --- include/linux/memcontrol.h | 54 +++++++++- mm/memcontrol.c | 205 ++++++++++++++++++------------------- 2 files changed, 150 insertions(+), 109 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index cae7d1b11eea..36bdfe8e5965 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -128,6 +128,7 @@ struct mem_cgroup_per_node { struct lruvec_stat __percpu *lruvec_stat_cpu; atomic_long_t lruvec_stat[NR_VM_NODE_STAT_ITEMS]; + atomic_long_t lruvec_stat_local[NR_VM_NODE_STAT_ITEMS]; unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; @@ -279,8 +280,12 @@ struct mem_cgroup { MEMCG_PADDING(_pad2_); atomic_long_t vmstats[MEMCG_NR_STAT]; + atomic_long_t vmstats_local[MEMCG_NR_STAT]; + atomic_long_t vmevents[NR_VM_EVENT_ITEMS]; - atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; + atomic_long_t vmevents_local[NR_VM_EVENT_ITEMS]; + + atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; unsigned long socket_pressure; @@ -565,6 +570,20 @@ struct mem_cgroup *lock_page_memcg(struct page *page); void __unlock_page_memcg(struct mem_cgroup *memcg); void unlock_page_memcg(struct page *page); +/* + * idx can be of type enum memcg_stat_item or node_stat_item. + * Keep in sync with memcg_exact_page_state(). + */ +static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) +{ + long x = atomic_long_read(&memcg->vmstats[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; +} + /* * idx can be of type enum memcg_stat_item or node_stat_item. * Keep in sync with memcg_exact_page_state(). @@ -572,7 +591,7 @@ void unlock_page_memcg(struct page *page); static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx) { - long x = atomic_long_read(&memcg->vmstats[idx]); + long x = atomic_long_read(&memcg->vmstats_local[idx]); #ifdef CONFIG_SMP if (x < 0) x = 0; @@ -624,6 +643,24 @@ static inline void mod_memcg_page_state(struct page *page, mod_memcg_state(page->mem_cgroup, idx, val); } +static inline unsigned long lruvec_page_state(struct lruvec *lruvec, + enum node_stat_item idx) +{ + struct mem_cgroup_per_node *pn; + long x; + + if (mem_cgroup_disabled()) + return node_page_state(lruvec_pgdat(lruvec), idx); + + pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x = atomic_long_read(&pn->lruvec_stat[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; +} + static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, enum node_stat_item idx) { @@ -634,7 +671,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, return node_page_state(lruvec_pgdat(lruvec), idx); pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x = atomic_long_read(&pn->lruvec_stat[idx]); + x = atomic_long_read(&pn->lruvec_stat_local[idx]); #ifdef CONFIG_SMP if (x < 0) x = 0; @@ -991,6 +1028,11 @@ static inline void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) { } +static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) +{ + return 0; +} + static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx) { @@ -1021,6 +1063,12 @@ static inline void mod_memcg_page_state(struct page *page, { } +static inline unsigned long lruvec_page_state(struct lruvec *lruvec, + enum node_stat_item idx) +{ + return node_page_state(lruvec_pgdat(lruvec), idx); +} + static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, enum node_stat_item idx) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3535270ebeec..2eb2d4ef9b34 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -702,12 +702,27 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { - atomic_long_add(x, &memcg->vmstats[idx]); + struct mem_cgroup *mi; + + atomic_long_add(x, &memcg->vmstats_local[idx]); + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) + atomic_long_add(x, &mi->vmstats[idx]); x = 0; } __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); } +static struct mem_cgroup_per_node * +parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid) +{ + struct mem_cgroup *parent; + + parent = parent_mem_cgroup(pn->memcg); + if (!parent) + return NULL; + return mem_cgroup_nodeinfo(parent, nid); +} + /** * __mod_lruvec_state - update lruvec memory statistics * @lruvec: the lruvec @@ -721,24 +736,31 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, int val) { + pg_data_t *pgdat = lruvec_pgdat(lruvec); struct mem_cgroup_per_node *pn; + struct mem_cgroup *memcg; long x; /* Update node */ - __mod_node_page_state(lruvec_pgdat(lruvec), idx, val); + __mod_node_page_state(pgdat, idx, val); if (mem_cgroup_disabled()) return; pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); + memcg = pn->memcg; /* Update memcg */ - __mod_memcg_state(pn->memcg, idx, val); + __mod_memcg_state(memcg, idx, val); /* Update lruvec */ x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]); if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { - atomic_long_add(x, &pn->lruvec_stat[idx]); + struct mem_cgroup_per_node *pi; + + atomic_long_add(x, &pn->lruvec_stat_local[idx]); + for (pi = pn; pi; pi = parent_nodeinfo(pi, pgdat->node_id)) + atomic_long_add(x, &pi->lruvec_stat[idx]); x = 0; } __this_cpu_write(pn->lruvec_stat_cpu->count[idx], x); @@ -760,18 +782,26 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); if (unlikely(x > MEMCG_CHARGE_BATCH)) { - atomic_long_add(x, &memcg->vmevents[idx]); + struct mem_cgroup *mi; + + atomic_long_add(x, &memcg->vmevents_local[idx]); + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) + atomic_long_add(x, &mi->vmevents[idx]); x = 0; } __this_cpu_write(memcg->vmstats_percpu->events[idx], x); } -static unsigned long memcg_events_local(struct mem_cgroup *memcg, - int event) +static unsigned long memcg_events(struct mem_cgroup *memcg, int event) { return atomic_long_read(&memcg->vmevents[event]); } +static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) +{ + return atomic_long_read(&memcg->vmevents_local[event]); +} + static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, struct page *page, bool compound, int nr_pages) @@ -2162,7 +2192,7 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock; - struct mem_cgroup *memcg; + struct mem_cgroup *memcg, *mi; stock = &per_cpu(memcg_stock, cpu); drain_stock(stock); @@ -2175,8 +2205,11 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) long x; x = this_cpu_xchg(memcg->vmstats_percpu->stat[i], 0); - if (x) - atomic_long_add(x, &memcg->vmstats[i]); + if (x) { + atomic_long_add(x, &memcg->vmstats_local[i]); + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) + atomic_long_add(x, &memcg->vmstats[i]); + } if (i >= NR_VM_NODE_STAT_ITEMS) continue; @@ -2186,8 +2219,12 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) pn = mem_cgroup_nodeinfo(memcg, nid); x = this_cpu_xchg(pn->lruvec_stat_cpu->count[i], 0); - if (x) - atomic_long_add(x, &pn->lruvec_stat[i]); + if (x) { + atomic_long_add(x, &pn->lruvec_stat_local[i]); + do { + atomic_long_add(x, &pn->lruvec_stat[i]); + } while ((pn = parent_nodeinfo(pn, nid))); + } } } @@ -2195,8 +2232,11 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) long x; x = this_cpu_xchg(memcg->vmstats_percpu->events[i], 0); - if (x) - atomic_long_add(x, &memcg->vmevents[i]); + if (x) { + atomic_long_add(x, &memcg->vmevents_local[i]); + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) + atomic_long_add(x, &memcg->vmevents[i]); + } } } @@ -3036,54 +3076,15 @@ static int mem_cgroup_hierarchy_write(struct cgroup_subsys_state *css, return retval; } -struct accumulated_vmstats { - unsigned long vmstats[MEMCG_NR_STAT]; - unsigned long vmevents[NR_VM_EVENT_ITEMS]; - unsigned long lru_pages[NR_LRU_LISTS]; - - /* overrides for v1 */ - const unsigned int *vmstats_array; - const unsigned int *vmevents_array; - - int vmstats_size; - int vmevents_size; -}; - -static void accumulate_vmstats(struct mem_cgroup *memcg, - struct accumulated_vmstats *acc) -{ - struct mem_cgroup *mi; - int i; - - for_each_mem_cgroup_tree(mi, memcg) { - for (i = 0; i < acc->vmstats_size; i++) - acc->vmstats[i] += memcg_page_state_local(mi, - acc->vmstats_array ? acc->vmstats_array[i] : i); - - for (i = 0; i < acc->vmevents_size; i++) - acc->vmevents[i] += memcg_events_local(mi, - acc->vmevents_array - ? acc->vmevents_array[i] : i); - - for (i = 0; i < NR_LRU_LISTS; i++) - acc->lru_pages[i] += memcg_page_state_local(mi, - NR_LRU_BASE + i); - } -} - static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) { - unsigned long val = 0; + unsigned long val; if (mem_cgroup_is_root(memcg)) { - struct mem_cgroup *iter; - - for_each_mem_cgroup_tree(iter, memcg) { - val += memcg_page_state_local(iter, MEMCG_CACHE); - val += memcg_page_state_local(iter, MEMCG_RSS); - if (swap) - val += memcg_page_state_local(iter, MEMCG_SWAP); - } + val = memcg_page_state(memcg, MEMCG_CACHE) + + memcg_page_state(memcg, MEMCG_RSS); + if (swap) + val += memcg_page_state(memcg, MEMCG_SWAP); } else { if (!swap) val = page_counter_read(&memcg->memory); @@ -3514,7 +3515,6 @@ static int memcg_stat_show(struct seq_file *m, void *v) unsigned long memory, memsw; struct mem_cgroup *mi; unsigned int i; - struct accumulated_vmstats acc; BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats)); BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS); @@ -3548,27 +3548,21 @@ static int memcg_stat_show(struct seq_file *m, void *v) seq_printf(m, "hierarchical_memsw_limit %llu\n", (u64)memsw * PAGE_SIZE); - memset(&acc, 0, sizeof(acc)); - acc.vmstats_size = ARRAY_SIZE(memcg1_stats); - acc.vmstats_array = memcg1_stats; - acc.vmevents_size = ARRAY_SIZE(memcg1_events); - acc.vmevents_array = memcg1_events; - accumulate_vmstats(memcg, &acc); - for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) { if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; seq_printf(m, "total_%s %llu\n", memcg1_stat_names[i], - (u64)acc.vmstats[i] * PAGE_SIZE); + (u64)memcg_page_state(memcg, i) * PAGE_SIZE); } for (i = 0; i < ARRAY_SIZE(memcg1_events); i++) seq_printf(m, "total_%s %llu\n", memcg1_event_names[i], - (u64)acc.vmevents[i]); + (u64)memcg_events(memcg, i)); for (i = 0; i < NR_LRU_LISTS; i++) seq_printf(m, "total_%s %llu\n", mem_cgroup_lru_names[i], - (u64)acc.lru_pages[i] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_LRU_BASE + i) * + PAGE_SIZE); #ifdef CONFIG_DEBUG_VM { @@ -5661,7 +5655,6 @@ static int memory_events_show(struct seq_file *m, void *v) static int memory_stat_show(struct seq_file *m, void *v) { struct mem_cgroup *memcg = mem_cgroup_from_seq(m); - struct accumulated_vmstats acc; int i; /* @@ -5675,31 +5668,27 @@ static int memory_stat_show(struct seq_file *m, void *v) * Current memory state: */ - memset(&acc, 0, sizeof(acc)); - acc.vmstats_size = MEMCG_NR_STAT; - acc.vmevents_size = NR_VM_EVENT_ITEMS; - accumulate_vmstats(memcg, &acc); - seq_printf(m, "anon %llu\n", - (u64)acc.vmstats[MEMCG_RSS] * PAGE_SIZE); + (u64)memcg_page_state(memcg, MEMCG_RSS) * PAGE_SIZE); seq_printf(m, "file %llu\n", - (u64)acc.vmstats[MEMCG_CACHE] * PAGE_SIZE); + (u64)memcg_page_state(memcg, MEMCG_CACHE) * PAGE_SIZE); seq_printf(m, "kernel_stack %llu\n", - (u64)acc.vmstats[MEMCG_KERNEL_STACK_KB] * 1024); + (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) * 1024); seq_printf(m, "slab %llu\n", - (u64)(acc.vmstats[NR_SLAB_RECLAIMABLE] + - acc.vmstats[NR_SLAB_UNRECLAIMABLE]) * PAGE_SIZE); + (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) + + memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE)) * + PAGE_SIZE); seq_printf(m, "sock %llu\n", - (u64)acc.vmstats[MEMCG_SOCK] * PAGE_SIZE); + (u64)memcg_page_state(memcg, MEMCG_SOCK) * PAGE_SIZE); seq_printf(m, "shmem %llu\n", - (u64)acc.vmstats[NR_SHMEM] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_SHMEM) * PAGE_SIZE); seq_printf(m, "file_mapped %llu\n", - (u64)acc.vmstats[NR_FILE_MAPPED] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_FILE_MAPPED) * PAGE_SIZE); seq_printf(m, "file_dirty %llu\n", - (u64)acc.vmstats[NR_FILE_DIRTY] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_FILE_DIRTY) * PAGE_SIZE); seq_printf(m, "file_writeback %llu\n", - (u64)acc.vmstats[NR_WRITEBACK] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_WRITEBACK) * PAGE_SIZE); /* * TODO: We should eventually replace our own MEMCG_RSS_HUGE counter @@ -5708,43 +5697,47 @@ static int memory_stat_show(struct seq_file *m, void *v) * where the page->mem_cgroup is set up and stable. */ seq_printf(m, "anon_thp %llu\n", - (u64)acc.vmstats[MEMCG_RSS_HUGE] * PAGE_SIZE); + (u64)memcg_page_state(memcg, MEMCG_RSS_HUGE) * PAGE_SIZE); for (i = 0; i < NR_LRU_LISTS; i++) seq_printf(m, "%s %llu\n", mem_cgroup_lru_names[i], - (u64)acc.lru_pages[i] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_LRU_BASE + i) * + PAGE_SIZE); seq_printf(m, "slab_reclaimable %llu\n", - (u64)acc.vmstats[NR_SLAB_RECLAIMABLE] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) * + PAGE_SIZE); seq_printf(m, "slab_unreclaimable %llu\n", - (u64)acc.vmstats[NR_SLAB_UNRECLAIMABLE] * PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE) * + PAGE_SIZE); /* Accumulated memory events */ - seq_printf(m, "pgfault %lu\n", acc.vmevents[PGFAULT]); - seq_printf(m, "pgmajfault %lu\n", acc.vmevents[PGMAJFAULT]); + seq_printf(m, "pgfault %lu\n", memcg_events(memcg, PGFAULT)); + seq_printf(m, "pgmajfault %lu\n", memcg_events(memcg, PGMAJFAULT)); seq_printf(m, "workingset_refault %lu\n", - acc.vmstats[WORKINGSET_REFAULT]); + memcg_page_state(memcg, WORKINGSET_REFAULT)); seq_printf(m, "workingset_activate %lu\n", - acc.vmstats[WORKINGSET_ACTIVATE]); + memcg_page_state(memcg, WORKINGSET_ACTIVATE)); seq_printf(m, "workingset_nodereclaim %lu\n", - acc.vmstats[WORKINGSET_NODERECLAIM]); - - seq_printf(m, "pgrefill %lu\n", acc.vmevents[PGREFILL]); - seq_printf(m, "pgscan %lu\n", acc.vmevents[PGSCAN_KSWAPD] + - acc.vmevents[PGSCAN_DIRECT]); - seq_printf(m, "pgsteal %lu\n", acc.vmevents[PGSTEAL_KSWAPD] + - acc.vmevents[PGSTEAL_DIRECT]); - seq_printf(m, "pgactivate %lu\n", acc.vmevents[PGACTIVATE]); - seq_printf(m, "pgdeactivate %lu\n", acc.vmevents[PGDEACTIVATE]); - seq_printf(m, "pglazyfree %lu\n", acc.vmevents[PGLAZYFREE]); - seq_printf(m, "pglazyfreed %lu\n", acc.vmevents[PGLAZYFREED]); + memcg_page_state(memcg, WORKINGSET_NODERECLAIM)); + + seq_printf(m, "pgrefill %lu\n", memcg_events(memcg, PGREFILL)); + seq_printf(m, "pgscan %lu\n", memcg_events(memcg, PGSCAN_KSWAPD) + + memcg_events(memcg, PGSCAN_DIRECT)); + seq_printf(m, "pgsteal %lu\n", memcg_events(memcg, PGSTEAL_KSWAPD) + + memcg_events(memcg, PGSTEAL_DIRECT)); + seq_printf(m, "pgactivate %lu\n", memcg_events(memcg, PGACTIVATE)); + seq_printf(m, "pgdeactivate %lu\n", memcg_events(memcg, PGDEACTIVATE)); + seq_printf(m, "pglazyfree %lu\n", memcg_events(memcg, PGLAZYFREE)); + seq_printf(m, "pglazyfreed %lu\n", memcg_events(memcg, PGLAZYFREED)); #ifdef CONFIG_TRANSPARENT_HUGEPAGE - seq_printf(m, "thp_fault_alloc %lu\n", acc.vmevents[THP_FAULT_ALLOC]); + seq_printf(m, "thp_fault_alloc %lu\n", + memcg_events(memcg, THP_FAULT_ALLOC)); seq_printf(m, "thp_collapse_alloc %lu\n", - acc.vmevents[THP_COLLAPSE_ALLOC]); + memcg_events(memcg, THP_COLLAPSE_ALLOC)); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ return 0; From patchwork Fri Apr 12 15:15:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 10898541 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A51121515 for ; Fri, 12 Apr 2019 15:15:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8593D28D4D for ; Fri, 12 Apr 2019 15:15:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 79ECE28D50; Fri, 12 Apr 2019 15:15:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F301B28D4D for ; Fri, 12 Apr 2019 15:15:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8AC4A6B026A; Fri, 12 Apr 2019 11:15:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 85CD06B026B; Fri, 12 Apr 2019 11:15:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C1066B026C; Fri, 12 Apr 2019 11:15:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 35C6A6B026A for ; Fri, 12 Apr 2019 11:15:24 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id k13so8973507qtc.23 for ; Fri, 12 Apr 2019 08:15:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Os4Omut1si9dZUtk8xlS/9PlA4SUJZFOM37ndbDq3oI=; b=dD70gIWm1OBBV99+lCO9Sl1dV3mJQiB69+DDt0Er8gqHO9MVPaa/JpoA103gzHwwDD C/8hiUlAFAXdI0xWieMpakKAhlied4e73mrQjPVWhW7zDh4KcZzPUcj3Z0vOQrMK+UGp l8o/udXBmtvbz4pqXgANydZn6RwrG6pXdqL9lZo4uYLX7xXahtrqRAtpY96xYkYz1qfw LFT7n7Ny9HD9JgQXjQt9Mx5FaA/eridJquED/tFvU3w5XupEVecn1yM6pdSmo8LrBSMF TSdYvkaITsJUtyprW5Ynx7p8U9u/zHr+o7l5DUZ1sYHO97as+f47rBSVN8RcWYPnbdvI +OdQ== X-Gm-Message-State: APjAAAWZn8MMnBmyhbdRPa2+G0ygGFOevk2ch1RW+yFi4uaYYOGsIzIx wXur0cA/XDyxO6KWRPr8WuRAoRouKcH76xJ22Icn+jlPjRBaQPFS0yvGw6HGzISZWXfK1+VPybc /61JEUMyak810Vuy+yJ1dnEna2MR3+NgnOaPUgDvRKB3YU5/8FC9IbOQz9xHrm9e8oQ== X-Received: by 2002:a0c:ba0f:: with SMTP id w15mr46923083qvf.20.1555082123898; Fri, 12 Apr 2019 08:15:23 -0700 (PDT) X-Received: by 2002:a0c:ba0f:: with SMTP id w15mr46922979qvf.20.1555082122728; Fri, 12 Apr 2019 08:15:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555082122; cv=none; d=google.com; s=arc-20160816; b=D+OlfOxdafoC6jocXUtq0Gdsk/ZkHbqTf9KKywzPh+X1Vt/esiCW/OdhLvaou7/UUG tdIroJ66VY2ktctLLEHfuu0tHLkB+zeSR/+POI3oI3Xu1IswW8HOkp4zvJY+5LJA8Nb1 2N2r+K9IDVI4j87cdsPHlRK7E1Cl7WNRciFtfIivy9rj+7uAtXCdpYvwtNTDL451uZ3B 78hwzSd9s5So3cDZVqGVBCnc/OMKJqiALYLjPIYQhZtatgpwwb+Gpkeq/s/Wni9Mqi8n 8kluPMWQaGjjSkDUPnQampv5NbP6erQBvX0A0xVDwwvemWI0zpyIKu/50tDbVolSl8PV WcUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=Os4Omut1si9dZUtk8xlS/9PlA4SUJZFOM37ndbDq3oI=; b=v6DvshjhoG/5f8O0LkAU4X2EojIb44A0YawP6RpnwXBVUD2pI6b14qlzwYHPfQ1Zem gSZQZyCSMpjBrlnvC+HfF/ETSGPWnU6dXrr5A0AgV7qglHlWZTHPROqFI6B43lQEf6e1 Ru6NBCr5mss/oO8do+kDDUCg378cWLcvCCMMxnu5fdtKoBZNn1oY9ECZNSwZKan2mQyH HU2/OnhpFWrpiWpmatHbsNYhoYTnbIKf4t4Ud4toiLBYaXiWJFKbs0+QyEtGH8kJajwp cQnCf6imfeQrtd4jO1CYlkGpYuCkK4Y1SKHbp9xCEqvcLoIV64wK0qHL784RvDVF1edr PT8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=FpWx+tFi; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id o56sor41702310qvc.71.2019.04.12.08.15.22 for (Google Transport Security); Fri, 12 Apr 2019 08:15:22 -0700 (PDT) Received-SPF: pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=FpWx+tFi; spf=pass (google.com: domain of hannes@cmpxchg.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Os4Omut1si9dZUtk8xlS/9PlA4SUJZFOM37ndbDq3oI=; b=FpWx+tFikqrZdAqHy8DuhFwSzGznleXRgjxZNDwl8fFECEghu5j8q1+uwliecftUar L0qovs/uw2w7CgfnMoQx78Dk2YOOBJoP4QTDOWlGJbl0O6GIUQ59VmwlLyyT2oHnuPB9 8hmvZJ247fTWRB9ZF+7KVq8xvyJanOTGCHhUTW0AmBAPBRC57t2u1Dj8wD9npYit59ja SWtjvgC6wCEuHSXwHDt8NxymIdh+HFjimb6yjHjLjhhk9NO03kJFznmagrp8qtPf2wJn CBD2GaKz8nvXzLO0rYOHzvsSx3udCXhFRsoJhxmSHMjI/slxgUGy30JqhrRYaSYZtXU6 8JdQ== X-Google-Smtp-Source: APXvYqzVDhpq/REyjk2jbBHCrlfGWJ514qWQfxe7lwASD2ybrledPl2eTVl5f7Jl8CXIIUISZgnShw== X-Received: by 2002:a0c:92d5:: with SMTP id c21mr48389189qvc.215.1555082122539; Fri, 12 Apr 2019 08:15:22 -0700 (PDT) Received: from localhost (pool-108-27-252-85.nycmny.fios.verizon.net. [108.27.252.85]) by smtp.gmail.com with ESMTPSA id m189sm25217643qkf.2.2019.04.12.08.15.21 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 12 Apr 2019 08:15:21 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/4] mm: memcontrol: fix NUMA round-robin reclaim at intermediate level Date: Fri, 12 Apr 2019 11:15:07 -0400 Message-Id: <20190412151507.2769-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190412151507.2769-1-hannes@cmpxchg.org> References: <20190412151507.2769-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When a cgroup is reclaimed on behalf of a configured limit, reclaim needs to round-robin through all NUMA nodes that hold pages of the memcg in question. However, when assembling the mask of candidate NUMA nodes, the code only consults the *local* cgroup LRU counters, not the recursive counters for the entire subtree. Cgroup limits are frequently configured against intermediate cgroups that do not have memory on their own LRUs. In this case, the node mask will always come up empty and reclaim falls back to scanning only the current node. If a cgroup subtree has some memory on one node but the processes are bound to another node afterwards, the limit reclaim will never age or reclaim that memory anymore. To fix this, use the recursive LRU counts for a cgroup subtree to determine which nodes hold memory of that cgroup. The code has been broken like this forever, so it doesn't seem to be a problem in practice. I just noticed it while reviewing the way the LRU counters are used in general. Signed-off-by: Johannes Weiner --- mm/memcontrol.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2eb2d4ef9b34..2535e54e7989 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1512,13 +1512,13 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg, { struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); - if (lruvec_page_state_local(lruvec, NR_INACTIVE_FILE) || - lruvec_page_state_local(lruvec, NR_ACTIVE_FILE)) + if (lruvec_page_state(lruvec, NR_INACTIVE_FILE) || + lruvec_page_state(lruvec, NR_ACTIVE_FILE)) return true; if (noswap || !total_swap_pages) return false; - if (lruvec_page_state_local(lruvec, NR_INACTIVE_ANON) || - lruvec_page_state_local(lruvec, NR_ACTIVE_ANON)) + if (lruvec_page_state(lruvec, NR_INACTIVE_ANON) || + lruvec_page_state(lruvec, NR_ACTIVE_ANON)) return true; return false;