From patchwork Tue Oct 22 14:47:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE6961895 for ; Tue, 22 Oct 2019 14:48:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BAB2721783 for ; Tue, 22 Oct 2019 14:48:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="po3EgEl1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BAB2721783 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 50E526B0006; Tue, 22 Oct 2019 10:48:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 470436B0007; Tue, 22 Oct 2019 10:48:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35E5A6B0008; Tue, 22 Oct 2019 10:48:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0221.hostedemail.com [216.40.44.221]) by kanga.kvack.org (Postfix) with ESMTP id 135386B0006 for ; Tue, 22 Oct 2019 10:48:17 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id C77F83AA9 for ; Tue, 22 Oct 2019 14:48:16 +0000 (UTC) X-FDA: 76071700992.27.scarf90_654993815fe63 X-Spam-Summary: 30,2,0,05071f50456c3dc7,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:2898:3138:3139:3140:3141:3142:3352:3369:3865:3866:3867:3868:3872:3874:4321:4605:5007:6261:6653:7903:8603:9592:10010:11026:11233:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:13069:13161:13229:13311:13357:13894:14181:14384:14394:14721:21080:21444:21451:21627:21966:30054:30070,0,RBL:209.85.160.196:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:1:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: scarf90_654993815fe63 X-Filterd-Recvd-Size: 4632 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:15 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id d17so12551711qto.3 for ; Tue, 22 Oct 2019 07:48:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iFBKqmL/4fnTvQgxpmecsW4xlCLftKHwTd8yhraPD9Y=; b=po3EgEl1Mfcj8YgbJhdpnDMRmpW8tzAaBeWPxhTAU12MmtccZL11JGUkwiGCxgb3xA QjdEo7R/vlHoX3wx55XWp85atLxaGk+SyRuTtJRtL93CNS+tRpuJrui+/9PKaa7yN3ii JLNR+sQ8cffNhgL+fm6oIOASPG2kRusL/cs1itJ4EqXdId40uJ9ocUAIThgUAPxylvbj MObUpe5nqGUYyuW0YRa7+wCtibgztGJg5dkHeW5ZpttdFZ1zN86nHgyeiLiVBtPJvEU2 rNzOnZd29An6Kf7sEkGGYMYqrq9BBDH/jts3v2801jiVF9o7199dTnfCZ1scCLknemwz ViqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iFBKqmL/4fnTvQgxpmecsW4xlCLftKHwTd8yhraPD9Y=; b=kGZ+dQuPzkniUjP9A8NmkRq0uVKN0opEQ3SyRwOwwvoBY63E2wp2uEBcJnSa518rh0 6nDqILs4uog9yaPcXK45Ohpdk4hmxRwHwpC+xnsJKTrLi1oJpsQgaGQpgAm3Kd9D6X6r 1R5z9uuF4BGHGCCUXM5v77W6Bmmeyj7ndHIMusgYw6gnz1xkt+6yIwDWp0i1Htb/UVuw QEis5sxP2SuljGoZ2IoDja8ZKPWH+ug2w0NKC4o3d4qmIfM2OgNXKOw0tuPLyaf7U8iJ rfSYXgpRF3jzw4ZekZnFERsjL1g3j9jjGQ5ZWfbTmpgOB/GJX+IoNLbiea+QgNBzPMeH b4tQ== X-Gm-Message-State: APjAAAW8F990qiiV9pL/7tllu3BXAMLoUbJDPepau0mcZChzO/zJQuOq 2Ckvm+h3xKndo0Grbm9b5rnB5w== X-Google-Smtp-Source: APXvYqy986Sb7waIgJjeFtAwugsVpL846YOA+EY68E+Fbe+rKmh3XrhkBV/xTbPGtzHtIcBkSwwq1w== X-Received: by 2002:ac8:29a5:: with SMTP id 34mr3679282qts.56.1571755694979; Tue, 22 Oct 2019 07:48:14 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id m186sm9298409qkd.119.2019.10.22.07.48.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:14 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/8] mm: vmscan: simplify lruvec_lru_size() Date: Tue, 22 Oct 2019 10:47:56 -0400 Message-Id: <20191022144803.302233-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This function currently takes the node or lruvec size and subtracts the zones that are excluded by the classzone index of the allocation. It uses four different types of counters to do this. Just add up the eligible zones. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/vmscan.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 1154b3a2b637..57f533b808f2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -351,32 +351,21 @@ unsigned long zone_reclaimable_pages(struct zone *zone) */ unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx) { - unsigned long lru_size = 0; + unsigned long size = 0; int zid; - if (!mem_cgroup_disabled()) { - for (zid = 0; zid < MAX_NR_ZONES; zid++) - lru_size += mem_cgroup_get_zone_lru_size(lruvec, lru, zid); - } else - lru_size = node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru); - - for (zid = zone_idx + 1; zid < MAX_NR_ZONES; zid++) { + for (zid = 0; zid <= zone_idx; zid++) { struct zone *zone = &lruvec_pgdat(lruvec)->node_zones[zid]; - unsigned long size; if (!managed_zone(zone)) continue; if (!mem_cgroup_disabled()) - size = mem_cgroup_get_zone_lru_size(lruvec, lru, zid); + size += mem_cgroup_get_zone_lru_size(lruvec, lru, zid); else - size = zone_page_state(&lruvec_pgdat(lruvec)->node_zones[zid], - NR_ZONE_LRU_BASE + lru); - lru_size -= min(size, lru_size); + size += zone_page_state(zone, NR_ZONE_LRU_BASE + lru); } - - return lru_size; - + return size; } /* From patchwork Tue Oct 22 14:47:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78F481515 for ; Tue, 22 Oct 2019 14:48:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2B1F021783 for ; Tue, 22 Oct 2019 14:48:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="wv3K5O3b" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2B1F021783 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A9FB6B0007; Tue, 22 Oct 2019 10:48:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6E4226B0008; Tue, 22 Oct 2019 10:48:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D2E56B000A; Tue, 22 Oct 2019 10:48:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3CBFF6B0007 for ; Tue, 22 Oct 2019 10:48:18 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id E8713180AD815 for ; Tue, 22 Oct 2019 14:48:17 +0000 (UTC) X-FDA: 76071701034.20.sugar78_6581f832f0e1f X-Spam-Summary: 2,0,0,407999b14e158430,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:1:41:69:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2198:2199:2393:2559:2562:2637:2731:2918:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4605:5007:6119:6261:6653:7809:7882:7903:8660:8957:9010:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12895:12986:13148:13149:13161:13200:13229:13230:13894:14096:14394:21080:21324:21325:21444:21450:21451:21627:21740:21939:21966:21972:30012:30029:30054,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: sugar78_6581f832f0e1f X-Filterd-Recvd-Size: 14093 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:17 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id m15so27226327qtq.2 for ; Tue, 22 Oct 2019 07:48:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=T9x/BZVwFX8c8LtmVeBnM5kjXKWh7rogmZNvXO9TjK8=; b=wv3K5O3bHBMPCrq928XBWBA85xY/mLJuE3OohiQ2VIj01SpyLkkdNaEZLf5TVHSKnC z6SVC/f+z/Iouaj5smTPVAVl5iE9TsGSOlgRABEtyrk7ilEfC9PE7lhrDfRQPRDWLELB 1NU1WmryXExK0P7PY5ATdNLJYjqDjNAP7jsfU+fre6HZBEv726pv0dHdzlpF3EEBkxFK XYW1LWDGPqJsVxpD6gAcmvp13Ii6+AndrIEE8Yny9ftwfffRdREQCQy4YRPLSU7oMx7z Dh0qXMAhFN/PyFmwDcc5OvtOy1w1FMhhd5KdZsCQDaal56oWoIBTB0qQv+eVfVpYInlL 0+yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=T9x/BZVwFX8c8LtmVeBnM5kjXKWh7rogmZNvXO9TjK8=; b=NSSG9Davx3F6y5QnzFm/Nx/MfAGFQMxTX+CY9EnSfblxluiDzaTg9b38tHycZsywhj 2IHN/NKyfikXCd8Tatkqw91aib26a43LPVCaQsG+moShLdW9eCalGlCiOTQK8B4U/hIl g6KmSUs5psxgaPJYpr6jQuooihKiqkUli7WQLrAm/pJRMO0dIr9Wp2Ob1RdqTupCJDFY mm3poikPD9vOyVVUuupjTZuDd6500chkrJAayn3w4LBNzUafMqhCh25T//DPSzSkrBT8 +LQzxhLj32TvQzJLfiIUxE/q8Pm9DKVHXHK6yUTILHgUpncIv43trhMR1jR2gaAiZMJV /iSQ== X-Gm-Message-State: APjAAAUXAE42AyJEFVS/1qJJ2/3LBHiNYCJUM+scAxwvuwxaDHYM4GQy 5zLNL3XmWmQ7TkQrHUkIXIoAtA== X-Google-Smtp-Source: APXvYqwQvnCxW3rnXdFgnN637ZDAko4TiaAvYcyINgTF7htr8dpafpl9jzmlAnLWoq2qTTX3Ap8OcQ== X-Received: by 2002:a05:6214:208:: with SMTP id i8mr3576588qvt.108.1571755696274; Tue, 22 Oct 2019 07:48:16 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id i25sm10803196qkk.30.2019.10.22.07.48.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:15 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/8] mm: clean up and clarify lruvec lookup procedure Date: Tue, 22 Oct 2019 10:47:57 -0400 Message-Id: <20191022144803.302233-3-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is a per-memcg lruvec and a NUMA node lruvec. Which one is being used is somewhat confusing right now, and it's easy to make mistakes - especially when it comes to global reclaim. How it works: when memory cgroups are enabled, we always use the root_mem_cgroup's per-node lruvecs. When memory cgroups are not compiled in or disabled at runtime, we use pgdat->lruvec. Document that in a comment. Due to the way the reclaim code is generalized, all lookups use the mem_cgroup_lruvec() helper function, and nobody should have to find the right lruvec manually right now. But to avoid future mistakes, rename the pgdat->lruvec member to pgdat->__lruvec and delete the convenience wrapper that suggests it's a commonly accessed member. While in this area, swap the mem_cgroup_lruvec() argument order. The name suggests a memcg operation, yet it takes a pgdat first and a memcg second. I have to double take every time I call this. Fix that. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko --- include/linux/memcontrol.h | 26 +++++++++++++------------- include/linux/mmzone.h | 15 ++++++++------- mm/memcontrol.c | 12 ++++++------ mm/page_alloc.c | 2 +- mm/slab.h | 4 ++-- mm/vmscan.c | 6 +++--- mm/workingset.c | 8 ++++---- 7 files changed, 37 insertions(+), 36 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 2b34925fc19d..498cea07cbb1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -393,22 +393,22 @@ mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) } /** - * mem_cgroup_lruvec - get the lru list vector for a node or a memcg zone - * @node: node of the wanted lruvec + * mem_cgroup_lruvec - get the lru list vector for a memcg & node * @memcg: memcg of the wanted lruvec + * @node: node of the wanted lruvec * - * Returns the lru list vector holding pages for a given @node or a given - * @memcg and @zone. This can be the node lruvec, if the memory controller - * is disabled. + * Returns the lru list vector holding pages for a given @memcg & + * @node combination. This can be the node lruvec, if the memory + * controller is disabled. */ -static inline struct lruvec *mem_cgroup_lruvec(struct pglist_data *pgdat, - struct mem_cgroup *memcg) +static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, + struct pglist_data *pgdat) { struct mem_cgroup_per_node *mz; struct lruvec *lruvec; if (mem_cgroup_disabled()) { - lruvec = node_lruvec(pgdat); + lruvec = &pgdat->__lruvec; goto out; } @@ -727,7 +727,7 @@ static inline void __mod_lruvec_page_state(struct page *page, return; } - lruvec = mem_cgroup_lruvec(pgdat, page->mem_cgroup); + lruvec = mem_cgroup_lruvec(page->mem_cgroup, pgdat); __mod_lruvec_state(lruvec, idx, val); } @@ -898,16 +898,16 @@ static inline void mem_cgroup_migrate(struct page *old, struct page *new) { } -static inline struct lruvec *mem_cgroup_lruvec(struct pglist_data *pgdat, - struct mem_cgroup *memcg) +static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, + struct pglist_data *pgdat) { - return node_lruvec(pgdat); + return &pgdat->__lruvec; } static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgdat) { - return &pgdat->lruvec; + return &pgdat->__lruvec; } static inline bool mm_match_cgroup(struct mm_struct *mm, diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index d4ca03b93373..449a44171026 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -777,7 +777,13 @@ typedef struct pglist_data { #endif /* Fields commonly accessed by the page reclaim scanner */ - struct lruvec lruvec; + + /* + * NOTE: THIS IS UNUSED IF MEMCG IS ENABLED. + * + * Use mem_cgroup_lruvec() to look up lruvecs. + */ + struct lruvec __lruvec; unsigned long flags; @@ -800,11 +806,6 @@ typedef struct pglist_data { #define node_start_pfn(nid) (NODE_DATA(nid)->node_start_pfn) #define node_end_pfn(nid) pgdat_end_pfn(NODE_DATA(nid)) -static inline struct lruvec *node_lruvec(struct pglist_data *pgdat) -{ - return &pgdat->lruvec; -} - static inline unsigned long pgdat_end_pfn(pg_data_t *pgdat) { return pgdat->node_start_pfn + pgdat->node_spanned_pages; @@ -842,7 +843,7 @@ static inline struct pglist_data *lruvec_pgdat(struct lruvec *lruvec) #ifdef CONFIG_MEMCG return lruvec->pgdat; #else - return container_of(lruvec, struct pglist_data, lruvec); + return container_of(lruvec, struct pglist_data, __lruvec); #endif } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 98c2fd902533..055975b0b3a3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -770,7 +770,7 @@ void __mod_lruvec_slab_state(void *p, enum node_stat_item idx, int val) if (!memcg || memcg == root_mem_cgroup) { __mod_node_page_state(pgdat, idx, val); } else { - lruvec = mem_cgroup_lruvec(pgdat, memcg); + lruvec = mem_cgroup_lruvec(memcg, pgdat); __mod_lruvec_state(lruvec, idx, val); } rcu_read_unlock(); @@ -1226,7 +1226,7 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd struct lruvec *lruvec; if (mem_cgroup_disabled()) { - lruvec = &pgdat->lruvec; + lruvec = &pgdat->__lruvec; goto out; } @@ -1605,7 +1605,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg, int nid, bool noswap) { - struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); + struct lruvec *lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid)); if (lruvec_page_state(lruvec, NR_INACTIVE_FILE) || lruvec_page_state(lruvec, NR_ACTIVE_FILE)) @@ -3735,7 +3735,7 @@ static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css, static unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, int nid, unsigned int lru_mask) { - struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); + struct lruvec *lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid)); unsigned long nr = 0; enum lru_list lru; @@ -5433,8 +5433,8 @@ static int mem_cgroup_move_account(struct page *page, anon = PageAnon(page); pgdat = page_pgdat(page); - from_vec = mem_cgroup_lruvec(pgdat, from); - to_vec = mem_cgroup_lruvec(pgdat, to); + from_vec = mem_cgroup_lruvec(from, pgdat); + to_vec = mem_cgroup_lruvec(to, pgdat); spin_lock_irqsave(&from->move_lock, flags); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fe76be55c9d5..791c018314b3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6708,7 +6708,7 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat) pgdat_page_ext_init(pgdat); spin_lock_init(&pgdat->lru_lock); - lruvec_init(node_lruvec(pgdat)); + lruvec_init(&pgdat->__lruvec); } static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx, int nid, diff --git a/mm/slab.h b/mm/slab.h index 3eb29ae75743..2bbecf28688d 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -369,7 +369,7 @@ static __always_inline int memcg_charge_slab(struct page *page, if (ret) goto out; - lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg); + lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page)); mod_lruvec_state(lruvec, cache_vmstat_idx(s), 1 << order); /* transer try_charge() page references to kmem_cache */ @@ -393,7 +393,7 @@ static __always_inline void memcg_uncharge_slab(struct page *page, int order, rcu_read_lock(); memcg = READ_ONCE(s->memcg_params.memcg); if (likely(!mem_cgroup_is_root(memcg))) { - lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg); + lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page)); mod_lruvec_state(lruvec, cache_vmstat_idx(s), -(1 << order)); memcg_kmem_uncharge_memcg(page, order, memcg); } else { diff --git a/mm/vmscan.c b/mm/vmscan.c index 57f533b808f2..be3c22c274c1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2545,7 +2545,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memcg, struct scan_control *sc) { - struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg); + struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); unsigned long nr[NR_LRU_LISTS]; unsigned long targets[NR_LRU_LISTS]; unsigned long nr_to_scan; @@ -3023,7 +3023,7 @@ static void snapshot_refaults(struct mem_cgroup *root_memcg, pg_data_t *pgdat) unsigned long refaults; struct lruvec *lruvec; - lruvec = mem_cgroup_lruvec(pgdat, memcg); + lruvec = mem_cgroup_lruvec(memcg, pgdat); refaults = lruvec_page_state_local(lruvec, WORKINGSET_ACTIVATE); lruvec->refaults = refaults; } while ((memcg = mem_cgroup_iter(root_memcg, memcg, NULL))); @@ -3391,7 +3391,7 @@ static void age_active_anon(struct pglist_data *pgdat, memcg = mem_cgroup_iter(NULL, NULL, NULL); do { - struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg); + struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); if (inactive_list_is_low(lruvec, false, sc, true)) shrink_active_list(SWAP_CLUSTER_MAX, lruvec, diff --git a/mm/workingset.c b/mm/workingset.c index c963831d354f..e8212123c1c3 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -233,7 +233,7 @@ void *workingset_eviction(struct page *page) VM_BUG_ON_PAGE(page_count(page), page); VM_BUG_ON_PAGE(!PageLocked(page), page); - lruvec = mem_cgroup_lruvec(pgdat, memcg); + lruvec = mem_cgroup_lruvec(memcg, pgdat); eviction = atomic_long_inc_return(&lruvec->inactive_age); return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page)); } @@ -280,7 +280,7 @@ void workingset_refault(struct page *page, void *shadow) memcg = mem_cgroup_from_id(memcgid); if (!mem_cgroup_disabled() && !memcg) goto out; - lruvec = mem_cgroup_lruvec(pgdat, memcg); + lruvec = mem_cgroup_lruvec(memcg, pgdat); refault = atomic_long_read(&lruvec->inactive_age); active_file = lruvec_lru_size(lruvec, LRU_ACTIVE_FILE, MAX_NR_ZONES); @@ -345,7 +345,7 @@ void workingset_activation(struct page *page) memcg = page_memcg_rcu(page); if (!mem_cgroup_disabled() && !memcg) goto out; - lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg); + lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page)); atomic_long_inc(&lruvec->inactive_age); out: rcu_read_unlock(); @@ -426,7 +426,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, struct lruvec *lruvec; int i; - lruvec = mem_cgroup_lruvec(NODE_DATA(sc->nid), sc->memcg); + lruvec = mem_cgroup_lruvec(sc->memcg, NODE_DATA(sc->nid)); for (pages = 0, i = 0; i < NR_LRU_LISTS; i++) pages += lruvec_page_state_local(lruvec, NR_LRU_BASE + i); From patchwork Tue Oct 22 14:47:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EF0931515 for ; Tue, 22 Oct 2019 14:48:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BBCE721783 for ; Tue, 22 Oct 2019 14:48:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="mxq29U7K" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BBCE721783 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 819086B0008; Tue, 22 Oct 2019 10:48:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7CDB36B000A; Tue, 22 Oct 2019 10:48:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66DDC6B000C; Tue, 22 Oct 2019 10:48:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id 470D16B0008 for ; Tue, 22 Oct 2019 10:48:19 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 09D51180AD815 for ; Tue, 22 Oct 2019 14:48:19 +0000 (UTC) X-FDA: 76071701118.07.grade89_65acf5f669222 X-Spam-Summary: 2,0,0,71ee6cf8ed938565,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3871:3872:3874:4321:5007:6119:6261:6653:7903:9010:9592:10004:10241:11026:11473:11658:11914:12043:12114:12296:12297:12438:12517:12519:12555:12895:12986:13069:13161:13229:13311:13357:13894:14096:14181:14384:14394:14721:14819:21080:21212:21444:21451:21627:30054:30070,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:28,LUA_SUMMARY:none X-HE-Tag: grade89_65acf5f669222 X-Filterd-Recvd-Size: 4986 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:18 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id t8so9604072qtc.6 for ; Tue, 22 Oct 2019 07:48:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=q6d0e71c1U6ZW8RKPMSK80x36zGerojsdp9LXPgfjJI=; b=mxq29U7KFf8nIL2EPYKGYZG3PSPLfnfyyUP3N9kMMBJep9Noe7BqgaM7RFDkI//avH +lWfp9sEakhlXGJS7YoZspH/bCa783BlFJaZLulfx06SkaVO6nseVrPLlE02uM46F5BT aXX+S5xo0gm2UsAhO2A+VmjIUQsL1r/v4PvaaiE2y1ld5cy0y2xidgDz/jYugGswM5YV XXV8qKlJgxAtFWHtk1MlF7jqqYsu3cW2zhBUYJ9MJyMxm8kVj4CL23AeGxB9bcLq1Cws cSqoKgFqCMpFHjtqd2m7RkiZsKaft0d6fZFpH0OP4v5ikJ/CxQXmt9q7gy7W0QOWLeLy G68g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=q6d0e71c1U6ZW8RKPMSK80x36zGerojsdp9LXPgfjJI=; b=qTo3NatHOozaj684oTeHx6znaflIkC8yDSevwZSufGWU00uObyD9Vt8H//uLNDM5JI Mh6yQKomcS/QupdJIsrW/DEcv7QhsTnZVXF6z22n2zKFHJB9/7K9GnPA/D3kYvNKXKUB nlNV8Sm4VGqI3iv30bgsNdBjinWHNr74H/07x5xWhZpl8XGJavO56uE5RQIa3u4pxhj6 /GNzr2mKi17QNdPfP/n6o++yOm70JD+ryIPru2xpd6n8JWFNFzCmDN4rFF/fQXGP2hPB OTirW7TXVUNsi3aIQDNaedjiw34EUnQ+6RJ67ExG4w1P0HStCWuUU4uXf8QaAquiMnmy JYjA== X-Gm-Message-State: APjAAAXpvEA58pgpo7FrN4tzm4SckENzFWed4o39TXDE/s531kkWYtg0 k//VsptcAiTyZRVcrD0ZuZSoIA== X-Google-Smtp-Source: APXvYqxLvuowjOIIcKYn75H2RxPI6SfUMcvB5TDKG/Hu0VfyuATYs1IGLTNo2b4pr6S1lKK5e7BkVg== X-Received: by 2002:ac8:6146:: with SMTP id d6mr3741857qtm.271.1571755697749; Tue, 22 Oct 2019 07:48:17 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id p4sm10518531qkf.112.2019.10.22.07.48.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:17 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/8] mm: vmscan: move inactive_list_is_low() swap check to the caller Date: Tue, 22 Oct 2019 10:47:58 -0400 Message-Id: <20191022144803.302233-4-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: inactive_list_is_low() should be about one thing: checking the ratio between inactive and active list. Kitchensink checks like the one for swap space makes the function hard to use and modify its callsites. Luckly, most callers already have an understanding of the swap situation, so it's easy to clean up. get_scan_count() has its own, memcg-aware swap check, and doesn't even get to the inactive_list_is_low() check on the anon list when there is no swap space available. shrink_list() is called on the results of get_scan_count(), so that check is redundant too. age_active_anon() has its own totalswap_pages check right before it checks the list proportions. The shrink_node_memcg() site is the only one that doesn't do its own swap check. Add it there. Then delete the swap check from inactive_list_is_low(). Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/vmscan.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index be3c22c274c1..622b77488144 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2226,13 +2226,6 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file, unsigned long refaults; unsigned long gb; - /* - * If we don't have swap space, anonymous page deactivation - * is pointless. - */ - if (!file && !total_swap_pages) - return false; - inactive = lruvec_lru_size(lruvec, inactive_lru, sc->reclaim_idx); active = lruvec_lru_size(lruvec, active_lru, sc->reclaim_idx); @@ -2653,7 +2646,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc * Even if we did not try to evict anon pages at all, we want to * rebalance the anon lru active/inactive ratio. */ - if (inactive_list_is_low(lruvec, false, sc, true)) + if (total_swap_pages && inactive_list_is_low(lruvec, false, sc, true)) shrink_active_list(SWAP_CLUSTER_MAX, lruvec, sc, LRU_ACTIVE_ANON); } From patchwork Tue Oct 22 14:47:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204655 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9261A1575 for ; Tue, 22 Oct 2019 14:48:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 55A8222459 for ; Tue, 22 Oct 2019 14:48:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="VeyPNpse" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 55A8222459 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EA9706B000A; Tue, 22 Oct 2019 10:48:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E842D6B000C; Tue, 22 Oct 2019 10:48:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9B4C6B000D; Tue, 22 Oct 2019 10:48:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id B91BD6B000A for ; Tue, 22 Oct 2019 10:48:20 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 651F5181AEF3F for ; Tue, 22 Oct 2019 14:48:20 +0000 (UTC) X-FDA: 76071701160.28.ray04_65dcaef0ed144 X-Spam-Summary: 2,0,0,16d3278c3283d018,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:2:41:69:355:379:541:800:960:966:973:982:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1605:1730:1747:1777:1792:2196:2199:2393:2559:2562:2693:2898:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4049:4120:4250:4321:4385:5007:6119:6261:6653:7875:8603:8957:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13894:13972:14096:14394:21080:21324:21444:21451:21611:21627:21796:30029:30036:30041:30054:30070,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: ray04_65dcaef0ed144 X-Filterd-Recvd-Size: 9100 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:19 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id o25so13758592qtr.5 for ; Tue, 22 Oct 2019 07:48:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=54i+z1wfPwz6/cGLiUfokNvMUBs/QmsNtuItwhjfY2I=; b=VeyPNpsexL0QFkwSfvrTR+ipseMK3B6GkLGCh7t5KGPNh0aMLaYY4TWUiEkgaDu6sD OkqHdMhcpIvf5XAGg/oS8FJjjmUpM6xU7YpSL3bUeqIvvhVMb5cC3eV3DewVSbLjNStD 9hGN9zzHYGIgW7xx0P59N7oMLQTufgOsE1gjmjXMZVI47+Zkyxj31tmPrn8qjtbRAFPh vNFBmq6PWAVVO0YuBbdhsLwignvBhDRkChbOK6E5kx0O2EhLLaGU1GycTyplz//xMOhi DEIK+UTUiUZN6jWWk5xd9koF4vwGVJUX9DZcU5arHIMdyseCVVORi9OuYlO1srYyED4+ O6JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=54i+z1wfPwz6/cGLiUfokNvMUBs/QmsNtuItwhjfY2I=; b=mThmR50bQv3R1XWoElsxpXUvM3tmQMaNxZdEf3PYTFJ29qfgfy0kN/XJhdsJdN9lhN iaIFDnyZ5tHf+O0GaS1fDb7GRLMawaVso2xGc9cdhbAE1F7vYOQJyqDvxUll1xwbkCe/ UKhMeL/q8IWtW43iNMFabK1Yv1OFaoTzITnMLtaodfD8NKxy9/WlS/8JuK89a89FBuI4 1STQLXjL1h9k4u0mJLzuj9vk2sgv4XMsvhk1F1bwup4L08MULckLHZH6euxB/w0w+/jc umvz2mSVpXUsxcHS/IUWE9JW7Ga2cUcAdsERsfgfnKmPTyLModbNLX/Ad5NbM6aNrx4o JvPA== X-Gm-Message-State: APjAAAWod4hAsGT8JZq6mjVxTYk1h1WT7FUo3axYWEClgKhlNGzXkE7W 07isjXuzrvTsLAtVKL1aYYmaPw== X-Google-Smtp-Source: APXvYqxUIOCsaOjFLF5giMe1ZXcyXhEJn1Gw5IIrOkv2mgKxObwAUB99LeseVc6rptgLFSj6NnGlZw== X-Received: by 2002:ac8:73cf:: with SMTP id v15mr3631553qtp.310.1571755699119; Tue, 22 Oct 2019 07:48:19 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id q16sm7131372qke.22.2019.10.22.07.48.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:18 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/8] mm: vmscan: naming fixes: global_reclaim() and sane_reclaim() Date: Tue, 22 Oct 2019 10:47:59 -0400 Message-Id: <20191022144803.302233-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Seven years after introducing the global_reclaim() function, I still have to double take when reading a callsite. I don't know how others do it, this is a terrible name. Invert the meaning and rename it to cgroup_reclaim(). [ After all, "global reclaim" is just regular reclaim invoked from the page allocator. It's reclaim on behalf of a cgroup limit that is a special case of reclaim, and should be explicit - not the reverse. ] sane_reclaim() isn't very descriptive either: it tests whether we can use the regular writeback throttling - available during regular page reclaim or cgroup2 limit reclaim - or need to use the broken wait_on_page_writeback() method. Use "writeback_throttling_sane()". Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/vmscan.c | 38 ++++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 622b77488144..302dad112f75 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -239,13 +239,13 @@ static void unregister_memcg_shrinker(struct shrinker *shrinker) up_write(&shrinker_rwsem); } -static bool global_reclaim(struct scan_control *sc) +static bool cgroup_reclaim(struct scan_control *sc) { - return !sc->target_mem_cgroup; + return sc->target_mem_cgroup; } /** - * sane_reclaim - is the usual dirty throttling mechanism operational? + * writeback_throttling_sane - is the usual dirty throttling mechanism available? * @sc: scan_control in question * * The normal page dirty throttling mechanism in balance_dirty_pages() is @@ -257,11 +257,9 @@ static bool global_reclaim(struct scan_control *sc) * This function tests whether the vmscan currently in progress can assume * that the normal dirty throttling mechanism is operational. */ -static bool sane_reclaim(struct scan_control *sc) +static bool writeback_throttling_sane(struct scan_control *sc) { - struct mem_cgroup *memcg = sc->target_mem_cgroup; - - if (!memcg) + if (!cgroup_reclaim(sc)) return true; #ifdef CONFIG_CGROUP_WRITEBACK if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) @@ -302,12 +300,12 @@ static void unregister_memcg_shrinker(struct shrinker *shrinker) { } -static bool global_reclaim(struct scan_control *sc) +static bool cgroup_reclaim(struct scan_control *sc) { - return true; + return false; } -static bool sane_reclaim(struct scan_control *sc) +static bool writeback_throttling_sane(struct scan_control *sc) { return true; } @@ -1227,7 +1225,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, goto activate_locked; /* Case 2 above */ - } else if (sane_reclaim(sc) || + } else if (writeback_throttling_sane(sc) || !PageReclaim(page) || !may_enter_fs) { /* * This is slightly racy - end_page_writeback() @@ -1821,7 +1819,7 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, if (current_is_kswapd()) return 0; - if (!sane_reclaim(sc)) + if (!writeback_throttling_sane(sc)) return 0; if (file) { @@ -1971,7 +1969,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, reclaim_stat->recent_scanned[file] += nr_taken; item = current_is_kswapd() ? PGSCAN_KSWAPD : PGSCAN_DIRECT; - if (global_reclaim(sc)) + if (!cgroup_reclaim(sc)) __count_vm_events(item, nr_scanned); __count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); spin_unlock_irq(&pgdat->lru_lock); @@ -1985,7 +1983,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, spin_lock_irq(&pgdat->lru_lock); item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT; - if (global_reclaim(sc)) + if (!cgroup_reclaim(sc)) __count_vm_events(item, nr_reclaimed); __count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); reclaim_stat->recent_rotated[0] += stat.nr_activate[0]; @@ -2309,7 +2307,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, * using the memory controller's swap limit feature would be * too expensive. */ - if (!global_reclaim(sc) && !swappiness) { + if (cgroup_reclaim(sc) && !swappiness) { scan_balance = SCAN_FILE; goto out; } @@ -2333,7 +2331,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, * thrashing file LRU becomes infinitely more attractive than * anon pages. Try to detect this based on file LRU size. */ - if (global_reclaim(sc)) { + if (!cgroup_reclaim(sc)) { unsigned long pgdatfile; unsigned long pgdatfree; int z; @@ -2564,7 +2562,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc * abort proportional reclaim if either the file or anon lru has already * dropped to zero at the first pass. */ - scan_adjusted = (global_reclaim(sc) && !current_is_kswapd() && + scan_adjusted = (!cgroup_reclaim(sc) && !current_is_kswapd() && sc->priority == DEF_PRIORITY); blk_start_plug(&plug); @@ -2853,7 +2851,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) * Legacy memcg will stall in page writeback so avoid forcibly * stalling in wait_iff_congested(). */ - if (!global_reclaim(sc) && sane_reclaim(sc) && + if (cgroup_reclaim(sc) && writeback_throttling_sane(sc) && sc->nr.dirty && sc->nr.dirty == sc->nr.congested) set_memcg_congestion(pgdat, root, true); @@ -2948,7 +2946,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) * Take care memory controller reclaiming has small influence * to global LRU. */ - if (global_reclaim(sc)) { + if (!cgroup_reclaim(sc)) { if (!cpuset_zone_allowed(zone, GFP_KERNEL | __GFP_HARDWALL)) continue; @@ -3048,7 +3046,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, retry: delayacct_freepages_start(); - if (global_reclaim(sc)) + if (!cgroup_reclaim(sc)) __count_zid_vm_events(ALLOCSTALL, sc->reclaim_idx, 1); do { From patchwork Tue Oct 22 14:48:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9DE3C1575 for ; Tue, 22 Oct 2019 14:48:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5044C21906 for ; Tue, 22 Oct 2019 14:48:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="bsgl5q1b" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5044C21906 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6BB3C6B000C; Tue, 22 Oct 2019 10:48:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 66BF06B000D; Tue, 22 Oct 2019 10:48:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55C746B000E; Tue, 22 Oct 2019 10:48:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id 376A56B000C for ; Tue, 22 Oct 2019 10:48:22 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id DE49068BD for ; Tue, 22 Oct 2019 14:48:21 +0000 (UTC) X-FDA: 76071701202.07.level80_661196715680e X-Spam-Summary: 2,0,0,2c19ff33eb987753,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:1:2:41:69:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2198:2199:2393:2553:2559:2562:2693:2897:2898:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4052:4250:4321:5007:6261:6653:7875:7903:9592:10004:11026:11473:11658:11914:12043:12291:12295:12296:12297:12438:12517:12519:12555:12683:12895:12986:13161:13229:13894:14096:14394:21080:21324:21433:21444:21451:21627:21939:30012:30034:30041:30054:30070:30090,0,RBL:209.85.160.196:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: level80_661196715680e X-Filterd-Recvd-Size: 12967 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:21 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id e14so7376273qto.1 for ; Tue, 22 Oct 2019 07:48:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BqchrERh+fBCV4S1nl+pMih++jK6+VDmqbtYzv8myJg=; b=bsgl5q1bt8l1nOFCJI4ZBKHjgyxytglXH3c8Hoq1n841wRERkIOPZ0XWOTcAbZ1nQ2 r91S+hiPqXr91e7bxMNb3SONiWrxHtJcQ1W4xGk4YsqNBNe4ituKMOT4HpLd4uGvnwrY Ql5xs3Jh8h2fvVzJuQITV2Te+4qDODDAfEIFTnfFayX3swlInJuwsXj8g77mt283QUof Rk0DdGUfvHubbb3tpf985XnJ4n4YVdBlYfmdkSgQeSwLsdlaa4zncGrFTorUeR8EyhuG rTLafsoevXONL/GT40F/7TNsZcCGLmo47KU9rZlyLiSVxdcgTvRvvLIffN3TgPm1l9O2 TvtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BqchrERh+fBCV4S1nl+pMih++jK6+VDmqbtYzv8myJg=; b=qBHot2cz1ePWmbhae3sPecE0ScyPgpnyM82PduL4paXMyZqelCrR1JFPaDmaXOfT8H 2ap0euSI4gtgzYliEjiqxmWY0EOl59BCB3BR4n8C5U59UWMGwo91PWsxN14i5qWHdJ5e k59VjfM+k5NyxJiEL1hoIgK6DpF7qWYPDQfLaeBQvh3Zz3X6wwRqbxZpbGnkep/ForaC KAed0i0XaUYeYrzIOGLXoXNaCQDOVGyvp+xMdN9N+cg8FUn7efjWSppsVadHDZW3wXU8 44BBxgfBCsWoK5mByY51hG05bXmXBgvmiyjQymdctgcEIusyJpQaElPi5g5F8Hw6bkSr ADhw== X-Gm-Message-State: APjAAAWs6Wlkd0WXoHacNaNYAOq8VSswGwYD+pe7dlUoXeEugAujiOeu +OWdRxLRB1c6Cdr30f3w6iXlh5YS0fo= X-Google-Smtp-Source: APXvYqx8CEXERtQpz0N76Hn80rVLsRCnezmuvfy2AZ4HqkuQ2wBrGfJurxeOnxdj5+/MVkSEDlhnXA== X-Received: by 2002:ac8:6686:: with SMTP id d6mr3546623qtp.332.1571755700564; Tue, 22 Oct 2019 07:48:20 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id d126sm6810585qkc.93.2019.10.22.07.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:20 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/8] mm: vmscan: replace shrink_node() loop with a retry jump Date: Tue, 22 Oct 2019 10:48:00 -0400 Message-Id: <20191022144803.302233-6-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Most of the function body is inside a loop, which imposes an additional indentation and scoping level that makes the code a bit hard to follow and modify. The looping only happens in case of reclaim-compaction, which isn't the common case. So rather than adding yet another function level to the reclaim path and have every reclaim invocation go through a level that only exists for one specific cornercase, use a retry goto. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin --- mm/vmscan.c | 231 ++++++++++++++++++++++++++-------------------------- 1 file changed, 115 insertions(+), 116 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 302dad112f75..235d1fc72311 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2729,144 +2729,143 @@ static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) { struct reclaim_state *reclaim_state = current->reclaim_state; + struct mem_cgroup *root = sc->target_mem_cgroup; unsigned long nr_reclaimed, nr_scanned; bool reclaimable = false; + struct mem_cgroup *memcg; +again: + memset(&sc->nr, 0, sizeof(sc->nr)); - do { - struct mem_cgroup *root = sc->target_mem_cgroup; - struct mem_cgroup *memcg; - - memset(&sc->nr, 0, sizeof(sc->nr)); - - nr_reclaimed = sc->nr_reclaimed; - nr_scanned = sc->nr_scanned; + nr_reclaimed = sc->nr_reclaimed; + nr_scanned = sc->nr_scanned; - memcg = mem_cgroup_iter(root, NULL, NULL); - do { - unsigned long reclaimed; - unsigned long scanned; + memcg = mem_cgroup_iter(root, NULL, NULL); + do { + unsigned long reclaimed; + unsigned long scanned; - switch (mem_cgroup_protected(root, memcg)) { - case MEMCG_PROT_MIN: - /* - * Hard protection. - * If there is no reclaimable memory, OOM. - */ + switch (mem_cgroup_protected(root, memcg)) { + case MEMCG_PROT_MIN: + /* + * Hard protection. + * If there is no reclaimable memory, OOM. + */ + continue; + case MEMCG_PROT_LOW: + /* + * Soft protection. + * Respect the protection only as long as + * there is an unprotected supply + * of reclaimable memory from other cgroups. + */ + if (!sc->memcg_low_reclaim) { + sc->memcg_low_skipped = 1; continue; - case MEMCG_PROT_LOW: - /* - * Soft protection. - * Respect the protection only as long as - * there is an unprotected supply - * of reclaimable memory from other cgroups. - */ - if (!sc->memcg_low_reclaim) { - sc->memcg_low_skipped = 1; - continue; - } - memcg_memory_event(memcg, MEMCG_LOW); - break; - case MEMCG_PROT_NONE: - /* - * All protection thresholds breached. We may - * still choose to vary the scan pressure - * applied based on by how much the cgroup in - * question has exceeded its protection - * thresholds (see get_scan_count). - */ - break; } + memcg_memory_event(memcg, MEMCG_LOW); + break; + case MEMCG_PROT_NONE: + /* + * All protection thresholds breached. We may + * still choose to vary the scan pressure + * applied based on by how much the cgroup in + * question has exceeded its protection + * thresholds (see get_scan_count). + */ + break; + } - reclaimed = sc->nr_reclaimed; - scanned = sc->nr_scanned; - shrink_node_memcg(pgdat, memcg, sc); - - shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, - sc->priority); - - /* Record the group's reclaim efficiency */ - vmpressure(sc->gfp_mask, memcg, false, - sc->nr_scanned - scanned, - sc->nr_reclaimed - reclaimed); - - } while ((memcg = mem_cgroup_iter(root, memcg, NULL))); + reclaimed = sc->nr_reclaimed; + scanned = sc->nr_scanned; + shrink_node_memcg(pgdat, memcg, sc); - if (reclaim_state) { - sc->nr_reclaimed += reclaim_state->reclaimed_slab; - reclaim_state->reclaimed_slab = 0; - } + shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, + sc->priority); - /* Record the subtree's reclaim efficiency */ - vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true, - sc->nr_scanned - nr_scanned, - sc->nr_reclaimed - nr_reclaimed); + /* Record the group's reclaim efficiency */ + vmpressure(sc->gfp_mask, memcg, false, + sc->nr_scanned - scanned, + sc->nr_reclaimed - reclaimed); - if (sc->nr_reclaimed - nr_reclaimed) - reclaimable = true; + } while ((memcg = mem_cgroup_iter(root, memcg, NULL))); - if (current_is_kswapd()) { - /* - * If reclaim is isolating dirty pages under writeback, - * it implies that the long-lived page allocation rate - * is exceeding the page laundering rate. Either the - * global limits are not being effective at throttling - * processes due to the page distribution throughout - * zones or there is heavy usage of a slow backing - * device. The only option is to throttle from reclaim - * context which is not ideal as there is no guarantee - * the dirtying process is throttled in the same way - * balance_dirty_pages() manages. - * - * Once a node is flagged PGDAT_WRITEBACK, kswapd will - * count the number of pages under pages flagged for - * immediate reclaim and stall if any are encountered - * in the nr_immediate check below. - */ - if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken) - set_bit(PGDAT_WRITEBACK, &pgdat->flags); + if (reclaim_state) { + sc->nr_reclaimed += reclaim_state->reclaimed_slab; + reclaim_state->reclaimed_slab = 0; + } - /* - * Tag a node as congested if all the dirty pages - * scanned were backed by a congested BDI and - * wait_iff_congested will stall. - */ - if (sc->nr.dirty && sc->nr.dirty == sc->nr.congested) - set_bit(PGDAT_CONGESTED, &pgdat->flags); + /* Record the subtree's reclaim efficiency */ + vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true, + sc->nr_scanned - nr_scanned, + sc->nr_reclaimed - nr_reclaimed); - /* Allow kswapd to start writing pages during reclaim.*/ - if (sc->nr.unqueued_dirty == sc->nr.file_taken) - set_bit(PGDAT_DIRTY, &pgdat->flags); + if (sc->nr_reclaimed - nr_reclaimed) + reclaimable = true; - /* - * If kswapd scans pages marked marked for immediate - * reclaim and under writeback (nr_immediate), it - * implies that pages are cycling through the LRU - * faster than they are written so also forcibly stall. - */ - if (sc->nr.immediate) - congestion_wait(BLK_RW_ASYNC, HZ/10); - } + if (current_is_kswapd()) { + /* + * If reclaim is isolating dirty pages under writeback, + * it implies that the long-lived page allocation rate + * is exceeding the page laundering rate. Either the + * global limits are not being effective at throttling + * processes due to the page distribution throughout + * zones or there is heavy usage of a slow backing + * device. The only option is to throttle from reclaim + * context which is not ideal as there is no guarantee + * the dirtying process is throttled in the same way + * balance_dirty_pages() manages. + * + * Once a node is flagged PGDAT_WRITEBACK, kswapd will + * count the number of pages under pages flagged for + * immediate reclaim and stall if any are encountered + * in the nr_immediate check below. + */ + if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken) + set_bit(PGDAT_WRITEBACK, &pgdat->flags); /* - * Legacy memcg will stall in page writeback so avoid forcibly - * stalling in wait_iff_congested(). + * Tag a node as congested if all the dirty pages + * scanned were backed by a congested BDI and + * wait_iff_congested will stall. */ - if (cgroup_reclaim(sc) && writeback_throttling_sane(sc) && - sc->nr.dirty && sc->nr.dirty == sc->nr.congested) - set_memcg_congestion(pgdat, root, true); + if (sc->nr.dirty && sc->nr.dirty == sc->nr.congested) + set_bit(PGDAT_CONGESTED, &pgdat->flags); + + /* Allow kswapd to start writing pages during reclaim.*/ + if (sc->nr.unqueued_dirty == sc->nr.file_taken) + set_bit(PGDAT_DIRTY, &pgdat->flags); /* - * Stall direct reclaim for IO completions if underlying BDIs - * and node is congested. Allow kswapd to continue until it - * starts encountering unqueued dirty pages or cycling through - * the LRU too quickly. + * If kswapd scans pages marked marked for immediate + * reclaim and under writeback (nr_immediate), it + * implies that pages are cycling through the LRU + * faster than they are written so also forcibly stall. */ - if (!sc->hibernation_mode && !current_is_kswapd() && - current_may_throttle() && pgdat_memcg_congested(pgdat, root)) - wait_iff_congested(BLK_RW_ASYNC, HZ/10); + if (sc->nr.immediate) + congestion_wait(BLK_RW_ASYNC, HZ/10); + } + + /* + * Legacy memcg will stall in page writeback so avoid forcibly + * stalling in wait_iff_congested(). + */ + if (cgroup_reclaim(sc) && writeback_throttling_sane(sc) && + sc->nr.dirty && sc->nr.dirty == sc->nr.congested) + set_memcg_congestion(pgdat, root, true); + + /* + * Stall direct reclaim for IO completions if underlying BDIs + * and node is congested. Allow kswapd to continue until it + * starts encountering unqueued dirty pages or cycling through + * the LRU too quickly. + */ + if (!sc->hibernation_mode && !current_is_kswapd() && + current_may_throttle() && pgdat_memcg_congested(pgdat, root)) + wait_iff_congested(BLK_RW_ASYNC, HZ/10); - } while (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, - sc)); + if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, + sc)) + goto again; /* * Kswapd gives up on balancing particular nodes after too From patchwork Tue Oct 22 14:48:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204659 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 54D1A1515 for ; Tue, 22 Oct 2019 14:48:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1810C218AE for ; Tue, 22 Oct 2019 14:48:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="06y95Xcc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1810C218AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 315F16B000D; Tue, 22 Oct 2019 10:48:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 29C3E6B000E; Tue, 22 Oct 2019 10:48:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 18AC26B0010; Tue, 22 Oct 2019 10:48:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id EC87F6B000D for ; Tue, 22 Oct 2019 10:48:23 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 8378E180AD815 for ; Tue, 22 Oct 2019 14:48:23 +0000 (UTC) X-FDA: 76071701286.24.drink72_664e977cd781a X-Spam-Summary: 2,0,0,22082be0dd22a490,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2693:2901:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3871:3872:3874:4117:4250:4321:4385:4605:5007:6261:6653:7903:7974:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:12986:13161:13229:13894:14096:14181:14394:14721:21080:21325:21444:21451:21627:21796:21939:30036:30054,0,RBL:209.85.222.195:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:40,LUA_SUMMARY:none X-HE-Tag: drink72_664e977cd781a X-Filterd-Recvd-Size: 6415 Received: from mail-qk1-f195.google.com (mail-qk1-f195.google.com [209.85.222.195]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:22 +0000 (UTC) Received: by mail-qk1-f195.google.com with SMTP id e66so16455594qkf.13 for ; Tue, 22 Oct 2019 07:48:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hxh0IM0xlU3vTA2ZQpF8tyo6AUPcwm84qmU6yKeg7AE=; b=06y95XccQ47sNAPL8uhNFIISYVKYHbZBLHpvJ//HY7ED8ifACzizBBmrCeGvn7/4Zg +YEyppc3xQfaKFCGcZGNq7Cnhsw2R/CxzKDe4Z0cpUfaorTp8h+nilk9ExL4dkXX1eKG LioBdrUgywnZDq6EYMusUNOgI/QGFG052qREgRAU78aOQji59fTFPkF3PawYCXt03wZu 5TG5pEPCsdOiwww9jnEJrkE4SxCF95zziI8twhfhs0vnzft25kpiI0nfscdVsvij1QUq 6OnbyDY41FGYeL+V4lPOwsAo/+PDowshd1lLkMCXC6uycDjmx0+3I/MAyBvB46gto8kS yN5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hxh0IM0xlU3vTA2ZQpF8tyo6AUPcwm84qmU6yKeg7AE=; b=gJTmeNDiqe25yaoF9XCDBL4EiS1ZiupUDBSxsVbuDTiFe20E7sWabFfRZdPBycbegp dd3l1qR07lS8UJJOKCbVTEF1kOjHF9hy1s/0sbJ5i+OjJOe8utt9is64AdJBsZz7vxac 955jYI+je+IohyUMLNtvpNJ4ys6GbequTC3zS8/vD9lBwDllCefhWdQo4MeOfRdldlEr kyqVV+GS9BmjmgyRGbC5nioZsFQL8MJFMxF85OMRPSxap2PqJHpHTt0Lzi/iem4SmtUW g/foAoUQpdxZaJhteETM1a1BT/W62YJymDqjFmDnt73AIYyKQczIcpJn3lbloWxVhZj3 xHVQ== X-Gm-Message-State: APjAAAWLRfL5L/F8AEDd/5J2dta2cjOurTrBSIp7WnuQgyvQz2JKF5q2 LldNnYSW4bGASDx6jNT9QZwoEg== X-Google-Smtp-Source: APXvYqxRGyy7EdMQXI+J2blgh97/sGUFZVM/ZtaDfoFqFo8q3zdoWEmxTaMWpL4oHjNuZN4jtFu4pw== X-Received: by 2002:a37:aac3:: with SMTP id t186mr3083773qke.221.1571755702048; Tue, 22 Oct 2019 07:48:22 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id x133sm8682703qka.44.2019.10.22.07.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:21 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 6/8] mm: vmscan: turn shrink_node_memcg() into shrink_lruvec() Date: Tue, 22 Oct 2019 10:48:01 -0400 Message-Id: <20191022144803.302233-7-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A lruvec holds LRU pages owned by a certain NUMA node and cgroup. Instead of awkwardly passing around a combination of a pgdat and a memcg pointer, pass down the lruvec as soon as we can look it up. Nested callers that need to access node or cgroup properties can look them them up if necessary, but there are only a few cases. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/vmscan.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 235d1fc72311..db073b40c432 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2280,9 +2280,10 @@ enum scan_balance { * nr[0] = anon inactive pages to scan; nr[1] = anon active pages to scan * nr[2] = file inactive pages to scan; nr[3] = file active pages to scan */ -static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, - struct scan_control *sc, unsigned long *nr) +static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, + unsigned long *nr) { + struct mem_cgroup *memcg = lruvec_memcg(lruvec); int swappiness = mem_cgroup_swappiness(memcg); struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat; u64 fraction[2]; @@ -2530,13 +2531,8 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, } } -/* - * This is a basic per-node page freer. Used by both kswapd and direct reclaim. - */ -static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memcg, - struct scan_control *sc) +static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) { - struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); unsigned long nr[NR_LRU_LISTS]; unsigned long targets[NR_LRU_LISTS]; unsigned long nr_to_scan; @@ -2546,7 +2542,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc struct blk_plug plug; bool scan_adjusted; - get_scan_count(lruvec, memcg, sc, nr); + get_scan_count(lruvec, sc, nr); /* Record the original scan target for proportional adjustments later */ memcpy(targets, nr, sizeof(nr)); @@ -2741,6 +2737,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) memcg = mem_cgroup_iter(root, NULL, NULL); do { + struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); unsigned long reclaimed; unsigned long scanned; @@ -2777,7 +2774,8 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) reclaimed = sc->nr_reclaimed; scanned = sc->nr_scanned; - shrink_node_memcg(pgdat, memcg, sc); + + shrink_lruvec(lruvec, sc); shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); @@ -3281,6 +3279,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg, pg_data_t *pgdat, unsigned long *nr_scanned) { + struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); struct scan_control sc = { .nr_to_reclaim = SWAP_CLUSTER_MAX, .target_mem_cgroup = memcg, @@ -3307,7 +3306,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg, * will pick up pages from other mem cgroup's as well. We hack * the priority and make it zero. */ - shrink_node_memcg(pgdat, memcg, &sc); + shrink_lruvec(lruvec, &sc); trace_mm_vmscan_memcg_softlimit_reclaim_end( cgroup_ino(memcg->css.cgroup), From patchwork Tue Oct 22 14:48:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204661 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0CE311575 for ; Tue, 22 Oct 2019 14:48:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CD079218AE for ; Tue, 22 Oct 2019 14:48:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="FvZrlFyR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CD079218AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 50CBE6B000E; Tue, 22 Oct 2019 10:48:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4BE8E6B0010; Tue, 22 Oct 2019 10:48:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D7CD6B0266; Tue, 22 Oct 2019 10:48:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0003.hostedemail.com [216.40.44.3]) by kanga.kvack.org (Postfix) with ESMTP id 166AB6B000E for ; Tue, 22 Oct 2019 10:48:25 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 7CAAB247A for ; Tue, 22 Oct 2019 14:48:24 +0000 (UTC) X-FDA: 76071701328.11.debt46_6674bf5fd5f2b X-Spam-Summary: 2,0,0,6820d05082e4d672,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:1981:2194:2199:2393:2559:2562:2895:3138:3139:3140:3141:3142:3353:3865:3868:3871:3872:5007:6261:6653:7875:9592:10004:11026:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:13161:13229:13894:14096:14181:14394:14721:21080:21444:21451:21627:30054,0,RBL:209.85.222.193:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: debt46_6674bf5fd5f2b X-Filterd-Recvd-Size: 5546 Received: from mail-qk1-f193.google.com (mail-qk1-f193.google.com [209.85.222.193]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:23 +0000 (UTC) Received: by mail-qk1-f193.google.com with SMTP id u22so16453001qkk.11 for ; Tue, 22 Oct 2019 07:48:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TcikxxhhZnd6YbZpIU4lRQd5AagmURIl+Vuyl9j6u4o=; b=FvZrlFyRQntoTimiufrW53mtj9qQm5Pfkmi6p9CvHGf4iR1aINMvfQGfKS5J6FOE6Q +a/mswPjs8yJdf44DISKKSp7pkEDXdEPHmpL5dPfkzMfKhGowvZmJntlFeptg/YT+M/4 F1yyqrA4p/pPpGQwEYO5wWe4v1QX3CFT8wrfa/IjioJDCqvPCRNx912M+Xg1uvyLIba4 UAI8qPBU/g32WGoLKqRcwfWhJ/8NtFsyvIIeW30mb8ztOxvbZzL7agXcNuzl+ePB6+7e nyCJPye+F9FgFZgkKIcjPyM7ig4P+P+beI6kZp4hAJ8urRBGbCg/vCRTdLEj/z9s2Ymk ndjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TcikxxhhZnd6YbZpIU4lRQd5AagmURIl+Vuyl9j6u4o=; b=owF9B5liEh3RUT9B0+dPSAYQG+TsbsnTXIhufSWZrsnq9jXF/SOd6HzYIvoAcAuKX2 pMQmT03mmuwOrpSBCODH+bUDHWlDKVVVJ1KY8VMJmTwuTqmry2KwHrEnjFzblbg69hU0 6OLg8dAWR69P/JDz+2oeL4nx4XWf7zlAISvIegYvpEi2fvpsTwYdMJKbmBRg/LbK0jrC KwDB7eglmrpb2NmTsVnfWa9sNyTXW+Lqi8S8di11kw8XlVb/B05tEiWJwCP7AeVWEWco TERF+XGFoXYnkabLstJzHSvQ64k3xyxMnHSIENRwajy4yzDFVvkfvGLZbX2pJyVkHxPw l16Q== X-Gm-Message-State: APjAAAVBakdaAHpCembgs4V/6CV0zVFFvFEUfBEldjpqP1sqiVlRBsMJ 0Gds5CO2uymtANA6EaE6Ar0XHBXGE/s= X-Google-Smtp-Source: APXvYqwd6CQlHw+mwJeGW1LxVFd0b0qCo4M2vpP4Ov/AangPWepYRJV0QsNC4KaEIrqRkOUKo/osTA== X-Received: by 2002:a37:8183:: with SMTP id c125mr3303192qkd.279.1571755703347; Tue, 22 Oct 2019 07:48:23 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id b4sm3862311qtt.26.2019.10.22.07.48.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:22 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 7/8] mm: vmscan: split shrink_node() into node part and memcgs part Date: Tue, 22 Oct 2019 10:48:02 -0400 Message-Id: <20191022144803.302233-8-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This function is getting long and unwieldy, split out the memcg bits. The updated shrink_node() handles the generic (node) reclaim aspects: - global vmpressure notifications - writeback and congestion throttling - reclaim/compaction management - kswapd giving up on unreclaimable nodes It then calls a new shrink_node_memcgs() which handles cgroup specifics: - the cgroup tree traversal - memory.low considerations - per-cgroup slab shrinking callbacks - per-cgroup vmpressure notifications Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko Signed-off-by: Johannes Weiner --- mm/vmscan.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index db073b40c432..65baa89740dd 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2722,18 +2722,10 @@ static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) (memcg && memcg_congested(pgdat, memcg)); } -static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) +static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) { - struct reclaim_state *reclaim_state = current->reclaim_state; struct mem_cgroup *root = sc->target_mem_cgroup; - unsigned long nr_reclaimed, nr_scanned; - bool reclaimable = false; struct mem_cgroup *memcg; -again: - memset(&sc->nr, 0, sizeof(sc->nr)); - - nr_reclaimed = sc->nr_reclaimed; - nr_scanned = sc->nr_scanned; memcg = mem_cgroup_iter(root, NULL, NULL); do { @@ -2786,6 +2778,22 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) sc->nr_reclaimed - reclaimed); } while ((memcg = mem_cgroup_iter(root, memcg, NULL))); +} + +static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) +{ + struct reclaim_state *reclaim_state = current->reclaim_state; + struct mem_cgroup *root = sc->target_mem_cgroup; + unsigned long nr_reclaimed, nr_scanned; + bool reclaimable = false; + +again: + memset(&sc->nr, 0, sizeof(sc->nr)); + + nr_reclaimed = sc->nr_reclaimed; + nr_scanned = sc->nr_scanned; + + shrink_node_memcgs(pgdat, sc); if (reclaim_state) { sc->nr_reclaimed += reclaim_state->reclaimed_slab; @@ -2793,7 +2801,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) } /* Record the subtree's reclaim efficiency */ - vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true, + vmpressure(sc->gfp_mask, root, true, sc->nr_scanned - nr_scanned, sc->nr_reclaimed - nr_reclaimed); From patchwork Tue Oct 22 14:48:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11204663 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8A12C1575 for ; Tue, 22 Oct 2019 14:48:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3CB07218AE for ; Tue, 22 Oct 2019 14:48:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="q12d0v3z" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3CB07218AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9CA396B0010; Tue, 22 Oct 2019 10:48:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 92C636B0266; Tue, 22 Oct 2019 10:48:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 77D0C6B0269; Tue, 22 Oct 2019 10:48:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id 58C736B0010 for ; Tue, 22 Oct 2019 10:48:26 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 02D46181AEF3F for ; Tue, 22 Oct 2019 14:48:26 +0000 (UTC) X-FDA: 76071701412.22.seat06_66a60311afb3e X-Spam-Summary: 2,0,0,e1c55c9f31468c8f,d41d8cd98f00b204,hannes@cmpxchg.org,:akpm@linux-foundation.org:mhocko@suse.com::cgroups@vger.kernel.org:linux-kernel@vger.kernel.org:kernel-team@fb.com,RULES_HIT:1:2:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2898:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3874:4050:4250:4321:4385:4605:5007:6261:6653:7903:9592:10004:11026:11232:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13161:13229:13255:13894:14096:14394:21080:21324:21325:21444:21451:21627:30045:30054:30070,0,RBL:209.85.160.196:@cmpxchg.org:.lbl8.mailshell.net-62.14.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:54,LUA_SUMMARY:none X-HE-Tag: seat06_66a60311afb3e X-Filterd-Recvd-Size: 10739 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Oct 2019 14:48:25 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id t20so27130462qtr.10 for ; Tue, 22 Oct 2019 07:48:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y/S9YM5L03E9mp2moYFg2d0XgAPZjVTeoGAh0CaM0+A=; b=q12d0v3zXOV5WGnLHVwyjLzM9fmUPfDWDCKmbLLTwIQBZHcDFlV7gImPBZowyjR6Zx lB6pxnBBIOuyk2Pb2qr6t83Dh0tW0naxXd067CoC7FHaw1+uxUCAU4AxdOtf50nyMsT6 aGzqFmoduz+Tvgjc10N4tOY53iU8IMjQ3m7U7otj/BRSD/OLtkOJAKzYDRFDMY/mH4oH DliRKULyu7UQq4GUMbNRxXw25RE60ZuZ+AnCRTeuHicS5Cv+lGttW7ixocRtQTlaQr+C U4md7L2wE6RKhOlePlEk2fREDYcZecEhb91bIKtFpvSiGWhN+VPltnwNhZU0ey4/Q+n0 Ukng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y/S9YM5L03E9mp2moYFg2d0XgAPZjVTeoGAh0CaM0+A=; b=DWDp4vQiNbqlSayex/IY8tpcpuuaIkkL6mFpL1aTKF+d3pxMXY3FERmBjTGKm2ueqK rks5OBwib5Ef7qsImDXDJiJw8awX3bmGqM6boCvPb+2olEIfu80XjzdRngEaSp2qKC+C zBVqEJptksjhhMvPsYzUuVwHOEBCfxSp/Tl24VYssm1qj9nHXk1QppgKjeT+CDHQ3JG7 IXVpKX0ogk+J/TMk+009ZulMjJ/ujirNlU+etdDbBW7JnnW84Lnv9RgQltnJlAnShZsW oZGvnADuzQVt+QzdDneWmIbGRhEBDMvADteA1dAggBpcrnuqNgLTQ21pdgUoJDSQOgIY 26Xg== X-Gm-Message-State: APjAAAVMv869tbWfA2Oc0yVdkHIiW0BVhz852o0L+HOLivv4+0zX2GOr hBE4xZDf/BTnW+xXxmISDHaxZA== X-Google-Smtp-Source: APXvYqzFOGhpHSTiy9Bfr0pdbWvQdXv5B+K83ETYEPzgzz0WiqOM/UNmdII5ZyFVY3U3lJARYdnb/Q== X-Received: by 2002:ac8:2fd9:: with SMTP id m25mr3544986qta.177.1571755704659; Tue, 22 Oct 2019 07:48:24 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::3:10ad]) by smtp.gmail.com with ESMTPSA id w6sm9134982qkj.136.2019.10.22.07.48.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 07:48:24 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 8/8] mm: vmscan: harmonize writeback congestion tracking for nodes & memcgs Date: Tue, 22 Oct 2019 10:48:03 -0400 Message-Id: <20191022144803.302233-9-hannes@cmpxchg.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191022144803.302233-1-hannes@cmpxchg.org> References: <20191022144803.302233-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The current writeback congestion tracking has separate flags for kswapd reclaim (node level) and cgroup limit reclaim (memcg-node level). This is unnecessarily complicated: the lruvec is an existing abstraction layer for that node-memcg intersection. Introduce lruvec->flags and LRUVEC_CONGESTED. Then track that at the reclaim root level, which is either the NUMA node for global reclaim, or the cgroup-node intersection for cgroup reclaim. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Signed-off-by: Johannes Weiner --- include/linux/memcontrol.h | 6 +-- include/linux/mmzone.h | 11 ++++-- mm/vmscan.c | 80 ++++++++++++-------------------------- 3 files changed, 36 insertions(+), 61 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 498cea07cbb1..d8ffcf60440c 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -133,9 +133,6 @@ struct mem_cgroup_per_node { unsigned long usage_in_excess;/* Set to the value by which */ /* the soft limit is exceeded*/ bool on_tree; - bool congested; /* memcg has many dirty pages */ - /* backed by a congested BDI */ - struct mem_cgroup *memcg; /* Back pointer, we cannot */ /* use container_of */ }; @@ -412,6 +409,9 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, goto out; } + if (!memcg) + memcg = root_mem_cgroup; + mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); lruvec = &mz->lruvec; out: diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 449a44171026..c04b4c1f01fa 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -296,6 +296,12 @@ struct zone_reclaim_stat { unsigned long recent_scanned[2]; }; +enum lruvec_flags { + LRUVEC_CONGESTED, /* lruvec has many dirty pages + * backed by a congested BDI + */ +}; + struct lruvec { struct list_head lists[NR_LRU_LISTS]; struct zone_reclaim_stat reclaim_stat; @@ -303,6 +309,8 @@ struct lruvec { atomic_long_t inactive_age; /* Refaults at the time of last reclaim cycle */ unsigned long refaults; + /* Various lruvec state flags (enum lruvec_flags) */ + unsigned long flags; #ifdef CONFIG_MEMCG struct pglist_data *pgdat; #endif @@ -572,9 +580,6 @@ struct zone { } ____cacheline_internodealigned_in_smp; enum pgdat_flags { - PGDAT_CONGESTED, /* pgdat has many dirty pages backed by - * a congested BDI - */ PGDAT_DIRTY, /* reclaim scanning has recently found * many dirty file pages at the tail * of the LRU. diff --git a/mm/vmscan.c b/mm/vmscan.c index 65baa89740dd..3e21166d5198 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -267,29 +267,6 @@ static bool writeback_throttling_sane(struct scan_control *sc) #endif return false; } - -static void set_memcg_congestion(pg_data_t *pgdat, - struct mem_cgroup *memcg, - bool congested) -{ - struct mem_cgroup_per_node *mn; - - if (!memcg) - return; - - mn = mem_cgroup_nodeinfo(memcg, pgdat->node_id); - WRITE_ONCE(mn->congested, congested); -} - -static bool memcg_congested(pg_data_t *pgdat, - struct mem_cgroup *memcg) -{ - struct mem_cgroup_per_node *mn; - - mn = mem_cgroup_nodeinfo(memcg, pgdat->node_id); - return READ_ONCE(mn->congested); - -} #else static int prealloc_memcg_shrinker(struct shrinker *shrinker) { @@ -309,18 +286,6 @@ static bool writeback_throttling_sane(struct scan_control *sc) { return true; } - -static inline void set_memcg_congestion(struct pglist_data *pgdat, - struct mem_cgroup *memcg, bool congested) -{ -} - -static inline bool memcg_congested(struct pglist_data *pgdat, - struct mem_cgroup *memcg) -{ - return false; - -} #endif /* @@ -2716,12 +2681,6 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, return inactive_lru_pages > pages_for_compaction; } -static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) -{ - return test_bit(PGDAT_CONGESTED, &pgdat->flags) || - (memcg && memcg_congested(pgdat, memcg)); -} - static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) { struct mem_cgroup *root = sc->target_mem_cgroup; @@ -2785,8 +2744,11 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) struct reclaim_state *reclaim_state = current->reclaim_state; struct mem_cgroup *root = sc->target_mem_cgroup; unsigned long nr_reclaimed, nr_scanned; + struct lruvec *target_lruvec; bool reclaimable = false; + target_lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat); + again: memset(&sc->nr, 0, sizeof(sc->nr)); @@ -2829,14 +2791,6 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken) set_bit(PGDAT_WRITEBACK, &pgdat->flags); - /* - * Tag a node as congested if all the dirty pages - * scanned were backed by a congested BDI and - * wait_iff_congested will stall. - */ - if (sc->nr.dirty && sc->nr.dirty == sc->nr.congested) - set_bit(PGDAT_CONGESTED, &pgdat->flags); - /* Allow kswapd to start writing pages during reclaim.*/ if (sc->nr.unqueued_dirty == sc->nr.file_taken) set_bit(PGDAT_DIRTY, &pgdat->flags); @@ -2852,12 +2806,17 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) } /* + * Tag a node/memcg as congested if all the dirty pages + * scanned were backed by a congested BDI and + * wait_iff_congested will stall. + * * Legacy memcg will stall in page writeback so avoid forcibly * stalling in wait_iff_congested(). */ - if (cgroup_reclaim(sc) && writeback_throttling_sane(sc) && + if ((current_is_kswapd() || + (cgroup_reclaim(sc) && writeback_throttling_sane(sc))) && sc->nr.dirty && sc->nr.dirty == sc->nr.congested) - set_memcg_congestion(pgdat, root, true); + set_bit(LRUVEC_CONGESTED, &target_lruvec->flags); /* * Stall direct reclaim for IO completions if underlying BDIs @@ -2865,8 +2824,9 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) * starts encountering unqueued dirty pages or cycling through * the LRU too quickly. */ - if (!sc->hibernation_mode && !current_is_kswapd() && - current_may_throttle() && pgdat_memcg_congested(pgdat, root)) + if (!current_is_kswapd() && current_may_throttle() && + !sc->hibernation_mode && + test_bit(LRUVEC_CONGESTED, &target_lruvec->flags)) wait_iff_congested(BLK_RW_ASYNC, HZ/10); if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, @@ -3080,8 +3040,16 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, if (zone->zone_pgdat == last_pgdat) continue; last_pgdat = zone->zone_pgdat; + snapshot_refaults(sc->target_mem_cgroup, zone->zone_pgdat); - set_memcg_congestion(last_pgdat, sc->target_mem_cgroup, false); + + if (cgroup_reclaim(sc)) { + struct lruvec *lruvec; + + lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, + zone->zone_pgdat); + clear_bit(LRUVEC_CONGESTED, &lruvec->flags); + } } delayacct_freepages_end(); @@ -3461,7 +3429,9 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int classzone_idx) /* Clear pgdat state for congested, dirty or under writeback. */ static void clear_pgdat_congested(pg_data_t *pgdat) { - clear_bit(PGDAT_CONGESTED, &pgdat->flags); + struct lruvec *lruvec = mem_cgroup_lruvec(NULL, pgdat); + + clear_bit(LRUVEC_CONGESTED, &lruvec->flags); clear_bit(PGDAT_DIRTY, &pgdat->flags); clear_bit(PGDAT_WRITEBACK, &pgdat->flags); }