From patchwork Tue Dec 24 07:53:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11309157 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 94295138D for ; Tue, 24 Dec 2019 07:55:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 619A8206CB for ; Tue, 24 Dec 2019 07:55:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="vaUVC/DH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 619A8206CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B0308E0006; Tue, 24 Dec 2019 02:55:18 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 560B18E0001; Tue, 24 Dec 2019 02:55:18 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 477528E0006; Tue, 24 Dec 2019 02:55:18 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 2C91A8E0001 for ; Tue, 24 Dec 2019 02:55:18 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id D06724DB4 for ; Tue, 24 Dec 2019 07:55:17 +0000 (UTC) X-FDA: 76299274674.07.chess43_1115409f0bf59 X-Spam-Summary: 2,0,0,d5152908c803487d,d41d8cd98f00b204,laoar.shao@gmail.com,:hannes@cmpxchg.org:david@fromorbit.com:mhocko@kernel.org:vdavydov.dev@gmail.com:akpm@linux-foundation.org:viro@zeniv.linux.org.uk::linux-fsdevel@vger.kernel.org:laoar.shao@gmail.com,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1345:1359:1437:1534:1541:1711:1730:1747:1777:1792:2393:2553:2559:2562:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:4250:4321:4605:5007:6119:6261:6653:7514:7901:7903:8603:9413:10004:11026:11473:11658:11914:12043:12048:12296:12297:12517:12519:12555:12895:12986:13069:13138:13161:13229:13231:13311:13357:14096:14181:14384:14394:14687:14721:21080:21444:21450:21451:21627:21666:21972:30034:30054:30056:30090,0,RBL:209.85.216.65:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: chess43_1115409f0bf59 X-Filterd-Recvd-Size: 4748 Received: from mail-pj1-f65.google.com (mail-pj1-f65.google.com [209.85.216.65]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Dec 2019 07:55:17 +0000 (UTC) Received: by mail-pj1-f65.google.com with SMTP id n96so871929pjc.3 for ; Mon, 23 Dec 2019 23:55:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=V1NIvureCn33Tn3ECKZBbHdYOKZ0mZ/srBiMlz7CqOo=; b=vaUVC/DHZHrYVeAmeVTVNycr7RSOOShtL0anFZVIrQTDBEW6y4NcDlgg06k3VdlMIU y28LQYrFnfOe3w4bYy/fP5H1Wr9BNt3v9Udg4OxWCmBsiVVEzFOMYYSZwNNc4zFaL23D X6x4xJJMalWRbUFVaHb4p9ulQUHFOe0aCozmWYUnrMKU8jAADiqGkKTc4KtipzY9oGNn WCPkB/aRAcxCJixLDhLreuTSNJcahFoVfD3fVsUXqZleggWrFMalGnuRnenh2aMf6QcQ GaL/cpTjS9exDf0ECPYMAoEynr5LzVDZyAReYOFPaBIraG7+bf4BvxP40Kmn00cAkuT8 qkHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=V1NIvureCn33Tn3ECKZBbHdYOKZ0mZ/srBiMlz7CqOo=; b=WxMNMWkthuMZItwzJ4iIKJS2GpfrdJENROwIUTndArhtg+ggNBmLlSEnAA4PRQku9v LWDIvDBnAX6JQQUyd2UdBRURlPcp/1CcElBOV1qqCay/Km/BOQA/8jyWl1da6iESsT8U lge/9gXTZ1eSrYfCP+jkLwbK/PQ4gb6psw4YvrVhOaDO7lDnCHHlgaL7osCi6HBuwejt No6XGqGvoEqsC6YuUG0ofIs17PUY9gjjYEqDuvjk+OlIuv4tN5gwO++alWbj9Dok00I5 EDXADbu9acs+j+93QM+BzCtE34B9zMGgGSJJtBjZbrj/O4AQLf25bP/HXeed+byno5RW dQvA== X-Gm-Message-State: APjAAAVlUcVeeLxAZeNRijsdPT+LxQij/zV9k/M6NGSmexaHwWIBqR8c YOlzzv3XxKE6a1InPjv2K4g= X-Google-Smtp-Source: APXvYqyT8qOnAq66pXq7Xp1RgbKisswZoZED+pR98Qm34GbQU76x3QyoL8MQF2ZduR4HUfbhrLxMpA== X-Received: by 2002:a17:90b:f06:: with SMTP id br6mr4174688pjb.125.1577174116520; Mon, 23 Dec 2019 23:55:16 -0800 (PST) Received: from dev.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id c2sm2004064pjq.27.2019.12.23.23.55.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2019 23:55:15 -0800 (PST) From: Yafang Shao To: hannes@cmpxchg.org, david@fromorbit.com, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Yafang Shao Subject: [PATCH v2 1/5] mm, memcg: reduce size of struct mem_cgroup by using bit field Date: Tue, 24 Dec 2019 02:53:22 -0500 Message-Id: <1577174006-13025-2-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> References: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are some members in struct mem_group can be either 0(false) or 1(true), so we can define them using bit field to reduce size. With this patch, the size of struct mem_cgroup can be reduced by 64 bytes in theory, but as there're some MEMCG_PADDING()s, the real number may be different, which is relate with the cacheline size. Anyway, this patch could reduce the size of struct mem_cgroup more or less. Signed-off-by: Yafang Shao --- include/linux/memcontrol.h | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a7a0a1a5..612a457 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -229,20 +229,26 @@ struct mem_cgroup { /* * Should the accounting and control be hierarchical, per subtree? */ - bool use_hierarchy; + unsigned int use_hierarchy : 1; /* * Should the OOM killer kill all belonging tasks, had it kill one? */ - bool oom_group; + unsigned int oom_group : 1; /* protected by memcg_oom_lock */ - bool oom_lock; - int under_oom; + unsigned int oom_lock : 1; - int swappiness; /* OOM-Killer disable */ - int oom_kill_disable; + unsigned int oom_kill_disable : 1; + + /* Legacy tcp memory accounting */ + unsigned int tcpmem_active : 1; + unsigned int tcpmem_pressure : 1; + + int under_oom; + + int swappiness; /* memory.events and memory.events.local */ struct cgroup_file events_file; @@ -297,9 +303,6 @@ struct mem_cgroup { unsigned long socket_pressure; - /* Legacy tcp memory accounting */ - bool tcpmem_active; - int tcpmem_pressure; #ifdef CONFIG_MEMCG_KMEM /* Index in the kmem_cache->memcg_params.memcg_caches array */ From patchwork Tue Dec 24 07:53:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11309161 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 14198109A for ; Tue, 24 Dec 2019 07:55:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D593F2071A for ; Tue, 24 Dec 2019 07:55:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Rya8+jtC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D593F2071A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 00BA08E0007; Tue, 24 Dec 2019 02:55:22 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EFD3C8E0001; Tue, 24 Dec 2019 02:55:21 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E14098E0007; Tue, 24 Dec 2019 02:55:21 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id CAE438E0001 for ; Tue, 24 Dec 2019 02:55:21 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 7F6FE3D13 for ; Tue, 24 Dec 2019 07:55:21 +0000 (UTC) X-FDA: 76299274842.28.cover21_11997ceec3235 X-Spam-Summary: 2,0,0,5cb5c331b8241ec4,d41d8cd98f00b204,laoar.shao@gmail.com,:hannes@cmpxchg.org:david@fromorbit.com:mhocko@kernel.org:vdavydov.dev@gmail.com:akpm@linux-foundation.org:viro@zeniv.linux.org.uk::linux-fsdevel@vger.kernel.org:laoar.shao@gmail.com:guro@fb.com,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1345:1359:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:2898:3138:3139:3140:3141:3142:3352:3865:3866:3867:3871:3872:3874:4321:4605:5007:6261:6653:7514:9413:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12895:13069:13311:13357:14096:14181:14384:14394:14687:14721:21080:21444:21451:21627:21666:21740:21990:30054:30064:30070,0,RBL:209.85.214.195:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: cover21_11997ceec3235 X-Filterd-Recvd-Size: 4550 Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Dec 2019 07:55:20 +0000 (UTC) Received: by mail-pl1-f195.google.com with SMTP id az3so8167560plb.11 for ; Mon, 23 Dec 2019 23:55:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fFfoleXjv3JSEmvkK0O9oMZGbyG7YMfuDochkkrQvOQ=; b=Rya8+jtCyVchuEd4ioU+noODodMq+INQiLX/PL4jSKcYg5GvOf/SXNDOQHJtEgbW8y xKL05hnbQGUtL3Acr7ahz8ME2eCZf7ZuMunCKvbRRAimebxigKbETepsyTSzWlQ9BgXB 3GWj0gLhRxLYw/JJUiPcpH5hRxy5+e+SEzVL3VAEoj1PGXL6tzT9xSp+STluqj8AzL/j MOWHh7EVc9HXo7isZza9cfU/KNvur+BDPfzHuBNNWOckvpzIdobDCGncrm7kproAsgCm nYd9QDN5QIDzAXYaZ6OccZH5PF/Aw9nhoe4DlLL++4qM2OcHdVA8Z5lRMeNFWbez74aW 24vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fFfoleXjv3JSEmvkK0O9oMZGbyG7YMfuDochkkrQvOQ=; b=DHl9Kw4aUaTaCzyJ16uK4zl+F7mxxaXtliaPnwcSKDJP0VTyVjyy0DVykcSjSu/u3C QDWZKlx0BngL/lyMJUGUSFscoBkha7+AqyAJfyE57O8bCkdBtbBdV3t4WN+8RvxpaEoO GTzjg/OPWDkJESboTB6zF1Cdf0m3nLLnPEXXx8KIsp+jzTmKKCFtRIMXNFivBirsmjiY oghheZyQe2ZcGx2LZu76OpgbsVz4//eijC5yWJY2ac+Y+ABKtR6oVX8yy3E4H3PYZ4WW sePfZDHx2FFK3vJg3+e/FDfMTZ+aPxCRbJ6TrLf/nFSh1mkLAX2QHb/SBUqZWVXIwaOE C0Kw== X-Gm-Message-State: APjAAAWIzep/+NfvSjGQpcUgD6FBiLd3n+65THpbUp0ki66EouZxgjFL qclaFCw2K2rTCORWEY7TXgM= X-Google-Smtp-Source: APXvYqzwb5nZVKnI+8k0kQUlZltmzZl2BXhecd6/pm7gYC87WxNmmvp1PaWB32bCf3nsZxt4xB3f0Q== X-Received: by 2002:a17:90a:ead3:: with SMTP id ev19mr4262063pjb.80.1577174120029; Mon, 23 Dec 2019 23:55:20 -0800 (PST) Received: from dev.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id c2sm2004064pjq.27.2019.12.23.23.55.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2019 23:55:19 -0800 (PST) From: Yafang Shao To: hannes@cmpxchg.org, david@fromorbit.com, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Yafang Shao , Roman Gushchin Subject: [PATCH v2 2/5] mm, memcg: introduce MEMCG_PROT_SKIP for memcg zero usage case Date: Tue, 24 Dec 2019 02:53:23 -0500 Message-Id: <1577174006-13025-3-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> References: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000059, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If the usage of a memcg is zero, we don't need to do useless work to scan it. That is a minor optimization. Cc: Roman Gushchin Signed-off-by: Yafang Shao --- include/linux/memcontrol.h | 1 + mm/memcontrol.c | 2 +- mm/vmscan.c | 6 ++++++ 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 612a457..1a315c7 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -54,6 +54,7 @@ enum mem_cgroup_protection { MEMCG_PROT_NONE, MEMCG_PROT_LOW, MEMCG_PROT_MIN, + MEMCG_PROT_SKIP, /* For zero usage case */ }; struct mem_cgroup_reclaim_cookie { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c5b5f74..f35fcca 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6292,7 +6292,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, usage = page_counter_read(&memcg->memory); if (!usage) - return MEMCG_PROT_NONE; + return MEMCG_PROT_SKIP; emin = memcg->memory.min; elow = memcg->memory.low; diff --git a/mm/vmscan.c b/mm/vmscan.c index 5a6445e..3c4c2da 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2677,6 +2677,12 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) * thresholds (see get_scan_count). */ break; + case MEMCG_PROT_SKIP: + /* + * Skip scanning this memcg if the usage of it is + * zero. + */ + continue; } reclaimed = sc->nr_reclaimed; From patchwork Tue Dec 24 07:53:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11309165 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3E2F138D for ; Tue, 24 Dec 2019 07:55:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C1208206CB for ; Tue, 24 Dec 2019 07:55:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Gaa4LRfG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1208206CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DC74B8E0008; Tue, 24 Dec 2019 02:55:25 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D76458E0001; Tue, 24 Dec 2019 02:55:25 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB9528E0008; Tue, 24 Dec 2019 02:55:25 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id B569C8E0001 for ; Tue, 24 Dec 2019 02:55:25 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 5E64B8249980 for ; Tue, 24 Dec 2019 07:55:25 +0000 (UTC) X-FDA: 76299275010.18.jeans73_1228b6f6b7040 X-Spam-Summary: 2,0,0,3d145d5512307f0d,d41d8cd98f00b204,laoar.shao@gmail.com,:hannes@cmpxchg.org:david@fromorbit.com:mhocko@kernel.org:vdavydov.dev@gmail.com:akpm@linux-foundation.org:viro@zeniv.linux.org.uk::linux-fsdevel@vger.kernel.org:laoar.shao@gmail.com:chris@chrisdown.name,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1345:1359:1437:1534:1541:1711:1730:1747:1777:1792:2393:2553:2559:2562:2693:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:3874:4321:4605:5007:6119:6261:6653:7514:7903:8784:9413:10004:10249:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12895:12986:13069:13161:13229:13311:13357:14096:14181:14384:14394:14687:14721:21080:21444:21451:21627:21666:21990:30054:30070:30090,0,RBL:209.85.210.196:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: jeans73_1228b6f6b7040 X-Filterd-Recvd-Size: 4865 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Dec 2019 07:55:24 +0000 (UTC) Received: by mail-pf1-f196.google.com with SMTP id q8so10398717pfh.7 for ; Mon, 23 Dec 2019 23:55:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jfKPUo9lIku1R2ahp2lE/JKldZJ7MZdv3GlFZ3kEMlk=; b=Gaa4LRfGyOoT4mcvSupClLdax8k3FG9KAqjHzo4kjy2lKdgb3gPtZxcshF8/w3UAYw oUzjAC2i6C/3lX3na16TLMtL1zEK2qnJaS8GpN4bxtVgQsXe3V+x0OZd8AVYwK52v+Jz KVqPoe42Dawe3tmBAST+AdjrL773RmDLid0ehT9oSBdbO7+wpvEtckN5571GyeA9IQJH ZR9zEaA4R9iTKVFon3OHlwHGbB6rsWF+8jRhYoGRmEYlYogg/tn+AdOKmkDf9w2AFnzr LM+BExf+n/Ii13V1sA51PNpGoRjE5oXcLqrRjTznnP0eIqBQRVzLf1J1fov7ggl5tz0i w6eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jfKPUo9lIku1R2ahp2lE/JKldZJ7MZdv3GlFZ3kEMlk=; b=ebEi2Y7p7CPPOgP/cwC8YkzcjETgevW1/Bx6xFq5OyITnD/KKfOXmie2LIr/xCjk9s gJgJwJF4LIktpyEZyxdr4bM/fUmuRatuwKRL36eBMbvdBvNjsCaAcsN8b53DeI2yeZNG 7zNuKvmmsOOMvYENdzwtqUrDkHnSf/Lu/BKwDCEsep50iC4jYd/u+QCVOK9y4n6b4Isk wz6WO/QgG11MeCF8HAyIzfwJSlQ8SBksbYOpizBKRyGUT+kardnPc8Ty89SBobaMNn2D dHrWUohJMKghMNXRG8f+sY9huKWZEk8vfF6MJQ/ktHBXVc4kIvBLxf7gXPRmVWu7NEAc 5Z3Q== X-Gm-Message-State: APjAAAWUPvkKzd/soBxpHX7dMvBsAItYEDWBHAYfBudpa30CEwU/0eB9 V+dgcPAZq71WTRWBaiE6zNE= X-Google-Smtp-Source: APXvYqx5yy+fa0xyry+JOfLiSMfzbupoakJJbuMOlC+zi6L36BrPLDVcL6DMDnAvbUxjPELtPhMdgg== X-Received: by 2002:a63:554c:: with SMTP id f12mr36614237pgm.23.1577174123459; Mon, 23 Dec 2019 23:55:23 -0800 (PST) Received: from dev.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id c2sm2004064pjq.27.2019.12.23.23.55.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2019 23:55:22 -0800 (PST) From: Yafang Shao To: hannes@cmpxchg.org, david@fromorbit.com, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Yafang Shao , Chris Down Subject: [PATCH v2 3/5] mm, memcg: reset memcg's memory.{min, low} for reclaiming itself Date: Tue, 24 Dec 2019 02:53:24 -0500 Message-Id: <1577174006-13025-4-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> References: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000009, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: memory.{emin, elow} are set in mem_cgroup_protected(), and the values of them won't be changed until next recalculation in this function. After either or both of them are set, the next reclaimer to relcaim this memcg may be a different reclaimer, e.g. this memcg is also the root memcg of the new reclaimer, and then in mem_cgroup_protection() in get_scan_count() the old values of them will be used to calculate scan count, that is not proper. We should reset them to zero in this case. Here's an example of this issue. root_mem_cgroup / A memory.max=1024M memory.min=512M memory.current=800M Once kswapd is waked up, it will try to scan all MEMCGs, including this A, and it will assign memory.emin of A with 512M. After that, A may reach its hard limit(memory.max), and then it will do memcg reclaim. Because A is the root of this reclaimer, so it will not calculate its memory.emin. So the memory.emin is the old value 512M, and then this old value will be used in mem_cgroup_protection() in get_scan_count() to get the scan count. That is not proper. Cc: Chris Down Signed-off-by: Yafang Shao --- mm/memcontrol.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f35fcca..2e78931 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6287,8 +6287,17 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, if (!root) root = root_mem_cgroup; - if (memcg == root) + if (memcg == root) { + /* + * Reset memory.(emin, elow) for reclaiming the memcg + * itself. + */ + if (memcg != root_mem_cgroup) { + memcg->memory.emin = 0; + memcg->memory.elow = 0; + } return MEMCG_PROT_NONE; + } usage = page_counter_read(&memcg->memory); if (!usage) From patchwork Tue Dec 24 07:53:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11309169 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4C513138D for ; Tue, 24 Dec 2019 07:55:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D21F206CB for ; Tue, 24 Dec 2019 07:55:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="q+8rr6aj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D21F206CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E152E8E0009; Tue, 24 Dec 2019 02:55:28 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DC92C8E0001; Tue, 24 Dec 2019 02:55:28 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB2C48E0009; Tue, 24 Dec 2019 02:55:28 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id B59AD8E0001 for ; Tue, 24 Dec 2019 02:55:28 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 53E8152D9 for ; Tue, 24 Dec 2019 07:55:28 +0000 (UTC) X-FDA: 76299275136.27.metal57_129be049ffb0b X-Spam-Summary: 2,0,0,9e0a8385e9c9f285,d41d8cd98f00b204,laoar.shao@gmail.com,:hannes@cmpxchg.org:david@fromorbit.com:mhocko@kernel.org:vdavydov.dev@gmail.com:akpm@linux-foundation.org:viro@zeniv.linux.org.uk::linux-fsdevel@vger.kernel.org:laoar.shao@gmail.com:dchinner@redhat.com,RULES_HIT:2:41:69:355:379:541:800:960:973:988:989:1260:1345:1359:1437:1535:1605:1606:1730:1747:1777:1792:2194:2199:2393:2553:2559:2562:2693:2897:3138:3139:3140:3141:3142:3865:3867:3870:3871:3872:3874:4119:4250:4321:4605:5007:6261:6653:7514:9413:9592:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12895:13161:13172:13229:14394:14687:21080:21444:21451:21627:21666:21966:21990:30054:30090,0,RBL:209.85.216.67:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: metal57_129be049ffb0b X-Filterd-Recvd-Size: 8563 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Dec 2019 07:55:27 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id j11so874448pjs.1 for ; Mon, 23 Dec 2019 23:55:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=5g2CLXGgkilXyD/2e6aUVdcgkPOtGJ9tnX8zatE5JMA=; b=q+8rr6ajBZfR9pA/qszO7K4g081IeY3nrEQ0eX0Ta/Iz18TJqfZeSQjtCFBLFjUIJH v6i+88Xa5c6dpn80FykmXrKZM39ZVIo3cpDEAsTTlpvjSgVf//zRp2wtShjifLC9M4SY OtcFOM6UMIji/p0EQHRYLXjl/GBJvgyOgMjM0+ejtQajtd9JVh9dA4y1qu18XJ0eav3L PaYyfXGVm+Sd5g6L20U+HWInvM4gQVtz/hudATeZ9XiACsil7xbo9+wLor6vNYcvw3Qa LjhUJ9FKqhoTbebSbGFj/wE19bXexAbrbzZXEZD1hiXpbwxRaP4DDYhNIi6xdIMXLD0X 4weA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5g2CLXGgkilXyD/2e6aUVdcgkPOtGJ9tnX8zatE5JMA=; b=iBUcBGNeJ6FGq0LlVHXISPc4ZXw9uGqxzO2I2xi2+CQFXeTMxdiAofoOKSLekiJ9Cr 5UmBOOtBlS5joqTyHqDptQP+DU8WrreR2b5mbcW0y5fP6DzrQ0EeYl3dCQgBNBn1Cy+i sIZMouW5P2iJjUQxzq+4D/8MAjFwPMNdCTtOQ/SodYhRL7C/cwoIfcqJKS3X5nLEpxJO 7Yd2jhFmtCwupKm0F0wSZIkkxxP2nE4u6mBUeXSeZPnZJ75wuelCAksnd2TDa+fBfBQz cjSiarIn5v3jku0hun5QkuIvDKLmh4VxdmUg2R6reRaEosgiXWMmllzigqYbvSEczlU8 SFRA== X-Gm-Message-State: APjAAAW8KTnYS0wtSLwh9LElHavVqm33PgEUf1JGuP9oguqR/fcYAuGs PH5IXmXKw1gcQkHJjQ6lSkk= X-Google-Smtp-Source: APXvYqyjF29Qc3jrwSpUGKp+qUPPcOVOx2HneauT30nXLFZAOThRTyDFmjPqhi8ssMoVbQr35afI0g== X-Received: by 2002:a17:90a:77c1:: with SMTP id e1mr4325998pjs.134.1577174126920; Mon, 23 Dec 2019 23:55:26 -0800 (PST) Received: from dev.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id c2sm2004064pjq.27.2019.12.23.23.55.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2019 23:55:26 -0800 (PST) From: Yafang Shao To: hannes@cmpxchg.org, david@fromorbit.com, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Yafang Shao , Dave Chinner Subject: [PATCH v2 4/5] mm: make memcg visible to lru walker isolation function Date: Tue, 24 Dec 2019 02:53:25 -0500 Message-Id: <1577174006-13025-5-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> References: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The lru walker isolation function may use this memcg to do something, e.g. the inode isolatation function will use the memcg to do inode protection in followup patch. So make memcg visible to the lru walker isolation function. Something should be emphasized in this patch is it replaces for_each_memcg_cache_index() with for_each_mem_cgroup() in list_lru_walk_node(). Because there's a gap between these two MACROs that for_each_mem_cgroup() depends on CONFIG_MEMCG while the other one depends on CONFIG_MEMCG_KMEM. But as list_lru_memcg_aware() returns false if CONFIG_MEMCG_KMEM is not configured, it is safe to this replacement. Cc: Dave Chinner Signed-off-by: Yafang Shao --- include/linux/memcontrol.h | 21 +++++++++++++++++++++ mm/list_lru.c | 22 ++++++++++++---------- mm/memcontrol.c | 15 --------------- 3 files changed, 33 insertions(+), 25 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 1a315c7..f36ada9 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -449,6 +449,21 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *, int mem_cgroup_scan_tasks(struct mem_cgroup *, int (*)(struct task_struct *, void *), void *); +/* + * Iteration constructs for visiting all cgroups (under a tree). If + * loops are exited prematurely (break), mem_cgroup_iter_break() must + * be used for reference counting. + */ +#define for_each_mem_cgroup_tree(iter, root) \ + for (iter = mem_cgroup_iter(root, NULL, NULL); \ + iter != NULL; \ + iter = mem_cgroup_iter(root, iter, NULL)) + +#define for_each_mem_cgroup(iter) \ + for (iter = mem_cgroup_iter(NULL, NULL, NULL); \ + iter != NULL; \ + iter = mem_cgroup_iter(NULL, iter, NULL)) + static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg) { if (mem_cgroup_disabled()) @@ -949,6 +964,12 @@ static inline int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, return 0; } +#define for_each_mem_cgroup_tree(iter) \ + for (iter = NULL; iter; ) + +#define for_each_mem_cgroup(iter) \ + for (iter = NULL; iter; ) + static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg) { return 0; diff --git a/mm/list_lru.c b/mm/list_lru.c index 0f1f6b0..536830d 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -207,11 +207,11 @@ unsigned long list_lru_count_node(struct list_lru *lru, int nid) EXPORT_SYMBOL_GPL(list_lru_count_node); static unsigned long -__list_lru_walk_one(struct list_lru_node *nlru, int memcg_idx, +__list_lru_walk_one(struct list_lru_node *nlru, struct mem_cgroup *memcg, list_lru_walk_cb isolate, void *cb_arg, unsigned long *nr_to_walk) { - + int memcg_idx = memcg_cache_id(memcg); struct list_lru_one *l; struct list_head *item, *n; unsigned long isolated = 0; @@ -273,7 +273,7 @@ unsigned long list_lru_count_node(struct list_lru *lru, int nid) unsigned long ret; spin_lock(&nlru->lock); - ret = __list_lru_walk_one(nlru, memcg_cache_id(memcg), isolate, cb_arg, + ret = __list_lru_walk_one(nlru, memcg, isolate, cb_arg, nr_to_walk); spin_unlock(&nlru->lock); return ret; @@ -289,7 +289,7 @@ unsigned long list_lru_count_node(struct list_lru *lru, int nid) unsigned long ret; spin_lock_irq(&nlru->lock); - ret = __list_lru_walk_one(nlru, memcg_cache_id(memcg), isolate, cb_arg, + ret = __list_lru_walk_one(nlru, memcg, isolate, cb_arg, nr_to_walk); spin_unlock_irq(&nlru->lock); return ret; @@ -299,17 +299,15 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid, list_lru_walk_cb isolate, void *cb_arg, unsigned long *nr_to_walk) { + struct mem_cgroup *memcg; long isolated = 0; - int memcg_idx; - isolated += list_lru_walk_one(lru, nid, NULL, isolate, cb_arg, - nr_to_walk); - if (*nr_to_walk > 0 && list_lru_memcg_aware(lru)) { - for_each_memcg_cache_index(memcg_idx) { + if (list_lru_memcg_aware(lru)) { + for_each_mem_cgroup(memcg) { struct list_lru_node *nlru = &lru->node[nid]; spin_lock(&nlru->lock); - isolated += __list_lru_walk_one(nlru, memcg_idx, + isolated += __list_lru_walk_one(nlru, memcg, isolate, cb_arg, nr_to_walk); spin_unlock(&nlru->lock); @@ -317,7 +315,11 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid, if (*nr_to_walk <= 0) break; } + } else { + isolated += list_lru_walk_one(lru, nid, NULL, isolate, cb_arg, + nr_to_walk); } + return isolated; } EXPORT_SYMBOL_GPL(list_lru_walk_node); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2e78931..2fc2bf4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -222,21 +222,6 @@ enum res_type { /* Used for OOM nofiier */ #define OOM_CONTROL (0) -/* - * Iteration constructs for visiting all cgroups (under a tree). If - * loops are exited prematurely (break), mem_cgroup_iter_break() must - * be used for reference counting. - */ -#define for_each_mem_cgroup_tree(iter, root) \ - for (iter = mem_cgroup_iter(root, NULL, NULL); \ - iter != NULL; \ - iter = mem_cgroup_iter(root, iter, NULL)) - -#define for_each_mem_cgroup(iter) \ - for (iter = mem_cgroup_iter(NULL, NULL, NULL); \ - iter != NULL; \ - iter = mem_cgroup_iter(NULL, iter, NULL)) - static inline bool should_force_charge(void) { return tsk_is_oom_victim(current) || fatal_signal_pending(current) || From patchwork Tue Dec 24 07:53:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11309173 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 30B96109A for ; Tue, 24 Dec 2019 07:55:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E3D01206CB for ; Tue, 24 Dec 2019 07:55:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="M4vEKx3w" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E3D01206CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D2BB38E000A; Tue, 24 Dec 2019 02:55:32 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CB3928E0001; Tue, 24 Dec 2019 02:55:32 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA3448E000A; Tue, 24 Dec 2019 02:55:32 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0232.hostedemail.com [216.40.44.232]) by kanga.kvack.org (Postfix) with ESMTP id A53F78E0001 for ; Tue, 24 Dec 2019 02:55:32 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 58EBA8249980 for ; Tue, 24 Dec 2019 07:55:32 +0000 (UTC) X-FDA: 76299275304.16.flame82_132bca6561c30 X-Spam-Summary: 2,0,0,b70929cda62c7d08,d41d8cd98f00b204,laoar.shao@gmail.com,:hannes@cmpxchg.org:david@fromorbit.com:mhocko@kernel.org:vdavydov.dev@gmail.com:akpm@linux-foundation.org:viro@zeniv.linux.org.uk::linux-fsdevel@vger.kernel.org:laoar.shao@gmail.com:guro@fb.com:chris@chrisdown.name:dchinner@redhat.com,RULES_HIT:1:2:41:355:379:541:800:960:965:966:973:988:989:1260:1345:1359:1437:1605:1730:1747:1777:1792:1801:2196:2198:2199:2200:2393:2553:2559:2562:2693:2897:2898:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4051:4250:4321:4385:4390:4395:4470:4605:5007:6119:6261:6653:7514:7875:7903:9121:9413:9592:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:12986:13161:13229:14096:14394:14687:21080:21433:21444:21450:21451:21627:21666:21796:21990:30036:30054:30064:30070:30090,0,RBL:209.85.214.196:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF :not bul X-HE-Tag: flame82_132bca6561c30 X-Filterd-Recvd-Size: 11083 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Dec 2019 07:55:31 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id g6so8187038plt.2 for ; Mon, 23 Dec 2019 23:55:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=kZ3OMcMXwx7R/y0P55Kf7rXfdQYQYitdi49ZafohpU8=; b=M4vEKx3wrrHwrHAxdlTcjzq22bDDXAAhjs9OdAGn5eeqZtFHHJEiH2RenGeIweDvnt a8jq2l9fVB5PQ/T5ATnsfVZZaHep6aTV+1HfJNEpBUEwWDAWgjXqgpm3YX6ouPYOfEFi W+sBJVOjgBHaefVmRraRA52SA+gTUGJDbgr0Zrq3PWMB0S9GMgtP6eRn3d9gn5Cs33jW DPYJLkYm+cj+wLGzPt/7dirhzudrVCN2fkTwHl33MlJXXzI1TcgcVtufRlFEtoghNb7E jSATInsJOHEIAoMGD1SSjDgTP9nHPbl4WBfiAaTPig1DXFnfg3cUeaknHcgB8Qo5VvlS VM4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=kZ3OMcMXwx7R/y0P55Kf7rXfdQYQYitdi49ZafohpU8=; b=UOM4XGuvBRoBf/wA68KeG+bT+gRtGxksJJ4ZNhAo/6kYIu1R/od71Xjj/1h7cXMXkA 0rsx+hOBhUVkCakjBdLq5UIo/05NOhvlSy+jPmyuRP41mDcnwAJrjr/jZKm1UjxkXWsd gswpyYcBINac0BckN5C/wdUAfdJ+Us+gCjjoL3jCUTmyqZJzAMZa6NtB4maZ1XLVWnQ5 tkznGBNO8A8kGP/5gKc8DbcNHMgGKXsOLm5UbbrcZ3ABaRalksmuNsBv7zIyM7RN5LSM oBg+qwi/K+3Jjfa+si7IltfFnWQq1rnHNpP2Y1+FHl1xJfGjSMSrMBujmXdqFP/8izao ayfQ== X-Gm-Message-State: APjAAAXks+w0gIrLr901mAoCjoIjvf1bY5TzVIayli56pMRLKEyKo941 E0dFO1w9y6TK6m/j8H48mCo= X-Google-Smtp-Source: APXvYqyHpzRKeidKJFiCUXzcUbXAqifMuaaaaknwdZo6Y+QXCqEurYZ93Agk2WD8yfSYA+6iGUqcjQ== X-Received: by 2002:a17:90a:e291:: with SMTP id d17mr4383136pjz.116.1577174130816; Mon, 23 Dec 2019 23:55:30 -0800 (PST) Received: from dev.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id c2sm2004064pjq.27.2019.12.23.23.55.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2019 23:55:30 -0800 (PST) From: Yafang Shao To: hannes@cmpxchg.org, david@fromorbit.com, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Yafang Shao , Roman Gushchin , Chris Down , Dave Chinner Subject: [PATCH v2 5/5] memcg, inode: protect page cache from freeing inode Date: Tue, 24 Dec 2019 02:53:26 -0500 Message-Id: <1577174006-13025-6-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> References: <1577174006-13025-1-git-send-email-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On my server there're some running MEMCGs protected by memory.{min, low}, but I found the usage of these MEMCGs abruptly became very small, which were far less than the protect limit. It confused me and finally I found that was because of inode stealing. Once an inode is freed, all its belonging page caches will be dropped as well, no matter how may page caches it has. So if we intend to protect the page caches in a memcg, we must protect their host (the inode) first. Otherwise the memcg protection can be easily bypassed with freeing inode, especially if there're big files in this memcg. Supposes we have a memcg, and the stat of this memcg is, memory.current = 1024M memory.min = 512M And in this memcg there's a inode with 800M page caches. Once this memcg is scanned by kswapd or other regular reclaimers, kswapd <<<< It can be either of the regular reclaimers. shrink_node_memcgs switch (mem_cgroup_protected()) <<<< Not protected case MEMCG_PROT_NONE: <<<< Will scan this memcg beak; shrink_lruvec() <<<< Reclaim the page caches shrink_slab() <<<< It may free this inode and drop all its page caches(800M). So we must protect the inode first if we want to protect page caches. The inherent mismatch between memcg and inode is a trouble. One inode can be shared by different MEMCGs, but it is a very rare case. If an inode is shared, its belonging page caches may be charged to different MEMCGs. Currently there's no perfect solution to fix this kind of issue, but the inode majority-writer ownership switching can help it more or less. Cc: Roman Gushchin Cc: Chris Down Cc: Dave Chinner Signed-off-by: Yafang Shao Reported-by: kbuild test robot Reported-by: kbuild test robot --- fs/inode.c | 25 +++++++++++++++++++++++-- include/linux/memcontrol.h | 11 ++++++++++- mm/memcontrol.c | 43 +++++++++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 5 +++++ 4 files changed, 81 insertions(+), 3 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index fef457a..4f4b2f3 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -54,6 +54,13 @@ * inode_hash_lock */ +struct inode_head { + struct list_head *freeable; +#ifdef CONFIG_MEMCG_KMEM + struct mem_cgroup *memcg; +#endif +}; + static unsigned int i_hash_mask __read_mostly; static unsigned int i_hash_shift __read_mostly; static struct hlist_head *inode_hashtable __read_mostly; @@ -724,8 +731,10 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) static enum lru_status inode_lru_isolate(struct list_head *item, struct list_lru_one *lru, spinlock_t *lru_lock, void *arg) { - struct list_head *freeable = arg; + struct inode_head *ihead = (struct inode_head *)arg; + struct list_head *freeable = ihead->freeable; struct inode *inode = container_of(item, struct inode, i_lru); + struct mem_cgroup *memcg = NULL; /* * we are inverting the lru lock/inode->i_lock here, so use a trylock. @@ -734,6 +743,15 @@ static enum lru_status inode_lru_isolate(struct list_head *item, if (!spin_trylock(&inode->i_lock)) return LRU_SKIP; +#ifdef CONFIG_MEMCG_KMEM + memcg = ihead->memcg; +#endif + if (memcg && inode->i_data.nrpages && + !(memcg_can_reclaim_inode(memcg, inode))) { + spin_unlock(&inode->i_lock); + return LRU_ROTATE; + } + /* * Referenced or dirty inodes are still in use. Give them another pass * through the LRU as we canot reclaim them now. @@ -789,11 +807,14 @@ static enum lru_status inode_lru_isolate(struct list_head *item, */ long prune_icache_sb(struct super_block *sb, struct shrink_control *sc) { + struct inode_head ihead; LIST_HEAD(freeable); long freed; + ihead.freeable = &freeable; + ihead.memcg = sc->memcg; freed = list_lru_shrink_walk(&sb->s_inode_lru, sc, - inode_lru_isolate, &freeable); + inode_lru_isolate, &ihead); dispose_list(&freeable); return freed; } diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index f36ada9..d1d4175 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -247,6 +247,9 @@ struct mem_cgroup { unsigned int tcpmem_active : 1; unsigned int tcpmem_pressure : 1; + /* Soft protection will be ignored if it's true */ + unsigned int in_low_reclaim : 1; + int under_oom; int swappiness; @@ -363,7 +366,7 @@ static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg, enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, struct mem_cgroup *memcg); - +bool memcg_can_reclaim_inode(struct mem_cgroup *memcg, struct inode *inode); int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct mem_cgroup **memcgp, bool compound); @@ -865,6 +868,12 @@ static inline enum mem_cgroup_protection mem_cgroup_protected( return MEMCG_PROT_NONE; } +static inline bool memcg_can_reclaim_inode(struct mem_cgroup *memcg, + struct inode *memcg) +{ + return true; +} + static inline int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct mem_cgroup **memcgp, diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2fc2bf4..c3498fd 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6340,6 +6340,49 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, } /** + * Once an inode is freed, all its belonging page caches will be dropped as + * well, even if there're lots of page caches. So if we intend to protect + * page caches in a memcg, we must protect their host(the inode) first. + * Otherwise the memcg protection can be easily bypassed with freeing inode, + * especially if there're big files in this memcg. + * Note that it may happen that the page caches are already charged to the + * memcg, but the inode hasn't been added to this memcg yet. In this case, + * this inode is not protected. + * The inherent mismatch between memcg and inode is a trouble. One inode + * can be shared by different MEMCGs, but it is a very rare case. If + * an inode is shared, its belonging page caches may be charged to + * different MEMCGs. Currently there's no perfect solution to fix this + * kind of issue, but the inode majority-writer ownership switching can + * help it more or less. + */ +bool memcg_can_reclaim_inode(struct mem_cgroup *memcg, + struct inode *inode) +{ + unsigned long cgroup_size; + unsigned long protection; + bool reclaimable = true; + + if (memcg == root_mem_cgroup) + goto out; + + protection = mem_cgroup_protection(memcg, memcg->in_low_reclaim); + if (!protection) + goto out; + + /* + * Don't protect this inode if the usage of this memcg is still + * above the protection after reclaiming this inode and all its + * belonging page caches. + */ + cgroup_size = mem_cgroup_size(memcg); + if (inode->i_data.nrpages + protection > cgroup_size) + reclaimable = false; + +out: + return reclaimable; +} + +/** * mem_cgroup_try_charge - try charging a page * @page: page to charge * @mm: mm context of the victim diff --git a/mm/vmscan.c b/mm/vmscan.c index 3c4c2da..ecc5c1d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2666,6 +2666,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) sc->memcg_low_skipped = 1; continue; } + + memcg->in_low_reclaim = 1; memcg_memory_event(memcg, MEMCG_LOW); break; case MEMCG_PROT_NONE: @@ -2693,6 +2695,9 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); + if (memcg->in_low_reclaim) + memcg->in_low_reclaim = 0; + /* Record the group's reclaim efficiency */ vmpressure(sc->gfp_mask, memcg, false, sc->nr_scanned - scanned,