From patchwork Fri May 8 18:30:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537371 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 51CED15AB for ; Fri, 8 May 2020 18:32:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1D1E024953 for ; Fri, 8 May 2020 18:32:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="qFPqOIpO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1D1E024953 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 330AD900002; Fri, 8 May 2020 14:32:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 28F938E0003; Fri, 8 May 2020 14:32:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1AE98900003; Fri, 8 May 2020 14:32:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id 0181F8E0003 for ; Fri, 8 May 2020 14:32:14 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BB38C2C93 for ; Fri, 8 May 2020 18:32:14 +0000 (UTC) X-FDA: 76794396588.03.team55_6ade04755ee4c X-Spam-Summary: 2,0,0,e1c0f99a706653b5,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3866:3867:3868:3870:4250:4321:5007:6261:6653:6742:8957:10004:11026:11473:11658:11914:12043:12114:12297:12438:12517:12519:12555:12679:12895:12986:13069:13146:13161:13229:13230:13311:13357:13894:14093:14096:14181:14384:14394:14721:21060:21080:21444:21450:21451:21627:30054,0,RBL:209.85.160.193:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: team55_6ade04755ee4c X-Filterd-Recvd-Size: 4582 Received: from mail-qt1-f193.google.com (mail-qt1-f193.google.com [209.85.160.193]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:14 +0000 (UTC) Received: by mail-qt1-f193.google.com with SMTP id b1so1355762qtt.1 for ; Fri, 08 May 2020 11:32:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0OaDRgIP/LaVuaLkrKFBfAyYPwFi/McMfSqQvrDbKR0=; b=qFPqOIpOx/A+KsibW/cl0Kiy5Z1zAe7JNze5nGAi3sKwnpKqCbDXG2CIi9NwZruWit wyxuR/txmqvpZmMGhWtGhWqeVGfp5U2i5eUyFvQqxo0BpnWaC2X0vKVkXvOzDTk9sGcb Uz/DvNHU/dVL72Or2yDR9sr0k8DlR4YkTusMw4uJRoRuMmjKgQkI439rEFMZvkny6vQ+ XxesdgdLu6f8XL+v/h6bU3OHfAtVtdYFEQtFsVlICLaRcMptHDEnpI4s3O3V35JX2Sji ykBrfgj72yhUYyNQJZRPD2mVo6pertJvdn+XSfP9aqaR9RyW3W5nXMy2hyAUN2sdHSMH kEYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0OaDRgIP/LaVuaLkrKFBfAyYPwFi/McMfSqQvrDbKR0=; b=anzZ/J7ErkuMaiUbNMp1SFjx7xTS36JZzijIw7I5lEE3s4R+M9Obh5HcgC7/wDChdb CGqsAPD3jij6/WU4nyw/RtJnH8Pobw8u5uzB5UyXrXssYMwbrHd0dwmXGjzkcsIJfbwc VzQjzwTEZ9xUH/PIvyrzXNmbSVsYXDxPipdOTzcW+waICmnvlCYas1eWLMS26sC9YB5E H6Ww1ElrruBj15PZcdK9B2LBiLa9WKT2sPW+Uyo8j5/tInEoIrh9wBH7qZoDX2Sc9OeF 58tDmd3JscuAriK0P9mlMO/6adEgYobMU+Oit00wGNwuQeLzqmfXI30dXMOTVmr25yQz PQLQ== X-Gm-Message-State: AGi0PuZzAUmDuaTyRd7ut7aL3VuDMETfUEeJFn+3LDFWSXUvk5Ja5/kw uP11uriwPZV+xXp6ZNg5Kcpo5A== X-Google-Smtp-Source: APiQypL6bxt/DQCIOwoBK2IN92lKSHldYkBgPF/Y2re35ZWYFNZiUvxC/szX/AkYe5Gr6EiezpkU5A== X-Received: by 2002:aed:2b43:: with SMTP id p61mr4384521qtd.298.1588962733581; Fri, 08 May 2020 11:32:13 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id x34sm2225542qta.43.2020.05.08.11.32.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:13 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 01/19] mm: fix NUMA node file count error in replace_page_cache() Date: Fri, 8 May 2020 14:30:48 -0400 Message-Id: <20200508183105.225460-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When replacing one page with another one in the cache, we have to decrease the file count of the old page's NUMA node and increase the one of the new NUMA node, otherwise the old node leaks the count and the new node eventually underflows its counter. Fixes: 74d609585d8b ("page cache: Add and replace pages using the XArray") Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi Reviewed-by: Shakeel Butt Reviewed-by: Joonsoo Kim Reviewed-by: Balbir Singh --- mm/filemap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index af1c6adad5bd..2b057b0aa882 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -808,11 +808,11 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) old->mapping = NULL; /* hugetlb pages do not participate in page cache accounting. */ if (!PageHuge(old)) - __dec_node_page_state(new, NR_FILE_PAGES); + __dec_node_page_state(old, NR_FILE_PAGES); if (!PageHuge(new)) __inc_node_page_state(new, NR_FILE_PAGES); if (PageSwapBacked(old)) - __dec_node_page_state(new, NR_SHMEM); + __dec_node_page_state(old, NR_SHMEM); if (PageSwapBacked(new)) __inc_node_page_state(new, NR_SHMEM); xas_unlock_irqrestore(&xas, flags); From patchwork Fri May 8 18:30:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537375 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5394415AB for ; Fri, 8 May 2020 18:32:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 13EF02192A for ; Fri, 8 May 2020 18:32:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="E2/VFAYi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 13EF02192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1A6B7900003; Fri, 8 May 2020 14:32:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 132908E0006; Fri, 8 May 2020 14:32:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 020E6900003; Fri, 8 May 2020 14:32:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id C78DC8E0006 for ; Fri, 8 May 2020 14:32:16 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9286E2DFD for ; Fri, 8 May 2020 18:32:16 +0000 (UTC) X-FDA: 76794396672.22.dad01_6b21996f64b18 X-Spam-Summary: 2,0,0,b99e1c0719873e0a,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:355:379:541:800:960:973:981:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2194:2198:2199:2200:2393:2553:2559:2562:2693:2895:2897:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3871:3872:3874:4117:4250:5007:6119:6261:6653:6742:7903:8957:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:13161:13229:13894:14096:14181:14394:14721:21080:21444:21450:21451:21611:21627:30012:30045:30054:30070:30090,0,RBL:209.85.160.196:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: dad01_6b21996f64b18 X-Filterd-Recvd-Size: 6436 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:16 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id j2so1844591qtr.12 for ; Fri, 08 May 2020 11:32:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EjChPX7glAdooI23QIihtHHWKSsyoFkOYmamMQuDTUs=; b=E2/VFAYie4kY0oIe+I5W4UoCkIR1MFpm8OEx4ahvA8yieQG6LQNzAimM1n3W6Li+Oc I0/5BUjFGqhTxN8yW7GlZ3Qzgx77PFKUTO/h1kkZEBPp6kUrL/LmkNlQb1nLZXjWmB0e rVfbzBbUsU1z770LBdlYdN5Lm4Dn5WiB6Yvl/XrvK1A2xYQHB9jIvQZjN+BkQ6AHKJ0Z vLZWRfxOlBhIZzVQIEwWwp+RcnrRAK1wpNAGjkQX5HJxuSN0dt59OcKSyfcOZ8DqFpJe 5E+Xw3g91V+3mRBPj1qstLwYoLSWZwC6SzFfsjqKkKRqw6wOVDd6flbMtu4wLGK37FZw LjbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EjChPX7glAdooI23QIihtHHWKSsyoFkOYmamMQuDTUs=; b=mR4xX4HQIogXk7GGXlMVPo8ymXzc8kQPVdniRcNLZ14I/yfjCUw8zKzgGzaX+XHUbJ YSMdnpomKDx4u2PT+A18GYzg2pyDrQayRGTjJ1KuOcqtD9b38IFKC9TE29JmYzGPE5FH REykhAQIM5tWezdoFv7yFoQN4Tnd40pRHbqGTl12UTWO2BEqEYniXOCUJWSu9mOXz/T6 Z6la2DZV/hANPhqDBK7cuedFdBcRNoq2P4WWWaKfH847RATqCv28U1LAEBPRbNHru7B2 AjtRFcGKn36kITJ1pXN5eXj3PoN9+mtZ/khZowGjx2TKjy48bt06RufWSDJX9Mxft7TB 70zQ== X-Gm-Message-State: AGi0Pua3sry1FymNK1QcnKX1gENDxuftIp40t6wPhkH7lpidQG7MDW8V fFT680tfoBc9kvacdQ/L0G2Rhg== X-Google-Smtp-Source: APiQypKfEaNko+TNl5fRiEjFygGSRu6c+fbfE4Rvr4dO+JYHm+E6+/ujwz88q8b+m6+BImDW3PWNwA== X-Received: by 2002:ac8:c0d:: with SMTP id k13mr1670052qti.136.1588962735392; Fri, 08 May 2020 11:32:15 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id g11sm1725212qkk.106.2020.05.08.11.32.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:14 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 02/19] mm: memcontrol: fix stat-corrupting race in charge moving Date: Fri, 8 May 2020 14:30:49 -0400 Message-Id: <20200508183105.225460-3-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The move_lock is a per-memcg lock, but the VM accounting code that needs to acquire it comes from the page and follows page->mem_cgroup under RCU protection. That means that the page becomes unlocked not when we drop the move_lock, but when we update page->mem_cgroup. And that assignment doesn't imply any memory ordering. If that pointer write gets reordered against the reads of the page state - page_mapped, PageDirty etc. the state may change while we rely on it being stable and we can end up corrupting the counters. Place an SMP memory barrier to make sure we're done with all page state by the time the new page->mem_cgroup becomes visible. Also replace the open-coded move_lock with a lock_page_memcg() to make it more obvious what we're serializing against. Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 317dbbaac603..cdd29b59929b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5376,7 +5376,6 @@ static int mem_cgroup_move_account(struct page *page, { struct lruvec *from_vec, *to_vec; struct pglist_data *pgdat; - unsigned long flags; unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; int ret; bool anon; @@ -5403,18 +5402,13 @@ static int mem_cgroup_move_account(struct page *page, from_vec = mem_cgroup_lruvec(from, pgdat); to_vec = mem_cgroup_lruvec(to, pgdat); - spin_lock_irqsave(&from->move_lock, flags); + lock_page_memcg(page); if (!anon && page_mapped(page)) { __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); } - /* - * move_lock grabbed above and caller set from->moving_account, so - * mod_memcg_page_state will serialize updates to PageDirty. - * So mapping should be stable for dirty pages. - */ if (!anon && PageDirty(page)) { struct address_space *mapping = page_mapping(page); @@ -5430,15 +5424,23 @@ static int mem_cgroup_move_account(struct page *page, } /* + * All state has been migrated, let's switch to the new memcg. + * * It is safe to change page->mem_cgroup here because the page - * is referenced, charged, and isolated - we can't race with - * uncharging, charging, migration, or LRU putback. + * is referenced, charged, isolated, and locked: we can't race + * with (un)charging, migration, LRU putback, or anything else + * that would rely on a stable page->mem_cgroup. + * + * Note that lock_page_memcg is a memcg lock, not a page lock, + * to save space. As soon as we switch page->mem_cgroup to a + * new memcg that isn't locked, the above state can change + * concurrently again. Make sure we're truly done with it. */ + smp_mb(); - /* caller should have done css_get */ - page->mem_cgroup = to; + page->mem_cgroup = to; /* caller should have done css_get */ - spin_unlock_irqrestore(&from->move_lock, flags); + __unlock_page_memcg(from); ret = 0; From patchwork Fri May 8 18:30:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537377 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27D1F15AB for ; Fri, 8 May 2020 18:32:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B2A232192A for ; Fri, 8 May 2020 18:32:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="thueaNSz" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B2A232192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C32E6900004; Fri, 8 May 2020 14:32:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BBA138E0006; Fri, 8 May 2020 14:32:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E388900004; Fri, 8 May 2020 14:32:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id 7EA2B8E0006 for ; Fri, 8 May 2020 14:32:19 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 270DF8248047 for ; Fri, 8 May 2020 18:32:19 +0000 (UTC) X-FDA: 76794396798.03.stone83_6b78fd6075e15 X-Spam-Summary: 10,1,0,c1d241a53a3f9170,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:327:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1605:1730:1747:1777:1792:2194:2196:2199:2200:2393:2553:2559:2562:2693:2737:3138:3139:3140:3141:3142:3315:3608:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4362:4385:4605:5007:6119:6261:6653:6742:7875:7903:8660:8957:9108:10004:11026:11473:11658:11914:12043:12295:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13148:13161:13229:13230:13894:14096:14394:21060:21080:21433:21444:21450:21451:21611:21627:21810:21891:21966:21987:21990:30003:30012:30045:30054:30070:30075:30079:30090,0,RBL:209.85.222.195:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: stone83_6b78fd6075e15 X-Filterd-Recvd-Size: 31241 Received: from mail-qk1-f195.google.com (mail-qk1-f195.google.com [209.85.222.195]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:18 +0000 (UTC) Received: by mail-qk1-f195.google.com with SMTP id q7so2646276qkf.3 for ; Fri, 08 May 2020 11:32:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gvivdVW5ETEWIYjqcsfazu+QEuC7bxT1uGHyPgIWjdk=; b=thueaNSzB9N5148LgAHFdFdG6qWsqk9sLRwVZyjA/QRcylFxyl3W6+y/QRIqPzN5Og 9Pv2CpMfWqFHhWpjJ2RyqtKWy5yRTl1Ln10Zxi7fkzfvqYY6L3hfLDHtKMmk5+8Bb/eL KzJh1gOwJtcqtWS1XPx+BB+TFITu3/Gk9LUO9zCeWiVnpKTMXalyzJw3ZsG+5t0vTR0h 0oDO8/PO2coRYClNaseLX9HesQWoLxDSP/3t8saXR+mLwyqYnPn2esHpQmG4FbcctBxa utZhvOwDLbyQEEt3kSGvbViVJNz7/lv8WVMKARiSZN2RG9aczKPQgUEJ8WmCCp7sYnZH dTXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gvivdVW5ETEWIYjqcsfazu+QEuC7bxT1uGHyPgIWjdk=; b=ZMzyllRprEyRScZYmHM73Tdvt3xhXFN5bW0kLsmbt9XeaVLLPwhwNZd2+XAYGEelsW SWURwsYS/NZPrabQ7CoWObG8cJHGeg3Una29MGIdCGUryKfpxYbmLS5KwLbsa/B0jqxM ayonbSHUaMZVBRJyxCN5SLe9vZhVbBpbFgUeuMUhY5Jt1SROmx6UbaE6KY9OnlHGuatB ZGDiRrbjeOCuS89Absdj5KCkGEoqe8kzs6zTi3OkeAT0fvh7Zk8oazhsQxvtkd7XiV05 UkfzCcDZN+l5/O9iAiRJI5OfSy+AjDwGPJH6OtJ2i+UUH8AqtHdZWSgkJdNOgECf3ZNS baNQ== X-Gm-Message-State: AGi0PuYpHLTZVf+/pbHJ2PoaaTWp1OjJOvI9SgLUdiYDWkq6tCdKxvBa WyPyCS6oiXHCLoWoYmLc/1DJyw== X-Google-Smtp-Source: APiQypLUI/5NpRUhG8en1GEKBOIKUsLd+v7Z/yEwtkDtzARtIfzz8lZ06zalfdVgVt17GYeE2XIO7Q== X-Received: by 2002:a37:82c1:: with SMTP id e184mr4376200qkd.186.1588962737179; Fri, 08 May 2020 11:32:17 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id a124sm1801685qkf.93.2020.05.08.11.32.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:16 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 03/19] mm: memcontrol: drop @compound parameter from memcg charging API Date: Fri, 8 May 2020 14:30:50 -0400 Message-Id: <20200508183105.225460-4-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The memcg charging API carries a boolean @compound parameter that tells whether the page we're dealing with is a hugepage. mem_cgroup_commit_charge() has another boolean @lrucare that indicates whether the page needs LRU locking or not while charging. The majority of callsites know those parameters at compile time, which results in a lot of naked "false, false" argument lists. This makes for cryptic code and is a breeding ground for subtle mistakes. Thankfully, the huge page state can be inferred from the page itself and doesn't need to be passed along. This is safe because charging completes before the page is published and somebody may split it. Simplify the callsites by removing @compound, and let memcg infer the state by using hpage_nr_pages() unconditionally. That function does PageTransHuge() to identify huge pages, which also helpfully asserts that nobody passes in tail pages by accident. The following patches will introduce a new charging API, best not to carry over unnecessary weight. Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi Reviewed-by: Joonsoo Kim Reviewed-by: Shakeel Butt --- include/linux/memcontrol.h | 22 ++++++++-------------- kernel/events/uprobes.c | 6 +++--- mm/filemap.c | 6 +++--- mm/huge_memory.c | 8 ++++---- mm/khugepaged.c | 20 ++++++++++---------- mm/memcontrol.c | 38 +++++++++++++++----------------------- mm/memory.c | 32 +++++++++++++++----------------- mm/migrate.c | 6 +++--- mm/shmem.c | 22 +++++++++------------- mm/swapfile.c | 9 ++++----- mm/userfaultfd.c | 6 +++--- 11 files changed, 77 insertions(+), 98 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index b67dd43aaa4b..30292d57c8af 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -373,15 +373,12 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg) } int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp, - bool compound); + gfp_t gfp_mask, struct mem_cgroup **memcgp); int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp, - bool compound); + gfp_t gfp_mask, struct mem_cgroup **memcgp); void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, - bool lrucare, bool compound); -void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg, - bool compound); + bool lrucare); +void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg); void mem_cgroup_uncharge(struct page *page); void mem_cgroup_uncharge_list(struct list_head *page_list); @@ -870,8 +867,7 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg) static inline int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, - struct mem_cgroup **memcgp, - bool compound) + struct mem_cgroup **memcgp) { *memcgp = NULL; return 0; @@ -880,8 +876,7 @@ static inline int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, static inline int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, - struct mem_cgroup **memcgp, - bool compound) + struct mem_cgroup **memcgp) { *memcgp = NULL; return 0; @@ -889,13 +884,12 @@ static inline int mem_cgroup_try_charge_delay(struct page *page, static inline void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, - bool lrucare, bool compound) + bool lrucare) { } static inline void mem_cgroup_cancel_charge(struct page *page, - struct mem_cgroup *memcg, - bool compound) + struct mem_cgroup *memcg) { } diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index ece7e13f6e4a..40e7488ce467 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -169,7 +169,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, if (new_page) { err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, - &memcg, false); + &memcg); if (err) return err; } @@ -181,7 +181,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, err = -EAGAIN; if (!page_vma_mapped_walk(&pvmw)) { if (new_page) - mem_cgroup_cancel_charge(new_page, memcg, false); + mem_cgroup_cancel_charge(new_page, memcg); goto unlock; } VM_BUG_ON_PAGE(addr != pvmw.address, old_page); @@ -189,7 +189,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, if (new_page) { get_page(new_page); page_add_new_anon_rmap(new_page, vma, addr, false); - mem_cgroup_commit_charge(new_page, memcg, false, false); + mem_cgroup_commit_charge(new_page, memcg, false); lru_cache_add_active_or_unevictable(new_page, vma); } else /* no new page, just dec_mm_counter for old_page */ diff --git a/mm/filemap.c b/mm/filemap.c index 2b057b0aa882..ce200386736c 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -842,7 +842,7 @@ static int __add_to_page_cache_locked(struct page *page, if (!huge) { error = mem_cgroup_try_charge(page, current->mm, - gfp_mask, &memcg, false); + gfp_mask, &memcg); if (error) return error; } @@ -878,14 +878,14 @@ static int __add_to_page_cache_locked(struct page *page, goto error; if (!huge) - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); trace_mm_filemap_add_to_page_cache(page); return 0; error: page->mapping = NULL; /* Leave page->index set: truncation relies upon it */ if (!huge) - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); put_page(page); return xas_error(&xas); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d7384eb2e017..46c2bc20b7cb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -594,7 +594,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, VM_BUG_ON_PAGE(!PageCompound(page), page); - if (mem_cgroup_try_charge_delay(page, vma->vm_mm, gfp, &memcg, true)) { + if (mem_cgroup_try_charge_delay(page, vma->vm_mm, gfp, &memcg)) { put_page(page); count_vm_event(THP_FAULT_FALLBACK); count_vm_event(THP_FAULT_FALLBACK_CHARGE); @@ -630,7 +630,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, vm_fault_t ret2; spin_unlock(vmf->ptl); - mem_cgroup_cancel_charge(page, memcg, true); + mem_cgroup_cancel_charge(page, memcg); put_page(page); pte_free(vma->vm_mm, pgtable); ret2 = handle_userfault(vmf, VM_UFFD_MISSING); @@ -641,7 +641,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, entry = mk_huge_pmd(page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); page_add_new_anon_rmap(page, vma, haddr, true); - mem_cgroup_commit_charge(page, memcg, false, true); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_active_or_unevictable(page, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); @@ -658,7 +658,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, release: if (pgtable) pte_free(vma->vm_mm, pgtable); - mem_cgroup_cancel_charge(page, memcg, true); + mem_cgroup_cancel_charge(page, memcg); put_page(page); return ret; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a02a4c5f2fe4..b73d2af6d11a 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1067,7 +1067,7 @@ static void collapse_huge_page(struct mm_struct *mm, goto out_nolock; } - if (unlikely(mem_cgroup_try_charge(new_page, mm, gfp, &memcg, true))) { + if (unlikely(mem_cgroup_try_charge(new_page, mm, gfp, &memcg))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out_nolock; } @@ -1075,7 +1075,7 @@ static void collapse_huge_page(struct mm_struct *mm, down_read(&mm->mmap_sem); result = hugepage_vma_revalidate(mm, address, &vma); if (result) { - mem_cgroup_cancel_charge(new_page, memcg, true); + mem_cgroup_cancel_charge(new_page, memcg); up_read(&mm->mmap_sem); goto out_nolock; } @@ -1083,7 +1083,7 @@ static void collapse_huge_page(struct mm_struct *mm, pmd = mm_find_pmd(mm, address); if (!pmd) { result = SCAN_PMD_NULL; - mem_cgroup_cancel_charge(new_page, memcg, true); + mem_cgroup_cancel_charge(new_page, memcg); up_read(&mm->mmap_sem); goto out_nolock; } @@ -1095,7 +1095,7 @@ static void collapse_huge_page(struct mm_struct *mm, */ if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, pmd, referenced)) { - mem_cgroup_cancel_charge(new_page, memcg, true); + mem_cgroup_cancel_charge(new_page, memcg); up_read(&mm->mmap_sem); goto out_nolock; } @@ -1183,7 +1183,7 @@ static void collapse_huge_page(struct mm_struct *mm, spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); page_add_new_anon_rmap(new_page, vma, address, true); - mem_cgroup_commit_charge(new_page, memcg, false, true); + mem_cgroup_commit_charge(new_page, memcg, false); count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1); lru_cache_add_active_or_unevictable(new_page, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); @@ -1201,7 +1201,7 @@ static void collapse_huge_page(struct mm_struct *mm, trace_mm_collapse_huge_page(mm, isolated, result); return; out: - mem_cgroup_cancel_charge(new_page, memcg, true); + mem_cgroup_cancel_charge(new_page, memcg); goto out_up_write; } @@ -1628,7 +1628,7 @@ static void collapse_file(struct mm_struct *mm, goto out; } - if (unlikely(mem_cgroup_try_charge(new_page, mm, gfp, &memcg, true))) { + if (unlikely(mem_cgroup_try_charge(new_page, mm, gfp, &memcg))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } @@ -1641,7 +1641,7 @@ static void collapse_file(struct mm_struct *mm, break; xas_unlock_irq(&xas); if (!xas_nomem(&xas, GFP_KERNEL)) { - mem_cgroup_cancel_charge(new_page, memcg, true); + mem_cgroup_cancel_charge(new_page, memcg); result = SCAN_FAIL; goto out; } @@ -1877,7 +1877,7 @@ static void collapse_file(struct mm_struct *mm, SetPageUptodate(new_page); page_ref_add(new_page, HPAGE_PMD_NR - 1); - mem_cgroup_commit_charge(new_page, memcg, false, true); + mem_cgroup_commit_charge(new_page, memcg, false); if (is_shmem) { set_page_dirty(new_page); @@ -1932,7 +1932,7 @@ static void collapse_file(struct mm_struct *mm, VM_BUG_ON(nr_none); xas_unlock_irq(&xas); - mem_cgroup_cancel_charge(new_page, memcg, true); + mem_cgroup_cancel_charge(new_page, memcg); new_page->mapping = NULL; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index cdd29b59929b..13da46a5d8ae 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -834,7 +834,7 @@ static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, struct page *page, - bool compound, int nr_pages) + int nr_pages) { /* * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is @@ -848,7 +848,7 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, __mod_memcg_state(memcg, NR_SHMEM, nr_pages); } - if (compound) { + if (abs(nr_pages) > 1) { VM_BUG_ON_PAGE(!PageTransHuge(page), page); __mod_memcg_state(memcg, MEMCG_RSS_HUGE, nr_pages); } @@ -5445,9 +5445,9 @@ static int mem_cgroup_move_account(struct page *page, ret = 0; local_irq_disable(); - mem_cgroup_charge_statistics(to, page, compound, nr_pages); + mem_cgroup_charge_statistics(to, page, nr_pages); memcg_check_events(to, page); - mem_cgroup_charge_statistics(from, page, compound, -nr_pages); + mem_cgroup_charge_statistics(from, page, -nr_pages); memcg_check_events(from, page); local_irq_enable(); out_unlock: @@ -6435,7 +6435,6 @@ void mem_cgroup_calculate_protection(struct mem_cgroup *root, * @mm: mm context of the victim * @gfp_mask: reclaim mode * @memcgp: charged memcg return - * @compound: charge the page as compound or small page * * Try to charge @page to the memcg that @mm belongs to, reclaiming * pages according to @gfp_mask if necessary. @@ -6448,11 +6447,10 @@ void mem_cgroup_calculate_protection(struct mem_cgroup *root, * with mem_cgroup_cancel_charge() in case page instantiation fails. */ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp, - bool compound) + gfp_t gfp_mask, struct mem_cgroup **memcgp) { + unsigned int nr_pages = hpage_nr_pages(page); struct mem_cgroup *memcg = NULL; - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; int ret = 0; if (mem_cgroup_disabled()) @@ -6494,13 +6492,12 @@ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, } int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp, - bool compound) + gfp_t gfp_mask, struct mem_cgroup **memcgp) { struct mem_cgroup *memcg; int ret; - ret = mem_cgroup_try_charge(page, mm, gfp_mask, memcgp, compound); + ret = mem_cgroup_try_charge(page, mm, gfp_mask, memcgp); memcg = *memcgp; mem_cgroup_throttle_swaprate(memcg, page_to_nid(page), gfp_mask); return ret; @@ -6511,7 +6508,6 @@ int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, * @page: page to charge * @memcg: memcg to charge the page to * @lrucare: page might be on LRU already - * @compound: charge the page as compound or small page * * Finalize a charge transaction started by mem_cgroup_try_charge(), * after page->mapping has been set up. This must happen atomically @@ -6524,9 +6520,9 @@ int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, * Use mem_cgroup_cancel_charge() to cancel the transaction instead. */ void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, - bool lrucare, bool compound) + bool lrucare) { - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; + unsigned int nr_pages = hpage_nr_pages(page); VM_BUG_ON_PAGE(!page->mapping, page); VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page); @@ -6544,7 +6540,7 @@ void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, commit_charge(page, memcg, lrucare); local_irq_disable(); - mem_cgroup_charge_statistics(memcg, page, compound, nr_pages); + mem_cgroup_charge_statistics(memcg, page, nr_pages); memcg_check_events(memcg, page); local_irq_enable(); @@ -6563,14 +6559,12 @@ void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, * mem_cgroup_cancel_charge - cancel a page charge * @page: page to charge * @memcg: memcg to charge the page to - * @compound: charge the page as compound or small page * * Cancel a charge transaction started by mem_cgroup_try_charge(). */ -void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg, - bool compound) +void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg) { - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; + unsigned int nr_pages = hpage_nr_pages(page); if (mem_cgroup_disabled()) return; @@ -6785,8 +6779,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) commit_charge(newpage, memcg, false); local_irq_save(flags); - mem_cgroup_charge_statistics(memcg, newpage, PageTransHuge(newpage), - nr_pages); + mem_cgroup_charge_statistics(memcg, newpage, nr_pages); memcg_check_events(memcg, newpage); local_irq_restore(flags); } @@ -7016,8 +7009,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) * only synchronisation we have for updating the per-CPU variables. */ VM_BUG_ON(!irqs_disabled()); - mem_cgroup_charge_statistics(memcg, page, PageTransHuge(page), - -nr_entries); + mem_cgroup_charge_statistics(memcg, page, -nr_entries); memcg_check_events(memcg, page); if (!mem_cgroup_is_root(memcg)) diff --git a/mm/memory.c b/mm/memory.c index 0ad29c7274de..a08cbaa81607 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2676,7 +2676,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } } - if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, &memcg, false)) + if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, &memcg)) goto oom_free_new; __SetPageUptodate(new_page); @@ -2711,7 +2711,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) */ ptep_clear_flush_notify(vma, vmf->address, vmf->pte); page_add_new_anon_rmap(new_page, vma, vmf->address, false); - mem_cgroup_commit_charge(new_page, memcg, false, false); + mem_cgroup_commit_charge(new_page, memcg, false); lru_cache_add_active_or_unevictable(new_page, vma); /* * We call the notify macro here because, when using secondary @@ -2750,7 +2750,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) new_page = old_page; page_copied = 1; } else { - mem_cgroup_cancel_charge(new_page, memcg, false); + mem_cgroup_cancel_charge(new_page, memcg); } if (new_page) @@ -3193,8 +3193,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) goto out_page; } - if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, - &memcg, false)) { + if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, &memcg)) { ret = VM_FAULT_OOM; goto out_page; } @@ -3245,11 +3244,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(page != swapcache && swapcache)) { page_add_new_anon_rmap(page, vma, vmf->address, false); - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_active_or_unevictable(page, vma); } else { do_page_add_anon_rmap(page, vma, vmf->address, exclusive); - mem_cgroup_commit_charge(page, memcg, true, false); + mem_cgroup_commit_charge(page, memcg, true); activate_page(page); } @@ -3285,7 +3284,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) out: return ret; out_nomap: - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); pte_unmap_unlock(vmf->pte, vmf->ptl); out_page: unlock_page(page); @@ -3359,8 +3358,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (!page) goto oom; - if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, &memcg, - false)) + if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, &memcg)) goto oom_free_page; /* @@ -3386,14 +3384,14 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Deliver the page fault to userland, check inside PT lock */ if (userfaultfd_missing(vma)) { pte_unmap_unlock(vmf->pte, vmf->ptl); - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); put_page(page); return handle_userfault(vmf, VM_UFFD_MISSING); } inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); page_add_new_anon_rmap(page, vma, vmf->address, false); - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_active_or_unevictable(page, vma); setpte: set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); @@ -3404,7 +3402,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; release: - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); put_page(page); goto unlock; oom_free_page: @@ -3655,7 +3653,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); page_add_new_anon_rmap(page, vma, vmf->address, false); - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_active_or_unevictable(page, vma); } else { inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); @@ -3864,8 +3862,8 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) if (!vmf->cow_page) return VM_FAULT_OOM; - if (mem_cgroup_try_charge_delay(vmf->cow_page, vma->vm_mm, GFP_KERNEL, - &vmf->memcg, false)) { + if (mem_cgroup_try_charge_delay(vmf->cow_page, vma->vm_mm, + GFP_KERNEL, &vmf->memcg)) { put_page(vmf->cow_page); return VM_FAULT_OOM; } @@ -3886,7 +3884,7 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) goto uncharge_out; return ret; uncharge_out: - mem_cgroup_cancel_charge(vmf->cow_page, vmf->memcg, false); + mem_cgroup_cancel_charge(vmf->cow_page, vmf->memcg); put_page(vmf->cow_page); return ret; } diff --git a/mm/migrate.c b/mm/migrate.c index f66f93f9a5e2..50c7a08f8f31 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2786,7 +2786,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, if (unlikely(anon_vma_prepare(vma))) goto abort; - if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, &memcg, false)) + if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, &memcg)) goto abort; /* @@ -2832,7 +2832,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, inc_mm_counter(mm, MM_ANONPAGES); page_add_new_anon_rmap(page, vma, addr, false); - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); if (!is_zone_device_page(page)) lru_cache_add_active_or_unevictable(page, vma); get_page(page); @@ -2854,7 +2854,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, unlock_abort: pte_unmap_unlock(ptep, ptl); - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); abort: *src &= ~MIGRATE_PFN_MIGRATE; } diff --git a/mm/shmem.c b/mm/shmem.c index bd8840082c94..d505b6cce4ab 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1664,8 +1664,7 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index, goto failed; } - error = mem_cgroup_try_charge_delay(page, charge_mm, gfp, &memcg, - false); + error = mem_cgroup_try_charge_delay(page, charge_mm, gfp, &memcg); if (!error) { error = shmem_add_to_page_cache(page, mapping, index, swp_to_radix_entry(swap), gfp); @@ -1680,14 +1679,14 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index, * the rest. */ if (error) { - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); delete_from_swap_cache(page); } } if (error) goto failed; - mem_cgroup_commit_charge(page, memcg, true, false); + mem_cgroup_commit_charge(page, memcg, true); spin_lock_irq(&info->lock); info->swapped--; @@ -1859,8 +1858,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, if (sgp == SGP_WRITE) __SetPageReferenced(page); - error = mem_cgroup_try_charge_delay(page, charge_mm, gfp, &memcg, - PageTransHuge(page)); + error = mem_cgroup_try_charge_delay(page, charge_mm, gfp, &memcg); if (error) { if (PageTransHuge(page)) { count_vm_event(THP_FILE_FALLBACK); @@ -1871,12 +1869,10 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, error = shmem_add_to_page_cache(page, mapping, hindex, NULL, gfp & GFP_RECLAIM_MASK); if (error) { - mem_cgroup_cancel_charge(page, memcg, - PageTransHuge(page)); + mem_cgroup_cancel_charge(page, memcg); goto unacct; } - mem_cgroup_commit_charge(page, memcg, false, - PageTransHuge(page)); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_anon(page); spin_lock_irq(&info->lock); @@ -2364,7 +2360,7 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, if (unlikely(offset >= max_off)) goto out_release; - ret = mem_cgroup_try_charge_delay(page, dst_mm, gfp, &memcg, false); + ret = mem_cgroup_try_charge_delay(page, dst_mm, gfp, &memcg); if (ret) goto out_release; @@ -2373,7 +2369,7 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, if (ret) goto out_release_uncharge; - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); _dst_pte = mk_pte(page, dst_vma->vm_page_prot); if (dst_vma->vm_flags & VM_WRITE) @@ -2424,7 +2420,7 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, ClearPageDirty(page); delete_from_page_cache(page); out_release_uncharge: - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); out_release: unlock_page(page); put_page(page); diff --git a/mm/swapfile.c b/mm/swapfile.c index 0aa9a9dd5d3d..15e5f8f290cc 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1868,15 +1868,14 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, if (unlikely(!page)) return -ENOMEM; - if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, - &memcg, false)) { + if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, &memcg)) { ret = -ENOMEM; goto out_nolock; } pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (unlikely(!pte_same_as_swp(*pte, swp_entry_to_pte(entry)))) { - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); ret = 0; goto out; } @@ -1888,10 +1887,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { page_add_anon_rmap(page, vma, addr, false); - mem_cgroup_commit_charge(page, memcg, true, false); + mem_cgroup_commit_charge(page, memcg, true); } else { /* ksm created a completely new copy */ page_add_new_anon_rmap(page, vma, addr, false); - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_active_or_unevictable(page, vma); } swap_free(entry); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 512576e171ce..bb57d0a3fca7 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -97,7 +97,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, __SetPageUptodate(page); ret = -ENOMEM; - if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false)) + if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg)) goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); @@ -124,7 +124,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, inc_mm_counter(dst_mm, MM_ANONPAGES); page_add_new_anon_rmap(page, dst_vma, dst_addr, false); - mem_cgroup_commit_charge(page, memcg, false, false); + mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_active_or_unevictable(page, dst_vma); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); @@ -138,7 +138,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, return ret; out_release_uncharge_unlock: pte_unmap_unlock(dst_pte, ptl); - mem_cgroup_cancel_charge(page, memcg, false); + mem_cgroup_cancel_charge(page, memcg); out_release: put_page(page); goto out; From patchwork Fri May 8 18:30:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537379 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78480912 for ; Fri, 8 May 2020 18:32:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 389612192A for ; Fri, 8 May 2020 18:32:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="pxoqGiGX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 389612192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E504A8E0006; Fri, 8 May 2020 14:32:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E0156900005; Fri, 8 May 2020 14:32:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF0A68E0007; Fri, 8 May 2020 14:32:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0042.hostedemail.com [216.40.44.42]) by kanga.kvack.org (Postfix) with ESMTP id B0F9B8E0006 for ; Fri, 8 May 2020 14:32:20 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 741118248047 for ; Fri, 8 May 2020 18:32:20 +0000 (UTC) X-FDA: 76794396840.19.hands79_6bb2df324a45a X-Spam-Summary: 2,0,0,fe01a68c63afcc69,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1543:1711:1730:1747:1777:1792:2194:2196:2199:2200:2393:2559:2562:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3871:3872:3874:4117:4250:4385:4605:5007:6261:6653:6742:8660:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:13148:13161:13229:13230:13894:14181:14394:14721:14819:21060:21080:21433:21444:21451:21627:21990:30054:30070,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: hands79_6bb2df324a45a X-Filterd-Recvd-Size: 6755 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:19 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id 4so2147280qtb.4 for ; Fri, 08 May 2020 11:32:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tJumjdWPL8ImmNUzRG9WEC1TfJwykzONtYCeiyJV9G4=; b=pxoqGiGXEuav96/KW40PHN/56mIPcv+pG8bFWXckn7KWmaroqEIKnTBtfp2ivpknd5 eUpMr7sVbK7wX5G7uKSEf6hU2QLAr9AEmkT21qOLTwpAVEFYgmqlYrmbipLmLqyW+aEg ZhH0rgKtQ7J9/dzwlS7Saf/IslIKpJyNjugFrVQaBFZAetBkf2SmcUJpvZMK224rS+sI 7ZWJEzgwBB3geO6LRoXV6e6KGSiF+SXl75wg2aniVze4+KdN6IR370Kh08YZ+2h72JEY /nnZCbfLQYr3kMrU7gIvvm8tBrJnemtlzyBLO7s+HDGXMNiJFvXkQqDGmd7mhPa7/s4o y7TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tJumjdWPL8ImmNUzRG9WEC1TfJwykzONtYCeiyJV9G4=; b=hNbSzJRLSGq8h1zUmODWdqjmPbty6o8BE3raXp4SNxt/dvVQ4isS88OdZ3yOVMkuD4 qDeagTO28/HScXf444lqO5bVHezoDTKyVT4Iqz6b8zNHhvfmWQq6tjg9fybsAL7JKrMb Z3QY1JgXJ2K8TTe++H0hd5QzDJdCE+dDyyQMyy36TtPjzYBwCEvVTgbaviX3deI6EvsD 0TB9u2knm0N95cORfIF6Itvv9MyJkp0Oq3B7GzsRMDbp/8rv9dIxJQ3IHGxzteUbMoFK /ZP/f5gBIodUyUHbrCXqUfpjLzCAdjcx7bi9ntET/g3QyMUWTy8I+nFvULAlte0bv5Gd QkaA== X-Gm-Message-State: AGi0PuZWlVF2dF5qnx30oFiconSIwNVd9uXsChXdepKWSJ5tn5/6jMGe T0/9iyA0sbUhDzisfABB1BbrXQ== X-Google-Smtp-Source: APiQypJfBtzB9Izp4VoWSxD1LvsBcJL01Juo/72A8UeQeymaN8ofP40ppKn9jTkqrlZ3wkzGb0+wKA== X-Received: by 2002:aed:3009:: with SMTP id 9mr4480935qte.191.1588962739111; Fri, 08 May 2020 11:32:19 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id q207sm1700107qka.13.2020.05.08.11.32.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:18 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 04/19] mm: memcontrol: move out cgroup swaprate throttling Date: Fri, 8 May 2020 14:30:51 -0400 Message-Id: <20200508183105.225460-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The cgroup swaprate throttling is about matching new anon allocations to the rate of available IO when that is being throttled. It's the io controller hooking into the VM, rather than a memory controller thing. Rename mem_cgroup_throttle_swaprate() to cgroup_throttle_swaprate(), and drop the @memcg argument which is only used to check whether the preceding page charge has succeeded and the fault is proceeding. We could decouple the call from mem_cgroup_try_charge() here as well, but that would cause unnecessary churn: the following patches convert all callsites to a new charge API and we'll decouple as we go along. Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi Reviewed-by: Joonsoo Kim Reviewed-by: Shakeel Butt --- include/linux/swap.h | 6 ++---- mm/memcontrol.c | 5 ++--- mm/swapfile.c | 14 +++++++------- 3 files changed, 11 insertions(+), 14 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 873bf5206afb..b42fb47d8cbe 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -650,11 +650,9 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *mem) #endif #if defined(CONFIG_SWAP) && defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) -extern void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node, - gfp_t gfp_mask); +extern void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask); #else -static inline void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, - int node, gfp_t gfp_mask) +static inline void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask) { } #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 13da46a5d8ae..8188d462d7ce 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6494,12 +6494,11 @@ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct mem_cgroup **memcgp) { - struct mem_cgroup *memcg; int ret; ret = mem_cgroup_try_charge(page, mm, gfp_mask, memcgp); - memcg = *memcgp; - mem_cgroup_throttle_swaprate(memcg, page_to_nid(page), gfp_mask); + if (*memcgp) + cgroup_throttle_swaprate(page, gfp_mask); return ret; } diff --git a/mm/swapfile.c b/mm/swapfile.c index 15e5f8f290cc..ad42eac1822d 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3748,11 +3748,12 @@ static void free_swap_count_continuations(struct swap_info_struct *si) } #if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) -void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node, - gfp_t gfp_mask) +void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask) { struct swap_info_struct *si, *next; - if (!(gfp_mask & __GFP_IO) || !memcg) + int nid = page_to_nid(page); + + if (!(gfp_mask & __GFP_IO)) return; if (!blk_cgroup_congested()) @@ -3766,11 +3767,10 @@ void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node, return; spin_lock(&swap_avail_lock); - plist_for_each_entry_safe(si, next, &swap_avail_heads[node], - avail_lists[node]) { + plist_for_each_entry_safe(si, next, &swap_avail_heads[nid], + avail_lists[nid]) { if (si->bdev) { - blkcg_schedule_throttle(bdev_get_queue(si->bdev), - true); + blkcg_schedule_throttle(bdev_get_queue(si->bdev), true); break; } } From patchwork Fri May 8 18:30:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537381 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1056E912 for ; Fri, 8 May 2020 18:32:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B70B021974 for ; Fri, 8 May 2020 18:32:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="0dNh/IWj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B70B021974 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CCBC3900006; Fri, 8 May 2020 14:32:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C538B900005; Fri, 8 May 2020 14:32:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACE20900006; Fri, 8 May 2020 14:32:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0089.hostedemail.com [216.40.44.89]) by kanga.kvack.org (Postfix) with ESMTP id 90BED900005 for ; Fri, 8 May 2020 14:32:22 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4E7A7180AD807 for ; Fri, 8 May 2020 18:32:22 +0000 (UTC) X-FDA: 76794396924.26.trail96_6bf29ca8bf435 X-Spam-Summary: 2,0,0,7edf24a2072080e6,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:4:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1605:1730:1747:1777:1792:1981:2194:2196:2198:2199:2200:2201:2393:2559:2562:2638:2693:3138:3139:3140:3141:3142:3503:3504:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6261:6653:6742:7550:7875:7903:8957:9389:9592:10004:11026:11232:11473:11658:11914:12043:12050:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13161:13229:13255:13894:14096:14394:21060:21080:21433:21444:21450:21451:21627:21740:21987:21990:30012:30054:30070:30075,0,RBL:209.85.222.194:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: trail96_6bf29ca8bf435 X-Filterd-Recvd-Size: 15347 Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:21 +0000 (UTC) Received: by mail-qk1-f194.google.com with SMTP id t3so2313926qkg.1 for ; Fri, 08 May 2020 11:32:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+wiSpoRwEYHspeJR8Gv0Gu4T5Xv9UOTe+wucWrpXt8k=; b=0dNh/IWjZhFeTttoqMZV3YkHGhFdx1nKm+z7iXFYwNPsE4TgXsSlPwNl2kG1f3JNT5 B80FQ1DT4047uhoF2ZBWq4o7SPVd3kgFtcKcxEImqkNDQt+DWx4KyZQcIFngsTOnhKpj dHb0jE/VejcpqnB3aOM3cUpil9G2W/fjOIzZHFlh+LBDAxMs1XgHnw+s91McJM32p5BQ lbekYEKAng1XPr9ZUs48f+qrUjZ5IgiXu+lQzYTkgVZMA5gl1TqO0l1pTDq32iDA5KT+ O8UToaKQsPLT316j4fCDASk1nFCQSWQxBr5Bo7ySM+Tmc8L7czbQ09L16kx70+rVrOLC A1zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+wiSpoRwEYHspeJR8Gv0Gu4T5Xv9UOTe+wucWrpXt8k=; b=a4tqDT37GJHHzySCUA05nZRbgPQ4hKywV6CFP/dOElcO2/F5mBXWjt0INn2latw7Y7 JLKvSKl8gRVRT38CZ85Xx8vJx/0r26UWi91AkxOhpgptDB40kmMRTjLxf0tGXmdcpt8D j/xb0yhZcR2BaoLJfXSg5f2RXHA5axxbb5Q/GZZg61wlYTbEMRz9nBxtO5XLmAfsPDEr xDXXHUNjKO7U052i04QQx40fd8KM4m/kHYwssnUsnad4/eVWVAuYsltdqsxnreC6Q4SC XvfQVeuGps9oP+2GhVrXrx7Hk+H1vM58AxlvbNoBwXaE6P8OC1agv9stJjX7ddFT1wxo MeIQ== X-Gm-Message-State: AGi0PubVdx8+1sequZajiUyCdpAhl2EaTFCskYxW2vbpf/Uc1P7D/1IU Po5QttETF6j8lMVuDLwYQJdikw== X-Google-Smtp-Source: APiQypJr8oSNAvojFc450AFWzLENI5Fu1W8Y2ox6Ymn2F1xsXhkMoaouTGdOD7X1EY8wxHXT02wEJA== X-Received: by 2002:a05:620a:1591:: with SMTP id d17mr4375296qkk.349.1588962740752; Fri, 08 May 2020 11:32:20 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id a29sm1867816qka.90.2020.05.08.11.32.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:20 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 05/19] mm: memcontrol: convert page cache to a new mem_cgroup_charge() API Date: Fri, 8 May 2020 14:30:52 -0400 Message-Id: <20200508183105.225460-6-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The try/commit/cancel protocol that memcg uses dates back to when pages used to be uncharged upon removal from the page cache, and thus couldn't be committed before the insertion had succeeded. Nowadays, pages are uncharged when they are physically freed; it doesn't matter whether the insertion was successful or not. For the page cache, the transaction dance has become unnecessary. Introduce a mem_cgroup_charge() function that simply charges a newly allocated page to a cgroup and sets up page->mem_cgroup in one single step. If the insertion fails, the caller doesn't have to do anything but free/put the page. Then switch the page cache over to this new API. Subsequent patches will also convert anon pages, but it needs a bit more prep work. Right now, memcg depends on page->mapping being already set up at the time of charging, so that it can maintain its own MEMCG_CACHE and MEMCG_RSS counters. For anon, page->mapping is set under the same pte lock under which the page is publishd, so a single charge point that can block doesn't work there just yet. The following prep patches will replace the private memcg counters with the generic vmstat counters, thus removing the page->mapping dependency, then complete the transition to the new single-point charge API and delete the old transactional scheme. v2: leave shmem swapcache when charging fails to avoid double IO (Joonsoo) Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi --- include/linux/memcontrol.h | 10 +++++ mm/filemap.c | 24 +++++------ mm/memcontrol.c | 29 ++++++++++++- mm/shmem.c | 88 ++++++++++++++++---------------------- 4 files changed, 85 insertions(+), 66 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 30292d57c8af..57339514d960 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -379,6 +379,10 @@ int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, bool lrucare); void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg); + +int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, + bool lrucare); + void mem_cgroup_uncharge(struct page *page); void mem_cgroup_uncharge_list(struct list_head *page_list); @@ -893,6 +897,12 @@ static inline void mem_cgroup_cancel_charge(struct page *page, { } +static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm, + gfp_t gfp_mask, bool lrucare) +{ + return 0; +} + static inline void mem_cgroup_uncharge(struct page *page) { } diff --git a/mm/filemap.c b/mm/filemap.c index ce200386736c..ee9882509566 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -832,7 +832,6 @@ static int __add_to_page_cache_locked(struct page *page, { XA_STATE(xas, &mapping->i_pages, offset); int huge = PageHuge(page); - struct mem_cgroup *memcg; int error; void *old; @@ -840,17 +839,16 @@ static int __add_to_page_cache_locked(struct page *page, VM_BUG_ON_PAGE(PageSwapBacked(page), page); mapping_set_update(&xas, mapping); - if (!huge) { - error = mem_cgroup_try_charge(page, current->mm, - gfp_mask, &memcg); - if (error) - return error; - } - get_page(page); page->mapping = mapping; page->index = offset; + if (!huge) { + error = mem_cgroup_charge(page, current->mm, gfp_mask, false); + if (error) + goto error; + } + do { xas_lock_irq(&xas); old = xas_load(&xas); @@ -874,20 +872,18 @@ static int __add_to_page_cache_locked(struct page *page, xas_unlock_irq(&xas); } while (xas_nomem(&xas, gfp_mask & GFP_RECLAIM_MASK)); - if (xas_error(&xas)) + if (xas_error(&xas)) { + error = xas_error(&xas); goto error; + } - if (!huge) - mem_cgroup_commit_charge(page, memcg, false); trace_mm_filemap_add_to_page_cache(page); return 0; error: page->mapping = NULL; /* Leave page->index set: truncation relies upon it */ - if (!huge) - mem_cgroup_cancel_charge(page, memcg); put_page(page); - return xas_error(&xas); + return error; } ALLOW_ERROR_INJECTION(__add_to_page_cache_locked, ERRNO); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8188d462d7ce..1d45a09b334f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6578,6 +6578,33 @@ void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg) cancel_charge(memcg, nr_pages); } +/** + * mem_cgroup_charge - charge a newly allocated page to a cgroup + * @page: page to charge + * @mm: mm context of the victim + * @gfp_mask: reclaim mode + * @lrucare: page might be on the LRU already + * + * Try to charge @page to the memcg that @mm belongs to, reclaiming + * pages according to @gfp_mask if necessary. + * + * Returns 0 on success. Otherwise, an error code is returned. + */ +int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, + bool lrucare) +{ + struct mem_cgroup *memcg; + int ret; + + VM_BUG_ON_PAGE(!page->mapping, page); + + ret = mem_cgroup_try_charge(page, mm, gfp_mask, &memcg); + if (ret) + return ret; + mem_cgroup_commit_charge(page, memcg, lrucare); + return 0; +} + struct uncharge_gather { struct mem_cgroup *memcg; unsigned long pgpgout; @@ -6625,8 +6652,6 @@ static void uncharge_batch(const struct uncharge_gather *ug) static void uncharge_page(struct page *page, struct uncharge_gather *ug) { VM_BUG_ON_PAGE(PageLRU(page), page); - VM_BUG_ON_PAGE(page_count(page) && !is_zone_device_page(page) && - !PageHWPoison(page) , page); if (!page->mem_cgroup) return; diff --git a/mm/shmem.c b/mm/shmem.c index d505b6cce4ab..afd5a057ebb7 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -605,11 +605,13 @@ static inline bool is_huge_enabled(struct shmem_sb_info *sbinfo) */ static int shmem_add_to_page_cache(struct page *page, struct address_space *mapping, - pgoff_t index, void *expected, gfp_t gfp) + pgoff_t index, void *expected, gfp_t gfp, + struct mm_struct *charge_mm) { XA_STATE_ORDER(xas, &mapping->i_pages, index, compound_order(page)); unsigned long i = 0; unsigned long nr = compound_nr(page); + int error; VM_BUG_ON_PAGE(PageTail(page), page); VM_BUG_ON_PAGE(index != round_down(index, nr), page); @@ -621,12 +623,22 @@ static int shmem_add_to_page_cache(struct page *page, page->mapping = mapping; page->index = index; + error = mem_cgroup_charge(page, charge_mm, gfp, PageSwapCache(page)); + if (error) { + if (!PageSwapCache(page) && PageTransHuge(page)) { + count_vm_event(THP_FILE_FALLBACK); + count_vm_event(THP_FILE_FALLBACK_CHARGE); + } + goto error; + } + cgroup_throttle_swaprate(page, gfp); + do { void *entry; xas_lock_irq(&xas); entry = xas_find_conflict(&xas); if (entry != expected) - xas_set_err(&xas, -EEXIST); + xas_set_err(&xas, expected ? -ENOENT : -EEXIST); xas_create_range(&xas); if (xas_error(&xas)) goto unlock; @@ -648,12 +660,15 @@ static int shmem_add_to_page_cache(struct page *page, } while (xas_nomem(&xas, gfp)); if (xas_error(&xas)) { - page->mapping = NULL; - page_ref_sub(page, nr); - return xas_error(&xas); + error = xas_error(&xas); + goto error; } return 0; +error: + page->mapping = NULL; + page_ref_sub(page, nr); + return error; } /* @@ -1619,7 +1634,6 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index, struct address_space *mapping = inode->i_mapping; struct shmem_inode_info *info = SHMEM_I(inode); struct mm_struct *charge_mm = vma ? vma->vm_mm : current->mm; - struct mem_cgroup *memcg; struct page *page; swp_entry_t swap; int error; @@ -1664,29 +1678,23 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index, goto failed; } - error = mem_cgroup_try_charge_delay(page, charge_mm, gfp, &memcg); - if (!error) { - error = shmem_add_to_page_cache(page, mapping, index, - swp_to_radix_entry(swap), gfp); + error = shmem_add_to_page_cache(page, mapping, index, + swp_to_radix_entry(swap), gfp, + charge_mm); + if (error) { /* - * We already confirmed swap under page lock, and make - * no memory allocation here, so usually no possibility - * of error; but free_swap_and_cache() only trylocks a - * page, so it is just possible that the entry has been - * truncated or holepunched since swap was confirmed. + * We already confirmed swap under page lock, but + * free_swap_and_cache() only trylocks a page, so it + * is just possible that the entry has been truncated + * or holepunched since swap was confirmed. * shmem_undo_range() will have done some of the * unaccounting, now delete_from_swap_cache() will do * the rest. */ - if (error) { - mem_cgroup_cancel_charge(page, memcg); + if (error == -ENOENT) delete_from_swap_cache(page); - } - } - if (error) goto failed; - - mem_cgroup_commit_charge(page, memcg, true); + } spin_lock_irq(&info->lock); info->swapped--; @@ -1733,7 +1741,6 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, struct shmem_inode_info *info = SHMEM_I(inode); struct shmem_sb_info *sbinfo; struct mm_struct *charge_mm; - struct mem_cgroup *memcg; struct page *page; enum sgp_type sgp_huge = sgp; pgoff_t hindex = index; @@ -1858,21 +1865,11 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, if (sgp == SGP_WRITE) __SetPageReferenced(page); - error = mem_cgroup_try_charge_delay(page, charge_mm, gfp, &memcg); - if (error) { - if (PageTransHuge(page)) { - count_vm_event(THP_FILE_FALLBACK); - count_vm_event(THP_FILE_FALLBACK_CHARGE); - } - goto unacct; - } error = shmem_add_to_page_cache(page, mapping, hindex, - NULL, gfp & GFP_RECLAIM_MASK); - if (error) { - mem_cgroup_cancel_charge(page, memcg); + NULL, gfp & GFP_RECLAIM_MASK, + charge_mm); + if (error) goto unacct; - } - mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_anon(page); spin_lock_irq(&info->lock); @@ -2310,7 +2307,6 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, struct address_space *mapping = inode->i_mapping; gfp_t gfp = mapping_gfp_mask(mapping); pgoff_t pgoff = linear_page_index(dst_vma, dst_addr); - struct mem_cgroup *memcg; spinlock_t *ptl; void *page_kaddr; struct page *page; @@ -2360,16 +2356,10 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, if (unlikely(offset >= max_off)) goto out_release; - ret = mem_cgroup_try_charge_delay(page, dst_mm, gfp, &memcg); - if (ret) - goto out_release; - ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL, - gfp & GFP_RECLAIM_MASK); + gfp & GFP_RECLAIM_MASK, dst_mm); if (ret) - goto out_release_uncharge; - - mem_cgroup_commit_charge(page, memcg, false); + goto out_release; _dst_pte = mk_pte(page, dst_vma->vm_page_prot); if (dst_vma->vm_flags & VM_WRITE) @@ -2390,11 +2380,11 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, ret = -EFAULT; max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); if (unlikely(offset >= max_off)) - goto out_release_uncharge_unlock; + goto out_release_unlock; ret = -EEXIST; if (!pte_none(*dst_pte)) - goto out_release_uncharge_unlock; + goto out_release_unlock; lru_cache_add_anon(page); @@ -2415,12 +2405,10 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, ret = 0; out: return ret; -out_release_uncharge_unlock: +out_release_unlock: pte_unmap_unlock(dst_pte, ptl); ClearPageDirty(page); delete_from_page_cache(page); -out_release_uncharge: - mem_cgroup_cancel_charge(page, memcg); out_release: unlock_page(page); put_page(page); From patchwork Fri May 8 18:30:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537383 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B55515AB for ; Fri, 8 May 2020 18:32:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F2722192A for ; Fri, 8 May 2020 18:32:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="Ok41Lo1v" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F2722192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 06EAB900007; Fri, 8 May 2020 14:32:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 04A26900005; Fri, 8 May 2020 14:32:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8E13900007; Fri, 8 May 2020 14:32:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0145.hostedemail.com [216.40.44.145]) by kanga.kvack.org (Postfix) with ESMTP id BA93F900005 for ; Fri, 8 May 2020 14:32:23 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7AA9D2C8F for ; Fri, 8 May 2020 18:32:23 +0000 (UTC) X-FDA: 76794396966.15.wire76_6c25a964a7128 X-Spam-Summary: 2,0,0,fbb59fc83394c821,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:355:379:541:800:960:968:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1543:1711:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:4117:4321:5007:6261:6653:6742:8957:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:13161:13229:13255:13894:14096:14181:14394:14721:21060:21080:21444:21450:21451:21627:30012:30034:30054,0,RBL:209.85.160.193:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:29,LUA_SUMMARY:none X-HE-Tag: wire76_6c25a964a7128 X-Filterd-Recvd-Size: 6449 Received: from mail-qt1-f193.google.com (mail-qt1-f193.google.com [209.85.160.193]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:22 +0000 (UTC) Received: by mail-qt1-f193.google.com with SMTP id x12so2131410qts.9 for ; Fri, 08 May 2020 11:32:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=os71RgO7UYZW6sWXEwj0lDMyIJjz3G/urqsx0D5LgDI=; b=Ok41Lo1vf1PMKfG8DI0O4GLzAqQtpVb1zNf1EPZ4My8rTLN7d9U322b+HhpR56d+zr qgw/tjjicxzAND+jaT9qVeoLd36ipg11elnWg1Y7bYPF/T6YCZATPbJafo69PY0BO/J3 f8aKqmu/arzV5mpttkCYZsi4SOXhka0OG9RtKFYGeBnYCLhWW70fKPsQST7VvJY7tiBu mJkZ1G5k5ovJfOTNzY5LP819R3F0edK6ciYpm6/r1d0sHdZHrBt0W7VbrHppDC88JPH5 ygDS6M79il+1OV54bUyD4agmTHx0tqczjshn0Sbt2wFolrxpQQpTGDv4hvCuGMBxwdJA CWcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=os71RgO7UYZW6sWXEwj0lDMyIJjz3G/urqsx0D5LgDI=; b=P6hSxeAZqLiEi5qD5kvGAhGEcF/HNaiea9EBMon5XHLD3s5S+UmIrXubyz5asRdiQs 1+RoTEW2Zf5tXJlu3WR5hEufLZZVCGOplRVdhoznbenu8Q4IlY3WffaX2h0wvgtkMdHD qbXpWmVmEgOgMlgPq4g42tbzdBKErWJJKx0F+m7GRH7+TXeBUhvAV0ByGOgHVuz34NJy zhdFDnEtXckbB4Cqy5O6i6O3egQ8EmniJYspF1Df65nXKun1RKuexhidtNKiPbCnUZ9w 5uB8dbjHAkLpXJTQL9kDY6KdsW/ugUFzGE0cJ3NL4hxbfspEN7tzGIL66rMeRSi9CPpw qhGg== X-Gm-Message-State: AGi0PuaNuW+CXvUIiC1llvLNrlicQSqKQwBs5z1Pe0B/Ecupvm3eC7RU 2Jn3wDpvc8E0K80gvSsxKCFddw== X-Google-Smtp-Source: APiQypLLbfqSzYKsedCrMYddgiO1/unFT31fQvO3IxHuqOtWBpejHnB2p+HWn96iuTaw56MC3MvTsw== X-Received: by 2002:ac8:7496:: with SMTP id v22mr4449251qtq.348.1588962742353; Fri, 08 May 2020 11:32:22 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id w27sm2114893qtc.18.2020.05.08.11.32.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:21 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 06/19] mm: memcontrol: prepare uncharging for removal of private page type counters Date: Fri, 8 May 2020 14:30:53 -0400 Message-Id: <20200508183105.225460-7-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The uncharge batching code adds up the anon, file, kmem counts to determine the total number of pages to uncharge and references to drop. But the next patches will remove the anon and file counters. Maintain an aggregate nr_pages in the uncharge_gather struct. Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi Reviewed-by: Joonsoo Kim --- mm/memcontrol.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1d45a09b334f..a5efdad77be4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6607,6 +6607,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct uncharge_gather { struct mem_cgroup *memcg; + unsigned long nr_pages; unsigned long pgpgout; unsigned long nr_anon; unsigned long nr_file; @@ -6623,13 +6624,12 @@ static inline void uncharge_gather_clear(struct uncharge_gather *ug) static void uncharge_batch(const struct uncharge_gather *ug) { - unsigned long nr_pages = ug->nr_anon + ug->nr_file + ug->nr_kmem; unsigned long flags; if (!mem_cgroup_is_root(ug->memcg)) { - page_counter_uncharge(&ug->memcg->memory, nr_pages); + page_counter_uncharge(&ug->memcg->memory, ug->nr_pages); if (do_memsw_account()) - page_counter_uncharge(&ug->memcg->memsw, nr_pages); + page_counter_uncharge(&ug->memcg->memsw, ug->nr_pages); if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && ug->nr_kmem) page_counter_uncharge(&ug->memcg->kmem, ug->nr_kmem); memcg_oom_recover(ug->memcg); @@ -6641,16 +6641,18 @@ static void uncharge_batch(const struct uncharge_gather *ug) __mod_memcg_state(ug->memcg, MEMCG_RSS_HUGE, -ug->nr_huge); __mod_memcg_state(ug->memcg, NR_SHMEM, -ug->nr_shmem); __count_memcg_events(ug->memcg, PGPGOUT, ug->pgpgout); - __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, nr_pages); + __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); memcg_check_events(ug->memcg, ug->dummy_page); local_irq_restore(flags); if (!mem_cgroup_is_root(ug->memcg)) - css_put_many(&ug->memcg->css, nr_pages); + css_put_many(&ug->memcg->css, ug->nr_pages); } static void uncharge_page(struct page *page, struct uncharge_gather *ug) { + unsigned long nr_pages; + VM_BUG_ON_PAGE(PageLRU(page), page); if (!page->mem_cgroup) @@ -6670,13 +6672,12 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) ug->memcg = page->mem_cgroup; } - if (!PageKmemcg(page)) { - unsigned int nr_pages = 1; + nr_pages = compound_nr(page); + ug->nr_pages += nr_pages; - if (PageTransHuge(page)) { - nr_pages = compound_nr(page); + if (!PageKmemcg(page)) { + if (PageTransHuge(page)) ug->nr_huge += nr_pages; - } if (PageAnon(page)) ug->nr_anon += nr_pages; else { @@ -6686,7 +6687,7 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) } ug->pgpgout++; } else { - ug->nr_kmem += compound_nr(page); + ug->nr_kmem += nr_pages; __ClearPageKmemcg(page); } From patchwork Fri May 8 18:30:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537385 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4480C15AB for ; Fri, 8 May 2020 18:32:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 123272192A for ; Fri, 8 May 2020 18:32:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="CmwijAWG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 123272192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 79C7C900008; Fri, 8 May 2020 14:32:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7717F900005; Fri, 8 May 2020 14:32:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5EBDB900008; Fri, 8 May 2020 14:32:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0001.hostedemail.com [216.40.44.1]) by kanga.kvack.org (Postfix) with ESMTP id 41C8E900005 for ; Fri, 8 May 2020 14:32:25 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E24A38248047 for ; Fri, 8 May 2020 18:32:24 +0000 (UTC) X-FDA: 76794397008.17.sense28_6c594ad74764f X-Spam-Summary: 2,0,0,82608f62ef40561f,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1542:1711:1730:1747:1777:1792:1981:2194:2199:2393:2559:2562:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3872:4321:5007:6261:6653:6742:7903:8957:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:12986:13255:13894:14096:14181:14394:14721:21060:21080:21444:21450:21451:21627:30054,0,RBL:209.85.222.196:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:27,LUA_SUMMARY:none X-HE-Tag: sense28_6c594ad74764f X-Filterd-Recvd-Size: 5547 Received: from mail-qk1-f196.google.com (mail-qk1-f196.google.com [209.85.222.196]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:24 +0000 (UTC) Received: by mail-qk1-f196.google.com with SMTP id b188so2619760qkd.9 for ; Fri, 08 May 2020 11:32:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OfCvK5ICX9RLV3lPlzIAY41JRodQH86fFkwrgQlSEf8=; b=CmwijAWGrB9GpiDIRRkJD4jSnT43gKAmAnGfjMaA25Kt/6/LLUOt4eOCmLPfSIMGOT aW9ShVJ6f895PpkJvDJVVoFWx5BgKEjpiM1k6aB0HHEse5SZQ2jgfYQwrhgGFc295/53 c0firhINkXK8AhvGbh6gPtL7EkztISznspA8/2XedHiWj27+y2kPRk+C3Qyyd37UqCb/ Llgeupzc7yudqB6GBoOzm51I3hJ4r0VphHkIxYSiDfkxQJHcvK4xfuG+CLwHYBO4dkIO LbKGvdO0IK037kd4xPNaYTmJxNke7bjWj88R1s0cINtHKjji7z8isPYV8KpPWPEjBxTw d/wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OfCvK5ICX9RLV3lPlzIAY41JRodQH86fFkwrgQlSEf8=; b=qux/5dcTTm9tlTauIas4vCJV1hfXjSj6Th4T0wzPsT/MxiR22rVU1ORa8dDGNm5h/Q nR+bCs5GBGOqisrwtqn3rlwNCn0lDXqh25aKywxYIqQQf4YyO8h29aJA8hobpNk0dTqe Vh6w71TKMUQFbXeQbsxQq9u/scJuV5rTaZUShrxAB5HcnMyw8guUUYHTeUew+9RmiUEy 2sbVGy6RvgGhWN5LuawGNodxX8AVFdjYLknXyU29atr5BuDzPv1z1EU+Uv9+yWsH8QCo fdkyE9SisUI4U/mn6f6sLtJQTZWs3IJCjhLhBMPbmJ9hv5hb0s5yI+Jt4dn6nwZpRE4C 5IWA== X-Gm-Message-State: AGi0Pua1fdO9QfjbUpopm168alDDd1Wu3njDYkHHqqd58SKJ2uWj/cjn nmVNr69AggyBdfjKivguzg/0Uw== X-Google-Smtp-Source: APiQypKoUnRPBruun23L9ix6YeFW3MKtoKGcwFDVaXCfwmbnoLeO5xCS2xT2NOKAvJKQaNWvn2Nc9w== X-Received: by 2002:a37:a417:: with SMTP id n23mr4055028qke.480.1588962743924; Fri, 08 May 2020 11:32:23 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id s4sm2145270qth.61.2020.05.08.11.32.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:23 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 07/19] mm: memcontrol: prepare move_account for removal of private page type counters Date: Fri, 8 May 2020 14:30:54 -0400 Message-Id: <20200508183105.225460-8-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When memcg uses the generic vmstat counters, it doesn't need to do anything at charging and uncharging time. It does, however, need to migrate counts when pages move to a different cgroup in move_account. Prepare the move_account function for the arrival of NR_FILE_PAGES, NR_ANON_MAPPED, NR_ANON_THPS etc. by having a branch for files and a branch for anon, which can then divided into sub-branches. Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi Reviewed-by: Joonsoo Kim --- mm/memcontrol.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a5efdad77be4..fe4212db8411 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5378,7 +5378,6 @@ static int mem_cgroup_move_account(struct page *page, struct pglist_data *pgdat; unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; int ret; - bool anon; VM_BUG_ON(from == to); VM_BUG_ON_PAGE(PageLRU(page), page); @@ -5396,25 +5395,27 @@ static int mem_cgroup_move_account(struct page *page, if (page->mem_cgroup != from) goto out_unlock; - anon = PageAnon(page); - pgdat = page_pgdat(page); from_vec = mem_cgroup_lruvec(from, pgdat); to_vec = mem_cgroup_lruvec(to, pgdat); lock_page_memcg(page); - if (!anon && page_mapped(page)) { - __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); - __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); - } + if (!PageAnon(page)) { + if (page_mapped(page)) { + __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); + __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); + } - if (!anon && PageDirty(page)) { - struct address_space *mapping = page_mapping(page); + if (PageDirty(page)) { + struct address_space *mapping = page_mapping(page); - if (mapping_cap_account_dirty(mapping)) { - __mod_lruvec_state(from_vec, NR_FILE_DIRTY, -nr_pages); - __mod_lruvec_state(to_vec, NR_FILE_DIRTY, nr_pages); + if (mapping_cap_account_dirty(mapping)) { + __mod_lruvec_state(from_vec, NR_FILE_DIRTY, + -nr_pages); + __mod_lruvec_state(to_vec, NR_FILE_DIRTY, + nr_pages); + } } } From patchwork Fri May 8 18:30:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 796FA912 for ; Fri, 8 May 2020 18:32:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 45F192192A for ; Fri, 8 May 2020 18:32:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="zsUtCnon" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 45F192192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C0834900009; Fri, 8 May 2020 14:32:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B4315900005; Fri, 8 May 2020 14:32:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0C8D900009; Fri, 8 May 2020 14:32:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 75929900005 for ; Fri, 8 May 2020 14:32:27 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 43072180AD807 for ; Fri, 8 May 2020 18:32:27 +0000 (UTC) X-FDA: 76794397134.21.wing92_6cb20111b4343 X-Spam-Summary: 2,0,0,6730aa948635d4c6,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2393:2559:2562:2736:2895:3138:3139:3140:3141:3142:3353:3865:3867:3868:3870:3874:4321:4605:5007:6261:6653:6742:7903:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:12986:13255:13894:14096:14181:14394:14721:21080:21444:21450:21451:21627:21966:21990:30054:30070,0,RBL:209.85.219.66:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: wing92_6cb20111b4343 X-Filterd-Recvd-Size: 5980 Received: from mail-qv1-f66.google.com (mail-qv1-f66.google.com [209.85.219.66]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:26 +0000 (UTC) Received: by mail-qv1-f66.google.com with SMTP id p13so1227758qvt.12 for ; Fri, 08 May 2020 11:32:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OaPQ/6k0L3jh0C6o7CfSb8YWPSRRpIDUVegV83C74dM=; b=zsUtCnonxo6Zog/FPVNSwTnBaCeM0640kwri6ouMQy5dP2WS5K0VL3Uvua0clO1WBf BRitoPXoqUDeRAN3oCRArMc0i5WyRWV5zibXNnGvLKFsfxZfB46eyMtJ1T+ADLs86OCv LZx/kJXhWDmi573Z9v9iEtE7MPuSnPQAeSsW9ktSHlahFKAGyE3tgF/3HMfZppUOuR2T vtpeWzPAdHEQDHLQ+DLtQo6+dje3HLAIlF7UMTktdJowPW9vr1UAxTEoHprG65c/iRSm pLM0u4C80GxbnbdQrdhxsGROOssrDKHBbDmYIIdnn4iZb4ScmI8zkH0K8z67JNA8P1nt w/ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OaPQ/6k0L3jh0C6o7CfSb8YWPSRRpIDUVegV83C74dM=; b=B+riimgQJmTgOMqF0gI70K9cg8ByFplrSepbs6Nn/deC38OWRNYyx18tJowBiQC9jw q2cPR6TOHHj42HeU6Ql5kwC19WHXMnRvmPberTyKYlJv19entbRD42DG3X/CuJgSPcxg MYH2hMezxnM+hfXqvTpozBtGmoDLhdhLuhCsI7T/WpJRYJ3F+fwwiNOJJ5iPLRQzDo3R +R+wDVGJOdMhOkfw2Wer8oee2Igkrmasjt0B6cBs1c147enj2MDVH8MAEvKXAODSLiKi tDB8UmbOvLJDUV8iIb5I0rTlJaEw2BkEN0GNQezR9HjjVd8qebLcOPA0tg50UTGBgOPj no3A== X-Gm-Message-State: AGi0PuawyEjqFcqf2ihm8zPPezOWLutuMQCwdN7ezdFY4DGZrEKoeCVx lNCiDdz4wqPG57m+uNLVXrldIQ== X-Google-Smtp-Source: APiQypKCSdSc0epyu6nb+Tpb5XOJSFpmyaxgKolFX3J9ZjZhpGz+pazhpY0Kj+ub8fsq8v8fRnV6dg== X-Received: by 2002:a05:6214:42b:: with SMTP id a11mr4097681qvy.186.1588962745866; Fri, 08 May 2020 11:32:25 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id o94sm2046783qtd.34.2020.05.08.11.32.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:25 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 08/19] mm: memcontrol: prepare cgroup vmstat infrastructure for native anon counters Date: Fri, 8 May 2020 14:30:55 -0400 Message-Id: <20200508183105.225460-9-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Anonymous compound pages can be mapped by ptes, which means that if we want to track NR_MAPPED_ANON, NR_ANON_THPS on a per-cgroup basis, we have to be prepared to see tail pages in our accounting functions. Make mod_lruvec_page_state() and lock_page_memcg() deal with tail pages correctly, namely by redirecting to the head page which has the page->mem_cgroup set up. Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/memcontrol.h | 5 +++-- mm/memcontrol.c | 9 ++++++--- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 57339514d960..5b110ac7dd83 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -723,16 +723,17 @@ static inline void mod_lruvec_state(struct lruvec *lruvec, static inline void __mod_lruvec_page_state(struct page *page, enum node_stat_item idx, int val) { + struct page *head = compound_head(page); /* rmap on tail pages */ pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; /* Untracked pages have no memcg, no lruvec. Update only the node */ - if (!page->mem_cgroup) { + if (!head->mem_cgroup) { __mod_node_page_state(pgdat, idx, val); return; } - lruvec = mem_cgroup_lruvec(page->mem_cgroup, pgdat); + lruvec = mem_cgroup_lruvec(head->mem_cgroup, pgdat); __mod_lruvec_state(lruvec, idx, val); } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fe4212db8411..b7be4cd6ddc5 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1979,6 +1979,7 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) */ struct mem_cgroup *lock_page_memcg(struct page *page) { + struct page *head = compound_head(page); /* rmap on tail pages */ struct mem_cgroup *memcg; unsigned long flags; @@ -1998,7 +1999,7 @@ struct mem_cgroup *lock_page_memcg(struct page *page) if (mem_cgroup_disabled()) return NULL; again: - memcg = page->mem_cgroup; + memcg = head->mem_cgroup; if (unlikely(!memcg)) return NULL; @@ -2006,7 +2007,7 @@ struct mem_cgroup *lock_page_memcg(struct page *page) return memcg; spin_lock_irqsave(&memcg->move_lock, flags); - if (memcg != page->mem_cgroup) { + if (memcg != head->mem_cgroup) { spin_unlock_irqrestore(&memcg->move_lock, flags); goto again; } @@ -2049,7 +2050,9 @@ void __unlock_page_memcg(struct mem_cgroup *memcg) */ void unlock_page_memcg(struct page *page) { - __unlock_page_memcg(page->mem_cgroup); + struct page *head = compound_head(page); + + __unlock_page_memcg(head->mem_cgroup); } EXPORT_SYMBOL(unlock_page_memcg); From patchwork Fri May 8 18:30:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537389 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAABD912 for ; Fri, 8 May 2020 18:32:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7C6DF2192A for ; Fri, 8 May 2020 18:32:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="lutlyDk7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7C6DF2192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9498A90000A; Fri, 8 May 2020 14:32:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8FA18900005; Fri, 8 May 2020 14:32:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C0ED90000A; Fri, 8 May 2020 14:32:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 624C5900005 for ; Fri, 8 May 2020 14:32:29 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2D2CB2DFD for ; Fri, 8 May 2020 18:32:29 +0000 (UTC) X-FDA: 76794397218.14.news62_6cf7e9f387457 X-Spam-Summary: 2,0,0,54701885cf578bbb,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:1:41:69:355:379:541:800:960:966:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:1981:2194:2196:2199:2200:2393:2559:2562:2636:2693:2903:3138:3139:3140:3141:3142:3308:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6261:6653:6742:8957:9592:10004:11026:11232:11233:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:12986:13161:13229:13255:13894:14096:14394:21080:21433:21444:21450:21451:21611:21627:30012:30045:30054:30070,0,RBL:209.85.160.196:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:27,LUA_SUMMARY:none X-HE-Tag: news62_6cf7e9f387457 X-Filterd-Recvd-Size: 13914 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:28 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id v4so1210923qte.3 for ; Fri, 08 May 2020 11:32:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WC4jrzYMizmual3lrTfako1Ay0HwXWehMCxzqAQWr+k=; b=lutlyDk7XcQrbdUBMc4HdCSwUlANrRyCjjBoXISjZv17WXskZpZ20Cj9+iqR/gMG7a F08/+ecQQEMRP+w1OH3ybCHe0jHoofCdtnXzPDpIii8W3G7DayqiGborI5YoncifA1RY SNCebjOhCxuh/1cov7w422+kmgzKxYXAsEnUkynMaMf0T9fehD8fz1cAwuwdZu7XW7tp Zg8wgOTHTObURvDl0f3Wg5dmWwaPtbUtGvZmAYzmqjtFt1CMIt3g2War7SLKE/vf+7yO 7Vmb7p+eqo7JIWgVkXCSJ9nxGW+c5jAas+cwMQueE5eYLdDewqWXqc4MVmneuX/HcG1p qlTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WC4jrzYMizmual3lrTfako1Ay0HwXWehMCxzqAQWr+k=; b=SUqcBBfOJlowQY0xCZi5tfR6RwaqnqTX6UbUL6Xl2ggtrTu19aqmvlEJSuxGEhcxCq 2Ior9Gu0a0DWpDSXYcC+2+q607nFwHCGlj1t4Tm/T+7HkeoxEAZJ0Rs/yFCpoGN8xtqj oZgi02UORvDu25kfq5xZ/03EMqeNcFSe1q9MDHltLWGrpV5BDSA+yq8spww1yMlWvGyR UOplS/zrCOYEIxAi6Gs2uHRxZpVhmQ8OoyvDSfXcHhIiMhSqiRSnp+Qs6vZO8KebGePI nodzWt7gvs8KzyfO2GCN73ZBRCIuVP0h3PvGk38x1CPePVnIfK1iAvEJ5fTM4M54nQoA i8fA== X-Gm-Message-State: AGi0PuaANmuKojVmLI3TZIYUZz67BiHOwTsKX2cAhpUrm6kPuT81hS9h iffa2WYF9qicuSX71XGAmbifAg== X-Google-Smtp-Source: APiQypID2sSyS8OO3wQ7SUtOIm02bnN3kD62zXkzP+p6q48v5flLk481z+Kdt0pCeP06kwGh61Fw3w== X-Received: by 2002:ac8:543:: with SMTP id c3mr4313827qth.8.1588962747854; Fri, 08 May 2020 11:32:27 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id o43sm2196587qtb.49.2020.05.08.11.32.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:27 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 09/19] mm: memcontrol: switch to native NR_FILE_PAGES and NR_SHMEM counters Date: Fri, 8 May 2020 14:30:56 -0400 Message-Id: <20200508183105.225460-10-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memcg maintains private MEMCG_CACHE and NR_SHMEM counters. This divergence from the generic VM accounting means unnecessary code overhead, and creates a dependency for memcg that page->mapping is set up at the time of charging, so that page types can be told apart. Convert the generic accounting sites to mod_lruvec_page_state and friends to maintain the per-cgroup vmstat counters of NR_FILE_PAGES and NR_SHMEM. The page is already locked in these places, so page->mem_cgroup is stable; we only need minimal tweaks of two mem_cgroup_migrate() calls to ensure it's set up in time. Then replace MEMCG_CACHE with NR_FILE_PAGES and delete the private NR_SHMEM accounting sites. Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/memcontrol.h | 3 +-- mm/filemap.c | 17 +++++++++-------- mm/khugepaged.c | 16 +++++++++++----- mm/memcontrol.c | 28 +++++++++++----------------- mm/migrate.c | 15 +++++++++++---- mm/shmem.c | 14 +++++++------- 6 files changed, 50 insertions(+), 43 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5b110ac7dd83..f932e7e9fad8 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -29,8 +29,7 @@ struct kmem_cache; /* Cgroup-specific page state, on top of universal node page state */ enum memcg_stat_item { - MEMCG_CACHE = NR_VM_NODE_STAT_ITEMS, - MEMCG_RSS, + MEMCG_RSS = NR_VM_NODE_STAT_ITEMS, MEMCG_RSS_HUGE, MEMCG_SWAP, MEMCG_SOCK, diff --git a/mm/filemap.c b/mm/filemap.c index ee9882509566..d5b6e3d7d402 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -199,9 +199,9 @@ static void unaccount_page_cache_page(struct address_space *mapping, nr = hpage_nr_pages(page); - __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr); + __mod_lruvec_page_state(page, NR_FILE_PAGES, -nr); if (PageSwapBacked(page)) { - __mod_node_page_state(page_pgdat(page), NR_SHMEM, -nr); + __mod_lruvec_page_state(page, NR_SHMEM, -nr); if (PageTransHuge(page)) __dec_node_page_state(page, NR_SHMEM_THPS); } else if (PageTransHuge(page)) { @@ -802,21 +802,22 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) new->mapping = mapping; new->index = offset; + mem_cgroup_migrate(old, new); + xas_lock_irqsave(&xas, flags); xas_store(&xas, new); old->mapping = NULL; /* hugetlb pages do not participate in page cache accounting. */ if (!PageHuge(old)) - __dec_node_page_state(old, NR_FILE_PAGES); + __dec_lruvec_page_state(old, NR_FILE_PAGES); if (!PageHuge(new)) - __inc_node_page_state(new, NR_FILE_PAGES); + __inc_lruvec_page_state(new, NR_FILE_PAGES); if (PageSwapBacked(old)) - __dec_node_page_state(old, NR_SHMEM); + __dec_lruvec_page_state(old, NR_SHMEM); if (PageSwapBacked(new)) - __inc_node_page_state(new, NR_SHMEM); + __inc_lruvec_page_state(new, NR_SHMEM); xas_unlock_irqrestore(&xas, flags); - mem_cgroup_migrate(old, new); if (freepage) freepage(old); put_page(old); @@ -867,7 +868,7 @@ static int __add_to_page_cache_locked(struct page *page, /* hugetlb pages do not participate in page cache accounting */ if (!huge) - __inc_node_page_state(page, NR_FILE_PAGES); + __inc_lruvec_page_state(page, NR_FILE_PAGES); unlock: xas_unlock_irq(&xas); } while (xas_nomem(&xas, gfp_mask & GFP_RECLAIM_MASK)); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b73d2af6d11a..e2be7f9a92db 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1834,12 +1834,18 @@ static void collapse_file(struct mm_struct *mm, } if (nr_none) { - struct zone *zone = page_zone(new_page); - - __mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none); + struct lruvec *lruvec; + /* + * XXX: We have started try_charge and pinned the + * memcg, but the page isn't committed yet so we + * cannot use mod_lruvec_page_state(). This hackery + * will be cleaned up when remove the page->mapping + * dependency from memcg and fully charge above. + */ + lruvec = mem_cgroup_lruvec(memcg, page_pgdat(new_page)); + __mod_lruvec_state(lruvec, NR_FILE_PAGES, nr_none); if (is_shmem) - __mod_node_page_state(zone->zone_pgdat, - NR_SHMEM, nr_none); + __mod_lruvec_state(lruvec, NR_SHMEM, nr_none); } xa_locked: diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b7be4cd6ddc5..c4c060ce1876 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -842,11 +842,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, */ if (PageAnon(page)) __mod_memcg_state(memcg, MEMCG_RSS, nr_pages); - else { - __mod_memcg_state(memcg, MEMCG_CACHE, nr_pages); - if (PageSwapBacked(page)) - __mod_memcg_state(memcg, NR_SHMEM, nr_pages); - } if (abs(nr_pages) > 1) { VM_BUG_ON_PAGE(!PageTransHuge(page), page); @@ -1392,7 +1387,7 @@ static char *memory_stat_format(struct mem_cgroup *memcg) (u64)memcg_page_state(memcg, MEMCG_RSS) * PAGE_SIZE); seq_buf_printf(&s, "file %llu\n", - (u64)memcg_page_state(memcg, MEMCG_CACHE) * + (u64)memcg_page_state(memcg, NR_FILE_PAGES) * PAGE_SIZE); seq_buf_printf(&s, "kernel_stack %llu\n", (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) * @@ -3302,7 +3297,7 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) unsigned long val; if (mem_cgroup_is_root(memcg)) { - val = memcg_page_state(memcg, MEMCG_CACHE) + + val = memcg_page_state(memcg, NR_FILE_PAGES) + memcg_page_state(memcg, MEMCG_RSS); if (swap) val += memcg_page_state(memcg, MEMCG_SWAP); @@ -3773,7 +3768,7 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) #endif /* CONFIG_NUMA */ static const unsigned int memcg1_stats[] = { - MEMCG_CACHE, + NR_FILE_PAGES, MEMCG_RSS, MEMCG_RSS_HUGE, NR_SHMEM, @@ -5405,6 +5400,14 @@ static int mem_cgroup_move_account(struct page *page, lock_page_memcg(page); if (!PageAnon(page)) { + __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); + __mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages); + + if (PageSwapBacked(page)) { + __mod_lruvec_state(from_vec, NR_SHMEM, -nr_pages); + __mod_lruvec_state(to_vec, NR_SHMEM, nr_pages); + } + if (page_mapped(page)) { __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); @@ -6614,10 +6617,8 @@ struct uncharge_gather { unsigned long nr_pages; unsigned long pgpgout; unsigned long nr_anon; - unsigned long nr_file; unsigned long nr_kmem; unsigned long nr_huge; - unsigned long nr_shmem; struct page *dummy_page; }; @@ -6641,9 +6642,7 @@ static void uncharge_batch(const struct uncharge_gather *ug) local_irq_save(flags); __mod_memcg_state(ug->memcg, MEMCG_RSS, -ug->nr_anon); - __mod_memcg_state(ug->memcg, MEMCG_CACHE, -ug->nr_file); __mod_memcg_state(ug->memcg, MEMCG_RSS_HUGE, -ug->nr_huge); - __mod_memcg_state(ug->memcg, NR_SHMEM, -ug->nr_shmem); __count_memcg_events(ug->memcg, PGPGOUT, ug->pgpgout); __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); memcg_check_events(ug->memcg, ug->dummy_page); @@ -6684,11 +6683,6 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) ug->nr_huge += nr_pages; if (PageAnon(page)) ug->nr_anon += nr_pages; - else { - ug->nr_file += nr_pages; - if (PageSwapBacked(page)) - ug->nr_shmem += nr_pages; - } ug->pgpgout++; } else { ug->nr_kmem += nr_pages; diff --git a/mm/migrate.c b/mm/migrate.c index 50c7a08f8f31..3af5447e7aca 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -490,11 +490,18 @@ int migrate_page_move_mapping(struct address_space *mapping, * are mapped to swap space. */ if (newzone != oldzone) { - __dec_node_state(oldzone->zone_pgdat, NR_FILE_PAGES); - __inc_node_state(newzone->zone_pgdat, NR_FILE_PAGES); + struct lruvec *old_lruvec, *new_lruvec; + struct mem_cgroup *memcg; + + memcg = page_memcg(page); + old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat); + new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat); + + __dec_lruvec_state(old_lruvec, NR_FILE_PAGES); + __inc_lruvec_state(new_lruvec, NR_FILE_PAGES); if (PageSwapBacked(page) && !PageSwapCache(page)) { - __dec_node_state(oldzone->zone_pgdat, NR_SHMEM); - __inc_node_state(newzone->zone_pgdat, NR_SHMEM); + __dec_lruvec_state(old_lruvec, NR_SHMEM); + __inc_lruvec_state(new_lruvec, NR_SHMEM); } if (dirty && mapping_cap_account_dirty(mapping)) { __dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY); diff --git a/mm/shmem.c b/mm/shmem.c index afd5a057ebb7..d0306a36f42c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -653,8 +653,8 @@ static int shmem_add_to_page_cache(struct page *page, __inc_node_page_state(page, NR_SHMEM_THPS); } mapping->nrpages += nr; - __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr); - __mod_node_page_state(page_pgdat(page), NR_SHMEM, nr); + __mod_lruvec_page_state(page, NR_FILE_PAGES, nr); + __mod_lruvec_page_state(page, NR_SHMEM, nr); unlock: xas_unlock_irq(&xas); } while (xas_nomem(&xas, gfp)); @@ -685,8 +685,8 @@ static void shmem_delete_from_page_cache(struct page *page, void *radswap) error = shmem_replace_entry(mapping, page->index, page, radswap); page->mapping = NULL; mapping->nrpages--; - __dec_node_page_state(page, NR_FILE_PAGES); - __dec_node_page_state(page, NR_SHMEM); + __dec_lruvec_page_state(page, NR_FILE_PAGES); + __dec_lruvec_page_state(page, NR_SHMEM); xa_unlock_irq(&mapping->i_pages); put_page(page); BUG_ON(error); @@ -1593,8 +1593,9 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp, xa_lock_irq(&swap_mapping->i_pages); error = shmem_replace_entry(swap_mapping, swap_index, oldpage, newpage); if (!error) { - __inc_node_page_state(newpage, NR_FILE_PAGES); - __dec_node_page_state(oldpage, NR_FILE_PAGES); + mem_cgroup_migrate(oldpage, newpage); + __inc_lruvec_page_state(newpage, NR_FILE_PAGES); + __dec_lruvec_page_state(oldpage, NR_FILE_PAGES); } xa_unlock_irq(&swap_mapping->i_pages); @@ -1606,7 +1607,6 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp, */ oldpage = newpage; } else { - mem_cgroup_migrate(oldpage, newpage); lru_cache_add_anon(newpage); *pagep = newpage; } From patchwork Fri May 8 18:30:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537391 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FE2215E6 for ; Fri, 8 May 2020 18:32:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2558E24953 for ; Fri, 8 May 2020 18:32:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="vO6yWWOm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2558E24953 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9BD0690000B; Fri, 8 May 2020 14:32:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 972C6900005; Fri, 8 May 2020 14:32:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7722B90000B; Fri, 8 May 2020 14:32:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 4EDD5900005 for ; Fri, 8 May 2020 14:32:31 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id EF48D2C8F for ; Fri, 8 May 2020 18:32:30 +0000 (UTC) X-FDA: 76794397260.15.lake44_6d37eb9250609 X-Spam-Summary: 2,0,0,7344a4a044c49f46,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:4:41:69:355:379:541:800:960:966:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:1981:2194:2196:2199:2200:2393:2559:2562:2904:2914:3138:3139:3140:3141:3142:3308:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6117:6119:6261:6653:6742:7808:7903:8660:8957:9592:11026:11232:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13148:13230:13894:14096:14394:21080:21433:21444:21450:21451:21627:21966:21987:21990:30003:30012:30045:30054:30079,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:1:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: lake44_6d37eb9250609 X-Filterd-Recvd-Size: 18513 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:30 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id o10so2140361qtr.6 for ; Fri, 08 May 2020 11:32:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S76spG0o9PUrt+bPnw9UjhNU0/IWRliCZs9PZhoTRQ8=; b=vO6yWWOmZS2d0QMCp5tDJqPYYEitb5lxKcUL8Gbz/pebyYanWzDSoILG7KSPHOj1v5 pI56vRyGc/Ww5diRYRUHvEDbf9BZdpQXhK/ieuwQmlp9FNGsibJUMyYMW4DJquU3PIxx nx/kUIZJuyVGT6w6Bwndue9Ti2TLcHGBp0qm1xPCuJQCG4Jmpg95lz1r8BOYjNbAV/3a RU8YEfa+UaFhljtuNcTBeB8c9wT0FfodZHZGl+A3fGSyw4NMinP/Ds95ll4YRK4VswtF Er629z4JRoqV2eHeExkFPe3gElwzse2mlmXEIBiwaIxOOf8B8oweViheVgPiGyw+jy1V 7Vhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S76spG0o9PUrt+bPnw9UjhNU0/IWRliCZs9PZhoTRQ8=; b=rbweBFPGEplznvqjXAfdnDN47UDG5Ue31LatXXnTnbFVMFinu2RNglZUJLKqomG73B XIQmAEqljN8aIiQZJwHkzsE8oLiKS/76M8k97Mp2aOUsrvXBydghl8jFkiVS1HmQzXxw iVn/XeFGQamzuCxDhQ506zO9QZsEL1UxHiSytGUe7ZrkjIVnitm4/HMc1mRJwVN8Ezr4 INLIiGYmmKv3E+ZzXknGmtlheYLrOThwz+CulUwT+QQtTCOsG8rmcamVM+lXaEW38VTm s1cGEwresEQ1mu3KVlWnh4qLZvw7lOx34GQVJ9xJBGg2tvmswCW0rZdTo/WAz3PQC8+N LrIQ== X-Gm-Message-State: AGi0Puas2B6Dp0CPym2r4n3lEBPRGeM8JhiWjcHLuzm9YL18QogHUwxX ieWepsckOUwQMBI7hBtgRFyDmQ4ShHk= X-Google-Smtp-Source: APiQypJifvuyq81OqXS5Kf1m/ECh/qW6ZUD+MNRDQXyLMJpGO32m4DO+sV/hqAoUKNs2KqYcJCqiuA== X-Received: by 2002:ac8:19dd:: with SMTP id s29mr4464315qtk.164.1588962749580; Fri, 08 May 2020 11:32:29 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id d7sm1772115qkk.26.2020.05.08.11.32.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:28 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 10/19] mm: memcontrol: switch to native NR_ANON_MAPPED counter Date: Fri, 8 May 2020 14:30:57 -0400 Message-Id: <20200508183105.225460-11-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memcg maintains a private MEMCG_RSS counter. This divergence from the generic VM accounting means unnecessary code overhead, and creates a dependency for memcg that page->mapping is set up at the time of charging, so that page types can be told apart. Convert the generic accounting sites to mod_lruvec_page_state and friends to maintain the per-cgroup vmstat counter of NR_ANON_MAPPED. We use lock_page_memcg() to stabilize page->mem_cgroup during rmap changes, the same way we do for NR_FILE_MAPPED. With the previous patch removing MEMCG_CACHE and the private NR_SHMEM counter, this patch finally eliminates the need to have page->mapping set up at charge time. However, we need to have page->mem_cgroup set up by the time rmap runs and does the accounting, so switch the commit and the rmap callbacks around. v2: fix temporary accounting bug by switching rmap<->commit (Joonsoo) Signed-off-by: Johannes Weiner --- include/linux/memcontrol.h | 3 +-- kernel/events/uprobes.c | 2 +- mm/huge_memory.c | 2 +- mm/khugepaged.c | 2 +- mm/memcontrol.c | 27 ++++++++-------------- mm/memory.c | 10 ++++---- mm/migrate.c | 2 +- mm/rmap.c | 47 +++++++++++++++++++++++--------------- mm/swapfile.c | 4 ++-- mm/userfaultfd.c | 2 +- 10 files changed, 51 insertions(+), 50 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index f932e7e9fad8..2df978a3a253 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -29,8 +29,7 @@ struct kmem_cache; /* Cgroup-specific page state, on top of universal node page state */ enum memcg_stat_item { - MEMCG_RSS = NR_VM_NODE_STAT_ITEMS, - MEMCG_RSS_HUGE, + MEMCG_RSS_HUGE = NR_VM_NODE_STAT_ITEMS, MEMCG_SWAP, MEMCG_SOCK, /* XXX: why are these zone and not node counters? */ diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 40e7488ce467..89ef81b65bcb 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -188,8 +188,8 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, if (new_page) { get_page(new_page); - page_add_new_anon_rmap(new_page, vma, addr, false); mem_cgroup_commit_charge(new_page, memcg, false); + page_add_new_anon_rmap(new_page, vma, addr, false); lru_cache_add_active_or_unevictable(new_page, vma); } else /* no new page, just dec_mm_counter for old_page */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 46c2bc20b7cb..07c012d89570 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -640,8 +640,8 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, entry = mk_huge_pmd(page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - page_add_new_anon_rmap(page, vma, haddr, true); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, vma, haddr, true); lru_cache_add_active_or_unevictable(page, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index e2be7f9a92db..be67ebe8a120 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1182,8 +1182,8 @@ static void collapse_huge_page(struct mm_struct *mm, spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); - page_add_new_anon_rmap(new_page, vma, address, true); mem_cgroup_commit_charge(new_page, memcg, false); + page_add_new_anon_rmap(new_page, vma, address, true); count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1); lru_cache_add_active_or_unevictable(new_page, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c4c060ce1876..fccb396ed7bd 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -836,13 +836,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, struct page *page, int nr_pages) { - /* - * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is - * counted as CACHE even if it's on ANON LRU. - */ - if (PageAnon(page)) - __mod_memcg_state(memcg, MEMCG_RSS, nr_pages); - if (abs(nr_pages) > 1) { VM_BUG_ON_PAGE(!PageTransHuge(page), page); __mod_memcg_state(memcg, MEMCG_RSS_HUGE, nr_pages); @@ -1384,7 +1377,7 @@ static char *memory_stat_format(struct mem_cgroup *memcg) */ seq_buf_printf(&s, "anon %llu\n", - (u64)memcg_page_state(memcg, MEMCG_RSS) * + (u64)memcg_page_state(memcg, NR_ANON_MAPPED) * PAGE_SIZE); seq_buf_printf(&s, "file %llu\n", (u64)memcg_page_state(memcg, NR_FILE_PAGES) * @@ -3298,7 +3291,7 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) if (mem_cgroup_is_root(memcg)) { val = memcg_page_state(memcg, NR_FILE_PAGES) + - memcg_page_state(memcg, MEMCG_RSS); + memcg_page_state(memcg, NR_ANON_MAPPED); if (swap) val += memcg_page_state(memcg, MEMCG_SWAP); } else { @@ -3769,7 +3762,7 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) static const unsigned int memcg1_stats[] = { NR_FILE_PAGES, - MEMCG_RSS, + NR_ANON_MAPPED, MEMCG_RSS_HUGE, NR_SHMEM, NR_FILE_MAPPED, @@ -5399,7 +5392,12 @@ static int mem_cgroup_move_account(struct page *page, lock_page_memcg(page); - if (!PageAnon(page)) { + if (PageAnon(page)) { + if (page_mapped(page)) { + __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); + __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); + } + } else { __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); __mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages); @@ -6530,7 +6528,6 @@ void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, { unsigned int nr_pages = hpage_nr_pages(page); - VM_BUG_ON_PAGE(!page->mapping, page); VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page); if (mem_cgroup_disabled()) @@ -6603,8 +6600,6 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct mem_cgroup *memcg; int ret; - VM_BUG_ON_PAGE(!page->mapping, page); - ret = mem_cgroup_try_charge(page, mm, gfp_mask, &memcg); if (ret) return ret; @@ -6616,7 +6611,6 @@ struct uncharge_gather { struct mem_cgroup *memcg; unsigned long nr_pages; unsigned long pgpgout; - unsigned long nr_anon; unsigned long nr_kmem; unsigned long nr_huge; struct page *dummy_page; @@ -6641,7 +6635,6 @@ static void uncharge_batch(const struct uncharge_gather *ug) } local_irq_save(flags); - __mod_memcg_state(ug->memcg, MEMCG_RSS, -ug->nr_anon); __mod_memcg_state(ug->memcg, MEMCG_RSS_HUGE, -ug->nr_huge); __count_memcg_events(ug->memcg, PGPGOUT, ug->pgpgout); __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); @@ -6681,8 +6674,6 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) if (!PageKmemcg(page)) { if (PageTransHuge(page)) ug->nr_huge += nr_pages; - if (PageAnon(page)) - ug->nr_anon += nr_pages; ug->pgpgout++; } else { ug->nr_kmem += nr_pages; diff --git a/mm/memory.c b/mm/memory.c index a08cbaa81607..46c3e5dc918d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2710,8 +2710,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * thread doing COW. */ ptep_clear_flush_notify(vma, vmf->address, vmf->pte); - page_add_new_anon_rmap(new_page, vma, vmf->address, false); mem_cgroup_commit_charge(new_page, memcg, false); + page_add_new_anon_rmap(new_page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(new_page, vma); /* * We call the notify macro here because, when using secondary @@ -3243,12 +3243,12 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(page != swapcache && swapcache)) { - page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(page, vma); } else { - do_page_add_anon_rmap(page, vma, vmf->address, exclusive); mem_cgroup_commit_charge(page, memcg, true); + do_page_add_anon_rmap(page, vma, vmf->address, exclusive); activate_page(page); } @@ -3390,8 +3390,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) } inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(page, vma); setpte: set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); @@ -3652,8 +3652,8 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(page, vma); } else { inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); diff --git a/mm/migrate.c b/mm/migrate.c index 3af5447e7aca..e84fb5b87a85 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2838,8 +2838,8 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto unlock_abort; inc_mm_counter(mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, addr, false); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, vma, addr, false); if (!is_zone_device_page(page)) lru_cache_add_active_or_unevictable(page, vma); get_page(page); diff --git a/mm/rmap.c b/mm/rmap.c index 2126fd4a254b..e96f1d099c3f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1114,6 +1114,11 @@ void do_page_add_anon_rmap(struct page *page, bool compound = flags & RMAP_COMPOUND; bool first; + if (unlikely(PageKsm(page))) + lock_page_memcg(page); + else + VM_BUG_ON_PAGE(!PageLocked(page), page); + if (compound) { atomic_t *mapcount; VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -1134,12 +1139,13 @@ void do_page_add_anon_rmap(struct page *page, */ if (compound) __inc_node_page_state(page, NR_ANON_THPS); - __mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, nr); + __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr); } - if (unlikely(PageKsm(page))) - return; - VM_BUG_ON_PAGE(!PageLocked(page), page); + if (unlikely(PageKsm(page))) { + unlock_page_memcg(page); + return; + } /* address might be in next vma when migration races vma_adjust */ if (first) @@ -1181,7 +1187,7 @@ void page_add_new_anon_rmap(struct page *page, /* increment count (starts at -1) */ atomic_set(&page->_mapcount, 0); } - __mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, nr); + __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr); __page_set_anon_rmap(page, vma, address, 1); } @@ -1230,13 +1236,12 @@ static void page_remove_file_rmap(struct page *page, bool compound) int i, nr = 1; VM_BUG_ON_PAGE(compound && !PageHead(page), page); - lock_page_memcg(page); /* Hugepages are not counted in NR_FILE_MAPPED for now. */ if (unlikely(PageHuge(page))) { /* hugetlb pages are always mapped with pmds */ atomic_dec(compound_mapcount_ptr(page)); - goto out; + return; } /* page still mapped by someone else? */ @@ -1246,14 +1251,14 @@ static void page_remove_file_rmap(struct page *page, bool compound) nr++; } if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) - goto out; + return; if (PageSwapBacked(page)) __dec_node_page_state(page, NR_SHMEM_PMDMAPPED); else __dec_node_page_state(page, NR_FILE_PMDMAPPED); } else { if (!atomic_add_negative(-1, &page->_mapcount)) - goto out; + return; } /* @@ -1265,8 +1270,6 @@ static void page_remove_file_rmap(struct page *page, bool compound) if (unlikely(PageMlocked(page))) clear_page_mlock(page); -out: - unlock_page_memcg(page); } static void page_remove_anon_compound_rmap(struct page *page) @@ -1310,7 +1313,7 @@ static void page_remove_anon_compound_rmap(struct page *page) clear_page_mlock(page); if (nr) - __mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, -nr); + __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr); } /** @@ -1322,22 +1325,28 @@ static void page_remove_anon_compound_rmap(struct page *page) */ void page_remove_rmap(struct page *page, bool compound) { - if (!PageAnon(page)) - return page_remove_file_rmap(page, compound); + lock_page_memcg(page); - if (compound) - return page_remove_anon_compound_rmap(page); + if (!PageAnon(page)) { + page_remove_file_rmap(page, compound); + goto out; + } + + if (compound) { + page_remove_anon_compound_rmap(page); + goto out; + } /* page still mapped by someone else? */ if (!atomic_add_negative(-1, &page->_mapcount)) - return; + goto out; /* * We use the irq-unsafe __{inc|mod}_zone_page_stat because * these counters are not modified in interrupt context, and * pte lock(a spinlock) is held, which implies preemption disabled. */ - __dec_node_page_state(page, NR_ANON_MAPPED); + __dec_lruvec_page_state(page, NR_ANON_MAPPED); if (unlikely(PageMlocked(page))) clear_page_mlock(page); @@ -1354,6 +1363,8 @@ void page_remove_rmap(struct page *page, bool compound) * Leaving it set also helps swapoff to reinstate ptes * faster for those pages still in swapcache. */ +out: + unlock_page_memcg(page); } /* diff --git a/mm/swapfile.c b/mm/swapfile.c index ad42eac1822d..45b937b924f5 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1886,11 +1886,11 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { - page_add_anon_rmap(page, vma, addr, false); mem_cgroup_commit_charge(page, memcg, true); + page_add_anon_rmap(page, vma, addr, false); } else { /* ksm created a completely new copy */ - page_add_new_anon_rmap(page, vma, addr, false); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, vma, addr, false); lru_cache_add_active_or_unevictable(page, vma); } swap_free(entry); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index bb57d0a3fca7..3dea268d2850 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -123,8 +123,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release_uncharge_unlock; inc_mm_counter(dst_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, dst_vma, dst_addr, false); mem_cgroup_commit_charge(page, memcg, false); + page_add_new_anon_rmap(page, dst_vma, dst_addr, false); lru_cache_add_active_or_unevictable(page, dst_vma); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); From patchwork Fri May 8 18:30:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537393 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DAC915AB for ; Fri, 8 May 2020 18:32:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E20CE2192A for ; Fri, 8 May 2020 18:32:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="kJ+Yukfh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E20CE2192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5525C90000C; Fri, 8 May 2020 14:32:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4BE73900005; Fri, 8 May 2020 14:32:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 307BA90000C; Fri, 8 May 2020 14:32:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 15D5E900005 for ; Fri, 8 May 2020 14:32:33 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D2E452DFD for ; Fri, 8 May 2020 18:32:32 +0000 (UTC) X-FDA: 76794397344.28.flesh51_6d7c83ce75408 X-Spam-Summary: 10,1,0,60ef4107ef859039,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:2:41:69:355:379:541:800:960:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1730:1747:1777:1792:2198:2199:2393:2559:2562:2898:3138:3139:3140:3141:3142:3308:3355:3865:3866:3867:3868:3870:3871:3872:4049:4120:4250:4321:4605:5007:6119:6261:6653:6742:7903:8603:8957:9592:10007:11026:11232:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12683:12895:12986:13894:14096:14394:21080:21444:21450:21451:21627:21740:21966:21990:30045:30054:30056:30070,0,RBL:209.85.219.65:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:1:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: flesh51_6d7c83ce75408 X-Filterd-Recvd-Size: 9982 Received: from mail-qv1-f65.google.com (mail-qv1-f65.google.com [209.85.219.65]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:32 +0000 (UTC) Received: by mail-qv1-f65.google.com with SMTP id a15so953318qvt.9 for ; Fri, 08 May 2020 11:32:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZybKj/hrHeEZIVHdPNsqOos0TY5Kh5D8W8ZpFPXZKJg=; b=kJ+YukfhluUDLc43RoMkVCbL0Q/9IsJh96aeSBnqEWw4vGOi1sdbTAI6tEyX1LRg3i DjiTUXs5C6/HkaCH37WDNwtHMSHkpkOkxJPf3n++GeKjIkKBMXQNBu56GwVVrXBaNu5Z tv1YxvXSvlcttEp3I1POTr+lPJIxwu+u9CT7lAjF1L9jJfcbtXJalSm8KiXQSPhylmiJ G8Jfo8O2BNFSC9PwzEeEUrgG+hZ2WayBbCjX04f99C/GEhUB7y3PDHad0aZJXHlf4Ddo 9tY+yAU94F2q33NNXmdKBD5CxXkFLKK7BIdRMraEpUce1RuSXxVcDiVaf0fX6wRnu9/e wTxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZybKj/hrHeEZIVHdPNsqOos0TY5Kh5D8W8ZpFPXZKJg=; b=Jy2nRqh6YJaXXVHnnp3lpCyQixlOhHDCBisv0Gb6d7FwWFfR4rx0LNPIMuTOtj4ojw 68k2ldq1szdH5LUu/nQahI+BByW3MjcFYm3xr5zGAo9r+CjsZQgRdV3erklvu65r3mql 4SC1CLQQCo1M2Eidcgsq/PBbK1xP/0s9KlRRFm7FdrA7s3rFUbJPcDf2//XoIJb4o9w7 D14D7FcqZyrYzc/YI1wX3kQT0VbEvRDCoW1aq7xfRGodhGBNSGre6Rt3N61QZZM/wVhV dbTnEbwxSjsNyelKycwxXPcy07LCoDDRjLKIxwCUkC2zFvI2rX/4YKBF3yEKILGERJZH TSaQ== X-Gm-Message-State: AGi0Pub3ajUG4nS8p/vTTGo2FMc1sUrPFa9u7YY+3FCc3lezx0migO8u l6xE4ejov5ky7oLyCHvITCkpUg== X-Google-Smtp-Source: APiQypKiMQIYkuDn2nAgpPV8KxRqOQ2MPk729HS4+PBjvo0k2F8acseFbEwdahC8QWnfscPpNdxDbw== X-Received: by 2002:a0c:facb:: with SMTP id p11mr4105188qvo.17.1588962751543; Fri, 08 May 2020 11:32:31 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id y23sm1675968qta.37.2020.05.08.11.32.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:30 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 11/19] mm: memcontrol: switch to native NR_ANON_THPS counter Date: Fri, 8 May 2020 14:30:58 -0400 Message-Id: <20200508183105.225460-12-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With rmap memcg locking already in place for NR_ANON_MAPPED, it's just a small step to remove the MEMCG_RSS_HUGE wart and switch memcg to the native NR_ANON_THPS accounting sites. Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/memcontrol.h | 3 +-- mm/huge_memory.c | 4 +++- mm/memcontrol.c | 39 ++++++++++++++++---------------------- mm/rmap.c | 6 +++--- 4 files changed, 23 insertions(+), 29 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 2df978a3a253..9b1054bf6d35 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -29,8 +29,7 @@ struct kmem_cache; /* Cgroup-specific page state, on top of universal node page state */ enum memcg_stat_item { - MEMCG_RSS_HUGE = NR_VM_NODE_STAT_ITEMS, - MEMCG_SWAP, + MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS, MEMCG_SOCK, /* XXX: why are these zone and not node counters? */ MEMCG_KERNEL_STACK_KB, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 07c012d89570..74f8b4013203 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2159,15 +2159,17 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, atomic_inc(&page[i]._mapcount); } + lock_page_memcg(page); if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { /* Last compound_mapcount is gone. */ - __dec_node_page_state(page, NR_ANON_THPS); + __dec_lruvec_page_state(page, NR_ANON_THPS); if (TestClearPageDoubleMap(page)) { /* No need in mapcount reference anymore */ for (i = 0; i < HPAGE_PMD_NR; i++) atomic_dec(&page[i]._mapcount); } } + unlock_page_memcg(page); smp_wmb(); /* make pte visible before pmd */ pmd_populate(mm, pmd, pgtable); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fccb396ed7bd..fd92c1c99e1f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -836,11 +836,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, struct page *page, int nr_pages) { - if (abs(nr_pages) > 1) { - VM_BUG_ON_PAGE(!PageTransHuge(page), page); - __mod_memcg_state(memcg, MEMCG_RSS_HUGE, nr_pages); - } - /* pagein of a big page is an event. So, ignore page size */ if (nr_pages > 0) __count_memcg_events(memcg, PGPGIN, 1); @@ -1406,15 +1401,9 @@ static char *memory_stat_format(struct mem_cgroup *memcg) (u64)memcg_page_state(memcg, NR_WRITEBACK) * PAGE_SIZE); - /* - * TODO: We should eventually replace our own MEMCG_RSS_HUGE counter - * with the NR_ANON_THP vm counter, but right now it's a pain in the - * arse because it requires migrating the work out of rmap to a place - * where the page->mem_cgroup is set up and stable. - */ seq_buf_printf(&s, "anon_thp %llu\n", - (u64)memcg_page_state(memcg, MEMCG_RSS_HUGE) * - PAGE_SIZE); + (u64)memcg_page_state(memcg, NR_ANON_THPS) * + HPAGE_PMD_NR * PAGE_SIZE); for (i = 0; i < NR_LRU_LISTS; i++) seq_buf_printf(&s, "%s %llu\n", lru_list_name(i), @@ -3006,8 +2995,6 @@ void mem_cgroup_split_huge_fixup(struct page *head) for (i = 1; i < HPAGE_PMD_NR; i++) head[i].mem_cgroup = head->mem_cgroup; - - __mod_memcg_state(head->mem_cgroup, MEMCG_RSS_HUGE, -HPAGE_PMD_NR); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -3763,7 +3750,7 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) static const unsigned int memcg1_stats[] = { NR_FILE_PAGES, NR_ANON_MAPPED, - MEMCG_RSS_HUGE, + NR_ANON_THPS, NR_SHMEM, NR_FILE_MAPPED, NR_FILE_DIRTY, @@ -3800,11 +3787,14 @@ static int memcg_stat_show(struct seq_file *m, void *v) BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats)); for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) { + unsigned long nr; + if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; - seq_printf(m, "%s %lu\n", memcg1_stat_names[i], - memcg_page_state_local(memcg, memcg1_stats[i]) * - PAGE_SIZE); + nr = memcg_page_state_local(memcg, memcg1_stats[i]); + if (memcg1_stats[i] == NR_ANON_THPS) + nr *= HPAGE_PMD_NR; + seq_printf(m, "%s %lu\n", memcg1_stat_names[i], nr * PAGE_SIZE); } for (i = 0; i < ARRAY_SIZE(memcg1_events); i++) @@ -5396,6 +5386,13 @@ static int mem_cgroup_move_account(struct page *page, if (page_mapped(page)) { __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); + if (PageTransHuge(page)) { + __mod_lruvec_state(from_vec, NR_ANON_THPS, + -nr_pages); + __mod_lruvec_state(to_vec, NR_ANON_THPS, + nr_pages); + } + } } else { __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); @@ -6612,7 +6609,6 @@ struct uncharge_gather { unsigned long nr_pages; unsigned long pgpgout; unsigned long nr_kmem; - unsigned long nr_huge; struct page *dummy_page; }; @@ -6635,7 +6631,6 @@ static void uncharge_batch(const struct uncharge_gather *ug) } local_irq_save(flags); - __mod_memcg_state(ug->memcg, MEMCG_RSS_HUGE, -ug->nr_huge); __count_memcg_events(ug->memcg, PGPGOUT, ug->pgpgout); __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); memcg_check_events(ug->memcg, ug->dummy_page); @@ -6672,8 +6667,6 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) ug->nr_pages += nr_pages; if (!PageKmemcg(page)) { - if (PageTransHuge(page)) - ug->nr_huge += nr_pages; ug->pgpgout++; } else { ug->nr_kmem += nr_pages; diff --git a/mm/rmap.c b/mm/rmap.c index e96f1d099c3f..bd98a995c573 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1138,7 +1138,7 @@ void do_page_add_anon_rmap(struct page *page, * disabled. */ if (compound) - __inc_node_page_state(page, NR_ANON_THPS); + __inc_lruvec_page_state(page, NR_ANON_THPS); __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr); } @@ -1180,7 +1180,7 @@ void page_add_new_anon_rmap(struct page *page, if (hpage_pincount_available(page)) atomic_set(compound_pincount_ptr(page), 0); - __inc_node_page_state(page, NR_ANON_THPS); + __inc_lruvec_page_state(page, NR_ANON_THPS); } else { /* Anon THP always mapped first with PMD */ VM_BUG_ON_PAGE(PageTransCompound(page), page); @@ -1286,7 +1286,7 @@ static void page_remove_anon_compound_rmap(struct page *page) if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return; - __dec_node_page_state(page, NR_ANON_THPS); + __dec_lruvec_page_state(page, NR_ANON_THPS); if (TestClearPageDoubleMap(page)) { /* From patchwork Fri May 8 18:30:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537395 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E4391912 for ; Fri, 8 May 2020 18:32:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 95E602192A for ; Fri, 8 May 2020 18:32:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="XBN6sfz1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95E602192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9BE7290000D; Fri, 8 May 2020 14:32:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 94882900005; Fri, 8 May 2020 14:32:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 771A390000D; Fri, 8 May 2020 14:32:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 563D5900005 for ; Fri, 8 May 2020 14:32:35 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2661F181AEF07 for ; Fri, 8 May 2020 18:32:35 +0000 (UTC) X-FDA: 76794397470.19.roof38_6dd1236f05132 X-Spam-Summary: 2,0,0,7bd99bf35951a657,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:327:355:379:541:800:960:966:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2351:2393:2553:2559:2562:2693:2737:3138:3139:3140:3141:3142:3308:3608:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6120:6261:6653:6691:6742:7875:7903:8957:9592:10004:11026:11232:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13223:13229:13255:13894:14096:14394:21080:21433:21444:21451:21611:21627:21987:21990:30003:30012:30054:30070:30079:30090,0,RBL:209.85.222.194:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: roof38_6dd1236f05132 X-Filterd-Recvd-Size: 24631 Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:34 +0000 (UTC) Received: by mail-qk1-f194.google.com with SMTP id f13so2063638qkh.2 for ; Fri, 08 May 2020 11:32:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LIYh8wCoO3ncJ1D0s45AEXYMs5PqKu9avSmsgq+sM2s=; b=XBN6sfz1JeWwuRl/YJVh02uYqQ1Z27yY1Lk21geT8vcVCyF4fugi2XhmmA1sUp/krz QuW2ZTO9g9d5UPlLV4U2EB400clWtWtPRe1YsZcvGH5yYuhxur9lHk0K5u6ql7vwaU/t +FYfraX1XsMladoYLhH8xB8mu5AhjHViVZMWNt2Q08Q8v3GBIn6Itb82B6B6JbzeZANl FOS2WUGC82paWUGOEBNQU6pi8+GXnES1wUIsxxn79xeRbLXVXpql/2qRUqQ/154JQBTj AFXKHvfmHF85klZagKYQ6gcK67P3zVbTaYY1i4o/IvVI+5tlWwLTW5i4CSpGibMLyAKI hOAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LIYh8wCoO3ncJ1D0s45AEXYMs5PqKu9avSmsgq+sM2s=; b=KjyN2UnCW9SrXQpStkjnFOYBYYwFv7pj+biDQWQG6ppli6qY50FzqqPqNEYfYCZczS qor/ziG42yNYUi0HVRqtIeQZzwUyjRjC63DrxjUC2PBbWpW8iNeV5XtzvwOhRhOLyUbs eAZqN0sRFhfPnqOW1PuoSbfrPNZXer7/Vsq12qTmvG50KEX8gohWD/TYf1rS7/mj40hS vu8u1Kj7LsDV1a76NFdhqnv8MNmkZUNXZhusKkhpUyxVmbUrlUsBif6gj4X8RZN8sPEP RDmCQzON04/IGMKWJACPwm7Ia6UlsBbyrDIPnOVl1IFc9j8WDDPXiMkdxGgHuDORpbrQ ClKQ== X-Gm-Message-State: AGi0PuZE6gpF3vWiLtT1ObfNKILL6T+3V9eHogiTdm8bWmUob0h+BXku Lx8S4xPTsjM8/zjdHyYd200rTg== X-Google-Smtp-Source: APiQypJ2dq4D2Q8qYTIM5CI3D2Dt8AuQ54e2ePTq3BHc9vySM/AKJUkRL5X5ubsXpGXuLWKAHYlbgg== X-Received: by 2002:a05:620a:1184:: with SMTP id b4mr4024803qkk.364.1588962753591; Fri, 08 May 2020 11:32:33 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id m33sm666084qtb.88.2020.05.08.11.32.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:32 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 12/19] mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API Date: Fri, 8 May 2020 14:30:59 -0400 Message-Id: <20200508183105.225460-13-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the page->mapping requirement gone from memcg, we can charge anon and file-thp pages in one single step, right after they're allocated. This removes two out of three API calls - especially the tricky commit step that needed to happen at just the right time between when the page is "set up" and when it's "published" - somewhat vague and fluid concepts that varied by page type. All we need is a freshly allocated page and a memcg context to charge. v2: prevent double charges on pre-allocated hugepages in khugepaged Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/mm.h | 4 +--- kernel/events/uprobes.c | 11 +++-------- mm/filemap.c | 2 +- mm/huge_memory.c | 9 +++------ mm/khugepaged.c | 35 ++++++++++------------------------- mm/memory.c | 36 ++++++++++-------------------------- mm/migrate.c | 5 +---- mm/swapfile.c | 6 +----- mm/userfaultfd.c | 5 +---- 9 files changed, 31 insertions(+), 82 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index bb8d3716bfe4..87a2c2b66d05 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -501,7 +501,6 @@ struct vm_fault { pte_t orig_pte; /* Value of PTE at the time of fault */ struct page *cow_page; /* Page handler may use for COW fault */ - struct mem_cgroup *memcg; /* Cgroup cow_page belongs to */ struct page *page; /* ->fault handlers should return a * page here, unless VM_FAULT_NOPAGE * is set (which is also implied by @@ -935,8 +934,7 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) return pte; } -vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, - struct page *page); +vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page); vm_fault_t finish_fault(struct vm_fault *vmf); vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf); #endif diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 89ef81b65bcb..4253c153e985 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -162,14 +162,13 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, }; int err; struct mmu_notifier_range range; - struct mem_cgroup *memcg; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, addr, addr + PAGE_SIZE); if (new_page) { - err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, - &memcg); + err = mem_cgroup_charge(new_page, vma->vm_mm, GFP_KERNEL, + false); if (err) return err; } @@ -179,16 +178,12 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, mmu_notifier_invalidate_range_start(&range); err = -EAGAIN; - if (!page_vma_mapped_walk(&pvmw)) { - if (new_page) - mem_cgroup_cancel_charge(new_page, memcg); + if (!page_vma_mapped_walk(&pvmw)) goto unlock; - } VM_BUG_ON_PAGE(addr != pvmw.address, old_page); if (new_page) { get_page(new_page); - mem_cgroup_commit_charge(new_page, memcg, false); page_add_new_anon_rmap(new_page, vma, addr, false); lru_cache_add_active_or_unevictable(new_page, vma); } else diff --git a/mm/filemap.c b/mm/filemap.c index d5b6e3d7d402..fa47f160e1cc 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2638,7 +2638,7 @@ void filemap_map_pages(struct vm_fault *vmf, if (vmf->pte) vmf->pte += xas.xa_index - last_pgoff; last_pgoff = xas.xa_index; - if (alloc_set_pte(vmf, NULL, page)) + if (alloc_set_pte(vmf, page)) goto unlock; unlock_page(page); goto next; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 74f8b4013203..d0f1e8cee93c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -587,19 +587,19 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, struct page *page, gfp_t gfp) { struct vm_area_struct *vma = vmf->vma; - struct mem_cgroup *memcg; pgtable_t pgtable; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; vm_fault_t ret = 0; VM_BUG_ON_PAGE(!PageCompound(page), page); - if (mem_cgroup_try_charge_delay(page, vma->vm_mm, gfp, &memcg)) { + if (mem_cgroup_charge(page, vma->vm_mm, gfp, false)) { put_page(page); count_vm_event(THP_FAULT_FALLBACK); count_vm_event(THP_FAULT_FALLBACK_CHARGE); return VM_FAULT_FALLBACK; } + cgroup_throttle_swaprate(page, gfp); pgtable = pte_alloc_one(vma->vm_mm); if (unlikely(!pgtable)) { @@ -630,7 +630,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, vm_fault_t ret2; spin_unlock(vmf->ptl); - mem_cgroup_cancel_charge(page, memcg); put_page(page); pte_free(vma->vm_mm, pgtable); ret2 = handle_userfault(vmf, VM_UFFD_MISSING); @@ -640,7 +639,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, entry = mk_huge_pmd(page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, vma, haddr, true); lru_cache_add_active_or_unevictable(page, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); @@ -649,7 +647,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, mm_inc_nr_ptes(vma->vm_mm); spin_unlock(vmf->ptl); count_vm_event(THP_FAULT_ALLOC); - count_memcg_events(memcg, THP_FAULT_ALLOC, 1); + count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); } return 0; @@ -658,7 +656,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, release: if (pgtable) pte_free(vma->vm_mm, pgtable); - mem_cgroup_cancel_charge(page, memcg); put_page(page); return ret; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index be67ebe8a120..34731e7c9a67 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1044,7 +1044,6 @@ static void collapse_huge_page(struct mm_struct *mm, struct page *new_page; spinlock_t *pmd_ptl, *pte_ptl; int isolated = 0, result = 0; - struct mem_cgroup *memcg; struct vm_area_struct *vma; struct mmu_notifier_range range; gfp_t gfp; @@ -1067,15 +1066,15 @@ static void collapse_huge_page(struct mm_struct *mm, goto out_nolock; } - if (unlikely(mem_cgroup_try_charge(new_page, mm, gfp, &memcg))) { + if (unlikely(mem_cgroup_charge(new_page, mm, gfp, false))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out_nolock; } + count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); down_read(&mm->mmap_sem); result = hugepage_vma_revalidate(mm, address, &vma); if (result) { - mem_cgroup_cancel_charge(new_page, memcg); up_read(&mm->mmap_sem); goto out_nolock; } @@ -1083,7 +1082,6 @@ static void collapse_huge_page(struct mm_struct *mm, pmd = mm_find_pmd(mm, address); if (!pmd) { result = SCAN_PMD_NULL; - mem_cgroup_cancel_charge(new_page, memcg); up_read(&mm->mmap_sem); goto out_nolock; } @@ -1095,7 +1093,6 @@ static void collapse_huge_page(struct mm_struct *mm, */ if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, pmd, referenced)) { - mem_cgroup_cancel_charge(new_page, memcg); up_read(&mm->mmap_sem); goto out_nolock; } @@ -1182,9 +1179,7 @@ static void collapse_huge_page(struct mm_struct *mm, spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); - mem_cgroup_commit_charge(new_page, memcg, false); page_add_new_anon_rmap(new_page, vma, address, true); - count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1); lru_cache_add_active_or_unevictable(new_page, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); @@ -1198,10 +1193,11 @@ static void collapse_huge_page(struct mm_struct *mm, out_up_write: up_write(&mm->mmap_sem); out_nolock: + if (*hpage) + mem_cgroup_uncharge(*hpage); trace_mm_collapse_huge_page(mm, isolated, result); return; out: - mem_cgroup_cancel_charge(new_page, memcg); goto out_up_write; } @@ -1609,7 +1605,6 @@ static void collapse_file(struct mm_struct *mm, struct address_space *mapping = file->f_mapping; gfp_t gfp; struct page *new_page; - struct mem_cgroup *memcg; pgoff_t index, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); @@ -1628,10 +1623,11 @@ static void collapse_file(struct mm_struct *mm, goto out; } - if (unlikely(mem_cgroup_try_charge(new_page, mm, gfp, &memcg))) { + if (unlikely(mem_cgroup_charge(new_page, mm, gfp, false))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } + count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); /* This will be less messy when we use multi-index entries */ do { @@ -1641,7 +1637,6 @@ static void collapse_file(struct mm_struct *mm, break; xas_unlock_irq(&xas); if (!xas_nomem(&xas, GFP_KERNEL)) { - mem_cgroup_cancel_charge(new_page, memcg); result = SCAN_FAIL; goto out; } @@ -1834,18 +1829,9 @@ static void collapse_file(struct mm_struct *mm, } if (nr_none) { - struct lruvec *lruvec; - /* - * XXX: We have started try_charge and pinned the - * memcg, but the page isn't committed yet so we - * cannot use mod_lruvec_page_state(). This hackery - * will be cleaned up when remove the page->mapping - * dependency from memcg and fully charge above. - */ - lruvec = mem_cgroup_lruvec(memcg, page_pgdat(new_page)); - __mod_lruvec_state(lruvec, NR_FILE_PAGES, nr_none); + __mod_lruvec_page_state(new_page, NR_FILE_PAGES, nr_none); if (is_shmem) - __mod_lruvec_state(lruvec, NR_SHMEM, nr_none); + __mod_lruvec_page_state(new_page, NR_SHMEM, nr_none); } xa_locked: @@ -1883,7 +1869,6 @@ static void collapse_file(struct mm_struct *mm, SetPageUptodate(new_page); page_ref_add(new_page, HPAGE_PMD_NR - 1); - mem_cgroup_commit_charge(new_page, memcg, false); if (is_shmem) { set_page_dirty(new_page); @@ -1891,7 +1876,6 @@ static void collapse_file(struct mm_struct *mm, } else { lru_cache_add_file(new_page); } - count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1); /* * Remove pte page tables, so we can re-fault the page as huge. @@ -1938,13 +1922,14 @@ static void collapse_file(struct mm_struct *mm, VM_BUG_ON(nr_none); xas_unlock_irq(&xas); - mem_cgroup_cancel_charge(new_page, memcg); new_page->mapping = NULL; } unlock_page(new_page); out: VM_BUG_ON(!list_empty(&pagelist)); + if (*hpage) + mem_cgroup_uncharge(*hpage); /* TODO: tracepoints */ } diff --git a/mm/memory.c b/mm/memory.c index 46c3e5dc918d..832ee914cbcf 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2645,7 +2645,6 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) struct page *new_page = NULL; pte_t entry; int page_copied = 0; - struct mem_cgroup *memcg; struct mmu_notifier_range range; if (unlikely(anon_vma_prepare(vma))) @@ -2676,8 +2675,9 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } } - if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, &memcg)) + if (mem_cgroup_charge(new_page, mm, GFP_KERNEL, false)) goto oom_free_new; + cgroup_throttle_swaprate(new_page, GFP_KERNEL); __SetPageUptodate(new_page); @@ -2710,7 +2710,6 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * thread doing COW. */ ptep_clear_flush_notify(vma, vmf->address, vmf->pte); - mem_cgroup_commit_charge(new_page, memcg, false); page_add_new_anon_rmap(new_page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(new_page, vma); /* @@ -2749,8 +2748,6 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) /* Free the old page.. */ new_page = old_page; page_copied = 1; - } else { - mem_cgroup_cancel_charge(new_page, memcg); } if (new_page) @@ -3088,7 +3085,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct page *page = NULL, *swapcache; - struct mem_cgroup *memcg; swp_entry_t entry; pte_t pte; int locked; @@ -3193,10 +3189,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) goto out_page; } - if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, &memcg)) { + if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, true)) { ret = VM_FAULT_OOM; goto out_page; } + cgroup_throttle_swaprate(page, GFP_KERNEL); /* * Back out if somebody else already faulted in this pte. @@ -3243,11 +3240,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(page != swapcache && swapcache)) { - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(page, vma); } else { - mem_cgroup_commit_charge(page, memcg, true); do_page_add_anon_rmap(page, vma, vmf->address, exclusive); activate_page(page); } @@ -3284,7 +3279,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) out: return ret; out_nomap: - mem_cgroup_cancel_charge(page, memcg); pte_unmap_unlock(vmf->pte, vmf->ptl); out_page: unlock_page(page); @@ -3305,7 +3299,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) static vm_fault_t do_anonymous_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - struct mem_cgroup *memcg; struct page *page; vm_fault_t ret = 0; pte_t entry; @@ -3358,8 +3351,9 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (!page) goto oom; - if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, &memcg)) + if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, false)) goto oom_free_page; + cgroup_throttle_swaprate(page, GFP_KERNEL); /* * The memory barrier inside __SetPageUptodate makes sure that @@ -3384,13 +3378,11 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Deliver the page fault to userland, check inside PT lock */ if (userfaultfd_missing(vma)) { pte_unmap_unlock(vmf->pte, vmf->ptl); - mem_cgroup_cancel_charge(page, memcg); put_page(page); return handle_userfault(vmf, VM_UFFD_MISSING); } inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(page, vma); setpte: @@ -3402,7 +3394,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; release: - mem_cgroup_cancel_charge(page, memcg); put_page(page); goto unlock; oom_free_page: @@ -3607,7 +3598,6 @@ static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) * mapping. If needed, the fucntion allocates page table or use pre-allocated. * * @vmf: fault environment - * @memcg: memcg to charge page (only for private mappings) * @page: page to map * * Caller must take care of unlocking vmf->ptl, if vmf->pte is non-NULL on @@ -3618,8 +3608,7 @@ static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) * * Return: %0 on success, %VM_FAULT_ code in case of error. */ -vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, - struct page *page) +vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) { struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; @@ -3627,9 +3616,6 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, vm_fault_t ret; if (pmd_none(*vmf->pmd) && PageTransCompound(page)) { - /* THP on COW? */ - VM_BUG_ON_PAGE(memcg, page); - ret = do_set_pmd(vmf, page); if (ret != VM_FAULT_FALLBACK) return ret; @@ -3652,7 +3638,6 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, vma, vmf->address, false); lru_cache_add_active_or_unevictable(page, vma); } else { @@ -3702,7 +3687,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) if (!(vmf->vma->vm_flags & VM_SHARED)) ret = check_stable_address_space(vmf->vma->vm_mm); if (!ret) - ret = alloc_set_pte(vmf, vmf->memcg, page); + ret = alloc_set_pte(vmf, page); if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; @@ -3862,11 +3847,11 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) if (!vmf->cow_page) return VM_FAULT_OOM; - if (mem_cgroup_try_charge_delay(vmf->cow_page, vma->vm_mm, - GFP_KERNEL, &vmf->memcg)) { + if (mem_cgroup_charge(vmf->cow_page, vma->vm_mm, GFP_KERNEL, false)) { put_page(vmf->cow_page); return VM_FAULT_OOM; } + cgroup_throttle_swaprate(vmf->cow_page, GFP_KERNEL); ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) @@ -3884,7 +3869,6 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) goto uncharge_out; return ret; uncharge_out: - mem_cgroup_cancel_charge(vmf->cow_page, vmf->memcg); put_page(vmf->cow_page); return ret; } diff --git a/mm/migrate.c b/mm/migrate.c index e84fb5b87a85..2028f08e3e8d 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2746,7 +2746,6 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, { struct vm_area_struct *vma = migrate->vma; struct mm_struct *mm = vma->vm_mm; - struct mem_cgroup *memcg; bool flush = false; spinlock_t *ptl; pte_t entry; @@ -2793,7 +2792,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, if (unlikely(anon_vma_prepare(vma))) goto abort; - if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, &memcg)) + if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, false)) goto abort; /* @@ -2838,7 +2837,6 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto unlock_abort; inc_mm_counter(mm, MM_ANONPAGES); - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, vma, addr, false); if (!is_zone_device_page(page)) lru_cache_add_active_or_unevictable(page, vma); @@ -2861,7 +2859,6 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, unlock_abort: pte_unmap_unlock(ptep, ptl); - mem_cgroup_cancel_charge(page, memcg); abort: *src &= ~MIGRATE_PFN_MIGRATE; } diff --git a/mm/swapfile.c b/mm/swapfile.c index 45b937b924f5..8c9b6767013b 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1858,7 +1858,6 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, swp_entry_t entry, struct page *page) { struct page *swapcache; - struct mem_cgroup *memcg; spinlock_t *ptl; pte_t *pte; int ret = 1; @@ -1868,14 +1867,13 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, if (unlikely(!page)) return -ENOMEM; - if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, &memcg)) { + if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, true)) { ret = -ENOMEM; goto out_nolock; } pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (unlikely(!pte_same_as_swp(*pte, swp_entry_to_pte(entry)))) { - mem_cgroup_cancel_charge(page, memcg); ret = 0; goto out; } @@ -1886,10 +1884,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { - mem_cgroup_commit_charge(page, memcg, true); page_add_anon_rmap(page, vma, addr, false); } else { /* ksm created a completely new copy */ - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, vma, addr, false); lru_cache_add_active_or_unevictable(page, vma); } diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 3dea268d2850..2745489415cc 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -56,7 +56,6 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, struct page **pagep, bool wp_copy) { - struct mem_cgroup *memcg; pte_t _dst_pte, *dst_pte; spinlock_t *ptl; void *page_kaddr; @@ -97,7 +96,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, __SetPageUptodate(page); ret = -ENOMEM; - if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg)) + if (mem_cgroup_charge(page, dst_mm, GFP_KERNEL, false)) goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); @@ -123,7 +122,6 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release_uncharge_unlock; inc_mm_counter(dst_mm, MM_ANONPAGES); - mem_cgroup_commit_charge(page, memcg, false); page_add_new_anon_rmap(page, dst_vma, dst_addr, false); lru_cache_add_active_or_unevictable(page, dst_vma); @@ -138,7 +136,6 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, return ret; out_release_uncharge_unlock: pte_unmap_unlock(dst_pte, ptl); - mem_cgroup_cancel_charge(page, memcg); out_release: put_page(page); goto out; From patchwork Fri May 8 18:31:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537397 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 02FBF15AB for ; Fri, 8 May 2020 18:32:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AAA8D2496C for ; Fri, 8 May 2020 18:32:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="vQAFtSNf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AAA8D2496C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1334B90000E; Fri, 8 May 2020 14:32:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 06C43900005; Fri, 8 May 2020 14:32:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E29E190000E; Fri, 8 May 2020 14:32:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id C7147900005 for ; Fri, 8 May 2020 14:32:36 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 86D048248047 for ; Fri, 8 May 2020 18:32:36 +0000 (UTC) X-FDA: 76794397512.15.waves32_6e0ac001b4c3a X-Spam-Summary: 2,0,0,2714a4cdb24d1782,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:1:2:41:69:355:379:541:800:960:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4050:4250:4321:4423:4605:5007:6119:6261:6653:6742:7875:7903:8957:9010:9592:10004:11026:11232:11473:11658:11914:12043:12295:12296:12297:12438:12517:12519:12555:12683:12895:13161:13229:13894:14096:14110:14394:21080:21433:21444:21451:21611:21627:21966:21990:30012:30054:30070,0,RBL:209.85.222.193:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: waves32_6e0ac001b4c3a X-Filterd-Recvd-Size: 10945 Received: from mail-qk1-f193.google.com (mail-qk1-f193.google.com [209.85.222.193]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:36 +0000 (UTC) Received: by mail-qk1-f193.google.com with SMTP id 23so1633852qkf.0 for ; Fri, 08 May 2020 11:32:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RJuCgdGkxri9O9i9NP/KXNDfKziAlbiYeA2e0yljems=; b=vQAFtSNfEXFc12lcmRHSrslBLWwrL9/3NL8nC2K0IdgACd51+qanHt6Urlnq1WUhwI JlMWMxShXW7YHbRJmcxBlCLEzY9XfmWvKgnISBWFAUfl0MT4yJw6cQYm+rvMs/tKASEQ UCyDKUKAOdMY/0jv68FVFWyaE94OedLnLjx9+FfgAe2Idzi+4XLQUxJl11mQk2JJeFI5 nb1/R5trHh7mna8SiwpM5GOcV/UiXnRDLKbs1lZOV8wnWD65UG88MxcPsD+Q1GYLxJWT BoZXgKZVl1dt0idRIQlHUtff4hpoJUJCfpcfmOqVTcrIdjwhmtr11K1dLp8KH/uBoR8W EEAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RJuCgdGkxri9O9i9NP/KXNDfKziAlbiYeA2e0yljems=; b=QUqSbgIoHV+fXQTbPujeOy4EGiFOYY6FFSarTMluWyOiX+KCc85NZCXJOTMMrC1N3Z at1HBE3NPyby8vd5X8jWqs60oe6Qr5zMGqOO0X2j5tygfeTIbSBkZSTq+RAU6iOh07wd xRxUeFzpwqv7xJssfxlJCiRQt79snmV7Gs6hzGb2HM6m48CKkYVv+bIv0kWsjg97PfYC x7+Y5enZ3n75xbPXnWGoyxSKgddqoUI64nzpqG8JvJUAzY87uLUbQmnfRiKmVq5BQJ// VE53z5GBnys3bkln1u5LoHSmEeryfaJqT0S8VD0ns7TP7xJgQkKSmzyJw7NCfqiINYcg gJzA== X-Gm-Message-State: AGi0PuZVAH+nMo2i5Jv+qfUTsEnWX0a5HgH0vJjdjUeShRgxgVKD/IvF 74pOP8mpvWtWW728PWeQyQRbuQ== X-Google-Smtp-Source: APiQypIpzqmpD0rszP5jLG6wkpB0LTXe+PC8lP5EdHe8jySUT5DIu12pY4ZLhwARoj4ZwLkx8MQKAQ== X-Received: by 2002:a05:620a:16db:: with SMTP id a27mr3878709qkn.441.1588962755453; Fri, 08 May 2020 11:32:35 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id k2sm2195385qta.39.2020.05.08.11.32.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:34 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 13/19] mm: memcontrol: drop unused try/commit/cancel charge API Date: Fri, 8 May 2020 14:31:00 -0400 Message-Id: <20200508183105.225460-14-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are no more users. RIP in peace. Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/memcontrol.h | 36 ----------- mm/memcontrol.c | 126 +++++-------------------------------- 2 files changed, 15 insertions(+), 147 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 9b1054bf6d35..23608d3ee70f 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -369,14 +369,6 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg) page_counter_read(&memcg->memory); } -int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp); -int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp); -void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, - bool lrucare); -void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg); - int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, bool lrucare); @@ -867,34 +859,6 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg) return false; } -static inline int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, - struct mem_cgroup **memcgp) -{ - *memcgp = NULL; - return 0; -} - -static inline int mem_cgroup_try_charge_delay(struct page *page, - struct mm_struct *mm, - gfp_t gfp_mask, - struct mem_cgroup **memcgp) -{ - *memcgp = NULL; - return 0; -} - -static inline void mem_cgroup_commit_charge(struct page *page, - struct mem_cgroup *memcg, - bool lrucare) -{ -} - -static inline void mem_cgroup_cancel_charge(struct page *page, - struct mem_cgroup *memcg) -{ -} - static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, bool lrucare) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fd92c1c99e1f..7b9bb7ca0b44 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6432,29 +6432,26 @@ void mem_cgroup_calculate_protection(struct mem_cgroup *root, } /** - * mem_cgroup_try_charge - try charging a page + * mem_cgroup_charge - charge a newly allocated page to a cgroup * @page: page to charge * @mm: mm context of the victim * @gfp_mask: reclaim mode - * @memcgp: charged memcg return + * @lrucare: page might be on the LRU already * * Try to charge @page to the memcg that @mm belongs to, reclaiming * pages according to @gfp_mask if necessary. * - * Returns 0 on success, with *@memcgp pointing to the charged memcg. - * Otherwise, an error code is returned. - * - * After page->mapping has been set up, the caller must finalize the - * charge with mem_cgroup_commit_charge(). Or abort the transaction - * with mem_cgroup_cancel_charge() in case page instantiation fails. + * Returns 0 on success. Otherwise, an error code is returned. */ -int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp) +int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, + bool lrucare) { unsigned int nr_pages = hpage_nr_pages(page); struct mem_cgroup *memcg = NULL; int ret = 0; + VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page); + if (mem_cgroup_disabled()) goto out; @@ -6486,56 +6483,8 @@ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, memcg = get_mem_cgroup_from_mm(mm); ret = try_charge(memcg, gfp_mask, nr_pages); - - css_put(&memcg->css); -out: - *memcgp = memcg; - return ret; -} - -int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, struct mem_cgroup **memcgp) -{ - int ret; - - ret = mem_cgroup_try_charge(page, mm, gfp_mask, memcgp); - if (*memcgp) - cgroup_throttle_swaprate(page, gfp_mask); - return ret; -} - -/** - * mem_cgroup_commit_charge - commit a page charge - * @page: page to charge - * @memcg: memcg to charge the page to - * @lrucare: page might be on LRU already - * - * Finalize a charge transaction started by mem_cgroup_try_charge(), - * after page->mapping has been set up. This must happen atomically - * as part of the page instantiation, i.e. under the page table lock - * for anonymous pages, under the page lock for page and swap cache. - * - * In addition, the page must not be on the LRU during the commit, to - * prevent racing with task migration. If it might be, use @lrucare. - * - * Use mem_cgroup_cancel_charge() to cancel the transaction instead. - */ -void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, - bool lrucare) -{ - unsigned int nr_pages = hpage_nr_pages(page); - - VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page); - - if (mem_cgroup_disabled()) - return; - /* - * Swap faults will attempt to charge the same page multiple - * times. But reuse_swap_page() might have removed the page - * from swapcache already, so we can't check PageSwapCache(). - */ - if (!memcg) - return; + if (ret) + goto out_put; commit_charge(page, memcg, lrucare); @@ -6553,55 +6502,11 @@ void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, */ mem_cgroup_uncharge_swap(entry, nr_pages); } -} -/** - * mem_cgroup_cancel_charge - cancel a page charge - * @page: page to charge - * @memcg: memcg to charge the page to - * - * Cancel a charge transaction started by mem_cgroup_try_charge(). - */ -void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg) -{ - unsigned int nr_pages = hpage_nr_pages(page); - - if (mem_cgroup_disabled()) - return; - /* - * Swap faults will attempt to charge the same page multiple - * times. But reuse_swap_page() might have removed the page - * from swapcache already, so we can't check PageSwapCache(). - */ - if (!memcg) - return; - - cancel_charge(memcg, nr_pages); -} - -/** - * mem_cgroup_charge - charge a newly allocated page to a cgroup - * @page: page to charge - * @mm: mm context of the victim - * @gfp_mask: reclaim mode - * @lrucare: page might be on the LRU already - * - * Try to charge @page to the memcg that @mm belongs to, reclaiming - * pages according to @gfp_mask if necessary. - * - * Returns 0 on success. Otherwise, an error code is returned. - */ -int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, - bool lrucare) -{ - struct mem_cgroup *memcg; - int ret; - - ret = mem_cgroup_try_charge(page, mm, gfp_mask, &memcg); - if (ret) - return ret; - mem_cgroup_commit_charge(page, memcg, lrucare); - return 0; +out_put: + css_put(&memcg->css); +out: + return ret; } struct uncharge_gather { @@ -6706,8 +6611,7 @@ static void uncharge_list(struct list_head *page_list) * mem_cgroup_uncharge - uncharge a page * @page: page to uncharge * - * Uncharge a page previously charged with mem_cgroup_try_charge() and - * mem_cgroup_commit_charge(). + * Uncharge a page previously charged with mem_cgroup_charge(). */ void mem_cgroup_uncharge(struct page *page) { @@ -6730,7 +6634,7 @@ void mem_cgroup_uncharge(struct page *page) * @page_list: list of pages to uncharge * * Uncharge a list of pages previously charged with - * mem_cgroup_try_charge() and mem_cgroup_commit_charge(). + * mem_cgroup_charge(). */ void mem_cgroup_uncharge_list(struct list_head *page_list) { From patchwork Fri May 8 18:31:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537399 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2CCA315AB for ; Fri, 8 May 2020 18:32:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D012A2496C for ; Fri, 8 May 2020 18:32:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="E2/lkZF0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D012A2496C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2B94690000F; Fri, 8 May 2020 14:32:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 21C75900005; Fri, 8 May 2020 14:32:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10BFF90000F; Fri, 8 May 2020 14:32:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0235.hostedemail.com [216.40.44.235]) by kanga.kvack.org (Postfix) with ESMTP id E9BAB900005 for ; Fri, 8 May 2020 14:32:38 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A584F181AEF07 for ; Fri, 8 May 2020 18:32:38 +0000 (UTC) X-FDA: 76794397596.14.net19_6e58c380c7e44 X-Spam-Summary: 2,0,0,9c10a7c32b710a5e,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:1:2:41:69:355:379:541:617:800:960:968:973:988:989:1260:1311:1314:1345:1359:1437:1500:1515:1605:1730:1747:1777:1792:2194:2199:2393:2559:2562:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4050:4250:4321:4605:5007:6119:6261:6653:6742:7875:7903:8957:9389:9592:10004:11026:11232:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12683:12895:12986:13894:14096:14394:14877:21080:21324:21444:21450:21451:21627:21966:21972:21990:30029:30054:30080,0,RBL:209.85.160.195:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: net19_6e58c380c7e44 X-Filterd-Recvd-Size: 10266 Received: from mail-qt1-f195.google.com (mail-qt1-f195.google.com [209.85.160.195]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:38 +0000 (UTC) Received: by mail-qt1-f195.google.com with SMTP id p12so2110635qtn.13 for ; Fri, 08 May 2020 11:32:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=i8KiuhgQn9QjGDSCTgxu+4Kxp1JpMMevwAlfWeqInNo=; b=E2/lkZF0RiYCcxCPe/xtrbbOTBo+Qxeuh14fqtLHVi8JGBUexR8E+nf1ZYvsRpjnq3 WUM8sdHfMP74Ue7jRiI3xJAJsYMPcoL+KvjpUXLwTCRYJOLRS6vsocDXLXNwariB43xz YDMRuuALVf5GQgkJkO0+hXlMzIrhktWXL3fZyHhuFP8NJ33pUU6L+/lB+KmZhJaDt2/1 YlZ6m7+V7kw0Krdvkg2DhESdmleb8pXSxgiA18MxnNR9N+LUgKgZ8c8Bmpark/3yvRhk /4G9o8qLzzBs1gA1h6dea5GplFO581TSJ5K4TLbOl8cg46nvgoMUvQBwiw/mLjWZ6WFj 4PQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=i8KiuhgQn9QjGDSCTgxu+4Kxp1JpMMevwAlfWeqInNo=; b=mcIyP7eN2zUl1cswurzSPQxYzgHCOsfZ0C9g7CKAS9tJXhD4judDATlp2X2TWgmtAP B69FHyHw9WIftzuY0jrkjFeqH68Ftoh1v4nnx/f0v1zRk0Vcc+OpUHtwn8NEJt47L0Zn ulZ4aCoAg8PgcNxlcB7WzWwXy230dOUsGju6BQt9LzY2hL7t2YO5yJWyomplwxogUyFJ 9fAXwHASuAs+PgPc/myjr6Rq/HwLet3U6uOMY7hUWw4XSi6947bJ5Z92uYFmO1WMIwat PgeCX3xjWjsq8acWHwRkaExFger9ZYrhiHg74AOWmFRcidyya+eJCCEEwoBSF5QWmkJX N1fQ== X-Gm-Message-State: AGi0PuZCZ2qHk/b2v9VLwePOqfCUgQPpMB7OpwBw+v5k/OpaVf0ogigU XbEDuiRbZ4gru+O6rVxLRITqUA== X-Google-Smtp-Source: APiQypLTdqRXlY45AtYDRrAQb39ZTePitQOY1dbwmLbvV/fD+BLOCJo0IQRd9GeUZYQvZU6P/RJ8gg== X-Received: by 2002:ac8:46d8:: with SMTP id h24mr4610830qto.352.1588962757352; Fri, 08 May 2020 11:32:37 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id q185sm1767510qkf.100.2020.05.08.11.32.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:36 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 14/19] mm: memcontrol: prepare swap controller setup for integration Date: Fri, 8 May 2020 14:31:01 -0400 Message-Id: <20200508183105.225460-15-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A few cleanups to streamline the swap controller setup: - Replace the do_swap_account flag with cgroup_memory_noswap. This brings it in line with other functionality that is usually available unless explicitly opted out of - nosocket, nokmem. - Remove the really_do_swap_account flag that stores the boot option and is later used to switch the do_swap_account. It's not clear why this indirection is/was necessary. Use do_swap_account directly. - Minor coding style polishing Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/memcontrol.h | 2 +- mm/memcontrol.c | 59 ++++++++++++++++++-------------------- mm/swap_cgroup.c | 4 +-- 3 files changed, 31 insertions(+), 34 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 23608d3ee70f..3fa70ca73c31 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -572,7 +572,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim, void mem_cgroup_print_oom_group(struct mem_cgroup *memcg); #ifdef CONFIG_MEMCG_SWAP -extern int do_swap_account; +extern bool cgroup_memory_noswap; #endif struct mem_cgroup *lock_page_memcg(struct page *page); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b9bb7ca0b44..bb5f02ab92fb 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -83,10 +83,14 @@ static bool cgroup_memory_nokmem; /* Whether the swap controller is active */ #ifdef CONFIG_MEMCG_SWAP -int do_swap_account __read_mostly; +#ifdef CONFIG_MEMCG_SWAP_ENABLED +bool cgroup_memory_noswap __read_mostly; #else -#define do_swap_account 0 -#endif +bool cgroup_memory_noswap __read_mostly = 1; +#endif /* CONFIG_MEMCG_SWAP_ENABLED */ +#else +#define cgroup_memory_noswap 1 +#endif /* CONFIG_MEMCG_SWAP */ #ifdef CONFIG_CGROUP_WRITEBACK static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); @@ -95,7 +99,7 @@ static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); /* Whether legacy memory+swap accounting is active */ static bool do_memsw_account(void) { - return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && do_swap_account; + return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap; } #define THRESHOLDS_EVENTS_TARGET 128 @@ -6459,18 +6463,19 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, /* * Every swap fault against a single page tries to charge the * page, bail as early as possible. shmem_unuse() encounters - * already charged pages, too. The USED bit is protected by - * the page lock, which serializes swap cache removal, which + * already charged pages, too. page->mem_cgroup is protected + * by the page lock, which serializes swap cache removal, which * in turn serializes uncharging. */ VM_BUG_ON_PAGE(!PageLocked(page), page); if (compound_head(page)->mem_cgroup) goto out; - if (do_swap_account) { + if (!cgroup_memory_noswap) { swp_entry_t ent = { .val = page_private(page), }; - unsigned short id = lookup_swap_cgroup_id(ent); + unsigned short id; + id = lookup_swap_cgroup_id(ent); rcu_read_lock(); memcg = mem_cgroup_from_id(id); if (memcg && !css_tryget_online(&memcg->css)) @@ -6943,7 +6948,7 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) struct mem_cgroup *memcg; unsigned short oldid; - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) || !do_swap_account) + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) || cgroup_memory_noswap) return 0; memcg = page->mem_cgroup; @@ -6987,7 +6992,7 @@ void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages) struct mem_cgroup *memcg; unsigned short id; - if (!do_swap_account) + if (cgroup_memory_noswap) return; id = swap_cgroup_record(entry, 0, nr_pages); @@ -7010,7 +7015,7 @@ long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg) { long nr_swap_pages = get_nr_swap_pages(); - if (!do_swap_account || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) + if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) return nr_swap_pages; for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg)) nr_swap_pages = min_t(long, nr_swap_pages, @@ -7027,7 +7032,7 @@ bool mem_cgroup_swap_full(struct page *page) if (vm_swap_full()) return true; - if (!do_swap_account || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) + if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) return false; memcg = page->mem_cgroup; @@ -7042,22 +7047,15 @@ bool mem_cgroup_swap_full(struct page *page) return false; } -/* for remember boot option*/ -#ifdef CONFIG_MEMCG_SWAP_ENABLED -static int really_do_swap_account __initdata = 1; -#else -static int really_do_swap_account __initdata; -#endif - -static int __init enable_swap_account(char *s) +static int __init setup_swap_account(char *s) { if (!strcmp(s, "1")) - really_do_swap_account = 1; + cgroup_memory_noswap = 0; else if (!strcmp(s, "0")) - really_do_swap_account = 0; + cgroup_memory_noswap = 1; return 1; } -__setup("swapaccount=", enable_swap_account); +__setup("swapaccount=", setup_swap_account); static u64 swap_current_read(struct cgroup_subsys_state *css, struct cftype *cft) @@ -7123,7 +7121,7 @@ static struct cftype swap_files[] = { { } /* terminate */ }; -static struct cftype memsw_cgroup_files[] = { +static struct cftype memsw_files[] = { { .name = "memsw.usage_in_bytes", .private = MEMFILE_PRIVATE(_MEMSWAP, RES_USAGE), @@ -7152,13 +7150,12 @@ static struct cftype memsw_cgroup_files[] = { static int __init mem_cgroup_swap_init(void) { - if (!mem_cgroup_disabled() && really_do_swap_account) { - do_swap_account = 1; - WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, - swap_files)); - WARN_ON(cgroup_add_legacy_cftypes(&memory_cgrp_subsys, - memsw_cgroup_files)); - } + if (mem_cgroup_disabled() || cgroup_memory_noswap) + return 0; + + WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, swap_files)); + WARN_ON(cgroup_add_legacy_cftypes(&memory_cgrp_subsys, memsw_files)); + return 0; } subsys_initcall(mem_cgroup_swap_init); diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c index 45affaef3bc6..7aa764f09079 100644 --- a/mm/swap_cgroup.c +++ b/mm/swap_cgroup.c @@ -171,7 +171,7 @@ int swap_cgroup_swapon(int type, unsigned long max_pages) unsigned long length; struct swap_cgroup_ctrl *ctrl; - if (!do_swap_account) + if (cgroup_memory_noswap) return 0; length = DIV_ROUND_UP(max_pages, SC_PER_PAGE); @@ -209,7 +209,7 @@ void swap_cgroup_swapoff(int type) unsigned long i, length; struct swap_cgroup_ctrl *ctrl; - if (!do_swap_account) + if (cgroup_memory_noswap) return; mutex_lock(&swap_cgroup_mutex); From patchwork Fri May 8 18:31:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537401 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 47D85912 for ; Fri, 8 May 2020 18:32:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EF4C92192A for ; Fri, 8 May 2020 18:32:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="bmFd9FmZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF4C92192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B918A900010; Fri, 8 May 2020 14:32:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B43D4900005; Fri, 8 May 2020 14:32:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0DA8900010; Fri, 8 May 2020 14:32:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0054.hostedemail.com [216.40.44.54]) by kanga.kvack.org (Postfix) with ESMTP id 7F4DA900005 for ; Fri, 8 May 2020 14:32:40 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3D1CA181AEF07 for ; Fri, 8 May 2020 18:32:40 +0000 (UTC) X-FDA: 76794397680.08.place42_6e908abe1fc3b X-Spam-Summary: 2,0,0,724f5ba589a773bc,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:968:973:981:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2194:2196:2199:2200:2393:2559:2562:2693:2730:2898:2907:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4051:4321:4385:5007:6117:6119:6261:6653:6742:7808:7903:8660:8784:8957:9592:10004:11026:11232:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12683:12895:13148:13161:13229:13230:13255:13894:14394:21063:21080:21324:21444:21450:21451:21627:21795:21966:21990:30012:30051:30054:30062:30070:30080,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: place42_6e908abe1fc3b X-Filterd-Recvd-Size: 11396 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:39 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id l18so1216457qtp.0 for ; Fri, 08 May 2020 11:32:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pKOPyg5gtn9WNLK5mCdpfEBeBK6if5ILK7ToiE0vcCU=; b=bmFd9FmZa2yqWu7iSLZAIsztyPg65rOnQu7/u0h6g8a6lr5X5kMIz6Fd4olXXQZlmX Ps0A1svz62ezR6G4pTmdG4yLt9l39Qb0bEsumD6rwlww+IVjZ+OeM2LVIscGReetC+0V SOXhZHz5f0QztJTrqR1PqfHV10AlQpbgZSez9SA1el3Nt5PNBAG9ZvcD9dOMrIHfWKEz 45CKliiOIWS0TU9V8OxGEmSFIoCvNsgLX5ML1s+2/iHViazx8eV0vSur6bfUutFIvr2u I5LjTliB1RZ421nLu0ENfcvrBw4jbPYbC/u5QzWDzJTOcrm5tCXu+3E4gmbA857RzdWK KZPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pKOPyg5gtn9WNLK5mCdpfEBeBK6if5ILK7ToiE0vcCU=; b=UOMP3WxGpUyn7mCKqSQLDSeX4IGH22dnqILkUFt8eF7OQqi00XD9uGJ5L6XTsj6Xrs iY0Kh25r2QDpFluNlrz/OPqdIAFWiAGPZ9Yd1PfH8b6cJqgyiCnJhSmxmb6AiIxASv1L /yRXrtFVP+6X7i6RcZqLhov+3+Dfrei7PBPtCGdfa/rck1A2g4jdqwruuzOJMg88K0kC nGdhu/S9kELSFu6p252VNcbi52+HGEiOQeGSrlS6wgQBhWbt9C8mncUyYqeZhk3Y4m2F DcWerT4oYOHJ/mVdmtfYuTJeBk5C7WIa5PDh/yJdsdCi6EjCkXViwtk202Xyhee+35nW Lx8g== X-Gm-Message-State: AGi0PuYTWOa0Ra5Ep3nmyRDBLG66wny4hE9IODhMhRi7zfCtVkWj5hdI cHcZY7hgst+Yn5t1TPjHHFMhVA== X-Google-Smtp-Source: APiQypLlpR+TS+vD2VAHcci+FYLBWYiuzkQ1wPt+mQ94PuefDFpPJFumVbOlLEGi5A796l+3hNbccQ== X-Received: by 2002:ac8:7301:: with SMTP id x1mr4574611qto.53.1588962759008; Fri, 08 May 2020 11:32:39 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id q2sm1788053qkn.116.2020.05.08.11.32.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:38 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 15/19] mm: memcontrol: make swap tracking an integral part of memory control Date: Fri, 8 May 2020 14:31:02 -0400 Message-Id: <20200508183105.225460-16-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Without swap page tracking, users that are otherwise memory controlled can easily escape their containment and allocate significant amounts of memory that they're not being charged for. That's because swap does readahead, but without the cgroup records of who owned the page at swapout, readahead pages don't get charged until somebody actually faults them into their page table and we can identify an owner task. This can be maliciously exploited with MADV_WILLNEED, which triggers arbitrary readahead allocations without charging the pages. Make swap swap page tracking an integral part of memcg and remove the Kconfig options. In the first place, it was only made configurable to allow users to save some memory. But the overhead of tracking cgroup ownership per swap page is minimal - 2 byte per page, or 512k per 1G of swap, or 0.04%. Saving that at the expense of broken containment semantics is not something we should present as a coequal option. The swapaccount=0 boot option will continue to exist, and it will eliminate the page_counter overhead and hide the swap control files, but it won't disable swap slot ownership tracking. This patch makes sure we always have the cgroup records at swapin time; the next patch will fix the actual bug by charging readahead swap pages at swapin time rather than at fault time. v2: fix double swap charge bug in cgroup1/cgroup2 code gating Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- init/Kconfig | 17 +---------------- mm/memcontrol.c | 47 ++++++++++++++++++----------------------------- mm/swap_cgroup.c | 6 ------ 3 files changed, 19 insertions(+), 51 deletions(-) diff --git a/init/Kconfig b/init/Kconfig index 492bb7000aa4..9a874b2201bd 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -847,24 +847,9 @@ config MEMCG Provides control over the memory footprint of tasks in a cgroup. config MEMCG_SWAP - bool "Swap controller" + bool depends on MEMCG && SWAP - help - Provides control over the swap space consumed by tasks in a cgroup. - -config MEMCG_SWAP_ENABLED - bool "Swap controller enabled by default" - depends on MEMCG_SWAP default y - help - Memory Resource Controller Swap Extension comes with its price in - a bigger memory consumption. General purpose distribution kernels - which want to enable the feature but keep it disabled by default - and let the user enable it by swapaccount=1 boot command line - parameter should have this option unselected. - For those who want to have the feature enabled by default should - select this option (if, for some reason, they need to disable it - then swapaccount=0 does the trick). config MEMCG_KMEM bool diff --git a/mm/memcontrol.c b/mm/memcontrol.c index bb5f02ab92fb..4a003531af07 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -83,14 +83,10 @@ static bool cgroup_memory_nokmem; /* Whether the swap controller is active */ #ifdef CONFIG_MEMCG_SWAP -#ifdef CONFIG_MEMCG_SWAP_ENABLED bool cgroup_memory_noswap __read_mostly; #else -bool cgroup_memory_noswap __read_mostly = 1; -#endif /* CONFIG_MEMCG_SWAP_ENABLED */ -#else #define cgroup_memory_noswap 1 -#endif /* CONFIG_MEMCG_SWAP */ +#endif #ifdef CONFIG_CGROUP_WRITEBACK static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); @@ -5294,8 +5290,7 @@ static struct page *mc_handle_swap_pte(struct vm_area_struct *vma, * we call find_get_page() with swapper_space directly. */ page = find_get_page(swap_address_space(ent), swp_offset(ent)); - if (do_memsw_account()) - entry->val = ent.val; + entry->val = ent.val; return page; } @@ -5329,8 +5324,7 @@ static struct page *mc_handle_file_pte(struct vm_area_struct *vma, page = find_get_entry(mapping, pgoff); if (xa_is_value(page)) { swp_entry_t swp = radix_to_swp_entry(page); - if (do_memsw_account()) - *entry = swp; + *entry = swp; page = find_get_page(swap_address_space(swp), swp_offset(swp)); } @@ -6460,6 +6454,9 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, goto out; if (PageSwapCache(page)) { + swp_entry_t ent = { .val = page_private(page), }; + unsigned short id; + /* * Every swap fault against a single page tries to charge the * page, bail as early as possible. shmem_unuse() encounters @@ -6471,17 +6468,12 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, if (compound_head(page)->mem_cgroup) goto out; - if (!cgroup_memory_noswap) { - swp_entry_t ent = { .val = page_private(page), }; - unsigned short id; - - id = lookup_swap_cgroup_id(ent); - rcu_read_lock(); - memcg = mem_cgroup_from_id(id); - if (memcg && !css_tryget_online(&memcg->css)) - memcg = NULL; - rcu_read_unlock(); - } + id = lookup_swap_cgroup_id(ent); + rcu_read_lock(); + memcg = mem_cgroup_from_id(id); + if (memcg && !css_tryget_online(&memcg->css)) + memcg = NULL; + rcu_read_unlock(); } if (!memcg) @@ -6498,7 +6490,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, memcg_check_events(memcg, page); local_irq_enable(); - if (do_memsw_account() && PageSwapCache(page)) { + if (PageSwapCache(page)) { swp_entry_t entry = { .val = page_private(page) }; /* * The swap entry might not get freed for a long time, @@ -6883,7 +6875,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) VM_BUG_ON_PAGE(PageLRU(page), page); VM_BUG_ON_PAGE(page_count(page), page); - if (!do_memsw_account()) + if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) return; memcg = page->mem_cgroup; @@ -6912,7 +6904,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) if (!mem_cgroup_is_root(memcg)) page_counter_uncharge(&memcg->memory, nr_entries); - if (memcg != swap_memcg) { + if (!cgroup_memory_noswap && memcg != swap_memcg) { if (!mem_cgroup_is_root(swap_memcg)) page_counter_charge(&swap_memcg->memsw, nr_entries); page_counter_uncharge(&memcg->memsw, nr_entries); @@ -6948,7 +6940,7 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) struct mem_cgroup *memcg; unsigned short oldid; - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) || cgroup_memory_noswap) + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) return 0; memcg = page->mem_cgroup; @@ -6964,7 +6956,7 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) memcg = mem_cgroup_id_get_online(memcg); - if (!mem_cgroup_is_root(memcg) && + if (!cgroup_memory_noswap && !mem_cgroup_is_root(memcg) && !page_counter_try_charge(&memcg->swap, nr_pages, &counter)) { memcg_memory_event(memcg, MEMCG_SWAP_MAX); memcg_memory_event(memcg, MEMCG_SWAP_FAIL); @@ -6992,14 +6984,11 @@ void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages) struct mem_cgroup *memcg; unsigned short id; - if (cgroup_memory_noswap) - return; - id = swap_cgroup_record(entry, 0, nr_pages); rcu_read_lock(); memcg = mem_cgroup_from_id(id); if (memcg) { - if (!mem_cgroup_is_root(memcg)) { + if (!cgroup_memory_noswap && !mem_cgroup_is_root(memcg)) { if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) page_counter_uncharge(&memcg->swap, nr_pages); else diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c index 7aa764f09079..7f34343c075a 100644 --- a/mm/swap_cgroup.c +++ b/mm/swap_cgroup.c @@ -171,9 +171,6 @@ int swap_cgroup_swapon(int type, unsigned long max_pages) unsigned long length; struct swap_cgroup_ctrl *ctrl; - if (cgroup_memory_noswap) - return 0; - length = DIV_ROUND_UP(max_pages, SC_PER_PAGE); array_size = length * sizeof(void *); @@ -209,9 +206,6 @@ void swap_cgroup_swapoff(int type) unsigned long i, length; struct swap_cgroup_ctrl *ctrl; - if (cgroup_memory_noswap) - return; - mutex_lock(&swap_cgroup_mutex); ctrl = &swap_cgroup_ctrl[type]; map = ctrl->map; From patchwork Fri May 8 18:31:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537403 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9544315AB for ; Fri, 8 May 2020 18:32:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 46AB22192A for ; Fri, 8 May 2020 18:32:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="jwIJEbNf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46AB22192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1D3B2900011; Fri, 8 May 2020 14:32:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1318B900005; Fri, 8 May 2020 14:32:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F3BE5900011; Fri, 8 May 2020 14:32:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id DBB53900005 for ; Fri, 8 May 2020 14:32:42 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 97C7C8248047 for ; Fri, 8 May 2020 18:32:42 +0000 (UTC) X-FDA: 76794397764.13.sleep55_6eed38cc0154b X-Spam-Summary: 2,0,0,261ab0d7e49777e8,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1605:1730:1747:1777:1792:2194:2196:2199:2200:2393:2553:2559:2562:2693:2898:2907:3138:3139:3140:3141:3142:3622:3865:3866:3867:3868:3870:3871:3872:3874:4050:4250:4321:4385:4605:5007:6119:6261:6653:6742:7903:8531:8957:9592:10004:11026:11232:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12679:12683:12895:13053:13161:13229:13894:14096:14394:21060:21080:21220:21324:21444:21450:21451:21627:21740:21990:30003:30054:30070:30090,0,RBL:209.85.222.194:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: sleep55_6eed38cc0154b X-Filterd-Recvd-Size: 10845 Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:42 +0000 (UTC) Received: by mail-qk1-f194.google.com with SMTP id k81so2647846qke.5 for ; Fri, 08 May 2020 11:32:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gEsuCl/jh1GeHgdn0FOVKPNUgKGHCcGpgaHvUQUBVHw=; b=jwIJEbNfT/2CL0tzS3TiYOLlNBsdiZqhcBkAT2z2pYTVuCdkSo46gg7XewMb8yiQuP ftQK5gtM/sZXuE/XcL94iRwGO6nHM7F6M9Oqj/Qu+IlSEdC1Wgr+lamK77D9m9bHorqy ilsPqXGtEuyPwsR/OOW6etaQb2mig9DfovoTHQ0sQS5pAmCmSavQGPgD2n+EFpuUQIWX n6YOeAdC41DRw4RXJxu/IXoSdZxob4X+9yMGTh0wXK3Op0zFcsqequmYpSN85dur9nVf Z9KATcJmLKDj1ShIVtLXDNVNpSB2gsC6HDM+xGwTomo6F8J2rwI2H57oFUKlt/j8xqZE lUfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gEsuCl/jh1GeHgdn0FOVKPNUgKGHCcGpgaHvUQUBVHw=; b=LVZjnjHo4z8aeZ04VtaB8tLvlCgMjdcV6Bhwt13IH2YZgIpvp+oxXOHajcqcFvhUEm XEgAsvP/Uya+nV/BG69xWkYWefNTGCZWhHf//WhVyp9w0MYm7md5j/IjgDxj/Ku1Ra7W HmVS92QYsruGmS4IBpPF6ITuZyZJc7YZFgcrgP6GrsdOvEeN2gkXIxHquyo2xevBUaDX ouQRTZ5vK4MIesO0JviqTh+7T7HvdzNRL4OKki/8TtDt3FxMvwtL6ExDLv/3M9bYO9za WhptYm0Ea4Ab4rS7VcfIRUQ+2CvvUEMhV1NMt0GU5qlc2LBzaHIgW+rF8+PsPQlOipb+ MCzw== X-Gm-Message-State: AGi0PuZxH+pMwkptnVS3Qj0zqkFVGNI/eB88da8nTEnosxYrSIyazQjH VIsK7VLXopn1fIqV8KQSFAuYeA== X-Google-Smtp-Source: APiQypJMpRJ88KOHhg4rWkabtFU2Ek09QL0qeneqJOMfFjmiP/CPoKYsAH0QT5oCdxQUffgHmUPiiw== X-Received: by 2002:a37:a412:: with SMTP id n18mr4077666qke.134.1588962761498; Fri, 08 May 2020 11:32:41 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id 33sm2022173qtc.82.2020.05.08.11.32.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:39 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 16/19] mm: memcontrol: charge swapin pages on instantiation Date: Fri, 8 May 2020 14:31:03 -0400 Message-Id: <20200508183105.225460-17-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Right now, users that are otherwise memory controlled can easily escape their containment and allocate significant amounts of memory that they're not being charged for. That's because swap readahead pages are not being charged until somebody actually faults them into their page table. This can be exploited with MADV_WILLNEED, which triggers arbitrary readahead allocations without charging the pages. There are additional problems with the delayed charging of swap pages: 1. To implement refault/workingset detection for anonymous pages, we need to have a target LRU available at swapin time, but the LRU is not determinable until the page has been charged. 2. To implement per-cgroup LRU locking, we need page->mem_cgroup to be stable when the page is isolated from the LRU; otherwise, the locks change under us. But swapcache gets charged after it's already on the LRU, and even if we cannot isolate it ourselves (since charging is not exactly optional). The previous patch ensured we always maintain cgroup ownership records for swap pages. This patch moves the swapcache charging point from the fault handler to swapin time to fix all of the above problems. v2: simplify swapin error checking (Joonsoo) Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi --- mm/memory.c | 15 ++++++--- mm/shmem.c | 14 ++++---- mm/swap_state.c | 89 ++++++++++++++++++++++++++----------------------- mm/swapfile.c | 6 ---- 4 files changed, 67 insertions(+), 57 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 832ee914cbcf..93900b121b6e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3125,9 +3125,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vmf->address); if (page) { + int err; + __SetPageLocked(page); __SetPageSwapBacked(page); set_page_private(page, entry.val); + + /* Tell memcg to use swap ownership records */ + SetPageSwapCache(page); + err = mem_cgroup_charge(page, vma->vm_mm, + GFP_KERNEL, false); + ClearPageSwapCache(page); + if (err) + goto out_page; + lru_cache_add_anon(page); swap_readpage(page, true); } @@ -3189,10 +3200,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) goto out_page; } - if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, true)) { - ret = VM_FAULT_OOM; - goto out_page; - } cgroup_throttle_swaprate(page, GFP_KERNEL); /* diff --git a/mm/shmem.c b/mm/shmem.c index d0306a36f42c..98547dc4642d 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -623,13 +623,15 @@ static int shmem_add_to_page_cache(struct page *page, page->mapping = mapping; page->index = index; - error = mem_cgroup_charge(page, charge_mm, gfp, PageSwapCache(page)); - if (error) { - if (!PageSwapCache(page) && PageTransHuge(page)) { - count_vm_event(THP_FILE_FALLBACK); - count_vm_event(THP_FILE_FALLBACK_CHARGE); + if (!PageSwapCache(page)) { + error = mem_cgroup_charge(page, charge_mm, gfp, false); + if (error) { + if (PageTransHuge(page)) { + count_vm_event(THP_FILE_FALLBACK); + count_vm_event(THP_FILE_FALLBACK_CHARGE); + } + goto error; } - goto error; } cgroup_throttle_swaprate(page, gfp); diff --git a/mm/swap_state.c b/mm/swap_state.c index 558e224138d1..4052c011391d 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -360,12 +360,13 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, unsigned long addr, bool *new_page_allocated) { - struct page *found_page = NULL, *new_page = NULL; struct swap_info_struct *si; - int err; + struct page *page; + *new_page_allocated = false; - do { + for (;;) { + int err; /* * First check the swap cache. Since this is normally * called after lookup_swap_cache() failed, re-calling @@ -373,12 +374,12 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, */ si = get_swap_device(entry); if (!si) - break; - found_page = find_get_page(swap_address_space(entry), - swp_offset(entry)); + return NULL; + page = find_get_page(swap_address_space(entry), + swp_offset(entry)); put_swap_device(si); - if (found_page) - break; + if (page) + return page; /* * Just skip read ahead for unused swap slot. @@ -389,21 +390,15 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, * else swap_off will be aborted if we return NULL. */ if (!__swp_swapcount(entry) && swap_slot_cache_enabled) - break; - - /* - * Get a new page to read into from swap. - */ - if (!new_page) { - new_page = alloc_page_vma(gfp_mask, vma, addr); - if (!new_page) - break; /* Out of memory */ - } + return NULL; /* * Swap entry may have been freed since our caller observed it. */ err = swapcache_prepare(entry); + if (!err) + break; + if (err == -EEXIST) { /* * We might race against get_swap_page() and stumble @@ -412,31 +407,43 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, */ cond_resched(); continue; - } else if (err) /* swp entry is obsolete ? */ - break; - - /* May fail (-ENOMEM) if XArray node allocation failed. */ - __SetPageLocked(new_page); - __SetPageSwapBacked(new_page); - err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL); - if (likely(!err)) { - /* Initiate read into locked page */ - SetPageWorkingset(new_page); - lru_cache_add_anon(new_page); - *new_page_allocated = true; - return new_page; } - __ClearPageLocked(new_page); - /* - * add_to_swap_cache() doesn't return -EEXIST, so we can safely - * clear SWAP_HAS_CACHE flag. - */ - put_swap_page(new_page, entry); - } while (err != -ENOMEM); - if (new_page) - put_page(new_page); - return found_page; + return NULL; + } + + /* + * The swap entry is ours to swap in. Prepare a new page. + */ + + page = alloc_page_vma(gfp_mask, vma, addr); + if (!page) + goto fail_free; + + __SetPageLocked(page); + __SetPageSwapBacked(page); + + /* May fail (-ENOMEM) if XArray node allocation failed. */ + if (add_to_swap_cache(page, entry, gfp_mask & GFP_KERNEL)) + goto fail_unlock; + + if (mem_cgroup_charge(page, NULL, gfp_mask & GFP_KERNEL, false)) + goto fail_delete; + + /* Initiate read into locked page */ + SetPageWorkingset(page); + lru_cache_add_anon(page); + *new_page_allocated = true; + return page; + +fail_delete: + delete_from_swap_cache(page); +fail_unlock: + unlock_page(page); + put_page(page); +fail_free: + swap_free(entry); + return NULL; } /* diff --git a/mm/swapfile.c b/mm/swapfile.c index 8c9b6767013b..3bc7acc68ba8 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1867,11 +1867,6 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, if (unlikely(!page)) return -ENOMEM; - if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, true)) { - ret = -ENOMEM; - goto out_nolock; - } - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (unlikely(!pte_same_as_swp(*pte, swp_entry_to_pte(entry)))) { ret = 0; @@ -1897,7 +1892,6 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, activate_page(page); out: pte_unmap_unlock(pte, ptl); -out_nolock: if (page != swapcache) { unlock_page(page); put_page(page); From patchwork Fri May 8 18:31:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537405 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A948915AB for ; Fri, 8 May 2020 18:33:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7617C2192A for ; Fri, 8 May 2020 18:33:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="q2dyjdEb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7617C2192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F872900012; Fri, 8 May 2020 14:32:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 10A47900005; Fri, 8 May 2020 14:32:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC692900012; Fri, 8 May 2020 14:32:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id C74A0900005 for ; Fri, 8 May 2020 14:32:44 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 894C02C0C for ; Fri, 8 May 2020 18:32:44 +0000 (UTC) X-FDA: 76794397848.02.level85_6f3561eaab422 X-Spam-Summary: 2,0,0,a386dd293cfc083e,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:355:379:541:800:960:966:968:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1542:1711:1730:1747:1777:1792:2196:2199:2393:2553:2559:2562:2693:2892:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3873:3874:4250:4385:4605:5007:6119:6261:6653:6742:7576:7875:8957:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12683:12895:13894:14096:14181:14394:14721:21060:21080:21444:21450:21451:21627:30054:30075:30090,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: level85_6f3561eaab422 X-Filterd-Recvd-Size: 5604 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:44 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id b1so1357484qtt.1 for ; Fri, 08 May 2020 11:32:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WjwifLkaxtnwjsDLki9oTn/ODVTE0c1W9xwEMFneqd4=; b=q2dyjdEbv83IUE3u02r2JwYp7811qVwPgshXKCrQ9XsOoThuy+yCgx3eE+3O06I4nk sA2UB6pDLm7Cl3ukEoSeqZ4absWqRl+rAuxVm35o9C8uulHk6tDaLg2OW64hPvYjMfPq h1nO3+CuPO7MSvFT8px0tCOrBddzqfjGfIsySPjI7CJLvGVVKfGcW9YX0CHvGlWCu3eI ZSOGDPjXfO7ba7ayjNCh1zngQEJvJ0Q7F/lJ9RAxmpg0eca0qEWUl/Zhn6BNtOlpc3JP KMXOdTZnVyl63bvpRi5/fbOGw/EtLtsnjpmczC5i18yY0lhPEZL3g44yjE/BuRqyt+9f tzQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WjwifLkaxtnwjsDLki9oTn/ODVTE0c1W9xwEMFneqd4=; b=WXQ/WFX3CrlFBq9lJVfS0sBQzoIVeukFw4B27XB6scTKfweNEDwNLoItH47mfnvUad /wM1at1meXgEhvM19Lp9wKy+8iQCszNF1C6LUfpSlUoKEzT3Ru+bNYFuXgeo68nHxXxJ GqksiJtvDuWGc4CqIqARjChXbun3QLjDNQURg7xDPTB4HVoulXUOsWHtTSNJGRDAOyeF dsHFt8hhSkKxkeK/kSPXkxKMVLTT2lpGdOTZ7UnoKokiVUV9r8Ygqd07JnwifiWiiTeZ zyw6M7H/qzqttqlYgP0D+xqmsJhMqOSbTmK5W4msHuGTAsxoyp/KUNdaVSuFmtsAObZv sbHA== X-Gm-Message-State: AGi0Pub18XKkhfrRp187edhpV1bMcG74HpsGXFO33SWIIMxFe7HNvaPq bBltwR+D9GEWEIsLckXAt3v9jw== X-Google-Smtp-Source: APiQypKVCx22J2K5w35bf60sV1gZioeQd40D9P2B8Ud1CnlBKzSyhGcvkchitCvEL8araIDX3HOMIw== X-Received: by 2002:ac8:2c4f:: with SMTP id e15mr396815qta.27.1588962763459; Fri, 08 May 2020 11:32:43 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id u11sm2090127qtj.10.2020.05.08.11.32.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:42 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 17/19] mm: memcontrol: document the new swap control behavior Date: Fri, 8 May 2020 14:31:04 -0400 Message-Id: <20200508183105.225460-18-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Alex Shi Signed-off-by: Alex Shi Signed-off-by: Johannes Weiner --- .../admin-guide/cgroup-v1/memory.rst | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 0ae4f564c2d6..12757e63b26c 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -199,11 +199,11 @@ An RSS page is unaccounted when it's fully unmapped. A PageCache page is unaccounted when it's removed from radix-tree. Even if RSS pages are fully unmapped (by kswapd), they may exist as SwapCache in the system until they are really freed. Such SwapCaches are also accounted. -A swapped-in page is not accounted until it's mapped. +A swapped-in page is accounted after adding into swapcache. Note: The kernel does swapin-readahead and reads multiple swaps at once. -This means swapped-in pages may contain pages for other tasks than a task -causing page fault. So, we avoid accounting at swap-in I/O. +Since page's memcg recorded into swap whatever memsw enabled, the page will +be accounted after swapin. At page migration, accounting information is kept. @@ -222,18 +222,13 @@ the cgroup that brought it in -- this will happen on memory pressure). But see section 8.2: when moving a task to another cgroup, its pages may be recharged to the new cgroup, if move_charge_at_immigrate has been chosen. -Exception: If CONFIG_MEMCG_SWAP is not used. -When you do swapoff and make swapped-out pages of shmem(tmpfs) to -be backed into memory in force, charges for pages are accounted against the -caller of swapoff rather than the users of shmem. - -2.4 Swap Extension (CONFIG_MEMCG_SWAP) +2.4 Swap Extension -------------------------------------- -Swap Extension allows you to record charge for swap. A swapped-in page is -charged back to original page allocator if possible. +Swap usage is always recorded for each of cgroup. Swap Extension allows you to +read and limit it. -When swap is accounted, following files are added. +When CONFIG_SWAP is enabled, following files are added. - memory.memsw.usage_in_bytes. - memory.memsw.limit_in_bytes. From patchwork Fri May 8 18:31:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537407 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D76F3912 for ; Fri, 8 May 2020 18:33:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 88DBD2192A for ; Fri, 8 May 2020 18:33:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="FGe17OT4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 88DBD2192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A32F6900015; Fri, 8 May 2020 14:32:46 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9E332900005; Fri, 8 May 2020 14:32:46 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88504900015; Fri, 8 May 2020 14:32:46 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0244.hostedemail.com [216.40.44.244]) by kanga.kvack.org (Postfix) with ESMTP id 6F3B8900005 for ; Fri, 8 May 2020 14:32:46 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2D6AC2C7C for ; Fri, 8 May 2020 18:32:46 +0000 (UTC) X-FDA: 76794397932.26.house24_6f6c36ea27042 X-Spam-Summary: 2,0,0,1d9e41634b832759,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:1:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2636:2907:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6261:6653:6742:7903:8957:9010:9592:10004:11026:11232:11473:11658:11914:12043:12295:12296:12297:12438:12517:12519:12555:12683:12895:12986:13894:14096:14394:21080:21324:21433:21444:21450:21451:21627:21966:21987:21990:30012:30054:30090,0,RBL:209.85.222.195:@cmpxchg.org:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: house24_6f6c36ea27042 X-Filterd-Recvd-Size: 13475 Received: from mail-qk1-f195.google.com (mail-qk1-f195.google.com [209.85.222.195]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:45 +0000 (UTC) Received: by mail-qk1-f195.google.com with SMTP id i14so1520451qka.10 for ; Fri, 08 May 2020 11:32:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TTuAqgld6SyEaK3rLHUNyouiF/IGEwrIxqQKkdAzWpI=; b=FGe17OT4Z1CVjqvCLrTYH2xv12QXScSBv87SZC3klEnRKeQFyPFP0SC4ydb552/3Rg X6i+MSWQabSacr+0jUlJRUTMOq5PGbHHRM6TO9aqig3r8b4RBBeE6p+O6NpqtRRxUT8F X0R93Tsvx5ZjkrS1lJKjuIa/UB+RJIwGuCFUFC3A8GfMAVLZ/ocPnNm4BHIaYN3Ff9UY Ku1XEeRzWLlzuvveV5Uetea6OIszYVsa0A67gLtXH+CsQoya3Acnu5l1rEWNdyjTHhTj hVRtskVhBt+aNsQxAdk2guuMvAzTP2hh8wXDaiKB7rcAmHTss3cr/aueCB61xSYgSKT0 DEKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TTuAqgld6SyEaK3rLHUNyouiF/IGEwrIxqQKkdAzWpI=; b=N2KH8qc4E9tmr+Klfs0lBTMxNCEQ7Vi8UM9rHTUnThJjEsnNlzDSuI7C8EKH1p209d M6oD52h5uGGrCeJpAggryMUJm+USKdjBTJ3ZOojVZtkkl2GnQETUeBFn49wB9dbKdSL9 jekgCGTMFUqY7K9LoFC3Sj2kD/LqnqoNyOZrbVTTb5eNRPP+lfd73x14RlveN6BwnkgP eGXvN/LQ8gdpVySbaY8wJuQe+n6h+0bjDo3GlGJ+af7oAUk/XSOm3lZHWY7FHTRV77t0 I4c6izL6JTuN1UprvZBR7aM2qjHZvSPKjQmnq0xV9ES70pm2xom9UpxP6OcREwOYHRwG nLiA== X-Gm-Message-State: AGi0PubKOB6/WfVsnQ693skxliiVo8xOx0/epfX1w94//AFMYlvgZH0L 7kRLvg00J7MKPgLxhS3LaxYKpw== X-Google-Smtp-Source: APiQypIYRDDnYjt6Iom/0xBEDFBj4IPN2NA1mCsIxAvpTe9dUxvXmZwcVLL8TWEvbgfWaVrkmAm5cw== X-Received: by 2002:a37:4e08:: with SMTP id c8mr4246765qkb.60.1588962764827; Fri, 08 May 2020 11:32:44 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id c11sm1763050qkj.78.2020.05.08.11.32.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:44 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 18/19] mm: memcontrol: delete unused lrucare handling Date: Fri, 8 May 2020 14:31:05 -0400 Message-Id: <20200508183105.225460-19-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Swapin faults were the last event to charge pages after they had already been put on the LRU list. Now that we charge directly on swapin, the lrucare portion of the charge code is unused. Signed-off-by: Johannes Weiner Reviewed-by: Joonsoo Kim --- include/linux/memcontrol.h | 5 ++-- kernel/events/uprobes.c | 3 +- mm/filemap.c | 2 +- mm/huge_memory.c | 2 +- mm/khugepaged.c | 4 +-- mm/memcontrol.c | 57 +++----------------------------------- mm/memory.c | 8 +++--- mm/migrate.c | 2 +- mm/shmem.c | 2 +- mm/swap_state.c | 2 +- mm/userfaultfd.c | 2 +- 11 files changed, 19 insertions(+), 70 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 3fa70ca73c31..e7209f4ca938 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -369,8 +369,7 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg) page_counter_read(&memcg->memory); } -int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, - bool lrucare); +int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask); void mem_cgroup_uncharge(struct page *page); void mem_cgroup_uncharge_list(struct list_head *page_list); @@ -860,7 +859,7 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg) } static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm, - gfp_t gfp_mask, bool lrucare) + gfp_t gfp_mask) { return 0; } diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 4253c153e985..eddc8db96027 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -167,8 +167,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, addr + PAGE_SIZE); if (new_page) { - err = mem_cgroup_charge(new_page, vma->vm_mm, GFP_KERNEL, - false); + err = mem_cgroup_charge(new_page, vma->vm_mm, GFP_KERNEL); if (err) return err; } diff --git a/mm/filemap.c b/mm/filemap.c index fa47f160e1cc..792e22e1e3c0 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -845,7 +845,7 @@ static int __add_to_page_cache_locked(struct page *page, page->index = offset; if (!huge) { - error = mem_cgroup_charge(page, current->mm, gfp_mask, false); + error = mem_cgroup_charge(page, current->mm, gfp_mask); if (error) goto error; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d0f1e8cee93c..21e6687895e2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -593,7 +593,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, VM_BUG_ON_PAGE(!PageCompound(page), page); - if (mem_cgroup_charge(page, vma->vm_mm, gfp, false)) { + if (mem_cgroup_charge(page, vma->vm_mm, gfp)) { put_page(page); count_vm_event(THP_FAULT_FALLBACK); count_vm_event(THP_FAULT_FALLBACK_CHARGE); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 34731e7c9a67..fbb1030091ca 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1066,7 +1066,7 @@ static void collapse_huge_page(struct mm_struct *mm, goto out_nolock; } - if (unlikely(mem_cgroup_charge(new_page, mm, gfp, false))) { + if (unlikely(mem_cgroup_charge(new_page, mm, gfp))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out_nolock; } @@ -1623,7 +1623,7 @@ static void collapse_file(struct mm_struct *mm, goto out; } - if (unlikely(mem_cgroup_charge(new_page, mm, gfp, false))) { + if (unlikely(mem_cgroup_charge(new_page, mm, gfp))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4a003531af07..491fdeec0ce4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2601,51 +2601,9 @@ static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) css_put_many(&memcg->css, nr_pages); } -static void lock_page_lru(struct page *page, int *isolated) +static void commit_charge(struct page *page, struct mem_cgroup *memcg) { - pg_data_t *pgdat = page_pgdat(page); - - spin_lock_irq(&pgdat->lru_lock); - if (PageLRU(page)) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, pgdat); - ClearPageLRU(page); - del_page_from_lru_list(page, lruvec, page_lru(page)); - *isolated = 1; - } else - *isolated = 0; -} - -static void unlock_page_lru(struct page *page, int isolated) -{ - pg_data_t *pgdat = page_pgdat(page); - - if (isolated) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, pgdat); - VM_BUG_ON_PAGE(PageLRU(page), page); - SetPageLRU(page); - add_page_to_lru_list(page, lruvec, page_lru(page)); - } - spin_unlock_irq(&pgdat->lru_lock); -} - -static void commit_charge(struct page *page, struct mem_cgroup *memcg, - bool lrucare) -{ - int isolated; - VM_BUG_ON_PAGE(page->mem_cgroup, page); - - /* - * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page - * may already be on some other mem_cgroup's LRU. Take care of it. - */ - if (lrucare) - lock_page_lru(page, &isolated); - /* * Nobody should be changing or seriously looking at * page->mem_cgroup at this point: @@ -2661,9 +2619,6 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, * have the page locked */ page->mem_cgroup = memcg; - - if (lrucare) - unlock_page_lru(page, isolated); } #ifdef CONFIG_MEMCG_KMEM @@ -6434,22 +6389,18 @@ void mem_cgroup_calculate_protection(struct mem_cgroup *root, * @page: page to charge * @mm: mm context of the victim * @gfp_mask: reclaim mode - * @lrucare: page might be on the LRU already * * Try to charge @page to the memcg that @mm belongs to, reclaiming * pages according to @gfp_mask if necessary. * * Returns 0 on success. Otherwise, an error code is returned. */ -int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, - bool lrucare) +int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) { unsigned int nr_pages = hpage_nr_pages(page); struct mem_cgroup *memcg = NULL; int ret = 0; - VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page); - if (mem_cgroup_disabled()) goto out; @@ -6483,7 +6434,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, if (ret) goto out_put; - commit_charge(page, memcg, lrucare); + commit_charge(page, memcg); local_irq_disable(); mem_cgroup_charge_statistics(memcg, page, nr_pages); @@ -6684,7 +6635,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) page_counter_charge(&memcg->memsw, nr_pages); css_get_many(&memcg->css, nr_pages); - commit_charge(newpage, memcg, false); + commit_charge(newpage, memcg); local_irq_save(flags); mem_cgroup_charge_statistics(memcg, newpage, nr_pages); diff --git a/mm/memory.c b/mm/memory.c index 93900b121b6e..7f19a73db0f0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2675,7 +2675,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } } - if (mem_cgroup_charge(new_page, mm, GFP_KERNEL, false)) + if (mem_cgroup_charge(new_page, mm, GFP_KERNEL)) goto oom_free_new; cgroup_throttle_swaprate(new_page, GFP_KERNEL); @@ -3134,7 +3134,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* Tell memcg to use swap ownership records */ SetPageSwapCache(page); err = mem_cgroup_charge(page, vma->vm_mm, - GFP_KERNEL, false); + GFP_KERNEL); ClearPageSwapCache(page); if (err) goto out_page; @@ -3358,7 +3358,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (!page) goto oom; - if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, false)) + if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; cgroup_throttle_swaprate(page, GFP_KERNEL); @@ -3854,7 +3854,7 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) if (!vmf->cow_page) return VM_FAULT_OOM; - if (mem_cgroup_charge(vmf->cow_page, vma->vm_mm, GFP_KERNEL, false)) { + if (mem_cgroup_charge(vmf->cow_page, vma->vm_mm, GFP_KERNEL)) { put_page(vmf->cow_page); return VM_FAULT_OOM; } diff --git a/mm/migrate.c b/mm/migrate.c index 2028f08e3e8d..5fed0305d2ec 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2792,7 +2792,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, if (unlikely(anon_vma_prepare(vma))) goto abort; - if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, false)) + if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL)) goto abort; /* diff --git a/mm/shmem.c b/mm/shmem.c index 98547dc4642d..ccda43fd0328 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -624,7 +624,7 @@ static int shmem_add_to_page_cache(struct page *page, page->index = index; if (!PageSwapCache(page)) { - error = mem_cgroup_charge(page, charge_mm, gfp, false); + error = mem_cgroup_charge(page, charge_mm, gfp); if (error) { if (PageTransHuge(page)) { count_vm_event(THP_FILE_FALLBACK); diff --git a/mm/swap_state.c b/mm/swap_state.c index 4052c011391d..3a66ed4e3574 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -427,7 +427,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, if (add_to_swap_cache(page, entry, gfp_mask & GFP_KERNEL)) goto fail_unlock; - if (mem_cgroup_charge(page, NULL, gfp_mask & GFP_KERNEL, false)) + if (mem_cgroup_charge(page, NULL, gfp_mask & GFP_KERNEL)) goto fail_delete; /* Initiate read into locked page */ diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 2745489415cc..7f5194046b01 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -96,7 +96,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, __SetPageUptodate(page); ret = -ENOMEM; - if (mem_cgroup_charge(page, dst_mm, GFP_KERNEL, false)) + if (mem_cgroup_charge(page, dst_mm, GFP_KERNEL)) goto out_release; _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); From patchwork Fri May 8 18:31:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11537409 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B2EE4912 for ; Fri, 8 May 2020 18:33:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 770E12192A for ; Fri, 8 May 2020 18:33:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="jiNhQR7o" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 770E12192A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 01F86900016; Fri, 8 May 2020 14:32:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F12E8900005; Fri, 8 May 2020 14:32:47 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E2860900016; Fri, 8 May 2020 14:32:47 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id C7AE3900005 for ; Fri, 8 May 2020 14:32:47 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8D10E181AEF07 for ; Fri, 8 May 2020 18:32:47 +0000 (UTC) X-FDA: 76794397974.27.pie20_6fa78c619165a X-Spam-Summary: 2,0,0,2be9a8099077bc82,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1542:1711:1730:1747:1777:1792:2194:2199:2393:2559:2562:2693:2901:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:4250:5007:6261:6653:6742:7875:7903:10004:11026:11232:11473:11658:11914:12043:12294:12296:12297:12438:12517:12519:12555:12895:13053:13161:13200:13229:13894:14093:14096:14181:14394:14721:21060:21080:21324:21444:21451:21627:30054:30070,0,RBL:209.85.160.194:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: pie20_6fa78c619165a X-Filterd-Recvd-Size: 5604 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 18:32:47 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id p12so2111151qtn.13 for ; Fri, 08 May 2020 11:32:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Eum9bjhCQUemh9iSZtdHJuvrUPzxWGTkZ4CcEro9n7E=; b=jiNhQR7oDFgCIaeq8w1kb7yBCUxXWI4PGAu8hvAJp/EDuZPdtqs0Kv0eQRCIbghrLr plx1VKm4Sw92oc+8T9hBnws9X3YpMutk37X+Z4vvHpsPuGw4ZF1Y2rPRjkD1agBRSLF1 kS797R/esbPerCLSlEDiYcRzyVwhJ24kA8ALmaEzLu9cjjv83Lmxp6XMkh3hMvNIqVEU XEuKmtJ486aKM1U5k7/+R0un2thzz8+m1v19Z7FPHL+BnJkUfBDg73JlyXZZ7nG9aKiE 94bS/k6zSEgO9ITOLKwSURZnE+flX89sr9YP4oD6ByysAzQdkJOyP+fRM8bipeTGLD/0 yqfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Eum9bjhCQUemh9iSZtdHJuvrUPzxWGTkZ4CcEro9n7E=; b=Hyj32obYG4gAFwFBZy4ztvulIS31O0etFZPbzWApm8XvMr+RVoThBniVXhMp8AkNOt uar+h11SUvSnQlUzn8pCzxXgKpi+3kfM27P5u9O12bru7sVlNKK/Y1DxjVHktUYR+mtY zHj9QN7+ThQBHAp0RtnJ7mAQtZjaJJEn3EvsWLi8eqjacE5xLPkpT05+MVa/egfMmd7o kh3k1iA4ZToCgzXabZuvJJYHZBLtt3V2j1OC7Rop5AWX2Xmau2q5tnVGOkYJwUIP8qzV niJbBylHpvBSHuR83K9GwlxdX4NPRP8ebrx9pn9YAiLBVxT1yXEwZUh4p2mvu2Xf4QoL 34GA== X-Gm-Message-State: AGi0PuaSoY7xUX0IZkBIg0aFXMwGLkx7nFmuoEuJ6O7V9tSzLWTRPgBk jDxE+fuh1i2Q9wpb8Sui5oLzRM6irsg= X-Google-Smtp-Source: APiQypJ0pdn3xZQfxhpj1b1EoTjr0iocaEQ2GoCK8oxJJWIkU3kVnVnuPWxhjkluRfHHIg6t3mG3Kw== X-Received: by 2002:aed:2442:: with SMTP id s2mr4367165qtc.153.1588962766615; Fri, 08 May 2020 11:32:46 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id 10sm2348766qtp.4.2020.05.08.11.32.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 11:32:45 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Alex Shi , Joonsoo Kim , Shakeel Butt , Hugh Dickins , Michal Hocko , "Kirill A. Shutemov" , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 19/19] mm: memcontrol: update page->mem_cgroup stability rules Date: Fri, 8 May 2020 14:31:06 -0400 Message-Id: <20200508183105.225460-20-hannes@cmpxchg.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508183105.225460-1-hannes@cmpxchg.org> References: <20200508183105.225460-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The previous patches have simplified the access rules around page->mem_cgroup somewhat: 1. We never change page->mem_cgroup while the page is isolated by somebody else. This was by far the biggest exception to our rules and it didn't stop at lock_page() or lock_page_memcg(). 2. We charge pages before they get put into page tables now, so the somewhat fishy rule about "can be in page table as long as it's still locked" is now gone and boiled down to having an exclusive reference to the page. Document the new rules. Any of the following will stabilize the page->mem_cgroup association: - the page lock - LRU isolation - lock_page_memcg() - exclusive access to the page Signed-off-by: Johannes Weiner Reviewed-by: Alex Shi Reviewed-by: Joonsoo Kim --- mm/memcontrol.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 491fdeec0ce4..865440e8438e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1201,9 +1201,8 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, * @page: the page * @pgdat: pgdat of the page * - * This function is only safe when following the LRU page isolation - * and putback protocol: the LRU lock must be held, and the page must - * either be PageLRU() or the caller must have isolated/allocated it. + * This function relies on page->mem_cgroup being stable - see the + * access rules in commit_charge(). */ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgdat) { @@ -2605,18 +2604,12 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg) { VM_BUG_ON_PAGE(page->mem_cgroup, page); /* - * Nobody should be changing or seriously looking at - * page->mem_cgroup at this point: - * - * - the page is uncharged - * - * - the page is off-LRU - * - * - an anonymous fault has exclusive page access, except for - * a locked page table + * Any of the following ensures page->mem_cgroup stability: * - * - a page cache insertion, a swapin fault, or a migration - * have the page locked + * - the page lock + * - LRU isolation + * - lock_page_memcg() + * - exclusive reference */ page->mem_cgroup = memcg; }