From patchwork Tue Jul 28 13:52:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 11689209 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC6B51392 for ; Tue, 28 Jul 2020 13:53:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 97242206F5 for ; Tue, 28 Jul 2020 13:53:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="OrOPrYRy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 97242206F5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E30AD6B005C; Tue, 28 Jul 2020 09:53:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DB9DF6B005D; Tue, 28 Jul 2020 09:53:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5A218D0015; Tue, 28 Jul 2020 09:53:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id AFE0C6B005C for ; Tue, 28 Jul 2020 09:53:14 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 62FD11EE6 for ; Tue, 28 Jul 2020 13:53:14 +0000 (UTC) X-FDA: 77087626308.30.soda70_0d0366326f6a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id 3A38F180B3C83 for ; Tue, 28 Jul 2020 13:53:14 +0000 (UTC) X-Spam-Summary: 1,0,0,31d0b8b9f22193fe,d41d8cd98f00b204,hannes@cmpxchg.org,,RULES_HIT:41:355:379:541:800:960:966:967:973:988:989:1260:1311:1314:1345:1437:1515:1535:1543:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2559:2563:2682:2685:2693:2731:2859:2898:2916:2933:2937:2939:2942:2945:2947:2951:2954:2987:3022:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4321:4385:5007:6261:6653:7903:9025:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12895:12986:13161:13229:13894:14096:14181:14394:14721:21064:21080:21433:21444:21450:21451:21627:21740:21990:22013:30054:30070,0,RBL:209.85.160.174:@cmpxchg.org:.lbl8.mailshell.net-66.100.201.201 62.2.0.100;04y8f137s9w9ntisjm3479d41jttkop13u6uqmxq5gxojdxg539c7tp4re9cgxd.tbd3k773a1hwmwzwofkb9apmyz639y1pn46exux41rz1pruu8awoukmjpew9nrs.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,M SF:not b X-HE-Tag: soda70_0d0366326f6a X-Filterd-Recvd-Size: 6694 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Tue, 28 Jul 2020 13:53:13 +0000 (UTC) Received: by mail-qt1-f174.google.com with SMTP id x12so6327326qtp.1 for ; Tue, 28 Jul 2020 06:53:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=2tgRhh7GhlvwJN28Hg6FvcPJjSlCFNZBCcfEBizh4lA=; b=OrOPrYRy6/fwQ6aJZxoKHj37nZQXmWdOiWabQtaWUFXYjwfCT5sE9cO6f+LyApGEeQ WCNtiOtdHpTomMDpBYm+7mQ82h551UubQgKav8SzJLbVHmSmVY/8B4Lx2c5xZhq0b6D6 40CdMI1nubUEC9YXV5wi2xN1L5cmansyRPHX8W6cndNzKKhZXjE/BXkb8r7R++Y7g2WH w2lLlpcXZBfVa7lbGXco0NApwiiHcGPEszEx32e8eqYn6DT6jO1QMZOyqVJ7ogmBbxVP lRDVv/TX95XhlUV0VwpKRxPraw1zXLDXGFpdZ6mlN3IRi5yodNbzvlcK2m2/igHELT5i O14g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=2tgRhh7GhlvwJN28Hg6FvcPJjSlCFNZBCcfEBizh4lA=; b=jM8Gcco3zI1KqTjaXpLked/xhie7ASnrg5zr0iU7fKzD+UWAJHd6nk2xcTTuecf7iT GCOZGbmE4oFBHHd8cbWz/hrbcoIt3lvDN9hj2OlGUPY4Pml2f9ES1yB0CS3Ck+0PPIXb TiMpikhnDCX6tKGxQYYmknc8e9O2EvqO1WXW+qo3xVNnK7QRlPvGdMB5Wl6LU0p3AEWI tRZW4m4a28eGcyh9XtS9J0S7GsTGB1U/9/R617pnl+N/wFS8UKm3hGjqMDCn9LgkA0K/ Ke8RDd1hdB2h6K1Z+a/br/CeeJO64+x5n6ilwLHIWlqmOdGIn49uUq1v7jQWyA1zC6Q3 pMCw== X-Gm-Message-State: AOAM5337aIi/dCGXiqOTyBF6F2DigI+FcEekzQkY+WSFLr3Fm3OxvKja +PYOP6kIvhThz+Jzfzy467xLPw== X-Google-Smtp-Source: ABdhPJxd9K5bTJBh6nlLmnKACJ65F+8LJIyhXHdUAcNoMXYaAhNjcMbXjFLQ42inpdZHnptsoE/kzw== X-Received: by 2002:ac8:7a95:: with SMTP id x21mr12668980qtr.135.1595944392965; Tue, 28 Jul 2020 06:53:12 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:53c1]) by smtp.gmail.com with ESMTPSA id w44sm19714965qtj.86.2020.07.28.06.53.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jul 2020 06:53:12 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH] mm: memcontrol: don't count limit-setting reclaim as memory pressure Date: Tue, 28 Jul 2020 09:52:10 -0400 Message-Id: <20200728135210.379885-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: 3A38F180B3C83 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When an outside process lowers one of the memory limits of a cgroup (or uses the force_empty knob in cgroup1), direct reclaim is performed in the context of the write(), in order to directly enforce the new limit and have it being met by the time the write() returns. Currently, this reclaim activity is accounted as memory pressure in the cgroup that the writer(!) belongs to. This is unexpected. It specifically causes problems for senpai (https://github.com/facebookincubator/senpai), which is an agent that routinely adjusts the memory limits and performs associated reclaim work in tens or even hundreds of cgroups running on the host. The cgroup that senpai is running in itself will report elevated levels of memory pressure, even though it itself is under no memory shortage or any sort of distress. Move the psi annotation from the central cgroup reclaim function to callsites in the allocation context, and thereby no longer count any limit-setting reclaim as memory pressure. If the newly set limit causes the workload inside the cgroup into direct reclaim, that of course will continue to count as memory pressure. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Chris Down Acked-by: Michal Hocko --- mm/memcontrol.c | 12 +++++++++++- mm/vmscan.c | 6 ------ 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 805a44bf948c..8377640ad494 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2233,11 +2233,18 @@ static void reclaim_high(struct mem_cgroup *memcg, gfp_t gfp_mask) { do { + unsigned long pflags; + if (page_counter_read(&memcg->memory) <= READ_ONCE(memcg->memory.high)) continue; + memcg_memory_event(memcg, MEMCG_HIGH); + + psi_memstall_enter(&pflags); try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true); + psi_memstall_leave(&pflags); + } while ((memcg = parent_mem_cgroup(memcg)) && !mem_cgroup_is_root(memcg)); } @@ -2451,10 +2458,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; struct mem_cgroup *mem_over_limit; struct page_counter *counter; + enum oom_status oom_status; unsigned long nr_reclaimed; bool may_swap = true; bool drained = false; - enum oom_status oom_status; + unsigned long pflags; if (mem_cgroup_is_root(memcg)) return 0; @@ -2514,8 +2522,10 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, memcg_memory_event(mem_over_limit, MEMCG_MAX); + psi_memstall_enter(&pflags); nr_reclaimed = try_to_free_mem_cgroup_pages(mem_over_limit, nr_pages, gfp_mask, may_swap); + psi_memstall_leave(&pflags); if (mem_cgroup_margin(mem_over_limit) >= nr_pages) goto retry; diff --git a/mm/vmscan.c b/mm/vmscan.c index 749d239c62b2..742538543c79 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3318,7 +3318,6 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, bool may_swap) { unsigned long nr_reclaimed; - unsigned long pflags; unsigned int noreclaim_flag; struct scan_control sc = { .nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX), @@ -3339,17 +3338,12 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask); set_task_reclaim_state(current, &sc.reclaim_state); - trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask); - - psi_memstall_enter(&pflags); noreclaim_flag = memalloc_noreclaim_save(); nr_reclaimed = do_try_to_free_pages(zonelist, &sc); memalloc_noreclaim_restore(noreclaim_flag); - psi_memstall_leave(&pflags); - trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed); set_task_reclaim_state(current, NULL);