From patchwork Thu May 7 16:33:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 11534411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6438081 for ; Thu, 7 May 2020 16:33:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 21E2F208DB for ; Thu, 7 May 2020 16:33:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IJMyIUeP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 21E2F208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 577AB900005; Thu, 7 May 2020 12:33:07 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 500C5900002; Thu, 7 May 2020 12:33:07 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EFF1900005; Thu, 7 May 2020 12:33:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0127.hostedemail.com [216.40.44.127]) by kanga.kvack.org (Postfix) with ESMTP id 233DC900002 for ; Thu, 7 May 2020 12:33:07 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DA36882499B9 for ; Thu, 7 May 2020 16:33:06 +0000 (UTC) X-FDA: 76790467572.18.pig32_1b2629be3212e X-Spam-Summary: 2,0,0,c6bc72f312295711,d41d8cd98f00b204,3qti0xggkch4ujcmggndiqqing.eqonkpwz-oomxcem.qti@flex--shakeelb.bounces.google.com,,RULES_HIT:41:69:152:355:379:541:800:960:973:988:989:1260:1277:1313:1314:1345:1437:1516:1518:1535:1544:1593:1594:1605:1711:1730:1747:1777:1792:2195:2199:2393:2559:2562:2693:2740:2904:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:4118:4605:5007:6261:6299:6653:7903:9168:9592:9969:10004:11026:11473:11658:11914:12043:12291:12296:12297:12438:12555:12683:12895:13161:13229:14096:14097:14181:14394:14659:14721:21080:21222:21433:21444:21451:21627:21740:21990:30054:30070,0,RBL:209.85.219.201:@flex--shakeelb.bounces.google.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: pig32_1b2629be3212e X-Filterd-Recvd-Size: 7341 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Thu, 7 May 2020 16:33:06 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id m138so7605449ybf.12 for ; Thu, 07 May 2020 09:33:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=og154mHT7Z9uD6j2lwCYPmDrJ+/s5VY2aDAgiLM9b98=; b=IJMyIUePnVZdYSUnoiGNaJBp9FENwv0hX7vYYnUYT9c3x9pjUNSefNv7zPpGkB1FGs jtm5oQCFAXilKW/JLjhFJ9J6o6Ih+udUlq94zoxdyzFJzvJ+Y+UbEwtqhbqboeSnC7le P1Ump5crq6Gh2lH86YAQmM3MW83uJ8t7ZVDJ9NFDI2OhqYo+vAlCL/5uBgejtqdApLbm zWDhL6oJfOGt7HQjc0LVEI4I6rW+dOuvXZyTGxfQZPnGc8du64GCVs38fn0954HxEhl1 0Mzs5kzurd/aoze7EFOKg5UsEGNqqkZ0I9HbMpmQdNxrtu1q7M+rNNKPw7ajXFWF5xFm H3hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=og154mHT7Z9uD6j2lwCYPmDrJ+/s5VY2aDAgiLM9b98=; b=iYZIFR80IPpLzcmluCyCNXWm9x1UmhfPrJ8N/z/68W5O9IFedD0fHXmeChgFwUNG93 mcdFATKMnSAPja9nsDNBQoqzy+fSSAbaHbB6e4SIYUPKCGjzNRBUv4Xl9eAhmW1Gv+RH kC2J5qUsTSRmzb1Rs4A9th7Z70A2kmPwA6HHQv/YrUEZsBDiSJ7k0gyMzG0a0zvPTzpK pnJa+ETw0sWTZKATEYZvis2lb5nl2iWPTmaC5rob6oIXib5NR3ztndiN11Ed4q/hqEzO cGSK44tbpJxmtXrLA6yQGbC7uSVVHz2LRe7MI1kzekVO1aomfdBP21BGESN/omGt/2YX t8cw== X-Gm-Message-State: AGi0PuZhnJn5zMnVTVgn2HSgmG4k3nzgYJETu3AtIHBin9nGN5xKD7Nm EVVLEccpCbKRt+d8ofEkIV7/0qpa3wI78Q== X-Google-Smtp-Source: APiQypK06BpXu5zFNpBIZOjw1SyVUbCuz1lprj+aT7oBFGoFI4w3VvF2wjp0Qoon8PqGO/Ce9Xd2LKgv+fIEzw== X-Received: by 2002:a25:b10a:: with SMTP id g10mr23931305ybj.220.1588869185637; Thu, 07 May 2020 09:33:05 -0700 (PDT) Date: Thu, 7 May 2020 09:33:01 -0700 Message-Id: <20200507163301.229070-1-shakeelb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.26.2.526.g744177e7f7-goog Subject: [PATCH] memcg: effective memory.high reclaim for remote charging From: Shakeel Butt To: Johannes Weiner , Roman Gushchin , Michal Hocko Cc: Greg Thelen , Andrew Morton , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Shakeel Butt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently the reclaim of excessive usage over memory.high is scheduled to run on returning to the userland. The main reason behind this approach was simplicity i.e. always reclaim with GFP_KERNEL context. However the underlying assumptions behind this approach are: the current task shares the memcg hierarchy with the given memcg and the memcg of the current task most probably will not change on return to userland. With the remote charging, the first assumption breaks and it allows the usage to grow way beyond the memory.high as the reclaim and the throttling becomes ineffective. This patch forces the synchronous reclaim and potentially throttling for the callers with context that allows blocking. For unblockable callers or whose synch high reclaim is still not successful, a high reclaim is scheduled either to return-to-userland if current task shares the hierarchy with the given memcg or to system work queue. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko --- mm/memcontrol.c | 63 +++++++++++++++++++++++++++++-------------------- 1 file changed, 37 insertions(+), 26 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 317dbbaac603..7abb762f26cd 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2387,23 +2387,13 @@ static unsigned long calculate_high_delay(struct mem_cgroup *memcg, return min(penalty_jiffies, MEMCG_MAX_HIGH_DELAY_JIFFIES); } -/* - * Scheduled by try_charge() to be executed from the userland return path - * and reclaims memory over the high limit. - */ -void mem_cgroup_handle_over_high(void) +static void reclaim_over_high(struct mem_cgroup *memcg, gfp_t gfp_mask, + unsigned long nr_pages) { unsigned long penalty_jiffies; unsigned long pflags; - unsigned int nr_pages = current->memcg_nr_pages_over_high; - struct mem_cgroup *memcg; - if (likely(!nr_pages)) - return; - - memcg = get_mem_cgroup_from_mm(current->mm); - reclaim_high(memcg, nr_pages, GFP_KERNEL); - current->memcg_nr_pages_over_high = 0; + reclaim_high(memcg, nr_pages, gfp_mask); /* * memory.high is breached and reclaim is unable to keep up. Throttle @@ -2418,7 +2408,7 @@ void mem_cgroup_handle_over_high(void) * been aggressively reclaimed enough yet. */ if (penalty_jiffies <= HZ / 100) - goto out; + return; /* * If we exit early, we're guaranteed to die (since @@ -2428,8 +2418,23 @@ void mem_cgroup_handle_over_high(void) psi_memstall_enter(&pflags); schedule_timeout_killable(penalty_jiffies); psi_memstall_leave(&pflags); +} -out: +/* + * Scheduled by try_charge() to be executed from the userland return path + * and reclaims memory over the high limit. + */ +void mem_cgroup_handle_over_high(void) +{ + unsigned int nr_pages = current->memcg_nr_pages_over_high; + struct mem_cgroup *memcg; + + if (likely(!nr_pages)) + return; + + memcg = get_mem_cgroup_from_mm(current->mm); + reclaim_over_high(memcg, GFP_KERNEL, nr_pages); + current->memcg_nr_pages_over_high = 0; css_put(&memcg->css); } @@ -2584,15 +2589,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, if (batch > nr_pages) refill_stock(memcg, batch - nr_pages); - /* - * If the hierarchy is above the normal consumption range, schedule - * reclaim on returning to userland. We can perform reclaim here - * if __GFP_RECLAIM but let's always punt for simplicity and so that - * GFP_KERNEL can consistently be used during reclaim. @memcg is - * not recorded as it most likely matches current's and won't - * change in the meantime. As high limit is checked again before - * reclaim, the cost of mismatch is negligible. - */ do { if (page_counter_read(&memcg->memory) > READ_ONCE(memcg->high)) { /* Don't bother a random interrupted task */ @@ -2600,8 +2596,23 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, schedule_work(&memcg->high_work); break; } - current->memcg_nr_pages_over_high += batch; - set_notify_resume(current); + + if (gfpflags_allow_blocking(gfp_mask)) + reclaim_over_high(memcg, gfp_mask, batch); + + if (page_counter_read(&memcg->memory) <= + READ_ONCE(memcg->high)) + break; + /* + * The above reclaim might not be able to do much. Punt + * the high reclaim to return to userland if the current + * task shares the hierarchy. + */ + if (current->mm && mm_match_cgroup(current->mm, memcg)) { + current->memcg_nr_pages_over_high += batch; + set_notify_resume(current); + } else + schedule_work(&memcg->high_work); break; } } while ((memcg = parent_mem_cgroup(memcg)));