From patchwork Tue Jan 15 10:17:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tetsuo Handa X-Patchwork-Id: 10764235 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 470BD14E5 for ; Tue, 15 Jan 2019 10:17:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 340062AFB1 for ; Tue, 15 Jan 2019 10:17:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 27A582B927; Tue, 15 Jan 2019 10:17:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3C8A2AFB1 for ; Tue, 15 Jan 2019 10:17:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD2FA8E0003; Tue, 15 Jan 2019 05:17:43 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A82FD8E0002; Tue, 15 Jan 2019 05:17:43 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 971888E0003; Tue, 15 Jan 2019 05:17:43 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-it1-f200.google.com (mail-it1-f200.google.com [209.85.166.200]) by kanga.kvack.org (Postfix) with ESMTP id 710168E0002 for ; Tue, 15 Jan 2019 05:17:43 -0500 (EST) Received: by mail-it1-f200.google.com with SMTP id m128so2272271itd.3 for ; Tue, 15 Jan 2019 02:17:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=9PSp++KrLhETgwQIv9HULtSA1taMRpCEtfOTBx/bJlc=; b=BWVgiZpBxOtwGaBz3rQJBl5WH39D4KIhsMMGD2rqolqJAoi/Giw+H91/brUaTRHVk9 qdw0kvCDNPLlTl6kmPTdAlqWsd28T+0RoYsB4r8lNySEOmDOKeFOwTkdK1talUKXJRf5 MOmxlDeXOh8HhOyCpDjPsgApS+W0vAES8xMfSoINuiMcqcDY+3sm2ySihzmGm6/kO6Ry SxNKWowXU9neue7JA05df6oLWQb45P05FcWWd0kJJ9P4pZTh33k3624JX+oXe+Geku1J 3HfhXYbAgUAWKVHW9JulfDEqacY+3VW1noSKw+BlpKfDEmK9qD+8d0R1Z0z5usEkcX79 Yqng== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) smtp.mailfrom=penguin-kernel@i-love.sakura.ne.jp X-Gm-Message-State: AJcUukeuXkVQJujlFaEjQw87jn60fWSmcvcnjYaRBUhftdaKc9hGHCPo 86FpdZP2P9kUlqgZcdz8XdbwkXK3u4wPMdDiWKf8jM1xmccWBRrAkh07wsjETXqSD1JU4YGEQzB uTYd7RqrcpqmeJm2LEKYHTfvAVyeA8KlXw4vkOp9OHw5A2tyP8KC1BoMyML8GiU1nhg== X-Received: by 2002:a02:6a16:: with SMTP id l22mr1615579jac.139.1547547463231; Tue, 15 Jan 2019 02:17:43 -0800 (PST) X-Google-Smtp-Source: ALg8bN5MQefPOd1tNBze4vgdPkc5nH8fP+G7u4eD7EkpO9Kaoqbki6m26vHk701dXmQnvkmYXobh X-Received: by 2002:a02:6a16:: with SMTP id l22mr1615551jac.139.1547547462449; Tue, 15 Jan 2019 02:17:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547547462; cv=none; d=google.com; s=arc-20160816; b=kx4HtEIOztgGp87he0VcC5MSXJNC7gkLBB1+6OcKxBhQjIetVaRD8VPPTjpLg/Dkw0 iNpFKU8uXE5OZRxXL2lnYBty3Qd9D0rI6PKXiBooUI63prc6nnafkintGZV67tvNbB7M P2fLbcbLh8LUXplDD6B8LQ9B4eM5+Cz04U9F/oSXqnDgu9RhaDYms16dTjCz13Px7IaR W0Nrb8E74XupRINg+5hGkP9gUPT1/WtSmHRARTaShzWPP3cG3LPRn7jfrbHF8x4B73vr b8qymY8FGSDfqcvK9X/a/zVww+P21Ls4/4wl+PAXECzwVNlyxNX9PPL4zkH3kS9y0E+t LKqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:references:cc:to:from:subject; bh=9PSp++KrLhETgwQIv9HULtSA1taMRpCEtfOTBx/bJlc=; b=XVw3w5JZTNrraJh4mo5dd2z7ybsQVEb4RTHciAilHnQPj9mTh/nxqE6Ux6kMs3tU6O RRasG7mwOLd4LJ9SkflTzMJOOS5iQMiqbrGWIASUvQGah3NWFQapZAY9M/ag6SmHtr9e nVw8An1PFE8FxKqtvu5wyXDkUg2kMFsIdPvUqF5ejC6LEarYI65OzS8mYKNe/5tRl/RK I9igkaACdgpVZ5vqbvC/jfnR6fTydnT7w14LmsyxV7IcDj9AeBqO6To6WcELeTSDmaZw oxef22nan/gcL0L7CIU3E755XkWzbFH1cFPE+eeOROp44KI9SZ+0v/9WpYpYZ7xSiyfP sxpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) smtp.mailfrom=penguin-kernel@i-love.sakura.ne.jp Received: from www262.sakura.ne.jp (www262.sakura.ne.jp. [202.181.97.72]) by mx.google.com with ESMTPS id u65si1779412itu.61.2019.01.15.02.17.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 Jan 2019 02:17:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) client-ip=202.181.97.72; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) smtp.mailfrom=penguin-kernel@i-love.sakura.ne.jp Received: from fsav103.sakura.ne.jp (fsav103.sakura.ne.jp [27.133.134.230]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id x0FAHadW021336; Tue, 15 Jan 2019 19:17:36 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav103.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav103.sakura.ne.jp); Tue, 15 Jan 2019 19:17:36 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav103.sakura.ne.jp) Received: from [192.168.1.8] (softbank126126163036.bbtec.net [126.126.163.36]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id x0FAHVgY021132 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NO); Tue, 15 Jan 2019 19:17:36 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: [PATCH v2] memcg: killed threads should not invoke memcg OOM killer From: Tetsuo Handa To: Andrew Morton , Johannes Weiner , David Rientjes Cc: Michal Hocko , linux-mm@kvack.org, Kirill Tkhai , Linus Torvalds References: <1545819215-10892-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> <20190107114139.GF31793@dhcp22.suse.cz> <20190107133720.GH31793@dhcp22.suse.cz> <935ae77c-9663-c3a4-c73a-fa69f9a3065f@i-love.sakura.ne.jp> Message-ID: <01370f70-e1f6-ebe4-b95e-0df21a0bc15e@i-love.sakura.ne.jp> Date: Tue, 15 Jan 2019 19:17:27 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <935ae77c-9663-c3a4-c73a-fa69f9a3065f@i-love.sakura.ne.jp> Content-Language: en-US X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Tetsuo Handa If $N > $M, a single process with $N threads in a memcg group can easily kill all $M processes in that memcg group, for mem_cgroup_out_of_memory() does not check if current thread needs to invoke the memcg OOM killer. T1@P1 |T2...$N@P1|P2...$M |OOM reaper ----------+----------+----------+---------- # all sleeping try_charge() mem_cgroup_out_of_memory() mutex_lock(oom_lock) try_charge() mem_cgroup_out_of_memory() mutex_lock(oom_lock) out_of_memory() select_bad_process() oom_kill_process(P1) wake_oom_reaper() oom_reap_task() # ignores P1 mutex_unlock(oom_lock) out_of_memory() select_bad_process(P2...$M) # all killed by T2...$N@P1 wake_oom_reaper() oom_reap_task() # ignores P2...$M mutex_unlock(oom_lock) We don't need to invoke the memcg OOM killer if current thread was killed when waiting for oom_lock, for mem_cgroup_oom_synchronize(true) can count on try_charge() when mem_cgroup_oom_synchronize(true) can not make forward progress because try_charge() allows already killed/exiting threads to make forward progress, and memory_max_write() can bail out upon signals. At first Michal thought that fatal signal check is racy compared to tsk_is_oom_victim() check. But an experiment showed that trying to call mark_oom_victim() on all killed thread groups is more racy than fatal signal check due to task_will_free_mem(current) path in out_of_memory(). Therefore, this patch changes mem_cgroup_out_of_memory() to bail out upon should_force_charge() == T rather than upon fatal_signal_pending() == T, for should_force_charge() == T && signal_pending(current) == F at memory_max_write() can't happen because current thread won't call memory_max_write() after getting PF_EXITING. Signed-off-by: Tetsuo Handa Acked-by: Michal Hocko --- mm/memcontrol.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index af7f18b..79a7d2a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -248,6 +248,12 @@ enum res_type { iter != NULL; \ iter = mem_cgroup_iter(NULL, iter, NULL)) +static inline bool should_force_charge(void) +{ + return tsk_is_oom_victim(current) || fatal_signal_pending(current) || + (current->flags & PF_EXITING); +} + /* Some nice accessors for the vmpressure. */ struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg) { @@ -1389,8 +1395,13 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, }; bool ret; - mutex_lock(&oom_lock); - ret = out_of_memory(&oc); + if (mutex_lock_killable(&oom_lock)) + return true; + /* + * A few threads which were not waiting at mutex_lock_killable() can + * fail to bail out. Therefore, check again after holding oom_lock. + */ + ret = should_force_charge() || out_of_memory(&oc); mutex_unlock(&oom_lock); return ret; } @@ -2209,9 +2220,7 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, * bypass the last charges so that they can exit quickly and * free their memory. */ - if (unlikely(tsk_is_oom_victim(current) || - fatal_signal_pending(current) || - current->flags & PF_EXITING)) + if (unlikely(should_force_charge())) goto force; /*