From patchwork Sat Jul 11 03:18:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11657651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 618EC722 for ; Sat, 11 Jul 2020 03:18:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 233BD2078B for ; Sat, 11 Jul 2020 03:18:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="R6Ot3O8I" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 233BD2078B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2E6048D0002; Fri, 10 Jul 2020 23:18:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 26F828D0001; Fri, 10 Jul 2020 23:18:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 137E58D0002; Fri, 10 Jul 2020 23:18:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0248.hostedemail.com [216.40.44.248]) by kanga.kvack.org (Postfix) with ESMTP id EDB9B8D0001 for ; Fri, 10 Jul 2020 23:18:16 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8D3E82C07 for ; Sat, 11 Jul 2020 03:18:16 +0000 (UTC) X-FDA: 77024336592.13.net36_3f058ef26ed3 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 5EB3318140B69 for ; Sat, 11 Jul 2020 03:18:16 +0000 (UTC) X-Spam-Summary: 1,0,0,2edf67ab7629acbb,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1345:1437:1535:1542:1711:1730:1747:1777:1792:1801:2196:2199:2393:2553:2559:2562:2693:2898:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:4250:4385:4605:5007:6119:6261:6299:6653:7514:7903:9121:9413:9592:10004:11026:11473:11658:11914:12296:12297:12438:12517:12519:12555:12895:13095:13161:13229:13868:14096:14181:14394:14687:14721:21080:21324:21433:21444:21451:21627:21666:21987:21990:30003:30054:30070:30090,0,RBL:209.85.222.193:@gmail.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04y8s7g61oe76ci9yk8hxiy6c7tp1oppxpwpnbn61acbfdqqhc96bqwcnychx3e.975gbint78iu9p4jg6re8q9pmetzuac4qro9fj9nf9djmzn4x9x4tmdcmp4gtos.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: net36_3f058ef26ed3 X-Filterd-Recvd-Size: 5451 Received: from mail-qk1-f193.google.com (mail-qk1-f193.google.com [209.85.222.193]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Sat, 11 Jul 2020 03:18:15 +0000 (UTC) Received: by mail-qk1-f193.google.com with SMTP id z63so7279033qkb.8 for ; Fri, 10 Jul 2020 20:18:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=/pGjLVnY+4rlWrkPvD1lp9gYeP3YEqTgb2JM9moeQGs=; b=R6Ot3O8IwAMIYuAJ8MMdE/nDcHKhSQ8BtaLcegCxQ/Bnt47w6nfBlMB59ELlKZPD9/ z3IrdR0nx35QtKsLa6bljbVWFXk5t6BFhq7qHc8/U0tq5yWM/YG38mSlm3gznqvqjr6A s8aOn1ck6zVhGUhrYa8RK6hjHNttOS0EvhSZkF4TyJN5gT1HWAVnYmOpzYhrzpgP6yvo pIgSs4aq5C4orJQhtXsomgiyEbuOELHu0Zie99ax1XkrtxxV/lsnBc2Fu0hPV+DzKJG/ +9yZzDYq6RL1oZRZNukCT8kwB9WJrTZrVsw2Pi+CClNrBLq+Q4f4kK8z2KrejB7640Uw j8Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=/pGjLVnY+4rlWrkPvD1lp9gYeP3YEqTgb2JM9moeQGs=; b=KPWuZx1JnCnzuUJjFB26Slrh0XFbPLRAk2WrZgd1w08wlbRVQWLbOYQUa4Y3j9cYKb mYuJ5SGPCy31Nj/XnBitsZVpHAuRsV08QK3uY1TJrrMSnzMmcjyRyoHKUNRtem2OtaBE 07qkP0P5BtQHGGPnKbHfyzpBjDz9wtdpTYm/M+nfIA3RLGKOssuHHTbuIdZXOoFOzI15 OLAhpRKtGZrGWvw8v61Er80KIwo+MgOrOXQNmy2VKp+vlGsJCeCEHWjKQTIVJHTF+brc bVfcMnNnsxuxlzxZbekxyZ4w8DqM+C6VjF3igY/mrDhYVuditfxqEn5jaiMqQ4s3xZj0 n61w== X-Gm-Message-State: AOAM532mAjaHFSy6fvsooCxRCgco3jYydjdtOrZoO9zA5B0zLCVNsu4I yMfbE2K07cLWuZ3J1IGEJ+M= X-Google-Smtp-Source: ABdhPJzP+9Qs6wXRsYgOFNAfdSpA8QzaZXLWbS+6j+P6OnJcTH7uk4+6h6gfc5Di4ORUBmv3jcvFCw== X-Received: by 2002:a37:6411:: with SMTP id y17mr48135536qkb.288.1594437495444; Fri, 10 Jul 2020 20:18:15 -0700 (PDT) Received: from dev.localdomain ([183.134.211.54]) by smtp.gmail.com with ESMTPSA id p25sm9334254qki.107.2020.07.10.20.18.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 10 Jul 2020 20:18:14 -0700 (PDT) From: Yafang Shao To: mhocko@kernel.org, rientjes@google.com, akpm@linux-foundation.org Cc: linux-mm@kvack.org, Yafang Shao Subject: [PATCH] mm, oom: don't invoke oom killer if current has been reapered Date: Fri, 10 Jul 2020 23:18:01 -0400 Message-Id: <1594437481-11144-1-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 X-Rspamd-Queue-Id: 5EB3318140B69 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If the current's MMF_OOM_SKIP is set, it means that the current is exiting or dying and likely to realease its address space. So we don't need to invoke the oom killer again. Otherwise that may cause some unexpected issues, for example, bellow is the issue found in our production environment. There're many threads of a multi-threaded task parallel running in a container on many cpus. Then many threads triggered OOM at the same time, CPU-1 CPU-2 ... CPU-n thread-1 thread-2 ... thread-n wait oom_lock wait oom_lock ... hold oom_lock (sigkill received) select current as victim and wakeup oom reaper release oom_lock (MMF_OOM_SKIP set by oom reaper) (lots of pages are freed) hold oom_lock because MMF_OOM_SKIP is set, kill others The thread running on CPU-n received sigkill and it will select current as the victim and wakeup the oom reaper. Then oom reaper will reap its rss and free lots of pages, as a result, there will be many free pages. Although the multi-threaded task is exiting, the other threads will continue to kill others because of the check of MMF_OOM_SKIP in task_will_free_mem(). Signed-off-by: Yafang Shao --- mm/oom_kill.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 6e94962..a8a155a 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -825,13 +825,6 @@ static bool task_will_free_mem(struct task_struct *task) if (!__task_will_free_mem(task)) return false; - /* - * This task has already been drained by the oom reaper so there are - * only small chances it will free some more - */ - if (test_bit(MMF_OOM_SKIP, &mm->flags)) - return false; - if (atomic_read(&mm->mm_users) <= 1) return true; @@ -963,7 +956,8 @@ static void oom_kill_process(struct oom_control *oc, const char *message) * so it can die quickly */ task_lock(victim); - if (task_will_free_mem(victim)) { + if (!test_bit(MMF_OOM_SKIP, &victim->mm->flags) && + task_will_free_mem(victim)) { mark_oom_victim(victim); wake_oom_reaper(victim); task_unlock(victim); @@ -1056,6 +1050,10 @@ bool out_of_memory(struct oom_control *oc) return true; } + /* current has been already reapered */ + if (test_bit(MMF_OOM_SKIP, ¤t->mm->flags)) + return true; + /* * If current has a pending SIGKILL or is exiting, then automatically * select it. The goal is to allow it to allocate so that it may