From patchwork Wed Oct 17 10:06:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tetsuo Handa X-Patchwork-Id: 10645213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C8CB109C for ; Wed, 17 Oct 2018 10:06:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E82B2AAC9 for ; Wed, 17 Oct 2018 10:06:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 31D1D2AC70; Wed, 17 Oct 2018 10:06:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C0B32AAC9 for ; Wed, 17 Oct 2018 10:06:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B27E6B0280; Wed, 17 Oct 2018 06:06:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 762796B0281; Wed, 17 Oct 2018 06:06:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 62C186B0282; Wed, 17 Oct 2018 06:06:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by kanga.kvack.org (Postfix) with ESMTP id 3B8036B0280 for ; Wed, 17 Oct 2018 06:06:55 -0400 (EDT) Received: by mail-io1-f70.google.com with SMTP id c21-v6so24348554ioi.14 for ; Wed, 17 Oct 2018 03:06:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=8Z5E3OGAg25qpVO/vpKEYfsTJb3sSOiGvr3RH/thjj8=; b=FDK4fTZft/8CSfq4AhXQqHhruldnbEQ6m9Y1I08SM/y26qG+RehZEYYk346w1WH0gL 78qnLr4RZU9wrLtTBRLVUtXr+Ok8omhwxKuw2iNGEvzooyxk6eX4yrFnMJb6vzr+KLEG 2Xtxlav/ujSEt8/2zQASv36MHn41yoTgxf/JtiarqquFrjo+FW5L944FneKiqOOeITAh Elp/qgbsV56a9gJoh4zQnQQlEgv6tMA3hMVUD4YMv1c/s8+pQjOHKi6JsXn9qlDW7kxl fQvXMR+wpdH0ybHM9yN9yXHWF8qD+4pyChibSdlWhCuy/CRRmfk2y4Tk85y1h2CqsHaH rMjg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) smtp.mailfrom=penguin-kernel@i-love.sakura.ne.jp X-Gm-Message-State: ABuFfogeUKXFCrGf6sIETOzrWxML13xFhJsjicYuNAavEwZZPpcqcWGd BXhTPF0DRsn/oAPsbHeHQSR75F6F9MQiA2JsMvimWjUcMFZSfWSF6IyT7RUQ/3CIa2iQ3D8dbgx s4STPz0CueE/4o5FoUxf1ss41lLv7Pg3JaZRLYX0PpozAE9CXfWdH/0XmXzOIlR/6hw== X-Received: by 2002:a05:660c:2d1:: with SMTP id j17mr1268874itd.23.1539770814605; Wed, 17 Oct 2018 03:06:54 -0700 (PDT) X-Google-Smtp-Source: ACcGV6179Z4b0gGEpsSpymOyDKs2LBbgBmGol7FXXDYv48kctHTNJs3B546wheU4iT5E9g5gGq7h X-Received: by 2002:a05:660c:2d1:: with SMTP id j17mr1268819itd.23.1539770813356; Wed, 17 Oct 2018 03:06:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539770813; cv=none; d=google.com; s=arc-20160816; b=ZvHyQV9GcxxB3u2M7VXIm0to4Hbl4MwnjIpm87/FfEUJQK6rZyHym89swPrSDzqO35 jGzBdTN6XVz8C0D4NUJKijdqGdF7qXAzabZwwLwMMEYPMOYr6W90vygqfeXb3KyuFfj0 KXGucTjpDDaXinYzKfe5RuPjGF7l1gjdiA+SJwcBTMPJpt14ZzNfGQ1CTemmsKz5NtFG oLKrdAq2mwSLp65290wiltFS0BU21jof7Bihsi/hsPBX9iGchL5asKU+i1WgCdkYMp/3 uFCtjxOYYCLAHFGgrIpEL33Zl0fvGPCJcOfMNE9+2tQ5mRPHKBoFN7GVrLT69gF1hz+B 6a4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from; bh=8Z5E3OGAg25qpVO/vpKEYfsTJb3sSOiGvr3RH/thjj8=; b=AmzNVbkhQCWkRqirzJyR6M5fO9rXCb99gNAOIqG8SXwiEceeWfsDdSiqEvBVwbiZet wEsV+UMAMnCP+1shJWhMlp0v9u71ii5aUb7d5TK92oSkN+lkGW8aCqEy9KFw9pXQZQDe UTFSJQ1tRo3eI+ZgssuTrdMqzB3+E+XQxEx/S+Qr7kL4bCUjFcaEPoaLjZvIqaAPHLCi G669iJn2d6Nyq2UDoZvcMMtZ8pGG+tlT2PTMmZc5oirkWCa/p9NV+UFD/S6xPOBssvIJ rnoPmWDu7KOMoW/OOLyhpIKCdp/tv7CPEqPoZPOaZ/03sF9VCV1Zapzu8HBBu7EUJOmf o1xA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) smtp.mailfrom=penguin-kernel@i-love.sakura.ne.jp Received: from www262.sakura.ne.jp (www262.sakura.ne.jp. [202.181.97.72]) by mx.google.com with ESMTPS id c5-v6si12604709jae.126.2018.10.17.03.06.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Oct 2018 03:06:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) client-ip=202.181.97.72; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of penguin-kernel@i-love.sakura.ne.jp designates 202.181.97.72 as permitted sender) smtp.mailfrom=penguin-kernel@i-love.sakura.ne.jp Received: from fsav109.sakura.ne.jp (fsav109.sakura.ne.jp [27.133.134.236]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id w9HA6Num052266; Wed, 17 Oct 2018 19:06:23 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav109.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav109.sakura.ne.jp); Wed, 17 Oct 2018 19:06:23 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav109.sakura.ne.jp) Received: from ccsecurity.localdomain (softbank060157066051.bbtec.net [60.157.66.51]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id w9HA6Jff052195 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 17 Oct 2018 19:06:23 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) From: Tetsuo Handa To: Michal Hocko Cc: Johannes Weiner , linux-mm@kvack.org, syzkaller-bugs@googlegroups.com, guro@fb.com, kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org, rientjes@google.com, yang.s@alibaba-inc.com, Andrew Morton , Sergey Senozhatsky , Petr Mladek , Sergey Senozhatsky , Steven Rostedt , Tetsuo Handa , Michal Hocko , syzbot Subject: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task. Date: Wed, 17 Oct 2018 19:06:22 +0900 Message-Id: <1539770782-3343-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> X-Mailer: git-send-email 1.8.3.1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP syzbot is hitting RCU stall at shmem_fault() [1]. This is because memcg-OOM events with no eligible task (current thread is marked as OOM-unkillable) continued calling dump_header() from out_of_memory() enabled by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper OOM header when no eligible victim left."). Michal proposed ratelimiting dump_header() [2]. But I don't think that that patch is appropriate because that patch does not ratelimit "%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n" "Out of memory and no killable processes...\n" messages which can be printed for every few milliseconds (i.e. effectively denial of service for console users) until the OOM situation is solved. Let's make sure that next dump_header() waits for at least 60 seconds from previous "Out of memory and no killable processes..." message. Michal is thinking that any interval is meaningless without knowing the printk() throughput. But since printk() is synchronous unless handed over to somebody else by commit dbdda842fe96f893 ("printk: Add console owner and waiter logic to load balance console writes"), it is likely that all OOM messages from this out_of_memory() request is already flushed to consoles when pr_warn("Out of memory and no killable processes...\n") returned. Thus, we will be able to allow console users to do what they need to do. To summarize, this patch allows threads in requested memcg to complete memory allocation requests for doing recovery operation, and also allows administrators to manually do recovery operation from console if OOM-unkillable thread is failing to solve the OOM situation automatically. [1] https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64 [2] https://lkml.kernel.org/r/20181010151135.25766-1-mhocko@kernel.org Signed-off-by: Tetsuo Handa Reported-by: syzbot Cc: Johannes Weiner Cc: Michal Hocko Signed-off-by: Tetsuo Handa --- mm/oom_kill.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index f10aa53..9056f9b 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1106,6 +1106,11 @@ bool out_of_memory(struct oom_control *oc) select_bad_process(oc); /* Found nothing?!?! */ if (!oc->chosen) { + static unsigned long last_warned; + + if ((is_sysrq_oom(oc) || is_memcg_oom(oc)) && + time_in_range(jiffies, last_warned, last_warned + 60 * HZ)) + return false; dump_header(oc, NULL); pr_warn("Out of memory and no killable processes...\n"); /* @@ -1115,6 +1120,7 @@ bool out_of_memory(struct oom_control *oc) */ if (!is_sysrq_oom(oc) && !is_memcg_oom(oc)) panic("System is deadlocked on memory\n"); + last_warned = jiffies; } if (oc->chosen && oc->chosen != (void *)-1UL) oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :