From patchwork Fri Jul 20 08:41:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Rientjes X-Patchwork-Id: 10536225 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1EA506053F for ; Fri, 20 Jul 2018 08:41:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 19D8D283DA for ; Fri, 20 Jul 2018 08:41:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 08ADB29488; Fri, 20 Jul 2018 08:41:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 66C962947C for ; Fri, 20 Jul 2018 08:41:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BEB16B0269; Fri, 20 Jul 2018 04:41:47 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 76EA26B026A; Fri, 20 Jul 2018 04:41:47 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65DE76B026B; Fri, 20 Jul 2018 04:41:47 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 23DEA6B0269 for ; Fri, 20 Jul 2018 04:41:47 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id e19-v6so5488886pgv.11 for ; Fri, 20 Jul 2018 01:41:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:from:to:cc:subject :in-reply-to:message-id:references:user-agent:mime-version; bh=V2lrFYAtFoiUWYRe+OnkDf7z+E01OJeevSebTba9s2M=; b=rdd4ORiNYyeJPxi70513d4z7NQjTBZJ6vTmyJvHZyjsQb9NLhUFPKk28WmtajjSn+N pfDK6pkkXv45+IQ4cWV1R8Hqc8kgDOrmS6V2ryEtFAZmLUIq5zRmNvaChoGgI+qOIk43 EV0eu3xfA1a98QnqhCt5JikKaqZD8x8opifY0YpW/sojrnxbABDjhNA6izNlfRPGG2O2 TcalS5vbzOc0GjcylRmrc8R+BfuSv/5TpVGgU93pvHrSaxBdgeV5tPUJEN2Tbj/tcSXr woz2j02gMHhfKfvEYIFvFq3BfoKUd5KXX89iPNSJWQ53KE9JBgVYwmOWo2BPQ2XcrM/b 9jhg== X-Gm-Message-State: AOUpUlE5pdDGeVKVOKdeXk/p82G5eZIyuiZdqcmC+vwttsxsTshgU/Ym 9BscucWfa4/9ZiU824WUOAEKRi99etJimE7bQ08Z4Fm/Lnmz3MsIPhhJEPzlFW2QNC38m4JlD3G 0zEhbEmz4Pd9NAubxkptOSzDJDUF6Hobw28Dhn6CG2mZ3Yesah9/5IbgAacGEhXJJsKIqPLtfbk aBmBz1ZAVnQ+AXMtM2Zae7AmIVCy2U33rcm4vzf6GE7GeCpRh8FKc5+9lMpmASGbsHcv2slOCCA sKS2WGSC4O0UjnooNs+N2u9EoVPkQA0BXsdUZZfKRqKSMqAy7ZhIfWpaG2mObINM0MzVaW7akvE sA2FV4zf32Q79QgcT7P0GvmbgXRDi19GENQiHlGgbA1sFTHjIIreCt0EkLg3IaUmN5XbX1v3Lry C X-Received: by 2002:a62:e0d5:: with SMTP id d82-v6mr1257097pfm.59.1532076106826; Fri, 20 Jul 2018 01:41:46 -0700 (PDT) X-Received: by 2002:a62:e0d5:: with SMTP id d82-v6mr1257049pfm.59.1532076105856; Fri, 20 Jul 2018 01:41:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532076105; cv=none; d=google.com; s=arc-20160816; b=FeiynJ7eEbca9ZAT9kEoaIkG+eq5F6bxzs2HzpnrDDO9r4l8UA603JiacS1HHLsU86 FKkir813ooSnpyEiL0746tzAljYop64UUAUi47Rs32XZm3sf3N0lWx9eNAkTedy6yxiP fRazfMKHWUAi1Xg15UU+l67o42AqpaWLQDi7Iwx4L9Cbc0xjRbb/ulbys1PsK2LYZLYf 7T2NchqpXViMqYj9v6ZnnR0revQfhd24lcctu+wipTSIjAP22U9EYf5HZSjAEjDnmiVe YL5BJZLG0aejt94i5t1T7vOaZwuEs7DUiFEXCWMk++n++eCqOOiAymRskRwaXlNkvjnU sq6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:references:message-id:in-reply-to:subject :cc:to:from:date:dkim-signature:arc-authentication-results; bh=V2lrFYAtFoiUWYRe+OnkDf7z+E01OJeevSebTba9s2M=; b=LgkDBgPDqBOCvFI2+1RywNAj/jDHC8xtHgCzN9iuyps/EelG2l5gnglDAh9uUiHTH2 bs+kpJiDaHV2UlNMPSauIl3vuXLhGRgrZ+HcB5juM2UmMGnGgYhJ/Qb/l9DL6d3ItDS5 QZSniUit6zpvN76lJ6iWiQKzvuJq3BYDcc2zt6ZJG4jJCTSLTrftZaDoSp0cfYZhAz0Q maeaj1oKHFvhWkIDmDaGM3UJgiB25L7kLlLJMRN2WxjmsMCt+BCtmcnQ4RU3VRvopmcL uxnAMWRqokuk1oOuOnVouzTUSJSYrynpDjjZwvWisI9mt3TW3Xa/5w/oWx5/AV77P3or g2PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sg+I2oJr; spf=pass (google.com: domain of rientjes@google.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id g2-v6sor368684pgv.100.2018.07.20.01.41.45 for (Google Transport Security); Fri, 20 Jul 2018 01:41:45 -0700 (PDT) Received-SPF: pass (google.com: domain of rientjes@google.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sg+I2oJr; spf=pass (google.com: domain of rientjes@google.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=V2lrFYAtFoiUWYRe+OnkDf7z+E01OJeevSebTba9s2M=; b=sg+I2oJrhC81NAd8Ffrw2dRxXlfBoH0ygMqubfQMVANaQWKDewXmZ67+cdh7sb/OCi 21EMi5UDgfiCGmju6paTBwceFPM4VDqsdIuTvVYJvxCGcDBhlbF+IiqXOOHqV7hix1pM 6OMWCqFoXFYwnsXK1ffoP6xPAqo+EbkqssNwUsJJ2hKNT6uUwxddpuXtuNcMpedDfOaD xkAoE4TwH8m7KylqWZ0Yif+g+LfqB5dm//f7eRAkH7H6SiwNcFwjtMd3f6IpnHgcU2X5 Gsl1PHC/LdYaDJuD99AqGyQUZbyRfvrHbTcx8AACqQFZa6o9Ly0CBreqXlyC9spoP7Ej R10g== X-Google-Smtp-Source: AAOMgpdksQtfSKNF4NGqsG4HQOA02hX1KAnApPCVG/w1EUE/beyqnogrPyDSz7hPDCqPH3NUt24Ajw== X-Received: by 2002:a63:af14:: with SMTP id w20-v6mr1222606pge.47.1532076105330; Fri, 20 Jul 2018 01:41:45 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id o27-v6sm3698279pfj.35.2018.07.20.01.41.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 20 Jul 2018 01:41:44 -0700 (PDT) Date: Fri, 20 Jul 2018 01:41:43 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Tetsuo Handa cc: Andrew Morton , kbuild test robot , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch v3] mm, oom: fix unnecessary killing of additional processes In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On Thu, 19 Jul 2018, Tetsuo Handa wrote: > Sigh... > > Nacked-by: Tetsuo Handa > > because David is not aware what is wrong. > Hmm, didn't you incorporate this exact patch into your own patch series that you proposed? :) I'm coming to this stark realization that all of these theater is only for effect. Perhaps other observers have come to that understanding earlier and I was late to the party. You're nacking a patch because it does a double set_bit() and jiffies can wraparound and we can add a process to the oom reaper list twice if the oom happens at the exact same moment. Ok. These are extremely trivial to fix. > Let's call "A" as a thread doing exit_mmap(), and "B" as the OOM reaper kernel thread. > > (1) "A" finds that unlikely(mm_is_oom_victim(mm)) == true. > (2) "B" finds that test_bit(MMF_OOM_SKIP, &mm->flags) in oom_reap_task() is false. > (3) "B" finds that !test_bit(MMF_UNSTABLE, &mm->flags) in oom_reap_task() is true. > (4) "B" enters into oom_reap_task_mm(tsk, mm). > (5) "B" finds that !down_read_trylock(&mm->mmap_sem) is false. > (6) "B" finds that mm_has_blockable_invalidate_notifiers(mm) is false. > (7) "B" finds that test_bit(MMF_UNSTABLE, &mm->flags) is false. > (8) "B" enters into __oom_reap_task_mm(mm). > (9) "A" finds that test_and_set_bit(MMF_UNSTABLE, &mm->flags) is false. > (10) "A" is preempted by somebody else. > (11) "B" finds that test_and_set_bit(MMF_UNSTABLE, &mm->flags) is true. > (12) "B" leaves __oom_reap_task_mm(mm). > (13) "B" leaves oom_reap_task_mm(). > (14) "B" finds that time_after_eq(jiffies, mm->oom_free_expire) became true. > (15) "B" finds that !test_bit(MMF_OOM_SKIP, &mm->flags) is true. > (16) "B" calls set_bit(MMF_OOM_SKIP, &mm->flags). > (17) "B" finds that test_bit(MMF_OOM_SKIP, &mm->flags) is true. > (18) select_bad_process() finds that MMF_OOM_SKIP is already set. > (19) out_of_memory() kills a new OOM victim. > (20) "A" resumes execution and start reclaiming memory. > > because oom_lock serialization was already removed. > Absent oom_lock serialization, this is exactly working as intended. You could argue that once the thread has reached exit_mmap() and begins oom reaping that it should be allowed to finish before the oom reaper declares MMF_OOM_SKIP. That could certainly be helpful, I simply haven't encountered a usecase where it were needed. Or, we could restart the oom expiration when MMF_UNSTABLE is set and deem that progress is being made so it give it some extra time. In practice, again, we haven't seen this needed. But either of those are very easy to add in as well. Which would you prefer? mm, oom: fix unnecessary killing of additional processes fix Fix double set_bit() per Tetsuo. Fix jiffies wraparound per Tetsuo. Signed-off-by: David Rientjes --- mm/mmap.c | 13 ++++++------- mm/oom_kill.c | 7 +++++-- 2 files changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3069,23 +3069,22 @@ void exit_mmap(struct mm_struct *mm) * Nothing can be holding mm->mmap_sem here and the above call * to mmu_notifier_release(mm) ensures mmu notifier callbacks in * __oom_reap_task_mm() will not block. - */ - __oom_reap_task_mm(mm); - - /* - * Now, set MMF_UNSTABLE to avoid racing with the oom reaper. + * + * This sets MMF_UNSTABLE to avoid racing with the oom reaper. * This needs to be done before calling munlock_vma_pages_all(), * which clears VM_LOCKED, otherwise the oom reaper cannot * reliably test for it. If the oom reaper races with * munlock_vma_pages_all(), this can result in a kernel oops if * a pmd is zapped, for example, after follow_page_mask() has * checked pmd_none(). - * + */ + __oom_reap_task_mm(mm); + + /* * Taking mm->mmap_sem for write after setting MMF_UNSTABLE will * guarantee that the oom reaper will not run on this mm again * after mmap_sem is dropped. */ - set_bit(MMF_UNSTABLE, &mm->flags); down_write(&mm->mmap_sem); up_write(&mm->mmap_sem); } diff --git a/mm/oom_kill.c b/mm/oom_kill.c --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -666,12 +666,15 @@ static int oom_reaper(void *unused) static u64 oom_free_timeout_ms = 1000; static void wake_oom_reaper(struct task_struct *tsk) { + unsigned long expire = jiffies + msecs_to_jiffies(oom_free_timeout_ms); + + if (!expire) + expire++; /* * Set the reap timeout; if it's already set, the mm is enqueued and * this tsk can be ignored. */ - if (cmpxchg(&tsk->signal->oom_mm->oom_free_expire, 0UL, - jiffies + msecs_to_jiffies(oom_free_timeout_ms))) + if (cmpxchg(&tsk->signal->oom_mm->oom_free_expire, 0UL, expire)) return; get_task_struct(tsk);