From patchwork Mon Sep 10 12:55:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 10594123 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD213112B for ; Mon, 10 Sep 2018 12:56:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9D78C28D8D for ; Mon, 10 Sep 2018 12:56:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 919E028EF9; Mon, 10 Sep 2018 12:56:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1712328D8D for ; Mon, 10 Sep 2018 12:56:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 475218E0007; Mon, 10 Sep 2018 08:55:58 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3F9F88E0006; Mon, 10 Sep 2018 08:55:58 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33C348E0007; Mon, 10 Sep 2018 08:55:58 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by kanga.kvack.org (Postfix) with ESMTP id BD7CD8E0006 for ; Mon, 10 Sep 2018 08:55:57 -0400 (EDT) Received: by mail-ed1-f72.google.com with SMTP id d47-v6so7204403edb.3 for ; Mon, 10 Sep 2018 05:55:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=/JOCw+NbfDz6lgIyF+Yv3IQvhEgkMeNnG6mNNRVK3Sc=; b=FqAujMfTurjANrcczNR+Qq5o644hibNQEUq4E25dKnrrRhqseOUjsb8jB0wkDi4b3w wPkIt2WXazNYpv9TS9G36C4QGiEySaZlirKaCbaCnpmqQb34kO9JJSv2hy7Q/d4tYqfB uvERMm2l2hh4N1yOS+clafkBjVdGGPeOKSHTb0X6LXHBJplUsx2R9PULuC+0vpGL5q9m KUwywoztSa98NXzcK8/c9tIG4sFgq1rE5JyA4m23FbWwulbgTKPuRs8lzZ0u4nEjFM5g 8EeGGrujEdGlvCTIgFOJ0VsNcrCuYPXjFqIS1Qfz6onshQoXQ/qqJ6ALIcxfZVV2MhBE ucug== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Gm-Message-State: APzg51B9Z+T0rhpRiygjYF0Tfh9Tmyl0Tw0EXYchT4Fybz7QxKfraggE 3l9PMcLgGp9QkYiIgP1ultMwbfbXoidyTdye4gnXSwWY3UhVFrbhGsk1Q5Sibmio0e69S+VM9ea TGJBq0p/AtTHajAqfgeREJE2DeOLFKFl1mKq5AdJyvRr8hPzM9Tw9gnjPls6h/mWKDSr4Vt3fpU ZdytzmhTNN90VRKzJNnS3N9B/Nt/AzVFWMJ0nKblGlhDj8DV91AxA4TaRphqnQ89sB6Ez/3NYxS 4Ic/whgr5Hzt5z9eBWRkycNUp4yguO2QJ3W4EJNUJKDGFVpxhRmmJModxyoOp4GUj5/PTxnLRh8 TN3nSjD2rLT4oQCWUmR+643U3GqXLK46Ma1RhsrIIuzH7GF6OpibRz8WCyTgxqtMiYcERTvwhw= = X-Received: by 2002:a50:a4a1:: with SMTP id w30-v6mr22934503edb.67.1536584157259; Mon, 10 Sep 2018 05:55:57 -0700 (PDT) X-Received: by 2002:a50:a4a1:: with SMTP id w30-v6mr22934428edb.67.1536584156202; Mon, 10 Sep 2018 05:55:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536584156; cv=none; d=google.com; s=arc-20160816; b=YDNQlp/3e8jxcXfh/wexIQlKbTvHSNRuAA+/2C8TbvVtul0MDm9ydbKLJSOIXoU7IR mysNDcOYCMoSqKVecfu70EgwCYgMviCwsyxoNKpn8BfTDzrkG9vf+3F8SujKSk8Tm8N5 w0nQrRotDbW2fIDTVx5fWtzfnxdwDfZHZYtwaMxcl7c6LESAue+d3JilLUzZquyi5MhV m/Qv8q1WNizfOMIV3NZYzHAiULMYFNZn0oceCf3GOFz3qkg6pv7uym8GGWmBBjPTqNBR K7kPFiXxBuCIfv+moVeihyIucev+tDQZH9LBdIDda3O43xGFw9yYZdnryCvLof6pbg+n AXzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=/JOCw+NbfDz6lgIyF+Yv3IQvhEgkMeNnG6mNNRVK3Sc=; b=hV8F5uoVEu+sOR/1+JtTXQlidJDkUGdsS4JtvR0i3WjrM/8JTA2pz4e99w5h+2ILli zesprmeZQOtRYb5HWDNOEHRHovpz/p1rWqjxa6njAJ9BemcFZT485Oq+gYwWpktSwSSW fN9QXEVtGelyJ9NowUPimyrtBBhzu0wZBrk+h9y+voCaPkWJj1XKnhtTAEiqHl4SGH+Y ZoSGhq4HoLcqGwsuTnB0ZsV/dtWG7cgdvC2fElqmYEsWdEPDQTgBOVhdId8H8BbOaG2g OnxpBW4xc6VujnvIvlUYm3Hgs8eag50h0FoiLngZN8eUty/i4RsbgRa6PTDczFinbbXz p3Nw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id v1-v6sor14861574edf.9.2018.09.10.05.55.56 for (Google Transport Security); Mon, 10 Sep 2018 05:55:56 -0700 (PDT) Received-SPF: pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Google-Smtp-Source: ANB0VdaOoZhYwqbi7Nu/N2uezNhxj3dGT8BGbSUneV6AHhUfM329kZplq9bKtEyfwAOauXkq35SMyA== X-Received: by 2002:a50:8978:: with SMTP id f53-v6mr22635587edf.166.1536584155687; Mon, 10 Sep 2018 05:55:55 -0700 (PDT) Received: from tiehlicka.suse.cz (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id h40-v6sm8632245edh.88.2018.09.10.05.55.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Sep 2018 05:55:55 -0700 (PDT) From: Michal Hocko To: Cc: Tetsuo Handa , Roman Gushchin , Andrew Morton , Michal Hocko Subject: [RFC PATCH 3/3] mm, oom: hand over MMF_OOM_SKIP to exit path if it is guranteed to finish Date: Mon, 10 Sep 2018 14:55:13 +0200 Message-Id: <20180910125513.311-4-mhocko@kernel.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180910125513.311-1-mhocko@kernel.org> References: <1536382452-3443-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> <20180910125513.311-1-mhocko@kernel.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Michal Hocko David Rientjes has noted that certain user space memory allocators leave a lot of page tables behind and the current implementation of oom_reaper doesn't deal with those workloads very well. In order to improve these workloads define a point when exit_mmap is guaranteed to finish the tear down without any further blocking etc. This is right after we unlink vmas (those still depend on locks which are held while performing memory allocations from other contexts) and before we start releasing page tables. Opencode free_pgtables and explicitly unlink all vmas first. Then set mm->mmap to NULL (there shouldn't be anybody looking at it at this stage) and check for mm->mmap in the oom_reaper path. If the mm->mmap is NULL we rely on the exit path and won't set MMF_OOM_SKIP from the reaper. Signed-off-by: Michal Hocko --- mm/mmap.c | 24 ++++++++++++++++++++---- mm/oom_kill.c | 13 +++++++------ 2 files changed, 27 insertions(+), 10 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 3481424717ac..99bb9ce29bc5 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3085,8 +3085,27 @@ void exit_mmap(struct mm_struct *mm) /* oom_reaper cannot race with the page tables teardown */ if (oom) down_write(&mm->mmap_sem); + /* + * Hide vma from rmap and truncate_pagecache before freeing + * pgtables + */ + while (vma) { + unlink_anon_vmas(vma); + unlink_file_vma(vma); + vma = vma->vm_next; + } + vma = mm->mmap; + if (oom) { + /* + * the exit path is guaranteed to finish without any unbound + * blocking at this stage so make it clear to the caller. + */ + mm->mmap = NULL; + up_write(&mm->mmap_sem); + } - free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); + free_pgd_range(&tlb, vma->vm_start, vma->vm_prev->vm_end, + FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); tlb_finish_mmu(&tlb, 0, -1); /* @@ -3099,9 +3118,6 @@ void exit_mmap(struct mm_struct *mm) vma = remove_vma(vma); } vm_unacct_memory(nr_accounted); - - if (oom) - up_write(&mm->mmap_sem); } /* Insert vm structure into process list sorted by address diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 049e67dc039b..0ebf93c76c81 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -570,12 +570,10 @@ static bool oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm) } /* - * MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't - * work on the mm anymore. The check for MMF_OOM_SKIP must run - * under mmap_sem for reading because it serializes against the - * down_write();up_write() cycle in exit_mmap(). + * If exit path clear mm->mmap then we know it will finish the tear down + * and we can go and bail out here. */ - if (test_bit(MMF_OOM_SKIP, &mm->flags)) { + if (!mm->mmap) { trace_skip_task_reaping(tsk->pid); goto out_unlock; } @@ -624,8 +622,11 @@ static void oom_reap_task(struct task_struct *tsk) /* * Hide this mm from OOM killer because it has been either reaped or * somebody can't call up_write(mmap_sem). + * Leave the MMF_OOM_SKIP to the exit path if it managed to reach the + * point it is guaranteed to finish without any blocking */ - set_bit(MMF_OOM_SKIP, &mm->flags); + if (mm->mmap) + set_bit(MMF_OOM_SKIP, &mm->flags); /* Drop a reference taken by wake_oom_reaper */ put_task_struct(tsk);