From patchwork Mon Aug 27 13:51:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 10577109 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 12FD0174C for ; Mon, 27 Aug 2018 13:51:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE00B29C29 for ; Mon, 27 Aug 2018 13:51:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB96229B50; Mon, 27 Aug 2018 13:51:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 337BE29BC8 for ; Mon, 27 Aug 2018 13:51:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2ECDE6B40D0; Mon, 27 Aug 2018 09:51:10 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 24EA56B40D1; Mon, 27 Aug 2018 09:51:10 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F00B6B40D2; Mon, 27 Aug 2018 09:51:10 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by kanga.kvack.org (Postfix) with ESMTP id 9F9F36B40D0 for ; Mon, 27 Aug 2018 09:51:09 -0400 (EDT) Received: by mail-ed1-f70.google.com with SMTP id g18-v6so3982908edg.14 for ; Mon, 27 Aug 2018 06:51:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=Vhcm/rGk2kZtLi1XQPv8v7NPVrD8UYtwAM0OUX8x8w4=; b=JMS8XBAnqDKVReAlM+CBQ/jjGciXYbJa6HWc28UYWvTlX6fContdKslnJNjCrfhb7r uKr3SWeBAhcNDvm3tjAhlsW4II8clVf5ZPNBxKuonu0QfatFzEkaugPOI87tLRiHLdBQ iY44MDFE9v6IikyEpet9aMl9rHAb/b+v6Kf9shKjmqQzi8Uq6jLx2nrOfx7z+Aai9waN MWs9UB2gPr5W1RSmveX+aE5qt6gVYvTecwZaYhdLzh9Sp2qgLKwiQeo034U6SzbQGGBw 8ZveKL+3E8n5aP7Zqx7/35f/XAJzXUeSOKA0lt/TyrqrKLSYfjPWxY3mul5RtKlATiEl GwRg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Gm-Message-State: APzg51DPwphrOkVJd3oZx9h6Zt+B/XlVfAAcdAIcJsHvmvco69NHFvlw UqUGxwC9QuHyCU6SSaOTbfxc9CnRdyTDwyYsIqEpVTL+HVOf19/BCGxAeX4Eq+kBwj1ZiQbWtxp UK2WPst6yORYU1surog0nIymNMkm1IG9ycDKaDXSwW7AoztE4Er9rnSzalAcDAV9d3C6v/LXaSE sdnLwHqMNyOmzdl6PWPV9zxEVxl8tXtxIGMy6FReB/gP8M0tULbsCB0EedXPMS4WoICkolojzGU YPkycm1PLh/ui2VJRvgx/FoQM1vtb+5k71xRcKJXsSd2P0FFWc6oTztxCLQGsYwpUPbv75Mlt8z FL5zQAWnDSNR4PuFDd6lLhDCoPQ4cQI9dlGMM9F4d98i5FdB17fcG68oCTPm+RyJZLaf62+LJg= = X-Received: by 2002:a50:8fe6:: with SMTP id y93-v6mr17401167edy.290.1535377869043; Mon, 27 Aug 2018 06:51:09 -0700 (PDT) X-Received: by 2002:a50:8fe6:: with SMTP id y93-v6mr17401108edy.290.1535377868166; Mon, 27 Aug 2018 06:51:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535377868; cv=none; d=google.com; s=arc-20160816; b=jS420r/rspzyg/uj74Lt8XAC7efUIh0fyexnHOSBXcYdxxNWiWDPOuJeIiWnLWKem+ Gc90Amr3KQAf8XED4EGWuc4eiEaWlwERbWbjQSu5krTSrFlmnoTOZHJ108seXps23HnP m4XnVYfQBQpCjpwIGtvdyG3lc0Tih8wegUEudZosKRnD+f577Jmrx8fNYzSNh1qyFMdn LGOLnS+wBC6rd/rZogcct0V7CPhx1k5OS0YLd81uqMPtTZRil+znrVG6jYJtVVCINrtQ WhvsMjXRRZRe4TJyDeOWyAFTT/TY4mwUcXIzvU73dfvUfNhVbfjycAvE6z+0E8zJKCnT UPYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from:arc-authentication-results; bh=Vhcm/rGk2kZtLi1XQPv8v7NPVrD8UYtwAM0OUX8x8w4=; b=CZagVdcul9FkKot1+gaNpv6mABVWi1o5ILowGuugfyL153gp4riyjbvIHKQY9z6xaO ZNxrBj0bRgEQt976h7WXf9pm1Iv+skfolYqlhhzF/rZBCoiz8pki2yD0EEi7l9dvb0FE d8zuxQxeheqbD9dJcQjkqZLLWqdJKouWVJHeW3PjV7iE2KTrBYaslS/wc/RJxQajoEkm 3i2OlS96xKGKW5YSmMBH38UTat7fJ0f5gWiheQkP0HatD713Fk8N6S1jDmVNRbtUVqGz sbnTinYR3uwD6Z7cfZ3uzpHarnK4bQ1/MLQquh8eQOIoePlT4OV65nujTTgTamP8vRIv 7Ctw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id r13-v6sor4793050eda.38.2018.08.27.06.51.08 for (Google Transport Security); Mon, 27 Aug 2018 06:51:08 -0700 (PDT) Received-SPF: pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Google-Smtp-Source: ANB0Vdan5xWGnt2bbdmWlQhFraHHQfEHG5Yo3KKOnWPYdZd2es01DoIQXVUfBiaYCUzeL05Jcwp7ow== X-Received: by 2002:a50:afa3:: with SMTP id h32-v6mr16871540edd.129.1535377867812; Mon, 27 Aug 2018 06:51:07 -0700 (PDT) Received: from tiehlicka.suse.cz (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id c50-v6sm7851280ede.53.2018.08.27.06.51.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Aug 2018 06:51:06 -0700 (PDT) From: Michal Hocko To: Andrew Morton Cc: , LKML , Michal Hocko , Roman Gushchin , Johannes Weiner , Vladimir Davydov , David Rientjes , Tejun Heo Subject: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry(). Date: Mon, 27 Aug 2018 15:51:01 +0200 Message-Id: <20180827135101.15700-1-mhocko@kernel.org> X-Mailer: git-send-email 2.18.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Michal Hocko Tetsuo Handa has reported that it is possible to bypass the short sleep for PF_WQ_WORKER threads which was introduced by commit 373ccbe5927034b5 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress") and lock up the system if OOM. The primary reason is that WQ_MEM_RECLAIM WQs are not guaranteed to run even when they have a rescuer available. Those workers might be essential for reclaim to make a forward progress, however. If we are too unlucky all the allocations requests can get stuck waiting for a WQ_MEM_RECLAIM work item and the system is essentially stuck in an OOM condition without much hope to move on. Tetsuo has seen the reclaim stuck on drain_local_pages_wq or xlog_cil_push_work (xfs). There might be others. Since should_reclaim_retry() should be a natural reschedule point, let's do the short sleep for PF_WQ_WORKER threads unconditionally in order to guarantee that other pending work items are started. This will workaround this problem and it is less fragile than hunting down when the sleep is missed. E.g. we used to have a sleeping point in the oom path but this has been removed recently because it caused other issues. Having a single sleeping point is more robust. Reported-and-debugged-by: Tetsuo Handa Signed-off-by: Michal Hocko Cc: Roman Gushchin Cc: Johannes Weiner Cc: Vladimir Davydov Cc: David Rientjes Cc: Tejun Heo --- Hi Andrew, this has been previously posted [1] but it took quite some time to finally understand the issue [2]. Can we push this to mmotm and linux-next? I wouldn't hurry to merge this but the longer we have a wider testing exposure the better. Thanks! [1] http://lkml.kernel.org/r/ca3da8b8-1bb5-c302-b190-fa6cebab58ca@I-love.SAKURA.ne.jp [2] http://lkml.kernel.org/r/20180730145425.GE1206094@devbig004.ftw2.facebook.com mm/page_alloc.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e75865d58ba7..5fc5e500b5d0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3923,6 +3923,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, { struct zone *zone; struct zoneref *z; + bool ret = false; /* * Costly allocations might have made a progress but this doesn't mean @@ -3986,25 +3987,26 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, } } - /* - * Memory allocation/reclaim might be called from a WQ - * context and the current implementation of the WQ - * concurrency control doesn't recognize that - * a particular WQ is congested if the worker thread is - * looping without ever sleeping. Therefore we have to - * do a short sleep here rather than calling - * cond_resched(). - */ - if (current->flags & PF_WQ_WORKER) - schedule_timeout_uninterruptible(1); - else - cond_resched(); - - return true; + ret = true; + goto out; } } - return false; +out: + /* + * Memory allocation/reclaim might be called from a WQ + * context and the current implementation of the WQ + * concurrency control doesn't recognize that + * a particular WQ is congested if the worker thread is + * looping without ever sleeping. Therefore we have to + * do a short sleep here rather than calling + * cond_resched(). + */ + if (current->flags & PF_WQ_WORKER) + schedule_timeout_uninterruptible(1); + else + cond_resched(); + return ret; } static inline bool