From patchwork Tue Mar 22 21:45:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12789227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44342C433EF for ; Tue, 22 Mar 2022 21:45:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8AA56B0142; Tue, 22 Mar 2022 17:45:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C39806B0143; Tue, 22 Mar 2022 17:45:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADB666B0144; Tue, 22 Mar 2022 17:45:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id 9E8096B0142 for ; Tue, 22 Mar 2022 17:45:51 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5F8A51827C15C for ; Tue, 22 Mar 2022 21:45:51 +0000 (UTC) X-FDA: 79273354902.18.5CBFB51 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id BBD094001E for ; Tue, 22 Mar 2022 21:45:50 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6EFF3B81DB7; Tue, 22 Mar 2022 21:45:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 101B4C340EE; Tue, 22 Mar 2022 21:45:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1647985548; bh=nuD6SuBfmNlTRsYolyl+S0xAQThzwrc+qWwimrO1VD8=; h=Date:To:From:In-Reply-To:Subject:From; b=bAcNaAD2Xq1jBAA3MzY1BWeJLg9DDwuyaUdjcWXRiDIUkifXWHzS+h5qAm6J7rAwS taQusEnhZR9NsqkOUBdeEZ9YKGCXSNL8VlOXPZKyCgOTugrnSqAYT/XnSrD0W9QE5R 9roQtvhpItOoFLa07bzoUz+PVUC0tgPOdZkF281w= Date: Tue, 22 Mar 2022 14:45:47 -0700 To: willy@infradead.org,tglx@linutronix.de,paulmck@kernel.org,nsaenzju@redhat.com,minchan@kernel.org,mgorman@techsingularity.net,juri.lelli@redhat.com,bigeasy@linutronix.de,mtosatti@redhat.com,akpm@linux-foundation.org,patches@lists.linux.dev,linux-mm@kvack.org,mm-commits@vger.kernel.org,torvalds@linux-foundation.org,akpm@linux-foundation.org From: Andrew Morton In-Reply-To: <20220322143803.04a5e59a07e48284f196a2f9@linux-foundation.org> Subject: [patch 143/227] mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu Message-Id: <20220322214548.101B4C340EE@smtp.kernel.org> X-Stat-Signature: tugbs9jb63jqdhkj5hen4jfc64ibm4jz Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bAcNaAD2; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BBD094001E X-HE-Tag: 1647985550-558475 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marcelo Tosatti Subject: mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu On systems that run FIFO:1 applications that busy loop, any SCHED_OTHER task that attempts to execute on such a CPU (such as work threads) will not be scheduled, which leads to system hangs. Commit d479960e44f27e0e5 ("mm: disable LRU pagevec during the migration temporarily") relies on queueing work items on all online CPUs to ensure visibility of lru_disable_count. To fix this, replace the usage of work items with synchronize_rcu, which provides the same guarantees. Readers of lru_disable_count are protected by either disabling preemption or rcu_read_lock: preempt_disable, local_irq_disable [bh_lru_lock()] rcu_read_lock [rt_spin_lock CONFIG_PREEMPT_RT] preempt_disable [local_lock !CONFIG_PREEMPT_RT] Since v5.1 kernel, synchronize_rcu() is guaranteed to wait on preempt_disable() regions of code. So any CPU which sees lru_disable_count = 0 will have exited the critical section when synchronize_rcu() returns. Link: https://lkml.kernel.org/r/Yin7hDxdt0s/x+fp@fuller.cnet Signed-off-by: Marcelo Tosatti Reviewed-by: Nicolas Saenz Julienne Acked-by: Minchan Kim Cc: Matthew Wilcox Cc: Mel Gorman Cc: Juri Lelli Cc: Thomas Gleixner Cc: Sebastian Andrzej Siewior Cc: Paul E. McKenney Signed-off-by: Andrew Morton --- mm/swap.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) --- a/mm/swap.c~mm-lru_cache_disable-replace-work-queue-synchronization-with-synchronize_rcu +++ a/mm/swap.c @@ -831,8 +831,7 @@ inline void __lru_add_drain_all(bool for for_each_online_cpu(cpu) { struct work_struct *work = &per_cpu(lru_add_drain_work, cpu); - if (force_all_cpus || - pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) || + if (pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) || data_race(pagevec_count(&per_cpu(lru_rotate.pvec, cpu))) || pagevec_count(&per_cpu(lru_pvecs.lru_deactivate_file, cpu)) || pagevec_count(&per_cpu(lru_pvecs.lru_deactivate, cpu)) || @@ -876,15 +875,21 @@ atomic_t lru_disable_count = ATOMIC_INIT void lru_cache_disable(void) { atomic_inc(&lru_disable_count); -#ifdef CONFIG_SMP /* - * lru_add_drain_all in the force mode will schedule draining on - * all online CPUs so any calls of lru_cache_disabled wrapped by - * local_lock or preemption disabled would be ordered by that. - * The atomic operation doesn't need to have stronger ordering - * requirements because that is enforced by the scheduling - * guarantees. + * Readers of lru_disable_count are protected by either disabling + * preemption or rcu_read_lock: + * + * preempt_disable, local_irq_disable [bh_lru_lock()] + * rcu_read_lock [rt_spin_lock CONFIG_PREEMPT_RT] + * preempt_disable [local_lock !CONFIG_PREEMPT_RT] + * + * Since v5.1 kernel, synchronize_rcu() is guaranteed to wait on + * preempt_disable() regions of code. So any CPU which sees + * lru_disable_count = 0 will have exited the critical + * section when synchronize_rcu() returns. */ + synchronize_rcu(); +#ifdef CONFIG_SMP __lru_add_drain_all(true); #else lru_add_and_bh_lrus_drain();