From patchwork Thu Sep 26 22:49:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13813707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECAE9CCFA07 for ; Thu, 26 Sep 2024 22:49:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7E396B0096; Thu, 26 Sep 2024 18:49:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D2EC96B009B; Thu, 26 Sep 2024 18:49:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF5AC6B009C; Thu, 26 Sep 2024 18:49:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9F9416B0096 for ; Thu, 26 Sep 2024 18:49:48 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0DEF91C553D for ; Thu, 26 Sep 2024 22:49:48 +0000 (UTC) X-FDA: 82608383256.07.8FB7CC6 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf07.hostedemail.com (Postfix) with ESMTP id 6240140010 for ; Thu, 26 Sep 2024 22:49:46 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=TZybb+rL; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of frederic@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727390969; a=rsa-sha256; cv=none; b=Fvp+baRdbGXlJGUqDfGKe4mGAtk+Ojq+FbCIAOsAtnIII0taBqyKX7BKad7iZEAJ43GMxC +q3Oei+cVCuNYPzwOG14f25AEN1zgegR9K1MaW82wIds/10OWKAneZD6/N1/1bDnFF1A9s /ijiysG02dcf8ouVXFerxJZiSBXlx0c= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=TZybb+rL; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of frederic@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727390969; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fAsnL8h86z9fMuyJ4gi6Hc0Ny9v4bVl8bJ3ZfSfvpPs=; b=pn1zKv5u3o0v8zIWZtzNpyqOCJ5VG3YBCxZ4s/9HyxGRyUSq1mraV+r5HwgOh3p2w4oehE OrbW/ea1B4Ey9cOwSTi9KnKN70sA9KwwV17EMX7/fHdW6ZSF/JHbbGMFta2OFZ3wfQq2bR ecowf6mnjqhcrhwSh3jtbAMtLxvn69s= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 7F833A44219; Thu, 26 Sep 2024 22:49:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 14BFEC4CECE; Thu, 26 Sep 2024 22:49:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727390985; bh=OmLFF6xgwbA0hM9UsQOF4k2zgh/bcASGVcZgCAmI/wE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TZybb+rLGew6y4q8nSczjj8DHRzJC7OPnSggLAYdW0ADMoiewKcozZLiLAW5sPq4b z/w7b2NScoCYFn5Y/Vu1k0LnkOv267r9y6sguOW4Nv2aO4I7KXAEGOqyeRdLQChUMM 4cTaoClZ1K67txUjJeVPMY4Il+eDer2maAH5KqQoz0Z9Yw9rGHdsmUmNnXCXQE0T10 MCioZOjqH18cVl7ms8sVxuKS31EpXnLgg1Dlb0+4pwj9q3OZxhffnGHGs/ou8q3fwC YN471CMyZJ3Ua5hJe27BPWBvBRj3fkTtvgx7DdcqZcLHY6H98mFfFSQVAxEosOWDlo CNXTOM4e0Ln8Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org, Uladzislau Rezki Subject: [PATCH 12/20] kthread: Make sure kthread hasn't started while binding it Date: Fri, 27 Sep 2024 00:49:00 +0200 Message-ID: <20240926224910.11106-13-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240926224910.11106-1-frederic@kernel.org> References: <20240926224910.11106-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: mz1uoda4sstmbssufbf3rayrr5kqy13h X-Rspamd-Queue-Id: 6240140010 X-Rspamd-Server: rspam02 X-HE-Tag: 1727390986-158171 X-HE-Meta: U2FsdGVkX1+2//wQUiGl1tmWet+jFdXEJuvsCzBrsnZG10EB/wVsHKpbHhK2i7IHzlwRe0aO/OREAyEVFuDM3EfkzN8s4pHooSn2y+S0D2kOsHWaWJuIf4Gf4DQCVQsoBM3nmM/vgP2nnYD4XtzIhR6m/NhS2mR5sw/coWSCCHyE8f8t3PF+M9ZIM04mlmDcyC+UcrgLAsSCPSFZfdwx4WZXkm4tFJIjYYrLkEF7grjlhrSw/5oWnFdWn8rbcXtk1jTJg1FPB5kPjkycON97/p8YEdtyj6T7r5XHLm0KEZd8LOJSPyR7TpLQ9VAxr5cnPyTHWKA1fpuxrIlw3Z9Z5bnvfUIEjviMbZzPOB18HXPcaddYkWPOxcR/BqF6cYrK9MeG7W4imPbAnMQyTNZ+XO8yrfpH4b0rgZqPxbj4oElSpHWlZgeWhkulbqKeNmzyjCN5Njl86vvOcsdzNZBREFyl9o9mieMrhVthcFYaHPW/i0XgGO3EX3OzP3JjgRI1RUSpZNQvl7BOZH06ZpRWz2DOMuB6mZ47Hkb71vbAhK9E7XDBxTgngZQ4Y6zY+28Sqntz+tdF0UjZfuTMf3rd+sKt3CAUg54xS+0fhJp8R9voAM3wrX4Cz8u2yETyNqKicQAZcGpDKPluR6YJUa47gi1VTv/kgJZB4ayIsDyEuIA3xH4tRTcrhm6aeAWxmB/IKi7MRrJ5bOo6P3GTux+chtYFDA2Ej9fs8B2LlZnsimEVOUuJzBWATCaAWK/5XAablFHMr99kUK5PO8skyzbX/Khf7YygQQldDKPfvqaj839EkyLgfLLbMde0gl2F6msJUhPu3Y7LfqIYACr4gRZusz/bEV/R075idMrSIbiVphPCQ1DNUzaM80BsLLHsh8iA8gArafW4n4djZjF5jZmhP2aGSgwV6X1Tw5II0Fz19uhsyYy59mD2vFar/6O+SMw7VeGL/8DAXqB5TdBN9Vx dLjhlzUr Efr2JTT4LaAC98v6mQhCyUhufqjqPv1ffNzANQQG19RxOg/JtwYNieJHkpg87aMHqK5VpePMECWVgStNq4sTKHmQN+KWDPQdEvYm6zD3+VwAiIFlT2t7qWYVra+pfuXvBgT5Ex97a1xERAiAoi5t1II7eRxFRlW3Lp0AyyKNVDw0Bc1E/KTziLsQi7IhqzpWdtZCVqpBzS++4nOPC0Qj7AnGizUROmEp7xwwYW8Vi58K3w5aPizMyaVETEey0mb/re8Kgiy7Wpr07gMaQ/lBAHd+Lk8ZK9NtH+HOhmOTJ6jmTgc20jO0G+FbjsA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make sure the kthread is sleeping in the schedule_preempt_disabled() call before calling its handler when kthread_bind[_mask]() is called on it. This provides a sanity check verifying that the task is not randomly blocked later at some point within its function handler, in which case it could be just concurrently awaken, leaving the call to do_set_cpus_allowed() without any effect until the next voluntary sleep. Rely on the wake-up ordering to ensure that the newly introduced "started" field returns the expected value: TASK A TASK B ------ ------ READ kthread->started wake_up_process(B) rq_lock() ... rq_unlock() // RELEASE schedule() rq_lock() // ACQUIRE // schedule task B rq_unlock() WRITE kthread->started Similarly, writing kthread->started before subsequent voluntary sleeps will be visible after calling wait_task_inactive() in __kthread_bind_mask(), reporting potential misuse of the API. Upcoming patches will make further use of this facility. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index db4ceb0f503c..1527a522cdd3 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -53,6 +53,7 @@ struct kthread_create_info struct kthread { unsigned long flags; unsigned int cpu; + int started; int result; int (*threadfn)(void *); void *data; @@ -382,6 +383,8 @@ static int kthread(void *_create) schedule_preempt_disabled(); preempt_enable(); + self->started = 1; + ret = -EINTR; if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { cgroup_kthread_ready(); @@ -540,7 +543,9 @@ static void __kthread_bind(struct task_struct *p, unsigned int cpu, unsigned int void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) { + struct kthread *kthread = to_kthread(p); __kthread_bind_mask(p, mask, TASK_UNINTERRUPTIBLE); + WARN_ON_ONCE(kthread->started); } /** @@ -554,7 +559,9 @@ void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) */ void kthread_bind(struct task_struct *p, unsigned int cpu) { + struct kthread *kthread = to_kthread(p); __kthread_bind(p, cpu, TASK_UNINTERRUPTIBLE); + WARN_ON_ONCE(kthread->started); } EXPORT_SYMBOL(kthread_bind); From patchwork Thu Sep 26 22:49:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13813708 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5964BCCFA17 for ; Thu, 26 Sep 2024 22:49:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C4EF6B009B; Thu, 26 Sep 2024 18:49:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 874526B009C; Thu, 26 Sep 2024 18:49:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 715496B009D; Thu, 26 Sep 2024 18:49:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 52D186B009B for ; Thu, 26 Sep 2024 18:49:52 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0558D8068B for ; Thu, 26 Sep 2024 22:49:51 +0000 (UTC) X-FDA: 82608383424.08.74B8A82 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 588B4A0004 for ; Thu, 26 Sep 2024 22:49:50 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=lyeZP9PW; spf=pass (imf25.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727390868; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8W7serO4JLPdkqgqPYOKj7uatBtc7JHjNBSXH5BrhFE=; b=ZOTI9QJGJm31lIfZ7nTDNATMw2Ev7Xis5vXF4ngHNevAOKKTFECKzqWWgVJ2jPQWpjWk7h q90BFn6jEz7SsoFk4kPncXNTBUQP/3MByMhdjVMTQEFjrASoZmHr2OkSobOAfLA0RHYhwx yEJ+g9ZHjjmggt3qNZDWxe4q2zLQ8no= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727390868; a=rsa-sha256; cv=none; b=YkQMlIy8y5btzatrddai1pxkyjELsXjxpzdh9mlvSG5WzS3LRoMtaSDbhZNW7zfLDIvhTZ BVs/ZstwRSnurzGuwB1Q7d8kwWgCiCWi7XX6bpwKT6lAcjD9eQDRYGKerbSTeB3wT5mNUH nwHge/qQH/mImwjx5b6vKmwEyyuqZjQ= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=lyeZP9PW; spf=pass (imf25.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 4198E5C5696; Thu, 26 Sep 2024 22:49:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B97D7C4CECF; Thu, 26 Sep 2024 22:49:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727390988; bh=bE1OM/op4+e/YyAtntgPKGkBQ7JAfe3Vd2i3n+7WjN8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lyeZP9PWK4kodAkIS7yTyNhlbSAraYXy5xoRG+4jGTQgHQpPMG/SpnI8qwYvYWXHN 2n5SVdMmIQKyc28FIyY6jAWXZSFbIDk6s3gnVQeBr0LRLKKQlX9IM9vojf/i73iQqD CfCm3C+5NVgQBGW+P/HXiSTEfAolKpVXHCSG1YKaW1j4jcoJQOX3jab5KUWQOAmqUA CDsPW6j8jql2Dz6xVR+fFUfQx3zOCu0yD4gRCsNFASNmCFinYbw3AUOYceXe10QK0l fA7mxZEWvX0uj8fKH8Sz+UHIWEgZI1Vlifl6t85IucnruBVsouNjrOEnm0aK52cFE0 lwZINyZQN4z5w== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Uladzislau Rezki , Zqiang , rcu@vger.kernel.org Subject: [PATCH 13/20] kthread: Default affine kthread to its preferred NUMA node Date: Fri, 27 Sep 2024 00:49:01 +0200 Message-ID: <20240926224910.11106-14-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240926224910.11106-1-frederic@kernel.org> References: <20240926224910.11106-1-frederic@kernel.org> MIME-Version: 1.0 X-Stat-Signature: bb3ba1u7u5szruwk4114iqb5k5bpe59t X-Rspamd-Queue-Id: 588B4A0004 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1727390990-463764 X-HE-Meta: U2FsdGVkX184PcoXuh1DOsRlUYltewKXmAg/B3RcV6edTQY+p5XwlOxMj7xH48O4O0NJbDaFUGf90MO3k4yZgL5SCRxYLgOTg/KgGYpjdu1iRNtZ+Zl/HP4sW6EvpmLbvXwraIeoN2od8Ea9kSKIDiukpb6teyXVJ1u37bHf6Jlj2iLQh1x2nddW0Lni3j5lXUzhkRBaFyKiXahukRbBwc5ymTnnxXlB9vsMu43Vq07AS9YV/o1nOlZhilmJoP39djK5hBGWig9yuSUqAw7ZopikjlJbv/YAdG1hB55SImgvKufyux5cxSSOse/LxlG+lze18zGJatvQJ4SO2K9/xsbAMx2u9CEHEL2qVS6FPwmAYu7VbVl4ERPDJ8n2z2zgMWc6u+Rvm8HjTWit6IP5LwKip/yXiJnFZ56qNIquSfls7JtNinIzoRCrL0ERGQFjMxBCD6QW2nklIQLqMfyzHec/EItK1HQiohfY2Y1jMENkq7nM6X8inN0qp81Q7YeOgokO+BJjt77/KUEZW0sg1p6ipoQIXW/sG2Puo1rAFM6PLrhB9eUeutT8ujrBB2XT0F+ogN5YFvWR/Kby4VUP8/Uzy1ZO5xrABSiYzdnSYbA+TiMQudxJlwRSyc+WQsjkHr/pJX8CDjk9jmordHc2SgIsiSs0YFYTdhH0Vqhb6AGANBZ0YsqCjtvoTewzyRRN2SA0x45d3h/ZK4m+3F5sYr66kgJK9rQTDYyWmc9q1976LQ2tnFPZVD5yVdRZOlNHHJU638WmNDCTYy49nlJjaKXtQsBujzk5PbPS1BIMof442JondSOJD0hPTP5Wyus/7gZq9snuDRG7Ehe/ZzrwWeTBOqjBFuD/ViPOiJYfoA+mQzV19jF7BK16eT0OU80lE22ljo48yWVhqodAdl9Z7FpOaTeh6EoC+eHh0toiVqIB83qht8gGWjvg64yculGXjlUWxIA8aTlX+0dePbu zKttkyJo OvUTKHE3yzt0QptYqdbO6TzhulKiSZIhdCPrFw1Xae5Eb8E/qb2nX75cXF/yAzbZ7Jgn3sffFtJ/0rzrLZBRfFeeaaRHW/pfyLC3LvQxCsbiBdnum2jWz591Oua3eYxYE1CaQC+MzDh8wTVlV9DIvdEP3jZT008P5u3JkWEX5UpvqLyd5wqREss+U6z1MwOOWRQOnMImoZ3VINowWBv+VRD5rBK0L9+XL3cqMOJ3q8thyf/4DrxSTV3U8QCr47ks5a9ulXw0c0f0cKmdHPhAwlZpo25VpADhHx2yG6vew+WDybJaRVetAegL5IQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Kthreads attached to a preferred NUMA node for their task structure allocation can also be assumed to run preferrably within that same node. A more precise affinity is usually notified by calling kthread_create_on_cpu() or kthread_bind[_mask]() before the first wakeup. For the others, a default affinity to the node is desired and sometimes implemented with more or less success when it comes to deal with hotplug events and nohz_full / CPU Isolation interactions: - kcompactd is affine to its node and handles hotplug but not CPU Isolation - kswapd is affine to its node and ignores hotplug and CPU Isolation - A bunch of drivers create their kthreads on a specific node and don't take care about affining further. Handle that default node affinity preference at the generic level instead, provided a kthread is created on an actual node and doesn't apply any specific affinity such as a given CPU or a custom cpumask to bind to before its first wake-up. This generic handling is aware of CPU hotplug events and CPU isolation such that: * When a housekeeping CPU goes up that is part of the node of a given kthread, the related task is re-affined to that own node if it was previously running on the default last resort online housekeeping set from other nodes. * When a housekeeping CPU goes down while it was part of the node of a kthread, the running task is migrated (or the sleeping task is woken up) automatically by the scheduler to other housekeepers within the same node or, as a last resort, to all housekeepers from other nodes. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- include/linux/cpuhotplug.h | 1 + kernel/kthread.c | 106 ++++++++++++++++++++++++++++++++++++- 2 files changed, 106 insertions(+), 1 deletion(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 2361ed4d2b15..228f27150a93 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -239,6 +239,7 @@ enum cpuhp_state { CPUHP_AP_WORKQUEUE_ONLINE, CPUHP_AP_RANDOM_ONLINE, CPUHP_AP_RCUTREE_ONLINE, + CPUHP_AP_KTHREADS_ONLINE, CPUHP_AP_BASE_CACHEINFO_ONLINE, CPUHP_AP_ONLINE_DYN, CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 40, diff --git a/kernel/kthread.c b/kernel/kthread.c index 1527a522cdd3..736276d313c2 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -35,6 +35,9 @@ static DEFINE_SPINLOCK(kthread_create_lock); static LIST_HEAD(kthread_create_list); struct task_struct *kthreadd_task; +static LIST_HEAD(kthreads_hotplug); +static DEFINE_MUTEX(kthreads_hotplug_lock); + struct kthread_create_info { /* Information passed to kthread() from kthreadd. */ @@ -53,6 +56,7 @@ struct kthread_create_info struct kthread { unsigned long flags; unsigned int cpu; + unsigned int node; int started; int result; int (*threadfn)(void *); @@ -64,6 +68,8 @@ struct kthread { #endif /* To store the full name if task comm is truncated. */ char *full_name; + struct task_struct *task; + struct list_head hotplug_node; }; enum KTHREAD_BITS { @@ -122,8 +128,11 @@ bool set_kthread_struct(struct task_struct *p) init_completion(&kthread->exited); init_completion(&kthread->parked); + INIT_LIST_HEAD(&kthread->hotplug_node); p->vfork_done = &kthread->exited; + kthread->task = p; + kthread->node = tsk_fork_get_node(current); p->worker_private = kthread; return true; } @@ -314,6 +323,11 @@ void __noreturn kthread_exit(long result) { struct kthread *kthread = to_kthread(current); kthread->result = result; + if (!list_empty(&kthread->hotplug_node)) { + mutex_lock(&kthreads_hotplug_lock); + list_del(&kthread->hotplug_node); + mutex_unlock(&kthreads_hotplug_lock); + } do_exit(0); } EXPORT_SYMBOL(kthread_exit); @@ -339,6 +353,48 @@ void __noreturn kthread_complete_and_exit(struct completion *comp, long code) } EXPORT_SYMBOL(kthread_complete_and_exit); +static void kthread_fetch_affinity(struct kthread *kthread, struct cpumask *cpumask) +{ + cpumask_and(cpumask, cpumask_of_node(kthread->node), + housekeeping_cpumask(HK_TYPE_KTHREAD)); + + if (cpumask_empty(cpumask)) + cpumask_copy(cpumask, housekeeping_cpumask(HK_TYPE_KTHREAD)); +} + +static void kthread_affine_node(void) +{ + struct kthread *kthread = to_kthread(current); + cpumask_var_t affinity; + + WARN_ON_ONCE(kthread_is_per_cpu(current)); + + if (kthread->node == NUMA_NO_NODE) { + housekeeping_affine(current, HK_TYPE_RCU); + } else { + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) { + WARN_ON_ONCE(1); + return; + } + + mutex_lock(&kthreads_hotplug_lock); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + /* + * The node cpumask is racy when read from kthread() but: + * - a racing CPU going down will either fail on the subsequent + * call to set_cpus_allowed_ptr() or be migrated to housekeepers + * afterwards by the scheduler. + * - a racing CPU going up will be handled by kthreads_online_cpu() + */ + kthread_fetch_affinity(kthread, affinity); + set_cpus_allowed_ptr(current, affinity); + mutex_unlock(&kthreads_hotplug_lock); + + free_cpumask_var(affinity); + } +} + static int kthread(void *_create) { static const struct sched_param param = { .sched_priority = 0 }; @@ -369,7 +425,6 @@ static int kthread(void *_create) * back to default in case they have been changed. */ sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m); - set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); /* OK, tell user we're spawned, wait for stop or wakeup */ __set_current_state(TASK_UNINTERRUPTIBLE); @@ -385,6 +440,9 @@ static int kthread(void *_create) self->started = 1; + if (!(current->flags & PF_NO_SETAFFINITY)) + kthread_affine_node(); + ret = -EINTR; if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { cgroup_kthread_ready(); @@ -779,6 +837,52 @@ int kthreadd(void *unused) return 0; } +/* + * Re-affine kthreads according to their preferences + * and the newly online CPU. The CPU down part is handled + * by select_fallback_rq() which default re-affines to + * housekeepers in case the preferred affinity doesn't + * apply anymore. + */ +static int kthreads_online_cpu(unsigned int cpu) +{ + cpumask_var_t affinity; + struct kthread *k; + int ret; + + guard(mutex)(&kthreads_hotplug_lock); + + if (list_empty(&kthreads_hotplug)) + return 0; + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + ret = 0; + + list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { + if (WARN_ON_ONCE((k->task->flags & PF_NO_SETAFFINITY) || + kthread_is_per_cpu(k->task) || + k->node == NUMA_NO_NODE)) { + ret = -EINVAL; + continue; + } + kthread_fetch_affinity(k, affinity); + set_cpus_allowed_ptr(k->task, affinity); + } + + free_cpumask_var(affinity); + + return ret; +} + +static int kthreads_init(void) +{ + return cpuhp_setup_state(CPUHP_AP_KTHREADS_ONLINE, "kthreads:online", + kthreads_online_cpu, NULL); +} +early_initcall(kthreads_init); + void __kthread_init_worker(struct kthread_worker *worker, const char *name, struct lock_class_key *key) From patchwork Thu Sep 26 22:49:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13813709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E9A0CDE03E for ; Thu, 26 Sep 2024 22:49:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB82B6B009C; Thu, 26 Sep 2024 18:49:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E1A8C6B009D; Thu, 26 Sep 2024 18:49:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6DBE6B009E; Thu, 26 Sep 2024 18:49:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A410E6B009C for ; Thu, 26 Sep 2024 18:49:54 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 32A3DA0681 for ; Thu, 26 Sep 2024 22:49:54 +0000 (UTC) X-FDA: 82608383508.06.F3311C0 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 7C34B16000B for ; Thu, 26 Sep 2024 22:49:52 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=psM8YQFN; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf08.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727390956; a=rsa-sha256; cv=none; b=tx3Si5/SinhQxHCePkp9xj/KLDPsD9E2fXjSEdTn7d9XtzEN58iRhlqhAzUsPjnloc+GoN Mm0YUXSt6egZQitWE0SaCIPJq69KLlpEWSnl6fnRuas4W4rKsk8rWgqvGiO1sqcSBPvC7g H5zwy3ZOPiBaDLBj5KnJft+gfi1yywM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=psM8YQFN; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf08.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727390956; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QMfzeDE6TO0jUOHFKLhVcYsWmCqJHbXGFSX6Ot+8q1Q=; b=q6jqDyfO08755Jv9zwpU/yTPCka3YMPmfzWOcL16ONvhe4IoTfvUTxn0Kjz100Y+PsLvrR Ny+ft4816KLFjxaURM8UmIABGVtDQU/LkOQUg7H1P0Qiobsgg/JHVcQmDpXoI22jTJ/Jqu ZTnAD0ZzXoDajRk+gOrJtb1xv0yTrbg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id C52C05C58A0; Thu, 26 Sep 2024 22:49:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4F665C4CED0; Thu, 26 Sep 2024 22:49:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727390991; bh=7BtrEplH3qQrg9kZixHgQN9470zkhnd7tdHddmuV+9U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=psM8YQFN5mZ3FzBukGKWEToWhZzCosyMS0j5KLWvaT8kWVGcT5tEVHkD15uycwmmo tJ1LttkAxj4ZweWz/08s7Kgrwh2kGdoUVn4gLfJeqPwS+NB/xmOBSSQswp/UECwOIh ImitrWdjMszxOjTusfjUOiF5fXo5SsImb1ftjw8izmEPoA7IYlRPXVL1aPmEFgkVha uz9CHu3SywKw1F3hqctu0GEWLSFVSBAT7sCkOes64A/1ubJODn4AmG897NX7CnsoNf 6WWjkBU5HcTp5GdJ+RK/iH5GUuE618/1yJujZ9W5ZMCLOA21ySARoJat1rQNAlhCN2 tJrXoHP8jA27g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Michal Hocko , Vlastimil Babka , Andrew Morton , linux-mm@kvack.org, Peter Zijlstra , Thomas Gleixner , Michal Hocko Subject: [PATCH 14/20] mm: Create/affine kcompactd to its preferred node Date: Fri, 27 Sep 2024 00:49:02 +0200 Message-ID: <20240926224910.11106-15-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240926224910.11106-1-frederic@kernel.org> References: <20240926224910.11106-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 7C34B16000B X-Rspamd-Server: rspam01 X-Stat-Signature: kitbj9t7pjfkn9mqtkren7tyubzbd8nh X-HE-Tag: 1727390992-374205 X-HE-Meta: U2FsdGVkX1/VjTiYvWgKyjKZ/XPOqvV3K2781nd4f/AT1A74vfdGTIbpkx867onuSZwH02Fesh74/EGeTYBPLqRflY5Pu3rb1bQQZRUBW5EfCOhXzV13NusV88raKdID3iWhI8k0y4+ckf4Q8DWXpkAsOagWI8eSPKws7wqC6dvGqiXeUbHHm2UKCCgeNyL6EFxvtT1oEjyaQl1QfBzMuEbRknwNRLAZDlEg+QpgouJEkg5REnXVRBZtgvhr+QvNAPjK6s7yLmQQe4IYAfP+oX8HZSm23/It4y947kfhO+bssR0xDeDstKANahRRCNEgUFmu4IWT11IUqFwQWf1yK9MQXsvyddowsHBO3z6Eveu8HEuxvX1G1K6s2gx011x1JJGwodfmVHCjpYhicgVV8Jx6csqpLQ4RNDznTNvNacH3LA+RlBT8ROz6//XHGs90wCO/8CV2Kv6SFjnU74FLTocM5kY5uVzJtCGTfq5FCJ67wDdc+YPCkmtEyyelZ7ZTwX/6nD901k+CYiiiyfT0FMIE1ZtEUThWyPo0gXM7utkzuHQJ3krdSR18CvMHK55e7Qj9debpOnAVBt1um6peWYD9DVXsRjASpQOeJPs3KqWF4+jbCAx3waJPH+5YySxhnWIGLqa3HvgDwUxWEGAgsxuYmXHKZ39r+jyRgssivu5S5AvLytGir48AXHSrcJdZMCpQ5dDbuqxqd8MviftEngwsu/Zq6+S+Oyf+igL+JkhB450FQlQ82QjGpgLT3iBpHcD88OXiwDW2JA0BXDEVGzaagY+Xfne21EMMqyM3H8iKYHqreqwtIFrmMz6QPL67F0oOvvMmv1JrKSaTD9ew/CPVOFAD/qVWI6sqGqprkXEndL4IFuwceGEi6ue0HGqs2Ard86OKdk4nZFrj7Ncjv84jkpEupRbp7h03yQfLm6nB/2FbTzeNS89nwcA7IkLlBjP3tYTaWcTXJGTOGHE 7VziXbE3 DemPnqgKGq6nO+NSWnGKQMDWfd/Lsif2xVq1p2DJXwNQSh4SgMS1aaW9hKzqUhM5H+sU0fuT4bIHqxMbhjr7ZxE7NcNP/prqcKFgpuhp6mOuJ1tai2cy5us+jFquQq63bgOu2oECFdLBnKUjw9mUQSRfZGBCsGVeg1eHJQQniiYBRQwf3Eq1WSbAnwGfxNKW1LoTpZ4DZoDkVEzf5oULCAQbYKpka0+gdQa3rqOnmWLbD4nGb60gQs8ESoQhOXmebUJiasLZqBRfvkKKc4ERAF7KeYaMW7wWKNc3y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Kcompactd is dedicated to a specific node. As such it wants to be preferrably affine to it, memory and CPUs-wise. Use the proper kthread API to achieve that. As a bonus it takes care of CPU-hotplug events and CPU-isolation on its behalf. Acked-by: Vlastimil Babka Acked-by: Michal Hocko Signed-off-by: Frederic Weisbecker --- mm/compaction.c | 43 +++---------------------------------------- 1 file changed, 3 insertions(+), 40 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index a2b16b08cbbf..a31c0f5758cf 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -3154,15 +3154,9 @@ void wakeup_kcompactd(pg_data_t *pgdat, int order, int highest_zoneidx) static int kcompactd(void *p) { pg_data_t *pgdat = (pg_data_t *)p; - struct task_struct *tsk = current; long default_timeout = msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC); long timeout = default_timeout; - const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); - - if (!cpumask_empty(cpumask)) - set_cpus_allowed_ptr(tsk, cpumask); - set_freezable(); pgdat->kcompactd_max_order = 0; @@ -3233,10 +3227,12 @@ void __meminit kcompactd_run(int nid) if (pgdat->kcompactd) return; - pgdat->kcompactd = kthread_run(kcompactd, pgdat, "kcompactd%d", nid); + pgdat->kcompactd = kthread_create_on_node(kcompactd, pgdat, nid, "kcompactd%d", nid); if (IS_ERR(pgdat->kcompactd)) { pr_err("Failed to start kcompactd on node %d\n", nid); pgdat->kcompactd = NULL; + } else { + wake_up_process(pgdat->kcompactd); } } @@ -3254,30 +3250,6 @@ void __meminit kcompactd_stop(int nid) } } -/* - * It's optimal to keep kcompactd on the same CPUs as their memory, but - * not required for correctness. So if the last cpu in a node goes - * away, we get changed to run anywhere: as the first one comes back, - * restore their cpu bindings. - */ -static int kcompactd_cpu_online(unsigned int cpu) -{ - int nid; - - for_each_node_state(nid, N_MEMORY) { - pg_data_t *pgdat = NODE_DATA(nid); - const struct cpumask *mask; - - mask = cpumask_of_node(pgdat->node_id); - - if (cpumask_any_and(cpu_online_mask, mask) < nr_cpu_ids) - /* One of our CPUs online: restore mask */ - if (pgdat->kcompactd) - set_cpus_allowed_ptr(pgdat->kcompactd, mask); - } - return 0; -} - static int proc_dointvec_minmax_warn_RT_change(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -3337,15 +3309,6 @@ static struct ctl_table vm_compaction[] = { static int __init kcompactd_init(void) { int nid; - int ret; - - ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, - "mm/compaction:online", - kcompactd_cpu_online, NULL); - if (ret < 0) { - pr_err("kcompactd: failed to register hotplug callbacks.\n"); - return ret; - } for_each_node_state(nid, N_MEMORY) kcompactd_run(nid); From patchwork Thu Sep 26 22:49:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13813710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEC2ACCFA17 for ; Thu, 26 Sep 2024 22:49:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DA9B56B009F; Thu, 26 Sep 2024 18:49:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CBDE86B00A1; Thu, 26 Sep 2024 18:49:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B81586B00A2; Thu, 26 Sep 2024 18:49:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9741A6B009F for ; Thu, 26 Sep 2024 18:49:56 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3022F120337 for ; Thu, 26 Sep 2024 22:49:56 +0000 (UTC) X-FDA: 82608383592.06.C335850 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id 8CC00140014 for ; Thu, 26 Sep 2024 22:49:54 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="qPdT/czz"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf23.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727390895; a=rsa-sha256; cv=none; b=5jzZU5Kso7YxadNT1aTK+a9kugJjP9NOywkVwxMItwhjcDur2XGr+Xmc2saMLgeLqYfH3I 5SA3abiFlAtiY9ZXAD5TVZ1MpUW8f2dU3JtC2CCoD6kEUndg4/hgzlYdqUa2IyQQWc83cM tjeBvoLPjOcO3JgSqX8ZpxZXZ4SDujg= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="qPdT/czz"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf23.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727390895; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CmttsLVxdQAGPqv4EyY4vbYE12QAnLoDFwzJoqQT6E4=; b=F0NRcos1G8S/6quEfloCYAIw84jKjiKNndeNA35oaA3zY4yFv5oWSayWwDrVfiRTgGPqNx tFerbMWJ83Uh7bi1p+ApUaQJ5kkR29y0+8g+S3BjcABUJ3kvkw7rqkQFLhY72eAXJRq8r0 O2eupG//91omonDjLi6GglzkySIpcoY= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id C7D125C5696; Thu, 26 Sep 2024 22:49:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85329C4CECE; Thu, 26 Sep 2024 22:49:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727390993; bh=GGwHDTPCelnTuLQKmOHE+XEiEGRzn3nDzNLPlrg5wLQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qPdT/czzAUkVt9taXAsYs8HYqiPAzFsD1PahFMGeLLa7p2XkOjqAGBmtuWIxHHZzI L73JueO4kixmDYeWu4a8O6Hn0pH8EOVcPCTshEGMwt/hCvhT5sU0By3wHdlwzQxZeg iJQwSFGkPH8DB3lu63k2EugRuJGZ1VPzYDgoxy+UIi4yJzBy0vF7kxHNXBsdPjwhgf kgXfavf0yn90bg42dxvHuz0QrUskQDnr4er73zhZEdFiHbE93hnfsomY8jIU0clpCq dnEuS0e4bCA/Mg/ntuju71is5DUonzaMun23oaFLmCDFNQiEx7M05CYRWIWd2qVizV vL26k62znYIuQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, Andrew Morton , Peter Zijlstra , Thomas Gleixner , Michal Hocko Subject: [PATCH 15/20] mm: Create/affine kswapd to its preferred node Date: Fri, 27 Sep 2024 00:49:03 +0200 Message-ID: <20240926224910.11106-16-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240926224910.11106-1-frederic@kernel.org> References: <20240926224910.11106-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8CC00140014 X-Stat-Signature: 51371xu4oijkshxud4xrp7up4wj9yjho X-Rspam-User: X-HE-Tag: 1727390994-652802 X-HE-Meta: U2FsdGVkX1/8tApfTHDLrzLVUN3Hki/vWDJDdyIJZQ8AnVDX5i6QLq+dsxJg3Zt2vqx2pK+B2w+U3sKTX5iOBP9deqaf2QsE7YiPhitCnLWKEmHqZQ2sn6A23TMqUMEuJHVQRF9xmiTU5zjqidlH8hmSSDuBs8lI3LphddrM4ITzIA63NiogQwBKaboXZCnJhk3F0jje2SvNjK+M/1vyvRzdSEJFHCFTEYb4JqtJHQaCYYMNK822BuzGHSSHn+mB8xZbWxKrpLSC5eXSBI1W97SdZCN1RngFdFcjbAZGbAjH+3rt1ddWL5CdCpnpRJJregpKhP2Rp0L6AzdOKx+KZQYcndwyQ2l7CFbiN3GhEHr0O4l3c7y1hMg6QjgEKE6TyMiz7LUov7UHInDF+/kPvOhdBTvOjnj6niJJvf0j8nFDxfbEzSI6Q79OK3quU1EYnz5ezrjP8JqIxKK7te0Vq1lFyfUQsnpiuZF8U4hhK4HAlH71Onc+mLajU9gjE1/aYIWIb3rUHxgqzkWATAOQKd8m2KITrlYpdeaVwUN0KTa5cMXGGGMb14SPohxG7Yjdj8Xq3Ka1UK90S5ihmoIF2WDf/5VmrNCjHoFbS/FOeb1R+8W4qex7wEYiLkh+RnS8gRph+1pgW5RYpiXneHi/4AyuVneHc2WDoTUFHBVAFr1jt6x1Zjsg5Z/LTOIzwrkOuZBqB11mrZjKcLlRvimtPF2KAr1j4P8BlHrG9+QN6oZ/lbqtK945+rDyCYH2Av/VW6fkrXFdMpqd8LSY+ZMOZmLkXPYnxdp7drFnAftMu/hcgRu/DQX2zcfnV7m9M5yEgdg2j7VoEuODSYMqCgeRPLKVLxPjMQ9DpedwcBTiuQjFG/5wdXo5WpgdDNvDXII+El2vAxgPaMgZaaO+H+ufLAQEDRo97TMwZjfc8v+Ev+kpIKXGtyBO+1QOJVrIp08wwQnOv8rGb/Fc70/gyA8 AwesOojD z2AWhd2jjjNXgEATlJCnjrPVwPX/EhM2DgaV8IVXge7s+tHLsROhnMfbZgQlCLKxmVfkBuuLvLaX7ZuF5C9KxuOX4vFtxDRRRq+GjkOhiCVLZVR6hW/sY2ncZbTxHuFkiH4X+/B/Be5lu/geMdn3fG6agp/fHv7Fi3DqgbaWiTSoZ9HX5ydQgwyw3FqCrx6uprYJWHxN+uhY9J0X76yUHPHLKeR/6x4vKZKdTSUWxhtGiJ2d50/K8BFRAghMyfgXZvpDPlxtdGkxAN1RsQzv2DmGfek48QNCQKJZ9QepMnppBsHESYU6sHSv305U43sD1xSHF1orpJQDKExI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: kswapd is dedicated to a specific node. As such it wants to be preferrably affine to it, memory and CPUs-wise. Use the proper kthread API to achieve that. As a bonus it takes care of CPU-hotplug events and CPU-isolation on its behalf. Acked-by: Vlastimil Babka Acked-by: Michal Hocko Signed-off-by: Frederic Weisbecker --- mm/vmscan.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 749cdc110c74..2f2b75536d9c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -7162,10 +7162,6 @@ static int kswapd(void *p) unsigned int highest_zoneidx = MAX_NR_ZONES - 1; pg_data_t *pgdat = (pg_data_t *)p; struct task_struct *tsk = current; - const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); - - if (!cpumask_empty(cpumask)) - set_cpus_allowed_ptr(tsk, cpumask); /* * Tell the memory management that we're a "memory allocator", @@ -7334,13 +7330,15 @@ void __meminit kswapd_run(int nid) pgdat_kswapd_lock(pgdat); if (!pgdat->kswapd) { - pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); + pgdat->kswapd = kthread_create_on_node(kswapd, pgdat, nid, "kswapd%d", nid); if (IS_ERR(pgdat->kswapd)) { /* failure at boot is fatal */ pr_err("Failed to start kswapd on node %d,ret=%ld\n", nid, PTR_ERR(pgdat->kswapd)); BUG_ON(system_state < SYSTEM_RUNNING); pgdat->kswapd = NULL; + } else { + wake_up_process(pgdat->kswapd); } } pgdat_kswapd_unlock(pgdat); From patchwork Thu Sep 26 22:49:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13813711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16003CCFA07 for ; Thu, 26 Sep 2024 22:50:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B08D6B00A1; Thu, 26 Sep 2024 18:50:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9392A6B00A2; Thu, 26 Sep 2024 18:50:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D7E16B00A3; Thu, 26 Sep 2024 18:50:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5EF156B00A1 for ; Thu, 26 Sep 2024 18:50:00 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D4268140632 for ; Thu, 26 Sep 2024 22:49:59 +0000 (UTC) X-FDA: 82608383718.26.9090F52 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id 36BE1120007 for ; Thu, 26 Sep 2024 22:49:58 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iWWKYk8L; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727390863; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6s4zHwNyxY6DbMQl18Qg35alLGSnsrkEkixTzXOBmMw=; b=M0THpCC/NQ7q+JchktTVQNtveEsPsk+bSjhJJNf2e7AHNvkjZGTiu+2M0KRLr6L5VVysnX e2bNRm4KtA59MSKg8JlQkpJfOx0o0kwWvOaWU2nUoabcFoJ6eow0sQY+21xHZiv72ylutK fzj1A5CdiA+Yj4ki3q6BwRSHBkCmbhs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727390863; a=rsa-sha256; cv=none; b=fg8+cyIhfmv0T/ZtGF5CFUukY+LS3BeTDnWBeWNwE6Vp12ePsL3iHtBRQziRc0pjSKrIrc 5XZYOia/++qQJ0IPvoVq14FVJPkM2XHCi9vj0W5BIVsbk0tVLQgb8beuNEiHmCyXKldgO8 0+01RJk0IVK284hJR2bU9Bu4O/ZXKbQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iWWKYk8L; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 746D45C5696; Thu, 26 Sep 2024 22:49:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5996C4CED1; Thu, 26 Sep 2024 22:49:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727390997; bh=m1Tcj1zEK+qAK6ToFK+B33y3KL3pudrwFEmXyRTSR20=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iWWKYk8L61abF7LxWDWRGHMgy+nOKnnunvJUW6jJ41wneZPY/6lNIfMOqSed6R/gW jYrTjTIZIFOLj5S9/D62bKvAsDKJlVPJ2WzeTRs1NDsf+swIeolOnOhhlyqYZwrjg8 C3/ohAwI76aHXOJp2XqCIN+THlWHkX+Dtjz2+LU4B7XMIRnGUgqj7s9Brak89HhBCs XLWJeiAf4oZhQIlBTJ63ZQvlf9wr3Mos+fhlYgs/hiXgr1bxJwGQ87/eZuQ290s0yE 6biSgYOA3Bmkgv2xZ+CrpL2IyvqvrMNfjI8hKSfgjFL8+HhCyyLz68L7TfR8fcZFpO 4sV4ZR3ULcjMQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Uladzislau Rezki , Zqiang , rcu@vger.kernel.org Subject: [PATCH 16/20] kthread: Implement preferred affinity Date: Fri, 27 Sep 2024 00:49:04 +0200 Message-ID: <20240926224910.11106-17-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240926224910.11106-1-frederic@kernel.org> References: <20240926224910.11106-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 36BE1120007 X-Stat-Signature: n7y5pz817866xffi3e984uou1pdicsc8 X-Rspam-User: X-HE-Tag: 1727390998-647671 X-HE-Meta: U2FsdGVkX199oUu0TJl8k6nTXns0D1FUnvMj1oOO9oivwM6u8y2Z3h0Nv98Ypx0YrRRXswlvTyMw8JWWH/+Eq7+Mw2E/InQD/7ai/sOJTy8cfLwApP1GFeDQ8UBRO4raQXStI0Xy5+1JHycsdhE5RgQlZmpFrSX9aU9GpthMI+q01EMXSFlC0aVR0q3G1VGHjGlwjLvo1jwX+IoPIHJ67CUVLDf9nD+H32BPyIeucsPo80MRpCAD1yrRwFR2uHHpSWQ6qfoWy3FlZ9ojYPeRyc5OIK8Xolo/r7PyaGUMfSWCD8Q245g5qlESRxgWVmsugGlXQoJQS6C5LUKkztct6eGD1fW/PcOeHO2nLd17GG9N4MpJP+2v9ePaGPu3r/ka8ny3Pl8TalCoE3cwsUlvPmDc9xT+K4cJyr9TQkwIHmn7aZnefHN6LrQ+pIYi6flDnqViF6S8G2+KfGK9krOcmb5TsUdQ+xs21v6JwioHHVttH3Dws+Emofmbn4lw1PJagGYH2IkyglF7RDxpyog0cxBiRdJCsernF56+D2wNk85aVy8xlBOXcyuAVeYEomJjrCxyoc7tiVbNCfLv5J2WZywZQdflzN4SBezm0Abe0yE4UeST6buEDfSdeurBL4QkbHyXaAsZDT8R8YGq7ZQSXzrgaofSz4GpzAvz4K7G0LYbKA+g8h1tgX5/mvmy9W8ARQUpOW4chmt1Ke8Bjkdjzr5gUlFOki6o1yAGCFcA0p8Jc8e1DNDUbTQk8EiUhwGbqTQdj1q3wZ46AbORDH78TyAIKIm+bPM2tcSNP/GxkFP65YWSlSuS3HkPZkXx8fDX3w7ZNth2TNMcgf5uuTsK7AXKEs2Al5Qa6wvPvDDAAmQ+cdcgdUIdSx5c2Zs36gwYcACzPdBUwljGlP6EduOYhrgmQ69MoVFAjH0v6FjNytVKcxyj6cZ4BMetZBkZie31WhhP1rUTRYD3gYEtY1K qjtTznRr kGRRcItXykw+QlC4lHdd6I2+HPErUUjaeRkdeSldjXR2i5g5uI18WDFK2JdqY9msoYOnIdBO6m0WNsqji7CDAwH+VyyO2SNfZUoJ6QU03oXXnjhqGecsjopfMo0ruYltY4sd9+XXMGvQVFRwbr2oGcRiTyu8+GuEP1Qkpc1orrEs1w9zPM4xlY5ki2L81nmciR1jEMgLZ8b748RxG9Ue2BrKjeqRqiSjEwVlGkIyVuR/OL/SgCe/txNgUyOQi85KrV2T3WUBJqRniNK04I7q4wp9xdP8CisujfP3ULGOzLHuhBTFPb98oBPxnYw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Affining kthreads follow either of four existing different patterns: 1) Per-CPU kthreads must stay affine to a single CPU and never execute relevant code on any other CPU. This is currently handled by smpboot code which takes care of CPU-hotplug operations. 2) Kthreads that _have_ to be affine to a specific set of CPUs and can't run anywhere else. The affinity is set through kthread_bind_mask() and the subsystem takes care by itself to handle CPU-hotplug operations. 3) Kthreads that prefer to be affine to a specific NUMA node. That preferred affinity is applied by default when an actual node ID is passed on kthread creation, provided the kthread is not per-CPU and no call to kthread_bind_mask() has been issued before the first wake-up. 4) Similar to the previous point but kthreads have a preferred affinity different than a node. It is set manually like any other task and CPU-hotplug is supposed to be handled by the relevant subsystem so that the task is properly reaffined whenever a given CPU from the preferred affinity comes up. Also care must be taken so that the preferred affinity doesn't cross housekeeping cpumask boundaries. Provide a function to handle the last usecase, mostly reusing the current node default affinity infrastructure. kthread_affine_preferred() is introduced, to be used just like kthread_bind_mask(), right after kthread creation and before the first wake up. The kthread is then affine right away to the cpumask passed through the API if it has online housekeeping CPUs. Otherwise it will be affine to all online housekeeping CPUs as a last resort. As with node affinity, it is aware of CPU hotplug events such that: * When a housekeeping CPU goes up that is part of the preferred affinity of a given kthread, the related task is re-affined to that preferred affinity if it was previously running on the default last resort online housekeeping set. * When a housekeeping CPU goes down while it was part of the preferred affinity of a kthread, the running task is migrated (or the sleeping task is woken up) automatically by the scheduler to other housekeepers within the preferred affinity or, as a last resort, to all housekeepers from other nodes. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- include/linux/kthread.h | 1 + kernel/kthread.c | 68 ++++++++++++++++++++++++++++++++++++----- 2 files changed, 62 insertions(+), 7 deletions(-) diff --git a/include/linux/kthread.h b/include/linux/kthread.h index b11f53c1ba2e..30209bdf83a2 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -85,6 +85,7 @@ kthread_run_on_cpu(int (*threadfn)(void *data), void *data, void free_kthread_struct(struct task_struct *k); void kthread_bind(struct task_struct *k, unsigned int cpu); void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask); +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask); int kthread_stop(struct task_struct *k); int kthread_stop_put(struct task_struct *k); bool kthread_should_stop(void); diff --git a/kernel/kthread.c b/kernel/kthread.c index 736276d313c2..91037533afda 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -70,6 +70,7 @@ struct kthread { char *full_name; struct task_struct *task; struct list_head hotplug_node; + struct cpumask *preferred_affinity; }; enum KTHREAD_BITS { @@ -327,6 +328,11 @@ void __noreturn kthread_exit(long result) mutex_lock(&kthreads_hotplug_lock); list_del(&kthread->hotplug_node); mutex_unlock(&kthreads_hotplug_lock); + + if (kthread->preferred_affinity) { + kfree(kthread->preferred_affinity); + kthread->preferred_affinity = NULL; + } } do_exit(0); } @@ -355,9 +361,17 @@ EXPORT_SYMBOL(kthread_complete_and_exit); static void kthread_fetch_affinity(struct kthread *kthread, struct cpumask *cpumask) { - cpumask_and(cpumask, cpumask_of_node(kthread->node), - housekeeping_cpumask(HK_TYPE_KTHREAD)); + const struct cpumask *pref; + if (kthread->preferred_affinity) { + pref = kthread->preferred_affinity; + } else { + if (WARN_ON_ONCE(kthread->node == NUMA_NO_NODE)) + return; + pref = cpumask_of_node(kthread->node); + } + + cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_KTHREAD)); if (cpumask_empty(cpumask)) cpumask_copy(cpumask, housekeeping_cpumask(HK_TYPE_KTHREAD)); } @@ -440,7 +454,7 @@ static int kthread(void *_create) self->started = 1; - if (!(current->flags & PF_NO_SETAFFINITY)) + if (!(current->flags & PF_NO_SETAFFINITY) && !self->preferred_affinity) kthread_affine_node(); ret = -EINTR; @@ -837,12 +851,53 @@ int kthreadd(void *unused) return 0; } +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask) +{ + struct kthread *kthread = to_kthread(p); + cpumask_var_t affinity; + unsigned long flags; + int ret; + + if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) { + WARN_ON(1); + return -EINVAL; + } + + WARN_ON_ONCE(kthread->preferred_affinity); + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + kthread->preferred_affinity = kzalloc(sizeof(struct cpumask), GFP_KERNEL); + if (!kthread->preferred_affinity) { + ret = -ENOMEM; + goto out; + } + + mutex_lock(&kthreads_hotplug_lock); + cpumask_copy(kthread->preferred_affinity, mask); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + kthread_fetch_affinity(kthread, affinity); + + /* It's safe because the task is inactive. */ + raw_spin_lock_irqsave(&p->pi_lock, flags); + do_set_cpus_allowed(p, affinity); + raw_spin_unlock_irqrestore(&p->pi_lock, flags); + + mutex_unlock(&kthreads_hotplug_lock); +out: + free_cpumask_var(affinity); + + return 0; +} + /* * Re-affine kthreads according to their preferences * and the newly online CPU. The CPU down part is handled * by select_fallback_rq() which default re-affines to - * housekeepers in case the preferred affinity doesn't - * apply anymore. + * housekeepers from other nodes in case the preferred + * affinity doesn't apply anymore. */ static int kthreads_online_cpu(unsigned int cpu) { @@ -862,8 +917,7 @@ static int kthreads_online_cpu(unsigned int cpu) list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { if (WARN_ON_ONCE((k->task->flags & PF_NO_SETAFFINITY) || - kthread_is_per_cpu(k->task) || - k->node == NUMA_NO_NODE)) { + kthread_is_per_cpu(k->task))) { ret = -EINVAL; continue; }