From patchwork Wed Aug 7 16:02:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756468 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8AFD14375A; Wed, 7 Aug 2024 16:03:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046587; cv=none; b=Vuko3T7EBZk+rLiLYJRKRd2RcZzM09gZt8RnEsYeHZVmdHWs731DZNspsJ7dx/qMvJfaKIVoBFz766Ot9XjNNyVLRwhwDIM1JqjkSTTnd0L5sj1o8pcPLSNxlVvuJTuDRirzq0KGRQUUxcp/sCdgXIjNfBrocL1ThfiXOK9Yyoo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046587; c=relaxed/simple; bh=ZmzCkFGiO4uVOjClN5Ffq3kFJzGfD03UONhdrDRJsTk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pEwoqNdiSMww/phz3cyPpVrOagYl8uuEPOi50gRwPqva0YnqpY2DuhGdR9FXvAvrA7fM9xhdUi+XDRuWWjkG00x8vinZkXJ30J32OSLNPaq5EGZMO0GDnjK5XA2yHoYwBMbB4CdW9yzK9ZFV0V2qa7K3DQIMLgMJODOeI5ed1+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=u9esigVt; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="u9esigVt" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B64BC4AF0E; Wed, 7 Aug 2024 16:03:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046586; bh=ZmzCkFGiO4uVOjClN5Ffq3kFJzGfD03UONhdrDRJsTk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u9esigVtaNkWaPmmPvQ6OxFME/PuYU8prUzvVEQ/DiH4Ak6Pojoi+lLOfQtPX3vPM t3/arDejOOyBdaKNAxEJ0Z0hjwhMCDGLKdUbPQGNRyAH8nQRnXCSE9MbhRo3Lm5PXX w9v0xh44sRaqnxgqyTf3WU5wFaTfBbLIUgLIg7hB6wDRbBQGZEJjOzyQs3RGEbV2xv 5gTZM9Iu7uXB/xQrTrQXhUmLN2bvwensqNmNXFEoQmOy84l0NOC9GS7gvu9mbMg4zn hphW3nif5vh4tAu4LUPva9fU/XGSfy6f8Oe2BKcudPpxCRWO6wvhlD2JfCoT75Yi9T heeFkQmGbt8hw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 11/19] kthread: Make sure kthread hasn't started while binding it Date: Wed, 7 Aug 2024 18:02:17 +0200 Message-ID: <20240807160228.26206-12-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Make sure the kthread is sleeping in the schedule_preempt_disabled() call before calling its handler when kthread_bind[_mask]() is called on it. This provides a sanity check verifying that the task is not randomly blocked later at some point within its function handler, in which case it could be just concurrently awaken, leaving the call to do_set_cpus_allowed() without any effect until the next voluntary sleep. Rely on the wake-up ordering to ensure that the newly introduced "started" field returns the expected value: TASK A TASK B ------ ------ READ kthread->started wake_up_process(B) rq_lock() ... rq_unlock() // RELEASE schedule() rq_lock() // ACQUIRE // schedule task B rq_unlock() WRITE kthread->started Similarly, writing kthread->started before subsequent voluntary sleeps will be visible after calling wait_task_inactive() in __kthread_bind_mask(), reporting potential misuse of the API. Upcoming patches will make further use of this facility. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index f7be976ff88a..ecb719f54f7a 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -53,6 +53,7 @@ struct kthread_create_info struct kthread { unsigned long flags; unsigned int cpu; + int started; int result; int (*threadfn)(void *); void *data; @@ -382,6 +383,8 @@ static int kthread(void *_create) schedule_preempt_disabled(); preempt_enable(); + self->started = 1; + ret = -EINTR; if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { cgroup_kthread_ready(); @@ -540,7 +543,9 @@ static void __kthread_bind(struct task_struct *p, unsigned int cpu, unsigned int void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) { + struct kthread *kthread = to_kthread(p); __kthread_bind_mask(p, mask, TASK_UNINTERRUPTIBLE); + WARN_ON_ONCE(kthread->started); } /** @@ -554,7 +559,9 @@ void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) */ void kthread_bind(struct task_struct *p, unsigned int cpu) { + struct kthread *kthread = to_kthread(p); __kthread_bind(p, cpu, TASK_UNINTERRUPTIBLE); + WARN_ON_ONCE(kthread->started); } EXPORT_SYMBOL(kthread_bind); From patchwork Wed Aug 7 16:02:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756469 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8574F1442F2; Wed, 7 Aug 2024 16:03:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046591; cv=none; b=amT4tV+lu43qKud3hlhze1DWHJLiu/eqxQyG8lSG1zaQKRQwb+FKuc9rbvgoufoYzJMh04/QYSO9NtB8Qic+hCHD1bCa889CuysNbOeZFhgl9u6eb2cDjvCjeqcmO5MfYe1jy67uIZZPYxzGUGsby5bvEQacjAl7fMQ+J1D8GLw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046591; c=relaxed/simple; bh=Rrk8ndlaediFaYaEu8Vy9mx3y5m6nelHltTuw6hlLeM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rk+vD+rofYHZIX1gVjvt9wt8h28PAwcC6JHZIj/23RR4GixjTnLWm9+hxABuWoNgkHNakHUttCka6Do+KwRODmGQ2SMIIfv7dGqLPRezkLfmb4DCaSw0QiRIBvYeVmfKEeqiywpfxDL9Vag/7aoVIdQjLuEp8n9o0sl3DJUqDdM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QPmXmXay; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QPmXmXay" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6FD0C4AF12; Wed, 7 Aug 2024 16:03:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046590; bh=Rrk8ndlaediFaYaEu8Vy9mx3y5m6nelHltTuw6hlLeM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QPmXmXaymg6t7h15lBu/3348hOVNH5ElKYhXHPEkQknD8Gta7FRz1QHhhqgDuQU23 NgW/Qdu+u0YdOY8PpwLigl2vICH99j9y1zxtKqaD5munJXZbgu6an7v5YWKpgh2xcS 7Rar4jmX9SJu92hRuWgDjkYEMHe2pZTxChm6/2jxTSHYImSlYvAlbkDAjpwBziKLcV qJIW6cC1/+XPlG0bPetzb4TIGh+LNUk050iua84tVm7w+MW7m6vdzSTfn0NL2PET7d XGrnyHz3gWXYTrA5QwilPeG2taSn4C8BhmEDh1rL90FW5uvaWq6HLzvge98EN6e01Q 9jqdbPtEBjZlg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 12/19] kthread: Default affine kthread to its preferred NUMA node Date: Wed, 7 Aug 2024 18:02:18 +0200 Message-ID: <20240807160228.26206-13-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Kthreads attached to a preferred NUMA node for their task structure allocation can also be assumed to run preferrably within that same node. A more precise affinity is usually notified by calling kthread_create_on_cpu() or kthread_bind[_mask]() before the first wakeup. For the others, a default affinity to the node is desired and sometimes implemented with more or less success when it comes to deal with hotplug events and nohz_full / CPU Isolation interactions: - kcompactd is affine to its node and handles hotplug but not CPU Isolation - kswapd is affine to its node and ignores hotplug and CPU Isolation - A bunch of drivers create their kthreads on a specific node and don't take care about affining further. Handle that default node affinity preference at the generic level instead, provided a kthread is created on an actual node and doesn't apply any specific affinity such as a given CPU or a custom cpumask to bind to before its first wake-up. This generic handling is aware of CPU hotplug events and CPU isolation such that: * When a housekeeping CPU goes up and is part of the node of a given kthread, it is added to its applied affinity set (and possibly the default last resort online housekeeping set is removed from the set). * When a housekeeping CPU goes down while it was part of the node of a kthread, it is removed from the kthread's applied affinity. The last resort is to affine the kthread to all online housekeeping CPUs. Signed-off-by: Frederic Weisbecker Acked-by: Vlastimil Babka --- include/linux/cpuhotplug.h | 1 + kernel/kthread.c | 120 ++++++++++++++++++++++++++++++++++++- 2 files changed, 120 insertions(+), 1 deletion(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 9316c39260e0..89d852538b72 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -240,6 +240,7 @@ enum cpuhp_state { CPUHP_AP_WORKQUEUE_ONLINE, CPUHP_AP_RANDOM_ONLINE, CPUHP_AP_RCUTREE_ONLINE, + CPUHP_AP_KTHREADS_ONLINE, CPUHP_AP_BASE_CACHEINFO_ONLINE, CPUHP_AP_ONLINE_DYN, CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 40, diff --git a/kernel/kthread.c b/kernel/kthread.c index ecb719f54f7a..eee5925e7725 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -35,6 +35,10 @@ static DEFINE_SPINLOCK(kthread_create_lock); static LIST_HEAD(kthread_create_list); struct task_struct *kthreadd_task; +static struct cpumask kthread_online_mask; +static LIST_HEAD(kthreads_hotplug); +static DEFINE_MUTEX(kthreads_hotplug_lock); + struct kthread_create_info { /* Information passed to kthread() from kthreadd. */ @@ -53,6 +57,7 @@ struct kthread_create_info struct kthread { unsigned long flags; unsigned int cpu; + unsigned int node; int started; int result; int (*threadfn)(void *); @@ -64,6 +69,8 @@ struct kthread { #endif /* To store the full name if task comm is truncated. */ char *full_name; + struct task_struct *task; + struct list_head hotplug_node; }; enum KTHREAD_BITS { @@ -122,8 +129,11 @@ bool set_kthread_struct(struct task_struct *p) init_completion(&kthread->exited); init_completion(&kthread->parked); + INIT_LIST_HEAD(&kthread->hotplug_node); p->vfork_done = &kthread->exited; + kthread->task = p; + kthread->node = tsk_fork_get_node(current); p->worker_private = kthread; return true; } @@ -314,6 +324,13 @@ void __noreturn kthread_exit(long result) { struct kthread *kthread = to_kthread(current); kthread->result = result; + if (!list_empty(&kthread->hotplug_node)) { + mutex_lock(&kthreads_hotplug_lock); + list_del(&kthread->hotplug_node); + /* Make sure the kthread never gets re-affined globally */ + set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); + mutex_unlock(&kthreads_hotplug_lock); + } do_exit(0); } EXPORT_SYMBOL(kthread_exit); @@ -339,6 +356,45 @@ void __noreturn kthread_complete_and_exit(struct completion *comp, long code) } EXPORT_SYMBOL(kthread_complete_and_exit); +static void kthread_fetch_affinity(struct kthread *k, struct cpumask *mask) +{ + if (k->node == NUMA_NO_NODE) { + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + } else { + /* + * The node cpumask is racy when read from kthread() but: + * - a racing CPU going down won't be present in kthread_online_mask + * - a racing CPU going up will be handled by kthreads_online_cpu() + */ + cpumask_and(mask, cpumask_of_node(k->node), &kthread_online_mask); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + if (cpumask_empty(mask)) + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + } +} + +static int kthread_affine_node(void) +{ + struct kthread *kthread = to_kthread(current); + cpumask_var_t affinity; + + WARN_ON_ONCE(kthread_is_per_cpu(current)); + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + mutex_lock(&kthreads_hotplug_lock); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + kthread_fetch_affinity(kthread, affinity); + set_cpus_allowed_ptr(current, affinity); + mutex_unlock(&kthreads_hotplug_lock); + + free_cpumask_var(affinity); + + return 0; +} + static int kthread(void *_create) { static const struct sched_param param = { .sched_priority = 0 }; @@ -369,7 +425,6 @@ static int kthread(void *_create) * back to default in case they have been changed. */ sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m); - set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); /* OK, tell user we're spawned, wait for stop or wakeup */ __set_current_state(TASK_UNINTERRUPTIBLE); @@ -385,6 +440,9 @@ static int kthread(void *_create) self->started = 1; + if (!(current->flags & PF_NO_SETAFFINITY)) + kthread_affine_node(); + ret = -EINTR; if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { cgroup_kthread_ready(); @@ -779,6 +837,66 @@ int kthreadd(void *unused) return 0; } +static int kthreads_hotplug_update(void) +{ + cpumask_var_t affinity; + struct kthread *k; + int err; + + if (list_empty(&kthreads_hotplug)) + return 0; + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + err = 0; + + list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { + if (WARN_ON_ONCE((k->task->flags & PF_NO_SETAFFINITY) || + kthread_is_per_cpu(k->task))) { + err = -EINVAL; + continue; + } + kthread_fetch_affinity(k, affinity); + set_cpus_allowed_ptr(k->task, affinity); + } + + free_cpumask_var(affinity); + + return err; +} + +static int kthreads_offline_cpu(unsigned int cpu) +{ + int ret = 0; + + mutex_lock(&kthreads_hotplug_lock); + cpumask_clear_cpu(cpu, &kthread_online_mask); + ret = kthreads_hotplug_update(); + mutex_unlock(&kthreads_hotplug_lock); + + return ret; +} + +static int kthreads_online_cpu(unsigned int cpu) +{ + int ret = 0; + + mutex_lock(&kthreads_hotplug_lock); + cpumask_set_cpu(cpu, &kthread_online_mask); + ret = kthreads_hotplug_update(); + mutex_unlock(&kthreads_hotplug_lock); + + return ret; +} + +static int kthreads_init(void) +{ + return cpuhp_setup_state(CPUHP_AP_KTHREADS_ONLINE, "kthreads:online", + kthreads_online_cpu, kthreads_offline_cpu); +} +early_initcall(kthreads_init); + void __kthread_init_worker(struct kthread_worker *worker, const char *name, struct lock_class_key *key) From patchwork Wed Aug 7 16:02:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756470 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCFAC13A26B; Wed, 7 Aug 2024 16:03:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046598; cv=none; b=F8YATjHbtZh2GWNBCuCNVv8dOIFXJ39WaItTivIeBrx7bysm7RUpUGYB2pFdrb/P2Rpw8lKwWwRbvHGTCFyUsGo1S7F6IkLDeWhpC+Ilo623cT8sdx9OFzF7Tk1iyHdm9Ui7IUDhcMdTxqknTXkeNgjVVu944I6wqeEEsA84e/8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046598; c=relaxed/simple; bh=ZJ7YBVdrjvsvoNjdhdYcfK+MVnXttd34QbaAF6bmuEI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LU1OKV+zRVs9WjLxVYr2Ysyczb4MgR+6IcNk4X1nPBciiaq0C+0xbJtUQ+aoKxiVXL/0evTZZzmaxrTTMsAHmk0qTlxmatF1KyxkE7H5OpRTo1ZXDI4ku70eJSlYPKyWeZ2gYWpxhDsA26GHAmrzeZUMmJRSj3mJ4TFyWG+574U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GHLEnZZ6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GHLEnZZ6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECE3DC32781; Wed, 7 Aug 2024 16:03:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046597; bh=ZJ7YBVdrjvsvoNjdhdYcfK+MVnXttd34QbaAF6bmuEI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GHLEnZZ6UuI+KikYPd7gMsreo7ZMeA3s/h3S/FXFD4ujUgZg+6YzFSn1aPc5DGaTE Xwv2IIug4AimbpLdO/MIJzfIouGYEDIk2joDm6gMCLAN4WvlZqTth1Qh7b1juXGK+t w18QYTM6KBiKfeIKiuBBhsZC+/CL79zM3zd7KEK5DDA7uUzHSpTyeoVMK8cxQ4+Uem z6e/mN/P/SYUJRCeolqLU2J8lZiv9r+43Q7LVqOJw3fCvmGlqQnFz1URzW1f2PQ587 tm2vajoGCi30c8UlLQ+ys7oKo88Ka4Y2nHPObvoslfvUEKCm+DXmkykOayfcn1OOIy cIlW0ss730lGA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 15/19] kthread: Implement preferred affinity Date: Wed, 7 Aug 2024 18:02:21 +0200 Message-ID: <20240807160228.26206-16-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Affining kthreads follow either of four existing different patterns: 1) Per-CPU kthreads must stay affine to a single CPU and never execute relevant code on any other CPU. This is currently handled by smpboot code which takes care of CPU-hotplug operations. 2) Kthreads that _have_ to be affine to a specific set of CPUs and can't run anywhere else. The affinity is set through kthread_bind_mask() and the subsystem takes care by itself to handle CPU-hotplug operations. 3) Kthreads that prefer to be affine to a specific NUMA node. That preferred affinity is applied by default when an actual node ID is passed on kthread creation, provided the kthread is not per-CPU and no call to kthread_bind_mask() has been issued before the first wake-up. 4) Similar to the previous point but kthreads have a preferred affinity different than a node. It is set manually like any other task and CPU-hotplug is supposed to be handled by the relevant subsystem so that the task is properly reaffined whenever a given CPU from the preferred affinity comes up or down. Also care must be taken so that the preferred affinity doesn't cross housekeeping cpumask boundaries. Provide a function to handle the last usecase, mostly reusing the current node default affinity infrastructure. kthread_affine_preferred() is introduced, to be used just like kthread_bind_mask(), right after kthread creation and before the first wake up. The kthread is then affine right away to the cpumask passed through the API if it has online housekeeping CPUs. Otherwise it will be affine to all online housekeeping CPUs as a last resort. As with node affinity, it is aware of CPU hotplug events such that: * When a housekeeping CPU goes up and is part of the preferred affinity of a given kthread, it is added to its applied affinity set (and possibly the default last resort online housekeeping set is removed from the set). * When a housekeeping CPU goes down while it was part of the preferred affinity of a kthread, it is removed from the kthread's applied affinity. The last resort is to affine the kthread to all online housekeeping CPUs. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- include/linux/kthread.h | 1 + kernel/kthread.c | 69 ++++++++++++++++++++++++++++++++++++----- 2 files changed, 62 insertions(+), 8 deletions(-) diff --git a/include/linux/kthread.h b/include/linux/kthread.h index b11f53c1ba2e..30209bdf83a2 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -85,6 +85,7 @@ kthread_run_on_cpu(int (*threadfn)(void *data), void *data, void free_kthread_struct(struct task_struct *k); void kthread_bind(struct task_struct *k, unsigned int cpu); void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask); +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask); int kthread_stop(struct task_struct *k); int kthread_stop_put(struct task_struct *k); bool kthread_should_stop(void); diff --git a/kernel/kthread.c b/kernel/kthread.c index eee5925e7725..e4ffc776928a 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -71,6 +71,7 @@ struct kthread { char *full_name; struct task_struct *task; struct list_head hotplug_node; + struct cpumask *preferred_affinity; }; enum KTHREAD_BITS { @@ -330,6 +331,11 @@ void __noreturn kthread_exit(long result) /* Make sure the kthread never gets re-affined globally */ set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); mutex_unlock(&kthreads_hotplug_lock); + + if (kthread->preferred_affinity) { + kfree(kthread->preferred_affinity); + kthread->preferred_affinity = NULL; + } } do_exit(0); } @@ -358,19 +364,25 @@ EXPORT_SYMBOL(kthread_complete_and_exit); static void kthread_fetch_affinity(struct kthread *k, struct cpumask *mask) { - if (k->node == NUMA_NO_NODE) { - cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); - } else { + const struct cpumask *pref; + + if (k->preferred_affinity) { + pref = k->preferred_affinity; + } else if (k->node != NUMA_NO_NODE) { /* * The node cpumask is racy when read from kthread() but: * - a racing CPU going down won't be present in kthread_online_mask * - a racing CPU going up will be handled by kthreads_online_cpu() */ - cpumask_and(mask, cpumask_of_node(k->node), &kthread_online_mask); - cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); - if (cpumask_empty(mask)) - cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + pref = cpumask_of_node(k->node); + } else { + pref = housekeeping_cpumask(HK_TYPE_KTHREAD); } + + cpumask_and(mask, pref, &kthread_online_mask); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + if (cpumask_empty(mask)) + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); } static int kthread_affine_node(void) @@ -440,7 +452,7 @@ static int kthread(void *_create) self->started = 1; - if (!(current->flags & PF_NO_SETAFFINITY)) + if (!(current->flags & PF_NO_SETAFFINITY) && !self->preferred_affinity) kthread_affine_node(); ret = -EINTR; @@ -837,6 +849,47 @@ int kthreadd(void *unused) return 0; } +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask) +{ + struct kthread *kthread = to_kthread(p); + cpumask_var_t affinity; + unsigned long flags; + int ret; + + if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) { + WARN_ON(1); + return -EINVAL; + } + + WARN_ON_ONCE(kthread->preferred_affinity); + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + kthread->preferred_affinity = kzalloc(sizeof(struct cpumask), GFP_KERNEL); + if (!kthread->preferred_affinity) { + ret = -ENOMEM; + goto out; + } + + mutex_lock(&kthreads_hotplug_lock); + cpumask_copy(kthread->preferred_affinity, mask); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + kthread_fetch_affinity(kthread, affinity); + + /* It's safe because the task is inactive. */ + raw_spin_lock_irqsave(&p->pi_lock, flags); + do_set_cpus_allowed(p, affinity); + raw_spin_unlock_irqrestore(&p->pi_lock, flags); + + mutex_unlock(&kthreads_hotplug_lock); +out: + free_cpumask_var(affinity); + + return 0; +} + static int kthreads_hotplug_update(void) { cpumask_var_t affinity; From patchwork Wed Aug 7 16:02:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756471 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96150143C48; Wed, 7 Aug 2024 16:03:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046600; cv=none; b=UdQxo2RObZcHs3EDuZT0C9P5234urFexjA0U87/M4+FIMnHztxe7jzFrT3IdB0uTMD4WceeD06LVTki/ciLUkU5sAZ6yj9u7WyP0mIgjA/dTx9TeS5u5xzRakrzr6lqfvJH1Nx6FofHk4PE+71V4bpWS54iz3TJaUQB6nmwuJG4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046600; c=relaxed/simple; bh=Ne/dsyLLelRxT9PNILuzlG076E3PMgXD7LoRFHFuxtU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GplTWeHneD0YOl68imvb/2DLq2SF8VNbIXdNRV9OmCdI3P7iJe/RjYfAM9BbfGnAgOLmNZa2SUbJmEfwA2zdF4aV5gXFYrP4tMX7AB808TNwZ7Qt3aW8DJU+5NI8kEkr8cwPv5zGd1NaU5Zm8+od1exKOXSmNTEThZhZU9KkOBQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=DoSO5HtG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="DoSO5HtG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2E79FC32781; Wed, 7 Aug 2024 16:03:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046600; bh=Ne/dsyLLelRxT9PNILuzlG076E3PMgXD7LoRFHFuxtU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DoSO5HtGHtyhc8XvQ4DhBLKCeExav2uNL31/lWi3a3QyNF6qLArWOMam4Yj9wOBHp 12MaGclgtZfLOmNing6Kj47Y3NRtStVN+Gcq9C3+MrmslvQOQ8opcWB1HIZha/LwA8 TFjqsoPZF38cVVUu+Q8ebSgIAI/p15LQdG2zT8CwxXNgG2++JWc21CK35tdank1obK Js3F+BfzLfWX0HyrwtPBWzvrO7YUvtrOP0VH4Yobk9hGD7Brsyk6mNo4Mu3ydE9wpt lqPObX4AXP2hyP9KQYgR9hw3ZBo1CpefUj4tdl0KvUcWUB9lmyKCaqBQ3rRmvLSuP1 ZSD+k81l4n5/A== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org, Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: [PATCH 16/19] rcu: Use kthread preferred affinity for RCU boost Date: Wed, 7 Aug 2024 18:02:22 +0200 Message-ID: <20240807160228.26206-17-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now that kthreads have an infrastructure to handle preferred affinity against CPU hotplug and housekeeping cpumask, convert RCU boost to use it instead of handling all the constraints by itself. Acked-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree.c | 27 +++++++++++++++++++-------- kernel/rcu/tree_plugin.h | 11 ++--------- 2 files changed, 21 insertions(+), 17 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e641cc681901..945e196090ed 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -149,7 +149,6 @@ static int rcu_scheduler_fully_active __read_mostly; static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, unsigned long gps, unsigned long flags); -static struct task_struct *rcu_boost_task(struct rcu_node *rnp); static void invoke_rcu_core(void); static void rcu_report_exp_rdp(struct rcu_data *rdp); static void sync_sched_exp_online_cleanup(int cpu); @@ -4932,6 +4931,22 @@ int rcutree_prepare_cpu(unsigned int cpu) return 0; } +static void rcu_thread_affine_rnp(struct task_struct *t, struct rcu_node *rnp) +{ + cpumask_var_t affinity; + int cpu; + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return; + + for_each_leaf_node_possible_cpu(rnp, cpu) + cpumask_set_cpu(cpu, affinity); + + kthread_affine_preferred(t, affinity); + + free_cpumask_var(affinity); +} + /* * Update kthreads affinity during CPU-hotplug changes. * @@ -4951,19 +4966,18 @@ static void rcutree_affinity_setting(unsigned int cpu, int outgoingcpu) unsigned long mask; struct rcu_data *rdp; struct rcu_node *rnp; - struct task_struct *task_boost, *task_exp; + struct task_struct *task_exp; rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; - task_boost = rcu_boost_task(rnp); task_exp = rcu_exp_par_gp_task(rnp); /* - * If CPU is the boot one, those tasks are created later from early + * If CPU is the boot one, this task is created later from early * initcall since kthreadd must be created first. */ - if (!task_boost && !task_exp) + if (!task_exp) return; if (!zalloc_cpumask_var(&cm, GFP_KERNEL)) @@ -4985,9 +4999,6 @@ static void rcutree_affinity_setting(unsigned int cpu, int outgoingcpu) if (task_exp) set_cpus_allowed_ptr(task_exp, cm); - if (task_boost) - set_cpus_allowed_ptr(task_boost, cm); - mutex_unlock(&rnp->kthread_mutex); free_cpumask_var(cm); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index c569da65b421..d1ff1b982ca2 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1216,16 +1216,13 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp) raw_spin_lock_irqsave_rcu_node(rnp, flags); rnp->boost_kthread_task = t; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + sp.sched_priority = kthread_prio; sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); + rcu_thread_affine_rnp(t, rnp); wake_up_process(t); /* get to TASK_INTERRUPTIBLE quickly. */ } -static struct task_struct *rcu_boost_task(struct rcu_node *rnp) -{ - return READ_ONCE(rnp->boost_kthread_task); -} - #else /* #ifdef CONFIG_RCU_BOOST */ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) @@ -1242,10 +1239,6 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp) { } -static struct task_struct *rcu_boost_task(struct rcu_node *rnp) -{ - return NULL; -} #endif /* #else #ifdef CONFIG_RCU_BOOST */ /* From patchwork Wed Aug 7 16:02:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756472 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A294A145B03; Wed, 7 Aug 2024 16:03:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046603; cv=none; b=oNlp0EKd5vTEAPzWHIP5Rb49uERcbu3tUz0kB5xpLyw9BJqVQHk+1wSfdnD1wniaVNj9EjVIrfHExXQed+jSLH2oJgeg15udx0WoJqgqWyDHLsosCY7QrKGHraRfhXSajSZwdpof4wpFXHB4tgJtNDuIR9qlQYFnBN6RY/JVAPM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046603; c=relaxed/simple; bh=9qI30ZQo3Tdt7iXgrN5KIoFtWBGg6s/0Oq9JRQN/dY4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CXmYI4le+tTow4ZtUyf0TaNUVm4zuGECcwgvg1KNhqM7W3JWmwQLoNmR+kODdBU7+8/sqFpAkQcDTKcgej3CeJPpWRbVbnml7+ZL3dbx3fNw/1BZh7urbBlQnq6YHuFUG89dXXe5pBy9xQm9pGabFrdaa5hd5u+3NqzfDki6V50= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dj446mh+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dj446mh+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDE53C4AF0F; Wed, 7 Aug 2024 16:03:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046603; bh=9qI30ZQo3Tdt7iXgrN5KIoFtWBGg6s/0Oq9JRQN/dY4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dj446mh+H6yk+zGGOLG9sPl2h9HkNOIC34vPSdx8FYSe6GGdDGWMXISbQK2K6vJPc 1oQz+1MruS0f+NimYoD+/foleHObIb9RnV8a9EmSSsq2ONLewLpZFpYarK1H8uxYnv 6KxEKCK4EHxfs82jWqGsRxFiZWa0EvxkmdRPH7dMYwWz1rUhzHYIvxUe3DQrzAmAVx BlaVTMbQBIwpoIDkD0WCEM/4dxfGDMUkppJkEJ3vJ9U5uh4Kzb6M6Jy/Wqr4Bpu7rY 29JW5Bw5xph3VYXxhsbw6uzouvmUihYQQNnwvnOgfN1+2VX2SC5BJh5w6l5RQv/Ecm EnxjN26+NujYw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org, Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: [PATCH 17/19] kthread: Unify kthread_create_on_cpu() and kthread_create_worker_on_cpu() automatic format Date: Wed, 7 Aug 2024 18:02:23 +0200 Message-ID: <20240807160228.26206-18-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 kthread_create_on_cpu() uses the CPU argument as an implicit and unique printf argument to add to the format whereas kthread_create_worker_on_cpu() still relies on explicitly passing the printf arguments. This difference in behaviour is error prone and doesn't help standardizing per-CPU kthread names. Unify the behaviours and convert kthread_create_worker_on_cpu() to use the printf behaviour of kthread_create_on_cpu(). Signed-off-by: Frederic Weisbecker --- fs/erofs/zdata.c | 2 +- include/linux/kthread.h | 21 +++++++++++---- kernel/kthread.c | 59 ++++++++++++++++++++++++----------------- 3 files changed, 52 insertions(+), 30 deletions(-) diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 424f656cd765..f5c4d9ffac8e 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -353,7 +353,7 @@ static void erofs_destroy_percpu_workers(void) static struct kthread_worker *erofs_init_percpu_worker(int cpu) { struct kthread_worker *worker = - kthread_create_worker_on_cpu(cpu, 0, "erofs_worker/%u", cpu); + kthread_create_worker_on_cpu(cpu, 0, "erofs_worker/%u"); if (IS_ERR(worker)) return worker; diff --git a/include/linux/kthread.h b/include/linux/kthread.h index 30209bdf83a2..0c66e7c1092a 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -187,13 +187,24 @@ extern void __kthread_init_worker(struct kthread_worker *worker, int kthread_worker_fn(void *worker_ptr); -__printf(2, 3) +__printf(3, 4) +struct kthread_worker *kthread_create_worker_on_node(unsigned int flags, + int node, + const char namefmt[], ...); + +#define kthread_create_worker(flags, namefmt, ...) \ +({ \ + struct kthread_worker *__kw \ + = kthread_create_worker_on_node(flags, NUMA_NO_NODE, \ + namefmt, ## __VA_ARGS__); \ + if (!IS_ERR(__kw)) \ + wake_up_process(__kw->task); \ + __kw; \ +}) + struct kthread_worker * -kthread_create_worker(unsigned int flags, const char namefmt[], ...); - -__printf(3, 4) struct kthread_worker * kthread_create_worker_on_cpu(int cpu, unsigned int flags, - const char namefmt[], ...); + const char namefmt[]); bool kthread_queue_work(struct kthread_worker *worker, struct kthread_work *work); diff --git a/kernel/kthread.c b/kernel/kthread.c index e4ffc776928a..d318621c6f4f 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1033,12 +1033,11 @@ int kthread_worker_fn(void *worker_ptr) EXPORT_SYMBOL_GPL(kthread_worker_fn); static __printf(3, 0) struct kthread_worker * -__kthread_create_worker(int cpu, unsigned int flags, - const char namefmt[], va_list args) +__kthread_create_worker_on_node(unsigned int flags, int node, + const char namefmt[], va_list args) { struct kthread_worker *worker; struct task_struct *task; - int node = NUMA_NO_NODE; worker = kzalloc(sizeof(*worker), GFP_KERNEL); if (!worker) @@ -1046,20 +1045,14 @@ __kthread_create_worker(int cpu, unsigned int flags, kthread_init_worker(worker); - if (cpu >= 0) - node = cpu_to_node(cpu); - task = __kthread_create_on_node(kthread_worker_fn, worker, - node, namefmt, args); + node, namefmt, args); if (IS_ERR(task)) goto fail_task; - if (cpu >= 0) - kthread_bind(task, cpu); - worker->flags = flags; worker->task = task; - wake_up_process(task); + return worker; fail_task: @@ -1070,6 +1063,7 @@ __kthread_create_worker(int cpu, unsigned int flags, /** * kthread_create_worker - create a kthread worker * @flags: flags modifying the default behavior of the worker + * @node: task structure for the thread is allocated on this node * @namefmt: printf-style name for the kthread worker (task). * * Returns a pointer to the allocated worker on success, ERR_PTR(-ENOMEM) @@ -1077,25 +1071,49 @@ __kthread_create_worker(int cpu, unsigned int flags, * when the caller was killed by a fatal signal. */ struct kthread_worker * -kthread_create_worker(unsigned int flags, const char namefmt[], ...) +kthread_create_worker_on_node(unsigned int flags, int node, const char namefmt[], ...) { struct kthread_worker *worker; va_list args; va_start(args, namefmt); - worker = __kthread_create_worker(-1, flags, namefmt, args); + worker = __kthread_create_worker_on_node(flags, node, namefmt, args); va_end(args); + if (worker) + wake_up_process(worker->task); + + return worker; +} +EXPORT_SYMBOL(kthread_create_worker_on_node); + +static __printf(3, 4) struct kthread_worker * +__kthread_create_worker_on_cpu(int cpu, unsigned int flags, + const char namefmt[], ...) +{ + struct kthread_worker *worker; + va_list args; + + va_start(args, namefmt); + worker = __kthread_create_worker_on_node(flags, cpu_to_node(cpu), + namefmt, args); + va_end(args); + + if (worker) { + kthread_bind(worker->task, cpu); + wake_up_process(worker->task); + } + return worker; } -EXPORT_SYMBOL(kthread_create_worker); /** * kthread_create_worker_on_cpu - create a kthread worker and bind it * to a given CPU and the associated NUMA node. * @cpu: CPU number * @flags: flags modifying the default behavior of the worker - * @namefmt: printf-style name for the kthread worker (task). + * @namefmt: printf-style name for the thread. Format is restricted + * to "name.*%u". Code fills in cpu number. * * Use a valid CPU number if you want to bind the kthread worker * to the given CPU and the associated NUMA node. @@ -1127,16 +1145,9 @@ EXPORT_SYMBOL(kthread_create_worker); */ struct kthread_worker * kthread_create_worker_on_cpu(int cpu, unsigned int flags, - const char namefmt[], ...) + const char namefmt[]) { - struct kthread_worker *worker; - va_list args; - - va_start(args, namefmt); - worker = __kthread_create_worker(cpu, flags, namefmt, args); - va_end(args); - - return worker; + return __kthread_create_worker_on_cpu(cpu, flags, namefmt, cpu); } EXPORT_SYMBOL(kthread_create_worker_on_cpu); From patchwork Wed Aug 7 16:02:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756473 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56358145B1D; Wed, 7 Aug 2024 16:03:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046606; cv=none; b=DBqJnlDY7DLJBQDOdkhusEIXpU9UhqqEhUL++jV5+0IgF+NsFL5k+NW9cdZ+CNFJaG+5o0Mrvl2Tn0/kGStbOJvgQ2tG34n3EE9U0ktCgYZLUxlQOGsvtTeSHjrsGaccLuDKWfkPjPyw3JxQJcVyZdiZn0xmVgBfnh/G2kSmQsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046606; c=relaxed/simple; bh=tl4oie1Tyy6+E/klxSeKzqf/qDGkSzUug+hS2JOM3Co=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CK98ZO/rHz/3xpeY4zTWerDNMbDPrvyKdxYCmbWyI3Q7eVjgT7uvHY9C8q8nt2ZuuBU8QY1sXP89592FjbznxsaaWALW1U0KJ8Yy43ru29r32yffG6W5lC57e4TA3IIBb1mtfn/380dutfzY72rutD9vkJS30mwQF9tqQx++5BQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RRXpA22Y; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RRXpA22Y" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6F5BC4AF0B; Wed, 7 Aug 2024 16:03:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046606; bh=tl4oie1Tyy6+E/klxSeKzqf/qDGkSzUug+hS2JOM3Co=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RRXpA22YN9x0mMYpAcOmQbr50dgJE7n7D49kaJEaoBHEocSsqfVDMQQjNp2I/tAe6 412Xvsx5GLScP6p/O9RzP+4RDPKI7/LBv4Siyv4aVQXky5aJqyVlat7k9l5WCQT3VB x8vEFIRs+HkTnRqoLau13qFRlyN6V9dxOLFkbKC1x3frGm2v52ETC9WlKGokOzLLeu 9gmrP7p1hOJ7E4kQQZ87dTp1cBvX7jDm1KzbDnaAOU/S8dipNHDmttmGcllJX76svK 2JyEMftwR7ZnOqTfA0amQsvwF5+sWMHEfKtsjxjV0LzZdX3GNv1bmGk/mqtiR/PFXo RTT9udcCYDWng== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org, Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: [PATCH 18/19] treewide: Introduce kthread_run_worker[_on_cpu]() Date: Wed, 7 Aug 2024 18:02:24 +0200 Message-ID: <20240807160228.26206-19-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 kthread_create() creates a kthread without running it yet. kthread_run() creates a kthread and runs it. On the other hand, kthread_create_worker() creates a kthread worker and runs it. This difference in behaviours is confusing. Also there is no way to create a kthread worker and affine it using kthread_bind_mask() or kthread_affine_preferred() before starting it. Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]() that behaves just like kthread_run(). kthread_create_worker[_on_cpu]() will now only create a kthread worker without starting it. Signed-off-by: Frederic Weisbecker --- arch/x86/kvm/i8254.c | 2 +- crypto/crypto_engine.c | 2 +- drivers/cpufreq/cppc_cpufreq.c | 2 +- drivers/gpu/drm/drm_vblank_work.c | 2 +- .../drm/i915/gem/selftests/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/gt/selftest_execlists.c | 2 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 2 +- drivers/gpu/drm/i915/gt/selftest_slpc.c | 2 +- drivers/gpu/drm/i915/selftests/i915_request.c | 8 ++-- drivers/gpu/drm/msm/disp/msm_disp_snapshot.c | 2 +- drivers/gpu/drm/msm/msm_atomic.c | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 2 +- drivers/gpu/drm/msm/msm_kms.c | 2 +- .../platform/chips-media/wave5/wave5-vpu.c | 2 +- drivers/net/dsa/mv88e6xxx/chip.c | 2 +- drivers/net/ethernet/intel/ice/ice_dpll.c | 2 +- drivers/net/ethernet/intel/ice/ice_gnss.c | 2 +- drivers/net/ethernet/intel/ice/ice_ptp.c | 2 +- drivers/platform/chrome/cros_ec_spi.c | 2 +- drivers/ptp/ptp_clock.c | 2 +- drivers/spi/spi.c | 2 +- drivers/usb/typec/tcpm/tcpm.c | 2 +- drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +- drivers/watchdog/watchdog_dev.c | 2 +- fs/erofs/zdata.c | 2 +- include/linux/kthread.h | 48 ++++++++++++++++--- kernel/kthread.c | 31 +++--------- kernel/rcu/tree.c | 4 +- kernel/workqueue.c | 2 +- net/dsa/tag_ksz.c | 2 +- net/dsa/tag_ocelot_8021q.c | 2 +- net/dsa/tag_sja1105.c | 2 +- 32 files changed, 82 insertions(+), 65 deletions(-) diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index cd57a517d04a..d7ab8780ab9e 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -681,7 +681,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags) pid_nr = pid_vnr(pid); put_pid(pid); - pit->worker = kthread_create_worker(0, "kvm-pit/%d", pid_nr); + pit->worker = kthread_run_worker(0, "kvm-pit/%d", pid_nr); if (IS_ERR(pit->worker)) goto fail_kthread; diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c index e60a0eb628e8..c7c16da5e649 100644 --- a/crypto/crypto_engine.c +++ b/crypto/crypto_engine.c @@ -517,7 +517,7 @@ struct crypto_engine *crypto_engine_alloc_init_and_set(struct device *dev, crypto_init_queue(&engine->queue, qlen); spin_lock_init(&engine->queue_lock); - engine->kworker = kthread_create_worker(0, "%s", engine->name); + engine->kworker = kthread_run_worker(0, "%s", engine->name); if (IS_ERR(engine->kworker)) { dev_err(dev, "failed to create crypto request pump task\n"); return NULL; diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index bafa32dd375d..0a567af2f06d 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -241,7 +241,7 @@ static void __init cppc_freq_invariance_init(void) if (fie_disabled) return; - kworker_fie = kthread_create_worker(0, "cppc_fie"); + kworker_fie = kthread_run_worker(0, "cppc_fie"); if (IS_ERR(kworker_fie)) { pr_warn("%s: failed to create kworker_fie: %ld\n", __func__, PTR_ERR(kworker_fie)); diff --git a/drivers/gpu/drm/drm_vblank_work.c b/drivers/gpu/drm/drm_vblank_work.c index 1752ffb44e1d..9cc71120246f 100644 --- a/drivers/gpu/drm/drm_vblank_work.c +++ b/drivers/gpu/drm/drm_vblank_work.c @@ -277,7 +277,7 @@ int drm_vblank_worker_init(struct drm_vblank_crtc *vblank) INIT_LIST_HEAD(&vblank->pending_work); init_waitqueue_head(&vblank->work_wait_queue); - worker = kthread_create_worker(0, "card%d-crtc%d", + worker = kthread_run_worker(0, "card%d-crtc%d", vblank->dev->primary->index, vblank->pipe); if (IS_ERR(worker)) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 89d4dc8b60c6..eb0158e43417 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -369,7 +369,7 @@ static int live_parallel_switch(void *arg) if (!data[n].ce[0]) continue; - worker = kthread_create_worker(0, "igt/parallel:%s", + worker = kthread_run_worker(0, "igt/parallel:%s", data[n].ce[0]->engine->name); if (IS_ERR(worker)) { err = PTR_ERR(worker); diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 4202df5b8c12..7da05fd6807f 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -3574,7 +3574,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) arg[id].batch = NULL; arg[id].count = 0; - worker[id] = kthread_create_worker(0, "igt/smoke:%d", id); + worker[id] = kthread_run_worker(0, "igt/smoke:%d", id); if (IS_ERR(worker[id])) { err = PTR_ERR(worker[id]); break; diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 9ce8ff1c04fe..9d3aeb237295 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -1025,7 +1025,7 @@ static int __igt_reset_engines(struct intel_gt *gt, threads[tmp].engine = other; threads[tmp].flags = flags; - worker = kthread_create_worker(0, "igt/%s", + worker = kthread_run_worker(0, "igt/%s", other->name); if (IS_ERR(worker)) { err = PTR_ERR(worker); diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c b/drivers/gpu/drm/i915/gt/selftest_slpc.c index 4ecc4ae74a54..e218b229681f 100644 --- a/drivers/gpu/drm/i915/gt/selftest_slpc.c +++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c @@ -489,7 +489,7 @@ static int live_slpc_tile_interaction(void *arg) return -ENOMEM; for_each_gt(gt, i915, i) { - threads[i].worker = kthread_create_worker(0, "igt/slpc_parallel:%d", gt->info.id); + threads[i].worker = kthread_run_worker(0, "igt/slpc_parallel:%d", gt->info.id); if (IS_ERR(threads[i].worker)) { ret = PTR_ERR(threads[i].worker); diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index acae30a04a94..88870844b5bd 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -492,7 +492,7 @@ static int mock_breadcrumbs_smoketest(void *arg) for (n = 0; n < ncpus; n++) { struct kthread_worker *worker; - worker = kthread_create_worker(0, "igt/%d", n); + worker = kthread_run_worker(0, "igt/%d", n); if (IS_ERR(worker)) { ret = PTR_ERR(worker); ncpus = n; @@ -1645,7 +1645,7 @@ static int live_parallel_engines(void *arg) for_each_uabi_engine(engine, i915) { struct kthread_worker *worker; - worker = kthread_create_worker(0, "igt/parallel:%s", + worker = kthread_run_worker(0, "igt/parallel:%s", engine->name); if (IS_ERR(worker)) { err = PTR_ERR(worker); @@ -1806,7 +1806,7 @@ static int live_breadcrumbs_smoketest(void *arg) unsigned int i = idx * ncpus + n; struct kthread_worker *worker; - worker = kthread_create_worker(0, "igt/%d.%d", idx, n); + worker = kthread_run_worker(0, "igt/%d.%d", idx, n); if (IS_ERR(worker)) { ret = PTR_ERR(worker); goto out_flush; @@ -3219,7 +3219,7 @@ static int perf_parallel_engines(void *arg) memset(&engines[idx].p, 0, sizeof(engines[idx].p)); - worker = kthread_create_worker(0, "igt:%s", + worker = kthread_run_worker(0, "igt:%s", engine->name); if (IS_ERR(worker)) { err = PTR_ERR(worker); diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c index e75b97127c0d..2be00b11e557 100644 --- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c +++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c @@ -109,7 +109,7 @@ int msm_disp_snapshot_init(struct drm_device *drm_dev) mutex_init(&kms->dump_mutex); - kms->dump_worker = kthread_create_worker(0, "%s", "disp_snapshot"); + kms->dump_worker = kthread_run_worker(0, "%s", "disp_snapshot"); if (IS_ERR(kms->dump_worker)) DRM_ERROR("failed to create disp state task\n"); diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c index 9c45d641b521..a7a2384044ff 100644 --- a/drivers/gpu/drm/msm/msm_atomic.c +++ b/drivers/gpu/drm/msm/msm_atomic.c @@ -115,7 +115,7 @@ int msm_atomic_init_pending_timer(struct msm_pending_timer *timer, timer->kms = kms; timer->crtc_idx = crtc_idx; - timer->worker = kthread_create_worker(0, "atomic-worker-%d", crtc_idx); + timer->worker = kthread_run_worker(0, "atomic-worker-%d", crtc_idx); if (IS_ERR(timer->worker)) { int ret = PTR_ERR(timer->worker); timer->worker = NULL; diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 3666b42b4ecd..87f151dfc241 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -859,7 +859,7 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, gpu->funcs = funcs; gpu->name = name; - gpu->worker = kthread_create_worker(0, "gpu-worker"); + gpu->worker = kthread_run_worker(0, "gpu-worker"); if (IS_ERR(gpu->worker)) { ret = PTR_ERR(gpu->worker); gpu->worker = NULL; diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c index af6a6fcb1173..8db9f3afb8ac 100644 --- a/drivers/gpu/drm/msm/msm_kms.c +++ b/drivers/gpu/drm/msm/msm_kms.c @@ -269,7 +269,7 @@ int msm_drm_kms_init(struct device *dev, const struct drm_driver *drv) /* initialize event thread */ ev_thread = &priv->event_thread[drm_crtc_index(crtc)]; ev_thread->dev = ddev; - ev_thread->worker = kthread_create_worker(0, "crtc_event:%d", crtc->base.id); + ev_thread->worker = kthread_run_worker(0, "crtc_event:%d", crtc->base.id); if (IS_ERR(ev_thread->worker)) { ret = PTR_ERR(ev_thread->worker); DRM_DEV_ERROR(dev, "failed to create crtc_event kthread\n"); diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c index 7273254ecb03..c49f5ed461cf 100644 --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c @@ -231,7 +231,7 @@ static int wave5_vpu_probe(struct platform_device *pdev) dev_err(&pdev->dev, "failed to get irq resource, falling back to polling\n"); hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); dev->hrtimer.function = &wave5_vpu_timer_callback; - dev->worker = kthread_create_worker(0, "vpu_irq_thread"); + dev->worker = kthread_run_worker(0, "vpu_irq_thread"); if (IS_ERR(dev->worker)) { dev_err(&pdev->dev, "failed to create vpu irq worker\n"); ret = PTR_ERR(dev->worker); diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 5b4e2ce5470d..a5908e2ff2cf 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -393,7 +393,7 @@ static int mv88e6xxx_irq_poll_setup(struct mv88e6xxx_chip *chip) kthread_init_delayed_work(&chip->irq_poll_work, mv88e6xxx_irq_poll); - chip->kworker = kthread_create_worker(0, "%s", dev_name(chip->dev)); + chip->kworker = kthread_run_worker(0, "%s", dev_name(chip->dev)); if (IS_ERR(chip->kworker)) return PTR_ERR(chip->kworker); diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c b/drivers/net/ethernet/intel/ice/ice_dpll.c index e92be6f130a3..f065a1dc0414 100644 --- a/drivers/net/ethernet/intel/ice/ice_dpll.c +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c @@ -1833,7 +1833,7 @@ static int ice_dpll_init_worker(struct ice_pf *pf) struct kthread_worker *kworker; kthread_init_delayed_work(&d->work, ice_dpll_periodic_work); - kworker = kthread_create_worker(0, "ice-dplls-%s", + kworker = kthread_run_worker(0, "ice-dplls-%s", dev_name(ice_pf_to_dev(pf))); if (IS_ERR(kworker)) return PTR_ERR(kworker); diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.c b/drivers/net/ethernet/intel/ice/ice_gnss.c index c8ea1af51ad3..fcd1f808b696 100644 --- a/drivers/net/ethernet/intel/ice/ice_gnss.c +++ b/drivers/net/ethernet/intel/ice/ice_gnss.c @@ -182,7 +182,7 @@ static struct gnss_serial *ice_gnss_struct_init(struct ice_pf *pf) pf->gnss_serial = gnss; kthread_init_delayed_work(&gnss->read_work, ice_gnss_read); - kworker = kthread_create_worker(0, "ice-gnss-%s", dev_name(dev)); + kworker = kthread_run_worker(0, "ice-gnss-%s", dev_name(dev)); if (IS_ERR(kworker)) { kfree(gnss); return NULL; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index e2786cc13286..9c2c65112781 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -3181,7 +3181,7 @@ static int ice_ptp_init_work(struct ice_pf *pf, struct ice_ptp *ptp) /* Allocate a kworker for handling work required for the ports * connected to the PTP hardware clock. */ - kworker = kthread_create_worker(0, "ice-ptp-%s", + kworker = kthread_run_worker(0, "ice-ptp-%s", dev_name(ice_pf_to_dev(pf))); if (IS_ERR(kworker)) return PTR_ERR(kworker); diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c index 86a3d32a7763..08f566cc1480 100644 --- a/drivers/platform/chrome/cros_ec_spi.c +++ b/drivers/platform/chrome/cros_ec_spi.c @@ -715,7 +715,7 @@ static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, int err; ec_spi->high_pri_worker = - kthread_create_worker(0, "cros_ec_spi_high_pri"); + kthread_run_worker(0, "cros_ec_spi_high_pri"); if (IS_ERR(ec_spi->high_pri_worker)) { err = PTR_ERR(ec_spi->high_pri_worker); diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index c56cd0f63909..89a4420972e7 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -295,7 +295,7 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, if (ptp->info->do_aux_work) { kthread_init_delayed_work(&ptp->aux_work, ptp_aux_kworker); - ptp->kworker = kthread_create_worker(0, "ptp%d", ptp->index); + ptp->kworker = kthread_run_worker(0, "ptp%d", ptp->index); if (IS_ERR(ptp->kworker)) { err = PTR_ERR(ptp->kworker); pr_err("failed to create ptp aux_worker %d\n", err); diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 6ebe5dd9bbb1..1a615fd696fc 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -2053,7 +2053,7 @@ static int spi_init_queue(struct spi_controller *ctlr) ctlr->busy = false; ctlr->queue_empty = true; - ctlr->kworker = kthread_create_worker(0, dev_name(&ctlr->dev)); + ctlr->kworker = kthread_run_worker(0, dev_name(&ctlr->dev)); if (IS_ERR(ctlr->kworker)) { dev_err(&ctlr->dev, "failed to create message pump kworker\n"); return PTR_ERR(ctlr->kworker); diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c index 26f9006e95e1..0d03a8667565 100644 --- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -7583,7 +7583,7 @@ struct tcpm_port *tcpm_register_port(struct device *dev, struct tcpc_dev *tcpc) mutex_init(&port->lock); mutex_init(&port->swap_lock); - port->wq = kthread_create_worker(0, dev_name(dev)); + port->wq = kthread_run_worker(0, dev_name(dev)); if (IS_ERR(port->wq)) return ERR_CAST(port->wq); sched_set_fifo(port->wq->task); diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index 8ffea8430f95..c204fc8e471a 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -229,7 +229,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr, dev = &vdpasim->vdpa.dev; kthread_init_work(&vdpasim->work, vdpasim_work_fn); - vdpasim->worker = kthread_create_worker(0, "vDPA sim worker: %s", + vdpasim->worker = kthread_run_worker(0, "vDPA sim worker: %s", dev_attr->name); if (IS_ERR(vdpasim->worker)) goto err_iommu; diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 4190cb800cc4..19698d87dc57 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -1229,7 +1229,7 @@ int __init watchdog_dev_init(void) { int err; - watchdog_kworker = kthread_create_worker(0, "watchdogd"); + watchdog_kworker = kthread_run_worker(0, "watchdogd"); if (IS_ERR(watchdog_kworker)) { pr_err("Failed to create watchdog kworker\n"); return PTR_ERR(watchdog_kworker); diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index f5c4d9ffac8e..e13959973fc4 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -353,7 +353,7 @@ static void erofs_destroy_percpu_workers(void) static struct kthread_worker *erofs_init_percpu_worker(int cpu) { struct kthread_worker *worker = - kthread_create_worker_on_cpu(cpu, 0, "erofs_worker/%u"); + kthread_run_worker_on_cpu(cpu, 0, "erofs_worker/%u"); if (IS_ERR(worker)) return worker; diff --git a/include/linux/kthread.h b/include/linux/kthread.h index 0c66e7c1092a..8d27403888ce 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -193,19 +193,53 @@ struct kthread_worker *kthread_create_worker_on_node(unsigned int flags, const char namefmt[], ...); #define kthread_create_worker(flags, namefmt, ...) \ -({ \ - struct kthread_worker *__kw \ - = kthread_create_worker_on_node(flags, NUMA_NO_NODE, \ - namefmt, ## __VA_ARGS__); \ - if (!IS_ERR(__kw)) \ - wake_up_process(__kw->task); \ - __kw; \ + kthread_create_worker_on_node(flags, NUMA_NO_NODE, namefmt, ## __VA_ARGS__); + +/** + * kthread_run_worker - create and wake a kthread worker. + * @flags: flags modifying the default behavior of the worker + * @namefmt: printf-style name for the thread. + * + * Description: Convenient wrapper for kthread_create_worker() followed by + * wake_up_process(). Returns the kthread_worker or ERR_PTR(-ENOMEM). + */ +#define kthread_run_worker(flags, namefmt, ...) \ +({ \ + struct kthread_worker *__kw \ + = kthread_create_worker(flags, namefmt, ## __VA_ARGS__); \ + if (!IS_ERR(__kw)) \ + wake_up_process(__kw->task); \ + __kw; \ }) struct kthread_worker * kthread_create_worker_on_cpu(int cpu, unsigned int flags, const char namefmt[]); +/** + * kthread_run_worker_on_cpu - create and wake a cpu bound kthread worker. + * @cpu: CPU number + * @flags: flags modifying the default behavior of the worker + * @namefmt: printf-style name for the thread. Format is restricted + * to "name.*%u". Code fills in cpu number. + * + * Description: Convenient wrapper for kthread_create_worker_on_cpu() + * followed by wake_up_process(). Returns the kthread_worker or + * ERR_PTR(-ENOMEM). + */ +static inline struct kthread_worker * +kthread_run_worker_on_cpu(int cpu, unsigned int flags, + const char namefmt[]) +{ + struct kthread_worker *kw; + + kw = kthread_create_worker_on_cpu(cpu, flags, namefmt); + if (!IS_ERR(kw)) + wake_up_process(kw->task); + + return kw; +} + bool kthread_queue_work(struct kthread_worker *worker, struct kthread_work *work); diff --git a/kernel/kthread.c b/kernel/kthread.c index d318621c6f4f..a203e366204c 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1080,33 +1080,10 @@ kthread_create_worker_on_node(unsigned int flags, int node, const char namefmt[] worker = __kthread_create_worker_on_node(flags, node, namefmt, args); va_end(args); - if (worker) - wake_up_process(worker->task); - return worker; } EXPORT_SYMBOL(kthread_create_worker_on_node); -static __printf(3, 4) struct kthread_worker * -__kthread_create_worker_on_cpu(int cpu, unsigned int flags, - const char namefmt[], ...) -{ - struct kthread_worker *worker; - va_list args; - - va_start(args, namefmt); - worker = __kthread_create_worker_on_node(flags, cpu_to_node(cpu), - namefmt, args); - va_end(args); - - if (worker) { - kthread_bind(worker->task, cpu); - wake_up_process(worker->task); - } - - return worker; -} - /** * kthread_create_worker_on_cpu - create a kthread worker and bind it * to a given CPU and the associated NUMA node. @@ -1147,7 +1124,13 @@ struct kthread_worker * kthread_create_worker_on_cpu(int cpu, unsigned int flags, const char namefmt[]) { - return __kthread_create_worker_on_cpu(cpu, flags, namefmt, cpu); + struct kthread_worker *worker; + + worker = kthread_create_worker_on_node(flags, cpu_to_node(cpu), namefmt, cpu); + if (worker) + kthread_bind(worker->task, cpu); + + return worker; } EXPORT_SYMBOL(kthread_create_worker_on_cpu); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 945e196090ed..902d4b5abbe7 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4827,7 +4827,7 @@ static void rcu_spawn_exp_par_gp_kworker(struct rcu_node *rnp) if (rnp->exp_kworker) return; - kworker = kthread_create_worker(0, name, rnp_index); + kworker = kthread_run_worker(0, name, rnp_index); if (IS_ERR_OR_NULL(kworker)) { pr_err("Failed to create par gp kworker on %d/%d\n", rnp->grplo, rnp->grphi); @@ -4854,7 +4854,7 @@ static void __init rcu_start_exp_gp_kworker(void) const char *name = "rcu_exp_gp_kthread_worker"; struct sched_param param = { .sched_priority = kthread_prio }; - rcu_exp_gp_kworker = kthread_create_worker(0, name); + rcu_exp_gp_kworker = kthread_run_worker(0, name); if (IS_ERR_OR_NULL(rcu_exp_gp_kworker)) { pr_err("Failed to create %s!\n", name); rcu_exp_gp_kworker = NULL; diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 1745ca788ede..ade8e7e526ac 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -7749,7 +7749,7 @@ static void __init wq_cpu_intensive_thresh_init(void) unsigned long thresh; unsigned long bogo; - pwq_release_worker = kthread_create_worker(0, "pool_workqueue_release"); + pwq_release_worker = kthread_run_worker(0, "pool_workqueue_release"); BUG_ON(IS_ERR(pwq_release_worker)); /* if the user set it to a specific value, keep it */ diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index ee7b272ab715..f21f2711921f 100644 --- a/net/dsa/tag_ksz.c +++ b/net/dsa/tag_ksz.c @@ -66,7 +66,7 @@ static int ksz_connect(struct dsa_switch *ds) if (!priv) return -ENOMEM; - xmit_worker = kthread_create_worker(0, "dsa%d:%d_xmit", + xmit_worker = kthread_run_worker(0, "dsa%d:%d_xmit", ds->dst->index, ds->index); if (IS_ERR(xmit_worker)) { ret = PTR_ERR(xmit_worker); diff --git a/net/dsa/tag_ocelot_8021q.c b/net/dsa/tag_ocelot_8021q.c index 8e8b1bef6af6..6ce0bc166792 100644 --- a/net/dsa/tag_ocelot_8021q.c +++ b/net/dsa/tag_ocelot_8021q.c @@ -110,7 +110,7 @@ static int ocelot_connect(struct dsa_switch *ds) if (!priv) return -ENOMEM; - priv->xmit_worker = kthread_create_worker(0, "felix_xmit"); + priv->xmit_worker = kthread_run_worker(0, "felix_xmit"); if (IS_ERR(priv->xmit_worker)) { err = PTR_ERR(priv->xmit_worker); kfree(priv); diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c index 3e902af7eea6..02adec693811 100644 --- a/net/dsa/tag_sja1105.c +++ b/net/dsa/tag_sja1105.c @@ -707,7 +707,7 @@ static int sja1105_connect(struct dsa_switch *ds) spin_lock_init(&priv->meta_lock); - xmit_worker = kthread_create_worker(0, "dsa%d:%d_xmit", + xmit_worker = kthread_run_worker(0, "dsa%d:%d_xmit", ds->dst->index, ds->index); if (IS_ERR(xmit_worker)) { err = PTR_ERR(xmit_worker); From patchwork Wed Aug 7 16:02:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13756474 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A07DB145B2E; Wed, 7 Aug 2024 16:03:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046609; cv=none; b=BcHiZQ+MoTbbXBkq21W2aS9OpoUgpmDiSAopgbjWDs9dlMGIxJu44pDAKTruhsq+P4Lnr5L/fQAonaqV9+Kzh45x9y3CdOPFA3zehALX4XM9e9GJVKB8l0pDllBHShagsrvH7I/05CJy4DU6rCjcgZBg8YJ007rY7vm9gq2ezXM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723046609; c=relaxed/simple; bh=YkDPolWzrbmlPbWb3+WNRO58pHRGY65SSsMMAwry6vI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Lqyv6pJm0GZLRB+moRAvSyjpK73KxzHs1jsSRQ/rtjl5CmzEhimX7DSOLRwz724nePXFvPF6/tIcTzOq3ZPR/iB6xb1prBo/1vG5MGrVAmAls8k6kQDBBa1gZxP0fNtmalMkBmvvjkaspdy9jxEcV8L889horgWac4wEqm169AA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AThBGUuY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AThBGUuY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9FD74C4AF0D; Wed, 7 Aug 2024 16:03:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723046609; bh=YkDPolWzrbmlPbWb3+WNRO58pHRGY65SSsMMAwry6vI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AThBGUuYq7Y2XTGld+MpH15sepu3BMI02QuJkLK3qWuOnS5Ktyjc6yRMAPRNt2Yej xoR8aUtXDQgXgDxXRwbzkQAALFf8UbrEHx8D0KwjswFePypqz8mjo8DMncgKSUWlQX wYPdBlk6XWlxOK6AZT8V4AyXbSd/Urd7Kheb/qo1PrOvu3sWru2ibX5Bxik5Yv5Qge ecRHoJBmm6yKhr1CtK3wgJ6p9rgWUmS0ZSUaVBnE2L5UDCmAUXJUKTo+yWMt5qNqo7 vebhNaslqQz7M998AZLCfYpLPgtneK1ZooSLebzfthKzMnvd+EhNMq1OHudZMEfheP lnKk5L5KHGqQw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org, Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: [PATCH 19/19] rcu: Use kthread preferred affinity for RCU exp kworkers Date: Wed, 7 Aug 2024 18:02:25 +0200 Message-ID: <20240807160228.26206-20-frederic@kernel.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807160228.26206-1-frederic@kernel.org> References: <20240807160228.26206-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now that kthreads have an infrastructure to handle preferred affinity against CPU hotplug and housekeeping cpumask, convert RCU exp workers to use it instead of handling all the constraints by itself. Acked-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree.c | 105 +++++++++------------------------------------- 1 file changed, 19 insertions(+), 86 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 902d4b5abbe7..118477a6dda4 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4815,6 +4815,22 @@ rcu_boot_init_percpu_data(int cpu) rcu_boot_init_nocb_percpu_data(rdp); } +static void rcu_thread_affine_rnp(struct task_struct *t, struct rcu_node *rnp) +{ + cpumask_var_t affinity; + int cpu; + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return; + + for_each_leaf_node_possible_cpu(rnp, cpu) + cpumask_set_cpu(cpu, affinity); + + kthread_affine_preferred(t, affinity); + + free_cpumask_var(affinity); +} + struct kthread_worker *rcu_exp_gp_kworker; static void rcu_spawn_exp_par_gp_kworker(struct rcu_node *rnp) @@ -4827,7 +4843,7 @@ static void rcu_spawn_exp_par_gp_kworker(struct rcu_node *rnp) if (rnp->exp_kworker) return; - kworker = kthread_run_worker(0, name, rnp_index); + kworker = kthread_create_worker(0, name, rnp_index); if (IS_ERR_OR_NULL(kworker)) { pr_err("Failed to create par gp kworker on %d/%d\n", rnp->grplo, rnp->grphi); @@ -4837,16 +4853,9 @@ static void rcu_spawn_exp_par_gp_kworker(struct rcu_node *rnp) if (IS_ENABLED(CONFIG_RCU_EXP_KTHREAD)) sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); -} -static struct task_struct *rcu_exp_par_gp_task(struct rcu_node *rnp) -{ - struct kthread_worker *kworker = READ_ONCE(rnp->exp_kworker); - - if (!kworker) - return NULL; - - return kworker->task; + rcu_thread_affine_rnp(kworker->task, rnp); + wake_up_process(kworker->task); } static void __init rcu_start_exp_gp_kworker(void) @@ -4931,79 +4940,6 @@ int rcutree_prepare_cpu(unsigned int cpu) return 0; } -static void rcu_thread_affine_rnp(struct task_struct *t, struct rcu_node *rnp) -{ - cpumask_var_t affinity; - int cpu; - - if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) - return; - - for_each_leaf_node_possible_cpu(rnp, cpu) - cpumask_set_cpu(cpu, affinity); - - kthread_affine_preferred(t, affinity); - - free_cpumask_var(affinity); -} - -/* - * Update kthreads affinity during CPU-hotplug changes. - * - * Set the per-rcu_node kthread's affinity to cover all CPUs that are - * served by the rcu_node in question. The CPU hotplug lock is still - * held, so the value of rnp->qsmaskinit will be stable. - * - * We don't include outgoingcpu in the affinity set, use -1 if there is - * no outgoing CPU. If there are no CPUs left in the affinity set, - * this function allows the kthread to execute on any CPU. - * - * Any future concurrent calls are serialized via ->kthread_mutex. - */ -static void rcutree_affinity_setting(unsigned int cpu, int outgoingcpu) -{ - cpumask_var_t cm; - unsigned long mask; - struct rcu_data *rdp; - struct rcu_node *rnp; - struct task_struct *task_exp; - - rdp = per_cpu_ptr(&rcu_data, cpu); - rnp = rdp->mynode; - - task_exp = rcu_exp_par_gp_task(rnp); - - /* - * If CPU is the boot one, this task is created later from early - * initcall since kthreadd must be created first. - */ - if (!task_exp) - return; - - if (!zalloc_cpumask_var(&cm, GFP_KERNEL)) - return; - - mutex_lock(&rnp->kthread_mutex); - mask = rcu_rnp_online_cpus(rnp); - for_each_leaf_node_possible_cpu(rnp, cpu) - if ((mask & leaf_node_cpu_bit(rnp, cpu)) && - cpu != outgoingcpu) - cpumask_set_cpu(cpu, cm); - cpumask_and(cm, cm, housekeeping_cpumask(HK_TYPE_RCU)); - if (cpumask_empty(cm)) { - cpumask_copy(cm, housekeeping_cpumask(HK_TYPE_RCU)); - if (outgoingcpu >= 0) - cpumask_clear_cpu(outgoingcpu, cm); - } - - if (task_exp) - set_cpus_allowed_ptr(task_exp, cm); - - mutex_unlock(&rnp->kthread_mutex); - - free_cpumask_var(cm); -} - /* * Has the specified (known valid) CPU ever been fully online? */ @@ -5032,7 +4968,6 @@ int rcutree_online_cpu(unsigned int cpu) if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) return 0; /* Too early in boot for scheduler work. */ sync_sched_exp_online_cleanup(cpu); - rcutree_affinity_setting(cpu, -1); // Stop-machine done, so allow nohz_full to disable tick. tick_dep_clear(TICK_DEP_BIT_RCU); @@ -5249,8 +5184,6 @@ int rcutree_offline_cpu(unsigned int cpu) rnp->ffmask &= ~rdp->grpmask; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - rcutree_affinity_setting(cpu, cpu); - // nohz_full CPUs need the tick for stop-machine to work quickly tick_dep_set(TICK_DEP_BIT_RCU); return 0;