From patchwork Mon Sep 16 22:49:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13805810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64FAFC3ABCB for ; Mon, 16 Sep 2024 22:50:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9D6F6B0099; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4CAB6B009A; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEC7A6B009B; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AF7676B0099 for ; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2B3E514015A for ; Mon, 16 Sep 2024 22:50:18 +0000 (UTC) X-FDA: 82572096516.12.175AE42 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id 81BC514001A for ; Mon, 16 Sep 2024 22:50:16 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=bTMAEhtm; spf=pass (imf23.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726526870; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xNBArmQsRbr8vhEoQnHk8Bx9L6asG/uynxgCiPrx+rQ=; b=2tlzeasLFqbTixHTYJDMsuyJvTAsbiwBkprHZVChKdc+9zFuJPeSoBC8zqg7a4+y03r9NJ t5vSNY4MX3pUhsl6SqsAHS6is3gRk4CfFjF8D5/eJiLLWgpypY0rMyNCj74Xqm/Y84kN6S b1EuI2jgBxRdlQa4CdzIm6kKQEUeVz8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=bTMAEhtm; spf=pass (imf23.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726526870; a=rsa-sha256; cv=none; b=ARORYU5QidKK8hzUKiX/BFhXyE4SbxVFxoU5w08xXKLMMWQXK5eC4iSqGlIzsHHtESZtm8 GlXfUu/vPO55//yEKi/G24E1oRO+6NTS647/FaDbK1jzTWvaYinjGheRxW9p3UWDAz81PV Ej3bx8jqW2w2h7UgBYjJq2NvuNxOsx4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 0DF8B5C121D; Mon, 16 Sep 2024 22:50:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFFBCC4CECF; Mon, 16 Sep 2024 22:50:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726527015; bh=P7lEwaOzaXjSDSuqvPYP44l3AVedPSv4yEdx2INBc1w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bTMAEhtmDWTkAFtjFel6YnKlcx7X0ChO155prgrysl4heaF1ThwtNHRBhS4FO/DMM Ux5mxG2+CwHUJikekDXLAo4L4K4bNG9uz7jITDTDSvgK2DxUPM0fo2+DEZfAouY9ER dilkbmfRl7MYxnJMDZ11VXPlKYkkalgBDZV6bZnjDvFwiTujQapfYK51U/C0ShCdW1 lx8J950jAJ+ntD//LV9vqIGtJdqw6YAFUc/xD6iR00Zgh8IUPefYuxyXgEYBIznB3j O5+kRWjvVyhwf2J8vpIrrvV4C9Dky08LtfS9rNk4jcM3laxTPz7u9Qr8g0dJkvNEOY MJ7YO6yx3PMcA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 15/19] kthread: Implement preferred affinity Date: Tue, 17 Sep 2024 00:49:19 +0200 Message-ID: <20240916224925.20540-16-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240916224925.20540-1-frederic@kernel.org> References: <20240916224925.20540-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 81BC514001A X-Stat-Signature: c6bg43cho4o4xkeddrae79wpi1kjzzdf X-Rspam-User: X-HE-Tag: 1726527016-583994 X-HE-Meta: U2FsdGVkX19he74zL+4JcOz6MmGEC+O3lpLzxyZfOpW3OrHX+u8rE9K+RMs300AZh4WJMJGDZGmjG8V8IYh/wbwAYzTn/2V5ji8RbyfHTcFFh3P63H3tk9LvBgNljALGr+GJQxV3EWf1kkaC3+H+4CP1xxWpEic2Lc61ZOZ33NKJqaP502kBj3lhqhjKj59+BzlZbARr0BESCj9azmIk499cSSQVGcCPk+vYQHZAr1FXLv9snx+hBU6e+XSnBdgKZcnFL7YhknUCcCa+3uqcKTE9zWlFpBmQyXoIVuZofJTjCOIrAJvkgYX3m1V5ijSFurR8TMjxPLwplWUBJJHBLvkx6AxGAidwY0EedCx3p5WgH7H4ePDOCwtfThEJJAcoxwUHLeQSYMp5pbGBigaUXwu+1bI32j7W7uaWad8Nb5xhumtTxe00Rkp6rmePoX52/cE8Jesjo42WqXz4PQNXalbHCC4C+FPhhHmSjcTo0IuuhPs8HUnEJC5M5VRYNp/g6hAtuty/7+dmDoz5YgIaxHB93u9IhNq+YpFEwrKuFLcG6quMppkpLTXnNH6lBiwguucLyxzsPfGgBq4IH3y7kA1XnCAhFzi4g0YMqNpPv+LkwLz2BYEXLdoGE1rOi+nDEArPTRMtT6uYrDESUl8DkKvXeyfE4d6bgfMLYsZfTD+J1N9jO2E/u1qnQaY1Jdxpz6+072BcOANCLCz6RQ1IEXyc9deIeajHeEptgmux+4+dw87OiIk2yO2FFk66RGx8Hm/pRRgykrxipPTvO1yGPmQS8tqw79mEHOn44YweWwwUsrB56LIJ8yZsDkW5+rT84OE0rXxNuWinEa29Lt29VV4eTmqupLTRLTIORVMFXWsQeEVNfBAV8nMDGKbUW+nbSFHX6tbSO6S8+18aEJMrw+0SHPRUQNzGWoL4Y3bOsRLiP4pQS0Fa3hNR+bGk8KMz9cBOvbc5I1yUHo5Stmc dgvYt3Ly VuCAdg4OLyiE7lQoU8S8SH0kzQnzKqax4ZSy92fVgvMEKQR/avwR6uxZhiZ7EwSBkGiWGxweiUmVhrVWZZ/1zIiFT4Q1PQ5jaS926uNt1/LYML6YqKa8aMlabVmNFPUtTgMDSNSwYEWYoPourHSpUMtlItec1ebGUykuQ10NShVV2wxVK0JObxpYsoZ3xlQZbmgC7TOwqyi+zmYIckC8MPch/Q7kSXEU09OSvITsFCU6AMgZnJIhb+xPJD5PkhqBHH/6JrGFp/iAMmtJwbCd2mYvmhLOVUxpwkDwGm4bFDEZZI9pg4pWX7CHokw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Affining kthreads follow either of four existing different patterns: 1) Per-CPU kthreads must stay affine to a single CPU and never execute relevant code on any other CPU. This is currently handled by smpboot code which takes care of CPU-hotplug operations. 2) Kthreads that _have_ to be affine to a specific set of CPUs and can't run anywhere else. The affinity is set through kthread_bind_mask() and the subsystem takes care by itself to handle CPU-hotplug operations. 3) Kthreads that prefer to be affine to a specific NUMA node. That preferred affinity is applied by default when an actual node ID is passed on kthread creation, provided the kthread is not per-CPU and no call to kthread_bind_mask() has been issued before the first wake-up. 4) Similar to the previous point but kthreads have a preferred affinity different than a node. It is set manually like any other task and CPU-hotplug is supposed to be handled by the relevant subsystem so that the task is properly reaffined whenever a given CPU from the preferred affinity comes up or down. Also care must be taken so that the preferred affinity doesn't cross housekeeping cpumask boundaries. Provide a function to handle the last usecase, mostly reusing the current node default affinity infrastructure. kthread_affine_preferred() is introduced, to be used just like kthread_bind_mask(), right after kthread creation and before the first wake up. The kthread is then affine right away to the cpumask passed through the API if it has online housekeeping CPUs. Otherwise it will be affine to all online housekeeping CPUs as a last resort. As with node affinity, it is aware of CPU hotplug events such that: * When a housekeeping CPU goes up and is part of the preferred affinity of a given kthread, it is added to its applied affinity set (and possibly the default last resort online housekeeping set is removed from the set). * When a housekeeping CPU goes down while it was part of the preferred affinity of a kthread, it is removed from the kthread's applied affinity. The last resort is to affine the kthread to all online housekeeping CPUs. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- include/linux/kthread.h | 1 + kernel/kthread.c | 69 ++++++++++++++++++++++++++++++++++++----- 2 files changed, 62 insertions(+), 8 deletions(-) diff --git a/include/linux/kthread.h b/include/linux/kthread.h index b11f53c1ba2e..30209bdf83a2 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -85,6 +85,7 @@ kthread_run_on_cpu(int (*threadfn)(void *data), void *data, void free_kthread_struct(struct task_struct *k); void kthread_bind(struct task_struct *k, unsigned int cpu); void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask); +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask); int kthread_stop(struct task_struct *k); int kthread_stop_put(struct task_struct *k); bool kthread_should_stop(void); diff --git a/kernel/kthread.c b/kernel/kthread.c index eee5925e7725..e4ffc776928a 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -71,6 +71,7 @@ struct kthread { char *full_name; struct task_struct *task; struct list_head hotplug_node; + struct cpumask *preferred_affinity; }; enum KTHREAD_BITS { @@ -330,6 +331,11 @@ void __noreturn kthread_exit(long result) /* Make sure the kthread never gets re-affined globally */ set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); mutex_unlock(&kthreads_hotplug_lock); + + if (kthread->preferred_affinity) { + kfree(kthread->preferred_affinity); + kthread->preferred_affinity = NULL; + } } do_exit(0); } @@ -358,19 +364,25 @@ EXPORT_SYMBOL(kthread_complete_and_exit); static void kthread_fetch_affinity(struct kthread *k, struct cpumask *mask) { - if (k->node == NUMA_NO_NODE) { - cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); - } else { + const struct cpumask *pref; + + if (k->preferred_affinity) { + pref = k->preferred_affinity; + } else if (k->node != NUMA_NO_NODE) { /* * The node cpumask is racy when read from kthread() but: * - a racing CPU going down won't be present in kthread_online_mask * - a racing CPU going up will be handled by kthreads_online_cpu() */ - cpumask_and(mask, cpumask_of_node(k->node), &kthread_online_mask); - cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); - if (cpumask_empty(mask)) - cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + pref = cpumask_of_node(k->node); + } else { + pref = housekeeping_cpumask(HK_TYPE_KTHREAD); } + + cpumask_and(mask, pref, &kthread_online_mask); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + if (cpumask_empty(mask)) + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); } static int kthread_affine_node(void) @@ -440,7 +452,7 @@ static int kthread(void *_create) self->started = 1; - if (!(current->flags & PF_NO_SETAFFINITY)) + if (!(current->flags & PF_NO_SETAFFINITY) && !self->preferred_affinity) kthread_affine_node(); ret = -EINTR; @@ -837,6 +849,47 @@ int kthreadd(void *unused) return 0; } +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask) +{ + struct kthread *kthread = to_kthread(p); + cpumask_var_t affinity; + unsigned long flags; + int ret; + + if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) { + WARN_ON(1); + return -EINVAL; + } + + WARN_ON_ONCE(kthread->preferred_affinity); + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + kthread->preferred_affinity = kzalloc(sizeof(struct cpumask), GFP_KERNEL); + if (!kthread->preferred_affinity) { + ret = -ENOMEM; + goto out; + } + + mutex_lock(&kthreads_hotplug_lock); + cpumask_copy(kthread->preferred_affinity, mask); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + kthread_fetch_affinity(kthread, affinity); + + /* It's safe because the task is inactive. */ + raw_spin_lock_irqsave(&p->pi_lock, flags); + do_set_cpus_allowed(p, affinity); + raw_spin_unlock_irqrestore(&p->pi_lock, flags); + + mutex_unlock(&kthreads_hotplug_lock); +out: + free_cpumask_var(affinity); + + return 0; +} + static int kthreads_hotplug_update(void) { cpumask_var_t affinity;