From patchwork Mon Sep 16 22:49:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13805806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D0A9C3ABCB for ; Mon, 16 Sep 2024 22:50:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6370E6B0089; Mon, 16 Sep 2024 18:50:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E6646B008A; Mon, 16 Sep 2024 18:50:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 486D96B008C; Mon, 16 Sep 2024 18:50:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2A7EF6B0089 for ; Mon, 16 Sep 2024 18:50:06 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C9F85120176 for ; Mon, 16 Sep 2024 22:50:05 +0000 (UTC) X-FDA: 82572095970.22.868B1A3 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id 1A065180009 for ; Mon, 16 Sep 2024 22:50:03 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="nRLAXb/Q"; spf=pass (imf06.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726526857; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cjaBm1fH+rcbzdDAj2QxrY+3nx5buM39J8og4oWKIUQ=; b=680po28xzk7js3k1ZcPzH4xL6d1JdnIqhqWYw5HDVX2wRpiLHmFpxvQDahIl+MN9aHM9Qb CsD9GZmyTQ9t9A33RXWlb4uk2Pnd5tfkfK4jyMhIJwY/PCz05AxpIjoRrrJu3zV5hT9GI3 M2gnjo7ejJDDYrD4sGwW/rE979xWrUQ= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="nRLAXb/Q"; spf=pass (imf06.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726526857; a=rsa-sha256; cv=none; b=rYaJC+bvOrLZV+LlPsQNVInEn2iX1UWnb2HDhfttXU94KVWQJJVD4rujCIOpr5/MO0jhjL QLAxyN1BzMUaRzZ0+dtGTwgCoBhDyLPfOe3MovZqIX6rkzyq6J2daS6O2dWclN9WsRvBEv qBnjLTTT1ZtYcN7qfTEK6KkGhf3Yyl4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 4FDAF5C216B; Mon, 16 Sep 2024 22:49:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AAB21C4CECC; Mon, 16 Sep 2024 22:49:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726527002; bh=sbJMoxIqu4qr1nD/+Ksno6FjhK0vmuAI9ZNJjfPGBuI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nRLAXb/Qy6nANXMBkUtDMV6bPI7XGa+ARiNyBck4/MFprVyaKt2CssHmWQE0gtmxt qlPRfGIFG2mKyThl6aB/HQrTcLkSeZg9zwzTm+v+Mu6O8hWpCV7eE4E8xZynKD/kfP bb+OHznYw6umYjFoFjztRzkc89/FWva73yWbEBLrVbnzQFJTKH/0MREoqSs6mRgu5l kifMAWClluK9fMjTGYXhWwaqoLZMZIFN/6dVGbsemJacgbqtar1+tnVjPGavfWjBbQ NmJ+9iSLA6lO2ZCl7l8A3RBTZ9KnPoBWRM5U32Ye3nQSFrmE3pcpLnAeX+OOKQhYQk gWH/sCQ25r2Jg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 11/19] kthread: Make sure kthread hasn't started while binding it Date: Tue, 17 Sep 2024 00:49:15 +0200 Message-ID: <20240916224925.20540-12-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240916224925.20540-1-frederic@kernel.org> References: <20240916224925.20540-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1A065180009 X-Stat-Signature: sjhpmstkssbpf154pui63963dh3p63pa X-Rspam-User: X-HE-Tag: 1726527003-709625 X-HE-Meta: U2FsdGVkX19S0sUbsRg05YrbtYw51cTcgG6ZrCx/jNK1s9odUKAglqB3z1V6JzvaAdJv0WEY5jE3TuULVvnbaTFrfui3WvS+1SgUmQV7EU0W5wbhuAW21CvW6bGFiBWYBEUdV2z45I3BKWPHv4N19cP1iraP1bD1U4sjDVu5QV99QLjxAxxCuqp5O+RzsYcqzOpwzdBb6WZw9saLxECXEqavZ5+PFSCSDyuzCLQ6DXH13npG5FSXfsj8HQttnJNtiDnzgCjoJ7xQis1jzqAogLtn4k3hIjD/7gQed82vxoWqVFdGAcpsFHVLgEM8E8vLQBr3o6UkIkpYj3p1IJCCPt7Ddt9Qw35/2qLSiKlnTTYm6upSq6tPMGw/XVfMu3ywE1WgH9GkOvxaPoANX26plflIacLmuCVGX56BKyPMBdE1kfTh+cloCbQpA6ILqGMjVUOPHZ9OJ82KKoCp/OAe4qZlCVF5i9Zpn6AJs2x77Cm1i9SHjqroZ2mSuc3uWx8Ny1+jxO5LYWmoxZe4FsL9NBTmrLqS/fVPa8g3xBPYJBivAGxXnBXyugnK1HHb/TofK94kvzAN91Og0oxHBryR8F4cjNOW6B67MaYpKWjdJQcmtnXrKUoTw0qYrY6wnkUE2GkQWP3I/liLkRQ+A0n4/8YCBT9H1gydTTDXtdAc5h8cuU9Oz5wDpvReREeO04XILHkFNkaDzXbuDRkKUFCJ16iL1SrcvRQT3/D7r+C94LSQlp5ZYXdC7bgFXjqeGGhCXk3KQ173VA0GymzA7gLKl8hWEGhxlNLMyOEI1ogX7Z+WHKVYa3vZhx4G2LoL6Qd233pn3k8fG6cX/3KCourSB86e1Irae26qELr6qhxEZiYDdELBfwBLB9PZXFcym7ejAnDjo72A1XxwV2NZzFa5J0VccTR20mdVfozuh5E9T6+7x1VUka3zSCWLgdZfisGYy8P1sfDwit9A/FXHIoG F96C3XRh ARBxTGmAwFcIgqyCumNiHbjlM1tvuAt4YApDOEXbHEJD1m7FfPUV6nb6G1U+Kc/ql6idQHezUdzq8szwA9wktP2jkhoK4hYNlLMUzEu5hZCDzhCevuSgkQtcVFzPaPRJU9lOYogduwCiZlZkRdR1drBSk1Wnx8uLwD8LEE3VcRcB2bAqoWOKbkQJk9d8dXri6utTxzrU0rah73ijk5HOi5ek/pd/BwjU6SWpyxUznnMAeeYN7NNrbHnVdDHkERpLAJsco2jUz4E7hrV4W73XsgWYh2/UyJo2c0EhdmWsn7XwOGSiMUvWw5PSdI0C21eGH0XV9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make sure the kthread is sleeping in the schedule_preempt_disabled() call before calling its handler when kthread_bind[_mask]() is called on it. This provides a sanity check verifying that the task is not randomly blocked later at some point within its function handler, in which case it could be just concurrently awaken, leaving the call to do_set_cpus_allowed() without any effect until the next voluntary sleep. Rely on the wake-up ordering to ensure that the newly introduced "started" field returns the expected value: TASK A TASK B ------ ------ READ kthread->started wake_up_process(B) rq_lock() ... rq_unlock() // RELEASE schedule() rq_lock() // ACQUIRE // schedule task B rq_unlock() WRITE kthread->started Similarly, writing kthread->started before subsequent voluntary sleeps will be visible after calling wait_task_inactive() in __kthread_bind_mask(), reporting potential misuse of the API. Upcoming patches will make further use of this facility. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index f7be976ff88a..ecb719f54f7a 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -53,6 +53,7 @@ struct kthread_create_info struct kthread { unsigned long flags; unsigned int cpu; + int started; int result; int (*threadfn)(void *); void *data; @@ -382,6 +383,8 @@ static int kthread(void *_create) schedule_preempt_disabled(); preempt_enable(); + self->started = 1; + ret = -EINTR; if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { cgroup_kthread_ready(); @@ -540,7 +543,9 @@ static void __kthread_bind(struct task_struct *p, unsigned int cpu, unsigned int void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) { + struct kthread *kthread = to_kthread(p); __kthread_bind_mask(p, mask, TASK_UNINTERRUPTIBLE); + WARN_ON_ONCE(kthread->started); } /** @@ -554,7 +559,9 @@ void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) */ void kthread_bind(struct task_struct *p, unsigned int cpu) { + struct kthread *kthread = to_kthread(p); __kthread_bind(p, cpu, TASK_UNINTERRUPTIBLE); + WARN_ON_ONCE(kthread->started); } EXPORT_SYMBOL(kthread_bind); From patchwork Mon Sep 16 22:49:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13805807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B37CC3ABCB for ; Mon, 16 Sep 2024 22:50:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC2266B008C; Mon, 16 Sep 2024 18:50:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E72F46B0092; Mon, 16 Sep 2024 18:50:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEBC26B0093; Mon, 16 Sep 2024 18:50:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B0FD76B008C for ; Mon, 16 Sep 2024 18:50:09 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5FA93A013B for ; Mon, 16 Sep 2024 22:50:09 +0000 (UTC) X-FDA: 82572096138.11.99A38EC Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf30.hostedemail.com (Postfix) with ESMTP id BA7AA80016 for ; Mon, 16 Sep 2024 22:50:07 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=S51p9HZa; spf=pass (imf30.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726526897; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zFjOW+MvS7lORSbyHwtg8Pp+tROSq2nGy2YTZo5Mggg=; b=La6F5t/rp0w5xswEV81ZsU/EiswKrXXrPUTJ3qL8sUsEGp2wABBjInhl96YieDWYtV+y54 wzHZe4q7f9llRSpJOjdDgiat7RX0Nz4R5EHwgeSmGqq2dDUmXd3sT/h7ouyEWWciwMeWIU 26lL/rPACoWU0M2Ery2ZjYvoyi98IxA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726526897; a=rsa-sha256; cv=none; b=Q7QzFJ+hmt2qgg174BfUTEX7ZS3ijqdjrmzu0EReDNUfu9I4ZcvWFi3bGkPlizzjp7VzIZ Vq/VKU9GxjoCge2Kr+njPmEuZa44aBs+fY6CrI3AE8DPwRb01X9seS8oFcUvLDEdEr1+gy 2JvF3TmE3OD7ftuZMrLAHQqRsbZIiuE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=S51p9HZa; spf=pass (imf30.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 55AE85C12B1; Mon, 16 Sep 2024 22:50:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24037C4CEC4; Mon, 16 Sep 2024 22:50:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726527006; bh=JA4X4dy4oru7fCID/2hM1LtEhfcBkOK7fx/W7lIKfx8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=S51p9HZantjy015saAGW9URigenWXkheTkQ+MZrSa6aBz+VPUZZarWpliXuXnD5Aj IOY9EKnKwsbADPN7lV1wKNOq4tJ9lMFp4Nr4i4+9oJcmE9VUw35AYwIhFKvRuotS1H Tf7QEYBaqa2Fr1LDmmeyV5/F6xPtuwxspKk6MdK1AqEl6QMyWZ1HV6ZLlmmOqpDX1g LJKaj+MHZWmH/q9Rl/W34No0ub3GpwifhQRetbJ0aQRlKc+fenQ8Wg672Nq0JysVT7 K6j3Rul03wDWD/L4frWL0Us7YuVEXCuNNLbgCT3rSubRIhDJl8eKSc+z72NtYXNs4U LeSlKi31Nc0gQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 12/19] kthread: Default affine kthread to its preferred NUMA node Date: Tue, 17 Sep 2024 00:49:16 +0200 Message-ID: <20240916224925.20540-13-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240916224925.20540-1-frederic@kernel.org> References: <20240916224925.20540-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BA7AA80016 X-Stat-Signature: jswge8o41ctzruwo6iw6nojakmozb1ht X-HE-Tag: 1726527007-271521 X-HE-Meta: U2FsdGVkX1/O6ixWLyqGIZozl2kU74Ov4W8cU2ITTlFGQ2AK0DrUj7An4bMYekGtrdNur5vnJ0vRIdl29L/cA0TnKq9cFXGEllc/ca5n47RPY7+FjFQYQ1ng0emcs9H6TX1X/jVPypCkM7HF4MkYSkF0x6svSJAEXB1fT+iOOoB0vjgExWh9Osk6ME90/BMvt4nXNu1lBb7vGLA8+lMUl0T9F/mOd9og5Y/ujdXCKOA3ugK0bG4SUhUowqiJSPOVotyrxBPQq1VGqMnVcP5sJbv3C2xEkFXdeC+wxQK+mlgy8K7ZmcTnomTd/8Ghe48ppqH+oVKTKGpJvzzeSkymp9eoluHOXj43zwX5I1XoLtcDNNhXdD/neBNoNZwyRBg+nzZQ7EkfNVSFWbKijVSS/SlpFJ6R1wMno7pSaI3wJqQCZch6R33CEXB2WU5WZ6elZPAIUwR7BMYSPJ7jkz4oGLETkI/H2s//zusO3ykiN3bo2lcRMwMQViYjNJKerbNGR7HRdA1LphL5vSSh3CtVAewXkTrmR0C+Cgl3V7lptGtq273J1/xYX6+A9Ib8u4G3qF/4QQKcRHaDixwnidDTSwnT6Pqiv2JndvbXbJ1+HWbejtippUdJTD08qQMAYtAMbH8z9no4ezHTPUoQxkvI3bhKG9bAR8YYkBMOuT7zSi5ZwqqMRnciAoLbp/EfJTQlzo8kfKFC7XKwVrpLHBlFqD8Bm5WmdokRkIsOuvI4o1YoVa5YpZUT/IfidXnH5kRcj8IsaDlrXNXBOqhLAwnKJGNumyuuMOWFCrtvmvgEIrLtHmssfPybxSgEmubW6+mojf/27aXFBSiJclTUsV3f6N3ySbgH61DiGjWJXqfwXJcyrj5X2ngjYMdtE0A4geGBsyle2E1uctT4th6XWd5LVKKaEgmBqX7QsxUz4vEkSUMMx2aLDJZLrK4KqV+MHe0i9MNNLBQjwS95tYx2cvO G5x27/Zq G0ShRX8+bOimUjyl6DaSAkebD86BejiiU7OHBGDRXUym2i54LdulysBpHkdGoHWW1pWHHkbmAfMJCxlacNq8iVkSAjutbcSnQd9UHB6KHpTvJMrp0dltCrrIZ8491/mucbB44MhScpdaJVRTawkw18vT3caVafxu18zgnvDwmxWMYeEnN1uoxJ1Rzy4MS2JLVE/An0E5iwn12MMByGARCqtgS3uTdqIGno1lAJPXFLy5m8fDcMt5rEqZOchXo/s4BSisoEJpIT4H59klnHDhAq0mWWAxNdHOdj9Kh6RXoXfm2wG2TcVQAFtvRhA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Kthreads attached to a preferred NUMA node for their task structure allocation can also be assumed to run preferrably within that same node. A more precise affinity is usually notified by calling kthread_create_on_cpu() or kthread_bind[_mask]() before the first wakeup. For the others, a default affinity to the node is desired and sometimes implemented with more or less success when it comes to deal with hotplug events and nohz_full / CPU Isolation interactions: - kcompactd is affine to its node and handles hotplug but not CPU Isolation - kswapd is affine to its node and ignores hotplug and CPU Isolation - A bunch of drivers create their kthreads on a specific node and don't take care about affining further. Handle that default node affinity preference at the generic level instead, provided a kthread is created on an actual node and doesn't apply any specific affinity such as a given CPU or a custom cpumask to bind to before its first wake-up. This generic handling is aware of CPU hotplug events and CPU isolation such that: * When a housekeeping CPU goes up and is part of the node of a given kthread, it is added to its applied affinity set (and possibly the default last resort online housekeeping set is removed from the set). * When a housekeeping CPU goes down while it was part of the node of a kthread, it is removed from the kthread's applied affinity. The last resort is to affine the kthread to all online housekeeping CPUs. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- include/linux/cpuhotplug.h | 1 + kernel/kthread.c | 120 ++++++++++++++++++++++++++++++++++++- 2 files changed, 120 insertions(+), 1 deletion(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 9316c39260e0..89d852538b72 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -240,6 +240,7 @@ enum cpuhp_state { CPUHP_AP_WORKQUEUE_ONLINE, CPUHP_AP_RANDOM_ONLINE, CPUHP_AP_RCUTREE_ONLINE, + CPUHP_AP_KTHREADS_ONLINE, CPUHP_AP_BASE_CACHEINFO_ONLINE, CPUHP_AP_ONLINE_DYN, CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 40, diff --git a/kernel/kthread.c b/kernel/kthread.c index ecb719f54f7a..eee5925e7725 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -35,6 +35,10 @@ static DEFINE_SPINLOCK(kthread_create_lock); static LIST_HEAD(kthread_create_list); struct task_struct *kthreadd_task; +static struct cpumask kthread_online_mask; +static LIST_HEAD(kthreads_hotplug); +static DEFINE_MUTEX(kthreads_hotplug_lock); + struct kthread_create_info { /* Information passed to kthread() from kthreadd. */ @@ -53,6 +57,7 @@ struct kthread_create_info struct kthread { unsigned long flags; unsigned int cpu; + unsigned int node; int started; int result; int (*threadfn)(void *); @@ -64,6 +69,8 @@ struct kthread { #endif /* To store the full name if task comm is truncated. */ char *full_name; + struct task_struct *task; + struct list_head hotplug_node; }; enum KTHREAD_BITS { @@ -122,8 +129,11 @@ bool set_kthread_struct(struct task_struct *p) init_completion(&kthread->exited); init_completion(&kthread->parked); + INIT_LIST_HEAD(&kthread->hotplug_node); p->vfork_done = &kthread->exited; + kthread->task = p; + kthread->node = tsk_fork_get_node(current); p->worker_private = kthread; return true; } @@ -314,6 +324,13 @@ void __noreturn kthread_exit(long result) { struct kthread *kthread = to_kthread(current); kthread->result = result; + if (!list_empty(&kthread->hotplug_node)) { + mutex_lock(&kthreads_hotplug_lock); + list_del(&kthread->hotplug_node); + /* Make sure the kthread never gets re-affined globally */ + set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); + mutex_unlock(&kthreads_hotplug_lock); + } do_exit(0); } EXPORT_SYMBOL(kthread_exit); @@ -339,6 +356,45 @@ void __noreturn kthread_complete_and_exit(struct completion *comp, long code) } EXPORT_SYMBOL(kthread_complete_and_exit); +static void kthread_fetch_affinity(struct kthread *k, struct cpumask *mask) +{ + if (k->node == NUMA_NO_NODE) { + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + } else { + /* + * The node cpumask is racy when read from kthread() but: + * - a racing CPU going down won't be present in kthread_online_mask + * - a racing CPU going up will be handled by kthreads_online_cpu() + */ + cpumask_and(mask, cpumask_of_node(k->node), &kthread_online_mask); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + if (cpumask_empty(mask)) + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + } +} + +static int kthread_affine_node(void) +{ + struct kthread *kthread = to_kthread(current); + cpumask_var_t affinity; + + WARN_ON_ONCE(kthread_is_per_cpu(current)); + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + mutex_lock(&kthreads_hotplug_lock); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + kthread_fetch_affinity(kthread, affinity); + set_cpus_allowed_ptr(current, affinity); + mutex_unlock(&kthreads_hotplug_lock); + + free_cpumask_var(affinity); + + return 0; +} + static int kthread(void *_create) { static const struct sched_param param = { .sched_priority = 0 }; @@ -369,7 +425,6 @@ static int kthread(void *_create) * back to default in case they have been changed. */ sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m); - set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); /* OK, tell user we're spawned, wait for stop or wakeup */ __set_current_state(TASK_UNINTERRUPTIBLE); @@ -385,6 +440,9 @@ static int kthread(void *_create) self->started = 1; + if (!(current->flags & PF_NO_SETAFFINITY)) + kthread_affine_node(); + ret = -EINTR; if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { cgroup_kthread_ready(); @@ -779,6 +837,66 @@ int kthreadd(void *unused) return 0; } +static int kthreads_hotplug_update(void) +{ + cpumask_var_t affinity; + struct kthread *k; + int err; + + if (list_empty(&kthreads_hotplug)) + return 0; + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + err = 0; + + list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { + if (WARN_ON_ONCE((k->task->flags & PF_NO_SETAFFINITY) || + kthread_is_per_cpu(k->task))) { + err = -EINVAL; + continue; + } + kthread_fetch_affinity(k, affinity); + set_cpus_allowed_ptr(k->task, affinity); + } + + free_cpumask_var(affinity); + + return err; +} + +static int kthreads_offline_cpu(unsigned int cpu) +{ + int ret = 0; + + mutex_lock(&kthreads_hotplug_lock); + cpumask_clear_cpu(cpu, &kthread_online_mask); + ret = kthreads_hotplug_update(); + mutex_unlock(&kthreads_hotplug_lock); + + return ret; +} + +static int kthreads_online_cpu(unsigned int cpu) +{ + int ret = 0; + + mutex_lock(&kthreads_hotplug_lock); + cpumask_set_cpu(cpu, &kthread_online_mask); + ret = kthreads_hotplug_update(); + mutex_unlock(&kthreads_hotplug_lock); + + return ret; +} + +static int kthreads_init(void) +{ + return cpuhp_setup_state(CPUHP_AP_KTHREADS_ONLINE, "kthreads:online", + kthreads_online_cpu, kthreads_offline_cpu); +} +early_initcall(kthreads_init); + void __kthread_init_worker(struct kthread_worker *worker, const char *name, struct lock_class_key *key) From patchwork Mon Sep 16 22:49:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13805808 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5C9EC3ABB2 for ; Mon, 16 Sep 2024 22:50:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5266E6B0092; Mon, 16 Sep 2024 18:50:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D5636B0093; Mon, 16 Sep 2024 18:50:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 377F76B0095; Mon, 16 Sep 2024 18:50:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 051356B0092 for ; Mon, 16 Sep 2024 18:50:11 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6C74A120174 for ; Mon, 16 Sep 2024 22:50:11 +0000 (UTC) X-FDA: 82572096222.03.458087F Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id CAB8B12000E for ; Mon, 16 Sep 2024 22:50:09 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=qhfUIBhA; spf=pass (imf29.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726526863; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UzduBpPmXurbjlrAIk3Lx0eqYn5j9Yi9P1SWpv6RXpA=; b=Bh2Nza7toxstH622BrZ2iYovDQGGTHq1O/L1ONFznzqY9vHUQGnmRsubdzUFqe6GU2s4+N avpJ1OXca1CKZQa7S9tOdWczL9jnGkBrEGS/hDBvm1UH5RDMf8i4Sl5Z4PBqzwL8e/ZoXg tKlYALnXfkvcK/wKQgcDUcIvv6qiS/k= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=qhfUIBhA; spf=pass (imf29.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726526863; a=rsa-sha256; cv=none; b=wJXPLemohxAzd/YI05G4rfsQ/eanj8ZMEBW4q9a20EFXTwPBmGFX5yP8px2q7rByFz9miC j55ItArqnzTtV9aTbeGbhO0MAGACqHVbxGok9E6vn7lncOqzGm2R6VqEE5LIY6A3RUWvo+ Yqgo0MOt2ggkPSsR0we2LGnfp719zQA= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 6D6285C021F; Mon, 16 Sep 2024 22:50:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD651C4CEC5; Mon, 16 Sep 2024 22:50:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726527008; bh=1dQPDIVHNJsgLLnrV38mW03au9nJKUebQ1/0avtuUj4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qhfUIBhAeAVryTWFiKasVSqw5MFlegVH7koZBy6i2ei3w5rs8Wnm9dgPRZQY4GDBn 4fl51kEz1frYX0j0VM2HioDgBcg+TJ0VbJNqf7v5DAUecT62MfuIu1iQ9MOXCqtyhz kurDRtRmFVva0mvmG1HVc1GfZrty3Rw+4o7YdbDB/Z0BIzlxsgDu8eBec0XdMRcH8J 7I4B2oDHu940+26pUyy3x2sRof5wlBHNnu1AsPJqoY+hGfQRBpns2jzyH/9Z+OM1yl M60rhPxS4I0t6UkNl5fbahTaUUHYUzC+ZRrQ2o6O7IKYV/JkFERinmCYez1pOlXFAE jeMI4sONDWy7w== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Michal Hocko , Vlastimil Babka , Andrew Morton , linux-mm@kvack.org, Peter Zijlstra , Thomas Gleixner Subject: [PATCH 13/19] mm: Create/affine kcompactd to its preferred node Date: Tue, 17 Sep 2024 00:49:17 +0200 Message-ID: <20240916224925.20540-14-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240916224925.20540-1-frederic@kernel.org> References: <20240916224925.20540-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CAB8B12000E X-Stat-Signature: 5tsg9y9spprtgjtyskdfdshnfk7dhp5y X-Rspam-User: X-HE-Tag: 1726527009-798975 X-HE-Meta: U2FsdGVkX1/JWxW6tI+CpUWpstyZVpr6FevjvqOHf4K2fKh9Yk63EV4XXCR7qL8/ERG3a9GXcM71Y9tOKqffci0c4z8jsV6MjVRh+4n9bMR1ECZH3rK71g10fBmnS0WA/p+1AsxCkTQYjcfTYIscWypZzMDsNApBaFB/ytKWkC1JLu15G3Weqrb2KJ8kCXvPWy+0N93FED5qUeon1BCmaOnXvxRD+pMg2Iu7uron7de3xUBHLbkILz3kZ+G+w796VsYICR5vNXTbopYLPTHRIonWrZCbXmQZ902ykcMmtWOwDuRuYpzdU1MNHA5287Oq11UTJiCdyvvUlvkzAGtfq7OM88/52q4q37WQW6lve0WDrF4cMHBnfp8hrnjEb7+XvzEFign5yKIJxq1w3MLEmPPtgdvpgBig41QsmLNakB5YL2IZppmPY4gHfaxVC4Tf03/rRN1HWSYXxcTBETTZHeBfV3yr5R5Sh2Xg2ZCB8rSOMgWccztAH59pj+Nui1Wurzd2oAa6Rivs592mSsYScbZvzqfNIrHnmw+BUajzsD4rq5sUZm5h3l+pD11ipXzR+rLmNB5mdv0LJ4M/53CNrH2wzvC874ov1Wet0w+eU+koF3quX28bPlarwcj7rA65y4WSuCN8YVLBQy0Xwn0XKGh4ZGZDW8n98CTnKxVmOoMwsOfCTto/77UOeoynqVPAAiI9t6oUbx8GZOe7f9vyHO79nCPD9oSUwet/h03aazvrM3hIy4/vd9S+9zL60sw1Ysn7+RIandBFcgtjV+hYGu+MaGdDFfwX+TVQB74H4Kq0kkg0xH85uXYrgy3SFNynQUnWukj+r4sOHOK6+shlxb/mP5S1R1oWrbHWrH1L7gvNEfJJNJuXaM/uFozsTITyaRl/hGIx8+02jeYYwYdY8/A7bSFttANGNUlQr+kaawc7IGVkSMS5Nts7jXDtdbuBUoRQ1HfCFN8XzJWxd8j xO/ifYuK KsBTsKE5bn7wbztKa/D92qzfw2IA6+hN5A0V4i1wQJkZPMy9lJ1ALzQQDhib2Fd+1ZYmifzMk/t/Vj3hZ3P87FfuCJBmc5j7WOQMDEQvyBWs/q1jp1k//Tu0hZ2ky1lNb3WJzzWy5X/qe3iSupMkC8BJDFhZED+3J6JFhAtGEqx6yHW4b+mkA3+i94Jn+Sl/DglNIOdxgEUWoyyPCqrf6fXIjV5T3tG3eMifEAmIk3rouvp4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Kcompactd is dedicated to a specific node. As such it wants to be preferrably affine to it, memory and CPUs-wise. Use the proper kthread API to achieve that. As a bonus it takes care of CPU-hotplug events and CPU-isolation on its behalf. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker Acked-by: Michal Hocko --- mm/compaction.c | 43 +++---------------------------------------- 1 file changed, 3 insertions(+), 40 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index eb95e9b435d0..69742555f2e5 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -3179,15 +3179,9 @@ void wakeup_kcompactd(pg_data_t *pgdat, int order, int highest_zoneidx) static int kcompactd(void *p) { pg_data_t *pgdat = (pg_data_t *)p; - struct task_struct *tsk = current; long default_timeout = msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC); long timeout = default_timeout; - const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); - - if (!cpumask_empty(cpumask)) - set_cpus_allowed_ptr(tsk, cpumask); - set_freezable(); pgdat->kcompactd_max_order = 0; @@ -3258,10 +3252,12 @@ void __meminit kcompactd_run(int nid) if (pgdat->kcompactd) return; - pgdat->kcompactd = kthread_run(kcompactd, pgdat, "kcompactd%d", nid); + pgdat->kcompactd = kthread_create_on_node(kcompactd, pgdat, nid, "kcompactd%d", nid); if (IS_ERR(pgdat->kcompactd)) { pr_err("Failed to start kcompactd on node %d\n", nid); pgdat->kcompactd = NULL; + } else { + wake_up_process(pgdat->kcompactd); } } @@ -3279,30 +3275,6 @@ void __meminit kcompactd_stop(int nid) } } -/* - * It's optimal to keep kcompactd on the same CPUs as their memory, but - * not required for correctness. So if the last cpu in a node goes - * away, we get changed to run anywhere: as the first one comes back, - * restore their cpu bindings. - */ -static int kcompactd_cpu_online(unsigned int cpu) -{ - int nid; - - for_each_node_state(nid, N_MEMORY) { - pg_data_t *pgdat = NODE_DATA(nid); - const struct cpumask *mask; - - mask = cpumask_of_node(pgdat->node_id); - - if (cpumask_any_and(cpu_online_mask, mask) < nr_cpu_ids) - /* One of our CPUs online: restore mask */ - if (pgdat->kcompactd) - set_cpus_allowed_ptr(pgdat->kcompactd, mask); - } - return 0; -} - static int proc_dointvec_minmax_warn_RT_change(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -3362,15 +3334,6 @@ static struct ctl_table vm_compaction[] = { static int __init kcompactd_init(void) { int nid; - int ret; - - ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, - "mm/compaction:online", - kcompactd_cpu_online, NULL); - if (ret < 0) { - pr_err("kcompactd: failed to register hotplug callbacks.\n"); - return ret; - } for_each_node_state(nid, N_MEMORY) kcompactd_run(nid); From patchwork Mon Sep 16 22:49:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13805809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54A24C3ABCE for ; Mon, 16 Sep 2024 22:50:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D40456B0095; Mon, 16 Sep 2024 18:50:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C9FB56B0096; Mon, 16 Sep 2024 18:50:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A55946B0098; Mon, 16 Sep 2024 18:50:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7618F6B0095 for ; Mon, 16 Sep 2024 18:50:14 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EB0F6A6BDA for ; Mon, 16 Sep 2024 22:50:13 +0000 (UTC) X-FDA: 82572096306.27.6352AD2 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf17.hostedemail.com (Postfix) with ESMTP id 5A09040003 for ; Mon, 16 Sep 2024 22:50:12 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=HEoC805W; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf17.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726526890; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k44zBR4dafRS9fG3qLeo6AW/vnkxSemRlIZbhp5LW1c=; b=aT5boImCHI0L/HE/te1gmuaYy5JeOJPyf7F+fNibTbt7ZhLboO49G7vfF4a5lfaSWHWKzP vXxVl69j/jjB1SHEKLJrTR6Fmsm3UmKqpUcI6GViwWFR0EjH27iHw6ninFLhU/K/AO/38x U+Hkpwlk/7ldlTHa975+EZm0Vfnn+qA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726526890; a=rsa-sha256; cv=none; b=jTwlgKUYtUMbyYlWdyuV13TCtwbTC13NMCwHuoh1mdjROeMCayT3Hi3/ogJ0TPh6IDNotp 07t08AOhXRchPYnMbr6x1vflmLz5RdjfUke6L0yDeNPgMqVifm+mTSjL9DoLuFvbtukdyv X5rSQPa5XJtM6XP2XldfYj74mlo4if8= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=HEoC805W; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf17.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id E48045C021F; Mon, 16 Sep 2024 22:50:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 41614C4CECF; Mon, 16 Sep 2024 22:50:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726527011; bh=G//HPzoeqfpZXKC5DLATLC3AcQAMDmNkHz7pJ7brv/M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HEoC805WXjqBDwxV7eq88XdBPgl/1Aao31aSAp9FLezKIHwvZEL3AKGDJ2vy2zjPT lu51to/xU4m9d8gbzURHBNR24ECfrEl6uPBmhAJ8jiIDQK6w9NuMBoL94o6vkxVAyZ a3Ik8JPEZ2YjOM7fC6rW1Gn27PvEH20ANr9DG/djdGmNPpSkgU262JHq4F476uw6zI 5nzOvJiviekib1ZPhByMT/8Kfz2tOlRfJQSIl62ae/YQ+wohPmqzbee9Ih/c8ihaAI MFXNtAjBEI29GG7vYZEbn4d0evySHTlPDGNYaPXXUry2Ey1hQBT+GWGyp3fI29MXYm KYkRAwyfS2m2Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: [PATCH 14/19] mm: Create/affine kswapd to its preferred node Date: Tue, 17 Sep 2024 00:49:18 +0200 Message-ID: <20240916224925.20540-15-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240916224925.20540-1-frederic@kernel.org> References: <20240916224925.20540-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5A09040003 X-Stat-Signature: rwydiy5ftkmra9u6xfu6nxrhutzarsfq X-Rspam-User: X-HE-Tag: 1726527012-172937 X-HE-Meta: U2FsdGVkX1/gVwy4ibIxeuWE6KkQuQJAe68tMmXlOuNdELihX7x/iKXmiSXMa7ddDnPoS8CLYOJ8wWOQ5vgmrqgWOR5elaUpTivPxWVAKQ4k4ycn0vGbm03t1KihUwg5xp6kZGZlDWEEQGANpQxfoeOdrJnR/+zHuOIXHO+hvcblG2vhmYZQzLAgUvEumqvWFEwpPmpYXLWi0Tv3l8Pem+GEkYgj6d8ezKfNHNHi2+EtEIgxwhxLjUvQv4mjFuh/CcrdNZjAIhJAzfXtzAzHk+blhUSJRPwiZdgHeVByHTImRZM9l01/YVPLjDFHL96gFzQ4+e/kx8WymxRgJnawskr3n5zoqsTzJOJAa95RQugv+bXDL8cyY9L51PXumvWD0AWR6LB1G9euUotd/3PmAfTYombTfuU/lJfQ20GoiJxwvqZww/KEB6WH89x/j4pZ3vs/0yRJT4WQxwtFxe24bkx/1dUFcZwrfJ4D9lwiAWXX/96MCOfSRni2ofHlaciqUt0j7HMfwVvDgLTeFXhkT4bAGr/9Fveyy6z6y51AXDF02R9JX/iLa5N/5C5Hv5/FWC4R/UNIgjvHKXSWj5S1qRc+uCtjPSYZxbYA9TP6DE/D4S18Z4N7b1MuT0GCx4vdr6/BqKDffi8Sm7o2EsCQv2qnAV8OY/2fQKq15nLwZOqes8rWaMJL0khr4umO0+w/fHvb7GZyCEs4rGhsjRT8FqerpwJAD8BMLlbVBjkfzU039AnWyR4z06ICC8Z+XROKqA8Pkd4n+0uzwVbXk3iKpr45G+5xADYz65oKWR7t7mZGKki/FIN/XU7DY+QHHNNFhwh1WG1T3h6f4RrdmdWdQw2jVSavrwXvsUjJnDLY2J+Y0jHrzanV61z7zeZg9q9/L596G4kPsJESyIC7/t49XR4MbRuCUqBrXh1PqvH7ZcPlV/jJPWOh+n0DW1pDcUrMv3zygoC/XRE8YAGRbQI n3EXlu5a 4wjLoP7Gq8E7d0EUPJCuas/nFewU/0eHPvIy/aPdhSY4n/NgAz0Nwx8clJ97PnNAj2vuPtVfw9MeKVrvbhrNtKylpYEjGiK/Z2J/h1lLe7ZspMcGtXSJChouzljevcN7g+rDyG3J0j4kEFLsYOM3Lbv4CdNQzCEVeGORUUGbhUTm7gUeTF3xJyWcXE7f5G5WOjwN307oNBtVsOd5fsuz8FWD1IvVuHoH6FvI8aB1bt+ghQXlUZ7pbWC10nb/d6AmXROy1iMLepdAC9ro= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: kswapd is dedicated to a specific node. As such it wants to be preferrably affine to it, memory and CPUs-wise. Use the proper kthread API to achieve that. As a bonus it takes care of CPU-hotplug events and CPU-isolation on its behalf. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker Acked-by: Michal Hocko --- mm/vmscan.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index bd489c1af228..00a7f1e92447 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -7139,10 +7139,6 @@ static int kswapd(void *p) unsigned int highest_zoneidx = MAX_NR_ZONES - 1; pg_data_t *pgdat = (pg_data_t *)p; struct task_struct *tsk = current; - const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); - - if (!cpumask_empty(cpumask)) - set_cpus_allowed_ptr(tsk, cpumask); /* * Tell the memory management that we're a "memory allocator", @@ -7311,13 +7307,15 @@ void __meminit kswapd_run(int nid) pgdat_kswapd_lock(pgdat); if (!pgdat->kswapd) { - pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); + pgdat->kswapd = kthread_create_on_node(kswapd, pgdat, nid, "kswapd%d", nid); if (IS_ERR(pgdat->kswapd)) { /* failure at boot is fatal */ pr_err("Failed to start kswapd on node %d,ret=%ld\n", nid, PTR_ERR(pgdat->kswapd)); BUG_ON(system_state < SYSTEM_RUNNING); pgdat->kswapd = NULL; + } else { + wake_up_process(pgdat->kswapd); } } pgdat_kswapd_unlock(pgdat); From patchwork Mon Sep 16 22:49:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13805810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64FAFC3ABCB for ; Mon, 16 Sep 2024 22:50:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9D6F6B0099; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4CAB6B009A; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEC7A6B009B; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AF7676B0099 for ; Mon, 16 Sep 2024 18:50:18 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2B3E514015A for ; Mon, 16 Sep 2024 22:50:18 +0000 (UTC) X-FDA: 82572096516.12.175AE42 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id 81BC514001A for ; Mon, 16 Sep 2024 22:50:16 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=bTMAEhtm; spf=pass (imf23.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726526870; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xNBArmQsRbr8vhEoQnHk8Bx9L6asG/uynxgCiPrx+rQ=; b=2tlzeasLFqbTixHTYJDMsuyJvTAsbiwBkprHZVChKdc+9zFuJPeSoBC8zqg7a4+y03r9NJ t5vSNY4MX3pUhsl6SqsAHS6is3gRk4CfFjF8D5/eJiLLWgpypY0rMyNCj74Xqm/Y84kN6S b1EuI2jgBxRdlQa4CdzIm6kKQEUeVz8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=bTMAEhtm; spf=pass (imf23.hostedemail.com: domain of frederic@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726526870; a=rsa-sha256; cv=none; b=ARORYU5QidKK8hzUKiX/BFhXyE4SbxVFxoU5w08xXKLMMWQXK5eC4iSqGlIzsHHtESZtm8 GlXfUu/vPO55//yEKi/G24E1oRO+6NTS647/FaDbK1jzTWvaYinjGheRxW9p3UWDAz81PV Ej3bx8jqW2w2h7UgBYjJq2NvuNxOsx4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 0DF8B5C121D; Mon, 16 Sep 2024 22:50:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFFBCC4CECF; Mon, 16 Sep 2024 22:50:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726527015; bh=P7lEwaOzaXjSDSuqvPYP44l3AVedPSv4yEdx2INBc1w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bTMAEhtmDWTkAFtjFel6YnKlcx7X0ChO155prgrysl4heaF1ThwtNHRBhS4FO/DMM Ux5mxG2+CwHUJikekDXLAo4L4K4bNG9uz7jITDTDSvgK2DxUPM0fo2+DEZfAouY9ER dilkbmfRl7MYxnJMDZ11VXPlKYkkalgBDZV6bZnjDvFwiTujQapfYK51U/C0ShCdW1 lx8J950jAJ+ntD//LV9vqIGtJdqw6YAFUc/xD6iR00Zgh8IUPefYuxyXgEYBIznB3j O5+kRWjvVyhwf2J8vpIrrvV4C9Dky08LtfS9rNk4jcM3laxTPz7u9Qr8g0dJkvNEOY MJ7YO6yx3PMcA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Kees Cook , Peter Zijlstra , Thomas Gleixner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Zqiang , rcu@vger.kernel.org Subject: [PATCH 15/19] kthread: Implement preferred affinity Date: Tue, 17 Sep 2024 00:49:19 +0200 Message-ID: <20240916224925.20540-16-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240916224925.20540-1-frederic@kernel.org> References: <20240916224925.20540-1-frederic@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 81BC514001A X-Stat-Signature: c6bg43cho4o4xkeddrae79wpi1kjzzdf X-Rspam-User: X-HE-Tag: 1726527016-583994 X-HE-Meta: U2FsdGVkX19he74zL+4JcOz6MmGEC+O3lpLzxyZfOpW3OrHX+u8rE9K+RMs300AZh4WJMJGDZGmjG8V8IYh/wbwAYzTn/2V5ji8RbyfHTcFFh3P63H3tk9LvBgNljALGr+GJQxV3EWf1kkaC3+H+4CP1xxWpEic2Lc61ZOZ33NKJqaP502kBj3lhqhjKj59+BzlZbARr0BESCj9azmIk499cSSQVGcCPk+vYQHZAr1FXLv9snx+hBU6e+XSnBdgKZcnFL7YhknUCcCa+3uqcKTE9zWlFpBmQyXoIVuZofJTjCOIrAJvkgYX3m1V5ijSFurR8TMjxPLwplWUBJJHBLvkx6AxGAidwY0EedCx3p5WgH7H4ePDOCwtfThEJJAcoxwUHLeQSYMp5pbGBigaUXwu+1bI32j7W7uaWad8Nb5xhumtTxe00Rkp6rmePoX52/cE8Jesjo42WqXz4PQNXalbHCC4C+FPhhHmSjcTo0IuuhPs8HUnEJC5M5VRYNp/g6hAtuty/7+dmDoz5YgIaxHB93u9IhNq+YpFEwrKuFLcG6quMppkpLTXnNH6lBiwguucLyxzsPfGgBq4IH3y7kA1XnCAhFzi4g0YMqNpPv+LkwLz2BYEXLdoGE1rOi+nDEArPTRMtT6uYrDESUl8DkKvXeyfE4d6bgfMLYsZfTD+J1N9jO2E/u1qnQaY1Jdxpz6+072BcOANCLCz6RQ1IEXyc9deIeajHeEptgmux+4+dw87OiIk2yO2FFk66RGx8Hm/pRRgykrxipPTvO1yGPmQS8tqw79mEHOn44YweWwwUsrB56LIJ8yZsDkW5+rT84OE0rXxNuWinEa29Lt29VV4eTmqupLTRLTIORVMFXWsQeEVNfBAV8nMDGKbUW+nbSFHX6tbSO6S8+18aEJMrw+0SHPRUQNzGWoL4Y3bOsRLiP4pQS0Fa3hNR+bGk8KMz9cBOvbc5I1yUHo5Stmc dgvYt3Ly VuCAdg4OLyiE7lQoU8S8SH0kzQnzKqax4ZSy92fVgvMEKQR/avwR6uxZhiZ7EwSBkGiWGxweiUmVhrVWZZ/1zIiFT4Q1PQ5jaS926uNt1/LYML6YqKa8aMlabVmNFPUtTgMDSNSwYEWYoPourHSpUMtlItec1ebGUykuQ10NShVV2wxVK0JObxpYsoZ3xlQZbmgC7TOwqyi+zmYIckC8MPch/Q7kSXEU09OSvITsFCU6AMgZnJIhb+xPJD5PkhqBHH/6JrGFp/iAMmtJwbCd2mYvmhLOVUxpwkDwGm4bFDEZZI9pg4pWX7CHokw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Affining kthreads follow either of four existing different patterns: 1) Per-CPU kthreads must stay affine to a single CPU and never execute relevant code on any other CPU. This is currently handled by smpboot code which takes care of CPU-hotplug operations. 2) Kthreads that _have_ to be affine to a specific set of CPUs and can't run anywhere else. The affinity is set through kthread_bind_mask() and the subsystem takes care by itself to handle CPU-hotplug operations. 3) Kthreads that prefer to be affine to a specific NUMA node. That preferred affinity is applied by default when an actual node ID is passed on kthread creation, provided the kthread is not per-CPU and no call to kthread_bind_mask() has been issued before the first wake-up. 4) Similar to the previous point but kthreads have a preferred affinity different than a node. It is set manually like any other task and CPU-hotplug is supposed to be handled by the relevant subsystem so that the task is properly reaffined whenever a given CPU from the preferred affinity comes up or down. Also care must be taken so that the preferred affinity doesn't cross housekeeping cpumask boundaries. Provide a function to handle the last usecase, mostly reusing the current node default affinity infrastructure. kthread_affine_preferred() is introduced, to be used just like kthread_bind_mask(), right after kthread creation and before the first wake up. The kthread is then affine right away to the cpumask passed through the API if it has online housekeeping CPUs. Otherwise it will be affine to all online housekeeping CPUs as a last resort. As with node affinity, it is aware of CPU hotplug events such that: * When a housekeeping CPU goes up and is part of the preferred affinity of a given kthread, it is added to its applied affinity set (and possibly the default last resort online housekeeping set is removed from the set). * When a housekeeping CPU goes down while it was part of the preferred affinity of a kthread, it is removed from the kthread's applied affinity. The last resort is to affine the kthread to all online housekeeping CPUs. Acked-by: Vlastimil Babka Signed-off-by: Frederic Weisbecker --- include/linux/kthread.h | 1 + kernel/kthread.c | 69 ++++++++++++++++++++++++++++++++++++----- 2 files changed, 62 insertions(+), 8 deletions(-) diff --git a/include/linux/kthread.h b/include/linux/kthread.h index b11f53c1ba2e..30209bdf83a2 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -85,6 +85,7 @@ kthread_run_on_cpu(int (*threadfn)(void *data), void *data, void free_kthread_struct(struct task_struct *k); void kthread_bind(struct task_struct *k, unsigned int cpu); void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask); +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask); int kthread_stop(struct task_struct *k); int kthread_stop_put(struct task_struct *k); bool kthread_should_stop(void); diff --git a/kernel/kthread.c b/kernel/kthread.c index eee5925e7725..e4ffc776928a 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -71,6 +71,7 @@ struct kthread { char *full_name; struct task_struct *task; struct list_head hotplug_node; + struct cpumask *preferred_affinity; }; enum KTHREAD_BITS { @@ -330,6 +331,11 @@ void __noreturn kthread_exit(long result) /* Make sure the kthread never gets re-affined globally */ set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); mutex_unlock(&kthreads_hotplug_lock); + + if (kthread->preferred_affinity) { + kfree(kthread->preferred_affinity); + kthread->preferred_affinity = NULL; + } } do_exit(0); } @@ -358,19 +364,25 @@ EXPORT_SYMBOL(kthread_complete_and_exit); static void kthread_fetch_affinity(struct kthread *k, struct cpumask *mask) { - if (k->node == NUMA_NO_NODE) { - cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); - } else { + const struct cpumask *pref; + + if (k->preferred_affinity) { + pref = k->preferred_affinity; + } else if (k->node != NUMA_NO_NODE) { /* * The node cpumask is racy when read from kthread() but: * - a racing CPU going down won't be present in kthread_online_mask * - a racing CPU going up will be handled by kthreads_online_cpu() */ - cpumask_and(mask, cpumask_of_node(k->node), &kthread_online_mask); - cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); - if (cpumask_empty(mask)) - cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + pref = cpumask_of_node(k->node); + } else { + pref = housekeeping_cpumask(HK_TYPE_KTHREAD); } + + cpumask_and(mask, pref, &kthread_online_mask); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + if (cpumask_empty(mask)) + cpumask_copy(mask, housekeeping_cpumask(HK_TYPE_KTHREAD)); } static int kthread_affine_node(void) @@ -440,7 +452,7 @@ static int kthread(void *_create) self->started = 1; - if (!(current->flags & PF_NO_SETAFFINITY)) + if (!(current->flags & PF_NO_SETAFFINITY) && !self->preferred_affinity) kthread_affine_node(); ret = -EINTR; @@ -837,6 +849,47 @@ int kthreadd(void *unused) return 0; } +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask) +{ + struct kthread *kthread = to_kthread(p); + cpumask_var_t affinity; + unsigned long flags; + int ret; + + if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) { + WARN_ON(1); + return -EINVAL; + } + + WARN_ON_ONCE(kthread->preferred_affinity); + + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) + return -ENOMEM; + + kthread->preferred_affinity = kzalloc(sizeof(struct cpumask), GFP_KERNEL); + if (!kthread->preferred_affinity) { + ret = -ENOMEM; + goto out; + } + + mutex_lock(&kthreads_hotplug_lock); + cpumask_copy(kthread->preferred_affinity, mask); + WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); + list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + kthread_fetch_affinity(kthread, affinity); + + /* It's safe because the task is inactive. */ + raw_spin_lock_irqsave(&p->pi_lock, flags); + do_set_cpus_allowed(p, affinity); + raw_spin_unlock_irqrestore(&p->pi_lock, flags); + + mutex_unlock(&kthreads_hotplug_lock); +out: + free_cpumask_var(affinity); + + return 0; +} + static int kthreads_hotplug_update(void) { cpumask_var_t affinity;