From patchwork Wed Sep 11 16:23:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Felix Moessbauer X-Patchwork-Id: 13800873 Received: from mta-65-225.siemens.flowmailer.net (mta-65-225.siemens.flowmailer.net [185.136.65.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D32651AED2C for ; Wed, 11 Sep 2024 16:23:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.136.65.225 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726071816; cv=none; b=mZ1E43Zv0BAWT3cKH/cNEG46FvdBAgUrbl9IzPV/m9tFhdf8/7LMJhDPR6SS+KKCqkfjHnf/uQYGDqMz7pqP75rLsrHnYzy4BDMetVMynQjxF882Y/yW4tvk4om9v2NomiaSI3HLhqKoNCA2eCdlyQZlLn2o+OO3hM6SrQBi7Uo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726071816; c=relaxed/simple; bh=y+lt7r74UJXmUaDLzhkQeYyRS6DuNdsUXj1h0SDurT4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=W0v3ejSJgUNUX2fWeh53U/RF409PruTTbEzikoNXmBdfNMr970J/sAw7SwFPcqf+RHUuK+oKiJ35kPaOZCXgq/+v2L+6ObM7PjmsDUSzFENXH+YD8iTkzjtqIDGaHKjV4aGCLuzMYh36wWFl8seubnoY7aSmirN61Kk0ChRTiU8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=siemens.com; spf=pass smtp.mailfrom=rts-flowmailer.siemens.com; dkim=pass (2048-bit key) header.d=siemens.com header.i=felix.moessbauer@siemens.com header.b=U283/qOH; arc=none smtp.client-ip=185.136.65.225 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=siemens.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rts-flowmailer.siemens.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=siemens.com header.i=felix.moessbauer@siemens.com header.b="U283/qOH" Received: by mta-65-225.siemens.flowmailer.net with ESMTPSA id 202409111623274ff445468c65e71b63 for ; Wed, 11 Sep 2024 18:23:27 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=fm1; d=siemens.com; i=felix.moessbauer@siemens.com; h=Date:From:Subject:To:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:Cc:References:In-Reply-To; bh=ukk2ZxcuU4qD+efg80mQqUmpuA7cmcAn4YLMH6OPZfM=; b=U283/qOHpLJJvulEwssviMg53NEd6JVnVDoNfKg0OA/I7rbmoMmEhWEO5HRyJ3f7CzpG3/ +6VGlIY3P0l0rXFlj1/VUV8+Gvb+AhlVjIwvF3jRW1bPMafqtVY6jTPDI3v43DhOf5TrDR3l 2Wz7ydz0YWDYO6bJ2JQE3NHUEnuH8enRUbspW/knI6Qr/hRbvqnobrcFFrNS4fn2tq9WaebS uolU1Khx9WDBqSijPvPN6yBwnBZGTWJAON5Mfz0+TT0/GlG+WYNDVuLVbAc5SO+gwx8G8S9N v1L1yWT92W53LFkom8NdApe8GNEhJOnILhFl6Q00sfytLAVUDNy8XB4A==; From: Felix Moessbauer To: axboe@kernel.dk Cc: stable@vger.kernel.org, asml.silence@gmail.com, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, cgroups@vger.kernel.org, dqminh@cloudflare.com, longman@redhat.com, adriaan.schmidt@siemens.com, florian.bezdeka@siemens.com, Felix Moessbauer Subject: [PATCH 6.1 1/2] io_uring/io-wq: do not allow pinning outside of cpuset Date: Wed, 11 Sep 2024 18:23:15 +0200 Message-Id: <20240911162316.516725-2-felix.moessbauer@siemens.com> In-Reply-To: <20240911162316.516725-1-felix.moessbauer@siemens.com> References: <20240911162316.516725-1-felix.moessbauer@siemens.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Flowmailer-Platform: Siemens Feedback-ID: 519:519-1321639:519-21489:flowmailer commit 0997aa5497c714edbb349ca366d28bd550ba3408 upstream. The io worker threads are userland threads that just never exit to the userland. By that, they are also assigned to a cgroup (the group of the creating task). When changing the affinity of the io_wq thread via syscall, we must only allow cpumasks within the limits defined by the cpuset controller of the cgroup (if enabled). Fixes: da64d6db3bd3 ("io_uring: One wqe per wq") Signed-off-by: Felix Moessbauer --- io_uring/io-wq.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 139cd49b2c27..c74bcc8d2f06 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -1362,22 +1363,34 @@ static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node) int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask) { + cpumask_var_t allowed_mask; + int ret = 0; int i; if (!tctx || !tctx->io_wq) return -EINVAL; + if (!alloc_cpumask_var(&allowed_mask, GFP_KERNEL)) + return -ENOMEM; + cpuset_cpus_allowed(tctx->io_wq->task, allowed_mask); + rcu_read_lock(); for_each_node(i) { struct io_wqe *wqe = tctx->io_wq->wqes[i]; - - if (mask) - cpumask_copy(wqe->cpu_mask, mask); - else - cpumask_copy(wqe->cpu_mask, cpumask_of_node(i)); + if (mask) { + if (cpumask_subset(mask, allowed_mask)) + cpumask_copy(wqe->cpu_mask, mask); + else + ret = -EINVAL; + } else { + if (!cpumask_and(wqe->cpu_mask, cpumask_of_node(i), allowed_mask)) + cpumask_copy(wqe->cpu_mask, allowed_mask); + } } rcu_read_unlock(); - return 0; + + free_cpumask_var(allowed_mask); + return ret; } /* From patchwork Wed Sep 11 16:23:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Felix Moessbauer X-Patchwork-Id: 13800871 Received: from mta-64-225.siemens.flowmailer.net (mta-64-225.siemens.flowmailer.net [185.136.64.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DBEB1AE852 for ; Wed, 11 Sep 2024 16:23:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.136.64.225 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726071813; cv=none; b=smK+e648D9RckRacOwUEKBMCp2VbdHvtxAUkwOS1tOGENK2AoSRDgQ4piUcgkOeelhg04sBohQcdGfAYrIdBa8a0181iA9qCrigzTTVpJyesqyfCYJWldU6la+LkGiJihKOPjVSFviIih6WXNxMPdZEGW32nPl4gj776xQYz+PE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726071813; c=relaxed/simple; bh=bmnetQMbTu9hnH0bwUNDYDayE3gIdFC+cuiYFhpzl6o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YXInBS6xFBFuKjKm/K2Hy/2QzAHa/SxfnqHhreOBXrSBsDLfUjOQXX5AT6qNWM83wtYcAhFN3bIGyekE/DWuS6D+7vP7s4ItW8bUyvzWXKQ0MB+WrNvOq/F4g8eBZepF0fypKc2fS/V64DuGQmXRwUpXddsIx6Cp01cxQKjFL98= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=siemens.com; spf=pass smtp.mailfrom=rts-flowmailer.siemens.com; dkim=pass (2048-bit key) header.d=siemens.com header.i=felix.moessbauer@siemens.com header.b=H/6NBs0R; arc=none smtp.client-ip=185.136.64.225 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=siemens.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rts-flowmailer.siemens.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=siemens.com header.i=felix.moessbauer@siemens.com header.b="H/6NBs0R" Received: by mta-64-225.siemens.flowmailer.net with ESMTPSA id 20240911162327facfbccf631c5d36b4 for ; Wed, 11 Sep 2024 18:23:28 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=fm1; d=siemens.com; i=felix.moessbauer@siemens.com; h=Date:From:Subject:To:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:Cc:References:In-Reply-To; bh=7ZUSL9956dMTKl4lPB2nJCaF8a0We09ZgJjxtb1B5Vc=; b=H/6NBs0R54sg0eiFuEVwndwNsW9dIT3HjAukgPWPFCL7xIZVR1ISXkwoqwGHWd+beGa9MP 3dq3q24r3e9vrjdZD3y4oGVwyKyU/jgN66fKyJ/NB5JWEiNS+p7NEPPeeUYXoniXXIgty0Tz ME3OW8jvApyseHuutT7Y5v4K7CsaZ7V3p4CYOFFgh8GxIm9GOWw7cIMIszK1bzABlvH4bwoQ FHr+TJy3GIXpUHpLa7Gh6b+h8iKRpgFVutDQv3zUiKu/rFFOrCQzjXjd1GQ85BrgEJAEy8Xn F/sZWxmwC7eXfCGmecwjcUjRSLjGiejwdTxq+71eknO87QOAT/wkMqyA==; From: Felix Moessbauer To: axboe@kernel.dk Cc: stable@vger.kernel.org, asml.silence@gmail.com, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, cgroups@vger.kernel.org, dqminh@cloudflare.com, longman@redhat.com, adriaan.schmidt@siemens.com, florian.bezdeka@siemens.com, Felix Moessbauer Subject: [PATCH 6.1 2/2] io_uring/io-wq: inherit cpuset of cgroup in io worker Date: Wed, 11 Sep 2024 18:23:16 +0200 Message-Id: <20240911162316.516725-3-felix.moessbauer@siemens.com> In-Reply-To: <20240911162316.516725-1-felix.moessbauer@siemens.com> References: <20240911162316.516725-1-felix.moessbauer@siemens.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Flowmailer-Platform: Siemens Feedback-ID: 519:519-1321639:519-21489:flowmailer commit 84eacf177faa605853c58e5b1c0d9544b88c16fd upstream. The io worker threads are userland threads that just never exit to the userland. By that, they are also assigned to a cgroup (the group of the creating task). When creating a new io worker, this worker should inherit the cpuset of the cgroup. Fixes: da64d6db3bd3 ("io_uring: One wqe per wq") Signed-off-by: Felix Moessbauer --- io_uring/io-wq.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index c74bcc8d2f06..04265bf8d319 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -1157,6 +1157,7 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) { int ret, node, i; struct io_wq *wq; + cpumask_var_t allowed_mask; if (WARN_ON_ONCE(!data->free_work || !data->do_work)) return ERR_PTR(-EINVAL); @@ -1176,6 +1177,9 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) wq->do_work = data->do_work; ret = -ENOMEM; + if (!alloc_cpumask_var(&allowed_mask, GFP_KERNEL)) + goto err; + cpuset_cpus_allowed(current, allowed_mask); for_each_node(node) { struct io_wqe *wqe; int alloc_node = node; @@ -1188,7 +1192,8 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) wq->wqes[node] = wqe; if (!alloc_cpumask_var(&wqe->cpu_mask, GFP_KERNEL)) goto err; - cpumask_copy(wqe->cpu_mask, cpumask_of_node(node)); + if (!cpumask_and(wqe->cpu_mask, cpumask_of_node(node), allowed_mask)) + cpumask_copy(wqe->cpu_mask, allowed_mask); wqe->node = alloc_node; wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded; wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers = @@ -1222,6 +1227,7 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) free_cpumask_var(wq->wqes[node]->cpu_mask); kfree(wq->wqes[node]); } + free_cpumask_var(allowed_mask); err_wq: kfree(wq); return ERR_PTR(ret);