From patchwork Tue Dec 17 18:29:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912418 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8EE41F8AF0; Tue, 17 Dec 2024 18:29:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460187; cv=none; b=EkLOJIVnXE3+katakqr1YTfplN/TFc6SG/CzHWpG6Mtub86ny7T2IfkeSIGoTd9TTgOyDJjDvsbCs1M53a4Zl5gQEdb/l5UzTne+DDYvHzMpPHKPETCaMnDUXnO7o1HFTBzt08mLLU7jpsKw1jEK7wKRr7jtOIM463BPb55IDDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460187; c=relaxed/simple; bh=GK5KtME/hF0aSpOZVjtVt+PZSF6LX1XTvt/7AaCW82Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=HX+vlCH6jBmXc4u0QQq1Qfwb06TGC/0IzxjzQfULZpQOORPI7hRElTNYgNsQwnfLgYJyEqo9cXfpqpxqBjyc+MHvk6WPr3iUB+J+RMeaBljBo5vEBISSmqJRWiXN+X/UexQ7ZkicdGcNpFEdIbtKxzziEtK9vgGWmIuuHaysxLc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BeLQJ9mm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BeLQJ9mm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECDC7C4CEDD; Tue, 17 Dec 2024 18:29:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460186; bh=GK5KtME/hF0aSpOZVjtVt+PZSF6LX1XTvt/7AaCW82Y=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=BeLQJ9mm2uFWrsTpWNRjjW7uEntUaU0PFv9nGaIG9BrhXK5aMfQAit/7HgXd8TU3L 3G53FG+ePxNtXXNPUf+NoCHde/ChjtQHWAwkYgnUcuRRv4Mo+AQ7XU6vhZOlcLzG39 XC+cartsLKGgQY/94FHd7/sgnlqqymFdzLTpc4Lq3qxVHlJlYu5oqsZ7XUYwq+7tsH 5RHY2tp5zoTH8k6TW7YVtRsThmSVwl6dDyiON6IPJha6H9LS8XaCQ6uopqIC5W/qKT 7ycHrotYixRpiCyIN3R/fGp8m5KNl0tegw6iXFrZRR7GM7NTrxYtlhNLrHAs3hiCME nTrjJSBEbgHEA== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:35 +0100 Subject: [PATCH v4 1/9] lib/group_cpus: let group_cpu_evenly return number of groups Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-1-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 group_cpu_evenly might allocated less groups then the requested: group_cpu_evenly __group_cpus_evenly alloc_nodes_groups # allocated total groups may be less than numgrps when # active total CPU number is less then numgrps In this case, the caller will do an out of bound access because the caller assumes the masks returned has numgrps. Return the number of groups created so the caller can limit the access range accordingly. Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 7 ++++--- drivers/virtio/virtio_vdpa.c | 2 +- fs/fuse/virtio_fs.c | 7 ++++--- include/linux/group_cpus.h | 2 +- kernel/irq/affinity.c | 2 +- lib/group_cpus.c | 23 +++++++++++++---------- 6 files changed, 24 insertions(+), 19 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index ad8d6a363f24ae11968b42f7bcfd6a719a0499b7..85c0a7073bd8bff5d34aad1729d45d89da4c4bd1 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -19,9 +19,10 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; - unsigned int queue, cpu; + unsigned int queue, cpu, nr_masks; - masks = group_cpus_evenly(qmap->nr_queues); + nr_masks = qmap->nr_queues; + masks = group_cpus_evenly(&nr_masks); if (!masks) { for_each_possible_cpu(cpu) qmap->mq_map[cpu] = qmap->queue_offset; @@ -29,7 +30,7 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) } for (queue = 0; queue < qmap->nr_queues; queue++) { - for_each_cpu(cpu, &masks[queue]) + for_each_cpu(cpu, &masks[queue % nr_masks]) qmap->mq_map[cpu] = qmap->queue_offset + queue; } kfree(masks); diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 1f60c9d5cb1810a6f208c24bb2ac640d537391a0..c478cccf5fd68b9c9c01332046c24316573d97cd 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -330,7 +330,7 @@ create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(&this_vecs); if (!result) { kfree(masks); diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 82afe78ec542358e2db6f4d955d521652ae363ec..5acd875f1e9c9840dd9d2f3245665c91230f57a8 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -862,7 +862,7 @@ static void virtio_fs_requests_done_work(struct work_struct *work) static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *fs) { const struct cpumask *mask, *masks; - unsigned int q, cpu; + unsigned int q, cpu, nr_masks; /* First attempt to map using existing transport layer affinities * e.g. PCIe MSI-X @@ -882,7 +882,8 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f return; fallback: /* Attempt to map evenly in groups over the CPUs */ - masks = group_cpus_evenly(fs->num_request_queues); + nr_masks = fs->num_request_queues; + masks = group_cpus_evenly(&nr_masks); /* If even this fails we default to all CPUs use first request queue */ if (!masks) { for_each_possible_cpu(cpu) @@ -891,7 +892,7 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f } for (q = 0; q < fs->num_request_queues; q++) { - for_each_cpu(cpu, &masks[q]) + for_each_cpu(cpu, &masks[q % nr_masks]) fs->mq_map[cpu] = q + VQ_REQUEST; } kfree(masks); diff --git a/include/linux/group_cpus.h b/include/linux/group_cpus.h index e42807ec61f6e8cf3787af7daa0d8686edfef0a3..8659534a3423e92746738ac57e713b7416e05271 100644 --- a/include/linux/group_cpus.h +++ b/include/linux/group_cpus.h @@ -9,6 +9,6 @@ #include #include -struct cpumask *group_cpus_evenly(unsigned int numgrps); +struct cpumask *group_cpus_evenly(unsigned int *numgrps); #endif diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index 44a4eba80315cc098ecfa366ca1d88483641b12a..0188e133f1a508a623e33f08a0fca2e1f2cbf4e4 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -71,7 +71,7 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(&this_vecs); if (!result) { kfree(masks); diff --git a/lib/group_cpus.c b/lib/group_cpus.c index ee272c4cefcc13907ce9f211f479615d2e3c9154..73da83ca2c45347a3a443d42d4f16801a47effd5 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -334,7 +334,8 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * @numgrps: number of groups * * Return: cpumask array if successful, NULL otherwise. And each element - * includes CPUs assigned to this group + * includes CPUs assigned to this group. numgrps will be updated to the + * actuall allocated number of masks. * * Try to put close CPUs from viewpoint of CPU and NUMA locality into * same group, and run two-stage grouping: @@ -344,9 +345,9 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int *numgrps) { - unsigned int curgrp = 0, nr_present = 0, nr_others = 0; + unsigned int curgrp = 0, nr_present = 0, nr_others = 0, nr_grps; cpumask_var_t *node_to_cpumask; cpumask_var_t nmsk, npresmsk; int ret = -ENOMEM; @@ -362,7 +363,8 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) if (!node_to_cpumask) goto fail_npresmsk; - masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + nr_grps = *numgrps; + masks = kcalloc(nr_grps, sizeof(*masks), GFP_KERNEL); if (!masks) goto fail_node_to_cpumask; @@ -383,7 +385,7 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) cpumask_copy(npresmsk, data_race(cpu_present_mask)); /* grouping present CPUs first */ - ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask, + ret = __group_cpus_evenly(curgrp, nr_grps, node_to_cpumask, npresmsk, nmsk, masks); if (ret < 0) goto fail_build_affinity; @@ -395,19 +397,19 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) * group space, assign the non present CPUs to the already * allocated out groups. */ - if (nr_present >= numgrps) + if (nr_present >= nr_grps) curgrp = 0; else curgrp = nr_present; cpumask_andnot(npresmsk, cpu_possible_mask, npresmsk); - ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask, + ret = __group_cpus_evenly(curgrp, nr_grps, node_to_cpumask, npresmsk, nmsk, masks); if (ret >= 0) nr_others = ret; fail_build_affinity: if (ret >= 0) - WARN_ON(nr_present + nr_others < numgrps); + WARN_ON(nr_present + nr_others < nr_grps); fail_node_to_cpumask: free_node_to_cpumask(node_to_cpumask); @@ -421,12 +423,13 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) kfree(masks); return NULL; } + *numgrps = nr_present + nr_others; return masks; } #else /* CONFIG_SMP */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int *numgrps) { - struct cpumask *masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + struct cpumask *masks = kcalloc(*numgrps, sizeof(*masks), GFP_KERNEL); if (!masks) return NULL; From patchwork Tue Dec 17 18:29:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912419 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A09B1F8930; Tue, 17 Dec 2024 18:29:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460189; cv=none; b=LBbDuHrvRHjWxL2dTkrzNJj2yzzPJ4SJPefhb3//0szeXQmftu1+BMlF16pW+rdeMbWJfRUQkhfhaDZeeugPK9OFZDFmu39DoDrF33hzqH68no3bCY/8jxOw8rkKCWomF4yW7Er+8wccBrMKct2+Mg3wXLMgqWghAmhLlOhw5Iw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460189; c=relaxed/simple; bh=jEw2W00hJQ1lt+cTJ+qzs1E4wepHrE7BqSO1rO5WRPQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=R/8F8YACXNBZE6m9V774q+/t8EDHIl+L/cHK7yEquqZaJm32UXNcCA9U0LSw2Op5kArFs7pLHT4a2ZOj0nsJABoYdvy/jQt2MqPzzD8ZNAIRxPNQcH0wuqntGxIu52bdSAcWVhutKly5iqqifgbdvIbFI1l67X9yUx34HRlxcEo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Fg2T7St/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Fg2T7St/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F1A3C4CED3; Tue, 17 Dec 2024 18:29:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460188; bh=jEw2W00hJQ1lt+cTJ+qzs1E4wepHrE7BqSO1rO5WRPQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Fg2T7St/v8YfU2qpzV7JQguqKZ21VuLdqCSDnXtpZhe1E/4yDcm3yEvMRZsMOeYzM 5vUp66eT2WwpL0W/0K1D33L2GqqTbL0QmIzJ8ODYOoxrnQVI2quOOvgZOGY2O2WQnH ks+eXh1WTZARtVTrjzPzTwef4rK/tJCNA2RtmKGIPG1YuM5ih6YY+/3COClq7MuHFk 5fJldEHT+48JpzBaL78xmuZajblNhHjPw0IZmy4zkXaF5mWhpIqgxuO8GwgAZTDSSL T48xkQHtP6zxb24x4wJWXZAoPJmmRDZBeCmGXY87GT1sh4Q2Ej08IsLbwHf6T0XpWv MVbbrGI9Dlp8g== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:36 +0100 Subject: [PATCH v4 2/9] sched/isolation: document HK_TYPE housekeeping option Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-2-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 The enum is a public API which can be used all over the kernel. This warrants a bit of documentation. Signed-off-by: Daniel Wagner --- include/linux/sched/isolation.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index 2b461129d1fad0fd0ef1ad759fe44695dc635e8c..6649c3a48e0ea0a88c84bf5f2a74bff039fadaf2 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -6,6 +6,19 @@ #include #include +/** + * enum hk_type - housekeeping cpu mask types + * @HK_TYPE_TIMER: housekeeping cpu mask for timers + * @HK_TYPE_RCU: housekeeping cpu mask for RCU + * @HK_TYPE_MISC: housekeeping cpu mask for miscalleanous resources + * @HK_TYPE_SCHED: housekeeping cpu mask for scheduling + * @HK_TYPE_TICK: housekeeping cpu maks for timer tick + * @HK_TYPE_DOMAIN: housekeeping cpu mask for general SMP balancing + * and scheduling algoririthms + * @HK_TYPE_WQ: housekeeping cpu mask for worksqueues + * @HK_TYPE_MANAGED_IRQ: housekeeping cpu mask for managed IRQs + * @HK_TYPE_KTHREAD: housekeeping cpu mask for kthreads + */ enum hk_type { HK_TYPE_TIMER, HK_TYPE_RCU, From patchwork Tue Dec 17 18:29:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912420 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEF831F9AB4; Tue, 17 Dec 2024 18:29:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460192; cv=none; b=Wi9uxDUzui52gSMT+y3/exMY2LOtstFR0FJ5fr/LM+MwhlqEBoBbZ4yvgCwKy6pmxME0OAHaWzAqUJm0A7NmkA1uoRyOBmaGe0gbBp0pqD2CJmfUwC+Eb12fzAF2RREexvSyZaHU4DiQjbOg7sBQdwA973Cbjix0sUtXjIzKZq4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460192; c=relaxed/simple; bh=J8AQ9XfymIw2GabflSCTi0yfwCmZTc/3XvdEqwuuAXs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=RpxsFosSvUojEREMG8DednbKIcRlt5SAOjIWFtPpkniMMX4+FfYAJ100+CgTh0uZrYwIImBLh/hAlumNNWPZF3046Ug75jCeD5GftxpAiHm/aXtpdPvkGUDO1RmuBxkP1NCzZVLeOsFKsDh1bxiIQ12Lo7roXfBY2fs6f/LreAI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lYcE5ER4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lYcE5ER4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 13BE3C4CEDF; Tue, 17 Dec 2024 18:29:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460191; bh=J8AQ9XfymIw2GabflSCTi0yfwCmZTc/3XvdEqwuuAXs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=lYcE5ER4WqHafND2JV2YcmsMXvebT7xabLiuYqWbNUv5yN4UICqQu62xU4mKGC8og 16QjYsmOFoy3qOB73Qskervz60t9UQE8Hij7bIulPxKtsUyWT1B+5kflUjghgMRnhx qr9reApEvmjcZWP0qGaCry7raoaM+oZT/wDYaW/cccY9slC/ptFO5DZ+rIbOTvvdYX 9x5zTaSm6kdHbxP31bj/hLMZlMQwQNDObpjrDK0PTQNCWtHbwCwMVH4ms4Gvv96bGW +ujZuNGrq4jOpo2XBn8/5Ez5t8PtiET+jXDzAIR8b8Ljw+B1iHREBQ0T2TTSEcuh3N SL6/lwveSYqwg== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:37 +0100 Subject: [PATCH v4 3/9] blk-mq: add number of queue calc helper Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-3-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Add two variants of helpers which calculates the correct number of queues which should be used. The need for two variants is necessary because some drivers calculate their max number of queues based on the possible CPU mask, others based on the online CPU mask. Reviewed-by: Christoph Hellwig Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 2 ++ 2 files changed, 47 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 85c0a7073bd8bff5d34aad1729d45d89da4c4bd1..b3a863c2db3231624685ab54a1810b22af4111f4 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -12,10 +12,55 @@ #include #include #include +#include #include "blk.h" #include "blk-mq.h" +static unsigned int blk_mq_num_queues(const struct cpumask *mask, + unsigned int max_queues) +{ + unsigned int num; + + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + num = cpumask_weight(mask); + return min_not_zero(num, max_queues); +} + +/** + * blk_mq_num_possible_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of possible cpu. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_possible_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_possible_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_possible_queues); + +/** + * blk_mq_num_online_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of online cpus. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_online_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_online_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 769eab6247d4921e574e0828ab41a580a5a9f2fe..4f0f2ea64de2057750e88c2a3ff7d49e13a7bfc5 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -920,6 +920,8 @@ int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, void blk_mq_unfreeze_queue_non_owner(struct request_queue *q); void blk_freeze_queue_start_non_owner(struct request_queue *q); +unsigned int blk_mq_num_possible_queues(unsigned int max_queues); +unsigned int blk_mq_num_online_queues(unsigned int max_queues); void blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, struct device *dev, unsigned int offset); From patchwork Tue Dec 17 18:29:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912421 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 661A41F9EDC; Tue, 17 Dec 2024 18:29:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460194; cv=none; b=iNtRnmnjpzOPsoOGQ1AmaXOkcdq6NqIGK/AHpWcFMOfKK/MgsiMxpeQ52jcZAlsF7QmJKUUz5ygPMsBZvaeB66KmHQK4mV6LcP+YrgmFcS0cf4Q0sp7j/o9NxojbXCbLcTtRAYun9olFfFTo4X4yt0T2kj40Q+yozqrbZe1ctJ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460194; c=relaxed/simple; bh=0BA2bwi/FY9Z4GjQUMHKqfh/GZN5hoiskth/rg/RvDY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=JRzZgfG44QmQ96O/HTmG4AH7Nf6WEysVQS7UFAhAQ2MHph9ZlzKgvDAhRZU/VZkldbptGpxCj+ZW2mxvHoWXz+E/dtn5mTufzOfV4iy4OV99gdPi9IKtkREzWV7HsxqEyeBqRQaplT3PAHAPC6ULOKWbXXWyW85R8NzZCWDs2DE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=We/lb5pa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="We/lb5pa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B1E0C4CED7; Tue, 17 Dec 2024 18:29:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460194; bh=0BA2bwi/FY9Z4GjQUMHKqfh/GZN5hoiskth/rg/RvDY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=We/lb5pazzBYvGzHx/acEMbPo8VqPZZdviGA0EBl9Kzb0MWC3sY6tvKE2F/FtIm7Z txfJIIiVc8FQ70hz9sPDEFWDDdfkHidFlM18Mg4h3DGx5b4qg8CN85l2tXsW7uEih0 62XPVHW2N79Qk66uJyrrGheDrskehBLy9AUVxMkhGdvq6oJwdIhYGJhcn4NhmUE8ut RNrzOUDURkLRu8XAnF4FsIKyJV0nK/nTnk0tJ0OcNGYf13j52M40ClhpBNsnBmW3ql kVKTOXDMUPzlVx8q2NbgO+5jkFvcWDQBNhMLx+hiSi0ra1wrJBshdA2Hpx2xlIui1c 804TMrDlkIAcQ== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:38 +0100 Subject: [PATCH v4 4/9] nvme-pci: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-4-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Reviewed-by: Christoph Hellwig Signed-off-by: Daniel Wagner --- drivers/nvme/host/pci.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 709328a67f915aede5c6bae956e1bdd5e6f3f1bc..4af22f09ed8474676edd118477344ed32236c497 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -81,7 +81,7 @@ static int io_queue_count_set(const char *val, const struct kernel_param *kp) int ret; ret = kstrtouint(val, 10, &n); - if (ret != 0 || n > num_possible_cpus()) + if (ret != 0 || n > blk_mq_num_possible_queues(0)) return -EINVAL; return param_set_uint(val, kp); } @@ -2439,7 +2439,8 @@ static unsigned int nvme_max_io_queues(struct nvme_dev *dev) */ if (dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS) return 1; - return num_possible_cpus() + dev->nr_write_queues + dev->nr_poll_queues; + return blk_mq_num_possible_queues(0) + dev->nr_write_queues + + dev->nr_poll_queues; } static int nvme_setup_io_queues(struct nvme_dev *dev) From patchwork Tue Dec 17 18:29:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912422 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 966F11FA141; Tue, 17 Dec 2024 18:29:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460196; cv=none; b=XRwkmj79k+0gNZAwo3WbvwxNyHrkHOrzb68li39+VsGCbgVxPl8q7Ywz7lRl17AaIPzBi90AX139xrPmtOVHGKREQDqTWW+Xm+N5Dlo+3zEwsm7U3jwVX3vMjDR70/qsM29qmHrSwtsy64HCsVadZya195CrA+JCjZSHGmKc5+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460196; c=relaxed/simple; bh=0ycW44IR1uqmQrDJLN8IiSf67ET1Zi70MGLU1ocQlSU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=cZweStJ61vIaY6ql2aaePFyp9CzuyFAZKd5VopFGzFDRP+l+ClT75/jpNW5yJ2hiz0Dx31+E/NEazP2EPT7Y64KZPOvpkQqn/jpmW3tmsn7uEloLPDY5ej621F4KvLua6CVHbuPdkSZBahY86FjrNDD2aBr3wS+fMMmxclY0ReE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=f6LqJjc+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="f6LqJjc+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ED67FC4CEDD; Tue, 17 Dec 2024 18:29:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460196; bh=0ycW44IR1uqmQrDJLN8IiSf67ET1Zi70MGLU1ocQlSU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=f6LqJjc+pL+hWAmpAZ1nqP1TaaZNM+c6KdbH0B6nrd0IxTtwOELC/n/9Xxy4XYQez 7WfXF1wmrvAMZV39XLXohDdZJtoB0H3nBi8tzM6KNu/gsLKCIMYwI/KXFNtOeaq/XV SECpmC0ZwS0WwblJ2NNKwrRjvmKYncDgdmYAUjmLW66JibnGEXjMC+wCzXaycd5P20 zfSaPkO6v0G3pYVHt0z92kJo5upot7QrFFcEDQh24KvefXx1S5J4IqatYIEo/AgLEQ RJg4CNTu6VwUvYv/5al5nNOhMcBmuHF1Q+VnQajvLdsJUcrvgq14K3LqA+LR2Y7MGV dkBUQ/S3uBwng== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:39 +0100 Subject: [PATCH v4 5/9] scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-5-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Signed-off-by: Daniel Wagner --- drivers/scsi/megaraid/megaraid_sas_base.c | 15 +++++++++------ drivers/scsi/qla2xxx/qla_isr.c | 10 +++++----- drivers/scsi/smartpqi/smartpqi_init.c | 5 ++--- 3 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 49abd7dd75a7b7c1ddcfac41acecbbcf7de8f5a4..59d385e5a917979ae2f61f5db2c3355b9cab7e08 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -5962,7 +5962,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) else instance->iopoll_q_count = 0; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -5978,7 +5979,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) /* Disable Balanced IOPS mode and try realloc vectors */ instance->perf_mode = MR_LATENCY_PERF_MODE; instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6234,7 +6236,7 @@ static int megasas_init_fw(struct megasas_instance *instance) intr_coalescing = (scratch_pad_1 & MR_INTR_COALESCING_SUPPORT_OFFSET) ? true : false; if (intr_coalescing && - (num_online_cpus() >= MR_HIGH_IOPS_QUEUE_COUNT) && + (blk_mq_num_online_queues(0) >= MR_HIGH_IOPS_QUEUE_COUNT) && (instance->msix_vectors == MEGASAS_MAX_MSIX_QUEUES)) instance->perf_mode = MR_BALANCED_PERF_MODE; else @@ -6278,7 +6280,8 @@ static int megasas_init_fw(struct megasas_instance *instance) else instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6310,8 +6313,8 @@ static int megasas_init_fw(struct megasas_instance *instance) megasas_setup_reply_map(instance); dev_info(&instance->pdev->dev, - "current msix/online cpus\t: (%d/%d)\n", - instance->msix_vectors, (unsigned int)num_online_cpus()); + "current msix/max num queues\t: (%d/%u)\n", + instance->msix_vectors, blk_mq_num_online_queues(0)); dev_info(&instance->pdev->dev, "RDPQ mode\t: (%s)\n", instance->is_rdpq ? "enabled" : "disabled"); diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index fe98c76e9be32ff03a1960f366f0d700d1168383..c4c6b5c6658c0734f7ff68bcc31b33dde87296dd 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -4533,13 +4533,13 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX); } else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, - &desc); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, + &desc); if (ret < 0) { ql_log(ql_log_fatal, vha, 0x00c7, diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 04fb24d77e9b5c0137f26bc41f17191cc4c49728..7636c8d1c9f14a0d887c1d517c3664f0d0df7e6e 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -5278,15 +5278,14 @@ static void pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info) if (reset_devices) { num_queue_groups = 1; } else { - int num_cpus; int max_queue_groups; max_queue_groups = min(ctrl_info->max_inbound_queues / 2, ctrl_info->max_outbound_queues - 1); max_queue_groups = min(max_queue_groups, PQI_MAX_QUEUE_GROUPS); - num_cpus = num_online_cpus(); - num_queue_groups = min(num_cpus, ctrl_info->max_msix_vectors); + num_queue_groups = + blk_mq_num_online_queues(ctrl_info->max_msix_vectors); num_queue_groups = min(num_queue_groups, max_queue_groups); } From patchwork Tue Dec 17 18:29:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912423 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50D3B1FA17F; Tue, 17 Dec 2024 18:29:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460199; cv=none; b=sybJarCoyG1ePc7I/DxDBkI+rqr7bzcOIaJEu7/0UhUmUYNrdUo2VCHdLk86xu5uwErxJc7YsXZWfHWY7shQxdVcwGY17gdOFC8TBzwNbUxo3axa5kaVknlGvOYQHWg6ifsw/DFxG4xcwts3SV4r/37grQvQxhviwp29TCNfCDw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460199; c=relaxed/simple; bh=Y+iIQqMxZ1GFfz3mNIIyf5XCB6LBEtLrtx8kjb0qqss=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=bWMPQ/YVN5i3LVDVMKk4aNKEnEh3KoFW+EcdyhFyaCZyE5/EZy5/2vDI4Y7TFY3aXRbHkz9McMyCusQVsmo30M5O7rTGAbMfF63GHfRlX6TndWqGMizzurS+t6UiH1Y72Hf1pdY7XD2HpKhoIgt/rkewB5eCRnI0kIYx50TXwFs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pSoZ6F8q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pSoZ6F8q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5BB92C4CED7; Tue, 17 Dec 2024 18:29:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460198; bh=Y+iIQqMxZ1GFfz3mNIIyf5XCB6LBEtLrtx8kjb0qqss=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=pSoZ6F8qGVrkkdoMmGLN16Vf7JLEg1ohgX1Hu2SO9f5u+qm027a+QG6gtvkmr/OyM 8ZUk9wQiXoWpbnQQG6pl/lcaHZsuSE8CxyolIk5t9I5B9H6ZrWWvEmQ2l/9oRMu8ds pd+j1xdlIBS6ypdMAHRoToXbpCNGj3+re2SbWXE8tj7LOTLPmew5wcoAE2OMocemoR 85MYkR5/J7WUakbHF4Ar0RBvCwFhBbo2t1K8WJyi69lt1SPG6RpRnrAfAA98vMqPPY c90MbVqX8HL51d5WQBch8oMDz7tW4Pdb+mzX+zWU8vKZTGqjguvOAIPN3QGwXjYWKx D6AXmynGoKCWg== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:40 +0100 Subject: [PATCH v4 6/9] virtio: blk/scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-6-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Signed-off-by: Daniel Wagner Reviewed-by: Christoph Hellwig Acked-by: Michael S. Tsirkin --- drivers/block/virtio_blk.c | 5 ++--- drivers/scsi/megaraid/megaraid_sas_base.c | 3 ++- drivers/scsi/virtio_scsi.c | 1 + 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index ed514ff46dc82acd629ae594cb0fa097bd301a9b..0287ceaaf19972f3a18e81cd2e3252e4d539ba93 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -976,9 +976,8 @@ static int init_vq(struct virtio_blk *vblk) return -EINVAL; } - num_vqs = min_t(unsigned int, - min_not_zero(num_request_queues, nr_cpu_ids), - num_vqs); + num_vqs = blk_mq_num_possible_queues( + min_not_zero(num_request_queues, num_vqs)); num_poll_vqs = min_t(unsigned int, poll_queues, num_vqs - 1); diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 59d385e5a917979ae2f61f5db2c3355b9cab7e08..3ff0978b3acb5baf757fee25d9fccf4971976272 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -6236,7 +6236,8 @@ static int megasas_init_fw(struct megasas_instance *instance) intr_coalescing = (scratch_pad_1 & MR_INTR_COALESCING_SUPPORT_OFFSET) ? true : false; if (intr_coalescing && - (blk_mq_num_online_queues(0) >= MR_HIGH_IOPS_QUEUE_COUNT) && + (blk_mq_num_online_queues(0) >= + MR_HIGH_IOPS_QUEUE_COUNT) && (instance->msix_vectors == MEGASAS_MAX_MSIX_QUEUES)) instance->perf_mode = MR_BALANCED_PERF_MODE; else diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 60be1a0c61836ba643adcf9ad8d5b68563a86cb1..46ca0b82f57ce2211c7e2817dd40ee34e65bcbf9 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -919,6 +919,7 @@ static int virtscsi_probe(struct virtio_device *vdev) /* We need to know how many queues before we allocate. */ num_queues = virtscsi_config_get(vdev, num_queues) ? : 1; num_queues = min_t(unsigned int, nr_cpu_ids, num_queues); + num_queues = blk_mq_num_possible_queues(num_queues); num_targets = virtscsi_config_get(vdev, max_target) + 1; From patchwork Tue Dec 17 18:29:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912424 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5DCD1FA832; Tue, 17 Dec 2024 18:30:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460202; cv=none; b=YHqiksFtp70Gq082Y2TcBwmSj9D78g4l3A+/hHyVs38IPgmt6MaH5tbj82ixRPSkfXjqG0ZxcsB7fks5WVzeDkIbdt17Bp0u943SvzHFB7bF3I+rOvJquMy8d4cNFW962DNEGq6Qz2LmSKI4LvkuxDHVL63lUTq9+CywIR16FF0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460202; c=relaxed/simple; bh=dzcQYVyMq7ub0Uj13RUnX1qE3Yh3C14qRsIHuipGrl4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=QcMbf4txpbj9Dzd8vuO4H3aupL+pg6w5KJd9XdmYtrLN2WmzHvzpItFubGeYVPHE+LCpx4rf7LfvmV1UHBz1JJtKl61C7Yl/KXILwiM73Qu+VQ3usw6rRS3nL1eDI0pmZJsX09sSkT+WRq9TE22N2NwGyCV6TD7zGjxXHhHw62w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nG8pG+od; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nG8pG+od" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C648CC4CED3; Tue, 17 Dec 2024 18:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460201; bh=dzcQYVyMq7ub0Uj13RUnX1qE3Yh3C14qRsIHuipGrl4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=nG8pG+odeN1F8AWBvZ1AHzhJUVjMLxbf0U4UGyEmyvPLYagYuGY9t7D6ZG8IiLogH S2Oo3if0xy+Q5mPj117g+Uf0Ip4jJ+G/rah17rpxaFZIvlI9CJ3BuTUfs0YKbntkrZ eiVD1NSO9mSUGcJofDrvcHxvEDD8EE0ndWVTjudgJGwtvkvbNRrR7wXsq8weYPYIEw VfV+68CSK2f0pSwNYoh1cPH0E5vWHCpefh8qNL5Okq3zMPBB+AoeQBh2QcftyjFvAE VMd/nuashhEo2taaGbXUnHk6Jjt5OFpdIOIklZcK19PziWJxsEPYNg4ozj6meap0uX 0M2cZFIskAI4g== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:41 +0100 Subject: [PATCH v4 7/9] lib/group_cpus: honor housekeeping config when grouping CPUs Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-7-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 group_cpus_evenly distributes all present CPUs into groups. This ignores the isolcpus configuration and assigns isolated CPUs into the groups. Make group_cpus_evenly aware of isolcpus configuration and use the housekeeping CPU mask as base for distributing the available CPUs into groups. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Signed-off-by: Daniel Wagner --- lib/group_cpus.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 75 insertions(+), 2 deletions(-) diff --git a/lib/group_cpus.c b/lib/group_cpus.c index 73da83ca2c45347a3a443d42d4f16801a47effd5..927e4ed634d0d9ca14235c977fc53d6f5f649396 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -8,6 +8,7 @@ #include #include #include +#include #ifdef CONFIG_SMP @@ -330,7 +331,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, } /** - * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * group_possible_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups * * Return: cpumask array if successful, NULL otherwise. And each element @@ -345,7 +346,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int *numgrps) +static struct cpumask *group_possible_cpus_evenly(unsigned int *numgrps) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0, nr_grps; cpumask_var_t *node_to_cpumask; @@ -426,6 +427,78 @@ struct cpumask *group_cpus_evenly(unsigned int *numgrps) *numgrps = nr_present + nr_others; return masks; } + +/** + * group_mask_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @cpu_mask: CPU to consider for the grouping + * + * Return: cpumask array if successful, NULL otherwise. And each element + * includes CPUs assigned to this group. + * + * Try to put close CPUs from viewpoint of CPU and NUMA locality into + * same group. Allocate present CPUs on these groups evenly. + */ +static struct cpumask *group_mask_cpus_evenly(unsigned int *numgrps, + const struct cpumask *cpu_mask) +{ + cpumask_var_t *node_to_cpumask; + cpumask_var_t nmsk; + unsigned int nr_grps; + int ret = -ENOMEM; + struct cpumask *masks = NULL; + + if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL)) + return NULL; + + node_to_cpumask = alloc_node_to_cpumask(); + if (!node_to_cpumask) + goto fail_nmsk; + + nr_grps = *numgrps; + masks = kcalloc(nr_grps, sizeof(*masks), GFP_KERNEL); + if (!masks) + goto fail_node_to_cpumask; + + build_node_to_cpumask(node_to_cpumask); + + ret = __group_cpus_evenly(0, nr_grps, node_to_cpumask, cpu_mask, nmsk, + masks); + +fail_node_to_cpumask: + free_node_to_cpumask(node_to_cpumask); + +fail_nmsk: + free_cpumask_var(nmsk); + if (ret < 0) { + kfree(masks); + return NULL; + } + *numgrps = ret; + return masks; +} + +/** + * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * + * Return: cpumask array if successful, NULL otherwise. + * + * group_possible_cpus_evently() is used for distributing the cpus on all + * possible cpus in absence of isolcpus command line argument. + * group_mask_cpu_evenly() is used when the isolcpus command line + * argument is used with managed_irq option. In this case only the + * housekeeping CPUs are considered. + */ +struct cpumask *group_cpus_evenly(unsigned int *numgrps) +{ + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) { + return group_mask_cpus_evenly(numgrps, + housekeeping_cpumask(HK_TYPE_MANAGED_IRQ)); + } + + return group_possible_cpus_evenly(numgrps); +} #else /* CONFIG_SMP */ struct cpumask *group_cpus_evenly(unsigned int *numgrps) { From patchwork Tue Dec 17 18:29:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912425 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32B291FA17F; Tue, 17 Dec 2024 18:30:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460204; cv=none; b=LbmtSHTXOYVfe4XzksVRRUR6BuZ8iJPSstJPNd5YBSlyGyHHb4FYJ5YBqOP+uJeOFANwsR9fN9S3/r2eamqFvZbXrVcbPbuauUm+vwR4UMSGxAWViZrudX4xH8uHh+hA6EFRhLL1BvJx+UMo6v5FW+T5D1G5ZZytDj18xrO51Ig= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460204; c=relaxed/simple; bh=9E4zdKJB+GZ5hrunhZ5waPqGhd3dwl22GY+jZTRcQo0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OOiczLWkPYAgufwIF6i9TmvnqzRbBSYi/LmlLThvpPpopXXvp+cY4f4q5aKaQKlmJY/e5z3uAME7tYkhQ/1I+WUSixKiU1/4clxoOHJqbFYNcmkr8yfVq1ksWHTl+ik4PaMged3JMfeRqQYcdU0Z8kZCSvv55X7CxhuH0PrY+O8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BDHSDkkv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BDHSDkkv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42921C4CED7; Tue, 17 Dec 2024 18:30:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460203; bh=9E4zdKJB+GZ5hrunhZ5waPqGhd3dwl22GY+jZTRcQo0=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=BDHSDkkvfIdjOsCZGAiSlESOhzhq3J7LUMxwgaAxl5kZg7RFqjj/btldHd+64aqZk oI1xoOIEUIVsrXX1VWCujqMT33H3SJS1Mx3BDXav0RJfxLDAgIU0F4IO8ip8UKqQnE zKniqALFx/TJVF8kZTTbX+5in5wUD25vHR8uMSnJL6+g8xP6G0iVipOYJLor976yU6 39TErU2jdYCBGMucm7kSGh9d92bOerL7sz4vk5jcUSt1hxRXK9LwW02LBoPO8fvSHU QQ9FYF+Mz/ui8JVfXNzrA0a2n+yRnpQ5X8/xJEhuAqTNSKo1t0Qb/qhGAsFW/U8gHG nV1bUxsaaDcRQ== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:42 +0100 Subject: [PATCH v4 8/9] blk-mq: use hk cpus only when isolcpus=managed_irq is enabled Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-8-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 When isolcpus=managed_irq is enabled all hardware queues should run on the housekeeping CPUs only. Thus ignore the affinity mask provided by the driver. Also we can't use blk_mq_map_queues because it will map all CPUs to first hctx unless, the CPU is the same as the hctx has the affinity set to, e.g. 8 CPUs with isolcpus=managed_irq,2-3,6-7 config queue mapping for /dev/nvme0n1 hctx0: default 2 3 4 6 7 hctx1: default 5 hctx2: default 0 hctx3: default 1 PCI name is 00:05.0: nvme0n1 irq 57 affinity 0-1 effective 1 is_managed:0 nvme0q0 irq 58 affinity 4 effective 4 is_managed:1 nvme0q1 irq 59 affinity 5 effective 5 is_managed:1 nvme0q2 irq 60 affinity 0 effective 0 is_managed:1 nvme0q3 irq 61 affinity 1 effective 1 is_managed:1 nvme0q4 where as with blk_mq_hk_map_queues we get queue mapping for /dev/nvme0n1 hctx0: default 2 4 hctx1: default 3 5 hctx2: default 0 6 hctx3: default 1 7 PCI name is 00:05.0: nvme0n1 irq 56 affinity 0-1 effective 1 is_managed:0 nvme0q0 irq 61 affinity 4 effective 4 is_managed:1 nvme0q1 irq 62 affinity 5 effective 5 is_managed:1 nvme0q2 irq 63 affinity 0 effective 0 is_managed:1 nvme0q3 irq 64 affinity 1 effective 1 is_managed:1 nvme0q4 Signed-off-by: Daniel Wagner Reviewed-by: Christoph Hellwig --- block/blk-mq-cpumap.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index b3a863c2db3231624685ab54a1810b22af4111f4..38016bf1be8af14ef368e68d3fd12416858e3da6 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -61,11 +61,74 @@ unsigned int blk_mq_num_online_queues(unsigned int max_queues) } EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); +/* + * blk_mq_map_hk_queues - Create housekeeping CPU to hardware queue mapping + * @qmap: CPU to hardware queue map + * + * Create a housekeeping CPU to hardware queue mapping in @qmap. If the + * isolcpus feature is enabled and blk_mq_map_hk_queues returns true, + * @qmap contains a valid configuration honoring the managed_irq + * configuration. If the isolcpus feature is disabled this function + * returns false. + */ +static bool blk_mq_map_hk_queues(struct blk_mq_queue_map *qmap) +{ + struct cpumask *hk_masks; + cpumask_var_t isol_mask; + unsigned int queue, cpu, nr_masks; + + if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + return false; + + /* map housekeeping cpus to matching hardware context */ + nr_masks = qmap->nr_queues; + hk_masks = group_cpus_evenly(&nr_masks); + if (!hk_masks) + goto fallback; + + for (queue = 0; queue < qmap->nr_queues; queue++) { + for_each_cpu(cpu, &hk_masks[queue % nr_masks]) + qmap->mq_map[cpu] = qmap->queue_offset + queue; + } + + kfree(hk_masks); + + /* map isolcpus to hardware context */ + if (!alloc_cpumask_var(&isol_mask, GFP_KERNEL)) + goto fallback; + + queue = 0; + cpumask_andnot(isol_mask, + cpu_possible_mask, + housekeeping_cpumask(HK_TYPE_MANAGED_IRQ)); + + for_each_cpu(cpu, isol_mask) { + qmap->mq_map[cpu] = qmap->queue_offset + queue; + queue = (queue + 1) % qmap->nr_queues; + } + + free_cpumask_var(isol_mask); + + return true; + +fallback: + /* map all cpus to hardware context ignoring any affinity */ + queue = 0; + for_each_possible_cpu(cpu) { + qmap->mq_map[cpu] = qmap->queue_offset + queue; + queue = (queue + 1) % qmap->nr_queues; + } + return true; +} + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; unsigned int queue, cpu, nr_masks; + if (blk_mq_map_hk_queues(qmap)) + return; + nr_masks = qmap->nr_queues; masks = group_cpus_evenly(&nr_masks); if (!masks) { @@ -121,6 +184,9 @@ void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, if (!dev->bus->irq_get_affinity) goto fallback; + if (blk_mq_map_hk_queues(qmap)) + return; + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = dev->bus->irq_get_affinity(dev, queue + offset); if (!mask) From patchwork Tue Dec 17 18:29:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13912426 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACAE21FAC53; Tue, 17 Dec 2024 18:30:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460206; cv=none; b=KYk/yTpEXcpA8q37MdzzpRbXYIji01CuoABEy0omzoDgpemrSlK9n1sU/HqZG7pYwiks09Ailcu0zLek1gbYpam3ZkrnPGUj+9Oj+pvo9nLnsUqll01NYZIkEefbROlyvdlHXnSNA1tzyF1oMefmTjB6BKdFdOI5M+V4+tV4yyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460206; c=relaxed/simple; bh=6NgFV7+SLJHkzCci3gNbm7cBYHWTyZIkNamwDIlbLeY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Bj6HiztUGL39yxOUV/zzs8LHw7kcroQFEP5yfrXmmVy+nwmnyZ/+4M9TpiJ3a0W0cY1QmPrtBQWlqpb43catXnn5MPco1XktayKkFMdmuL8F7UAqY1gQ9wU5l3IZaRqyDbKR39rKyGIlQ0qj3Fb5jSel4PlrHIpuO7/9xQSkVZk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s6Kv27+R; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s6Kv27+R" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEEDBC4CED7; Tue, 17 Dec 2024 18:30:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460206; bh=6NgFV7+SLJHkzCci3gNbm7cBYHWTyZIkNamwDIlbLeY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=s6Kv27+RnXSPrl2jQCg0fkmNWjHVSiB/hAypqxFY66hSV7caMFNcFSK9tefnE2j9D 4VbA1/GbM9ZVKy8tWqx699sdcqv9NCX8Pbgz/cDonCQ1O4uhOltf/ng8vOftMg8qLI lmxKzV9+CUUzhx85La7dhEaQgYdecAI5Ogz0N/F6chL7REarIP7YADQY4s+x45YeFH vV5ptlSjGSqA+RYyXOk99DPYju/HT35KkViwL5WI2VvpjJK6uNHYY5ZLa1KLPSUkk7 xMfjlLKxelpJ/H5zZtseSfB/nl/OowQ7AE/vSLMPBeemE+TXvoPmt90w9y20mhLuQh bglrtOGwWgJRA== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:43 +0100 Subject: [PATCH v4 9/9] blk-mq: issue warning when offlining hctx with online isolcpus Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-9-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 When we offlining a hardware context which also serves isolcpus mapped to it, any IO issued by the isolcpus will stall as there is nothing which handles the interrupts etc. This configuration/setup is not supported at this point thus just issue a warning. Signed-off-by: Daniel Wagner --- block/blk-mq.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index de15c0c76f874a2a863b05a23e0f3dba20cb6488..f9af0f5dd6aac8da855777acf2ffc61128f15a74 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3619,6 +3619,45 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx) return data.has_rq; } +static void blk_mq_hctx_check_isolcpus_online(struct blk_mq_hw_ctx *hctx, unsigned int cpu) +{ + const struct cpumask *hk_mask; + int i; + + if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + return; + + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + for (i = 0; i < hctx->nr_ctx; i++) { + struct blk_mq_ctx *ctx = hctx->ctxs[i]; + + if (ctx->cpu == cpu) + continue; + + /* + * Check if this context has at least one online + * housekeeping CPU in this case the hardware context is + * usable. + */ + if (cpumask_test_cpu(ctx->cpu, hk_mask) && + cpu_online(ctx->cpu)) + break; + + /* + * The context doesn't have any online housekeeping CPUs + * but there might be an online isolated CPU mapped to + * it. + */ + if (cpu_is_offline(ctx->cpu)) + continue; + + pr_warn("%s: offlining hctx%d but there is still an online isolcpu CPU %d mapped to it, IO stalls expected\n", + hctx->queue->disk->disk_name, + hctx->queue_num, ctx->cpu); + } +} + static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, unsigned int this_cpu) { @@ -3638,8 +3677,10 @@ static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, continue; /* this hctx has at least one online CPU */ - if (this_cpu != cpu) + if (this_cpu != cpu) { + blk_mq_hctx_check_isolcpus_online(hctx, this_cpu); return true; + } } return false;