From patchwork Fri Jan 10 16:26:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935122 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E9C6B212FAC; Fri, 10 Jan 2025 16:26:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526409; cv=none; b=E66Xsq4aJQLj9iNIt3kUlV/tQt50a4qUzadzWgbbhfJgc0RcODrRfsnHd9hPpRVQFz4xaxzfiAPBtwA/zan3tldwh/lGEOdj3kbIkLDes6FBJ3Uu8p+wZRJSnW6dBGZFb7gjl7rmMolBmKnUIyVQ7f9mL+Sk11sBBSt4klQqgaI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526409; c=relaxed/simple; bh=2QlSRxWs/elhgqriGrrcLTL8l5gVPM+GPSu5vog+ndE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=mRY22PSyua/is48jQXaxcFSpX5eGTb9k+6dJMkekvtfAzzmxptsaB+gpE3huOaodD54/7LBVQFsDIEaUtEjls0A6tuY4DG9ZL+pVAhaGzgUZeUUK4bdQlGriGErSHPy2yGO0cbOmFTgmBLhpuv695xynWC2tj4RlSdYl2HUOIEg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=J0LyphEc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="J0LyphEc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 38DBFC4CED6; Fri, 10 Jan 2025 16:26:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526408; bh=2QlSRxWs/elhgqriGrrcLTL8l5gVPM+GPSu5vog+ndE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=J0LyphEcTT2dtqLr/I2zTqssQLGVB8RPIjRzMx9HKxRhCEsKEpRuOGRxZl1dE5vgH Go3EDTxQRm8ip63xxIylah/Yz3O0t5hUXYyMQvm7D6RKl/PuWbyJUeS8AMUbdNFm4f tzfh/d17n1VbVtbWGFaremV9OTyHJ9zbTsFJvjYsUtqUOVaAZlG/dE8miELPFHD3MF CcwhVomaAbd1+S6vfOHUD2oX2RKogpvM2ON4OuuJUPVcJv91rJdlir97WNeeMMNua2 zESOJzWifvvCN8hs2L5RdHWG80/u3AWpyVoQR4pOcda/8ZU0y9sOzRl+J1chTVu8rC 9TAilm8+wD9Qg== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:39 +0100 Subject: [PATCH v5 1/9] lib/group_cpus: let group_cpu_evenly return number initialized masks Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-1-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 group_cpu_evenly might allocated less groups then the requested: group_cpu_evenly __group_cpus_evenly alloc_nodes_groups # allocated total groups may be less than numgrps when # active total CPU number is less then numgrps In this case, the caller will do an out of bound access because the caller assumes the masks returned has numgrps. Return the number of groups created so the caller can limit the access range accordingly. Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- block/blk-mq-cpumap.c | 6 +++--- drivers/virtio/virtio_vdpa.c | 9 +++++---- fs/fuse/virtio_fs.c | 6 +++--- include/linux/group_cpus.h | 3 ++- kernel/irq/affinity.c | 9 +++++---- lib/group_cpus.c | 12 +++++++++--- 6 files changed, 27 insertions(+), 18 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index ad8d6a363f24ae11968b42f7bcfd6a719a0499b7..7d3dfe885dfac18711ae73eff510efe3877ffcb6 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -19,9 +19,9 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; - unsigned int queue, cpu; + unsigned int queue, cpu, nr_masks; - masks = group_cpus_evenly(qmap->nr_queues); + masks = group_cpus_evenly(qmap->nr_queues, &nr_masks); if (!masks) { for_each_possible_cpu(cpu) qmap->mq_map[cpu] = qmap->queue_offset; @@ -29,7 +29,7 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) } for (queue = 0; queue < qmap->nr_queues; queue++) { - for_each_cpu(cpu, &masks[queue]) + for_each_cpu(cpu, &masks[queue % nr_masks]) qmap->mq_map[cpu] = qmap->queue_offset + queue; } kfree(masks); diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 1f60c9d5cb1810a6f208c24bb2ac640d537391a0..a7b297dae4890c9d6002744b90fc133bbedb7b44 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -329,20 +329,21 @@ create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; + unsigned int nr_masks; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(this_vecs, &nr_masks); if (!result) { kfree(masks); return NULL; } - for (j = 0; j < this_vecs; j++) + for (j = 0; j < nr_masks; j++) cpumask_copy(&masks[curvec + j], &result[j]); kfree(result); - curvec += this_vecs; - usedvecs += this_vecs; + curvec += nr_masks; + usedvecs += nr_masks; } /* Fill out vectors at the end that don't need affinity */ diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 82afe78ec542358e2db6f4d955d521652ae363ec..47412bd40285a28d0dd61e4b3dabc59d5a1ba54e 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -862,7 +862,7 @@ static void virtio_fs_requests_done_work(struct work_struct *work) static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *fs) { const struct cpumask *mask, *masks; - unsigned int q, cpu; + unsigned int q, cpu, nr_masks; /* First attempt to map using existing transport layer affinities * e.g. PCIe MSI-X @@ -882,7 +882,7 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f return; fallback: /* Attempt to map evenly in groups over the CPUs */ - masks = group_cpus_evenly(fs->num_request_queues); + masks = group_cpus_evenly(fs->num_request_queues, &nr_masks); /* If even this fails we default to all CPUs use first request queue */ if (!masks) { for_each_possible_cpu(cpu) @@ -891,7 +891,7 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f } for (q = 0; q < fs->num_request_queues; q++) { - for_each_cpu(cpu, &masks[q]) + for_each_cpu(cpu, &masks[q % nr_masks]) fs->mq_map[cpu] = q + VQ_REQUEST; } kfree(masks); diff --git a/include/linux/group_cpus.h b/include/linux/group_cpus.h index e42807ec61f6e8cf3787af7daa0d8686edfef0a3..bd5dada6e8606fa6cf8f7babf939e39fd7475c8d 100644 --- a/include/linux/group_cpus.h +++ b/include/linux/group_cpus.h @@ -9,6 +9,7 @@ #include #include -struct cpumask *group_cpus_evenly(unsigned int numgrps); +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks); #endif diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index 44a4eba80315cc098ecfa366ca1d88483641b12a..d2aefab5eb2b929877ced43f48b6268098484bd7 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -70,20 +70,21 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) */ for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; + unsigned int nr_masks; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(this_vecs, &nr_masks); if (!result) { kfree(masks); return NULL; } - for (j = 0; j < this_vecs; j++) + for (j = 0; j < nr_masks; j++) cpumask_copy(&masks[curvec + j].mask, &result[j]); kfree(result); - curvec += this_vecs; - usedvecs += this_vecs; + curvec += nr_masks; + usedvecs += nr_masks; } /* Fill out vectors at the end that don't need affinity */ diff --git a/lib/group_cpus.c b/lib/group_cpus.c index ee272c4cefcc13907ce9f211f479615d2e3c9154..016c6578a07616959470b47121459a16a1bc99e5 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -332,9 +332,11 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, /** * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups + * @nummasks: number of initialized cpumasks * * Return: cpumask array if successful, NULL otherwise. And each element - * includes CPUs assigned to this group + * includes CPUs assigned to this group. nummasks contains the number + * of initialized masks which can be less than numgrps. * * Try to put close CPUs from viewpoint of CPU and NUMA locality into * same group, and run two-stage grouping: @@ -344,7 +346,8 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0; cpumask_var_t *node_to_cpumask; @@ -421,10 +424,12 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) kfree(masks); return NULL; } + *nummasks = nr_present + nr_others; return masks; } #else /* CONFIG_SMP */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) { struct cpumask *masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); @@ -433,6 +438,7 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) /* assign all CPUs(cpu 0) to the 1st group only */ cpumask_copy(&masks[0], cpu_possible_mask); + *nummasks = 1; return masks; } #endif /* CONFIG_SMP */ From patchwork Fri Jan 10 16:26:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935123 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 758482135A0; Fri, 10 Jan 2025 16:26:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526411; cv=none; b=e5N+m+IfBOx3wdRtqX3vwOnkq29/Yhe6eHSApCPQPT/ydoWszSUt+h5syehzpYjkLJxL2axqNwOlPKJmG6cSVsVBa8Twflu/WIcAqyvbtrQmhB4zlTdltTIxrsoo8G8uza49XA4U8uxj8IOZl8uXwp07gGEurhnngkVeAI9ZZnk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526411; c=relaxed/simple; bh=QVb/GKe8Bqk7gvJ7t6g6sEH36EcUHUWPUDhvKJginFE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=JG3mu79/YfJrPySN349dRtpa82/poKAdBElHJgN/yTin4xUCqLRpvk77vq97VefmyoUDFAgjxey3Gs5orOCUMDHjyE7zD2r+yrez9nSvR8xtHMzxEyt5gUDM8d4xb1M+Qj8sy7EvjpUOX74YT/mybVSLNCrk2FBMF3pz32Gb2JM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=KkqCRO+h; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="KkqCRO+h" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B55A8C4CEE0; Fri, 10 Jan 2025 16:26:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526411; bh=QVb/GKe8Bqk7gvJ7t6g6sEH36EcUHUWPUDhvKJginFE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=KkqCRO+h0O7VDRnPT7DUgC0u8T+jFoJKaRtYozvfbqZPexSXFY78WUhpbuCZwE/iU THhWtXu515eUJkCX9AsKql7qOc9vFFDEPfy+PTBiKXIJyHPchIML3W5gdhENBmNpRu QQU/Dabi56ixbc2okTxXq84HJpurQximMgOvunw2sPDUPUgfPL6P0gazOIkgfd8DOT blv0125aOJX1h+xDLFvhI3RLTaSGH2ghQIKb2g4oEbaswV+2pOr+Uo8tlP4y83/LwN aQGzFANvP+QrVXrwpFEet/4Ll6APTstXweiSjvK/cJdy8PM9TNeJJBGEd1gZJqKwAG TucYSL5TiiAmQ== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:40 +0100 Subject: [PATCH v5 2/9] blk-mq: add number of queue calc helper Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-2-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Add two variants of helpers which calculates the correct number of queues which should be used. The need for two variants is necessary because some drivers calculate their max number of queues based on the possible CPU mask, others based on the online CPU mask. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 2 ++ 2 files changed, 47 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 7d3dfe885dfac18711ae73eff510efe3877ffcb6..0923cccdcbcad75ad107c3636af15b723356e087 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -12,10 +12,55 @@ #include #include #include +#include #include "blk.h" #include "blk-mq.h" +static unsigned int blk_mq_num_queues(const struct cpumask *mask, + unsigned int max_queues) +{ + unsigned int num; + + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + num = cpumask_weight(mask); + return min_not_zero(num, max_queues); +} + +/** + * blk_mq_num_possible_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of possible cpu. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_possible_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_possible_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_possible_queues); + +/** + * blk_mq_num_online_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of online cpus. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_online_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_online_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index a0a9007cc1e36f89ebb21e699de3234a3cf9ef5b..f58172755e3464e305a86bbaa0a4509270fd7e0e 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -909,6 +909,8 @@ int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, void blk_mq_unfreeze_queue_non_owner(struct request_queue *q); void blk_freeze_queue_start_non_owner(struct request_queue *q); +unsigned int blk_mq_num_possible_queues(unsigned int max_queues); +unsigned int blk_mq_num_online_queues(unsigned int max_queues); void blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, struct device *dev, unsigned int offset); From patchwork Fri Jan 10 16:26:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935124 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0451A2135CF; Fri, 10 Jan 2025 16:26:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526414; cv=none; b=L4DkHjbyfqHhawN4JYdQ5TLOB3XhX8bsrYQO8YNU8SjJ2Tu98tKhz2B6ubfI3d8e23WYJr3yhdGsy1yGGDXYZQ8qZBDCpAdgGh1Y1VcfNZnlrRtPkhNoQzs1K0zrKJvNEYyll+q6KN0eqGKmxKUwJkafHjrDFfvg61kdjUWCI4E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526414; c=relaxed/simple; bh=7KONgCPE9Ymk0Paeq6W03Cji817hrNEc4cLzoY7jtxk=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GTl+bactlqnk6jaEE6FtgYTfmk9o7WW8ot6mApYNpZkRkMij+lS9mOzFrA8uBG4aX1cdAceVakssRb9CoUSYRPjm/jmbdJJQ627wX8tOE7lFpX3mUQRzl+KIu/fHQpwderaVo4ng8CWHHRiX/OKL1LJncxFSz2nCs8D1CZa/GIU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SGoFF60h; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SGoFF60h" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3B95DC4CEE3; Fri, 10 Jan 2025 16:26:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526413; bh=7KONgCPE9Ymk0Paeq6W03Cji817hrNEc4cLzoY7jtxk=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=SGoFF60hidDrBhvkwkP8fKG8mLcaHrEmpWtnixxZbRQUiS0yRmkrcUIHLLaoui3FV nECZr9rjjMWh4IqgtKgkMBe1c9PIdin5yS91ubVuSfXL3zgNLme4+EW+9ytOOi2CC9 dPEbEgjithMuJxvTEtj00f3fNmual//ZkRU9mITN5v9fNZw2ytIQ3JySeJygM+2YI/ qGSiu3DHHk2fbuoxJbm+05Mno1bHLVUCYEivSEuEElxAFXmZPwZ9fXKbjCUJglbso2 qWYj6SewkNywxo9CAOOMWTYPzD9mTcX+6YF8OPzX4x0EaG0pkOb5JIgWGR9C4hzKjt Ea08Eym1qoH2g== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:41 +0100 Subject: [PATCH v5 3/9] nvme-pci: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-3-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- drivers/nvme/host/pci.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 709328a67f915aede5c6bae956e1bdd5e6f3f1bc..4af22f09ed8474676edd118477344ed32236c497 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -81,7 +81,7 @@ static int io_queue_count_set(const char *val, const struct kernel_param *kp) int ret; ret = kstrtouint(val, 10, &n); - if (ret != 0 || n > num_possible_cpus()) + if (ret != 0 || n > blk_mq_num_possible_queues(0)) return -EINVAL; return param_set_uint(val, kp); } @@ -2439,7 +2439,8 @@ static unsigned int nvme_max_io_queues(struct nvme_dev *dev) */ if (dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS) return 1; - return num_possible_cpus() + dev->nr_write_queues + dev->nr_poll_queues; + return blk_mq_num_possible_queues(0) + dev->nr_write_queues + + dev->nr_poll_queues; } static int nvme_setup_io_queues(struct nvme_dev *dev) From patchwork Fri Jan 10 16:26:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935125 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9ECFC2139D8; Fri, 10 Jan 2025 16:26:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526416; cv=none; b=PX+Q1jx4H4G7CerXuLpdkGO8ZYwocednP3Yc0XWsfwXzxFmNE2U8tz3qftkeBUGbV67veES7dY3UrtM1n1yyChVR7hZdSNFqqXxVnOjFFcF41MUWYUQpIUYerpOmgivp9Ki3ouJKySInMiByUpSLkv8hy60CVkNlPqJOaS80Kr0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526416; c=relaxed/simple; bh=0ycW44IR1uqmQrDJLN8IiSf67ET1Zi70MGLU1ocQlSU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pGYMKc+qIirmoJ5GJ8eTGiKE3BWMXChjEJFIZnqFTWgCe4Wm96WVoU8hifwDlYMHQaitGVPq2Lx4pLXNdq1pywcmFSo2a2gzCqTroExoJYQ20PhFV9DkoZGpxJ0QsUpWaEzs5siOHfY27xhnfnQI/si3ewwP8OnCmlnc6riuSxM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tE61rrMh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tE61rrMh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C86B1C4CED6; Fri, 10 Jan 2025 16:26:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526416; bh=0ycW44IR1uqmQrDJLN8IiSf67ET1Zi70MGLU1ocQlSU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=tE61rrMhwqmAQPxSwk5UrlzHUuve4bybQGl0HMGMlefY5rtcLGXwBkhUrhdiEFon7 pHeQSQRJqM9TfQeVSj9VryFQ02c2eIPchC/7OIoXlwYENopkVUdZNQQZYIFHvCK4K6 DrgGkfGTQbDEPKeSVxIEmoZf/rfyoo9WZ2F7DIviDLZZV0vNK/uLA3Wn0FTL4UBzqz fq6bpIvVgVXPBAl2Apr/RqVHBlJhfl5yAn1nTnuzlAFiS8Ht7JHi8fleLFTO9jxtzF cn4FaYgWsSehERVeL3HbipC8BdD8iSlmIDNjVsnwfGD6WFD20ko6EZr9tR/AornfNe uPI8F6zu0XfBQ== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:42 +0100 Subject: [PATCH v5 4/9] scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-4-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Signed-off-by: Daniel Wagner Reviewed-by: Martin K. Petersen Reviewed-by: Hannes Reinecke --- drivers/scsi/megaraid/megaraid_sas_base.c | 15 +++++++++------ drivers/scsi/qla2xxx/qla_isr.c | 10 +++++----- drivers/scsi/smartpqi/smartpqi_init.c | 5 ++--- 3 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 49abd7dd75a7b7c1ddcfac41acecbbcf7de8f5a4..59d385e5a917979ae2f61f5db2c3355b9cab7e08 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -5962,7 +5962,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) else instance->iopoll_q_count = 0; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -5978,7 +5979,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) /* Disable Balanced IOPS mode and try realloc vectors */ instance->perf_mode = MR_LATENCY_PERF_MODE; instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6234,7 +6236,7 @@ static int megasas_init_fw(struct megasas_instance *instance) intr_coalescing = (scratch_pad_1 & MR_INTR_COALESCING_SUPPORT_OFFSET) ? true : false; if (intr_coalescing && - (num_online_cpus() >= MR_HIGH_IOPS_QUEUE_COUNT) && + (blk_mq_num_online_queues(0) >= MR_HIGH_IOPS_QUEUE_COUNT) && (instance->msix_vectors == MEGASAS_MAX_MSIX_QUEUES)) instance->perf_mode = MR_BALANCED_PERF_MODE; else @@ -6278,7 +6280,8 @@ static int megasas_init_fw(struct megasas_instance *instance) else instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6310,8 +6313,8 @@ static int megasas_init_fw(struct megasas_instance *instance) megasas_setup_reply_map(instance); dev_info(&instance->pdev->dev, - "current msix/online cpus\t: (%d/%d)\n", - instance->msix_vectors, (unsigned int)num_online_cpus()); + "current msix/max num queues\t: (%d/%u)\n", + instance->msix_vectors, blk_mq_num_online_queues(0)); dev_info(&instance->pdev->dev, "RDPQ mode\t: (%s)\n", instance->is_rdpq ? "enabled" : "disabled"); diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index fe98c76e9be32ff03a1960f366f0d700d1168383..c4c6b5c6658c0734f7ff68bcc31b33dde87296dd 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -4533,13 +4533,13 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX); } else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, - &desc); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, + &desc); if (ret < 0) { ql_log(ql_log_fatal, vha, 0x00c7, diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 04fb24d77e9b5c0137f26bc41f17191cc4c49728..7636c8d1c9f14a0d887c1d517c3664f0d0df7e6e 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -5278,15 +5278,14 @@ static void pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info) if (reset_devices) { num_queue_groups = 1; } else { - int num_cpus; int max_queue_groups; max_queue_groups = min(ctrl_info->max_inbound_queues / 2, ctrl_info->max_outbound_queues - 1); max_queue_groups = min(max_queue_groups, PQI_MAX_QUEUE_GROUPS); - num_cpus = num_online_cpus(); - num_queue_groups = min(num_cpus, ctrl_info->max_msix_vectors); + num_queue_groups = + blk_mq_num_online_queues(ctrl_info->max_msix_vectors); num_queue_groups = min(num_queue_groups, max_queue_groups); } From patchwork Fri Jan 10 16:26:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935126 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1D15213E88; Fri, 10 Jan 2025 16:26:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526418; cv=none; b=iVbPEPWt68CknT+2V6JlMJhWcP6uCCMrHqqGmYbfyzWMljZgSmThWXFoIQY3HdetZGI3oM3HBJ9mHNSCMs5M9wwdeCX9C3N6QhNVyRb9oyCjCRpKNUSEXU0tDFe3x5zkQiglCuO6qojUXcCF7iTwZpK04R5B6q54QBjSeBlruJw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526418; c=relaxed/simple; bh=2WOikGN/ZqobQqHz6R5F6l81bCwlMDfxRu+pSaWFjzw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=DpkQH2nmVBK6TFpAsnIRwxglEoxQybG49/wclUbuBJXVRJydJJdkBrz3roKGGaYMuVfdmiVt/CWNWyxZaD2qj8RxMctuQfk447eLhSgEHRmhMw+VKR7nxBefWgFoyw84ky/aLrlDEb7Ym+2RB1EZNNgHWUUXuCBKsidP3pTYdAM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QwU9t/0n; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QwU9t/0n" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 33E13C4CED6; Fri, 10 Jan 2025 16:26:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526418; bh=2WOikGN/ZqobQqHz6R5F6l81bCwlMDfxRu+pSaWFjzw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=QwU9t/0nxb9qVso7q/8qOHCxbw3XJ06xXtDuidpB3Tg1BXgS0D3ZJxnFV/7hjC4rE /lKP/IL569W8mJGxXaq4Em1yJTxk6mzSx3P7nvOhBIbrMy4Z/rd4sGpFAg5B40C2eY IFPMxV0xAZVh3k0RY3YngN/+4EK87qSmY73dbheISoK63GL/klDISwivJ3F8WzkzYz jABk+o6gHzGxMzPUNPP5hvjLzjvgmWAxKNoKmI/LjnVaYYLqf9mzwle86xrAUD1nVe scBeTPVZYFajjLNU3yKHPTdxYExO+RzyqA4uqaE/uqtgf5WG34iW1ySVgh7m2MolVB yDFj55MlmiRNw== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:43 +0100 Subject: [PATCH v5 5/9] virtio: blk/scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-5-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Reviewed-by: Christoph Hellwig Acked-by: Michael S. Tsirkin Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- drivers/block/virtio_blk.c | 5 ++--- drivers/scsi/virtio_scsi.c | 1 + 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 71a7ffeafb32ccd6329102d3166da7cbc8bc9539..c5b2ceebd645659d86299d07224d85bb7671a9a7 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -976,9 +976,8 @@ static int init_vq(struct virtio_blk *vblk) return -EINVAL; } - num_vqs = min_t(unsigned int, - min_not_zero(num_request_queues, nr_cpu_ids), - num_vqs); + num_vqs = blk_mq_num_possible_queues( + min_not_zero(num_request_queues, num_vqs)); num_poll_vqs = min_t(unsigned int, poll_queues, num_vqs - 1); diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 60be1a0c61836ba643adcf9ad8d5b68563a86cb1..46ca0b82f57ce2211c7e2817dd40ee34e65bcbf9 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -919,6 +919,7 @@ static int virtscsi_probe(struct virtio_device *vdev) /* We need to know how many queues before we allocate. */ num_queues = virtscsi_config_get(vdev, num_queues) ? : 1; num_queues = min_t(unsigned int, nr_cpu_ids, num_queues); + num_queues = blk_mq_num_possible_queues(num_queues); num_targets = virtscsi_config_get(vdev, max_target) + 1; From patchwork Fri Jan 10 16:26:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935127 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60E8E214209; Fri, 10 Jan 2025 16:27:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526421; cv=none; b=R1ZNgeJTWKLVYPcxaLzJMK03dloQ0XYOy8RweAw2GDjwjQSAt53N13seej0w6EXMsxrrFwvkMuS01hQ7TGSDBUzHeh7SP+CcJ3WFpHL69s9YGI/LaUoH0okY4JJvLldBwnV/4IZcZOsJTNrXQcTB3e/FpPXyl0qIrF1v+0ECT8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526421; c=relaxed/simple; bh=acSNL6VJuwcxl8u3g9fPp0k3KcJTFgUi5vnjUgLuaZg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Y11TNSg02ZFX3y5F24ZfYgQ0ENhNngwfzkBX+x8YhMTSfHxDNRCKXaZNlvL16eYJBV8T2t9jxQ8BkBCfW5LzF9udAEnWEAH5T9u6/GNzW3fCKarfddddDFl9/ZGykXQIOpGWJ+UeWmQL/X9bo8/Wf69jezWojI8YX4n+cETfNPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Z0fhl7Xx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Z0fhl7Xx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9860EC4CED6; Fri, 10 Jan 2025 16:27:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526420; bh=acSNL6VJuwcxl8u3g9fPp0k3KcJTFgUi5vnjUgLuaZg=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Z0fhl7Xx0B6viqZtmfP0ve2EFvhh9Umxb2dsQCLc+9Z5RtM4/eB9v2/aVulpakmXC xixeQVwz1yFTwOC7YOsLR3VuEcVRQzACUejnveh3W+ahiNxWzbtOtKZ+Ru1b5FpLFu H05rlru2fC6uC+l3lQQk+Z0xgbh93H4GF7keUesoqABEU8QktZabrvjfXmZ36vpCdc OrdKDKhw4QFL9nsrDOZZvr9HX2Kofaum4CaFQuXEPWtpPtMOpdGPOhBk6oXdM9sydK 43peoEW6bvrCFcZVg50pkPxlld/99avxof/pFnMkPm11kYJSN09zgG4zzuHf7cuOw0 g7gA1v8nqy47Q== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:44 +0100 Subject: [PATCH v5 6/9] lib/group_cpus: honor housekeeping config when grouping CPUs Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-6-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 group_cpus_evenly distributes all present CPUs into groups. This ignores the isolcpus configuration and assigns isolated CPUs into the groups. Make group_cpus_evenly aware of isolcpus configuration and use the housekeeping CPU mask as base for distributing the available CPUs into groups. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Signed-off-by: Daniel Wagner --- lib/group_cpus.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 79 insertions(+), 3 deletions(-) diff --git a/lib/group_cpus.c b/lib/group_cpus.c index 016c6578a07616959470b47121459a16a1bc99e5..ba112dda527552a031dff083e77b748ac2629ca8 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -8,6 +8,7 @@ #include #include #include +#include #ifdef CONFIG_SMP @@ -330,7 +331,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, } /** - * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * group_possible_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups * @nummasks: number of initialized cpumasks * @@ -346,8 +347,8 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps, - unsigned int *nummasks) +static struct cpumask *group_possible_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0; cpumask_var_t *node_to_cpumask; @@ -427,6 +428,81 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps, *nummasks = nr_present + nr_others; return masks; } + +/** + * group_mask_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @cpu_mask: CPU to consider for the grouping + * @nummasks: number of initialized cpusmasks + * + * Return: cpumask array if successful, NULL otherwise. And each element + * includes CPUs assigned to this group. + * + * Try to put close CPUs from viewpoint of CPU and NUMA locality into + * same group. Allocate present CPUs on these groups evenly. + */ +static struct cpumask *group_mask_cpus_evenly(unsigned int numgrps, + const struct cpumask *cpu_mask, + unsigned int *nummasks) +{ + cpumask_var_t *node_to_cpumask; + cpumask_var_t nmsk; + int ret = -ENOMEM; + struct cpumask *masks = NULL; + + if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL)) + return NULL; + + node_to_cpumask = alloc_node_to_cpumask(); + if (!node_to_cpumask) + goto fail_nmsk; + + masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + if (!masks) + goto fail_node_to_cpumask; + + build_node_to_cpumask(node_to_cpumask); + + ret = __group_cpus_evenly(0, numgrps, node_to_cpumask, cpu_mask, nmsk, + masks); + +fail_node_to_cpumask: + free_node_to_cpumask(node_to_cpumask); + +fail_nmsk: + free_cpumask_var(nmsk); + if (ret < 0) { + kfree(masks); + return NULL; + } + *nummasks = ret; + return masks; +} + +/** + * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @nummasks: number of initialized cpusmasks + * + * Return: cpumask array if successful, NULL otherwise. + * + * group_possible_cpus_evently() is used for distributing the cpus on all + * possible cpus in absence of isolcpus command line argument. + * group_mask_cpu_evenly() is used when the isolcpus command line + * argument is used with managed_irq option. In this case only the + * housekeeping CPUs are considered. + */ +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) +{ + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) { + return group_mask_cpus_evenly(numgrps, + housekeeping_cpumask(HK_TYPE_MANAGED_IRQ), + nummasks); + } + + return group_possible_cpus_evenly(numgrps, nummasks); +} #else /* CONFIG_SMP */ struct cpumask *group_cpus_evenly(unsigned int numgrps, unsigned int *nummasks) From patchwork Fri Jan 10 16:26:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935128 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B32132147F5; Fri, 10 Jan 2025 16:27:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526423; cv=none; b=dnJv7bNCjxXOcMCWSQcnXQKK5Hd/UulrGl/lctW0AZsdXEYR0pISZxJEPGR0Qe2julmdDFhc7BFrCUt2ozfdve0rTBBdhyWZqdTPsk4rPY29rOvah/GHzUllRLwIXNeO1BMZN9C9labzOqDOFbJ4ZK82PAHIPgxnWdfXH+tOPLc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526423; c=relaxed/simple; bh=5//HpnRrVhYs9UNL8zeF3rbaQciAX5RlpgaaiZHbgaI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=SRnJrLdVQzMXjp6q6WXcbzjrOfux60+Gke+kLiELmTUMe2o4HDGFNkmbJERiTleScMcCHW6iJrbx1xgtoM6q/AUmMm7on0UCVFQKEDuRS0oWxAHDp0ePLRqIUraaaasyuH4iFjkUnXySRGSWquD0bgKeTpa4hUdOIjHS3/HstOs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=q6z3CTM5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="q6z3CTM5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE9D6C4CEE4; Fri, 10 Jan 2025 16:27:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526423; bh=5//HpnRrVhYs9UNL8zeF3rbaQciAX5RlpgaaiZHbgaI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=q6z3CTM5LqYkAhnKyEgfu+nKe61Q+TUtw20L/MQGqybRIjNJX7ZZcP03cLORhgjxT lIyHkdVGy/TU6WbCHVLsGpHswnsX0VtBKs/bnxMsCPlLxFVOsZsMibMUWlA7pWQVr1 xSj3/OEjYE2jmf+Dw7S4prkQZb+rthlXQlNtqtidzqznBaRr4582JQpxTovT3PMWG/ Rxa1OOZhRgUNI8AlTiPQt0zwi73Mo4N19V78W26HqkFe0wjn1eQKcX7yKLZqcqmlK+ eMc1FWD3EPE+RsEBiNth3A3hxrb8hBNJLWfqaxnfW0VJv4fJT1HhpIq/9RZH66UR/e OKfsItufCQb5w== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:45 +0100 Subject: [PATCH v5 7/9] blk-mq: use hk cpus only when isolcpus=managed_irq is enabled Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-7-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 When isolcpus=managed_irq is enabled all hardware queues should run on the housekeeping CPUs only. Thus ignore the affinity mask provided by the driver. Also we can't use blk_mq_map_queues because it will map all CPUs to first hctx unless, the CPU is the same as the hctx has the affinity set to, e.g. 8 CPUs with isolcpus=managed_irq,2-3,6-7 config queue mapping for /dev/nvme0n1 hctx0: default 2 3 4 6 7 hctx1: default 5 hctx2: default 0 hctx3: default 1 PCI name is 00:05.0: nvme0n1 irq 57 affinity 0-1 effective 1 is_managed:0 nvme0q0 irq 58 affinity 4 effective 4 is_managed:1 nvme0q1 irq 59 affinity 5 effective 5 is_managed:1 nvme0q2 irq 60 affinity 0 effective 0 is_managed:1 nvme0q3 irq 61 affinity 1 effective 1 is_managed:1 nvme0q4 where as with blk_mq_hk_map_queues we get queue mapping for /dev/nvme0n1 hctx0: default 2 4 hctx1: default 3 5 hctx2: default 0 6 hctx3: default 1 7 PCI name is 00:05.0: nvme0n1 irq 56 affinity 0-1 effective 1 is_managed:0 nvme0q0 irq 61 affinity 4 effective 4 is_managed:1 nvme0q1 irq 62 affinity 5 effective 5 is_managed:1 nvme0q2 irq 63 affinity 0 effective 0 is_managed:1 nvme0q3 irq 64 affinity 1 effective 1 is_managed:1 nvme0q4 Reviewed-by: Christoph Hellwig Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- block/blk-mq-cpumap.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 0923cccdcbcad75ad107c3636af15b723356e087..e78eebbbaf0a2e0e8e03a2b31087c62a9090808c 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -61,11 +61,73 @@ unsigned int blk_mq_num_online_queues(unsigned int max_queues) } EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); +/* + * blk_mq_map_hk_queues - Create housekeeping CPU to hardware queue mapping + * @qmap: CPU to hardware queue map + * + * Create a housekeeping CPU to hardware queue mapping in @qmap. If the + * isolcpus feature is enabled and blk_mq_map_hk_queues returns true, + * @qmap contains a valid configuration honoring the managed_irq + * configuration. If the isolcpus feature is disabled this function + * returns false. + */ +static bool blk_mq_map_hk_queues(struct blk_mq_queue_map *qmap) +{ + struct cpumask *hk_masks; + cpumask_var_t isol_mask; + unsigned int queue, cpu, nr_masks; + + if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + return false; + + /* map housekeeping cpus to matching hardware context */ + hk_masks = group_cpus_evenly(qmap->nr_queues, &nr_masks); + if (!hk_masks) + goto fallback; + + for (queue = 0; queue < qmap->nr_queues; queue++) { + for_each_cpu(cpu, &hk_masks[queue % nr_masks]) + qmap->mq_map[cpu] = qmap->queue_offset + queue; + } + + kfree(hk_masks); + + /* map isolcpus to hardware context */ + if (!alloc_cpumask_var(&isol_mask, GFP_KERNEL)) + goto fallback; + + queue = 0; + cpumask_andnot(isol_mask, + cpu_possible_mask, + housekeeping_cpumask(HK_TYPE_MANAGED_IRQ)); + + for_each_cpu(cpu, isol_mask) { + qmap->mq_map[cpu] = qmap->queue_offset + queue; + queue = (queue + 1) % qmap->nr_queues; + } + + free_cpumask_var(isol_mask); + + return true; + +fallback: + /* map all cpus to hardware context ignoring any affinity */ + queue = 0; + for_each_possible_cpu(cpu) { + qmap->mq_map[cpu] = qmap->queue_offset + queue; + queue = (queue + 1) % qmap->nr_queues; + } + return true; +} + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; unsigned int queue, cpu, nr_masks; + if (blk_mq_map_hk_queues(qmap)) + return; + masks = group_cpus_evenly(qmap->nr_queues, &nr_masks); if (!masks) { for_each_possible_cpu(cpu) @@ -120,6 +182,9 @@ void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, if (!dev->bus->irq_get_affinity) goto fallback; + if (blk_mq_map_hk_queues(qmap)) + return; + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = dev->bus->irq_get_affinity(dev, queue + offset); if (!mask) From patchwork Fri Jan 10 16:26:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935129 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0882C212FA7; Fri, 10 Jan 2025 16:27:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526426; cv=none; b=XhWUFNxRWkZyjZ5biSMu62+6WKIshOqGmG+PI0IyUFYreLLPNetoKDDf2kUjvlTVt1iBpj/UNGx0akeElsSadjaa+2+iNZ8NSxb/K04gLYBr1F7+4bWKnQq63nrgFPzqx0rlAr3NZkrUX3hpjl/Lp9FhDxyCZUoCkmY0K46abx8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526426; c=relaxed/simple; bh=m9BSbBAD2bW8ZLYqHByMdg0GNOBvxPgl189ozbJAOes=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BPetOvAegE7GnjoKRXh3/81+TJgeXmXYiDQ3j+susRUDO5f/EygcKfPoucAB3M3VbCnEfjaC0S3KaOfre7smr/EjRaHSI6ngF84IzYXgbOsBMzACzTk+1F6wI0LLO0h9SLwEt2Ypi208zFSBsP71NSMmxA1h3rbWfLgLwQVznSQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=r+5ItLis; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="r+5ItLis" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8CAC7C4CED6; Fri, 10 Jan 2025 16:27:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526425; bh=m9BSbBAD2bW8ZLYqHByMdg0GNOBvxPgl189ozbJAOes=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=r+5ItLis6ADqjLZ00Y7vBeSuGNRtzNTJVGEtwoh4kB07EhnoA3SxtoGr/dXPCT3c1 kz15npsjmJ678zcsqURO9gI2VwV24kQrHFJH8kkBl1rqSWY9Q2rdrW45LjMvFwfE3q gUZkNLyyS9gTBOQnQnKkYjVkcbaG4phZN0R2UQq8zj8yfYEgh3JzreHw95snCkDf3U dzuAbKq/fk3DMi0Dlj8MuyNuzMUii/lRG/JyH7Qaxk62A+NN9twYCnys7GuW4Lu7Z4 XqvJ09YHEorEWgLvrag9v77l7BFp3nytx8ftJV8BEkJMxPVELC21G/pDlrM1pE9FXG 4hJk67zGncQoQ== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:46 +0100 Subject: [PATCH v5 8/9] blk-mq: issue warning when offlining hctx with online isolcpus Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-8-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 When isolcpus=managed_irq is enabled, and the last housekeeping CPU for a given hardware context goes offline, there is no CPU left which handles the IOs anymore. If isolated CPUs mapped to this hardware context are online and an application running on these isolated CPUs issue an IO this will lead to stalls. The kernel will not schedule IO to isolated CPUS thus this avoids IO stalls. Thus issue a warning when housekeeping CPUs are offlined for a hardware context while there are still isolated CPUs online. Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 2e6132f778fd958aae3cad545e4b3dd623c9c304..43eab0db776d37ffd7eb6c084211b5e05d41a574 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3620,6 +3620,45 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx) return data.has_rq; } +static void blk_mq_hctx_check_isolcpus_online(struct blk_mq_hw_ctx *hctx, unsigned int cpu) +{ + const struct cpumask *hk_mask; + int i; + + if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + return; + + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + for (i = 0; i < hctx->nr_ctx; i++) { + struct blk_mq_ctx *ctx = hctx->ctxs[i]; + + if (ctx->cpu == cpu) + continue; + + /* + * Check if this context has at least one online + * housekeeping CPU in this case the hardware context is + * usable. + */ + if (cpumask_test_cpu(ctx->cpu, hk_mask) && + cpu_online(ctx->cpu)) + break; + + /* + * The context doesn't have any online housekeeping CPUs + * but there might be an online isolated CPU mapped to + * it. + */ + if (cpu_is_offline(ctx->cpu)) + continue; + + pr_warn("%s: offlining hctx%d but there is still an online isolcpu CPU %d mapped to it, IO stalls expected\n", + hctx->queue->disk->disk_name, + hctx->queue_num, ctx->cpu); + } +} + static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, unsigned int this_cpu) { @@ -3639,8 +3678,10 @@ static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, continue; /* this hctx has at least one online CPU */ - if (this_cpu != cpu) + if (this_cpu != cpu) { + blk_mq_hctx_check_isolcpus_online(hctx, this_cpu); return true; + } } return false; From patchwork Fri Jan 10 16:26:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13935130 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63E6D21504E; Fri, 10 Jan 2025 16:27:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526428; cv=none; b=CPDXOFf0wr/6WgvExL8DL65u/AZX6qGEvpodzg6Gtq+pISZF+EKEVEovr6/lpFtf76wEmzZ2ZoNBcLHXaLJ1GFlaCVNCi13rcwxU+o6e5HBgKNe6NtZJZLeRr7Vc16Xxl6MprVw0k6rAcFa5J4PeKGKDsgkgLNHjU4VxjpX+bjo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736526428; c=relaxed/simple; bh=KZEW4nhkxc3r/pWy7TVGP1x5+haiTt3SJOI/nLc24UA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pzyuo2btqzHp2cuvzkz+K9BYEH0YH4cU9k+IrLb0rOs1jW/0lz6yYIHprFJgMV4/fewaTASN3EbJUwcEiinRbGrL5eZ81SgsqYMJGAf9WPVyfxjM2qkCWrjZLLjGmPEBTrMjF8SmyNtrHDsIyc0hwlOv5smF8OqzGsoC5C5SqOQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nh99LV66; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nh99LV66" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB67BC4CED6; Fri, 10 Jan 2025 16:27:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736526428; bh=KZEW4nhkxc3r/pWy7TVGP1x5+haiTt3SJOI/nLc24UA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=nh99LV66Sd7DFDIC3HFT/s/WX1XS4Kouaa+FRlKEPrQl02bLY70e0ahgqaED7jDXt p4xtbPSM17TN7JBo/1uxr+fG5dERy8zsLHSSLb76T3dcpllib6SADSc+dIHq4L4OWa oxjPgtjDagdSD6gccU0RBHjtfD9okWrGKw6Oga5GHgllCZy7K2mtwP1ryFt0p+82QM JnoTutQcIv9B1cQBb9ZIrC0d6ls3+slYvZSP1oa+IFkcxLw5ZcA/uIahG0NRNKk640 YcT8t5bqRt7l4xBVudpSwfynD2GztItYFhhvepnH0jnp54GgSlwoNbeusBsvKt3RJA zuAL3xAzJk/ng== From: Daniel Wagner Date: Fri, 10 Jan 2025 17:26:47 +0100 Subject: [PATCH v5 9/9] doc: update managed_irq documentation Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250110-isolcpus-io-queues-v5-9-0e4f118680b0@kernel.org> References: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> In-Reply-To: <20250110-isolcpus-io-queues-v5-0-0e4f118680b0@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 The managed_irq documentation is a bit difficult to understand. Rephrase the current text and add the latest changes how managed_irq CPU sets are handled. Isolated CPUs and housekeeping CPUs are grouped into sets and the possibility of stalls if all housekeeping CPUs are offlined in a set. Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- Documentation/admin-guide/kernel-parameters.txt | 46 +++++++++++++------------ 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 3872bc6ec49d63772755504966ae70113f24a1db..e4bf1fc984943c1d4938dffb85d97da05010a325 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2460,28 +2460,30 @@ "number of CPUs in system - 1". managed_irq - - Isolate from being targeted by managed interrupts - which have an interrupt mask containing isolated - CPUs. The affinity of managed interrupts is - handled by the kernel and cannot be changed via - the /proc/irq/* interfaces. - - This isolation is best effort and only effective - if the automatically assigned interrupt mask of a - device queue contains isolated and housekeeping - CPUs. If housekeeping CPUs are online then such - interrupts are directed to the housekeeping CPU - so that IO submitted on the housekeeping CPU - cannot disturb the isolated CPU. - - If a queue's affinity mask contains only isolated - CPUs then this parameter has no effect on the - interrupt routing decision, though interrupts are - only delivered when tasks running on those - isolated CPUs submit IO. IO submitted on - housekeeping CPUs has no influence on those - queues. + Isolate CPUs from IRQ-related work for drivers + that support managed interrupts, ensuring no + IRQ work is scheduled on the isolated CPUs. The + kernel manages the affinity of managed + interrupts, which cannot be changed via the + /proc/irq/* interfaces. + + Since isolated CPUs do not handle IRQ work, the + work is forwarded to housekeeping CPUs. + Housekeeping and isolated CPUs are grouped into + sets, ensuring at least one housekeeping CPU is + available per set. Consequently, if all + housekeeping CPUs in a set are offlined, there + will be no CPU available to handle IRQ work for + the isolated CPUs. Therefore, users should + offline all isolated CPUs before offlining the + housekeeping CPUs in a set to avoid stalls. + + The block layer ensures that no I/O is + scheduled on isolated CPU, except when user + applications running on the isolated CPUs issue + I/O requests. In this case the I/O is issued + from the isolated CPU and the IRQ related work + is forwared to a housekeeping CPU. The format of is described above.