From patchwork Thu Jun 27 14:10:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13714467 Received: from mail.nearlyone.de (mail.nearlyone.de [49.12.199.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1735419599C; Thu, 27 Jun 2024 14:11:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=49.12.199.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719497485; cv=none; b=HzEKe0/U628VofpxkaEPHdW6k67bZ9JT1LFwM5CNhY6IoFjg1DjMM6GjYK/dN7bv+oBHYw8+FixxPsPmS2YfROCxaopSjlXa8jo+2LTKgwIo8yYJ5T+3Ha5XtFTfhzwG7P9c+XKAjCt+fioZgvDV8qx4dtM27cFQvD7MTPJ+DWY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719497485; c=relaxed/simple; bh=2SBOoMtiJ54DPGGVHAq7iv2C4pFRN2GgpRpH94Lztm8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tdMAyAIBg+cgwOIvu5r5GYlZ2UXxpLxmG/tQL4SCKqQznja/mpao/gjxG5nxuYcny4p0wa4Mi24BSiBMxuW2BXtETW4EAZYSb2spEGG1wNKnLHh/9PBK6/akDnAPEE9q045axgg+CHlRhvsRi4TJGAHdXnJeXQBNBEKh/4aXmgc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=monom.org; dkim=pass (2048-bit key) header.d=monom.org header.i=@monom.org header.b=TOv9ofVQ; arc=none smtp.client-ip=49.12.199.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=monom.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=monom.org header.i=@monom.org header.b="TOv9ofVQ" Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id 13CF2DAD4C; Thu, 27 Jun 2024 16:11:13 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monom.org; s=dkim; t=1719497474; h=from:subject:date:message-id:to:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references; bh=BNoWuT7UuG2Nbrf+bgTkJFKZb6gx8cFnWOjFFjkii6s=; b=TOv9ofVQ1gOF3fSkFSbOab6RiJ5vO26sBn9Hu5tJIhE7rQaK0GUz4k2D3kl86fHi+BZ+R9 cDbnih4Y432teCGAXpn5Z2hiVZKGVWbgW+af/r+Myz+rj8GtSP8YqcaQ+983d3Msdn0FsM yfQqo+Btm7Ms27KEmHVWLfYBrQfmS/YgFB6trKQ/E5SN7eIl/q0DzjKvH8i4hlKNWvu+a4 QjWYihlNS+drQSFWEilX734flAsOs29QmaM4d1j2FrmlcnVhvhXl4h1AfZYrJ6QGdugpGf RQXX9EDIXZMVNk1wKXsiCFfGlaGy75abK0RgA28ZZhV9HlzK7UFyu5DiDEm6IQ== From: Daniel Wagner Date: Thu, 27 Jun 2024 16:10:51 +0200 Subject: [PATCH v2 1/3] blk-mq: add blk_mq_num_possible_queues helper Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240627-isolcpus-io-queues-v2-1-26a32e3c4f75@suse.de> References: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de> In-Reply-To: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de> To: Jens Axboe , Keith Busch , Sagi Grimberg , Thomas Gleixner , Christoph Hellwig Cc: Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , Ming Lei , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Daniel Wagner X-Mailer: b4 0.14.0 X-Last-TLS-Session-Version: TLSv1.3 Multi queue devices which use managed IRQs should only allocate queues for the housekeeping CPUs when isolcpus is set. This avoids that the isolated CPUs get disturbed with OS workload. Add a helper which calculates the correct number of queues which should be used. Signed-off-by: Daniel Wagner Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg --- block/blk-mq-cpumap.c | 20 ++++++++++++++++++++ include/linux/blk-mq.h | 1 + 2 files changed, 21 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 9638b25fd521..9717e323f308 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -11,10 +11,30 @@ #include #include #include +#include #include "blk.h" #include "blk-mq.h" +/** + * blk_mq_num_possible_queues - Calc nr of queues for managed devices + * + * Calculate the number of queues which should be used for a multiqueue + * device which uses the managed IRQ API. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_possible_queues(void) +{ + const struct cpumask *hk_mask; + + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + if (!cpumask_empty(hk_mask)) + return cpumask_weight(hk_mask); + + return num_possible_cpus(); +} +EXPORT_SYMBOL_GPL(blk_mq_num_possible_queues); + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 89ba6b16fe8b..2105cc78ca67 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -900,6 +900,7 @@ void blk_mq_freeze_queue_wait(struct request_queue *q); int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, unsigned long timeout); +unsigned int blk_mq_num_possible_queues(void); void blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues); From patchwork Thu Jun 27 14:10:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13714468 Received: from mail.nearlyone.de (mail.nearlyone.de [49.12.199.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A55A5195FE5; Thu, 27 Jun 2024 14:11:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=49.12.199.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719497485; cv=none; b=YN3QNo2Bb2b7pQkyPx7AnD0bh4iCSHG1I7XC1fQ+yE8GHK2NE2fSeYZ6YIPxx1eLB8KddU5BMfLCQgWG0PKV6knLvg7FtqBThJ/uq7mFJvXn6Ipb+j3bUahEsTdmk4GBT69RNjqjXhK+GAO8vpUrlL2XAHOijMUuGhJPT+wEOlE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719497485; c=relaxed/simple; bh=T3rk+myDgRW7pdJOY82hO4pqzWsSBl7pZeCzlSfrwRQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=uB6TeIPsAw3hI0pZMyEo7ft7BvfnKgcXAheS/CEpVsvCWYn3sxBBOjga7j6cfXrPWso19YkVc9OD4FK/VIUobPutGjDR7FUVRe/CXQP5jq7oqtWCrvNPZU8+hOcmjv6zN3N4LUtGFrRhj+Z8kA+ETJQWenMrqmwT9RemR+wBBx8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=monom.org; dkim=pass (2048-bit key) header.d=monom.org header.i=@monom.org header.b=vhoedez/; arc=none smtp.client-ip=49.12.199.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=monom.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=monom.org header.i=@monom.org header.b="vhoedez/" Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id 45051DAD93; Thu, 27 Jun 2024 16:11:15 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monom.org; s=dkim; t=1719497475; h=from:subject:date:message-id:to:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references; bh=Z86U4HajwRS+bO9fhPagrSAmcb5268+tFSMExzGmM3o=; b=vhoedez/x0GFd/0rEH5rsJdieqerjH6UHFIrRcjRCRxqfcRKPh9Fq8MKRusQ5urg+qLQ+8 Fs/vQBnrxo4ep5baTF6EvSh7Hj6jNvclgrmUszFuhXembG/ToiKQ6lAigcnpL8oSO/u11p CB4Wuy3Kng2lBtcOBULI1QaFHcriMvSBT1tSZ7xCl4D9zMa2RZlyMHerXWmlmDDJW4jVII yozcdDh7LJrSK8TiBrZj1b6K5s7VYKuyErE842emuSlBpj0rEsqrGFGr+Q3g6eGuiTGFeZ 7tkFQbS8H6IkK4Ka2557R+BlSI/YT2ZqURHjqpwgCNCnPuaVEbgKs8TfbmcJQw== From: Daniel Wagner Date: Thu, 27 Jun 2024 16:10:52 +0200 Subject: [PATCH v2 2/3] nvme-pci: limit queue count to housekeeping CPUs Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240627-isolcpus-io-queues-v2-2-26a32e3c4f75@suse.de> References: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de> In-Reply-To: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de> To: Jens Axboe , Keith Busch , Sagi Grimberg , Thomas Gleixner , Christoph Hellwig Cc: Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , Ming Lei , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Daniel Wagner X-Mailer: b4 0.14.0 X-Last-TLS-Session-Version: TLSv1.3 When isolcpus is used, the nvme-pci driver should only allocated queues for the housekeeping CPUs. Use the blk_mq_num_possible_queues helper which returns the correct number of queues for all configurations. Signed-off-by: Daniel Wagner Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg --- drivers/nvme/host/pci.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 102a9fb0c65f..193144e6d59b 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -81,7 +81,7 @@ static int io_queue_count_set(const char *val, const struct kernel_param *kp) int ret; ret = kstrtouint(val, 10, &n); - if (ret != 0 || n > num_possible_cpus()) + if (ret != 0 || n > blk_mq_num_possible_queues()) return -EINVAL; return param_set_uint(val, kp); } @@ -2263,7 +2263,8 @@ static unsigned int nvme_max_io_queues(struct nvme_dev *dev) */ if (dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS) return 1; - return num_possible_cpus() + dev->nr_write_queues + dev->nr_poll_queues; + return blk_mq_num_possible_queues() + dev->nr_write_queues + + dev->nr_poll_queues; } static int nvme_setup_io_queues(struct nvme_dev *dev) From patchwork Thu Jun 27 14:10:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 13714470 Received: from mail.nearlyone.de (mail.nearlyone.de [49.12.199.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94C7F195FE6; Thu, 27 Jun 2024 14:11:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=49.12.199.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719497486; cv=none; b=qH/+EsrZLVxTas+g5RBg6s97JpwiLk75/ZgrVoyTQu/iNAry4SbujgRwcLoCtn1W+SJTsD0XJqA+qpQnLmabCIrOObDakwLkLe5McYNynhSjtZ3Vbwl+krGd++QqGz6h17vbyiqcnVsQbYNX1rZ0pFwdklXeYKpt5jGEEfpZjaA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719497486; c=relaxed/simple; bh=scPvWNw3hpillcEd+nWhCCDrjstdBXT5pxTlLbAMjXI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tSPoBmkfGUNMaUze+/Qe1ER15viKAd2kaWqX+EamDGUH5BsfVvsJmCPa0K4NKx6tTjo+WvhM1a2IYkGO/2+vypJc8V7poeMI4eR6Pld1pqHPsIanc8r1CZeOIj1kS5wIL8zdb7qTfri7FiPCkh5KzKP9gdP8ogo+h53CQUxp0z8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=monom.org; dkim=pass (2048-bit key) header.d=monom.org header.i=@monom.org header.b=TBSa/VW0; arc=none smtp.client-ip=49.12.199.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=monom.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=monom.org header.i=@monom.org header.b="TBSa/VW0" Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id 8A320DAD95; Thu, 27 Jun 2024 16:11:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monom.org; s=dkim; t=1719497477; h=from:subject:date:message-id:to:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references; bh=Nkl4SSxNIrBpfNEBv/GE9DiJBjq3oTyTgR3zzoHTSJo=; b=TBSa/VW0F7MgASJWdBxd+i0xd5Qo40RT1tgbWvqWgJtpR6PP/jIxnKKvkPzlGMvLzZ+qDb Q7HNuPGKlX4vAJOxXhwrNB8vueggY+5cypioKqyFIxoS+Cpf00b5vlB/HDn4pxwcx4ncOW v5dn1jZ4tmq/S+/YDueN9QBc8+V5Km0VIHFMioFUtSvMaQXN+9Qy0bcb6m76S9jyrd/Hzt C3M0mnywpHldvCjlnirR0CyHVzqIo9EAtP6fPkoguTOBqebJAUcbB4MF5XcjA+YOnro6sa M5e/GnVbQrawx84af8PJeyflU1YlNKKsuB6DgT2vvzqal1389dbjp8fP827Sgw== From: Daniel Wagner Date: Thu, 27 Jun 2024 16:10:53 +0200 Subject: [PATCH v2 3/3] lib/group_cpus.c: honor housekeeping config when grouping CPUs Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240627-isolcpus-io-queues-v2-3-26a32e3c4f75@suse.de> References: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de> In-Reply-To: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de> To: Jens Axboe , Keith Busch , Sagi Grimberg , Thomas Gleixner , Christoph Hellwig Cc: Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , Ming Lei , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Daniel Wagner X-Mailer: b4 0.14.0 X-Last-TLS-Session-Version: TLSv1.3 group_cpus_evenly distributes all present CPUs into groups. This ignores the isolcpus configuration and assigns isolated CPUs into the groups. Make group_cpus_evenly aware of isolcpus configuration and use the housekeeping CPU mask as base for distributing the available CPUs into groups. Fixes: 11ea68f553e2 ("genirq, sched/isolation: Isolate from handling managed interrupts") Signed-off-by: Daniel Wagner Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg --- lib/group_cpus.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 73 insertions(+), 2 deletions(-) diff --git a/lib/group_cpus.c b/lib/group_cpus.c index ee272c4cefcc..19fb7186f9d4 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -8,6 +8,7 @@ #include #include #include +#include #ifdef CONFIG_SMP @@ -330,7 +331,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, } /** - * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * group_possible_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups * * Return: cpumask array if successful, NULL otherwise. And each element @@ -344,7 +345,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +static struct cpumask *group_possible_cpus_evenly(unsigned int numgrps) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0; cpumask_var_t *node_to_cpumask; @@ -423,6 +424,76 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) } return masks; } + +/** + * group_mask_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @cpu_mask: CPU to consider for the grouping + * + * Return: cpumask array if successful, NULL otherwise. And each element + * includes CPUs assigned to this group. + * + * Try to put close CPUs from viewpoint of CPU and NUMA locality into + * same group. Allocate present CPUs on these groups evenly. + */ +static struct cpumask *group_mask_cpus_evenly(unsigned int numgrps, + const struct cpumask *cpu_mask) +{ + cpumask_var_t *node_to_cpumask; + cpumask_var_t nmsk; + int ret = -ENOMEM; + struct cpumask *masks = NULL; + + if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL)) + return NULL; + + node_to_cpumask = alloc_node_to_cpumask(); + if (!node_to_cpumask) + goto fail_nmsk; + + masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + if (!masks) + goto fail_node_to_cpumask; + + build_node_to_cpumask(node_to_cpumask); + + ret = __group_cpus_evenly(0, numgrps, node_to_cpumask, cpu_mask, nmsk, + masks); + +fail_node_to_cpumask: + free_node_to_cpumask(node_to_cpumask); + +fail_nmsk: + free_cpumask_var(nmsk); + if (ret < 0) { + kfree(masks); + return NULL; + } + return masks; +} + +/** + * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * + * Return: cpumask array if successful, NULL otherwise. + * + * group_possible_cpus_evently() is used for distributing the cpus on all + * possible cpus in absence of isolcpus command line argument. + * group_mask_cpu_evenly() is used when the isolcpus command line + * argument is used with managed_irq option. In this case only the + * housekeeping CPUs are considered. + */ +struct cpumask *group_cpus_evenly(unsigned int numgrps) +{ + const struct cpumask *hk_mask; + + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + if (!cpumask_empty(hk_mask)) + return group_mask_cpus_evenly(numgrps, hk_mask); + + return group_possible_cpus_evenly(numgrps); +} #else /* CONFIG_SMP */ struct cpumask *group_cpus_evenly(unsigned int numgrps) {