From patchwork Tue Oct 30 18:32:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661415 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F0091932 for ; Tue, 30 Oct 2018 18:34:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 981D32986B for ; Tue, 30 Oct 2018 18:34:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B4D929D1D; Tue, 30 Oct 2018 18:34:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D03F52986B for ; Tue, 30 Oct 2018 18:34:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727502AbeJaD3W (ORCPT ); Tue, 30 Oct 2018 23:29:22 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:50808 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727547AbeJaD1e (ORCPT ); Tue, 30 Oct 2018 23:27:34 -0400 Received: by mail-it1-f196.google.com with SMTP id k206-v6so14964171ite.0 for ; Tue, 30 Oct 2018 11:32:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bYfeBORDH+1H7Y7Fu0ilm7RpfNFuYSSIGJoZ2gOL4PI=; b=hj5IG0UeewVl2Znvwz0gMlfJ9YDkC8/4uCUrMhLz9/AmHsd92a6ZIgxpbCDgvraqqD XpCTxEVaBzzLyG8LE3L7in71E8OnRVvWVc2wpmbP75/dFKhbxiBDj+JQ2jMNg4INWivN mEK73LRpKrtXS3rk8uYtIFziV16jYLicA2zcQSRcom5XupWoHgiDfZvqSTxdzacrLPFT klJshYCtiFLU9EJleQDdVZbHTqbT1/RcxjGjxxHBSDOVdkLWwaXhR/tnaH7zB85yozF4 UUMuVpwJrG26jGqHgUxcNxIWL5jt7fKJWXzZOHaOaxZ3PPN6fxzI2a7AxWamLEPzmolR ZT5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bYfeBORDH+1H7Y7Fu0ilm7RpfNFuYSSIGJoZ2gOL4PI=; b=lKD/7FQpobu5CRP3EsoVe8rvkXhqOUNOTZNqXoLHw8zkZvUqzR67vqwbFcngZpkX5V 6MIYtkrkx4sjcv4I46/B8BP0+hZOwMJ7UXTpC9iyqYvgmOMYYx6WGFpeikv5OnjmkksO erfrqMq6jm/Wxy9i5xh6awCrCRjwHs4U88DQDU4YoHxoij7eR9WmC6xK2hj/4UfTI1Yp LeqCGZ8MdU6pvE7HR0maNA0f42E+pe/g3ijwopy97FM2/f0eVgIPibkgjGel/uW0pZ5a T26Dh17Cge7Q25AFpOJDH0zg+0Qb92Y9aRZw0A/Pd/tS2SOcAhAKQkaujL1xU6uAhRbq 5Ccg== X-Gm-Message-State: AGRZ1gLkSXjFqv2MluHz8iYXarP0pXOLkoRNj7q9+ppMycjEk61pX2bd XecbSekpxp9chbCN24mbXmLzA3bZr7g= X-Google-Smtp-Source: AJdET5e3xnldilSIOqLjzD9xjVtYs8eyUl2jGtfEOjo5CRQos8C1M+du/0Tp0uYociGycVUHFLul1Q== X-Received: by 2002:a24:8247:: with SMTP id t68-v6mr23361itd.68.1540924377786; Tue, 30 Oct 2018 11:32:57 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.32.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:32:56 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 01/16] blk-mq: kill q->mq_map Date: Tue, 30 Oct 2018 12:32:37 -0600 Message-Id: <20181030183252.17857-2-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It's just a pointer to set->mq_map, use that instead. Move the assignment a bit earlier, so we always know it's valid. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Bart Van Assche Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- block/blk-mq.c | 13 ++++--------- block/blk-mq.h | 4 +++- include/linux/blkdev.h | 2 -- 3 files changed, 7 insertions(+), 12 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 21e4147c4810..22d5beaab5a0 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2321,7 +2321,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * If the cpu isn't present, the cpu is mapped to first hctx. */ for_each_possible_cpu(i) { - hctx_idx = q->mq_map[i]; + hctx_idx = set->mq_map[i]; /* unmapped hw queue can be remapped after CPU topo changed */ if (!set->tags[hctx_idx] && !__blk_mq_alloc_rq_map(set, hctx_idx)) { @@ -2331,7 +2331,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * case, remap the current ctx to hctx[0] which * is guaranteed to always have tags allocated */ - q->mq_map[i] = 0; + set->mq_map[i] = 0; } ctx = per_cpu_ptr(q->queue_ctx, i); @@ -2429,8 +2429,6 @@ static void blk_mq_del_queue_tag_set(struct request_queue *q) static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set, struct request_queue *q) { - q->tag_set = set; - mutex_lock(&set->tag_list_lock); /* @@ -2467,8 +2465,6 @@ void blk_mq_release(struct request_queue *q) kobject_put(&hctx->kobj); } - q->mq_map = NULL; - kfree(q->queue_hw_ctx); /* @@ -2588,7 +2584,7 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, int node; struct blk_mq_hw_ctx *hctx; - node = blk_mq_hw_queue_to_node(q->mq_map, i); + node = blk_mq_hw_queue_to_node(set->mq_map, i); /* * If the hw queue has been mapped to another numa node, * we need to realloc the hctx. If allocation fails, fallback @@ -2665,8 +2661,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, if (!q->queue_hw_ctx) goto err_percpu; - q->mq_map = set->mq_map; - blk_mq_realloc_hw_ctxs(set, q); if (!q->nr_hw_queues) goto err_hctxs; @@ -2675,6 +2669,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ); q->nr_queues = nr_cpu_ids; + q->tag_set = set; q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; diff --git a/block/blk-mq.h b/block/blk-mq.h index 9497b47e2526..9536be06d022 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -75,7 +75,9 @@ extern int blk_mq_hw_queue_to_node(unsigned int *map, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, int cpu) { - return q->queue_hw_ctx[q->mq_map[cpu]]; + struct blk_mq_tag_set *set = q->tag_set; + + return q->queue_hw_ctx[set->mq_map[cpu]]; } /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c675e2b5af62..4223ae2d2198 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -412,8 +412,6 @@ struct request_queue { const struct blk_mq_ops *mq_ops; - unsigned int *mq_map; - /* sw queues */ struct blk_mq_ctx __percpu *queue_ctx; unsigned int nr_queues; From patchwork Tue Oct 30 18:32:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661413 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF7B71932 for ; Tue, 30 Oct 2018 18:34:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C592E299B4 for ; Tue, 30 Oct 2018 18:34:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AD76B29C60; Tue, 30 Oct 2018 18:34:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5DDF299B4 for ; Tue, 30 Oct 2018 18:34:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728147AbeJaD1h (ORCPT ); Tue, 30 Oct 2018 23:27:37 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:36346 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728009AbeJaD1g (ORCPT ); Tue, 30 Oct 2018 23:27:36 -0400 Received: by mail-it1-f196.google.com with SMTP id t4-v6so6644714itf.1 for ; Tue, 30 Oct 2018 11:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=X7FeGBhGKL3ajOCghwh8jhFmc7mJ/fi6tVG45YFHgF0=; b=rVmZ+UBGc8S2ksTNEVs+TOWeiGiPUboYMVeDGVaY0ob6UwB5G4ZrPVZRlVwwK5CavI I41FAABFaRVkTXAoVYsS7cNfBPD8gYGT0MJpo6xvpPjkbOZR927vTaydMl38Yhb2KdTD b9KtU4eVGt6Ok1Iyr5/StdesJxtFlz8iD1xzaxlhdxHOOtoQT/+Fboyy9ODY6juZ2x3m Mx1w01Ax2EkOF2Qa/AZdcWqCTwgZ74lA0fn06IwPxk+vL4nR+7RTSbME6WBKB8IIXsBa eh6uX574YK+yacWkjrqkULs14cGJc20eTNdfichwQ94vZG/9cxC8qeba8Dyq2N8172bh VAsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=X7FeGBhGKL3ajOCghwh8jhFmc7mJ/fi6tVG45YFHgF0=; b=RrZS6o48Rma2gToBPJdIRq/oKIwH11c/jKAupzs5SCsvGjLtUojNonJfgETZKKn2pq Y5JgQBCDjwaslBLmbpVBRUK/TSlTFGRA2ogzPsL5+3Hu10Uo6aAu/1uuG7fUH/JnLfzO 2RfcNPTzRA3/uuWz6J85S0Do5UNvaHtQU4K9b6BGPI6iCIZgejQiJVRFsk7PTDlPTZK9 k3RsRP+NkOxQONi/pdWOOoBnsc1U9C6oV1DcUjMr19h9noHXJBfYT7fEseyj7D3BmCvE lTft/PkEXBHMRA6gHOovQtu/Rq7r0d2TEnX6g3hCQv8aRDpDkN6NNLJAyplJyGi1BkGD XLFg== X-Gm-Message-State: AGRZ1gKIPcMCZ7DrkY68ARmtjVyOdUvDZhy99Kinpew9DQdAXnytmK2Q nLF/PhDe6nHNWhjUi2er9hqLutKMmYc= X-Google-Smtp-Source: AJdET5dnpynlcGp06lggqHOR449WsudmJ8MKmDHDhPLnGBFYWLmuxGdYRMVhTskMxJe8x2FXhK+FOw== X-Received: by 2002:a02:7b06:: with SMTP id q6-v6mr14515jac.88.1540924379907; Tue, 30 Oct 2018 11:32:59 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.32.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:32:58 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 02/16] blk-mq: abstract out queue map Date: Tue, 30 Oct 2018 12:32:38 -0600 Message-Id: <20181030183252.17857-3-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is in preparation for allowing multiple sets of maps per queue, if so desired. Reviewed-by: Hannes Reinecke Reviewed-by: Bart Van Assche Signed-off-by: Jens Axboe --- block/blk-mq-cpumap.c | 10 ++++---- block/blk-mq-pci.c | 10 ++++---- block/blk-mq-rdma.c | 4 ++-- block/blk-mq-virtio.c | 8 +++---- block/blk-mq.c | 34 ++++++++++++++------------- block/blk-mq.h | 8 +++---- drivers/block/virtio_blk.c | 2 +- drivers/nvme/host/pci.c | 2 +- drivers/scsi/qla2xxx/qla_os.c | 5 ++-- drivers/scsi/scsi_lib.c | 2 +- drivers/scsi/smartpqi/smartpqi_init.c | 3 ++- drivers/scsi/virtio_scsi.c | 3 ++- include/linux/blk-mq-pci.h | 4 ++-- include/linux/blk-mq-virtio.h | 4 ++-- include/linux/blk-mq.h | 15 +++++++++--- 15 files changed, 64 insertions(+), 50 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 3eb169f15842..6e6686c55984 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -30,10 +30,10 @@ static int get_first_sibling(unsigned int cpu) return cpu; } -int blk_mq_map_queues(struct blk_mq_tag_set *set) +int blk_mq_map_queues(struct blk_mq_queue_map *qmap) { - unsigned int *map = set->mq_map; - unsigned int nr_queues = set->nr_hw_queues; + unsigned int *map = qmap->mq_map; + unsigned int nr_queues = qmap->nr_queues; unsigned int cpu, first_sibling; for_each_possible_cpu(cpu) { @@ -62,12 +62,12 @@ EXPORT_SYMBOL_GPL(blk_mq_map_queues); * We have no quick way of doing reverse lookups. This is only used at * queue init time, so runtime isn't important. */ -int blk_mq_hw_queue_to_node(unsigned int *mq_map, unsigned int index) +int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int index) { int i; for_each_possible_cpu(i) { - if (index == mq_map[i]) + if (index == qmap->mq_map[i]) return local_memory_node(cpu_to_node(i)); } diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c index db644ec624f5..40333d60a850 100644 --- a/block/blk-mq-pci.c +++ b/block/blk-mq-pci.c @@ -31,26 +31,26 @@ * that maps a queue to the CPUs that have irq affinity for the corresponding * vector. */ -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev, +int blk_mq_pci_map_queues(struct blk_mq_queue_map *qmap, struct pci_dev *pdev, int offset) { const struct cpumask *mask; unsigned int queue, cpu; - for (queue = 0; queue < set->nr_hw_queues; queue++) { + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = pci_irq_get_affinity(pdev, queue + offset); if (!mask) goto fallback; for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; + qmap->mq_map[cpu] = queue; } return 0; fallback: - WARN_ON_ONCE(set->nr_hw_queues > 1); - blk_mq_clear_mq_map(set); + WARN_ON_ONCE(qmap->nr_queues > 1); + blk_mq_clear_mq_map(qmap); return 0; } EXPORT_SYMBOL_GPL(blk_mq_pci_map_queues); diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c index 996167f1de18..a71576aff3a5 100644 --- a/block/blk-mq-rdma.c +++ b/block/blk-mq-rdma.c @@ -41,12 +41,12 @@ int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, goto fallback; for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; + set->map[0].mq_map[cpu] = queue; } return 0; fallback: - return blk_mq_map_queues(set); + return blk_mq_map_queues(&set->map[0]); } EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues); diff --git a/block/blk-mq-virtio.c b/block/blk-mq-virtio.c index c3afbca11299..661fbfef480f 100644 --- a/block/blk-mq-virtio.c +++ b/block/blk-mq-virtio.c @@ -29,7 +29,7 @@ * that maps a queue to the CPUs that have irq affinity for the corresponding * vector. */ -int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, +int blk_mq_virtio_map_queues(struct blk_mq_queue_map *qmap, struct virtio_device *vdev, int first_vec) { const struct cpumask *mask; @@ -38,17 +38,17 @@ int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, if (!vdev->config->get_vq_affinity) goto fallback; - for (queue = 0; queue < set->nr_hw_queues; queue++) { + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = vdev->config->get_vq_affinity(vdev, first_vec + queue); if (!mask) goto fallback; for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; + qmap->mq_map[cpu] = queue; } return 0; fallback: - return blk_mq_map_queues(set); + return blk_mq_map_queues(qmap); } EXPORT_SYMBOL_GPL(blk_mq_virtio_map_queues); diff --git a/block/blk-mq.c b/block/blk-mq.c index 22d5beaab5a0..9f149429cfbd 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1974,7 +1974,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, struct blk_mq_tags *tags; int node; - node = blk_mq_hw_queue_to_node(set->mq_map, hctx_idx); + node = blk_mq_hw_queue_to_node(&set->map[0], hctx_idx); if (node == NUMA_NO_NODE) node = set->numa_node; @@ -2030,7 +2030,7 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, size_t rq_size, left; int node; - node = blk_mq_hw_queue_to_node(set->mq_map, hctx_idx); + node = blk_mq_hw_queue_to_node(&set->map[0], hctx_idx); if (node == NUMA_NO_NODE) node = set->numa_node; @@ -2321,7 +2321,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * If the cpu isn't present, the cpu is mapped to first hctx. */ for_each_possible_cpu(i) { - hctx_idx = set->mq_map[i]; + hctx_idx = set->map[0].mq_map[i]; /* unmapped hw queue can be remapped after CPU topo changed */ if (!set->tags[hctx_idx] && !__blk_mq_alloc_rq_map(set, hctx_idx)) { @@ -2331,7 +2331,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * case, remap the current ctx to hctx[0] which * is guaranteed to always have tags allocated */ - set->mq_map[i] = 0; + set->map[0].mq_map[i] = 0; } ctx = per_cpu_ptr(q->queue_ctx, i); @@ -2584,7 +2584,7 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, int node; struct blk_mq_hw_ctx *hctx; - node = blk_mq_hw_queue_to_node(set->mq_map, i); + node = blk_mq_hw_queue_to_node(&set->map[0], i); /* * If the hw queue has been mapped to another numa node, * we need to realloc the hctx. If allocation fails, fallback @@ -2793,18 +2793,18 @@ static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) * for (queue = 0; queue < set->nr_hw_queues; queue++) { * mask = get_cpu_mask(queue) * for_each_cpu(cpu, mask) - * set->mq_map[cpu] = queue; + * set->map.mq_map[cpu] = queue; * } * * When we need to remap, the table has to be cleared for * killing stale mapping since one CPU may not be mapped * to any hw queue. */ - blk_mq_clear_mq_map(set); + blk_mq_clear_mq_map(&set->map[0]); return set->ops->map_queues(set); } else - return blk_mq_map_queues(set); + return blk_mq_map_queues(&set->map[0]); } /* @@ -2859,10 +2859,12 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return -ENOMEM; ret = -ENOMEM; - set->mq_map = kcalloc_node(nr_cpu_ids, sizeof(*set->mq_map), - GFP_KERNEL, set->numa_node); - if (!set->mq_map) + set->map[0].mq_map = kcalloc_node(nr_cpu_ids, + sizeof(*set->map[0].mq_map), + GFP_KERNEL, set->numa_node); + if (!set->map[0].mq_map) goto out_free_tags; + set->map[0].nr_queues = set->nr_hw_queues; ret = blk_mq_update_queue_map(set); if (ret) @@ -2878,8 +2880,8 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return 0; out_free_mq_map: - kfree(set->mq_map); - set->mq_map = NULL; + kfree(set->map[0].mq_map); + set->map[0].mq_map = NULL; out_free_tags: kfree(set->tags); set->tags = NULL; @@ -2894,8 +2896,8 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) for (i = 0; i < nr_cpu_ids; i++) blk_mq_free_map_and_requests(set, i); - kfree(set->mq_map); - set->mq_map = NULL; + kfree(set->map[0].mq_map); + set->map[0].mq_map = NULL; kfree(set->tags); set->tags = NULL; @@ -3056,7 +3058,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, pr_warn("Increasing nr_hw_queues to %d fails, fallback to %d\n", nr_hw_queues, prev_nr_hw_queues); set->nr_hw_queues = prev_nr_hw_queues; - blk_mq_map_queues(set); + blk_mq_map_queues(&set->map[0]); goto fallback; } blk_mq_map_swqueue(q); diff --git a/block/blk-mq.h b/block/blk-mq.h index 9536be06d022..889f0069dd80 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -70,14 +70,14 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, /* * CPU -> queue mappings */ -extern int blk_mq_hw_queue_to_node(unsigned int *map, unsigned int); +extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, int cpu) { struct blk_mq_tag_set *set = q->tag_set; - return q->queue_hw_ctx[set->mq_map[cpu]]; + return q->queue_hw_ctx[set->map[0].mq_map[cpu]]; } /* @@ -206,12 +206,12 @@ static inline void blk_mq_put_driver_tag(struct request *rq) __blk_mq_put_driver_tag(hctx, rq); } -static inline void blk_mq_clear_mq_map(struct blk_mq_tag_set *set) +static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) { int cpu; for_each_possible_cpu(cpu) - set->mq_map[cpu] = 0; + qmap->mq_map[cpu] = 0; } #endif diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 086c6bb12baa..6e869d05f91e 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -624,7 +624,7 @@ static int virtblk_map_queues(struct blk_mq_tag_set *set) { struct virtio_blk *vblk = set->driver_data; - return blk_mq_virtio_map_queues(set, vblk->vdev, 0); + return blk_mq_virtio_map_queues(&set->map[0], vblk->vdev, 0); } #ifdef CONFIG_VIRTIO_BLK_SCSI diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index f30031945ee4..e5d783cb6937 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -435,7 +435,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set) { struct nvme_dev *dev = set->driver_data; - return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), + return blk_mq_pci_map_queues(&set->map[0], to_pci_dev(dev->dev), dev->num_vecs > 1 ? 1 /* admin queue */ : 0); } diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 3e2665c66bc4..ca9ac124f218 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -6934,11 +6934,12 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost) { int rc; scsi_qla_host_t *vha = (scsi_qla_host_t *)shost->hostdata; + struct blk_mq_queue_map *qmap = &shost->tag_set.map[0]; if (USER_CTRL_IRQ(vha->hw)) - rc = blk_mq_map_queues(&shost->tag_set); + rc = blk_mq_map_queues(qmap); else - rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev, 0); + rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, 0); return rc; } diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 651be30ba96a..ed81b8e74cfe 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1812,7 +1812,7 @@ static int scsi_map_queues(struct blk_mq_tag_set *set) if (shost->hostt->map_queues) return shost->hostt->map_queues(shost); - return blk_mq_map_queues(set); + return blk_mq_map_queues(&set->map[0]); } void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index a25a07a0b7f0..bac084260d80 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -5319,7 +5319,8 @@ static int pqi_map_queues(struct Scsi_Host *shost) { struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost); - return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0); + return blk_mq_pci_map_queues(&shost->tag_set.map[0], + ctrl_info->pci_dev, 0); } static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 1c72db94270e..c3c95b314286 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -719,8 +719,9 @@ static void virtscsi_target_destroy(struct scsi_target *starget) static int virtscsi_map_queues(struct Scsi_Host *shost) { struct virtio_scsi *vscsi = shost_priv(shost); + struct blk_mq_queue_map *qmap = &shost->tag_set.map[0]; - return blk_mq_virtio_map_queues(&shost->tag_set, vscsi->vdev, 2); + return blk_mq_virtio_map_queues(qmap, vscsi->vdev, 2); } /* diff --git a/include/linux/blk-mq-pci.h b/include/linux/blk-mq-pci.h index 9f4c17f0d2d8..0b1f45c62623 100644 --- a/include/linux/blk-mq-pci.h +++ b/include/linux/blk-mq-pci.h @@ -2,10 +2,10 @@ #ifndef _LINUX_BLK_MQ_PCI_H #define _LINUX_BLK_MQ_PCI_H -struct blk_mq_tag_set; +struct blk_mq_queue_map; struct pci_dev; -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev, +int blk_mq_pci_map_queues(struct blk_mq_queue_map *qmap, struct pci_dev *pdev, int offset); #endif /* _LINUX_BLK_MQ_PCI_H */ diff --git a/include/linux/blk-mq-virtio.h b/include/linux/blk-mq-virtio.h index 69b4da262c45..687ae287e1dc 100644 --- a/include/linux/blk-mq-virtio.h +++ b/include/linux/blk-mq-virtio.h @@ -2,10 +2,10 @@ #ifndef _LINUX_BLK_MQ_VIRTIO_H #define _LINUX_BLK_MQ_VIRTIO_H -struct blk_mq_tag_set; +struct blk_mq_queue_map; struct virtio_device; -int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, +int blk_mq_virtio_map_queues(struct blk_mq_queue_map *qmap, struct virtio_device *vdev, int first_vec); #endif /* _LINUX_BLK_MQ_VIRTIO_H */ diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 5c8418ebbfd6..da88e539601b 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -74,10 +74,19 @@ struct blk_mq_hw_ctx { struct srcu_struct srcu[0]; }; +struct blk_mq_queue_map { + unsigned int *mq_map; + unsigned int nr_queues; +}; + +enum { + HCTX_MAX_TYPES = 1, +}; + struct blk_mq_tag_set { - unsigned int *mq_map; + struct blk_mq_queue_map map[HCTX_MAX_TYPES]; const struct blk_mq_ops *ops; - unsigned int nr_hw_queues; + unsigned int nr_hw_queues; /* nr hw queues across maps */ unsigned int queue_depth; /* max hw supported */ unsigned int reserved_tags; unsigned int cmd_size; /* per-request extra data */ @@ -294,7 +303,7 @@ void blk_mq_freeze_queue_wait(struct request_queue *q); int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, unsigned long timeout); -int blk_mq_map_queues(struct blk_mq_tag_set *set); +int blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues); void blk_mq_quiesce_queue_nowait(struct request_queue *q); From patchwork Tue Oct 30 18:32:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661385 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C244514DE for ; Tue, 30 Oct 2018 18:33:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CDD0D2871C for ; Tue, 30 Oct 2018 18:33:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C1FF22AA89; Tue, 30 Oct 2018 18:33:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CDFC2871C for ; Tue, 30 Oct 2018 18:33:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728020AbeJaD1i (ORCPT ); Tue, 30 Oct 2018 23:27:38 -0400 Received: from mail-io1-f68.google.com ([209.85.166.68]:33543 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728154AbeJaD1i (ORCPT ); Tue, 30 Oct 2018 23:27:38 -0400 Received: by mail-io1-f68.google.com with SMTP id f12-v6so4653971iog.0 for ; Tue, 30 Oct 2018 11:33:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ncMlYjW4QMBcueE4XAAx0DvzuXJgAlxX/WdXaLvA1LQ=; b=LwxpnXwV9wkbg77rASowOSIiSpfJI7H8XloHlgoOkRB1S9yKN5q/ItcjYlpQQ7niel jatV8sShUu5KMmQpvox9AAcbNG5uRTj7ovh0HSsqSHYE6en7us/jrWt0/c3UHvs8iBDE PVUWeif1HJMoBCXwSpxoMKFFYq5+Z4ngXgDWhqyzy6YnN0QLJNhdbBDY36iWK7UzG9Ei 5/33eRILNXIypxag4UQZLDTPqSYnmb/RE/DbaQaorCI622Yh4oGthv6AWkuUW6PxnYpF VrQydiunyYljd5T627LXqp9gVQ45i/weRybPoLeIhZK7+Y4O/YNsYRBV3dlpfmPMWK0B jVkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ncMlYjW4QMBcueE4XAAx0DvzuXJgAlxX/WdXaLvA1LQ=; b=IOemhqVfCB15D/2ittpmgYURFBGmCV0nKLfz6V7/mClBWzwtk7TgGFRmv7kDcSsnGz DOySMd/f+FSkFHG//QpPNBIP4nZpehmmNDHmams/PL5ZsllGISaKyeAEnZoPWW3clQ4I terGfx7++Bp5tv221DR1Jv8BjwVcHH21tqfICraSc++SWSzh3Mj9FtguCkqqijDzlayC lG4uGr7MC/hivH1wkJvGvzJh3NEsuH2JKj6swUKRxHAKCV5p+YOEPWmOzy+Gq9ZoIcR0 +zjbwEoBX62i8yCpeBb+yaoLu/D57PbyvJEVsO0Ms+WF5GFHiv9ZtKAffoKDuS0GAmG8 XPaA== X-Gm-Message-State: AGRZ1gIx1OkhxrBsykuCHM0trG91Z4vzSJIApJi6geORxS6+LyN3tnYH iKqQzdUFBixO9WY8C0B0vwI+pQFO+Qg= X-Google-Smtp-Source: AJdET5cjDUXCBnY7PbTHECdvmrixYnr7k5OsSzpmnCpd9XSmPvCg4KpuZSBCIJoRausFAsnxAqJXng== X-Received: by 2002:a6b:4618:: with SMTP id t24-v6mr8245ioa.284.1540924381841; Tue, 30 Oct 2018 11:33:01 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:00 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 03/16] blk-mq: provide dummy blk_mq_map_queue_type() helper Date: Tue, 30 Oct 2018 12:32:39 -0600 Message-Id: <20181030183252.17857-4-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Doesn't do anything right now, but it's needed as a prep patch to get the interfaces right. While in there, correct the blk_mq_map_queue() CPU type to an unsigned int. Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- block/blk-mq.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.h b/block/blk-mq.h index 889f0069dd80..d9facfb9ca51 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -73,13 +73,20 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, - int cpu) + unsigned int cpu) { struct blk_mq_tag_set *set = q->tag_set; return q->queue_hw_ctx[set->map[0].mq_map[cpu]]; } +static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, + unsigned int hctx_type, + unsigned int cpu) +{ + return blk_mq_map_queue(q, cpu); +} + /* * sysfs helpers */ From patchwork Tue Oct 30 18:32:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 131D914DE for ; Tue, 30 Oct 2018 18:34:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D14E299B4 for ; Tue, 30 Oct 2018 18:34:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1103929C60; Tue, 30 Oct 2018 18:34:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 35451299B4 for ; Tue, 30 Oct 2018 18:34:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728174AbeJaD1l (ORCPT ); Tue, 30 Oct 2018 23:27:41 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:36608 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728173AbeJaD1l (ORCPT ); Tue, 30 Oct 2018 23:27:41 -0400 Received: by mail-io1-f65.google.com with SMTP id o19-v6so7916592iod.3 for ; Tue, 30 Oct 2018 11:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+fT7u8A3AEqtuEcBc4VTBeh358yzZS8u50JezYHcrBg=; b=uEBMCK1Zy4AP6Y2Qn6SqnbvMYojhPblOsCWuHTn4Sn+akRQJzrKJBMS9A2i5GYgHnI Pw2fb5Vo9dQQEIB52hr7jKaDR9bph0xiq6TYRz9UfstCYTHPOPF133rt6+GvDYVwfGpT O5K1thXZN95scDSHv7a6wwpmX0cWxH4iRn84ShW5dTL1arrDTBUjHywNLuHuYZ0xWD0K bZCiKi+TSvTbq+45mS/hC73lcUV/Xuik026nnl5YsROynpLZli5OjXM+oHXscu90CCXh Hr2fzF9509UtfQrHeFRv30WEododHEnXnrkwUQamdHSut9fnf5BQj2+0H/YTRNzYg8TK l84w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+fT7u8A3AEqtuEcBc4VTBeh358yzZS8u50JezYHcrBg=; b=BSPWtkAuMXHEEqwkKx2tvWUIKi06mxchsWiydrzUJOBqTgj4xbWxaa1oIrvZdeNKQP l2Y/y9V7R/Ic9HItaF977dLPbqLxX4TxGTukcUNRLc2nmo3og6fQf25nBs9UgTV3t+6F yabadP1sBS0Ogj0kaUib0cnKPnUmHSsdlJ1c7llOKllj7omC9pq8BoDxzp9bo/uQ9jRv 1yHfSBkK4gDMuHeF+UySchX08BeUh+lY65IqpKnIRLGVcWhFn2Mz3ZLWC+Q1dj0t+SQ8 00EU4swxPIS5KVJSlc6HKahvrvEmW9HxIBoNoNccQVWdTBIp5avnPEU7YWhOAXhKQgCN 4yeg== X-Gm-Message-State: AGRZ1gJP1P/ghy+Kyy3zdC6BzZS5kZ3jUHQmI316fpSpYcxHs8ObBg8g JYDUSJUbwHxetu832GSgEuuEKmIYKfg= X-Google-Smtp-Source: AJdET5dbvSgRIRw3ReyGfnzpaJtHhuzMnKUrI8SoxiE/L9Lyl+P4/oVzxZu8i9huQMyl9NQcSMOHZQ== X-Received: by 2002:a6b:1506:: with SMTP id 6-v6mr24505iov.50.1540924384231; Tue, 30 Oct 2018 11:33:04 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:02 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 04/16] blk-mq: pass in request/bio flags to queue mapping Date: Tue, 30 Oct 2018 12:32:40 -0600 Message-Id: <20181030183252.17857-5-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Prep patch for being able to place request based not just on CPU location, but also on the type of request. Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe --- block/blk-flush.c | 7 +++--- block/blk-mq-debugfs.c | 4 +++- block/blk-mq-sched.c | 16 ++++++++++---- block/blk-mq-tag.c | 5 +++-- block/blk-mq.c | 50 +++++++++++++++++++++++------------------- block/blk-mq.h | 6 +++-- block/blk.h | 6 ++--- 7 files changed, 57 insertions(+), 37 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 9baa9a119447..7922dba81497 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -219,7 +219,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) /* release the tag's ownership to the req cloned from */ spin_lock_irqsave(&fq->mq_flush_lock, flags); - hctx = blk_mq_map_queue(q, flush_rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(q, flush_rq->cmd_flags, flush_rq->mq_ctx->cpu); if (!q->elevator) { blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq); flush_rq->tag = -1; @@ -307,7 +307,8 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, if (!q->elevator) { fq->orig_rq = first_rq; flush_rq->tag = first_rq->tag; - hctx = blk_mq_map_queue(q, first_rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(q, first_rq->cmd_flags, + first_rq->mq_ctx->cpu); blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq); } else { flush_rq->internal_tag = first_rq->internal_tag; @@ -330,7 +331,7 @@ static void mq_flush_data_end_io(struct request *rq, blk_status_t error) unsigned long flags; struct blk_flush_queue *fq = blk_get_flush_queue(q, ctx); - hctx = blk_mq_map_queue(q, ctx->cpu); + hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); if (q->elevator) { WARN_ON(rq->tag < 0); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 9ed43a7c70b5..fac70c81b7de 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -427,8 +427,10 @@ struct show_busy_params { static void hctx_show_busy_rq(struct request *rq, void *data, bool reserved) { const struct show_busy_params *params = data; + struct blk_mq_hw_ctx *hctx; - if (blk_mq_map_queue(rq->q, rq->mq_ctx->cpu) == params->hctx) + hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); + if (hctx == params->hctx) __blk_mq_debugfs_rq_show(params->m, list_entry_rq(&rq->queuelist)); } diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe8017a2d..8125e9393ec2 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -311,7 +311,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) { struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = blk_mq_get_ctx(q); - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx->cpu); bool ret = false; if (e && e->type->ops.mq.bio_merge) { @@ -367,7 +367,9 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, struct request_queue *q = rq->q; struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx; + + hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); /* flush rq in flush machinery need to be dispatched directly */ if (!(rq->rq_flags & RQF_FLUSH_SEQ) && op_is_flush(rq->cmd_flags)) { @@ -400,9 +402,15 @@ void blk_mq_sched_insert_requests(struct request_queue *q, struct blk_mq_ctx *ctx, struct list_head *list, bool run_queue_async) { - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); - struct elevator_queue *e = hctx->queue->elevator; + struct blk_mq_hw_ctx *hctx; + struct elevator_queue *e; + struct request *rq; + + /* For list inserts, requests better be on the same hw queue */ + rq = list_first_entry(list, struct request, queuelist); + hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); + e = hctx->queue->elevator; if (e && e->type->ops.mq.insert_requests) e->type->ops.mq.insert_requests(hctx, list, false); else { diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 4254e74c1446..478a959357f5 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -168,7 +168,8 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) io_schedule(); data->ctx = blk_mq_get_ctx(data->q); - data->hctx = blk_mq_map_queue(data->q, data->ctx->cpu); + data->hctx = blk_mq_map_queue(data->q, data->cmd_flags, + data->ctx->cpu); tags = blk_mq_tags_from_data(data); if (data->flags & BLK_MQ_REQ_RESERVED) bt = &tags->breserved_tags; @@ -530,7 +531,7 @@ u32 blk_mq_unique_tag(struct request *rq) struct blk_mq_hw_ctx *hctx; int hwq = 0; - hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(q, rq->cmd_flags, rq->mq_ctx->cpu); hwq = hctx->queue_num; return (hwq << BLK_MQ_UNIQUE_TAG_BITS) | diff --git a/block/blk-mq.c b/block/blk-mq.c index 9f149429cfbd..e3febb5691c4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -332,8 +332,8 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, } static struct request *blk_mq_get_request(struct request_queue *q, - struct bio *bio, unsigned int op, - struct blk_mq_alloc_data *data) + struct bio *bio, + struct blk_mq_alloc_data *data) { struct elevator_queue *e = q->elevator; struct request *rq; @@ -347,8 +347,9 @@ static struct request *blk_mq_get_request(struct request_queue *q, put_ctx_on_error = true; } if (likely(!data->hctx)) - data->hctx = blk_mq_map_queue(q, data->ctx->cpu); - if (op & REQ_NOWAIT) + data->hctx = blk_mq_map_queue(q, data->cmd_flags, + data->ctx->cpu); + if (data->cmd_flags & REQ_NOWAIT) data->flags |= BLK_MQ_REQ_NOWAIT; if (e) { @@ -359,9 +360,10 @@ static struct request *blk_mq_get_request(struct request_queue *q, * dispatch list. Don't include reserved tags in the * limiting, as it isn't useful. */ - if (!op_is_flush(op) && e->type->ops.mq.limit_depth && + if (!op_is_flush(data->cmd_flags) && + e->type->ops.mq.limit_depth && !(data->flags & BLK_MQ_REQ_RESERVED)) - e->type->ops.mq.limit_depth(op, data); + e->type->ops.mq.limit_depth(data->cmd_flags, data); } else { blk_mq_tag_busy(data->hctx); } @@ -376,8 +378,8 @@ static struct request *blk_mq_get_request(struct request_queue *q, return NULL; } - rq = blk_mq_rq_ctx_init(data, tag, op); - if (!op_is_flush(op)) { + rq = blk_mq_rq_ctx_init(data, tag, data->cmd_flags); + if (!op_is_flush(data->cmd_flags)) { rq->elv.icq = NULL; if (e && e->type->ops.mq.prepare_request) { if (e->type->icq_cache && rq_ioc(bio)) @@ -394,7 +396,7 @@ static struct request *blk_mq_get_request(struct request_queue *q, struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op, blk_mq_req_flags_t flags) { - struct blk_mq_alloc_data alloc_data = { .flags = flags }; + struct blk_mq_alloc_data alloc_data = { .flags = flags, .cmd_flags = op }; struct request *rq; int ret; @@ -402,7 +404,7 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op, if (ret) return ERR_PTR(ret); - rq = blk_mq_get_request(q, NULL, op, &alloc_data); + rq = blk_mq_get_request(q, NULL, &alloc_data); blk_queue_exit(q); if (!rq) @@ -420,7 +422,7 @@ EXPORT_SYMBOL(blk_mq_alloc_request); struct request *blk_mq_alloc_request_hctx(struct request_queue *q, unsigned int op, blk_mq_req_flags_t flags, unsigned int hctx_idx) { - struct blk_mq_alloc_data alloc_data = { .flags = flags }; + struct blk_mq_alloc_data alloc_data = { .flags = flags, .cmd_flags = op }; struct request *rq; unsigned int cpu; int ret; @@ -453,7 +455,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask); alloc_data.ctx = __blk_mq_get_ctx(q, cpu); - rq = blk_mq_get_request(q, NULL, op, &alloc_data); + rq = blk_mq_get_request(q, NULL, &alloc_data); blk_queue_exit(q); if (!rq) @@ -467,7 +469,7 @@ static void __blk_mq_free_request(struct request *rq) { struct request_queue *q = rq->q; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); const int sched_tag = rq->internal_tag; blk_pm_mark_last_busy(rq); @@ -484,7 +486,7 @@ void blk_mq_free_request(struct request *rq) struct request_queue *q = rq->q; struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); if (rq->rq_flags & RQF_ELVPRIV) { if (e && e->type->ops.mq.finish_request) @@ -976,8 +978,9 @@ bool blk_mq_get_driver_tag(struct request *rq) { struct blk_mq_alloc_data data = { .q = rq->q, - .hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu), + .hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu), .flags = BLK_MQ_REQ_NOWAIT, + .cmd_flags = rq->cmd_flags, }; bool shared; @@ -1141,7 +1144,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); - hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); if (!got_budget && !blk_mq_get_dispatch_budget(hctx)) break; @@ -1572,7 +1575,8 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, void blk_mq_request_bypass_insert(struct request *rq, bool run_queue) { struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, + ctx->cpu); spin_lock(&hctx->lock); list_add_tail(&rq->queuelist, &hctx->dispatch); @@ -1782,7 +1786,8 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) int srcu_idx; blk_qc_t unused_cookie; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, + ctx->cpu); hctx_lock(hctx, &srcu_idx); ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); @@ -1816,7 +1821,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); const int is_flush_fua = op_is_flush(bio->bi_opf); - struct blk_mq_alloc_data data = { .flags = 0 }; + struct blk_mq_alloc_data data = { .flags = 0, .cmd_flags = bio->bi_opf }; struct request *rq; unsigned int request_count = 0; struct blk_plug *plug; @@ -1839,7 +1844,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) rq_qos_throttle(q, bio, NULL); - rq = blk_mq_get_request(q, bio, bio->bi_opf, &data); + rq = blk_mq_get_request(q, bio, &data); if (unlikely(!rq)) { rq_qos_cleanup(q, bio); if (bio->bi_opf & REQ_NOWAIT) @@ -1908,6 +1913,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) if (same_queue_rq) { data.hctx = blk_mq_map_queue(q, + same_queue_rq->cmd_flags, same_queue_rq->mq_ctx->cpu); blk_mq_try_issue_directly(data.hctx, same_queue_rq, &cookie); @@ -2262,7 +2268,7 @@ static void blk_mq_init_cpu_queues(struct request_queue *q, * Set local node, IFF we have more than one hw queue. If * not, we remain on the home node of the device */ - hctx = blk_mq_map_queue(q, i); + hctx = blk_mq_map_queue_type(q, 0, i); if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE) hctx->numa_node = local_memory_node(cpu_to_node(i)); } @@ -2335,7 +2341,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) } ctx = per_cpu_ptr(q->queue_ctx, i); - hctx = blk_mq_map_queue(q, i); + hctx = blk_mq_map_queue_type(q, 0, i); cpumask_set_cpu(i, hctx->cpumask); ctx->index_hw = hctx->nr_ctx; diff --git a/block/blk-mq.h b/block/blk-mq.h index d9facfb9ca51..6a8f8b60d8ba 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -73,6 +73,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, + unsigned int flags, unsigned int cpu) { struct blk_mq_tag_set *set = q->tag_set; @@ -84,7 +85,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue * unsigned int hctx_type, unsigned int cpu) { - return blk_mq_map_queue(q, cpu); + return blk_mq_map_queue(q, hctx_type, cpu); } /* @@ -135,6 +136,7 @@ struct blk_mq_alloc_data { struct request_queue *q; blk_mq_req_flags_t flags; unsigned int shallow_depth; + unsigned int cmd_flags; /* input & output parameter */ struct blk_mq_ctx *ctx; @@ -209,7 +211,7 @@ static inline void blk_mq_put_driver_tag(struct request *rq) if (rq->tag == -1 || rq->internal_tag == -1) return; - hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); __blk_mq_put_driver_tag(hctx, rq); } diff --git a/block/blk.h b/block/blk.h index 2bf1cfeeb9c0..78ae94886acf 100644 --- a/block/blk.h +++ b/block/blk.h @@ -104,10 +104,10 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q) __clear_bit(flag, &q->queue_flags); } -static inline struct blk_flush_queue *blk_get_flush_queue( - struct request_queue *q, struct blk_mq_ctx *ctx) +static inline struct blk_flush_queue * +blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx) { - return blk_mq_map_queue(q, ctx->cpu)->fq; + return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx->cpu)->fq; } static inline void __blk_get_queue(struct request_queue *q) From patchwork Tue Oct 30 18:32:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661409 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B1771932 for ; Tue, 30 Oct 2018 18:34:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A53CE299B4 for ; Tue, 30 Oct 2018 18:34:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 996AC299BB; Tue, 30 Oct 2018 18:34:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 11EF8299B4 for ; Tue, 30 Oct 2018 18:34:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727412AbeJaD3H (ORCPT ); Tue, 30 Oct 2018 23:29:07 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:39817 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728154AbeJaD1m (ORCPT ); Tue, 30 Oct 2018 23:27:42 -0400 Received: by mail-it1-f196.google.com with SMTP id m15so14960326itl.4 for ; Tue, 30 Oct 2018 11:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4MN7CxNF39hRfSgNb+oXevSm2RA1lQLKjJLPABnQJlg=; b=uXKPcN8YKbBxLsYmyZFRaHzHdS5sjoOx2mVFAI6TxbDct7Euo2hO6r/Q6ycdaLtFAT zxyUFc1axSnFTGeUepvQtmQbKixIHVDrigws7Rof+c5g+iCyGUHMXH7xqpGpLFfEOOSb orfCIap/pSnwww/ZTEGXIf1bR5V+xi2Cxh9l4vNz1ER8Aj40NcQDarZU0oMzVXhBWztj EEioG9IpRlGmri2+iA//OpYsow7zYEmuSO6y+k+q4jam3VxCgsB4LgR7ZtkdaGrYdd7q Z5nWVZ8Uwnd5WTCoMA7KubDzI5zF9nMLocnBKc6IUsGoy95+srg8JsUWmhx8WN/iJQMl 61dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4MN7CxNF39hRfSgNb+oXevSm2RA1lQLKjJLPABnQJlg=; b=neA7VgkBRhH0oomQ/WQ7DMz7aWQBEVDQdaNPCnhQPAI8ybZUixAQ08fz1AOZqL7gyH SrHzguOyHitezi5Yac61AERT9Rnh81bRp7S86Asho65FahDCsG60StmrEw2xuxq4aGSw yUfBwwFOve7sh60w4ivopUDUx9a4ZpvkQT6rzZNahhlOww1oLaBbhQyYoCS9m8jM592a 5t4y6PAs2PgOYkU7LugXbGJlVrpNH6q2xEOxKHGf/9o1Qlo6tsIXwVwi7PvJmHvpcc2f yoS34Td7G2OlEdhImZscsxuKbPvIbKbbGHEhRpfgdNxokAlskUk7pRdobktxPFxpT0jJ MGeA== X-Gm-Message-State: AGRZ1gKCK5nOeyZLCDCo8+XFMibFnshZESp9ke6X+dOTiwMvuNOHsd/r d6m8eYKU9osWCMd2/BZIK846/fUfzLs= X-Google-Smtp-Source: AJdET5eHSIXdgvIc2upeAZJSde+rIJp1mm3cinoJete4zprpLtpSyruLzQe+b+dyDHygDJLWJnS8Zg== X-Received: by 2002:a24:1b81:: with SMTP id 123-v6mr2165487its.163.1540924386174; Tue, 30 Oct 2018 11:33:06 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:04 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 05/16] blk-mq: allow software queue to map to multiple hardware queues Date: Tue, 30 Oct 2018 12:32:41 -0600 Message-Id: <20181030183252.17857-6-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The mapping used to be dependent on just the CPU location, but now it's a tuple of (type, cpu) instead. This is a prep patch for allowing a single software queue to map to multiple hardware queues. No functional changes in this patch. Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe --- block/blk-mq-sched.c | 2 +- block/blk-mq.c | 22 ++++++++++++++++------ block/blk-mq.h | 2 +- block/kyber-iosched.c | 6 +++--- include/linux/blk-mq.h | 3 ++- 5 files changed, 23 insertions(+), 12 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 8125e9393ec2..d232ecf3290c 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -110,7 +110,7 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx) { - unsigned idx = ctx->index_hw; + unsigned short idx = ctx->index_hw[hctx->type]; if (++idx == hctx->nr_ctx) idx = 0; diff --git a/block/blk-mq.c b/block/blk-mq.c index e3febb5691c4..34afbad0ebf6 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -75,14 +75,18 @@ static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx) static void blk_mq_hctx_mark_pending(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx) { - if (!sbitmap_test_bit(&hctx->ctx_map, ctx->index_hw)) - sbitmap_set_bit(&hctx->ctx_map, ctx->index_hw); + const int bit = ctx->index_hw[hctx->type]; + + if (!sbitmap_test_bit(&hctx->ctx_map, bit)) + sbitmap_set_bit(&hctx->ctx_map, bit); } static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx) { - sbitmap_clear_bit(&hctx->ctx_map, ctx->index_hw); + const int bit = ctx->index_hw[hctx->type]; + + sbitmap_clear_bit(&hctx->ctx_map, bit); } struct mq_inflight { @@ -954,7 +958,7 @@ static bool dispatch_rq_from_ctx(struct sbitmap *sb, unsigned int bitnr, struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *start) { - unsigned off = start ? start->index_hw : 0; + unsigned off = start ? start->index_hw[hctx->type] : 0; struct dispatch_rq_data data = { .hctx = hctx, .rq = NULL, @@ -2342,10 +2346,16 @@ static void blk_mq_map_swqueue(struct request_queue *q) ctx = per_cpu_ptr(q->queue_ctx, i); hctx = blk_mq_map_queue_type(q, 0, i); - + hctx->type = 0; cpumask_set_cpu(i, hctx->cpumask); - ctx->index_hw = hctx->nr_ctx; + ctx->index_hw[hctx->type] = hctx->nr_ctx; hctx->ctxs[hctx->nr_ctx++] = ctx; + + /* + * If the nr_ctx type overflows, we have exceeded the + * amount of sw queues we can support. + */ + BUG_ON(!hctx->nr_ctx); } mutex_unlock(&q->sysfs_lock); diff --git a/block/blk-mq.h b/block/blk-mq.h index 6a8f8b60d8ba..1821f448f7c4 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -17,7 +17,7 @@ struct blk_mq_ctx { } ____cacheline_aligned_in_smp; unsigned int cpu; - unsigned int index_hw; + unsigned short index_hw[HCTX_MAX_TYPES]; /* incremented at dispatch time */ unsigned long rq_dispatched[2]; diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index 728757a34fa0..b824a639d5d4 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -576,7 +576,7 @@ static bool kyber_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio) { struct kyber_hctx_data *khd = hctx->sched_data; struct blk_mq_ctx *ctx = blk_mq_get_ctx(hctx->queue); - struct kyber_ctx_queue *kcq = &khd->kcqs[ctx->index_hw]; + struct kyber_ctx_queue *kcq = &khd->kcqs[ctx->index_hw[hctx->type]]; unsigned int sched_domain = kyber_sched_domain(bio->bi_opf); struct list_head *rq_list = &kcq->rq_list[sched_domain]; bool merged; @@ -602,7 +602,7 @@ static void kyber_insert_requests(struct blk_mq_hw_ctx *hctx, list_for_each_entry_safe(rq, next, rq_list, queuelist) { unsigned int sched_domain = kyber_sched_domain(rq->cmd_flags); - struct kyber_ctx_queue *kcq = &khd->kcqs[rq->mq_ctx->index_hw]; + struct kyber_ctx_queue *kcq = &khd->kcqs[rq->mq_ctx->index_hw[hctx->type]]; struct list_head *head = &kcq->rq_list[sched_domain]; spin_lock(&kcq->lock); @@ -611,7 +611,7 @@ static void kyber_insert_requests(struct blk_mq_hw_ctx *hctx, else list_move_tail(&rq->queuelist, head); sbitmap_set_bit(&khd->kcq_map[sched_domain], - rq->mq_ctx->index_hw); + rq->mq_ctx->index_hw[hctx->type]); blk_mq_sched_request_inserted(rq); spin_unlock(&kcq->lock); } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index da88e539601b..466b9202b69c 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -37,7 +37,8 @@ struct blk_mq_hw_ctx { struct blk_mq_ctx *dispatch_from; unsigned int dispatch_busy; - unsigned int nr_ctx; + unsigned short type; + unsigned short nr_ctx; struct blk_mq_ctx **ctxs; spinlock_t dispatch_wait_lock; From patchwork Tue Oct 30 18:32:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661407 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6F281932 for ; Tue, 30 Oct 2018 18:34:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1BEE2986B for ; Tue, 30 Oct 2018 18:34:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A65A7299B9; Tue, 30 Oct 2018 18:34:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 538C52986B for ; Tue, 30 Oct 2018 18:34:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726984AbeJaD1o (ORCPT ); Tue, 30 Oct 2018 23:27:44 -0400 Received: from mail-it1-f193.google.com ([209.85.166.193]:54682 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728213AbeJaD1o (ORCPT ); Tue, 30 Oct 2018 23:27:44 -0400 Received: by mail-it1-f193.google.com with SMTP id d6so9363304itl.4 for ; Tue, 30 Oct 2018 11:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FDKGBdEnPJqmZMVEkJWhACz2yqqeabrzsupMLOYAhO4=; b=sAQt6ArmCG29q2FC2N3AEPWxQsxpU9ESDqTKtgWldkMW2zejie424VUyRdwDv7/jyk GyqYFEYz4ONd0OYnjasEn/TFB6hImfM5yLp0uRFa9aLHzTAnwfAloAjbv9XSQuHSEtpi Ao3wpVsOXf+CPjgybmvDgJUbd6fUHHKIWo/qqzzt0lK/C5TAEf693/Jnutc5CUDF3TtE oM2KFZpex+B94D1hHl+W/JBGdFdn7V9ZLHp3gW8ElWLMBn3kR57HI+LSyIaYqEtIZqAr GuZr6MvIyKcjcS2MFzlVIeh+kiOBxOFZvDMyvgkKuv62C4Y3FaavZy8s5pjh7iMzVqIL bhvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FDKGBdEnPJqmZMVEkJWhACz2yqqeabrzsupMLOYAhO4=; b=XD5vxsU1iwmcAXdYpSohpedhu3ocAgEnImZ794OPudT3H11PcaKB0IsfcccIKg7OcX t5kA5piEXsDBWCkTpmAeSiyIrbpudXPiLRebQTWrFO28mvCJM/b0bBSiQsc7daaC5/mI FKCU+pTnJZcU5N5U01bPMxxLbHMUg98bvuNKYTDc5ti6Aa6zqfi59c7viRQ7SjCJhFwP haqqAgmWsjAeclHQpd3KfdGGGK04p+mmEYfh3w5bBSL/kL/qSjw6i6gZTbF+MA1MnS5y rdCouAtvTB4pKBw8vd7LqowoHYaGwCMpcftP7VGihwZmoswMrpuFoBSdnBPUxFl6xh3f RQfA== X-Gm-Message-State: AGRZ1gKWCf7hRYCW9I5WGYBaAVDqHpjVMwyqHXW9xqKR4Vrn0GBwFenm LOz9r1rFHNeIJ1jZz9bk1RKnqU/9jwc= X-Google-Smtp-Source: AJdET5di1MfVvmB6Hq618QWR7jsFmPGD/4Vc9jDMrIk7h9M2KWLd68t9oDSc56+f1ChmvsHRFM89Ag== X-Received: by 2002:a24:4f51:: with SMTP id c78-v6mr22872itb.56.1540924388040; Tue, 30 Oct 2018 11:33:08 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:06 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 06/16] blk-mq: add 'type' attribute to the sysfs hctx directory Date: Tue, 30 Oct 2018 12:32:42 -0600 Message-Id: <20181030183252.17857-7-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It can be useful for a user to verify what type a given hardware queue is, expose this information in sysfs. Reviewed-by: Hannes Reinecke Reviewed-by: Bart Van Assche Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- block/blk-mq-sysfs.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index aafb44224c89..2d737f9e7ba7 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -161,6 +161,11 @@ static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx *hctx, char *page) return ret; } +static ssize_t blk_mq_hw_sysfs_type_show(struct blk_mq_hw_ctx *hctx, char *page) +{ + return sprintf(page, "%u\n", hctx->type); +} + static struct attribute *default_ctx_attrs[] = { NULL, }; @@ -177,11 +182,16 @@ static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_cpus = { .attr = {.name = "cpu_list", .mode = 0444 }, .show = blk_mq_hw_sysfs_cpus_show, }; +static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_type = { + .attr = {.name = "type", .mode = 0444 }, + .show = blk_mq_hw_sysfs_type_show, +}; static struct attribute *default_hw_ctx_attrs[] = { &blk_mq_hw_sysfs_nr_tags.attr, &blk_mq_hw_sysfs_nr_reserved_tags.attr, &blk_mq_hw_sysfs_cpus.attr, + &blk_mq_hw_sysfs_type.attr, NULL, }; From patchwork Tue Oct 30 18:32:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661405 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D36EA14DE for ; Tue, 30 Oct 2018 18:34:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DBAFF2986B for ; Tue, 30 Oct 2018 18:34:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CD3A0299B9; Tue, 30 Oct 2018 18:34:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C21D2986B for ; Tue, 30 Oct 2018 18:34:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727874AbeJaD1r (ORCPT ); Tue, 30 Oct 2018 23:27:47 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:35229 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727868AbeJaD1q (ORCPT ); Tue, 30 Oct 2018 23:27:46 -0400 Received: by mail-it1-f195.google.com with SMTP id p64-v6so14960836itp.0 for ; Tue, 30 Oct 2018 11:33:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AcZsPmlvD7NKkmPeiOGF+gGiFk5qlfKvsJEdotRLh4Q=; b=nQoVh/WmDkOjPE8MsinAynBnvNrMMPzMtQNEKg4hej39dEziQVv5GxQ04vfoZevUVu GL95vfTaBlB2t9/Du7L1lA/VjcgS6GCsbDEEQgnBxxhSL7KZY94CQDktR0Rk64Qx9Y97 dVTFkAj6sTMb3klT5IAvwJF8oLOnc+7B/Ths5NzUSZ5h7qFbym0mLLRcTfFfHIDSY8wJ j3gDaXrpFgCY3nLI/H5mR7OreYSyH7Krk9IxvC2ZVz1nsCZCK91fH1TECOucqlIsD2fG /yUeS+dceYPaot0SFEdHW2vkvWux9fWcWgkqJKVZfvLQavT/YMuiMCMriaRPzF+MbSn0 y4HA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AcZsPmlvD7NKkmPeiOGF+gGiFk5qlfKvsJEdotRLh4Q=; b=eEzfcYgi1yEW38+4fS+OrQoLWTGAG2GsGGMNHvFdqas1L5oCxVWxXgHgATo6subeJM QlwC3TNlrvV8cDfvEff3XotWfaFhQF9koY7EDnjs5NsY69fV/XKYRnL/vEuq7SMfufwW lDGxrIZN/6PYLXee0CDpaKCau5k7OYUiXUfRFjsi7aLBBYv3H/Mjkfc8xz4CdEU2dCJ5 RQGgG/SF8eSARvypB5MDV6fu+PfR9M15saDaBP/ooxkBkl8nQNMCod4gsGACdFf+4393 wpnEwx7ctWSN+33N0jgIRncNuRK7jk6K4hyXZwaK8GZtq32Wv/DqVCTS51teKvYJ2hgt 0eTg== X-Gm-Message-State: AGRZ1gKKuTcn0htLdtvelYJhPhIdbh9z9MzHYO1HbTmJ3IqT4jG3+Byw 1dFRtDo73UBOsQzUJaTsb1oMrjXm6As= X-Google-Smtp-Source: AJdET5cw/brSvUihLgre64DcFnEd3QELaYtMVdYIH8dlJ5aQa6WJ1LWZ2YPo6aMiosaIvpgd6PxRmg== X-Received: by 2002:a24:ed0c:: with SMTP id r12-v6mr218ith.53.1540924390098; Tue, 30 Oct 2018 11:33:10 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:08 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 07/16] blk-mq: support multiple hctx maps Date: Tue, 30 Oct 2018 12:32:43 -0600 Message-Id: <20181030183252.17857-8-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add support for the tag set carrying multiple queue maps, and for the driver to inform blk-mq how many it wishes to support through setting set->nr_maps. This adds an mq_ops helper for drivers that support more than 1 map, mq_ops->flags_to_type(). The function takes request/bio flags and CPU, and returns a queue map index for that. We then use the type information in blk_mq_map_queue() to index the map set. Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- block/blk-mq.c | 92 ++++++++++++++++++++++++++++-------------- block/blk-mq.h | 33 +++++++++++---- include/linux/blk-mq.h | 14 +++++++ 3 files changed, 100 insertions(+), 39 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 34afbad0ebf6..9d6e2f6f8ee9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2257,7 +2257,8 @@ static int blk_mq_init_hctx(struct request_queue *q, static void blk_mq_init_cpu_queues(struct request_queue *q, unsigned int nr_hw_queues) { - unsigned int i; + struct blk_mq_tag_set *set = q->tag_set; + unsigned int i, j; for_each_possible_cpu(i) { struct blk_mq_ctx *__ctx = per_cpu_ptr(q->queue_ctx, i); @@ -2272,9 +2273,11 @@ static void blk_mq_init_cpu_queues(struct request_queue *q, * Set local node, IFF we have more than one hw queue. If * not, we remain on the home node of the device */ - hctx = blk_mq_map_queue_type(q, 0, i); - if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE) - hctx->numa_node = local_memory_node(cpu_to_node(i)); + for (j = 0; j < set->nr_maps; j++) { + hctx = blk_mq_map_queue_type(q, j, i); + if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE) + hctx->numa_node = local_memory_node(cpu_to_node(i)); + } } } @@ -2309,7 +2312,7 @@ static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set, static void blk_mq_map_swqueue(struct request_queue *q) { - unsigned int i, hctx_idx; + unsigned int i, j, hctx_idx; struct blk_mq_hw_ctx *hctx; struct blk_mq_ctx *ctx; struct blk_mq_tag_set *set = q->tag_set; @@ -2345,17 +2348,28 @@ static void blk_mq_map_swqueue(struct request_queue *q) } ctx = per_cpu_ptr(q->queue_ctx, i); - hctx = blk_mq_map_queue_type(q, 0, i); - hctx->type = 0; - cpumask_set_cpu(i, hctx->cpumask); - ctx->index_hw[hctx->type] = hctx->nr_ctx; - hctx->ctxs[hctx->nr_ctx++] = ctx; + for (j = 0; j < set->nr_maps; j++) { + hctx = blk_mq_map_queue_type(q, j, i); - /* - * If the nr_ctx type overflows, we have exceeded the - * amount of sw queues we can support. - */ - BUG_ON(!hctx->nr_ctx); + /* + * If the CPU is already set in the mask, then we've + * mapped this one already. This can happen if + * devices share queues across queue maps. + */ + if (cpumask_test_cpu(i, hctx->cpumask)) + continue; + + cpumask_set_cpu(i, hctx->cpumask); + hctx->type = j; + ctx->index_hw[hctx->type] = hctx->nr_ctx; + hctx->ctxs[hctx->nr_ctx++] = ctx; + + /* + * If the nr_ctx type overflows, we have exceeded the + * amount of sw queues we can support. + */ + BUG_ON(!hctx->nr_ctx); + } } mutex_unlock(&q->sysfs_lock); @@ -2523,6 +2537,7 @@ struct request_queue *blk_mq_init_sq_queue(struct blk_mq_tag_set *set, memset(set, 0, sizeof(*set)); set->ops = ops; set->nr_hw_queues = 1; + set->nr_maps = 1; set->queue_depth = queue_depth; set->numa_node = NUMA_NO_NODE; set->flags = set_flags; @@ -2802,6 +2817,8 @@ static int blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set) static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) { if (set->ops->map_queues) { + int i; + /* * transport .map_queues is usually done in the following * way: @@ -2809,18 +2826,21 @@ static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) * for (queue = 0; queue < set->nr_hw_queues; queue++) { * mask = get_cpu_mask(queue) * for_each_cpu(cpu, mask) - * set->map.mq_map[cpu] = queue; + * set->map[x].mq_map[cpu] = queue; * } * * When we need to remap, the table has to be cleared for * killing stale mapping since one CPU may not be mapped * to any hw queue. */ - blk_mq_clear_mq_map(&set->map[0]); + for (i = 0; i < set->nr_maps; i++) + blk_mq_clear_mq_map(&set->map[i]); return set->ops->map_queues(set); - } else + } else { + BUG_ON(set->nr_maps > 1); return blk_mq_map_queues(&set->map[0]); + } } /* @@ -2831,7 +2851,7 @@ static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) */ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) { - int ret; + int i, ret; BUILD_BUG_ON(BLK_MQ_MAX_DEPTH > 1 << BLK_MQ_UNIQUE_TAG_BITS); @@ -2854,6 +2874,11 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) set->queue_depth = BLK_MQ_MAX_DEPTH; } + if (!set->nr_maps) + set->nr_maps = 1; + else if (set->nr_maps > HCTX_MAX_TYPES) + return -EINVAL; + /* * If a crashdump is active, then we are potentially in a very * memory constrained environment. Limit us to 1 queue and @@ -2875,12 +2900,14 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return -ENOMEM; ret = -ENOMEM; - set->map[0].mq_map = kcalloc_node(nr_cpu_ids, - sizeof(*set->map[0].mq_map), - GFP_KERNEL, set->numa_node); - if (!set->map[0].mq_map) - goto out_free_tags; - set->map[0].nr_queues = set->nr_hw_queues; + for (i = 0; i < set->nr_maps; i++) { + set->map[i].mq_map = kcalloc_node(nr_cpu_ids, + sizeof(struct blk_mq_queue_map), + GFP_KERNEL, set->numa_node); + if (!set->map[i].mq_map) + goto out_free_mq_map; + set->map[i].nr_queues = set->nr_hw_queues; + } ret = blk_mq_update_queue_map(set); if (ret) @@ -2896,9 +2923,10 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return 0; out_free_mq_map: - kfree(set->map[0].mq_map); - set->map[0].mq_map = NULL; -out_free_tags: + for (i = 0; i < set->nr_maps; i++) { + kfree(set->map[i].mq_map); + set->map[i].mq_map = NULL; + } kfree(set->tags); set->tags = NULL; return ret; @@ -2907,13 +2935,15 @@ EXPORT_SYMBOL(blk_mq_alloc_tag_set); void blk_mq_free_tag_set(struct blk_mq_tag_set *set) { - int i; + int i, j; for (i = 0; i < nr_cpu_ids; i++) blk_mq_free_map_and_requests(set, i); - kfree(set->map[0].mq_map); - set->map[0].mq_map = NULL; + for (j = 0; j < set->nr_maps; j++) { + kfree(set->map[j].mq_map); + set->map[j].mq_map = NULL; + } kfree(set->tags); set->tags = NULL; diff --git a/block/blk-mq.h b/block/blk-mq.h index 1821f448f7c4..8329017badc8 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -72,20 +72,37 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, */ extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); -static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, - unsigned int flags, - unsigned int cpu) +/* + * blk_mq_map_queue_type() - map (hctx_type,cpu) to hardware queue + * @q: request queue + * @hctx_type: the hctx type index + * @cpu: CPU + */ +static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, + unsigned int hctx_type, + unsigned int cpu) { struct blk_mq_tag_set *set = q->tag_set; - return q->queue_hw_ctx[set->map[0].mq_map[cpu]]; + return q->queue_hw_ctx[set->map[hctx_type].mq_map[cpu]]; } -static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, - unsigned int hctx_type, - unsigned int cpu) +/* + * blk_mq_map_queue() - map (cmd_flags,type) to hardware queue + * @q: request queue + * @flags: request command flags + * @cpu: CPU + */ +static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, + unsigned int flags, + unsigned int cpu) { - return blk_mq_map_queue(q, hctx_type, cpu); + int hctx_type = 0; + + if (q->mq_ops->flags_to_type) + hctx_type = q->mq_ops->flags_to_type(q, flags); + + return blk_mq_map_queue_type(q, hctx_type, cpu); } /* diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 466b9202b69c..26768c8f5af5 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -85,7 +85,14 @@ enum { }; struct blk_mq_tag_set { + /* + * map[] holds ctx -> hctx mappings, one map exists for each type + * that the driver wishes to support. There are no restrictions + * on maps being of the same size, and it's perfectly legal to + * share maps between types. + */ struct blk_mq_queue_map map[HCTX_MAX_TYPES]; + unsigned int nr_maps; /* nr entries in map[] */ const struct blk_mq_ops *ops; unsigned int nr_hw_queues; /* nr hw queues across maps */ unsigned int queue_depth; /* max hw supported */ @@ -109,6 +116,8 @@ struct blk_mq_queue_data { typedef blk_status_t (queue_rq_fn)(struct blk_mq_hw_ctx *, const struct blk_mq_queue_data *); +/* takes rq->cmd_flags as input, returns a hardware type index */ +typedef int (flags_to_type_fn)(struct request_queue *, unsigned int); typedef bool (get_budget_fn)(struct blk_mq_hw_ctx *); typedef void (put_budget_fn)(struct blk_mq_hw_ctx *); typedef enum blk_eh_timer_return (timeout_fn)(struct request *, bool); @@ -133,6 +142,11 @@ struct blk_mq_ops { */ queue_rq_fn *queue_rq; + /* + * Return a queue map type for the given request/bio flags + */ + flags_to_type_fn *flags_to_type; + /* * Reserve budget before queue request, once .queue_rq is * run, it is driver's responsibility to release the From patchwork Tue Oct 30 18:32:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661387 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5DCF21932 for ; Tue, 30 Oct 2018 18:33:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 68FDD2AA7B for ; Tue, 30 Oct 2018 18:33:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5D4872AA8A; Tue, 30 Oct 2018 18:33:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 31D3B2AA7B for ; Tue, 30 Oct 2018 18:33:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727616AbeJaD1t (ORCPT ); Tue, 30 Oct 2018 23:27:49 -0400 Received: from mail-it1-f193.google.com ([209.85.166.193]:52967 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727841AbeJaD1s (ORCPT ); Tue, 30 Oct 2018 23:27:48 -0400 Received: by mail-it1-f193.google.com with SMTP id r5-v6so12973311ith.2 for ; Tue, 30 Oct 2018 11:33:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YaNlZPxgZjMsE6yDVY2FyfjFvXinmFQDne8Zv8Bpjm0=; b=rvJN3FEpSNBj12cWkHpNYG7i5NmARKfJSGWGLfZWELhd2ghH2qATNNTI3NmE9wB6de LvZOmj22JPpd8H4nShUmkioSmKplguqzr/kxV4D2wF5GA7EIjYBx3sIYHadXPUlpodaK aTk9KYY0EJsFCicSrCPQjTc+SJXkJWz5SIOmiiA6dx62gfeuiakNh6IgTtKsM7vjYQBJ 9/f6GzuJo8mNiAs+s45s5lJHYxmHUPkyZJKhSFRNSFrSJ/CKf7+TtsrNFBh5i11x52CP Cy745LEdaBZVaS2/t09CxGR+9OKdh2YkJUh/ZkCMO5Sr8dmh1JSOKXZyNxnUZxKueWev GxhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YaNlZPxgZjMsE6yDVY2FyfjFvXinmFQDne8Zv8Bpjm0=; b=rcYQ63ggEFnEWR08EvSNY+aHK589UWZHJhdImOhcHh/NI0X26ndHobiTUCJ6BAltqg 9ezmEAc0X4m+VMUz+2yxAHWC8nNEwG1Jfgrcy83++VWTxIkmqrDx/g32kSon+OTZ9y3A Pkrgtq6aPW71+wUUmNcPyh1EqyOoxUkRzyLiBHSMSMo7psPTdIpJ/hetIn0xp/iHhc2/ rLqzEwLyD6Y7k+9tvodBVCNxHRiV0qLMIxjN0qU6gjaJAKBV2f60dW0WrvDd5KMQRCdU N3JR0I54aTwu1IaECjs+zw8vwvlwKLBb3nwiJaBgzkupWUAJ1RDwAycJblkpUkFh0hsx GXDA== X-Gm-Message-State: AGRZ1gI3JBEzJMltVCqXs2fJnfWZRLAg39a7lk6MNaL/MYb/5BtROaBX +H3yCq1GZBeUsl3ozmSijQOkgVZORSo= X-Google-Smtp-Source: AJdET5c8lhycrzLUP6kjXFFrkwzLDmmX+RTAN+tgh/t7zJztE849d8dlf2ZIv1LnyNCbA7+t6iUeYA== X-Received: by 2002:a24:b303:: with SMTP id e3-v6mr19471itf.82.1540924392117; Tue, 30 Oct 2018 11:33:12 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:10 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 08/16] blk-mq: separate number of hardware queues from nr_cpu_ids Date: Tue, 30 Oct 2018 12:32:44 -0600 Message-Id: <20181030183252.17857-9-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP With multiple maps, nr_cpu_ids is no longer the maximum number of hardware queues we support on a given devices. The initializer of the tag_set can have set ->nr_hw_queues larger than the available number of CPUs, since we can exceed that with multiple queue maps. Reviewed-by: Hannes Reinecke Reviewed-by: Bart Van Assche Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- block/blk-mq.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9d6e2f6f8ee9..1ca48cf3bbc7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2668,6 +2668,19 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, mutex_unlock(&q->sysfs_lock); } +/* + * Maximum number of hardware queues we support. For single sets, we'll never + * have more than the CPUs (software queues). For multiple sets, the tag_set + * user may have set ->nr_hw_queues larger. + */ +static unsigned int nr_hw_queues(struct blk_mq_tag_set *set) +{ + if (set->nr_maps == 1) + return nr_cpu_ids; + + return max(set->nr_hw_queues, nr_cpu_ids); +} + struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, struct request_queue *q) { @@ -2687,7 +2700,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, /* init q->mq_kobj and sw queues' kobjects */ blk_mq_sysfs_init(q); - q->queue_hw_ctx = kcalloc_node(nr_cpu_ids, sizeof(*(q->queue_hw_ctx)), + q->nr_queues = nr_hw_queues(set); + q->queue_hw_ctx = kcalloc_node(q->nr_queues, sizeof(*(q->queue_hw_ctx)), GFP_KERNEL, set->numa_node); if (!q->queue_hw_ctx) goto err_percpu; @@ -2699,7 +2713,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, INIT_WORK(&q->timeout_work, blk_mq_timeout_work); blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ); - q->nr_queues = nr_cpu_ids; q->tag_set = set; q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; @@ -2889,12 +2902,13 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) set->queue_depth = min(64U, set->queue_depth); } /* - * There is no use for more h/w queues than cpus. + * There is no use for more h/w queues than cpus if we just have + * a single map */ - if (set->nr_hw_queues > nr_cpu_ids) + if (set->nr_maps == 1 && set->nr_hw_queues > nr_cpu_ids) set->nr_hw_queues = nr_cpu_ids; - set->tags = kcalloc_node(nr_cpu_ids, sizeof(struct blk_mq_tags *), + set->tags = kcalloc_node(nr_hw_queues(set), sizeof(struct blk_mq_tags *), GFP_KERNEL, set->numa_node); if (!set->tags) return -ENOMEM; @@ -2937,7 +2951,7 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) { int i, j; - for (i = 0; i < nr_cpu_ids; i++) + for (i = 0; i < nr_hw_queues(set); i++) blk_mq_free_map_and_requests(set, i); for (j = 0; j < set->nr_maps; j++) { @@ -3069,7 +3083,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, lockdep_assert_held(&set->tag_list_lock); - if (nr_hw_queues > nr_cpu_ids) + if (set->nr_maps == 1 && nr_hw_queues > nr_cpu_ids) nr_hw_queues = nr_cpu_ids; if (nr_hw_queues < 1 || nr_hw_queues == set->nr_hw_queues) return; From patchwork Tue Oct 30 18:32:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661389 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 878C61932 for ; Tue, 30 Oct 2018 18:33:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 926902871C for ; Tue, 30 Oct 2018 18:33:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 86BA82AA89; Tue, 30 Oct 2018 18:33:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 574532871C for ; Tue, 30 Oct 2018 18:33:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727841AbeJaD1u (ORCPT ); Tue, 30 Oct 2018 23:27:50 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:50845 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728244AbeJaD1u (ORCPT ); Tue, 30 Oct 2018 23:27:50 -0400 Received: by mail-it1-f194.google.com with SMTP id k206-v6so14965249ite.0 for ; Tue, 30 Oct 2018 11:33:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=UQiEEY9NMcTCEmprz5TQOWS0p7/1mWG9RdC0JOyu6aY=; b=1qhl4csYHjkqmdwTYPZNkKwdlfHdKdX1wk3lz8Rc/n7h0Gthk3YlWXYkcOLL+gEiLF iDTBShjNC8DbXvUS7vqXpCI4BPYmyTWFoNTqa3nfhYYoG1n2WiJo5pxEUBpV74HsmmKh Rihxm3/iy0jro0hP5SnyvYIW+4+pvgR0c3mn8qik/LITLoitXb9NnepqdJA12yhkyp+r HISAf9PHE9LsFr1CK+y1unqpowo3qSuv/4eDa/MndUG8/8hp4KT5StqH7E5NLFIYWzwb k3xGTCtQUBsaElrdqyeMhwQp7Yp9gMO8PioldRSkq09/LLdgdSutHlf1ohpAnkUE7E7e rcGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=UQiEEY9NMcTCEmprz5TQOWS0p7/1mWG9RdC0JOyu6aY=; b=mKHQYJcExpvKq9YAcDkdBAlyKgXe+ypRrLsai8iHw8dbPYi1GuIgt311tqeB1GPjOy wBf64KGS/KTwSxuOa9e5oZIt2Izk84yBfRyEmf1EfagLzdBz1ubRBpp3jDvUeOLPnS/k 4QIbHmNq6bNy4avD7NWa2277YlvhO64TmhrM5VY17PBtWALwpnFxT4WIvtUd79g6w60m WNb9spwPLvSYlBMWvUgsYAS5frHBp/t5ItDVdi5H+dBJ7R75R8vEcwWKzg0KQBBXfGH3 a+goKfHJTVUYtL7IK6Xo8BE59E+Qd6q8QqQW2HgJ8DgCJcfE9oAzl7QJsVOHK81Db6o1 S9gw== X-Gm-Message-State: AGRZ1gIc51jucVuomHufIbMix9Ic7k7QEgYC7z6zpliKefiqWB7EA7Wc AlNTSjN9UypPqVtDOoJfaD3v2wAv8js= X-Google-Smtp-Source: AJdET5ccjFh3e9qP38/MTNN/1Nt5ynjAi8TFrdMikZAjRf6YzSgvdneZkX8f8xE4+ggqqbV10qCmeA== X-Received: by 2002:a24:2914:: with SMTP id p20-v6mr2162992itp.171.1540924394036; Tue, 30 Oct 2018 11:33:14 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:12 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 09/16] blk-mq: cache request hardware queue mapping Date: Tue, 30 Oct 2018 12:32:45 -0600 Message-Id: <20181030183252.17857-10-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We call blk_mq_map_queue() a lot, at least two times for each request per IO, sometimes more. Since we now have an indirect call as well in that function. cache the mapping so we don't have to re-call blk_mq_map_queue() for the same request multiple times. Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg Reviewed-by: Hannes Reinecke --- block/blk-flush.c | 12 ++++-------- block/blk-mq-debugfs.c | 4 +--- block/blk-mq-sched.c | 6 ++---- block/blk-mq-tag.c | 9 +-------- block/blk-mq.c | 22 +++++++++------------- block/blk-mq.h | 5 +---- include/linux/blkdev.h | 1 + 7 files changed, 19 insertions(+), 40 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 7922dba81497..2ff590b31a9d 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -219,7 +219,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) /* release the tag's ownership to the req cloned from */ spin_lock_irqsave(&fq->mq_flush_lock, flags); - hctx = blk_mq_map_queue(q, flush_rq->cmd_flags, flush_rq->mq_ctx->cpu); + hctx = flush_rq->mq_hctx; if (!q->elevator) { blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq); flush_rq->tag = -1; @@ -268,7 +268,6 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, struct request *first_rq = list_first_entry(pending, struct request, flush.list); struct request *flush_rq = fq->flush_rq; - struct blk_mq_hw_ctx *hctx; /* C1 described at the top of this file */ if (fq->flush_pending_idx != fq->flush_running_idx || list_empty(pending)) @@ -303,13 +302,12 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, * just for cheating put/get driver tag. */ flush_rq->mq_ctx = first_rq->mq_ctx; + flush_rq->mq_hctx = first_rq->mq_hctx; if (!q->elevator) { fq->orig_rq = first_rq; flush_rq->tag = first_rq->tag; - hctx = blk_mq_map_queue(q, first_rq->cmd_flags, - first_rq->mq_ctx->cpu); - blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq); + blk_mq_tag_set_rq(flush_rq->mq_hctx, first_rq->tag, flush_rq); } else { flush_rq->internal_tag = first_rq->internal_tag; } @@ -326,13 +324,11 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, static void mq_flush_data_end_io(struct request *rq, blk_status_t error) { struct request_queue *q = rq->q; - struct blk_mq_hw_ctx *hctx; + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; struct blk_mq_ctx *ctx = rq->mq_ctx; unsigned long flags; struct blk_flush_queue *fq = blk_get_flush_queue(q, ctx); - hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); - if (q->elevator) { WARN_ON(rq->tag < 0); blk_mq_put_driver_tag_hctx(hctx, rq); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index fac70c81b7de..cde19be36135 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -427,10 +427,8 @@ struct show_busy_params { static void hctx_show_busy_rq(struct request *rq, void *data, bool reserved) { const struct show_busy_params *params = data; - struct blk_mq_hw_ctx *hctx; - hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); - if (hctx == params->hctx) + if (rq->mq_hctx == params->hctx) __blk_mq_debugfs_rq_show(params->m, list_entry_rq(&rq->queuelist)); } diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index d232ecf3290c..8bc1f37acca2 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -367,9 +367,7 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, struct request_queue *q = rq->q; struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx; - - hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; /* flush rq in flush machinery need to be dispatched directly */ if (!(rq->rq_flags & RQF_FLUSH_SEQ) && op_is_flush(rq->cmd_flags)) { @@ -408,7 +406,7 @@ void blk_mq_sched_insert_requests(struct request_queue *q, /* For list inserts, requests better be on the same hw queue */ rq = list_first_entry(list, struct request, queuelist); - hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); + hctx = rq->mq_hctx; e = hctx->queue->elevator; if (e && e->type->ops.mq.insert_requests) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 478a959357f5..fb836d818b80 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -527,14 +527,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, */ u32 blk_mq_unique_tag(struct request *rq) { - struct request_queue *q = rq->q; - struct blk_mq_hw_ctx *hctx; - int hwq = 0; - - hctx = blk_mq_map_queue(q, rq->cmd_flags, rq->mq_ctx->cpu); - hwq = hctx->queue_num; - - return (hwq << BLK_MQ_UNIQUE_TAG_BITS) | + return (rq->mq_hctx->queue_num << BLK_MQ_UNIQUE_TAG_BITS) | (rq->tag & BLK_MQ_UNIQUE_TAG_MASK); } EXPORT_SYMBOL(blk_mq_unique_tag); diff --git a/block/blk-mq.c b/block/blk-mq.c index 1ca48cf3bbc7..b86d725958d3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -300,6 +300,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, /* csd/requeue_work/fifo_time is initialized before use */ rq->q = data->q; rq->mq_ctx = data->ctx; + rq->mq_hctx = data->hctx; rq->rq_flags = rq_flags; rq->cpu = -1; rq->cmd_flags = op; @@ -473,10 +474,11 @@ static void __blk_mq_free_request(struct request *rq) { struct request_queue *q = rq->q; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; const int sched_tag = rq->internal_tag; blk_pm_mark_last_busy(rq); + rq->mq_hctx = NULL; if (rq->tag != -1) blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag); if (sched_tag != -1) @@ -490,7 +492,7 @@ void blk_mq_free_request(struct request *rq) struct request_queue *q = rq->q; struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; if (rq->rq_flags & RQF_ELVPRIV) { if (e && e->type->ops.mq.finish_request) @@ -982,7 +984,7 @@ bool blk_mq_get_driver_tag(struct request *rq) { struct blk_mq_alloc_data data = { .q = rq->q, - .hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu), + .hctx = rq->mq_hctx, .flags = BLK_MQ_REQ_NOWAIT, .cmd_flags = rq->cmd_flags, }; @@ -1148,7 +1150,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); - hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); + hctx = rq->mq_hctx; if (!got_budget && !blk_mq_get_dispatch_budget(hctx)) break; @@ -1578,9 +1580,7 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, */ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue) { - struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, - ctx->cpu); + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; spin_lock(&hctx->lock); list_add_tail(&rq->queuelist, &hctx->dispatch); @@ -1789,9 +1789,7 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) blk_status_t ret; int srcu_idx; blk_qc_t unused_cookie; - struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, - ctx->cpu); + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; hctx_lock(hctx, &srcu_idx); ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); @@ -1916,9 +1914,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) blk_mq_put_ctx(data.ctx); if (same_queue_rq) { - data.hctx = blk_mq_map_queue(q, - same_queue_rq->cmd_flags, - same_queue_rq->mq_ctx->cpu); + data.hctx = same_queue_rq->mq_hctx; blk_mq_try_issue_directly(data.hctx, same_queue_rq, &cookie); } diff --git a/block/blk-mq.h b/block/blk-mq.h index 8329017badc8..74cb2f524824 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -223,13 +223,10 @@ static inline void blk_mq_put_driver_tag_hctx(struct blk_mq_hw_ctx *hctx, static inline void blk_mq_put_driver_tag(struct request *rq) { - struct blk_mq_hw_ctx *hctx; - if (rq->tag == -1 || rq->internal_tag == -1) return; - hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); - __blk_mq_put_driver_tag(hctx, rq); + __blk_mq_put_driver_tag(rq->mq_hctx, rq); } static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 4223ae2d2198..7b351210ebcd 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -129,6 +129,7 @@ enum mq_rq_state { struct request { struct request_queue *q; struct blk_mq_ctx *mq_ctx; + struct blk_mq_hw_ctx *mq_hctx; int cpu; unsigned int cmd_flags; /* op and common flags */ From patchwork Tue Oct 30 18:32:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661403 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 80D3E14DE for ; Tue, 30 Oct 2018 18:34:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C23F2986B for ; Tue, 30 Oct 2018 18:34:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8050C299B9; Tue, 30 Oct 2018 18:34:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 160362986B for ; Tue, 30 Oct 2018 18:34:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727020AbeJaD2q (ORCPT ); Tue, 30 Oct 2018 23:28:46 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:36385 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727684AbeJaD1w (ORCPT ); Tue, 30 Oct 2018 23:27:52 -0400 Received: by mail-it1-f194.google.com with SMTP id t4-v6so6645726itf.1 for ; Tue, 30 Oct 2018 11:33:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0pIeDFtEGtTWIpejvN1KlR+jipBFPqHKlbzzMp8RAQY=; b=wAOiowO/mVCgKdHWTOtZUTsYPXWoATSFFw+/LNkkogVvlewLcGOEWq0shwvXMoMTpE ckQy0rbhdUohZl8Tk8iD4HoukVnA7HIngPW+Sy2102zYCXNHiOqx1L13R1QMxSvq+Adh OVz+69aOLw4qH/Qh4D2bE9HBUaayZxfRIRzEk/YetvgT+FDFiujCCh9HDDKq2pDMTUvj hH6KACH6I83vsM5tO8tmzXCqF0ksepVae/MdeIm2kjrRr6lNsaHKPZn4spw6rLgH2iH0 j2aVNv4L5RCPDwQd8smE3xE2nh++zOjfJ7pMvG+Qpy7TuJz/5NmNxJ6xmsQKIvlG++db sNig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0pIeDFtEGtTWIpejvN1KlR+jipBFPqHKlbzzMp8RAQY=; b=oT03bTxhPAUSSCIYqI9LxkjtP+2UNcnh6hzFL7B3ZDMS/GYOumLMowY0cw8dt4CVU9 pKhIU5oF+8BZQRLrmJVqut5JJvAuNGoIH1MsJ5Pet6/w2cpTUoW54rw2ZQh4eGQBGFx7 k7hL3jnt43ktxXrmln2myxlby7pqLYFbVw/rIAEXMJ+ungWM7OXPixdQGAe18s8yhcgr FVgakpFx5CsqiFrErTwnAwBrnRCyAeFFoA9+B8u/Y9OUhgZYwxnZrhREiS92eTBUJR++ qK+DxBiF2u+ghNKqt3w5u8p5SzIOMqvVHxcAnQUFJ4nxYjStN/LlvdkaFG20V6I7dmfN 6u1g== X-Gm-Message-State: AGRZ1gKJdMsb4SK5MQHxuTeGR4frttTYCqW6dU96QVVWLUoWobVh8rGx wLlc2Gt979DXiWHTE+ZWs1xRtcypOfI= X-Google-Smtp-Source: AJdET5db2SrmMseFyQke7yKO2N3elXXI+IwJbhgyXv8e2+Il9RgMUvA2+kJwkUhi5rrRQ0/v/rz4rg== X-Received: by 2002:a02:a914:: with SMTP id n20-v6mr12774jam.90.1540924396163; Tue, 30 Oct 2018 11:33:16 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:14 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 10/16] blk-mq: cleanup and improve list insertion Date: Tue, 30 Oct 2018 12:32:46 -0600 Message-Id: <20181030183252.17857-11-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It's somewhat strange to have a list insertion function that relies on the fact that the caller has mapped things correctly. Pass in the hardware queue directly for insertion, which makes for a much cleaner interface and implementation. Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg Reviewed-by: Hannes Reinecke --- block/blk-mq-sched.c | 8 +------- block/blk-mq-sched.h | 2 +- block/blk-mq.c | 25 ++++++++++++++----------- 3 files changed, 16 insertions(+), 19 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 8bc1f37acca2..6e7375246e2f 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -396,17 +396,11 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, blk_mq_run_hw_queue(hctx, async); } -void blk_mq_sched_insert_requests(struct request_queue *q, +void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list, bool run_queue_async) { - struct blk_mq_hw_ctx *hctx; struct elevator_queue *e; - struct request *rq; - - /* For list inserts, requests better be on the same hw queue */ - rq = list_first_entry(list, struct request, queuelist); - hctx = rq->mq_hctx; e = hctx->queue->elevator; if (e && e->type->ops.mq.insert_requests) diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 8a9544203173..ffd7b5989d63 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -19,7 +19,7 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx); void blk_mq_sched_insert_request(struct request *rq, bool at_head, bool run_queue, bool async); -void blk_mq_sched_insert_requests(struct request_queue *q, +void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list, bool run_queue_async); diff --git a/block/blk-mq.c b/block/blk-mq.c index b86d725958d3..51b8166959b9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1623,11 +1623,12 @@ static int plug_ctx_cmp(void *priv, struct list_head *a, struct list_head *b) void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) { + struct blk_mq_hw_ctx *this_hctx; struct blk_mq_ctx *this_ctx; struct request_queue *this_q; struct request *rq; LIST_HEAD(list); - LIST_HEAD(ctx_list); + LIST_HEAD(rq_list); unsigned int depth; list_splice_init(&plug->mq_list, &list); @@ -1635,6 +1636,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) list_sort(NULL, &list, plug_ctx_cmp); this_q = NULL; + this_hctx = NULL; this_ctx = NULL; depth = 0; @@ -1642,30 +1644,31 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) rq = list_entry_rq(list.next); list_del_init(&rq->queuelist); BUG_ON(!rq->q); - if (rq->mq_ctx != this_ctx) { - if (this_ctx) { + if (rq->mq_hctx != this_hctx || rq->mq_ctx != this_ctx) { + if (this_hctx) { trace_block_unplug(this_q, depth, !from_schedule); - blk_mq_sched_insert_requests(this_q, this_ctx, - &ctx_list, + blk_mq_sched_insert_requests(this_hctx, this_ctx, + &rq_list, from_schedule); } - this_ctx = rq->mq_ctx; this_q = rq->q; + this_ctx = rq->mq_ctx; + this_hctx = rq->mq_hctx; depth = 0; } depth++; - list_add_tail(&rq->queuelist, &ctx_list); + list_add_tail(&rq->queuelist, &rq_list); } /* - * If 'this_ctx' is set, we know we have entries to complete - * on 'ctx_list'. Do those. + * If 'this_hctx' is set, we know we have entries to complete + * on 'rq_list'. Do those. */ - if (this_ctx) { + if (this_hctx) { trace_block_unplug(this_q, depth, !from_schedule); - blk_mq_sched_insert_requests(this_q, this_ctx, &ctx_list, + blk_mq_sched_insert_requests(this_hctx, this_ctx, &rq_list, from_schedule); } } From patchwork Tue Oct 30 18:32:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661401 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E69A14DE for ; Tue, 30 Oct 2018 18:34:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 58E3D299B9 for ; Tue, 30 Oct 2018 18:34:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CFB829C60; Tue, 30 Oct 2018 18:34:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2650299B9 for ; Tue, 30 Oct 2018 18:34:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728280AbeJaD1z (ORCPT ); Tue, 30 Oct 2018 23:27:55 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:52985 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728279AbeJaD1y (ORCPT ); Tue, 30 Oct 2018 23:27:54 -0400 Received: by mail-it1-f194.google.com with SMTP id r5-v6so12973742ith.2 for ; Tue, 30 Oct 2018 11:33:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lH0eWuHHiCANq4BSpJWvv00qJJn3hhUC9L40R4M73kg=; b=L6UjNA4JF31+E+Ock5f0VSU5X3Dk0iICBqga3Tx3kV+eKs7FPIeIuGNFXw5aHPW9tC TVwvcWPavhVJ2QbIQ+h5OpAXYxad1lIJbebfSSkaXCV2yyHCZzReBM8nWjaiJzQcN/xl aqEbmoe/1XkpU2PKUl/6KmogXOQdlzebab6HO4MHtl3NbJpFBJexnJ4ZoXyG40z7Spd5 2U2vluq64peFNvQocy7lBQS5LfYRfG2YkjAgg1ZL72D3avVYuMbCpND8myxsk5BVd+6k P/CDe734E92S1g7S1KJzdMEPkQhiVi4IzwalJwgw57If4l2SBGaX9ZK5jjZSJAR8kuEM NO2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lH0eWuHHiCANq4BSpJWvv00qJJn3hhUC9L40R4M73kg=; b=pz6KOVk18tyJNgfeAOmHnCauMyclYW2PIM/P6IN6I5TME7nhVLEswHqmbwCHJBeQw2 Nj2I9Jw2OZ+Kf6c4/9zPEd2CMC7/25wie23FLFmKu/p4vyxdm94Qm8xim+7sV/pBJQfY td/CGBPeNZJmZGsGjXj6/s5HQq9Lzj9VDpN+0EZA4oep1SNAYp5vUyYh7gnuJm9zwyhN Q9lK6BAbS9E0s03lS32qmxRqSnd74k+ZtrVtwrO13ff0B/CujgCw5zsFEl0iwJzkss+w ihi2cEGvLnKwfVsaziBJvGqmzCnMBe2ZKfh1IkxMWJdEyd++qcqtk/p1zn0KZdTC3VXh kdlw== X-Gm-Message-State: AGRZ1gKK46Vs/sft5dSeMDb6yAjgZQBFqznZj6xe93gJzRXX/bE6cGLP 1ddmWgH1xtHbJoJOD4Pmp9VNjkbMDzs= X-Google-Smtp-Source: AJdET5fJmAz/Erkrun47XTQOD03fsZT2RF6veLieGBkVH7Q2awM3QXsS60ET2WTEo3MiOaT1qQf9IQ== X-Received: by 2002:a24:1706:: with SMTP id 6-v6mr30592ith.16.1540924397941; Tue, 30 Oct 2018 11:33:17 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:16 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 11/16] blk-mq: improve plug list sorting Date: Tue, 30 Oct 2018 12:32:47 -0600 Message-Id: <20181030183252.17857-12-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently we only look at the software queue, but with support for multiple maps, we should also look at the hardware queue. This is important since we'll flush out the request list if either the software queue or hardware queue don't match. This sorts by software queue first, then hardware queue if that differs. Finally we sort by request location like before. This minimizes the flush points per plug list. Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 51b8166959b9..5a34c9374dc7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1611,14 +1611,21 @@ void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, spin_unlock(&ctx->lock); } -static int plug_ctx_cmp(void *priv, struct list_head *a, struct list_head *b) +static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b) { struct request *rqa = container_of(a, struct request, queuelist); struct request *rqb = container_of(b, struct request, queuelist); - return !(rqa->mq_ctx < rqb->mq_ctx || - (rqa->mq_ctx == rqb->mq_ctx && - blk_rq_pos(rqa) < blk_rq_pos(rqb))); + if (rqa->mq_ctx < rqb->mq_ctx) + return -1; + else if (rqa->mq_ctx > rqb->mq_ctx) + return 1; + else if (rqa->mq_hctx < rqb->mq_hctx) + return -1; + else if (rqa->mq_hctx > rqb->mq_hctx) + return 1; + + return blk_rq_pos(rqa) > blk_rq_pos(rqb); } void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) @@ -1633,7 +1640,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) list_splice_init(&plug->mq_list, &list); - list_sort(NULL, &list, plug_ctx_cmp); + list_sort(NULL, &list, plug_rq_cmp); this_q = NULL; this_hctx = NULL; From patchwork Tue Oct 30 18:32:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661399 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3E28A14DE for ; Tue, 30 Oct 2018 18:34:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4978F299B9 for ; Tue, 30 Oct 2018 18:34:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E00429C60; Tue, 30 Oct 2018 18:34:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9E47299B9 for ; Tue, 30 Oct 2018 18:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726135AbeJaD2i (ORCPT ); Tue, 30 Oct 2018 23:28:38 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:37003 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728295AbeJaD14 (ORCPT ); Tue, 30 Oct 2018 23:27:56 -0400 Received: by mail-it1-f195.google.com with SMTP id e74-v6so14968181ita.2 for ; Tue, 30 Oct 2018 11:33:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=y5BsDQmc5dLZKcxCUwh1dtXo4oyHvmazmNAUSxG7ysA=; b=wQ4plRSc/xR6AA0ydQd7opC1U4kC0qs2liAeermPEOb8+9FYYztWC+edXPYCcVBBQs ghkpkpY+uCHavWvq4vMR3IIdMfnY73VrgCHYdfEK72RnLnakf5ZC9GkLSx7zRA/LSiIR oFOuHwcbdtMaycKYRlFUFpd87dPgdyFhYBvsMc0oauR8vHA6gqBVIKk6URLZqaYT0Iv+ zOuMYhnqMLIUdQcXYf6Sx6R5XmmMRuBQ9eVXmwZKZ8uk4TgG/jNLMTu2xxLh2PjWYom6 XeQd5ljRVH3uxDC+CIFBYxrfJiyGLeMEbKRdJLZnLIGlHLeiGGW+prwrYvgLOQOF3G7N fmhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=y5BsDQmc5dLZKcxCUwh1dtXo4oyHvmazmNAUSxG7ysA=; b=Ok/EWHhuPebW/GjWk/qncpaP/4Oaw4OtU4HkmYceds38DtAe21SFed3HyJQmvHs2wD +sIyqlaDM3yi6AsW+GM4xGJu+LtOtcBjZbPpysEdfgTwoH2rq+0KYLDKTNVCz/wFoWN6 ih5D8ZuaRNHNy8coV8lSTSepnk1t7xMFLFbzO5ge7tDBpjIhLeTISsWum32za+bxKyZG 2uc844zjXmk9n0U+2AkNrEPjwuPRp+xaTLiT7QRHB+6GJe1L6txZJ54blFtN5h7eN9Nm V3i5RZ013/F3+MkRKIorcDGAXdwYx4VNYNtXaPqzttNFHOhbGYPuQ1LB7awjJu+SLkii eURg== X-Gm-Message-State: AGRZ1gLgxXlv5Ex+M3p4Mi2Yjjvv2we7KKCQTxL7acb2ov8pOgel3hm4 LyDfkz+ocAqKbsmUekFwrBo27QNo+Aw= X-Google-Smtp-Source: AJdET5c8lgFP+5G3+iVQMsg+cmnq+NP40CxMu4iI55YCg16CoUttivf10ZpFbuiS++2hXW9WN660WQ== X-Received: by 2002:a24:ed0c:: with SMTP id r12-v6mr629ith.53.1540924400045; Tue, 30 Oct 2018 11:33:20 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:18 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 12/16] blk-mq: initial support for multiple queue maps Date: Tue, 30 Oct 2018 12:32:48 -0600 Message-Id: <20181030183252.17857-13-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a queue offset to the tag map. This enables users to map iteratively, for each queue map type they support. Bump maximum number of supported maps to 2, we're now fully able to support more than 1 map. Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- block/blk-mq-cpumap.c | 9 +++++---- block/blk-mq-pci.c | 2 +- block/blk-mq-virtio.c | 2 +- include/linux/blk-mq.h | 3 ++- 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 6e6686c55984..03a534820271 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -14,9 +14,10 @@ #include "blk.h" #include "blk-mq.h" -static int cpu_to_queue_index(unsigned int nr_queues, const int cpu) +static int cpu_to_queue_index(struct blk_mq_queue_map *qmap, + unsigned int nr_queues, const int cpu) { - return cpu % nr_queues; + return qmap->queue_offset + (cpu % nr_queues); } static int get_first_sibling(unsigned int cpu) @@ -44,11 +45,11 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) * performace optimizations. */ if (cpu < nr_queues) { - map[cpu] = cpu_to_queue_index(nr_queues, cpu); + map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); } else { first_sibling = get_first_sibling(cpu); if (first_sibling == cpu) - map[cpu] = cpu_to_queue_index(nr_queues, cpu); + map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); else map[cpu] = map[first_sibling]; } diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c index 40333d60a850..1dce18553984 100644 --- a/block/blk-mq-pci.c +++ b/block/blk-mq-pci.c @@ -43,7 +43,7 @@ int blk_mq_pci_map_queues(struct blk_mq_queue_map *qmap, struct pci_dev *pdev, goto fallback; for_each_cpu(cpu, mask) - qmap->mq_map[cpu] = queue; + qmap->mq_map[cpu] = qmap->queue_offset + queue; } return 0; diff --git a/block/blk-mq-virtio.c b/block/blk-mq-virtio.c index 661fbfef480f..370827163835 100644 --- a/block/blk-mq-virtio.c +++ b/block/blk-mq-virtio.c @@ -44,7 +44,7 @@ int blk_mq_virtio_map_queues(struct blk_mq_queue_map *qmap, goto fallback; for_each_cpu(cpu, mask) - qmap->mq_map[cpu] = queue; + qmap->mq_map[cpu] = qmap->queue_offset + queue; } return 0; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 26768c8f5af5..8e80d5043079 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -78,10 +78,11 @@ struct blk_mq_hw_ctx { struct blk_mq_queue_map { unsigned int *mq_map; unsigned int nr_queues; + unsigned int queue_offset; }; enum { - HCTX_MAX_TYPES = 1, + HCTX_MAX_TYPES = 2, }; struct blk_mq_tag_set { From patchwork Tue Oct 30 18:32:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661397 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A69D14DE for ; Tue, 30 Oct 2018 18:34:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 246252986B for ; Tue, 30 Oct 2018 18:34:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 188B2299B9; Tue, 30 Oct 2018 18:34:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 923902986B for ; Tue, 30 Oct 2018 18:34:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727162AbeJaD2f (ORCPT ); Tue, 30 Oct 2018 23:28:35 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:33949 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728244AbeJaD16 (ORCPT ); Tue, 30 Oct 2018 23:27:58 -0400 Received: by mail-it1-f196.google.com with SMTP id e81-v6so13377890itc.1 for ; Tue, 30 Oct 2018 11:33:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=34+ob0sFEhybFDfaezN8qC7dpU4f/CwKNb9IJtk6PWk=; b=m+tguX0CzwX4Zon4ogYPZwXdBp1jkg8v9JkETR+zu+ZkbaPiCd33OpUzP+gwqj8czF 8nuECRUlFR8L2K7LECxCJ0K0y+muRwKYMbtiqxPwHsPv8/AHtAuyGgA1erCZGzFXD/Kj ptTujM6AkagSTBisw7YavJjUVALJbJ2k/E5XOxDjyBRRvkHk1r4v4/B8R62oo07tCXOd QBLLSIyTMFu3tlw0OwIrbwpZZxP+if4nTRBry/XRL/H0D56+PM0O8rAGr0azuKbpMkcP eGyPaRSOgqqh0PDV+F+crZ7QhcWxU3MdKLESPC+RMhUFovm38ioUkHyuIGOMEdam9Ji8 02AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=34+ob0sFEhybFDfaezN8qC7dpU4f/CwKNb9IJtk6PWk=; b=huZIlPRZAQ4ujqo0OI7KZvdBxYRLtnA/JKXoGE+7xUR4LrO5Jv/Gva4VxOlxW0A0AS +DYvo+DVSf/+nNNs7/+MN9zzSkdyF2y6seK1MmCgGDviM2l/noouInqOLxGuHK0gSa2F JUVdz10XyKtFtZOCvTTgGlYC+Xl83ynqewlDoWbB9tiOcSPkd+fq5bA9AlZ5b4YVR7Ca sxMkRRr9x0yeBl7VDfRqVm+qymhH12PBJm/VkP4TCXArpv2mOtnVb1l6yBYrx+sYzByB liY8iX5uBxu0U6ujuluL6Y6hKq2A6Xzn5PfJ8p2z7ErX4nEwqZ1vOXG39GjjxjoUgXlS yMRA== X-Gm-Message-State: AGRZ1gJXBGWxKlPWkxUPUwCmBRoZ6XF4B8XKEH+cH7eD+MIjqB7J0S8H ualxNWUqNSQotCetiBK+oqBLg1snmrw= X-Google-Smtp-Source: AJdET5cl3KsNiGzxsR218xiYH+W3ub3FKlYGCLIVFJ98rsqfSb+kHSbG7YZDuzWlBMTS3hqYbe54Cw== X-Received: by 2002:a02:e43:: with SMTP id 64-v6mr21185jae.58.1540924402168; Tue, 30 Oct 2018 11:33:22 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:20 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe , Thomas Gleixner Subject: [PATCH 13/16] irq: add support for allocating (and affinitizing) sets of IRQs Date: Tue, 30 Oct 2018 12:32:49 -0600 Message-Id: <20181030183252.17857-14-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts, and have them appropriately affinitized. Add support for defining a number of sets in the irq_affinity structure, of varying sizes, and get each set affinitized correctly across the machine. Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Reviewed-by: Hannes Reinecke Reviewed-by: Ming Lei Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- drivers/pci/msi.c | 14 ++++++++++++++ include/linux/interrupt.h | 4 ++++ kernel/irq/affinity.c | 40 ++++++++++++++++++++++++++++++--------- 3 files changed, 49 insertions(+), 9 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index af24ed50a245..e6c6e10b9ceb 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -1036,6 +1036,13 @@ static int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, if (maxvec < minvec) return -ERANGE; + /* + * If the caller is passing in sets, we can't support a range of + * vectors. The caller needs to handle that. + */ + if (affd->nr_sets && minvec != maxvec) + return -EINVAL; + if (WARN_ON_ONCE(dev->msi_enabled)) return -EINVAL; @@ -1087,6 +1094,13 @@ static int __pci_enable_msix_range(struct pci_dev *dev, if (maxvec < minvec) return -ERANGE; + /* + * If the caller is passing in sets, we can't support a range of + * supported vectors. The caller needs to handle that. + */ + if (affd->nr_sets && minvec != maxvec) + return -EINVAL; + if (WARN_ON_ONCE(dev->msix_enabled)) return -EINVAL; diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 1d6711c28271..ca397ff40836 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -247,10 +247,14 @@ struct irq_affinity_notify { * the MSI(-X) vector space * @post_vectors: Don't apply affinity to @post_vectors at end of * the MSI(-X) vector space + * @nr_sets: Length of passed in *sets array + * @sets: Number of affinitized sets */ struct irq_affinity { int pre_vectors; int post_vectors; + int nr_sets; + int *sets; }; #if defined(CONFIG_SMP) diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index f4f29b9d90ee..2046a0f0f0f1 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -180,6 +180,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) int curvec, usedvecs; cpumask_var_t nmsk, npresmsk, *node_to_cpumask; struct cpumask *masks = NULL; + int i, nr_sets; /* * If there aren't any vectors left after applying the pre/post @@ -210,10 +211,23 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) get_online_cpus(); build_node_to_cpumask(node_to_cpumask); - /* Spread on present CPUs starting from affd->pre_vectors */ - usedvecs = irq_build_affinity_masks(affd, curvec, affvecs, - node_to_cpumask, cpu_present_mask, - nmsk, masks); + /* + * Spread on present CPUs starting from affd->pre_vectors. If we + * have multiple sets, build each sets affinity mask separately. + */ + nr_sets = affd->nr_sets; + if (!nr_sets) + nr_sets = 1; + + for (i = 0, usedvecs = 0; i < nr_sets; i++) { + int this_vecs = affd->sets ? affd->sets[i] : affvecs; + int nr; + + nr = irq_build_affinity_masks(affd, curvec, this_vecs, + node_to_cpumask, cpu_present_mask, + nmsk, masks + usedvecs); + usedvecs += nr; + } /* * Spread on non present CPUs starting from the next vector to be @@ -258,13 +272,21 @@ int irq_calc_affinity_vectors(int minvec, int maxvec, const struct irq_affinity { int resv = affd->pre_vectors + affd->post_vectors; int vecs = maxvec - resv; - int ret; + int set_vecs; if (resv > minvec) return 0; - get_online_cpus(); - ret = min_t(int, cpumask_weight(cpu_possible_mask), vecs) + resv; - put_online_cpus(); - return ret; + if (affd->nr_sets) { + int i; + + for (i = 0, set_vecs = 0; i < affd->nr_sets; i++) + set_vecs += affd->sets[i]; + } else { + get_online_cpus(); + set_vecs = cpumask_weight(cpu_possible_mask); + put_online_cpus(); + } + + return resv + min(set_vecs, vecs); } From patchwork Tue Oct 30 18:32:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661393 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3804C1932 for ; Tue, 30 Oct 2018 18:33:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10A4227FAE for ; Tue, 30 Oct 2018 18:33:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 04FEC284C4; Tue, 30 Oct 2018 18:33:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3B1CD27FAE for ; Tue, 30 Oct 2018 18:33:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728349AbeJaD2C (ORCPT ); Tue, 30 Oct 2018 23:28:02 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:54086 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728335AbeJaD2A (ORCPT ); Tue, 30 Oct 2018 23:28:00 -0400 Received: by mail-it1-f196.google.com with SMTP id f16-v6so7408987ita.3 for ; Tue, 30 Oct 2018 11:33:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Kd+IlX4sZp5qTEQrFdk/0yIj0HwN0pdw7Z4EsC4/s4c=; b=S9WJoPU0wAaHLQB1gm1RUNFoK1C9c8vDolfqI/c7EvBx1N1CyUUsFujZ9RaQuj1zsp 0ZixwvsaeB9z1hiVLUqZnwP6CM8KnWoo1UWYvt37foWUIo7upVxRVnNl+gJRXC7q9gi4 HKj9h/jfWz/7kvb+hfy+uu2jVL5EL1Mn7tkAlcSWJzJqHhXPH1eSXOexNpWgHV01+Ldm Z4VKcG4VReB+tyuPuUeuz5pvlhbb+Wo+U6Wok1SwGv2a37MHxze67Q83bOI+t2sOCQZq ApECf0Gb7f0K3WJLJfNwT5dCXr6PdXRkySdc/48DeY5DwaeQfQxjp6S0VZHh50sFaF7s hdSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Kd+IlX4sZp5qTEQrFdk/0yIj0HwN0pdw7Z4EsC4/s4c=; b=uBw5JTQHzPKp+2ou7+yF3cwmXkFjxIl1i8/2wZSRNZXmR3UlZ5kgl+spGJfgeg+XN1 8mtGv55+KiC2dihLL2dH8rXu8pcExCuofz1skITUmt87/hs0oN1/jsPwURizP/s/N+0I W0FeRowSdgKnHfoqmr0Sl8hRM6ARWEyYRBzT6F9tCjLT8q6FdNY+K4dzVdqE02ADBfEC co6t5U57BCgvYm1lous32oc08DlLUESSeWy5j4wG0tjF+K0rxJMb04INr5/oiJZvGn+2 fuPpnb3VJKOl8VEIwCTwBvbLh5bqXHhz/7kt1Ym1QLB+il7dcUiPYhRfwdZlSRqLrGBj Cgbw== X-Gm-Message-State: AGRZ1gIk2Ag1eOmX3Wyimo3rWI2mEiy0qZO0uwFAAcCW2chN15X1sgkd 4tifCYZMbEDJt9k3CmtnH8sKXraZry4= X-Google-Smtp-Source: AJdET5cUFXwSm/8eLxOG1mBqV1soY1D7+qnaTcwa4rRVDFTomUklW1x0ZlmDrcRdCcrXhGnEd2hIdw== X-Received: by 2002:a02:9d79:: with SMTP id m54-v6mr3026jal.5.1540924403832; Tue, 30 Oct 2018 11:33:23 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:22 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 14/16] nvme: utilize two queue maps, one for reads and one for writes Date: Tue, 30 Oct 2018 12:32:50 -0600 Message-Id: <20181030183252.17857-15-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP NVMe does round-robin between queues by default, which means that sharing a queue map for both reads and writes can be problematic in terms of read servicing. It's much easier to flood the queue with writes and reduce the read servicing. Implement two queue maps, one for reads and one for writes. The write queue count is configurable through the 'write_queues' parameter. By default, we retain the previous behavior of having a single queue set, shared between reads and writes. Setting 'write_queues' to a non-zero value will create two queue sets, one for reads and one for writes, the latter using the configurable number of queues (hardware queue counts permitting). Reviewed-by: Hannes Reinecke Reviewed-by: Keith Busch Signed-off-by: Jens Axboe --- drivers/nvme/host/pci.c | 174 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 162 insertions(+), 12 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index e5d783cb6937..17170686105f 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -74,11 +74,29 @@ static int io_queue_depth = 1024; module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644); MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2"); +static int queue_count_set(const char *val, const struct kernel_param *kp); +static const struct kernel_param_ops queue_count_ops = { + .set = queue_count_set, + .get = param_get_int, +}; + +static int write_queues; +module_param_cb(write_queues, &queue_count_ops, &write_queues, 0644); +MODULE_PARM_DESC(write_queues, + "Number of queues to use for writes. If not set, reads and writes " + "will share a queue set."); + struct nvme_dev; struct nvme_queue; static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); +enum { + NVMEQ_TYPE_READ, + NVMEQ_TYPE_WRITE, + NVMEQ_TYPE_NR, +}; + /* * Represents an NVM Express device. Each nvme_dev is a PCI function. */ @@ -92,6 +110,7 @@ struct nvme_dev { struct dma_pool *prp_small_pool; unsigned online_queues; unsigned max_qid; + unsigned io_queues[NVMEQ_TYPE_NR]; unsigned int num_vecs; int q_depth; u32 db_stride; @@ -134,6 +153,17 @@ static int io_queue_depth_set(const char *val, const struct kernel_param *kp) return param_set_int(val, kp); } +static int queue_count_set(const char *val, const struct kernel_param *kp) +{ + int n = 0, ret; + + ret = kstrtoint(val, 10, &n); + if (n > num_possible_cpus()) + n = num_possible_cpus(); + + return param_set_int(val, kp); +} + static inline unsigned int sq_idx(unsigned int qid, u32 stride) { return qid * 2 * stride; @@ -218,9 +248,20 @@ static inline void _nvme_check_size(void) BUILD_BUG_ON(sizeof(struct nvme_dbbuf) != 64); } +static unsigned int max_io_queues(void) +{ + return num_possible_cpus() + write_queues; +} + +static unsigned int max_queue_count(void) +{ + /* IO queues + admin queue */ + return 1 + max_io_queues(); +} + static inline unsigned int nvme_dbbuf_size(u32 stride) { - return ((num_possible_cpus() + 1) * 8 * stride); + return (max_queue_count() * 8 * stride); } static int nvme_dbbuf_dma_alloc(struct nvme_dev *dev) @@ -431,12 +472,41 @@ static int nvme_init_request(struct blk_mq_tag_set *set, struct request *req, return 0; } +static int queue_irq_offset(struct nvme_dev *dev) +{ + /* if we have more than 1 vec, admin queue offsets us 1 */ + if (dev->num_vecs > 1) + return 1; + + return 0; +} + static int nvme_pci_map_queues(struct blk_mq_tag_set *set) { struct nvme_dev *dev = set->driver_data; + int i, qoff, offset; + + offset = queue_irq_offset(dev); + for (i = 0, qoff = 0; i < set->nr_maps; i++) { + struct blk_mq_queue_map *map = &set->map[i]; - return blk_mq_pci_map_queues(&set->map[0], to_pci_dev(dev->dev), - dev->num_vecs > 1 ? 1 /* admin queue */ : 0); + map->nr_queues = dev->io_queues[i]; + if (!map->nr_queues) { + BUG_ON(i == NVMEQ_TYPE_READ); + + /* shared set, resuse read set parameters */ + map->nr_queues = dev->io_queues[NVMEQ_TYPE_READ]; + qoff = 0; + offset = queue_irq_offset(dev); + } + + map->queue_offset = qoff; + blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset); + qoff += map->nr_queues; + offset += map->nr_queues; + } + + return 0; } /** @@ -849,6 +919,14 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; } +static int nvme_flags_to_type(struct request_queue *q, unsigned int flags) +{ + if ((flags & REQ_OP_MASK) == REQ_OP_READ) + return NVMEQ_TYPE_READ; + + return NVMEQ_TYPE_WRITE; +} + static void nvme_pci_complete_rq(struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); @@ -1476,6 +1554,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = { static const struct blk_mq_ops nvme_mq_ops = { .queue_rq = nvme_queue_rq, + .flags_to_type = nvme_flags_to_type, .complete = nvme_pci_complete_rq, .init_hctx = nvme_init_hctx, .init_request = nvme_init_request, @@ -1888,18 +1967,53 @@ static int nvme_setup_host_mem(struct nvme_dev *dev) return ret; } +static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int nr_io_queues) +{ + unsigned int this_w_queues = write_queues; + + /* + * Setup read/write queue split + */ + if (nr_io_queues == 1) { + dev->io_queues[NVMEQ_TYPE_READ] = 1; + dev->io_queues[NVMEQ_TYPE_WRITE] = 0; + return; + } + + /* + * If 'write_queues' is set, ensure it leaves room for at least + * one read queue + */ + if (this_w_queues >= nr_io_queues) + this_w_queues = nr_io_queues - 1; + + /* + * If 'write_queues' is set to zero, reads and writes will share + * a queue set. + */ + if (!this_w_queues) { + dev->io_queues[NVMEQ_TYPE_WRITE] = 0; + dev->io_queues[NVMEQ_TYPE_READ] = nr_io_queues; + } else { + dev->io_queues[NVMEQ_TYPE_WRITE] = this_w_queues; + dev->io_queues[NVMEQ_TYPE_READ] = nr_io_queues - this_w_queues; + } +} + static int nvme_setup_io_queues(struct nvme_dev *dev) { struct nvme_queue *adminq = &dev->queues[0]; struct pci_dev *pdev = to_pci_dev(dev->dev); int result, nr_io_queues; unsigned long size; - + int irq_sets[2]; struct irq_affinity affd = { - .pre_vectors = 1 + .pre_vectors = 1, + .nr_sets = ARRAY_SIZE(irq_sets), + .sets = irq_sets, }; - nr_io_queues = num_possible_cpus(); + nr_io_queues = max_io_queues(); result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues); if (result < 0) return result; @@ -1934,13 +2048,48 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) * setting up the full range we need. */ pci_free_irq_vectors(pdev); - result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1, - PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd); - if (result <= 0) - return -EIO; + + /* + * For irq sets, we have to ask for minvec == maxvec. This passes + * any reduction back to us, so we can adjust our queue counts and + * IRQ vector needs. + */ + do { + nvme_calc_io_queues(dev, nr_io_queues); + irq_sets[0] = dev->io_queues[NVMEQ_TYPE_READ]; + irq_sets[1] = dev->io_queues[NVMEQ_TYPE_WRITE]; + if (!irq_sets[1]) + affd.nr_sets = 1; + + /* + * Need IRQs for read+write queues, and one for the admin queue + */ + nr_io_queues = irq_sets[0] + irq_sets[1] + 1; + + result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues, + nr_io_queues, + PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd); + + /* + * Need to reduce our vec counts + */ + if (result == -ENOSPC) { + nr_io_queues--; + if (!nr_io_queues) + return result; + continue; + } else if (result <= 0) + return -EIO; + break; + } while (1); + dev->num_vecs = result; dev->max_qid = max(result - 1, 1); + dev_info(dev->ctrl.device, "%d/%d/%d read/write queues\n", + dev->io_queues[NVMEQ_TYPE_READ], + dev->io_queues[NVMEQ_TYPE_WRITE]); + /* * Should investigate if there's a performance win from allocating * more queues than interrupt vectors; it might allow the submission @@ -2042,6 +2191,7 @@ static int nvme_dev_add(struct nvme_dev *dev) if (!dev->ctrl.tagset) { dev->tagset.ops = &nvme_mq_ops; dev->tagset.nr_hw_queues = dev->online_queues - 1; + dev->tagset.nr_maps = NVMEQ_TYPE_NR; dev->tagset.timeout = NVME_IO_TIMEOUT; dev->tagset.numa_node = dev_to_node(dev->dev); dev->tagset.queue_depth = @@ -2489,8 +2639,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (!dev) return -ENOMEM; - dev->queues = kcalloc_node(num_possible_cpus() + 1, - sizeof(struct nvme_queue), GFP_KERNEL, node); + dev->queues = kcalloc_node(max_queue_count(), sizeof(struct nvme_queue), + GFP_KERNEL, node); if (!dev->queues) goto free; From patchwork Tue Oct 30 18:32:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661395 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D94C31932 for ; Tue, 30 Oct 2018 18:33:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E38DF2986B for ; Tue, 30 Oct 2018 18:33:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D5F40299B9; Tue, 30 Oct 2018 18:33:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6B7F12986B for ; Tue, 30 Oct 2018 18:33:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726493AbeJaD22 (ORCPT ); Tue, 30 Oct 2018 23:28:28 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:36421 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728298AbeJaD2C (ORCPT ); Tue, 30 Oct 2018 23:28:02 -0400 Received: by mail-it1-f194.google.com with SMTP id t4-v6so6646388itf.1 for ; Tue, 30 Oct 2018 11:33:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Q05dSOPfwm/UowrXt9AMeUMp3mZmjRCMcDVjG5ZA724=; b=PGNr4zA6ebujuKlxOULXkMmbfq/R4beWrMw0Nu0fvrEeiA5Igow5iNHLwpY4+BzyuK 2Mw1k6TX/N3b47omYHlIDk1NlwWLJEgIYORpyDa3NwnB1J//92Y4tSJAhaLetXfAQPPu ZYcpGtPPKuFNmPSSQzA+wC/nzPBVocGc9KVRdgOGB6Kmj0GsFEiRESXyODL+7GxD4B+K FmrZJ5bTVm0QGKQ1dYN9mrH5Bf/HAUq4hQJ0Gjmcd4AQUzAWUZR8VndwB9MkNGSmYc9u ZmYrhx5TR2M0C3DubfvEAagYyAFAC9ByKjQUJ4/9YCvXCvg1w0+hFs32d4YehDSBYQc/ Cizg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Q05dSOPfwm/UowrXt9AMeUMp3mZmjRCMcDVjG5ZA724=; b=PJD03WFNM6NJWOJRlf9/YFln6ZqCCO9XNdPGbllQU6e8gm0Mb4PUJHHCYEv5EMG+N8 ewgAf7tU+YApAY+M0XJFBWqz/IPl+BwnNqEgYNqs6anx/Jq4dl3BU1MUR2mCOhxwys9k Z1AXUugE13vcWK1XLZxC4ePniQohx8oL3Ph+tgk1jJGwujCQ1Q29Tr3QdNv8RNVeguzW RdR3YTLk3uikEIijwLvTMT4FurP8WDrzf+54HKlkycsZtkRnDGvtCFhyVuNasmsGdekk FdRl9SVtlrrg6+fi/CKHEgqiJJ7OQxGAz1PZVTbWvBnxfAVMphYSnWhgi+j28vBAKk5h +hQQ== X-Gm-Message-State: AGRZ1gKnKAMLRASEb0jLi33rAgsPgl4MxqHR/OBytQvNU5vXgTRSMCJu NG/FzhPKttZRzWiYwY9zxaQCEUqab50= X-Google-Smtp-Source: AJdET5dy6uy6nRfx5Hma+m2+ePPwrCzjLsteH9Avh6zyrLjEgCl1yJSs4diaNDq8i4+18SHX7TaQSQ== X-Received: by 2002:a24:8247:: with SMTP id t68-v6mr24622itd.68.1540924406008; Tue, 30 Oct 2018 11:33:26 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:24 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 15/16] block: add REQ_HIPRI and inherit it from IOCB_HIPRI Date: Tue, 30 Oct 2018 12:32:51 -0600 Message-Id: <20181030183252.17857-16-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We use IOCB_HIPRI to poll for IO in the caller instead of scheduling. This information is not available for (or after) IO submission. The driver may make different queue choices based on the type of IO, so make the fact that we will poll for this IO known to the lower layers as well. Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe Reviewed-by: Sagi Grimberg --- fs/block_dev.c | 2 ++ fs/direct-io.c | 2 ++ fs/iomap.c | 9 ++++++++- include/linux/blk_types.h | 4 +++- 4 files changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 38b8ce05cbc7..8bb8090c57a7 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -232,6 +232,8 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, bio.bi_opf = dio_bio_write_op(iocb); task_io_account_write(ret); } + if (iocb->ki_flags & IOCB_HIPRI) + bio.bi_opf |= REQ_HIPRI; qc = submit_bio(&bio); for (;;) { diff --git a/fs/direct-io.c b/fs/direct-io.c index 093fb54cd316..ffb46b7aa5f7 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -1265,6 +1265,8 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, } else { dio->op = REQ_OP_READ; } + if (iocb->ki_flags & IOCB_HIPRI) + dio->op_flags |= REQ_HIPRI; /* * For AIO O_(D)SYNC writes we need to defer completions to a workqueue diff --git a/fs/iomap.c b/fs/iomap.c index ec15cf2ec696..50ad8c8d1dcb 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1554,6 +1554,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, unsigned len) { struct page *page = ZERO_PAGE(0); + int flags = REQ_SYNC | REQ_IDLE; struct bio *bio; bio = bio_alloc(GFP_KERNEL, 1); @@ -1562,9 +1563,12 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; + if (dio->iocb->ki_flags & IOCB_HIPRI) + flags |= REQ_HIPRI; + get_page(page); __bio_add_page(bio, page, len, 0); - bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE); + bio_set_op_attrs(bio, REQ_OP_WRITE, flags); atomic_inc(&dio->ref); return submit_bio(bio); @@ -1663,6 +1667,9 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, bio_set_pages_dirty(bio); } + if (dio->iocb->ki_flags & IOCB_HIPRI) + bio->bi_opf |= REQ_HIPRI; + iov_iter_advance(dio->submit.iter, n); dio->size += n; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 093a818c5b68..d6c2558d6b73 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -322,6 +322,8 @@ enum req_flag_bits { /* command specific flags for REQ_OP_WRITE_ZEROES: */ __REQ_NOUNMAP, /* do not free blocks when zeroing */ + __REQ_HIPRI, + /* for driver use */ __REQ_DRV, __REQ_SWAP, /* swapping request. */ @@ -342,8 +344,8 @@ enum req_flag_bits { #define REQ_RAHEAD (1ULL << __REQ_RAHEAD) #define REQ_BACKGROUND (1ULL << __REQ_BACKGROUND) #define REQ_NOWAIT (1ULL << __REQ_NOWAIT) - #define REQ_NOUNMAP (1ULL << __REQ_NOUNMAP) +#define REQ_HIPRI (1ULL << __REQ_HIPRI) #define REQ_DRV (1ULL << __REQ_DRV) #define REQ_SWAP (1ULL << __REQ_SWAP) From patchwork Tue Oct 30 18:32:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10661391 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7206D14DE for ; Tue, 30 Oct 2018 18:33:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B8FB27FAE for ; Tue, 30 Oct 2018 18:33:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FE3A284C4; Tue, 30 Oct 2018 18:33:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ADF7D27FAE for ; Tue, 30 Oct 2018 18:33:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728370AbeJaD2F (ORCPT ); Tue, 30 Oct 2018 23:28:05 -0400 Received: from mail-io1-f67.google.com ([209.85.166.67]:33572 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728018AbeJaD2E (ORCPT ); Tue, 30 Oct 2018 23:28:04 -0400 Received: by mail-io1-f67.google.com with SMTP id f12-v6so4654854iog.0 for ; Tue, 30 Oct 2018 11:33:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=w9pcIzKJWRe4jEUW8l7UAhqEstR0ah0VX1SydPsKIBM=; b=k/sJ8EZX/mPKGCUC+siRmJj0IMRcSadGCPo7yzNtdm/aU3gF/2QC706dEfe3lJjXvP kwjP/5haEQFE8L23uRScEgi8nhbk+b++O2Ja+dkEjH1a5DJLFnA12plFKmNSnIQDyQs4 IFkxIwqOvS3lNNqxKcA+ZgEGy6t6vfVJBSWTyTk/ZkMPDRpq0HTPyLNtAPQkVCnL9Kv9 TWf4B8hkTwVlhm9mzZHixXcUe+c1JvRnUD1qnY6iUjnbmJPlwPfriGOoQV3+2AN5gCpm nhic631b4SzogW4z8nGqTvzdUDUu+NklTmcOjf9m4YqrHALQKHg21hkHCmjYO8FQRMdU pW9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=w9pcIzKJWRe4jEUW8l7UAhqEstR0ah0VX1SydPsKIBM=; b=PdqNvA9NRLmi29IuPpCGgkRK/y8qaYj5bN6u5dYsg46JiQ3NAjML1KURkKi9emj/y7 QIQcLpkILAaJzfSMHeHhMVJOXsEv9UP3j4jYqQBEC6oSx/K4HDzESnGY+Bkg6y/1xM19 QdUTh6agkDP1aYa8hj5NfsZHtt2++66sFWLhL+Pw+GJk80mPEHuLZS82mworLf6if1fc 7K+qWWl38jZiPRYwi+XAXMFh8WRkxjzKh/UcyGakDBhHhdFtoHen2UM/MIE6xjBeNQOy 1RjQrR4ZNl9Amz8wUFPxwxUAJlH+JlKgLwJl/5SyH5h8Rq7D7uvyFHxC6JL2rRmyhdcr vkZA== X-Gm-Message-State: AGRZ1gKiIXFiCZIMvHM2plDZA1UDQrVUSh8pKWoxMbfmUHWeWo9SAl8p sD/mJtY9E/EHqhbIDJNHU2Y/WaVBOcg= X-Google-Smtp-Source: AJdET5cRsn8cFeLIlhbzA/6Yn4l/my5K7FsIVbBGX0GaklA5Ln/rrS8ZFdZ/B9SYNvLIGFtx5fHu/g== X-Received: by 2002:a6b:b48d:: with SMTP id d135-v6mr3972iof.61.1540924407842; Tue, 30 Oct 2018 11:33:27 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id o20-v6sm4895739itc.34.2018.10.30.11.33.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Oct 2018 11:33:26 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 16/16] nvme: add separate poll queue map Date: Tue, 30 Oct 2018 12:32:52 -0600 Message-Id: <20181030183252.17857-17-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181030183252.17857-1-axboe@kernel.dk> References: <20181030183252.17857-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Adds support for defining a variable number of poll queues, currently configurable with the 'poll_queues' module parameter. Defaults to a single poll queue. And now we finally have poll support without triggering interrupts! Reviewed-by: Hannes Reinecke Signed-off-by: Jens Axboe --- drivers/nvme/host/pci.c | 97 +++++++++++++++++++++++++++++++++-------- include/linux/blk-mq.h | 2 +- 2 files changed, 81 insertions(+), 18 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 17170686105f..305d8d3826d7 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -86,6 +86,10 @@ MODULE_PARM_DESC(write_queues, "Number of queues to use for writes. If not set, reads and writes " "will share a queue set."); +static int poll_queues = 1; +module_param_cb(poll_queues, &queue_count_ops, &poll_queues, 0644); +MODULE_PARM_DESC(poll_queues, "Number of queues to use for polled IO."); + struct nvme_dev; struct nvme_queue; @@ -94,6 +98,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); enum { NVMEQ_TYPE_READ, NVMEQ_TYPE_WRITE, + NVMEQ_TYPE_POLL, NVMEQ_TYPE_NR, }; @@ -202,6 +207,7 @@ struct nvme_queue { u16 last_cq_head; u16 qid; u8 cq_phase; + u8 polled; u32 *dbbuf_sq_db; u32 *dbbuf_cq_db; u32 *dbbuf_sq_ei; @@ -250,7 +256,7 @@ static inline void _nvme_check_size(void) static unsigned int max_io_queues(void) { - return num_possible_cpus() + write_queues; + return num_possible_cpus() + write_queues + poll_queues; } static unsigned int max_queue_count(void) @@ -500,8 +506,15 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set) offset = queue_irq_offset(dev); } + /* + * The poll queue(s) doesn't have an IRQ (and hence IRQ + * affinity), so use the regular blk-mq cpu mapping + */ map->queue_offset = qoff; - blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset); + if (i != NVMEQ_TYPE_POLL) + blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset); + else + blk_mq_map_queues(map); qoff += map->nr_queues; offset += map->nr_queues; } @@ -892,7 +905,7 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, * We should not need to do this, but we're still using this to * ensure we can drain requests on a dying queue. */ - if (unlikely(nvmeq->cq_vector < 0)) + if (unlikely(nvmeq->cq_vector < 0 && !nvmeq->polled)) return BLK_STS_IOERR; ret = nvme_setup_cmd(ns, req, &cmnd); @@ -921,6 +934,8 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, static int nvme_flags_to_type(struct request_queue *q, unsigned int flags) { + if (flags & REQ_HIPRI) + return NVMEQ_TYPE_POLL; if ((flags & REQ_OP_MASK) == REQ_OP_READ) return NVMEQ_TYPE_READ; @@ -1094,7 +1109,10 @@ static int adapter_alloc_cq(struct nvme_dev *dev, u16 qid, struct nvme_queue *nvmeq, s16 vector) { struct nvme_command c; - int flags = NVME_QUEUE_PHYS_CONTIG | NVME_CQ_IRQ_ENABLED; + int flags = NVME_QUEUE_PHYS_CONTIG; + + if (vector != -1) + flags |= NVME_CQ_IRQ_ENABLED; /* * Note: we (ab)use the fact that the prp fields survive if no data @@ -1106,7 +1124,10 @@ static int adapter_alloc_cq(struct nvme_dev *dev, u16 qid, c.create_cq.cqid = cpu_to_le16(qid); c.create_cq.qsize = cpu_to_le16(nvmeq->q_depth - 1); c.create_cq.cq_flags = cpu_to_le16(flags); - c.create_cq.irq_vector = cpu_to_le16(vector); + if (vector != -1) + c.create_cq.irq_vector = cpu_to_le16(vector); + else + c.create_cq.irq_vector = 0; return nvme_submit_sync_cmd(dev->ctrl.admin_q, &c, NULL, 0); } @@ -1348,13 +1369,14 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq) int vector; spin_lock_irq(&nvmeq->cq_lock); - if (nvmeq->cq_vector == -1) { + if (nvmeq->cq_vector == -1 && !nvmeq->polled) { spin_unlock_irq(&nvmeq->cq_lock); return 1; } vector = nvmeq->cq_vector; nvmeq->dev->online_queues--; nvmeq->cq_vector = -1; + nvmeq->polled = false; spin_unlock_irq(&nvmeq->cq_lock); /* @@ -1366,7 +1388,8 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq) if (!nvmeq->qid && nvmeq->dev->ctrl.admin_q) blk_mq_quiesce_queue(nvmeq->dev->ctrl.admin_q); - pci_free_irq(to_pci_dev(nvmeq->dev->dev), vector, nvmeq); + if (vector != -1) + pci_free_irq(to_pci_dev(nvmeq->dev->dev), vector, nvmeq); return 0; } @@ -1500,7 +1523,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid) spin_unlock_irq(&nvmeq->cq_lock); } -static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) +static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled) { struct nvme_dev *dev = nvmeq->dev; int result; @@ -1510,7 +1533,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) * A queue's vector matches the queue identifier unless the controller * has only one vector available. */ - vector = dev->num_vecs == 1 ? 0 : qid; + if (!polled) + vector = dev->num_vecs == 1 ? 0 : qid; + else + vector = -1; + result = adapter_alloc_cq(dev, qid, nvmeq, vector); if (result) return result; @@ -1527,15 +1554,20 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) * xxx' warning if the create CQ/SQ command times out. */ nvmeq->cq_vector = vector; + nvmeq->polled = polled; nvme_init_queue(nvmeq, qid); - result = queue_request_irq(nvmeq); - if (result < 0) - goto release_sq; + + if (vector != -1) { + result = queue_request_irq(nvmeq); + if (result < 0) + goto release_sq; + } return result; release_sq: nvmeq->cq_vector = -1; + nvmeq->polled = false; dev->online_queues--; adapter_delete_sq(dev, qid); release_cq: @@ -1686,7 +1718,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev) static int nvme_create_io_queues(struct nvme_dev *dev) { - unsigned i, max; + unsigned i, max, rw_queues; int ret = 0; for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) { @@ -1697,8 +1729,17 @@ static int nvme_create_io_queues(struct nvme_dev *dev) } max = min(dev->max_qid, dev->ctrl.queue_count - 1); + if (max != 1 && dev->io_queues[NVMEQ_TYPE_POLL]) { + rw_queues = dev->io_queues[NVMEQ_TYPE_READ] + + dev->io_queues[NVMEQ_TYPE_WRITE]; + } else { + rw_queues = max; + } + for (i = dev->online_queues; i <= max; i++) { - ret = nvme_create_queue(&dev->queues[i], i); + bool polled = i > rw_queues; + + ret = nvme_create_queue(&dev->queues[i], i, polled); if (ret) break; } @@ -1970,6 +2011,7 @@ static int nvme_setup_host_mem(struct nvme_dev *dev) static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int nr_io_queues) { unsigned int this_w_queues = write_queues; + unsigned int this_p_queues = poll_queues; /* * Setup read/write queue split @@ -1977,9 +2019,28 @@ static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int nr_io_queues) if (nr_io_queues == 1) { dev->io_queues[NVMEQ_TYPE_READ] = 1; dev->io_queues[NVMEQ_TYPE_WRITE] = 0; + dev->io_queues[NVMEQ_TYPE_POLL] = 0; return; } + /* + * Configure number of poll queues, if set + */ + if (this_p_queues) { + /* + * We need at least one queue left. With just one queue, we'll + * have a single shared read/write set. + */ + if (this_p_queues >= nr_io_queues) { + this_w_queues = 0; + this_p_queues = nr_io_queues - 1; + } + + dev->io_queues[NVMEQ_TYPE_POLL] = this_p_queues; + nr_io_queues -= this_p_queues; + } else + dev->io_queues[NVMEQ_TYPE_POLL] = 0; + /* * If 'write_queues' is set, ensure it leaves room for at least * one read queue @@ -2084,11 +2145,13 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) } while (1); dev->num_vecs = result; - dev->max_qid = max(result - 1, 1); + result = max(result - 1, 1); + dev->max_qid = result + dev->io_queues[NVMEQ_TYPE_POLL]; - dev_info(dev->ctrl.device, "%d/%d/%d read/write queues\n", + dev_info(dev->ctrl.device, "%d/%d/%d read/write/poll queues\n", dev->io_queues[NVMEQ_TYPE_READ], - dev->io_queues[NVMEQ_TYPE_WRITE]); + dev->io_queues[NVMEQ_TYPE_WRITE], + dev->io_queues[NVMEQ_TYPE_POLL]); /* * Should investigate if there's a performance win from allocating diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 8e80d5043079..b31f6f016621 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -82,7 +82,7 @@ struct blk_mq_queue_map { }; enum { - HCTX_MAX_TYPES = 2, + HCTX_MAX_TYPES = 3, }; struct blk_mq_tag_set {