From patchwork Thu Oct 25 21:16:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656599 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9124014DE for ; Thu, 25 Oct 2018 21:16:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 864C12C463 for ; Thu, 25 Oct 2018 21:16:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 79DAE2C64C; Thu, 25 Oct 2018 21:16:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 121EB2C463 for ; Thu, 25 Oct 2018 21:16:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726296AbeJZFuw (ORCPT ); Fri, 26 Oct 2018 01:50:52 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:53538 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFuv (ORCPT ); Fri, 26 Oct 2018 01:50:51 -0400 Received: by mail-it1-f194.google.com with SMTP id q70-v6so3354088itb.3 for ; Thu, 25 Oct 2018 14:16:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=B8wAPR6nFQtNL9HFuePvft531N0hEZori9nR5KYqoMY=; b=OPVu4Tf4GYxItL4ZysN7zO439IXcykfqPjbTtecd0RFg2y+h/rt/LH3ul+1ByXaNp0 91pFR1mBhR76hISxogGx/Yb5bs20vf7AFmxk8Rn0M0mGaW0qZaC2KzgODs9DyCUk3fVf MZZBdrEKiYpdKEq1HxOQjP1OGzWEupXXMto/pIpg0hfEqHGwDnfuelk51BHRYOzzi1uz 1DKqSWp1zVW4bP6BZsb09bG8zvOfJ4W28bznmq1Q/GDwmomQ5NQGHGDj7HKQxcGOQ4+6 ke9HaJNZWUavUy+xAw0OWrWVc9AWaZ2KZsFOm/wgAXjbF7tV+ctOEcqopjV1UH5JHMaR Cg0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=B8wAPR6nFQtNL9HFuePvft531N0hEZori9nR5KYqoMY=; b=hMH2rIGImuI6fZ6ihOCboJZm2ugeaffdTXCqrsxHen2MPXvwmMS7IJEHnith6kNbU4 mShP1On6YKmXYuHdQF/gzhfwTlLxTD0/EU0X+YNqOMc05/z3QuWIQzUJXs5bKbe1M5q7 d2m111lxR+H8ix9bvKY9HFGPm4ztcbQ5je3FFursWuVfQEeBoJvf7eZPAO0C4gUNT1SG YyBvo2PV1T5rIkWRIMcBws9IqPWNtfvQ9zbNUTkA+LzpVYTWsy5RD5XvbJq58uUW7xLF yZIYRNHitYx14cGnzS44j0yrnoQIopj8iQd62SsyOVqoxiDzjXidkW1Z/QtJzNK9xW2Z smcw== X-Gm-Message-State: AGRZ1gINNg9Lw84q1JN9+k4avqo4zf654C7/PhOOldI/OBpmfvajcnMW b53QnOHbHuAl1KqqB8vSwbX8baf7ZrJypQ== X-Google-Smtp-Source: AJdET5fsE7cPvxkfoGIJUOKAWOl7yC5lzBhUsZYKEcI/CPmt0JsrTWlmI7Ln5w9g2ON61h2qD3UfCg== X-Received: by 2002:a24:de83:: with SMTP id d125-v6mr2035728itg.137.1540502191943; Thu, 25 Oct 2018 14:16:31 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:30 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 01/14] blk-mq: kill q->mq_map Date: Thu, 25 Oct 2018 15:16:13 -0600 Message-Id: <20181025211626.12692-2-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It's just a pointer to set->mq_map, use that instead. Signed-off-by: Jens Axboe Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 13 ++++--------- block/blk-mq.h | 4 +++- include/linux/blkdev.h | 2 -- 3 files changed, 7 insertions(+), 12 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 21e4147c4810..22d5beaab5a0 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2321,7 +2321,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * If the cpu isn't present, the cpu is mapped to first hctx. */ for_each_possible_cpu(i) { - hctx_idx = q->mq_map[i]; + hctx_idx = set->mq_map[i]; /* unmapped hw queue can be remapped after CPU topo changed */ if (!set->tags[hctx_idx] && !__blk_mq_alloc_rq_map(set, hctx_idx)) { @@ -2331,7 +2331,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * case, remap the current ctx to hctx[0] which * is guaranteed to always have tags allocated */ - q->mq_map[i] = 0; + set->mq_map[i] = 0; } ctx = per_cpu_ptr(q->queue_ctx, i); @@ -2429,8 +2429,6 @@ static void blk_mq_del_queue_tag_set(struct request_queue *q) static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set, struct request_queue *q) { - q->tag_set = set; - mutex_lock(&set->tag_list_lock); /* @@ -2467,8 +2465,6 @@ void blk_mq_release(struct request_queue *q) kobject_put(&hctx->kobj); } - q->mq_map = NULL; - kfree(q->queue_hw_ctx); /* @@ -2588,7 +2584,7 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, int node; struct blk_mq_hw_ctx *hctx; - node = blk_mq_hw_queue_to_node(q->mq_map, i); + node = blk_mq_hw_queue_to_node(set->mq_map, i); /* * If the hw queue has been mapped to another numa node, * we need to realloc the hctx. If allocation fails, fallback @@ -2665,8 +2661,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, if (!q->queue_hw_ctx) goto err_percpu; - q->mq_map = set->mq_map; - blk_mq_realloc_hw_ctxs(set, q); if (!q->nr_hw_queues) goto err_hctxs; @@ -2675,6 +2669,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ); q->nr_queues = nr_cpu_ids; + q->tag_set = set; q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; diff --git a/block/blk-mq.h b/block/blk-mq.h index 9497b47e2526..9536be06d022 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -75,7 +75,9 @@ extern int blk_mq_hw_queue_to_node(unsigned int *map, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, int cpu) { - return q->queue_hw_ctx[q->mq_map[cpu]]; + struct blk_mq_tag_set *set = q->tag_set; + + return q->queue_hw_ctx[set->mq_map[cpu]]; } /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 82b6cf45c6e0..6e506044a309 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -415,8 +415,6 @@ struct request_queue { const struct blk_mq_ops *mq_ops; - unsigned int *mq_map; - /* sw queues */ struct blk_mq_ctx __percpu *queue_ctx; unsigned int nr_queues; From patchwork Thu Oct 25 21:16:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656601 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B27214DE for ; Thu, 25 Oct 2018 21:16:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F36902C463 for ; Thu, 25 Oct 2018 21:16:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E77DA2C64C; Thu, 25 Oct 2018 21:16:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F22682C463 for ; Thu, 25 Oct 2018 21:16:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726386AbeJZFuy (ORCPT ); Fri, 26 Oct 2018 01:50:54 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:38759 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFux (ORCPT ); Fri, 26 Oct 2018 01:50:53 -0400 Received: by mail-it1-f194.google.com with SMTP id i76-v6so3636823ita.3 for ; Thu, 25 Oct 2018 14:16:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vZkDBl5LbywSHtUsOeVUkRs/3+Y3HSnoyKPU/xUo8Oc=; b=XLA4rOvpigCG7GBYqHrrt+D83w0Qvi1mHc1+aoBSdKF5Bl/GHLLWERaydpji65C8Hx iedo0hWkur23A9AdkxBGRGUBXY78fGpaY4kYPN5tsV3V24jO7S/uJ2se1KlmqzQwkEyG isi6enLkfUCWi1rwx3jy8+EtCuyIUbKAt0IQcL0IWK1xprTkldmZDAW7xipR3XTkRge+ hooYrPlTxZpj1C0EHOyiHEWslBiMbRGX9PAKDLrXDea/tnXGrdOpBZY0W/uchkpXOpCo dfQMbLRbGFCVH2wGWtheE2nU1l/MHUupZsqF6EOX99sbtVYOtQktO6zps8KkWzk5dRaj momQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vZkDBl5LbywSHtUsOeVUkRs/3+Y3HSnoyKPU/xUo8Oc=; b=lQwHZBsAI5SFigKQACrQnrRalSHcI/vzEb13p3iw9P1XdIH15xv4+Ht6jlwCBWE9ME OxobWpJaJb56E5G25K2ByWp3vlsBfdlz/UUPYWiNzoVETuIozD0o5AmTOwlsp8783124 X2mJLeaYh6GYkhwHOJa5Uiv5s8ZCOXIJ3REPLcTg9qedTACXMAphS2fWXudoa72s0LJ4 QZSOLTDAgoHFA0uWBUwUra05ow2kPMvjT2mofF4TtfN9l7DQGc/bpvfC0sm9pylElK4/ bhNnC2Idu5eCOtYglTX+HtkcQPZ7MGuEDqxtqbnqEg8t93GI8tR6LUXnHrI06bLV7oGN Towg== X-Gm-Message-State: AGRZ1gJclqokrgqCv3vSo5qW07Ad+ZbPk4lmCTET95z5SWGhvGv69OQd WtF7D5zWHcyVHzxitnmvPBXUAB5d6jcdHg== X-Google-Smtp-Source: AJdET5cYlRHIxwz9cQwCFx7ZSv4izP/SazAAS3B1xO8C+KUgMyDY4DzuceFET7+MKXTYR5Ck22pk4Q== X-Received: by 2002:a02:4f9b:: with SMTP id r27-v6mr652082jad.97.1540502193663; Thu, 25 Oct 2018 14:16:33 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:32 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 02/14] blk-mq: abstract out queue map Date: Thu, 25 Oct 2018 15:16:14 -0600 Message-Id: <20181025211626.12692-3-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is in preparation for allowing multiple sets of maps per queue, if so desired. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq-cpumap.c | 10 ++++---- block/blk-mq-pci.c | 10 ++++---- block/blk-mq-rdma.c | 4 ++-- block/blk-mq-virtio.c | 8 +++---- block/blk-mq.c | 34 ++++++++++++++------------- block/blk-mq.h | 8 +++---- drivers/block/virtio_blk.c | 2 +- drivers/nvme/host/pci.c | 2 +- drivers/scsi/qla2xxx/qla_os.c | 5 ++-- drivers/scsi/scsi_lib.c | 2 +- drivers/scsi/smartpqi/smartpqi_init.c | 3 ++- drivers/scsi/virtio_scsi.c | 3 ++- include/linux/blk-mq-pci.h | 4 ++-- include/linux/blk-mq-virtio.h | 4 ++-- include/linux/blk-mq.h | 13 ++++++++-- 15 files changed, 63 insertions(+), 49 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 3eb169f15842..6e6686c55984 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -30,10 +30,10 @@ static int get_first_sibling(unsigned int cpu) return cpu; } -int blk_mq_map_queues(struct blk_mq_tag_set *set) +int blk_mq_map_queues(struct blk_mq_queue_map *qmap) { - unsigned int *map = set->mq_map; - unsigned int nr_queues = set->nr_hw_queues; + unsigned int *map = qmap->mq_map; + unsigned int nr_queues = qmap->nr_queues; unsigned int cpu, first_sibling; for_each_possible_cpu(cpu) { @@ -62,12 +62,12 @@ EXPORT_SYMBOL_GPL(blk_mq_map_queues); * We have no quick way of doing reverse lookups. This is only used at * queue init time, so runtime isn't important. */ -int blk_mq_hw_queue_to_node(unsigned int *mq_map, unsigned int index) +int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int index) { int i; for_each_possible_cpu(i) { - if (index == mq_map[i]) + if (index == qmap->mq_map[i]) return local_memory_node(cpu_to_node(i)); } diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c index db644ec624f5..40333d60a850 100644 --- a/block/blk-mq-pci.c +++ b/block/blk-mq-pci.c @@ -31,26 +31,26 @@ * that maps a queue to the CPUs that have irq affinity for the corresponding * vector. */ -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev, +int blk_mq_pci_map_queues(struct blk_mq_queue_map *qmap, struct pci_dev *pdev, int offset) { const struct cpumask *mask; unsigned int queue, cpu; - for (queue = 0; queue < set->nr_hw_queues; queue++) { + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = pci_irq_get_affinity(pdev, queue + offset); if (!mask) goto fallback; for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; + qmap->mq_map[cpu] = queue; } return 0; fallback: - WARN_ON_ONCE(set->nr_hw_queues > 1); - blk_mq_clear_mq_map(set); + WARN_ON_ONCE(qmap->nr_queues > 1); + blk_mq_clear_mq_map(qmap); return 0; } EXPORT_SYMBOL_GPL(blk_mq_pci_map_queues); diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c index 996167f1de18..a71576aff3a5 100644 --- a/block/blk-mq-rdma.c +++ b/block/blk-mq-rdma.c @@ -41,12 +41,12 @@ int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, goto fallback; for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; + set->map[0].mq_map[cpu] = queue; } return 0; fallback: - return blk_mq_map_queues(set); + return blk_mq_map_queues(&set->map[0]); } EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues); diff --git a/block/blk-mq-virtio.c b/block/blk-mq-virtio.c index c3afbca11299..661fbfef480f 100644 --- a/block/blk-mq-virtio.c +++ b/block/blk-mq-virtio.c @@ -29,7 +29,7 @@ * that maps a queue to the CPUs that have irq affinity for the corresponding * vector. */ -int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, +int blk_mq_virtio_map_queues(struct blk_mq_queue_map *qmap, struct virtio_device *vdev, int first_vec) { const struct cpumask *mask; @@ -38,17 +38,17 @@ int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, if (!vdev->config->get_vq_affinity) goto fallback; - for (queue = 0; queue < set->nr_hw_queues; queue++) { + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = vdev->config->get_vq_affinity(vdev, first_vec + queue); if (!mask) goto fallback; for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; + qmap->mq_map[cpu] = queue; } return 0; fallback: - return blk_mq_map_queues(set); + return blk_mq_map_queues(qmap); } EXPORT_SYMBOL_GPL(blk_mq_virtio_map_queues); diff --git a/block/blk-mq.c b/block/blk-mq.c index 22d5beaab5a0..fa2e5176966e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1974,7 +1974,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, struct blk_mq_tags *tags; int node; - node = blk_mq_hw_queue_to_node(set->mq_map, hctx_idx); + node = blk_mq_hw_queue_to_node(&set->map[0], hctx_idx); if (node == NUMA_NO_NODE) node = set->numa_node; @@ -2030,7 +2030,7 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, size_t rq_size, left; int node; - node = blk_mq_hw_queue_to_node(set->mq_map, hctx_idx); + node = blk_mq_hw_queue_to_node(&set->map[0], hctx_idx); if (node == NUMA_NO_NODE) node = set->numa_node; @@ -2321,7 +2321,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * If the cpu isn't present, the cpu is mapped to first hctx. */ for_each_possible_cpu(i) { - hctx_idx = set->mq_map[i]; + hctx_idx = set->map[0].mq_map[i]; /* unmapped hw queue can be remapped after CPU topo changed */ if (!set->tags[hctx_idx] && !__blk_mq_alloc_rq_map(set, hctx_idx)) { @@ -2331,7 +2331,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) * case, remap the current ctx to hctx[0] which * is guaranteed to always have tags allocated */ - set->mq_map[i] = 0; + set->map[0].mq_map[i] = 0; } ctx = per_cpu_ptr(q->queue_ctx, i); @@ -2584,7 +2584,7 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, int node; struct blk_mq_hw_ctx *hctx; - node = blk_mq_hw_queue_to_node(set->mq_map, i); + node = blk_mq_hw_queue_to_node(&set->map[0], i); /* * If the hw queue has been mapped to another numa node, * we need to realloc the hctx. If allocation fails, fallback @@ -2793,18 +2793,18 @@ static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) * for (queue = 0; queue < set->nr_hw_queues; queue++) { * mask = get_cpu_mask(queue) * for_each_cpu(cpu, mask) - * set->mq_map[cpu] = queue; + * set->map.mq_map[cpu] = queue; * } * * When we need to remap, the table has to be cleared for * killing stale mapping since one CPU may not be mapped * to any hw queue. */ - blk_mq_clear_mq_map(set); + blk_mq_clear_mq_map(&set->map[0]); return set->ops->map_queues(set); } else - return blk_mq_map_queues(set); + return blk_mq_map_queues(&set->map[0]); } /* @@ -2859,10 +2859,12 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return -ENOMEM; ret = -ENOMEM; - set->mq_map = kcalloc_node(nr_cpu_ids, sizeof(*set->mq_map), - GFP_KERNEL, set->numa_node); - if (!set->mq_map) + set->map[0].mq_map = kcalloc_node(nr_cpu_ids, + sizeof(*set->map[0].mq_map), + GFP_KERNEL, set->numa_node); + if (!set->map[0].mq_map) goto out_free_tags; + set->map[0].nr_queues = set->nr_hw_queues; ret = blk_mq_update_queue_map(set); if (ret) @@ -2878,8 +2880,8 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return 0; out_free_mq_map: - kfree(set->mq_map); - set->mq_map = NULL; + kfree(set->map[0].mq_map); + set->map[0].mq_map = NULL; out_free_tags: kfree(set->tags); set->tags = NULL; @@ -2894,8 +2896,8 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) for (i = 0; i < nr_cpu_ids; i++) blk_mq_free_map_and_requests(set, i); - kfree(set->mq_map); - set->mq_map = NULL; + kfree(set->map[0].mq_map); + set->map[0].mq_map = NULL; kfree(set->tags); set->tags = NULL; @@ -3056,7 +3058,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, pr_warn("Increasing nr_hw_queues to %d fails, fallback to %d\n", nr_hw_queues, prev_nr_hw_queues); set->nr_hw_queues = prev_nr_hw_queues; - blk_mq_map_queues(set); + blk_mq_map_queues(&set->map[0]); goto fallback; } blk_mq_map_swqueue(q); diff --git a/block/blk-mq.h b/block/blk-mq.h index 9536be06d022..889f0069dd80 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -70,14 +70,14 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, /* * CPU -> queue mappings */ -extern int blk_mq_hw_queue_to_node(unsigned int *map, unsigned int); +extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, int cpu) { struct blk_mq_tag_set *set = q->tag_set; - return q->queue_hw_ctx[set->mq_map[cpu]]; + return q->queue_hw_ctx[set->map[0].mq_map[cpu]]; } /* @@ -206,12 +206,12 @@ static inline void blk_mq_put_driver_tag(struct request *rq) __blk_mq_put_driver_tag(hctx, rq); } -static inline void blk_mq_clear_mq_map(struct blk_mq_tag_set *set) +static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) { int cpu; for_each_possible_cpu(cpu) - set->mq_map[cpu] = 0; + qmap->mq_map[cpu] = 0; } #endif diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 086c6bb12baa..6e869d05f91e 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -624,7 +624,7 @@ static int virtblk_map_queues(struct blk_mq_tag_set *set) { struct virtio_blk *vblk = set->driver_data; - return blk_mq_virtio_map_queues(set, vblk->vdev, 0); + return blk_mq_virtio_map_queues(&set->map[0], vblk->vdev, 0); } #ifdef CONFIG_VIRTIO_BLK_SCSI diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index f30031945ee4..e5d783cb6937 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -435,7 +435,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set) { struct nvme_dev *dev = set->driver_data; - return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), + return blk_mq_pci_map_queues(&set->map[0], to_pci_dev(dev->dev), dev->num_vecs > 1 ? 1 /* admin queue */ : 0); } diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 3e2665c66bc4..ca9ac124f218 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -6934,11 +6934,12 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost) { int rc; scsi_qla_host_t *vha = (scsi_qla_host_t *)shost->hostdata; + struct blk_mq_queue_map *qmap = &shost->tag_set.map[0]; if (USER_CTRL_IRQ(vha->hw)) - rc = blk_mq_map_queues(&shost->tag_set); + rc = blk_mq_map_queues(qmap); else - rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev, 0); + rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, 0); return rc; } diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 77b0f10e1be1..44c1bbf6b302 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1778,7 +1778,7 @@ static int scsi_map_queues(struct blk_mq_tag_set *set) if (shost->hostt->map_queues) return shost->hostt->map_queues(shost); - return blk_mq_map_queues(set); + return blk_mq_map_queues(&set->map[0]); } void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index a25a07a0b7f0..bac084260d80 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -5319,7 +5319,8 @@ static int pqi_map_queues(struct Scsi_Host *shost) { struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost); - return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0); + return blk_mq_pci_map_queues(&shost->tag_set.map[0], + ctrl_info->pci_dev, 0); } static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 1c72db94270e..c3c95b314286 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -719,8 +719,9 @@ static void virtscsi_target_destroy(struct scsi_target *starget) static int virtscsi_map_queues(struct Scsi_Host *shost) { struct virtio_scsi *vscsi = shost_priv(shost); + struct blk_mq_queue_map *qmap = &shost->tag_set.map[0]; - return blk_mq_virtio_map_queues(&shost->tag_set, vscsi->vdev, 2); + return blk_mq_virtio_map_queues(qmap, vscsi->vdev, 2); } /* diff --git a/include/linux/blk-mq-pci.h b/include/linux/blk-mq-pci.h index 9f4c17f0d2d8..0b1f45c62623 100644 --- a/include/linux/blk-mq-pci.h +++ b/include/linux/blk-mq-pci.h @@ -2,10 +2,10 @@ #ifndef _LINUX_BLK_MQ_PCI_H #define _LINUX_BLK_MQ_PCI_H -struct blk_mq_tag_set; +struct blk_mq_queue_map; struct pci_dev; -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev, +int blk_mq_pci_map_queues(struct blk_mq_queue_map *qmap, struct pci_dev *pdev, int offset); #endif /* _LINUX_BLK_MQ_PCI_H */ diff --git a/include/linux/blk-mq-virtio.h b/include/linux/blk-mq-virtio.h index 69b4da262c45..687ae287e1dc 100644 --- a/include/linux/blk-mq-virtio.h +++ b/include/linux/blk-mq-virtio.h @@ -2,10 +2,10 @@ #ifndef _LINUX_BLK_MQ_VIRTIO_H #define _LINUX_BLK_MQ_VIRTIO_H -struct blk_mq_tag_set; +struct blk_mq_queue_map; struct virtio_device; -int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, +int blk_mq_virtio_map_queues(struct blk_mq_queue_map *qmap, struct virtio_device *vdev, int first_vec); #endif /* _LINUX_BLK_MQ_VIRTIO_H */ diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 2286dc12c6bc..c992069bb3ee 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -74,8 +74,17 @@ struct blk_mq_hw_ctx { struct srcu_struct srcu[0]; }; +struct blk_mq_queue_map { + unsigned int *mq_map; + unsigned int nr_queues; +}; + +enum { + HCTX_MAX_TYPES = 1, +}; + struct blk_mq_tag_set { - unsigned int *mq_map; + struct blk_mq_queue_map map[HCTX_MAX_TYPES]; const struct blk_mq_ops *ops; unsigned int nr_hw_queues; unsigned int queue_depth; /* max hw supported */ @@ -288,7 +297,7 @@ void blk_mq_freeze_queue_wait(struct request_queue *q); int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, unsigned long timeout); -int blk_mq_map_queues(struct blk_mq_tag_set *set); +int blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues); void blk_mq_quiesce_queue_nowait(struct request_queue *q); From patchwork Thu Oct 25 21:16:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656603 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D6F7114BB for ; Thu, 25 Oct 2018 21:16:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD5AF2C463 for ; Thu, 25 Oct 2018 21:16:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C1DEE2C64C; Thu, 25 Oct 2018 21:16:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B17C2C463 for ; Thu, 25 Oct 2018 21:16:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726501AbeJZFuz (ORCPT ); Fri, 26 Oct 2018 01:50:55 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:51471 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFuz (ORCPT ); Fri, 26 Oct 2018 01:50:55 -0400 Received: by mail-it1-f196.google.com with SMTP id 74-v6so3368395itw.1 for ; Thu, 25 Oct 2018 14:16:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Xz6uiw2CfAitE8jOfqVEiGr+oT/nI+U2TYd+jXcVaPk=; b=DBMZs8PLXoJJZRMZqiWFl1MYGVvm3bXjjdA1jV620AhJUtiQAEvoLDVPulpdRSNwu9 yDVtM6Mev4j9+DXVMDrUpiOWD29PQTJS5n5sIsmOtFwfbr922269YFs53d+Lu7hbAJy7 0rST9yEpbskZyav45b80Yr8ds2Mu2gyKNndxY3Q7evRaYJpjGqaslPqNkTxHNJUWd2ho isGdDPULbMeuJ9rcyXs1BPlERKiDAxCgLp4aU5HSmrcALhp94eJHbOabEX6m00wXoMpP XtHYvRiRuEhlujK9psOgTfoakRXMYG0yCCxtEFRKzvWu9E/fjjYuo8QWWKvXN3CiuSKz 9/fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Xz6uiw2CfAitE8jOfqVEiGr+oT/nI+U2TYd+jXcVaPk=; b=nHJPfv6rllbx4+fWPZOUmw1o3rtj+HAENGti/TAXm5bKZ47UfT4YDQ1F6jToq1x2OZ 7uvLzWd191xOnJaUKZz/ercae8o22PafjnyymxKl30WEkFJOTZgeHMW+SAXKfTq3UyoX Pq18RoLD5K0rKdSvHZbAu+fquI5kymBlDH5YcWH6BdLICDrVKS0RwqBTJ4p/iqNi76uG 1C1tG0nXsJ8blm60WqDKQ37DdPCu996jGpdBoNiXcSIzeKyGD1nEwlSH2AEHBVYCnkdx cUxquf2lGT3dCLjP4YA24vAGUa4o8ME/SuAVvbysR+aguEFcZLmliWgBdQ+KF/MTY3gf 2qbQ== X-Gm-Message-State: AGRZ1gLPquze4q3ouJgeoUbC7HeuqFv/p80kInc2ztLqF12h79mBb5Q9 1FSANbUqnwH4K26ZX3cZT4Tb292vr6rKCQ== X-Google-Smtp-Source: AJdET5fJHbqWAcggAP1uCnC8rfIsQWOVMl6Z3QXYMequUTYVrlmkypduHicCRe6N0gRr5v19sAFDLA== X-Received: by 2002:a24:a70e:: with SMTP id a14-v6mr2037157itf.74.1540502195644; Thu, 25 Oct 2018 14:16:35 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:34 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 03/14] blk-mq: provide dummy blk_mq_map_queue_type() helper Date: Thu, 25 Oct 2018 15:16:15 -0600 Message-Id: <20181025211626.12692-4-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Doesn't do anything right now, but it's needed as a prep patch to get the interfaces right. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/block/blk-mq.h b/block/blk-mq.h index 889f0069dd80..79c300faa7ce 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -80,6 +80,12 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, return q->queue_hw_ctx[set->map[0].mq_map[cpu]]; } +static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, + int type, int cpu) +{ + return blk_mq_map_queue(q, cpu); +} + /* * sysfs helpers */ From patchwork Thu Oct 25 21:16:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656605 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 40E9314DE for ; Thu, 25 Oct 2018 21:16:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 340572C463 for ; Thu, 25 Oct 2018 21:16:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 289362C64C; Thu, 25 Oct 2018 21:16:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2E61B2C463 for ; Thu, 25 Oct 2018 21:16:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726387AbeJZFu6 (ORCPT ); Fri, 26 Oct 2018 01:50:58 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:38769 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFu6 (ORCPT ); Fri, 26 Oct 2018 01:50:58 -0400 Received: by mail-it1-f195.google.com with SMTP id i76-v6so3637045ita.3 for ; Thu, 25 Oct 2018 14:16:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pd1n6+pI4RCokU9GPAm1TQD2C6UxegpBgFB3WwT/xKY=; b=x/yQYWLrnpxEk6pMU2ciTF2XebKWqtNGPqahd/++rzxH9sHkxZTOSKna394eJoCxJ3 PyLzHVXf8s9RAfrZbAYtoDZL+2iM7hWj08OM9HJN+59avlx7GzxCYCyItJhCKabxqBYJ dM4MFpk2iaOixKuWVdIeNhOiuS+MRQQNA/4WBYT53s6emIkxg4Sz//OM7TNf0FNEePn1 2HO1Ux9lLy+9o2OGPjWwp3hH2htVTR7V5hc3vcrqtTGuhYPXxZXBopFyfxe7BgjSuloq xQZt8fPdvzAdFK3GHgXD82Q1OpHxukJKNQPyoSMieyQIbzzaiFE4qN6G0WiyZE2nA8nu k+nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pd1n6+pI4RCokU9GPAm1TQD2C6UxegpBgFB3WwT/xKY=; b=FErqOdMONOpdM9oaY+CuLAYmQRMkCZty89YWmD4EdKYdT/1qlf8pSAEudifeZoP6CD JoLZBy5yy8s58UH6p5pcyHvsocchqDDtjxo/CiwO5oYs5IRZJMzZWap2SHUm1l3MVilb /ChNcGSaQ7J9zp2+r8anV9ymCFvyRnLaM9QkDZNEt6BeEdzL1FVU2NEg2drCmrZpgA7W Pb7tjWQPs3G0u2tWyJBkSWcpgyNKVnsK6RueVInpfNWYo5O9W7DKZGDHPv1irm01fC7o bIYBda/4kP2/GFVLvQxaDxTSTc2CAJDVPMAZEjS2/nVWNQyBc4xaN/NJprT3OksqOl+c aIOA== X-Gm-Message-State: AGRZ1gJlgymwcJ3OsUcD3RV6TchUftSAgDdqk7BDQqWMKybo168uBCHX A6r8oWn3LCLISK/Uu39hrk8TlI/5hgPW+w== X-Google-Smtp-Source: AJdET5f5PCCmjWiD7bzORJ+Ka2CJKJ+6kT5Yh95reYh5ys41TQT2Y0b8k0JgzbTS8lGUoerlkcqW4g== X-Received: by 2002:a24:5f15:: with SMTP id r21-v6mr2173334itb.6.1540502197565; Thu, 25 Oct 2018 14:16:37 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:36 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 04/14] blk-mq: pass in request/bio flags to queue mapping Date: Thu, 25 Oct 2018 15:16:16 -0600 Message-Id: <20181025211626.12692-5-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Prep patch for being able to place request based not just on CPU location, but also on the type of request. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-flush.c | 7 +++--- block/blk-mq-debugfs.c | 4 +++- block/blk-mq-sched.c | 16 ++++++++++---- block/blk-mq-tag.c | 5 +++-- block/blk-mq.c | 50 +++++++++++++++++++++++------------------- block/blk-mq.h | 8 ++++--- block/blk.h | 6 ++--- 7 files changed, 58 insertions(+), 38 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 9baa9a119447..7922dba81497 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -219,7 +219,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) /* release the tag's ownership to the req cloned from */ spin_lock_irqsave(&fq->mq_flush_lock, flags); - hctx = blk_mq_map_queue(q, flush_rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(q, flush_rq->cmd_flags, flush_rq->mq_ctx->cpu); if (!q->elevator) { blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq); flush_rq->tag = -1; @@ -307,7 +307,8 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, if (!q->elevator) { fq->orig_rq = first_rq; flush_rq->tag = first_rq->tag; - hctx = blk_mq_map_queue(q, first_rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(q, first_rq->cmd_flags, + first_rq->mq_ctx->cpu); blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq); } else { flush_rq->internal_tag = first_rq->internal_tag; @@ -330,7 +331,7 @@ static void mq_flush_data_end_io(struct request *rq, blk_status_t error) unsigned long flags; struct blk_flush_queue *fq = blk_get_flush_queue(q, ctx); - hctx = blk_mq_map_queue(q, ctx->cpu); + hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); if (q->elevator) { WARN_ON(rq->tag < 0); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 9ed43a7c70b5..fac70c81b7de 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -427,8 +427,10 @@ struct show_busy_params { static void hctx_show_busy_rq(struct request *rq, void *data, bool reserved) { const struct show_busy_params *params = data; + struct blk_mq_hw_ctx *hctx; - if (blk_mq_map_queue(rq->q, rq->mq_ctx->cpu) == params->hctx) + hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); + if (hctx == params->hctx) __blk_mq_debugfs_rq_show(params->m, list_entry_rq(&rq->queuelist)); } diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 29bfe8017a2d..8125e9393ec2 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -311,7 +311,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) { struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = blk_mq_get_ctx(q); - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, bio->bi_opf, ctx->cpu); bool ret = false; if (e && e->type->ops.mq.bio_merge) { @@ -367,7 +367,9 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, struct request_queue *q = rq->q; struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx; + + hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); /* flush rq in flush machinery need to be dispatched directly */ if (!(rq->rq_flags & RQF_FLUSH_SEQ) && op_is_flush(rq->cmd_flags)) { @@ -400,9 +402,15 @@ void blk_mq_sched_insert_requests(struct request_queue *q, struct blk_mq_ctx *ctx, struct list_head *list, bool run_queue_async) { - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); - struct elevator_queue *e = hctx->queue->elevator; + struct blk_mq_hw_ctx *hctx; + struct elevator_queue *e; + struct request *rq; + + /* For list inserts, requests better be on the same hw queue */ + rq = list_first_entry(list, struct request, queuelist); + hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); + e = hctx->queue->elevator; if (e && e->type->ops.mq.insert_requests) e->type->ops.mq.insert_requests(hctx, list, false); else { diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 4254e74c1446..478a959357f5 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -168,7 +168,8 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) io_schedule(); data->ctx = blk_mq_get_ctx(data->q); - data->hctx = blk_mq_map_queue(data->q, data->ctx->cpu); + data->hctx = blk_mq_map_queue(data->q, data->cmd_flags, + data->ctx->cpu); tags = blk_mq_tags_from_data(data); if (data->flags & BLK_MQ_REQ_RESERVED) bt = &tags->breserved_tags; @@ -530,7 +531,7 @@ u32 blk_mq_unique_tag(struct request *rq) struct blk_mq_hw_ctx *hctx; int hwq = 0; - hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(q, rq->cmd_flags, rq->mq_ctx->cpu); hwq = hctx->queue_num; return (hwq << BLK_MQ_UNIQUE_TAG_BITS) | diff --git a/block/blk-mq.c b/block/blk-mq.c index fa2e5176966e..e6ea7da99125 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -332,8 +332,8 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, } static struct request *blk_mq_get_request(struct request_queue *q, - struct bio *bio, unsigned int op, - struct blk_mq_alloc_data *data) + struct bio *bio, + struct blk_mq_alloc_data *data) { struct elevator_queue *e = q->elevator; struct request *rq; @@ -347,8 +347,9 @@ static struct request *blk_mq_get_request(struct request_queue *q, put_ctx_on_error = true; } if (likely(!data->hctx)) - data->hctx = blk_mq_map_queue(q, data->ctx->cpu); - if (op & REQ_NOWAIT) + data->hctx = blk_mq_map_queue(q, data->cmd_flags, + data->ctx->cpu); + if (data->cmd_flags & REQ_NOWAIT) data->flags |= BLK_MQ_REQ_NOWAIT; if (e) { @@ -359,9 +360,10 @@ static struct request *blk_mq_get_request(struct request_queue *q, * dispatch list. Don't include reserved tags in the * limiting, as it isn't useful. */ - if (!op_is_flush(op) && e->type->ops.mq.limit_depth && + if (!op_is_flush(data->cmd_flags) && + e->type->ops.mq.limit_depth && !(data->flags & BLK_MQ_REQ_RESERVED)) - e->type->ops.mq.limit_depth(op, data); + e->type->ops.mq.limit_depth(data->cmd_flags, data); } else { blk_mq_tag_busy(data->hctx); } @@ -376,8 +378,8 @@ static struct request *blk_mq_get_request(struct request_queue *q, return NULL; } - rq = blk_mq_rq_ctx_init(data, tag, op); - if (!op_is_flush(op)) { + rq = blk_mq_rq_ctx_init(data, tag, data->cmd_flags); + if (!op_is_flush(data->cmd_flags)) { rq->elv.icq = NULL; if (e && e->type->ops.mq.prepare_request) { if (e->type->icq_cache && rq_ioc(bio)) @@ -394,7 +396,7 @@ static struct request *blk_mq_get_request(struct request_queue *q, struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op, blk_mq_req_flags_t flags) { - struct blk_mq_alloc_data alloc_data = { .flags = flags }; + struct blk_mq_alloc_data alloc_data = { .flags = flags, .cmd_flags = op }; struct request *rq; int ret; @@ -402,7 +404,7 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op, if (ret) return ERR_PTR(ret); - rq = blk_mq_get_request(q, NULL, op, &alloc_data); + rq = blk_mq_get_request(q, NULL, &alloc_data); blk_queue_exit(q); if (!rq) @@ -420,7 +422,7 @@ EXPORT_SYMBOL(blk_mq_alloc_request); struct request *blk_mq_alloc_request_hctx(struct request_queue *q, unsigned int op, blk_mq_req_flags_t flags, unsigned int hctx_idx) { - struct blk_mq_alloc_data alloc_data = { .flags = flags }; + struct blk_mq_alloc_data alloc_data = { .flags = flags, .cmd_flags = op }; struct request *rq; unsigned int cpu; int ret; @@ -453,7 +455,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask); alloc_data.ctx = __blk_mq_get_ctx(q, cpu); - rq = blk_mq_get_request(q, NULL, op, &alloc_data); + rq = blk_mq_get_request(q, NULL, &alloc_data); blk_queue_exit(q); if (!rq) @@ -467,7 +469,7 @@ static void __blk_mq_free_request(struct request *rq) { struct request_queue *q = rq->q; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); const int sched_tag = rq->internal_tag; blk_pm_mark_last_busy(rq); @@ -484,7 +486,7 @@ void blk_mq_free_request(struct request *rq) struct request_queue *q = rq->q; struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->cmd_flags, ctx->cpu); if (rq->rq_flags & RQF_ELVPRIV) { if (e && e->type->ops.mq.finish_request) @@ -976,8 +978,9 @@ bool blk_mq_get_driver_tag(struct request *rq) { struct blk_mq_alloc_data data = { .q = rq->q, - .hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu), + .hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu), .flags = BLK_MQ_REQ_NOWAIT, + .cmd_flags = rq->cmd_flags, }; bool shared; @@ -1141,7 +1144,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); - hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); if (!got_budget && !blk_mq_get_dispatch_budget(hctx)) break; @@ -1572,7 +1575,8 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, void blk_mq_request_bypass_insert(struct request *rq, bool run_queue) { struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, + ctx->cpu); spin_lock(&hctx->lock); list_add_tail(&rq->queuelist, &hctx->dispatch); @@ -1782,7 +1786,8 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq) int srcu_idx; blk_qc_t unused_cookie; struct blk_mq_ctx *ctx = rq->mq_ctx; - struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, + ctx->cpu); hctx_lock(hctx, &srcu_idx); ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true); @@ -1816,7 +1821,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); const int is_flush_fua = op_is_flush(bio->bi_opf); - struct blk_mq_alloc_data data = { .flags = 0 }; + struct blk_mq_alloc_data data = { .flags = 0, .cmd_flags = bio->bi_opf }; struct request *rq; unsigned int request_count = 0; struct blk_plug *plug; @@ -1839,7 +1844,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) rq_qos_throttle(q, bio, NULL); - rq = blk_mq_get_request(q, bio, bio->bi_opf, &data); + rq = blk_mq_get_request(q, bio, &data); if (unlikely(!rq)) { rq_qos_cleanup(q, bio); if (bio->bi_opf & REQ_NOWAIT) @@ -1908,6 +1913,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) if (same_queue_rq) { data.hctx = blk_mq_map_queue(q, + same_queue_rq->cmd_flags, same_queue_rq->mq_ctx->cpu); blk_mq_try_issue_directly(data.hctx, same_queue_rq, &cookie); @@ -2262,7 +2268,7 @@ static void blk_mq_init_cpu_queues(struct request_queue *q, * Set local node, IFF we have more than one hw queue. If * not, we remain on the home node of the device */ - hctx = blk_mq_map_queue(q, i); + hctx = blk_mq_map_queue_type(q, 0, i); if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE) hctx->numa_node = local_memory_node(cpu_to_node(i)); } @@ -2335,7 +2341,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) } ctx = per_cpu_ptr(q->queue_ctx, i); - hctx = blk_mq_map_queue(q, i); + hctx = blk_mq_map_queue_type(q, 0, i); cpumask_set_cpu(i, hctx->cpumask); ctx->index_hw = hctx->nr_ctx; diff --git a/block/blk-mq.h b/block/blk-mq.h index 79c300faa7ce..55428b92c019 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -73,7 +73,8 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, - int cpu) + unsigned int flags, + int cpu) { struct blk_mq_tag_set *set = q->tag_set; @@ -83,7 +84,7 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, int type, int cpu) { - return blk_mq_map_queue(q, cpu); + return blk_mq_map_queue(q, type, cpu); } /* @@ -134,6 +135,7 @@ struct blk_mq_alloc_data { struct request_queue *q; blk_mq_req_flags_t flags; unsigned int shallow_depth; + unsigned int cmd_flags; /* input & output parameter */ struct blk_mq_ctx *ctx; @@ -208,7 +210,7 @@ static inline void blk_mq_put_driver_tag(struct request *rq) if (rq->tag == -1 || rq->internal_tag == -1) return; - hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu); + hctx = blk_mq_map_queue(rq->q, rq->cmd_flags, rq->mq_ctx->cpu); __blk_mq_put_driver_tag(hctx, rq); } diff --git a/block/blk.h b/block/blk.h index 2bf1cfeeb9c0..78ae94886acf 100644 --- a/block/blk.h +++ b/block/blk.h @@ -104,10 +104,10 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q) __clear_bit(flag, &q->queue_flags); } -static inline struct blk_flush_queue *blk_get_flush_queue( - struct request_queue *q, struct blk_mq_ctx *ctx) +static inline struct blk_flush_queue * +blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx) { - return blk_mq_map_queue(q, ctx->cpu)->fq; + return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx->cpu)->fq; } static inline void __blk_get_queue(struct request_queue *q) From patchwork Thu Oct 25 21:16:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656607 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24E7214DE for ; Thu, 25 Oct 2018 21:16:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 19CDB2C463 for ; Thu, 25 Oct 2018 21:16:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E38C2C64C; Thu, 25 Oct 2018 21:16:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8CA742C463 for ; Thu, 25 Oct 2018 21:16:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726448AbeJZFu7 (ORCPT ); Fri, 26 Oct 2018 01:50:59 -0400 Received: from mail-it1-f193.google.com ([209.85.166.193]:34474 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFu7 (ORCPT ); Fri, 26 Oct 2018 01:50:59 -0400 Received: by mail-it1-f193.google.com with SMTP id e81-v6so3054159itc.1 for ; Thu, 25 Oct 2018 14:16:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9JynGiYF6jE1B6dobU9g3isZq38bzidmWC1EEymc56M=; b=pF3fnpXtNTJddMktxezMPbGE8vIJPJSS9k05L4J/mN2JCAVPTP0YEslMIOfLn+sS3Z RofOnLz94L1N2tpHA5LxCcu67fKgjWIbYGxjY3naJY4EUUO/TTTNofCbd0+VWX8Ycw+B GG4WbjyWaDbubcQprNdA+KTjxcjYOXELQwgFIiH8WeFyluJaarU662eNrzyl58bs4T8N KHbqW2XDwSJOgW7fvkhjgKZFPjK77YBfWBD+wF5PHOJr0eNr3pzK8CvIxQbhdcF/WqH/ 3wHkVTzlrClWVRPoyS9VTfXmNARX9eN1APLcEPK40YmfF4HYGg0ggEP3pz4IX2YCw3mu EsOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9JynGiYF6jE1B6dobU9g3isZq38bzidmWC1EEymc56M=; b=YBl2CLkMTNBAmcKm6VU/u/a6EPOYiM/WVSBZMf2MjP+YVw3uqPTOg7GFJuG4YeV5EL jNBrY2oQHRDTufK2bPFB8p7ka4/a/kTgo5nabAhvjp32ncDuxGwOz+ul4CN0XQbpPO6x OOaaoDp8+uvc41BmWgfY7izT2EIYvmndbwax8XxMFowlNwainHj+HbV/Lf/YhgJ91AUt oRnxTF4oOfY3rCcxVoAHTMFT+s43AFhLb4GDz3GDK+1H6UHntsyX/Qg8u5pBBpt0xZgQ DkLzgpKn0prRZLVWbL4ayg8qYQtnGCxvEyBtgaPa82kGgH5qa04lUlZR7N11mnrfYLqQ S32Q== X-Gm-Message-State: AGRZ1gLf+yF7xSXYEJbf4ULgStWpg9hV12U2eW+DNjvj0ZuL1DmuH/QO fQS1cwMStTTJXo5TVRT0J1ApSfNz/BBpmA== X-Google-Smtp-Source: AJdET5fSYSZeliIAbjevWpeJcn5mdtyKQqHoKt+ki6PAukjDrZLeXHxcC0jsAbx7OsjLzJNZH0jTtA== X-Received: by 2002:a24:6fc6:: with SMTP id x189-v6mr2068494itb.149.1540502199395; Thu, 25 Oct 2018 14:16:39 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:38 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 05/14] blk-mq: allow software queue to map to multiple hardware queues Date: Thu, 25 Oct 2018 15:16:17 -0600 Message-Id: <20181025211626.12692-6-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The mapping used to be dependent on just the CPU location, but now it's a tuple of { type, cpu} instead. This is a prep patch for allowing a single software queue to map to multiple hardware queues. No functional changes in this patch. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq-sched.c | 2 +- block/blk-mq.c | 18 ++++++++++++------ block/blk-mq.h | 2 +- block/kyber-iosched.c | 6 +++--- include/linux/blk-mq.h | 3 ++- 5 files changed, 19 insertions(+), 12 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 8125e9393ec2..d232ecf3290c 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -110,7 +110,7 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx) { - unsigned idx = ctx->index_hw; + unsigned short idx = ctx->index_hw[hctx->type]; if (++idx == hctx->nr_ctx) idx = 0; diff --git a/block/blk-mq.c b/block/blk-mq.c index e6ea7da99125..fab84c6bda18 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -75,14 +75,18 @@ static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx) static void blk_mq_hctx_mark_pending(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx) { - if (!sbitmap_test_bit(&hctx->ctx_map, ctx->index_hw)) - sbitmap_set_bit(&hctx->ctx_map, ctx->index_hw); + const int bit = ctx->index_hw[hctx->type]; + + if (!sbitmap_test_bit(&hctx->ctx_map, bit)) + sbitmap_set_bit(&hctx->ctx_map, bit); } static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx) { - sbitmap_clear_bit(&hctx->ctx_map, ctx->index_hw); + const int bit = ctx->index_hw[hctx->type]; + + sbitmap_clear_bit(&hctx->ctx_map, bit); } struct mq_inflight { @@ -954,7 +958,7 @@ static bool dispatch_rq_from_ctx(struct sbitmap *sb, unsigned int bitnr, struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *start) { - unsigned off = start ? start->index_hw : 0; + unsigned off = start ? start->index_hw[hctx->type] : 0; struct dispatch_rq_data data = { .hctx = hctx, .rq = NULL, @@ -2342,10 +2346,12 @@ static void blk_mq_map_swqueue(struct request_queue *q) ctx = per_cpu_ptr(q->queue_ctx, i); hctx = blk_mq_map_queue_type(q, 0, i); - + hctx->type = 0; cpumask_set_cpu(i, hctx->cpumask); - ctx->index_hw = hctx->nr_ctx; + ctx->index_hw[hctx->type] = hctx->nr_ctx; hctx->ctxs[hctx->nr_ctx++] = ctx; + /* wrap */ + BUG_ON(!hctx->nr_ctx); } mutex_unlock(&q->sysfs_lock); diff --git a/block/blk-mq.h b/block/blk-mq.h index 55428b92c019..7b5a790acdbf 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -17,7 +17,7 @@ struct blk_mq_ctx { } ____cacheline_aligned_in_smp; unsigned int cpu; - unsigned int index_hw; + unsigned short index_hw[HCTX_MAX_TYPES]; /* incremented at dispatch time */ unsigned long rq_dispatched[2]; diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index 728757a34fa0..b824a639d5d4 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -576,7 +576,7 @@ static bool kyber_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio) { struct kyber_hctx_data *khd = hctx->sched_data; struct blk_mq_ctx *ctx = blk_mq_get_ctx(hctx->queue); - struct kyber_ctx_queue *kcq = &khd->kcqs[ctx->index_hw]; + struct kyber_ctx_queue *kcq = &khd->kcqs[ctx->index_hw[hctx->type]]; unsigned int sched_domain = kyber_sched_domain(bio->bi_opf); struct list_head *rq_list = &kcq->rq_list[sched_domain]; bool merged; @@ -602,7 +602,7 @@ static void kyber_insert_requests(struct blk_mq_hw_ctx *hctx, list_for_each_entry_safe(rq, next, rq_list, queuelist) { unsigned int sched_domain = kyber_sched_domain(rq->cmd_flags); - struct kyber_ctx_queue *kcq = &khd->kcqs[rq->mq_ctx->index_hw]; + struct kyber_ctx_queue *kcq = &khd->kcqs[rq->mq_ctx->index_hw[hctx->type]]; struct list_head *head = &kcq->rq_list[sched_domain]; spin_lock(&kcq->lock); @@ -611,7 +611,7 @@ static void kyber_insert_requests(struct blk_mq_hw_ctx *hctx, else list_move_tail(&rq->queuelist, head); sbitmap_set_bit(&khd->kcq_map[sched_domain], - rq->mq_ctx->index_hw); + rq->mq_ctx->index_hw[hctx->type]); blk_mq_sched_request_inserted(rq); spin_unlock(&kcq->lock); } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index c992069bb3ee..72b36faf182d 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -37,7 +37,8 @@ struct blk_mq_hw_ctx { struct blk_mq_ctx *dispatch_from; unsigned int dispatch_busy; - unsigned int nr_ctx; + unsigned short type; + unsigned short nr_ctx; struct blk_mq_ctx **ctxs; spinlock_t dispatch_wait_lock; From patchwork Thu Oct 25 21:16:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656609 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 81FB714BB for ; Thu, 25 Oct 2018 21:16:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77A372C463 for ; Thu, 25 Oct 2018 21:16:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6BF752C64C; Thu, 25 Oct 2018 21:16:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 201892C463 for ; Thu, 25 Oct 2018 21:16:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726531AbeJZFvB (ORCPT ); Fri, 26 Oct 2018 01:51:01 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:35288 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFvB (ORCPT ); Fri, 26 Oct 2018 01:51:01 -0400 Received: by mail-io1-f65.google.com with SMTP id 79-v6so6422532iou.2 for ; Thu, 25 Oct 2018 14:16:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xZnC+BVSRuJNQgU8jEvRnKFlWPiEfUI4d/DjqRMHU1U=; b=pfNC0eIhV5/202rzWjREobYEsOeYnXHgZBGvhpNypRtgm/KnjaXXmexLPStGSs7cQx 5aAmjkFRVnNnJEMm0lWmCZtHzCF+ynGJuXhbO9EPiFp4XXhm1UwKKaG3dByKNRphHX/P /pjPOqqakSUvs83H5Z2D71Vs0FM0Dv9JpDty6epgPYDfI+QaoIXDEpKjxMjMRanZnx8/ beG6xt6yaDfQZZK7TaNxACjtCzuYlDsCfVBwKCziLMD11jsba1KnSgjIOQnH6yTHwKox JJhx9KHUw6M+ic8oJTc3QyYr+RDPl6IXwsnubkHNBEbjr3bMJPr0y9JIXA1NCA2c7x8i IyLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xZnC+BVSRuJNQgU8jEvRnKFlWPiEfUI4d/DjqRMHU1U=; b=FT/iMgNeZvhPzZ2rs+wvZ+Osf9a5ZzCOq+fO8rci+muJOrE36ChjE23Y2LWZRDthgq FgSzydZ7jGdpgZu/dEWIEeyck/Ho2RI8LuHql4jCL95qR7f5b4TTmN50+lxoIlxWAJro 4Nl9dwWw0ycJ44v3aqoAqJPGOK9tewmx0Vxa1ffu+4TN0HewZvgCeMhN+hJ+Ff1O/j+k G+XwVcxK56t3gGcbQaNhva0SnyrBZq7JiS3t+BrdZTxBXBTdMukwHbgzUU0llQN7FV5k 5mIaOChrdxRcB8+SOeVMkKxAAq/WG3+KhHLZjrMSuwUqsMI4Pcq4l6+VcJl6QHWiGG1k O1/g== X-Gm-Message-State: AGRZ1gKjfdgxrQL87HG6rbTMr8sXhIFWUbHjeWy4y49RmI/9TKR7yIDi iUbje6SknNYezrDpfFFRW0odnixo6Qne/g== X-Google-Smtp-Source: AJdET5cApLdZL5bzAtAzd2ZnD1ujHT3jbLtV8MsQLhqfEKA6IF9Ckf9cH8iLHvJSv6kxA9HmPNuNSg== X-Received: by 2002:a6b:c6cc:: with SMTP id w195-v6mr566820iof.149.1540502201169; Thu, 25 Oct 2018 14:16:41 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:39 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 06/14] blk-mq: add 'type' attribute to the sysfs hctx directory Date: Thu, 25 Oct 2018 15:16:18 -0600 Message-Id: <20181025211626.12692-7-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It can be useful for a user to verify what type a given hardware queue is, expose this information in sysfs. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq-sysfs.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index aafb44224c89..2d737f9e7ba7 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -161,6 +161,11 @@ static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx *hctx, char *page) return ret; } +static ssize_t blk_mq_hw_sysfs_type_show(struct blk_mq_hw_ctx *hctx, char *page) +{ + return sprintf(page, "%u\n", hctx->type); +} + static struct attribute *default_ctx_attrs[] = { NULL, }; @@ -177,11 +182,16 @@ static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_cpus = { .attr = {.name = "cpu_list", .mode = 0444 }, .show = blk_mq_hw_sysfs_cpus_show, }; +static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_type = { + .attr = {.name = "type", .mode = 0444 }, + .show = blk_mq_hw_sysfs_type_show, +}; static struct attribute *default_hw_ctx_attrs[] = { &blk_mq_hw_sysfs_nr_tags.attr, &blk_mq_hw_sysfs_nr_reserved_tags.attr, &blk_mq_hw_sysfs_cpus.attr, + &blk_mq_hw_sysfs_type.attr, NULL, }; From patchwork Thu Oct 25 21:16:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656611 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1178014BB for ; Thu, 25 Oct 2018 21:16:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 059BF2C463 for ; Thu, 25 Oct 2018 21:16:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EE0CA2C64C; Thu, 25 Oct 2018 21:16:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 458F32C463 for ; Thu, 25 Oct 2018 21:16:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726570AbeJZFvD (ORCPT ); Fri, 26 Oct 2018 01:51:03 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:36875 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFvD (ORCPT ); Fri, 26 Oct 2018 01:51:03 -0400 Received: by mail-it1-f195.google.com with SMTP id e74-v6so3646076ita.2 for ; Thu, 25 Oct 2018 14:16:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mqzn9XDBXtKf4KZ/4v5khLALvSNKmSDKvcEJg0ybLk4=; b=OjCbVbDtePQo3qTZ3js9B/UqPv/H4f7s+Cpoqbjl3KjY44IO57XnUvFdiXatnfraJo ydiFgtNAcegecQ3w9xp+3WfweLXmEMNa/chWEc5JvSDGxUkKd8bDw5z7S33WvxC13iY+ FgrhkPXOA9BzgCQdywZTWMUpvchnSq2hkZB/rBTJZ+LvoCobMEi/ojN7dKf2m46vLuW0 /rgSXe9zbRgAsPZDI+TyvfD0HTFIw9AkK57fVpNF/mddRqSrM1XxqBaIXC1WkY/QGnb4 tJWdWYULIbXKOFSrGAyPZeuityx8vsukBXWi21xc6Z9Evx0EDqSWVWCRxfHcM5tKOOp3 sGog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mqzn9XDBXtKf4KZ/4v5khLALvSNKmSDKvcEJg0ybLk4=; b=Wz5DAWEWJRxtrkoJb9rVUlWlu7pMf0GSgPkUipze5k/Ozu3WI8bzzEhkyscHr9Xtry nYbXz0YcYvbf9KVBJc0vrChfvKpuistncakQT2/xrnvs1IcVCLnQuOzAbykL2T4oe5D8 1YyZCL8qYplSn0ibGg6ovqn93pxGquYJ3mc0unrmHEUJ70/QDzFpUElbzBA1ePqkhdoe SYRd6ndNIzfD/yY6SdyZ/AWevfGKTVn31R4kfmelecWPLaBqmC2LekCDodB35I1QMKPj rIlnGbZG2WD1xg2/YIyVIzc7+jMvob/Fs9RYbiYdZIujUh88/WiZKHHMEtOXJd8Br0GN u4lg== X-Gm-Message-State: AGRZ1gKFFuEJQBDgnY2BLvuWVt270UvME/x7i3+3ei4CJ3WE9KsxSsK2 yIQdDxS8pfiseE23pXHvoNn+ExYiMH4FXg== X-Google-Smtp-Source: AJdET5flC5IZ7Tglb8EUe1sfYPUuMAXhzV0BxWo3WQj5xlQ9H94FluYiIrMk58oAcxgdJmVTBeIs0g== X-Received: by 2002:a02:6605:: with SMTP id k5-v6mr692228jac.96.1540502202943; Thu, 25 Oct 2018 14:16:42 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:41 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 07/14] blk-mq: support multiple hctx maps Date: Thu, 25 Oct 2018 15:16:19 -0600 Message-Id: <20181025211626.12692-8-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add support for the tag set carrying multiple queue maps, and for the driver to inform blk-mq how many it wishes to support through setting set->nr_maps. This adds an mq_ops helper for drivers that support more than 1 map, mq_ops->flags_to_type(). The function takes request/bio flags and CPU, and returns a queue map index for that. We then use the type information in blk_mq_map_queue() to index the map set. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 85 ++++++++++++++++++++++++++++-------------- block/blk-mq.h | 19 ++++++---- include/linux/blk-mq.h | 7 ++++ 3 files changed, 76 insertions(+), 35 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index fab84c6bda18..0fab36372ace 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2257,7 +2257,8 @@ static int blk_mq_init_hctx(struct request_queue *q, static void blk_mq_init_cpu_queues(struct request_queue *q, unsigned int nr_hw_queues) { - unsigned int i; + struct blk_mq_tag_set *set = q->tag_set; + unsigned int i, j; for_each_possible_cpu(i) { struct blk_mq_ctx *__ctx = per_cpu_ptr(q->queue_ctx, i); @@ -2272,9 +2273,11 @@ static void blk_mq_init_cpu_queues(struct request_queue *q, * Set local node, IFF we have more than one hw queue. If * not, we remain on the home node of the device */ - hctx = blk_mq_map_queue_type(q, 0, i); - if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE) - hctx->numa_node = local_memory_node(cpu_to_node(i)); + for (j = 0; j < set->nr_maps; j++) { + hctx = blk_mq_map_queue_type(q, j, i); + if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE) + hctx->numa_node = local_memory_node(cpu_to_node(i)); + } } } @@ -2309,7 +2312,7 @@ static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set, static void blk_mq_map_swqueue(struct request_queue *q) { - unsigned int i, hctx_idx; + unsigned int i, j, hctx_idx; struct blk_mq_hw_ctx *hctx; struct blk_mq_ctx *ctx; struct blk_mq_tag_set *set = q->tag_set; @@ -2345,13 +2348,23 @@ static void blk_mq_map_swqueue(struct request_queue *q) } ctx = per_cpu_ptr(q->queue_ctx, i); - hctx = blk_mq_map_queue_type(q, 0, i); - hctx->type = 0; - cpumask_set_cpu(i, hctx->cpumask); - ctx->index_hw[hctx->type] = hctx->nr_ctx; - hctx->ctxs[hctx->nr_ctx++] = ctx; - /* wrap */ - BUG_ON(!hctx->nr_ctx); + for (j = 0; j < set->nr_maps; j++) { + hctx = blk_mq_map_queue_type(q, j, i); + hctx->type = j; + + /* + * If the CPU is already set in the mask, then we've + * mapped this one already. This can happen if + * devices share queues across queue maps. + */ + if (cpumask_test_cpu(i, hctx->cpumask)) + continue; + cpumask_set_cpu(i, hctx->cpumask); + ctx->index_hw[hctx->type] = hctx->nr_ctx; + hctx->ctxs[hctx->nr_ctx++] = ctx; + /* wrap */ + BUG_ON(!hctx->nr_ctx); + } } mutex_unlock(&q->sysfs_lock); @@ -2519,6 +2532,7 @@ struct request_queue *blk_mq_init_sq_queue(struct blk_mq_tag_set *set, memset(set, 0, sizeof(*set)); set->ops = ops; set->nr_hw_queues = 1; + set->nr_maps = 1; set->queue_depth = queue_depth; set->numa_node = NUMA_NO_NODE; set->flags = set_flags; @@ -2798,6 +2812,8 @@ static int blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set) static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) { if (set->ops->map_queues) { + int i; + /* * transport .map_queues is usually done in the following * way: @@ -2805,18 +2821,21 @@ static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) * for (queue = 0; queue < set->nr_hw_queues; queue++) { * mask = get_cpu_mask(queue) * for_each_cpu(cpu, mask) - * set->map.mq_map[cpu] = queue; + * set->map[x].mq_map[cpu] = queue; * } * * When we need to remap, the table has to be cleared for * killing stale mapping since one CPU may not be mapped * to any hw queue. */ - blk_mq_clear_mq_map(&set->map[0]); + for (i = 0; i < set->nr_maps; i++) + blk_mq_clear_mq_map(&set->map[i]); return set->ops->map_queues(set); - } else + } else { + BUG_ON(set->nr_maps > 1); return blk_mq_map_queues(&set->map[0]); + } } /* @@ -2827,7 +2846,7 @@ static int blk_mq_update_queue_map(struct blk_mq_tag_set *set) */ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) { - int ret; + int i, ret; BUILD_BUG_ON(BLK_MQ_MAX_DEPTH > 1 << BLK_MQ_UNIQUE_TAG_BITS); @@ -2850,6 +2869,11 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) set->queue_depth = BLK_MQ_MAX_DEPTH; } + if (!set->nr_maps) + set->nr_maps = 1; + else if (set->nr_maps > HCTX_MAX_TYPES) + return -EINVAL; + /* * If a crashdump is active, then we are potentially in a very * memory constrained environment. Limit us to 1 queue and @@ -2871,12 +2895,14 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return -ENOMEM; ret = -ENOMEM; - set->map[0].mq_map = kcalloc_node(nr_cpu_ids, - sizeof(*set->map[0].mq_map), - GFP_KERNEL, set->numa_node); - if (!set->map[0].mq_map) - goto out_free_tags; - set->map[0].nr_queues = set->nr_hw_queues; + for (i = 0; i < set->nr_maps; i++) { + set->map[i].mq_map = kcalloc_node(nr_cpu_ids, + sizeof(struct blk_mq_queue_map), + GFP_KERNEL, set->numa_node); + if (!set->map[i].mq_map) + goto out_free_mq_map; + set->map[i].nr_queues = set->nr_hw_queues; + } ret = blk_mq_update_queue_map(set); if (ret) @@ -2892,9 +2918,10 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) return 0; out_free_mq_map: - kfree(set->map[0].mq_map); - set->map[0].mq_map = NULL; -out_free_tags: + for (i = 0; i < set->nr_maps; i++) { + kfree(set->map[i].mq_map); + set->map[i].mq_map = NULL; + } kfree(set->tags); set->tags = NULL; return ret; @@ -2903,13 +2930,15 @@ EXPORT_SYMBOL(blk_mq_alloc_tag_set); void blk_mq_free_tag_set(struct blk_mq_tag_set *set) { - int i; + int i, j; for (i = 0; i < nr_cpu_ids; i++) blk_mq_free_map_and_requests(set, i); - kfree(set->map[0].mq_map); - set->map[0].mq_map = NULL; + for (j = 0; j < set->nr_maps; j++) { + kfree(set->map[j].mq_map); + set->map[j].mq_map = NULL; + } kfree(set->tags); set->tags = NULL; diff --git a/block/blk-mq.h b/block/blk-mq.h index 7b5a790acdbf..e27c6f8dc86c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -72,19 +72,24 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, */ extern int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int); -static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, - unsigned int flags, - int cpu) +static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, + int type, int cpu) { struct blk_mq_tag_set *set = q->tag_set; - return q->queue_hw_ctx[set->map[0].mq_map[cpu]]; + return q->queue_hw_ctx[set->map[type].mq_map[cpu]]; } -static inline struct blk_mq_hw_ctx *blk_mq_map_queue_type(struct request_queue *q, - int type, int cpu) +static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q, + unsigned int flags, + int cpu) { - return blk_mq_map_queue(q, type, cpu); + int type = 0; + + if (q->mq_ops->flags_to_type) + type = q->mq_ops->flags_to_type(q, flags); + + return blk_mq_map_queue_type(q, type, cpu); } /* diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 72b36faf182d..7e792ffb09bb 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -86,6 +86,7 @@ enum { struct blk_mq_tag_set { struct blk_mq_queue_map map[HCTX_MAX_TYPES]; + unsigned int nr_maps; const struct blk_mq_ops *ops; unsigned int nr_hw_queues; unsigned int queue_depth; /* max hw supported */ @@ -109,6 +110,7 @@ struct blk_mq_queue_data { typedef blk_status_t (queue_rq_fn)(struct blk_mq_hw_ctx *, const struct blk_mq_queue_data *); +typedef int (flags_to_type_fn)(struct request_queue *, unsigned int); typedef bool (get_budget_fn)(struct blk_mq_hw_ctx *); typedef void (put_budget_fn)(struct blk_mq_hw_ctx *); typedef enum blk_eh_timer_return (timeout_fn)(struct request *, bool); @@ -132,6 +134,11 @@ struct blk_mq_ops { */ queue_rq_fn *queue_rq; + /* + * Return a queue map type for the given request/bio flags + */ + flags_to_type_fn *flags_to_type; + /* * Reserve budget before queue request, once .queue_rq is * run, it is driver's responsibility to release the From patchwork Thu Oct 25 21:16:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656613 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C33BD14DE for ; Thu, 25 Oct 2018 21:16:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B898E2C463 for ; Thu, 25 Oct 2018 21:16:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AD22C2C64C; Thu, 25 Oct 2018 21:16:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4AEB02C463 for ; Thu, 25 Oct 2018 21:16:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726772AbeJZFvF (ORCPT ); Fri, 26 Oct 2018 01:51:05 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:36596 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFvF (ORCPT ); Fri, 26 Oct 2018 01:51:05 -0400 Received: by mail-io1-f65.google.com with SMTP id o19-v6so6416193iod.3 for ; Thu, 25 Oct 2018 14:16:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2AxXXjDvz7PdPK5/6oCpBd7E+v9TzQsK40+6be4tCM0=; b=m9lSPjQ5R/TBDJBoxpHQquLHiI6ZUbHkmOH+4aI63eJ398QOy/CfzkaTZtPgfylqCx somXJkZU5rwIhr7gP6ccEMAQB91zblLz6YrfPtvA5IcYVsmSerlxbhhKjBshVHAgpKMu jJ4DNxJBArojNm4xDyyJKRrCA8MJPbvDr9jDFcDv+KMLNhbKk2isAgHn2rWYnuSUla6r 27AHxjM83rBJBtpXApf9HLkG+H/Nwv3yUPet9yRl1P5TYi6pfTh0g/YKSXaAvDqMR8eX 2tenZk1WMyrW4LX7NVWezkoyOJPV8CuVw/1rhz7u25EwZgo7NV4vjqpVLzMsSo9WOoVs x1/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2AxXXjDvz7PdPK5/6oCpBd7E+v9TzQsK40+6be4tCM0=; b=fqvGd1NTRAAkupn6dHWD+jJ25zZBEBPzCW8uIzBzXdksBTjsywDbd+hLQCyC5sa5hw V/LMdiHXqLcAtMYVp92jwd+xY4pcBahzqD39rcmGaDxIJ8GU9sUPjxxXgnNXUluR7xVr XBLKh24kvGH1aiM0lywYuvowkbOwENhcegBQKjv0GBPRT3rleNhhMcrWPBiSTzMIdztl a5uOVpaQU+DE0yzBBj/RJvkfT0U1NtjLdt6f2aQorcqkHtf73kBbetyl95MsqA3F4DMH BW2HqESSWWJ7LrTDH3l/FlVZc/AmF9dqsvllSd65qn63Vx2OB0AOZFDirESZxLL7sDLc 7f0w== X-Gm-Message-State: AGRZ1gKfkvFuXQ6Cjm77o173aFp0Z2pFU/YpIH1P9z8JVBkfBbhqeW/U Vd2JYU71VGnUR1rrDGSmrM85cnaDvT0HnA== X-Google-Smtp-Source: AJdET5dHbdGArOwqp60b7rm5KkJPbPBmNrJxJUGoA6QH28D5Qa6sIraKE/N7yU0JVaG9EnFv1+lQ8w== X-Received: by 2002:a6b:b487:: with SMTP id d129-v6mr560831iof.131.1540502205461; Thu, 25 Oct 2018 14:16:45 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:43 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 08/14] blk-mq: separate number of hardware queues from nr_cpu_ids Date: Thu, 25 Oct 2018 15:16:20 -0600 Message-Id: <20181025211626.12692-9-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP With multiple maps, nr_cpu_ids is no longer the maximum number of hardware queues we support on a given devices. The initializer of the tag_set can have set ->nr_hw_queues larger than the available number of CPUs, since we can exceed that with multiple queue maps. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 0fab36372ace..60a951c4934c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2663,6 +2663,19 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, mutex_unlock(&q->sysfs_lock); } +/* + * Maximum number of queues we support. For single sets, we'll never have + * more than the CPUs (software queues). For multiple sets, the tag_set + * user may have set ->nr_hw_queues larger. + */ +static unsigned int nr_hw_queues(struct blk_mq_tag_set *set) +{ + if (set->nr_maps == 1) + return nr_cpu_ids; + + return max(set->nr_hw_queues, nr_cpu_ids); +} + struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, struct request_queue *q) { @@ -2682,7 +2695,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, /* init q->mq_kobj and sw queues' kobjects */ blk_mq_sysfs_init(q); - q->queue_hw_ctx = kcalloc_node(nr_cpu_ids, sizeof(*(q->queue_hw_ctx)), + q->nr_queues = nr_hw_queues(set); + q->queue_hw_ctx = kcalloc_node(q->nr_queues, sizeof(*(q->queue_hw_ctx)), GFP_KERNEL, set->numa_node); if (!q->queue_hw_ctx) goto err_percpu; @@ -2694,7 +2708,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, INIT_WORK(&q->timeout_work, blk_mq_timeout_work); blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ); - q->nr_queues = nr_cpu_ids; q->tag_set = set; q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; @@ -2884,12 +2897,13 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) set->queue_depth = min(64U, set->queue_depth); } /* - * There is no use for more h/w queues than cpus. + * There is no use for more h/w queues than cpus if we just have + * a single map */ - if (set->nr_hw_queues > nr_cpu_ids) + if (set->nr_maps == 1 && set->nr_hw_queues > nr_cpu_ids) set->nr_hw_queues = nr_cpu_ids; - set->tags = kcalloc_node(nr_cpu_ids, sizeof(struct blk_mq_tags *), + set->tags = kcalloc_node(nr_hw_queues(set), sizeof(struct blk_mq_tags *), GFP_KERNEL, set->numa_node); if (!set->tags) return -ENOMEM; @@ -2932,7 +2946,7 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) { int i, j; - for (i = 0; i < nr_cpu_ids; i++) + for (i = 0; i < nr_hw_queues(set); i++) blk_mq_free_map_and_requests(set, i); for (j = 0; j < set->nr_maps; j++) { @@ -3064,7 +3078,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, lockdep_assert_held(&set->tag_list_lock); - if (nr_hw_queues > nr_cpu_ids) + if (set->nr_maps == 1 && nr_hw_queues > nr_cpu_ids) nr_hw_queues = nr_cpu_ids; if (nr_hw_queues < 1 || nr_hw_queues == set->nr_hw_queues) return; From patchwork Thu Oct 25 21:16:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656615 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E440714DE for ; Thu, 25 Oct 2018 21:16:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D64192C463 for ; Thu, 25 Oct 2018 21:16:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C48EA2C656; Thu, 25 Oct 2018 21:16:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5850D2C463 for ; Thu, 25 Oct 2018 21:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727250AbeJZFvH (ORCPT ); Fri, 26 Oct 2018 01:51:07 -0400 Received: from mail-io1-f68.google.com ([209.85.166.68]:37442 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFvH (ORCPT ); Fri, 26 Oct 2018 01:51:07 -0400 Received: by mail-io1-f68.google.com with SMTP id k17-v6so6419050ioc.4 for ; Thu, 25 Oct 2018 14:16:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+c7Pr+f64gmPa03Ptxfz2xO9kvQtiyGDeig20u/3rFI=; b=gMIVkhNhFdlzuoEHLIR/iDWckrNbx4BpeAQrC+8x/HFZljSfoBM0girz4TCWmav5XI c1492Ss5IR7dD2c3oyNLTNsoYBhh24u6uFewTkogJTh7ZyMPTi0Ezh6WDwnbDKyUV4Fm Bu6hEBtiZhal8o7b7KWYES/RErYkHJAe/cgboX5AHLDj1v++OrkqGnLEl6MaNCW4V5gi jpmrLQBvidJvtTrCTqySvvF4/wqZTbLvr+EgL3d9nsejDy2QICRBwqvS1JfvmZYCamCF PJp7PbpBSzyiJS4yF7uotzlGr8plBoM/u0j/b+7EkVk1vhtH5iCK9hxh38qCu1M66t6a FVcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+c7Pr+f64gmPa03Ptxfz2xO9kvQtiyGDeig20u/3rFI=; b=QvqmFEeI00yu67V1EkawMAkM5zIoNvbbJk7j64DRKrB4bG1WAXHoxuO6q8rODCRm+Q vZS0Tt7ukrIlWQR7UAXprjbjhmnXj9fdyVERmyud4hYEfh9/pWfWbCKZ6mbpSiQLso0L HRyrw7Z1kNa/Ratl88nGKH/RGsXN8fnFPeUQkf7LvLjHP/MxbDIJHP9MymswAYFmLVmD 0+cGsqwrL6ngdPMdJhAF0HIC0D8G6LYNj446QbVZt7auuo2y9R3VdkhJizGunYvdXAfR j/NVocIgbwOWHLKynuDnsZywWLEhlzbe6ibR7x2U3TXRF+OiQFmiSn+v9FdCtFr4wBep iZ4g== X-Gm-Message-State: AGRZ1gLnSkv6YUB94u5YItsMhes80bfbXV9GWQ9M9r6KyGAVeJA7NPAk Az0MqE/KyaaUhUj4oGzdYMQusPXHMVQqqw== X-Google-Smtp-Source: AJdET5e2/U/xmsCLOoEf8QlNC784FscW7fyBCeyG/TxYQLuQdC5FUivTfTgPQ+rvpg1tdoibHUQgow== X-Received: by 2002:a6b:4117:: with SMTP id n23-v6mr599001ioa.150.1540502207188; Thu, 25 Oct 2018 14:16:47 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:45 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 09/14] blk-mq: ensure that plug lists don't straddle hardware queues Date: Thu, 25 Oct 2018 15:16:21 -0600 Message-Id: <20181025211626.12692-10-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Since we insert per hardware queue, we have to ensure that every request on the plug list being inserted belongs to the same hardware queue. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 60a951c4934c..52b07188b39a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1621,6 +1621,27 @@ static int plug_ctx_cmp(void *priv, struct list_head *a, struct list_head *b) blk_rq_pos(rqa) < blk_rq_pos(rqb))); } +/* + * Need to ensure that the hardware queue matches, so we don't submit + * a list of requests that end up on different hardware queues. + */ +static bool ctx_match(struct request *req, struct blk_mq_ctx *ctx, + unsigned int flags) +{ + if (req->mq_ctx != ctx) + return false; + + /* + * If we just have one map, then we know the hctx will match + * if the ctx matches + */ + if (req->q->tag_set->nr_maps == 1) + return true; + + return blk_mq_map_queue(req->q, req->cmd_flags, ctx->cpu) == + blk_mq_map_queue(req->q, flags, ctx->cpu); +} + void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) { struct blk_mq_ctx *this_ctx; @@ -1628,7 +1649,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) struct request *rq; LIST_HEAD(list); LIST_HEAD(ctx_list); - unsigned int depth; + unsigned int depth, this_flags; list_splice_init(&plug->mq_list, &list); @@ -1636,13 +1657,14 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) this_q = NULL; this_ctx = NULL; + this_flags = 0; depth = 0; while (!list_empty(&list)) { rq = list_entry_rq(list.next); list_del_init(&rq->queuelist); BUG_ON(!rq->q); - if (rq->mq_ctx != this_ctx) { + if (!ctx_match(rq, this_ctx, this_flags)) { if (this_ctx) { trace_block_unplug(this_q, depth, !from_schedule); blk_mq_sched_insert_requests(this_q, this_ctx, @@ -1650,6 +1672,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) from_schedule); } + this_flags = rq->cmd_flags; this_ctx = rq->mq_ctx; this_q = rq->q; depth = 0; From patchwork Thu Oct 25 21:16:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656617 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1D3914BB for ; Thu, 25 Oct 2018 21:16:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E62E82C463 for ; Thu, 25 Oct 2018 21:16:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D1C6E2C64C; Thu, 25 Oct 2018 21:16:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 70DAE2C463 for ; Thu, 25 Oct 2018 21:16:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726149AbeJZFvJ (ORCPT ); Fri, 26 Oct 2018 01:51:09 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:40221 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbeJZFvJ (ORCPT ); Fri, 26 Oct 2018 01:51:09 -0400 Received: by mail-io1-f65.google.com with SMTP id a23-v6so6415144iod.7 for ; Thu, 25 Oct 2018 14:16:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9LrmkFD+aQRBlC5/q6dcnEeOWB1KK8o1CyOk8gYLMr4=; b=beIx9rWo/P0GtMI4nVlV7zUq1/Z02WJWbnX1AyigJ6+3NODSjAvHqTxwgLVb+rQlxg GpTJgpigZBq6n1NJr03QlOPOU/b2pUIu61S5rJx07PcXfsNoS5bjhzyXj7RcHd5ENBFn nXND9Z3oE4z1xoN5uk0lFfvzC89Qnti4d58iEq0FbsfT4rU1FQVMDOVT+PpYimGozzep 6Ivd53hrpyP0NLTswsQa9pCX74ncD5/7Dzu7kvoqEBfezqJRVVDme3B+DXIgdvFO7O9m LzpPC60QU1KufNKfmBXg3QowVo6N8f3o5L7bs7qxCVjNe7E8R87bpWM1lZjBPd81Yz2c dIFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9LrmkFD+aQRBlC5/q6dcnEeOWB1KK8o1CyOk8gYLMr4=; b=F7OzJCTvH+w+qpSmxj1Je+FvCoOXVTHjtS3FSNo242V4pBFvYkZm1voqx+pnSLVJOt KEl+IQKY7EJwk3PvTa5MfBOvMj7n4ledop6X+K0TMIFXKWnDagbyX3H/i7kdTsZ0Asg7 d92J5/d5xW/aQJhIO9kx2WqlTWb/2LAn8EgPS+vkzC2HZhXO2IWcmkyTPWb3+SNCi86D hxQiyA+i8H/VC+AUN6OIgKBRnxeCiUGeRdL2xdL2WH+OZKXwy52vpWMSzaKPiiqdkVbU DuPCuPH5Et8CnvkeZZHaxRj7m2cuNiBuva8Ikyui0tpawmgiiLu88T7Szv/ddQ2QHovy TAJQ== X-Gm-Message-State: AGRZ1gL8txTLqVPAi5H8VActCZaNl4Wvn/n0Rm4m+jeyqJbVYEDn7oAn QkZS7ARNp5DFxU5t8hSDCepNrv8TjkAEAA== X-Google-Smtp-Source: AJdET5fGMahKTlB7Hei8bxkje1fOErKDW1WCJuE2bfz89C6q3kqCjBfBaPgjwJRJvUFaxPN/58pAcA== X-Received: by 2002:a6b:710b:: with SMTP id q11-v6mr510345iog.138.1540502209126; Thu, 25 Oct 2018 14:16:49 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:47 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 10/14] blk-mq: initial support for multiple queue maps Date: Thu, 25 Oct 2018 15:16:22 -0600 Message-Id: <20181025211626.12692-11-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a queue offset to the tag map. This enables users to map iteratively, for each queue map type they support. Bump maximum number of supported maps to 2, we're now fully able to support more than 1 map. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- block/blk-mq-cpumap.c | 9 +++++---- block/blk-mq-pci.c | 2 +- block/blk-mq-virtio.c | 2 +- include/linux/blk-mq.h | 3 ++- 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 6e6686c55984..03a534820271 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -14,9 +14,10 @@ #include "blk.h" #include "blk-mq.h" -static int cpu_to_queue_index(unsigned int nr_queues, const int cpu) +static int cpu_to_queue_index(struct blk_mq_queue_map *qmap, + unsigned int nr_queues, const int cpu) { - return cpu % nr_queues; + return qmap->queue_offset + (cpu % nr_queues); } static int get_first_sibling(unsigned int cpu) @@ -44,11 +45,11 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) * performace optimizations. */ if (cpu < nr_queues) { - map[cpu] = cpu_to_queue_index(nr_queues, cpu); + map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); } else { first_sibling = get_first_sibling(cpu); if (first_sibling == cpu) - map[cpu] = cpu_to_queue_index(nr_queues, cpu); + map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); else map[cpu] = map[first_sibling]; } diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c index 40333d60a850..1dce18553984 100644 --- a/block/blk-mq-pci.c +++ b/block/blk-mq-pci.c @@ -43,7 +43,7 @@ int blk_mq_pci_map_queues(struct blk_mq_queue_map *qmap, struct pci_dev *pdev, goto fallback; for_each_cpu(cpu, mask) - qmap->mq_map[cpu] = queue; + qmap->mq_map[cpu] = qmap->queue_offset + queue; } return 0; diff --git a/block/blk-mq-virtio.c b/block/blk-mq-virtio.c index 661fbfef480f..370827163835 100644 --- a/block/blk-mq-virtio.c +++ b/block/blk-mq-virtio.c @@ -44,7 +44,7 @@ int blk_mq_virtio_map_queues(struct blk_mq_queue_map *qmap, goto fallback; for_each_cpu(cpu, mask) - qmap->mq_map[cpu] = queue; + qmap->mq_map[cpu] = qmap->queue_offset + queue; } return 0; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 7e792ffb09bb..250b9ed86cd4 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -78,10 +78,11 @@ struct blk_mq_hw_ctx { struct blk_mq_queue_map { unsigned int *mq_map; unsigned int nr_queues; + unsigned int queue_offset; }; enum { - HCTX_MAX_TYPES = 1, + HCTX_MAX_TYPES = 2, }; struct blk_mq_tag_set { From patchwork Thu Oct 25 21:16:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656619 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A72A14DE for ; Thu, 25 Oct 2018 21:16:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F2CB2C63D for ; Thu, 25 Oct 2018 21:16:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 42D5B2C64C; Thu, 25 Oct 2018 21:16:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D000E2C463 for ; Thu, 25 Oct 2018 21:16:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726065AbeJZFvL (ORCPT ); Fri, 26 Oct 2018 01:51:11 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:43158 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727265AbeJZFvL (ORCPT ); Fri, 26 Oct 2018 01:51:11 -0400 Received: by mail-io1-f65.google.com with SMTP id y10-v6so6390135ioa.10 for ; Thu, 25 Oct 2018 14:16:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RdTXcoUkTq5K7EWsZETMPuhG8fwGUygYDmWiHM5puJA=; b=Ly4bcscD6GbJCU7u76fx+8s8P4RCvOwcwJT5eRNpl6oZoKjfoLhBIM36L9RmsV1/6S iPLeVklwGnoYEVi74HNR2YbckDz4aeU6gGLWfPzNKFKl9sRHwFL47vQC5ce04fsxUU4E mfNZ4tfC8pmTplQnDNiThB5pGFWU1SwLKrdFyQKHuK7OxD4zOc9iUS7vq1Z6HpD+vHnD Vi1pKp/sonPsfbaIGSlcrGxiEx0NeNEzHx4tftVPJkCy0d6S2oj4Ah0aBjfPWCdelW+3 MsQbxdHRW8HPdhsfQFRz/jHKXcqY8t07sB54SaAqJLoum1WdU2nwnByi6E9jUn0bn2k6 ckpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RdTXcoUkTq5K7EWsZETMPuhG8fwGUygYDmWiHM5puJA=; b=rW4kW+qiVOf1M4E2qrCVosCy/N8MbuKsg4blkkXr7d9mhYxWtbfKOyXmR/hJBI3hCl V1WX5kyy4mb4UXehxm/9Bo+ryRVly7k33EnbmixbG5S5kjJ8Qk7SAV6jsNBPcZahMY38 V675IWFlqoEth2clHwejkAYBesqTISsO2gGsd04jvRX4m56Rxs6DkVCExaGVMdeu39sk 8hguydsliXWnwDvKIGCI2cLvTQ+buQOBae+Jx8Jr7DWftsE3yulsQ4FHVmd9QRv5ghbv P9FuVv6Epqi3ZN75WVqYPrZAYtiHQ2P6AL6ieFl4+QLAVnfUd4qfHJ76qL39mUt7rK5Y uphA== X-Gm-Message-State: AGRZ1gLdJJKfm1I/FFtskLQtUXxmlmPqOm9+/109tdqHUX2BhW+LWKG6 VTS5680PmA918FmpoC6ORqVHJ+b37VbNzQ== X-Google-Smtp-Source: AJdET5cgnM6sQbxJI0Q4G2lSyYMQgtEy3+C0fQnrbJ+j+6BoQ6Hi9E5iJpj1WP7e6Qe0oRWlvhRqsg== X-Received: by 2002:a6b:6201:: with SMTP id f1-v6mr576095iog.11.1540502210964; Thu, 25 Oct 2018 14:16:50 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:49 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: [PATCH 11/14] irq: add support for allocating (and affinitizing) sets of IRQs Date: Thu, 25 Oct 2018 15:16:23 -0600 Message-Id: <20181025211626.12692-12-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts, and have them appropriately affinitized. Add support for defining a number of sets in the irq_affinity structure, of varying sizes, and get each set affinitized correctly across the machine. Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- include/linux/interrupt.h | 4 ++++ kernel/irq/affinity.c | 31 +++++++++++++++++++++++++------ 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index eeceac3376fc..9fce2131902c 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -247,10 +247,14 @@ struct irq_affinity_notify { * the MSI(-X) vector space * @post_vectors: Don't apply affinity to @post_vectors at end of * the MSI(-X) vector space + * @nr_sets: Length of passed in *sets array + * @sets: Number of affinitized sets */ struct irq_affinity { int pre_vectors; int post_vectors; + int nr_sets; + int *sets; }; #if defined(CONFIG_SMP) diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index f4f29b9d90ee..0055e252e438 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -180,6 +180,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) int curvec, usedvecs; cpumask_var_t nmsk, npresmsk, *node_to_cpumask; struct cpumask *masks = NULL; + int i, nr_sets; /* * If there aren't any vectors left after applying the pre/post @@ -210,10 +211,23 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) get_online_cpus(); build_node_to_cpumask(node_to_cpumask); - /* Spread on present CPUs starting from affd->pre_vectors */ - usedvecs = irq_build_affinity_masks(affd, curvec, affvecs, - node_to_cpumask, cpu_present_mask, - nmsk, masks); + /* + * Spread on present CPUs starting from affd->pre_vectors. If we + * have multiple sets, build each sets affinity mask separately. + */ + nr_sets = affd->nr_sets; + if (!nr_sets) + nr_sets = 1; + + for (i = 0, usedvecs = 0; i < nr_sets; i++) { + int this_vecs = affd->sets ? affd->sets[i] : affvecs; + int nr; + + nr = irq_build_affinity_masks(affd, curvec, this_vecs, + node_to_cpumask, cpu_present_mask, + nmsk, masks + usedvecs); + usedvecs += nr; + } /* * Spread on non present CPUs starting from the next vector to be @@ -258,13 +272,18 @@ int irq_calc_affinity_vectors(int minvec, int maxvec, const struct irq_affinity { int resv = affd->pre_vectors + affd->post_vectors; int vecs = maxvec - resv; + int i, set_vecs; int ret; if (resv > minvec) return 0; get_online_cpus(); - ret = min_t(int, cpumask_weight(cpu_possible_mask), vecs) + resv; + ret = min_t(int, cpumask_weight(cpu_possible_mask), vecs); put_online_cpus(); - return ret; + + for (i = 0, set_vecs = 0; i < affd->nr_sets; i++) + set_vecs += affd->sets[i]; + + return resv + max(ret, set_vecs); } From patchwork Thu Oct 25 21:16:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656621 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1503614DE for ; Thu, 25 Oct 2018 21:16:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0A80F2C463 for ; Thu, 25 Oct 2018 21:16:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F362A2C64C; Thu, 25 Oct 2018 21:16:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E6DD2C463 for ; Thu, 25 Oct 2018 21:16:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727150AbeJZFvN (ORCPT ); Fri, 26 Oct 2018 01:51:13 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:55509 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727265AbeJZFvM (ORCPT ); Fri, 26 Oct 2018 01:51:12 -0400 Received: by mail-it1-f196.google.com with SMTP id c23-v6so3350832itd.5 for ; Thu, 25 Oct 2018 14:16:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=H8SyGySDgTZd1PRIfJoMk1GFhTniI8O35WBekSMvR/Q=; b=f9DyV2dpnAeEb+0kTYe9E8xRLy+icGKo4EwEN2GwLhEbZwjuSJgfBPrtFDw1oHMCNQ m9PivX3Y9L3bsoMpanjyoLMGv89eoBJVqwLHexFfLjgEawL9+awwPv+ozelE8WfDJc/o jTfcBIV9ft2z+tJeCOS7STlM6ZQSkHxxYr2JyN9hMPW+ocW2S9qx8hv1Mu2CuFDLTU1W 63dmblvKydRugHS9y6tWS8nb5hbXxQXnjBH3S1mOWqBC20UZc9khSpTKLaAQU1zaYP1y +LbnTjjxyX4tEWnSB3zpP0peBoheqQbmNr3JeriNZ93F2pbdhQRwuE+dCiuoshv1iiB+ /gHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=H8SyGySDgTZd1PRIfJoMk1GFhTniI8O35WBekSMvR/Q=; b=uoNRClPrU6qr54TR1jmWvljAo/PlJnubUnN9QZFwVROrvcJ+RYmeHUkEX1YljgzraA 5ijwhVrqybm/PHMCW8vwKzvsOs5PI/swIAqplHFH5sCTwGOaiB56ncYEGLdvfIEUt9eC QjsWiELa8EeZtYY5fVVSgZlFv2qGdsrhrGDp4truSubyvuIvoYMmMe6NbZUjn4DEXtef BbJOkdOsVvdbju8UNXq3gPhGq7v/0U50nyhE6TFN+jbCHw6FL22Jl1DQ5Z4Vm7Va4b3a HQ/660PaIp4RWonirBxYoVDfbyShzO6ncFiblW1K9OflIN/600tAanAHTANHiZgchwsv e76Q== X-Gm-Message-State: AGRZ1gIsKm8Ce7mT6VKpZda+rjZYhWqmP0KZXtmrmzMEao7nsh4rYZsg JI4VFtJdB0KmWGW2HLl0RmSFePdNf9F4mA== X-Google-Smtp-Source: AJdET5e2c4/hbL1OzBDcOhs/UdlwrBmoIrlweLI3bY1Eai84MZ0OmXARwGB771pPfOQHN8xSxpgrLA== X-Received: by 2002:a24:9602:: with SMTP id z2-v6mr2051320itd.102.1540502212576; Thu, 25 Oct 2018 14:16:52 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:51 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 12/14] nvme: utilize two queue maps, one for reads and one for writes Date: Thu, 25 Oct 2018 15:16:24 -0600 Message-Id: <20181025211626.12692-13-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP NVMe does round-robin between queues by default, which means that sharing a queue map for both reads and writes can be problematic in terms of read servicing. It's much easier to flood the queue with writes and reduce the read servicing. Implement two queue maps, one for reads and one for writes. The write queue count is configurable through the 'write_queues' parameter. By default, we retain the previous behavior of having a single queue set, shared between reads and writes. Setting 'write_queues' to a non-zero value will create two queue sets, one for reads and one for writes, the latter using the configurable number of queues (hardware queue counts permitting). Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- drivers/nvme/host/pci.c | 139 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 131 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index e5d783cb6937..658c9a2f4114 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -74,11 +74,29 @@ static int io_queue_depth = 1024; module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644); MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2"); +static int queue_count_set(const char *val, const struct kernel_param *kp); +static const struct kernel_param_ops queue_count_ops = { + .set = queue_count_set, + .get = param_get_int, +}; + +static int write_queues; +module_param_cb(write_queues, &queue_count_ops, &write_queues, 0644); +MODULE_PARM_DESC(write_queues, + "Number of queues to use for writes. If not set, reads and writes " + "will share a queue set."); + struct nvme_dev; struct nvme_queue; static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); +enum { + NVMEQ_TYPE_READ, + NVMEQ_TYPE_WRITE, + NVMEQ_TYPE_NR, +}; + /* * Represents an NVM Express device. Each nvme_dev is a PCI function. */ @@ -92,6 +110,7 @@ struct nvme_dev { struct dma_pool *prp_small_pool; unsigned online_queues; unsigned max_qid; + unsigned io_queues[NVMEQ_TYPE_NR]; unsigned int num_vecs; int q_depth; u32 db_stride; @@ -134,6 +153,17 @@ static int io_queue_depth_set(const char *val, const struct kernel_param *kp) return param_set_int(val, kp); } +static int queue_count_set(const char *val, const struct kernel_param *kp) +{ + int n = 0, ret; + + ret = kstrtoint(val, 10, &n); + if (n > num_possible_cpus()) + n = num_possible_cpus(); + + return param_set_int(val, kp); +} + static inline unsigned int sq_idx(unsigned int qid, u32 stride) { return qid * 2 * stride; @@ -218,9 +248,20 @@ static inline void _nvme_check_size(void) BUILD_BUG_ON(sizeof(struct nvme_dbbuf) != 64); } +static unsigned int max_io_queues(void) +{ + return num_possible_cpus() + write_queues; +} + +static unsigned int max_queue_count(void) +{ + /* IO queues + admin queue */ + return 1 + max_io_queues(); +} + static inline unsigned int nvme_dbbuf_size(u32 stride) { - return ((num_possible_cpus() + 1) * 8 * stride); + return (max_queue_count() * 8 * stride); } static int nvme_dbbuf_dma_alloc(struct nvme_dev *dev) @@ -431,12 +472,41 @@ static int nvme_init_request(struct blk_mq_tag_set *set, struct request *req, return 0; } +static int queue_irq_offset(struct nvme_dev *dev) +{ + /* if we have more than 1 vec, admin queue offsets us 1 */ + if (dev->num_vecs > 1) + return 1; + + return 0; +} + static int nvme_pci_map_queues(struct blk_mq_tag_set *set) { struct nvme_dev *dev = set->driver_data; + int i, qoff, offset; + + offset = queue_irq_offset(dev); + for (i = 0, qoff = 0; i < set->nr_maps; i++) { + struct blk_mq_queue_map *map = &set->map[i]; + + map->nr_queues = dev->io_queues[i]; + if (!map->nr_queues) { + BUG_ON(i == NVMEQ_TYPE_READ); - return blk_mq_pci_map_queues(&set->map[0], to_pci_dev(dev->dev), - dev->num_vecs > 1 ? 1 /* admin queue */ : 0); + /* shared set, resuse read set parameters */ + map->nr_queues = dev->io_queues[NVMEQ_TYPE_READ]; + qoff = 0; + offset = queue_irq_offset(dev); + } + + map->queue_offset = qoff; + blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset); + qoff += map->nr_queues; + offset += map->nr_queues; + } + + return 0; } /** @@ -849,6 +919,14 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; } +static int nvme_flags_to_type(struct request_queue *q, unsigned int flags) +{ + if ((flags & REQ_OP_MASK) == REQ_OP_READ) + return NVMEQ_TYPE_READ; + + return NVMEQ_TYPE_WRITE; +} + static void nvme_pci_complete_rq(struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); @@ -1476,6 +1554,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = { static const struct blk_mq_ops nvme_mq_ops = { .queue_rq = nvme_queue_rq, + .flags_to_type = nvme_flags_to_type, .complete = nvme_pci_complete_rq, .init_hctx = nvme_init_hctx, .init_request = nvme_init_request, @@ -1888,18 +1967,53 @@ static int nvme_setup_host_mem(struct nvme_dev *dev) return ret; } +static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int nr_io_queues) +{ + unsigned int this_w_queues = write_queues; + + /* + * Setup read/write queue split + */ + if (nr_io_queues == 1) { + dev->io_queues[NVMEQ_TYPE_READ] = 1; + dev->io_queues[NVMEQ_TYPE_WRITE] = 0; + return; + } + + /* + * If 'write_queues' is set, ensure it leaves room for at least + * one read queue + */ + if (this_w_queues >= nr_io_queues) + this_w_queues = nr_io_queues - 1; + + /* + * If 'write_queues' is set to zero, reads and writes will share + * a queue set. + */ + if (!this_w_queues) { + dev->io_queues[NVMEQ_TYPE_WRITE] = 0; + dev->io_queues[NVMEQ_TYPE_READ] = nr_io_queues; + } else { + dev->io_queues[NVMEQ_TYPE_WRITE] = this_w_queues; + dev->io_queues[NVMEQ_TYPE_READ] = nr_io_queues - this_w_queues; + } +} + static int nvme_setup_io_queues(struct nvme_dev *dev) { struct nvme_queue *adminq = &dev->queues[0]; struct pci_dev *pdev = to_pci_dev(dev->dev); int result, nr_io_queues; unsigned long size; - + int irq_sets[2]; struct irq_affinity affd = { - .pre_vectors = 1 + .pre_vectors = 1, + .nr_sets = ARRAY_SIZE(irq_sets), + .sets = irq_sets, }; - nr_io_queues = num_possible_cpus(); + nr_io_queues = max_io_queues(); result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues); if (result < 0) return result; @@ -1929,6 +2043,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) /* Deregister the admin queue's interrupt */ pci_free_irq(pdev, 0, adminq); + nvme_calc_io_queues(dev, nr_io_queues); + irq_sets[0] = dev->io_queues[NVMEQ_TYPE_READ]; + irq_sets[1] = dev->io_queues[NVMEQ_TYPE_WRITE]; + if (!irq_sets[1]) + affd.nr_sets = 1; + /* * If we enable msix early due to not intx, disable it again before * setting up the full range we need. @@ -1941,6 +2061,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) dev->num_vecs = result; dev->max_qid = max(result - 1, 1); + nvme_calc_io_queues(dev, dev->max_qid); + /* * Should investigate if there's a performance win from allocating * more queues than interrupt vectors; it might allow the submission @@ -2042,6 +2164,7 @@ static int nvme_dev_add(struct nvme_dev *dev) if (!dev->ctrl.tagset) { dev->tagset.ops = &nvme_mq_ops; dev->tagset.nr_hw_queues = dev->online_queues - 1; + dev->tagset.nr_maps = NVMEQ_TYPE_NR; dev->tagset.timeout = NVME_IO_TIMEOUT; dev->tagset.numa_node = dev_to_node(dev->dev); dev->tagset.queue_depth = @@ -2489,8 +2612,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (!dev) return -ENOMEM; - dev->queues = kcalloc_node(num_possible_cpus() + 1, - sizeof(struct nvme_queue), GFP_KERNEL, node); + dev->queues = kcalloc_node(max_queue_count(), sizeof(struct nvme_queue), + GFP_KERNEL, node); if (!dev->queues) goto free; From patchwork Thu Oct 25 21:16:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656623 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BEF7B14BB for ; Thu, 25 Oct 2018 21:16:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B43552C463 for ; Thu, 25 Oct 2018 21:16:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8BC72C64C; Thu, 25 Oct 2018 21:16:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 428292C463 for ; Thu, 25 Oct 2018 21:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727265AbeJZFvQ (ORCPT ); Fri, 26 Oct 2018 01:51:16 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:39732 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726787AbeJZFvP (ORCPT ); Fri, 26 Oct 2018 01:51:15 -0400 Received: by mail-it1-f195.google.com with SMTP id m15so3638044itl.4 for ; Thu, 25 Oct 2018 14:16:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CTndW+ydMvjIYJPp2VM7s8ScaVMNyi9JLS8YD+G2YV8=; b=Z2wogridZpYlV9D08IE7twqDvFrZ7LMkNQN7pLGanoXmfAIkfLeVmP54OvxI5Q/NeR Cvi9Q+1Otcz6bX1RvfWLxEsly0hupKiS2Hb/h5DMKVTLpC7g4uiYD9KrVn/qAQBzDsOg WTllg1sE+UQUzrnphA2FSaDmejAz2VFKIgbC7Z3dVoD6zurzydx7REfZYlgto5prgCsr N3Hw0uLcLHP3ZseDSoejIhOfrJuSUSufm+0wMJCSNdfZjBuxvbsNrRjF0qm6t+tmFAZ5 R4JE/nBVSL1FY0n6GNk4txgb9HjRndm09tSUwH8WhxU7NsPiWgScr6qnZkxfZCM2+9PK HwJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=CTndW+ydMvjIYJPp2VM7s8ScaVMNyi9JLS8YD+G2YV8=; b=WZh+YH00iVbg7Sw8TBCoW6p69kkM10TwKXfO7DKOT6CjH6KynqCnN48TV5XKLiMoyi QbpX/4itAwlJM+WDf1N3692eJaG3QIBkb27hXZrQ/ez4Na7fo+/9QtSJ35IKexVt+8Pn MVrAUe5xJDQK2YRqBWjP+5ia6/0ekuedsXIW6Pkh9P/zQAALhlPNtzpo7cFDGBwukRS/ o/aGkp1Z9bCoQwnQVKgrLDnyTCOBUcFiBmmoemyKBosf04SRrL0yepbp+/t5g5C5Vr71 7qVvK+P1Sfg/ckbOGKQex+exssrn3Je3Lojl5nkxS744CkO3ORDZTFIyaYAAgffkVn4+ fcnA== X-Gm-Message-State: AGRZ1gKryDoaVIKikg7K8iN8mS1OHTUxIvgQk5VZBG6WXPJgg5tIA2xq TbQr/WamLDwQwWKYKt/6avhquK1aPRdS5A== X-Google-Smtp-Source: AJdET5d918N+mGfbQQrotwG1pgdy0KGRXY6RNw85MVjlfS3TwHJ8TKqmdnbqnLaLK9RV/5JTVMdLoA== X-Received: by 2002:a24:b047:: with SMTP id b7-v6mr1879795itj.87.1540502215774; Thu, 25 Oct 2018 14:16:55 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:53 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 13/14] block: add REQ_HIPRI and inherit it from IOCB_HIPRI Date: Thu, 25 Oct 2018 15:16:25 -0600 Message-Id: <20181025211626.12692-14-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We use IOCB_HIPRI to poll for IO in the caller instead of scheduling. This information is not available for (or after) IO submission. The driver may make different queue choices based on the type of IO, so make the fact that we will poll for this IO known to the lower layers as well. Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- fs/block_dev.c | 2 ++ fs/direct-io.c | 2 ++ fs/iomap.c | 9 ++++++++- include/linux/blk_types.h | 4 +++- 4 files changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 38b8ce05cbc7..8bb8090c57a7 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -232,6 +232,8 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, bio.bi_opf = dio_bio_write_op(iocb); task_io_account_write(ret); } + if (iocb->ki_flags & IOCB_HIPRI) + bio.bi_opf |= REQ_HIPRI; qc = submit_bio(&bio); for (;;) { diff --git a/fs/direct-io.c b/fs/direct-io.c index 093fb54cd316..ffb46b7aa5f7 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -1265,6 +1265,8 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, } else { dio->op = REQ_OP_READ; } + if (iocb->ki_flags & IOCB_HIPRI) + dio->op_flags |= REQ_HIPRI; /* * For AIO O_(D)SYNC writes we need to defer completions to a workqueue diff --git a/fs/iomap.c b/fs/iomap.c index ec15cf2ec696..50ad8c8d1dcb 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1554,6 +1554,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, unsigned len) { struct page *page = ZERO_PAGE(0); + int flags = REQ_SYNC | REQ_IDLE; struct bio *bio; bio = bio_alloc(GFP_KERNEL, 1); @@ -1562,9 +1563,12 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; + if (dio->iocb->ki_flags & IOCB_HIPRI) + flags |= REQ_HIPRI; + get_page(page); __bio_add_page(bio, page, len, 0); - bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE); + bio_set_op_attrs(bio, REQ_OP_WRITE, flags); atomic_inc(&dio->ref); return submit_bio(bio); @@ -1663,6 +1667,9 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, bio_set_pages_dirty(bio); } + if (dio->iocb->ki_flags & IOCB_HIPRI) + bio->bi_opf |= REQ_HIPRI; + iov_iter_advance(dio->submit.iter, n); dio->size += n; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 093a818c5b68..d6c2558d6b73 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -322,6 +322,8 @@ enum req_flag_bits { /* command specific flags for REQ_OP_WRITE_ZEROES: */ __REQ_NOUNMAP, /* do not free blocks when zeroing */ + __REQ_HIPRI, + /* for driver use */ __REQ_DRV, __REQ_SWAP, /* swapping request. */ @@ -342,8 +344,8 @@ enum req_flag_bits { #define REQ_RAHEAD (1ULL << __REQ_RAHEAD) #define REQ_BACKGROUND (1ULL << __REQ_BACKGROUND) #define REQ_NOWAIT (1ULL << __REQ_NOWAIT) - #define REQ_NOUNMAP (1ULL << __REQ_NOUNMAP) +#define REQ_HIPRI (1ULL << __REQ_HIPRI) #define REQ_DRV (1ULL << __REQ_DRV) #define REQ_SWAP (1ULL << __REQ_SWAP) From patchwork Thu Oct 25 21:16:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10656625 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6C6D14DE for ; Thu, 25 Oct 2018 21:17:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB4CF2C463 for ; Thu, 25 Oct 2018 21:17:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AFD3F2C64C; Thu, 25 Oct 2018 21:17:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 005012C463 for ; Thu, 25 Oct 2018 21:16:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726920AbeJZFvS (ORCPT ); Fri, 26 Oct 2018 01:51:18 -0400 Received: from mail-it1-f193.google.com ([209.85.166.193]:36909 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726787AbeJZFvR (ORCPT ); Fri, 26 Oct 2018 01:51:17 -0400 Received: by mail-it1-f193.google.com with SMTP id e74-v6so3646893ita.2 for ; Thu, 25 Oct 2018 14:16:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=P0Ij0y4tNyespukgRyoxk0GbvDaq7eCt4gqx0mdDOFs=; b=oO/WrqEx5t+DkBDax3Y7DRvUfHZs6VsYWJIPt73DBPrUa5Zj3hKeaNFWj+6ewGf1b4 FRPuGZRmMR26TRVpQNgW1XVyHsmV3+WZiDDcf6+nXpiIqd80Sn7LGKzxaMRSC126zUc7 qZ/K+cLOSbDwbSguHsyWKyJLjCS4cIom2BYF3nzQMyOfgk9HCXkZ8S3m8yNbJMx40Kdb 61B0WzEd9QiLG/2c3ekdZuPYZbgGtHdm8OD3Umemn0gRplhEnXYCUZ+25veg+MHSGdgq MuGkHB8KdJmLmZrcG3fDZQTPNzKZ6mmwAPWn0oc7XSBQsYzWR3RL63p8ZPxi59TrJCDK PXjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=P0Ij0y4tNyespukgRyoxk0GbvDaq7eCt4gqx0mdDOFs=; b=EuOR5WDZQ1H2+l//wNW/Q17nPq6M4IACf40W/Dhh++kEvKztbxxhnOvj9WRhj7YHnc Qz/cJ2nZwpZsmAJnh2p5z+fJWTzOJxUhiiHJZH4LcMeDNE6LQKFsjxsP0aYigGMbyecO XhGjOoCUIOYYxAgYta8M+SdchsQ5gMHs9Ho+lEGH0mA61liZ30g3BSkrrWRfU98Yk3He w4qa6ll+MxtVAFmn/xSWIsYFp+/PkhlcDSj0wE/t0sRo4CFXOwcQSUijkNOvmMvN4P0s 8+mZ3mN96hYerwkJm9gegmHKXd7PEKGf5UJLTdNDZKTXcPsX/VhuwfbSspSH5RPYdOVr omkQ== X-Gm-Message-State: AGRZ1gK+IY+0YuaPw5mHnJJjQfxtg1I0xz9kPI02UEF2NsVAwu5zvHsm 1Pd6ivYgOymnRaQIuFlTp0uzjinuYoxpYw== X-Google-Smtp-Source: AJdET5fEXPPzdxkgIVSShGBUmAhV+2lTSjCfyleMvITMGERByxJEa1g0vfTiJCt07c9hnWijSK14rw== X-Received: by 2002:a05:660c:b48:: with SMTP id m8mr2308065itl.159.1540502217517; Thu, 25 Oct 2018 14:16:57 -0700 (PDT) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id l23-v6sm2831890ioj.40.2018.10.25.14.16.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 14:16:56 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Cc: Jens Axboe Subject: [PATCH 14/14] nvme: add separate poll queue map Date: Thu, 25 Oct 2018 15:16:26 -0600 Message-Id: <20181025211626.12692-15-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181025211626.12692-1-axboe@kernel.dk> References: <20181025211626.12692-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Adds support for defining a variable number of poll queues, currently configurable with the 'poll_queues' module parameter. Defaults to a single poll queue. And now we finally have poll support without triggering interrupts! Signed-off-by: Jens Axboe Reviewed-by: Hannes Reinecke --- drivers/nvme/host/pci.c | 103 +++++++++++++++++++++++++++++++++------- include/linux/blk-mq.h | 2 +- 2 files changed, 88 insertions(+), 17 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 658c9a2f4114..cce5d06f11c5 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -86,6 +86,10 @@ MODULE_PARM_DESC(write_queues, "Number of queues to use for writes. If not set, reads and writes " "will share a queue set."); +static int poll_queues = 1; +module_param_cb(poll_queues, &queue_count_ops, &poll_queues, 0644); +MODULE_PARM_DESC(poll_queues, "Number of queues to use for polled IO."); + struct nvme_dev; struct nvme_queue; @@ -94,6 +98,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); enum { NVMEQ_TYPE_READ, NVMEQ_TYPE_WRITE, + NVMEQ_TYPE_POLL, NVMEQ_TYPE_NR, }; @@ -202,6 +207,7 @@ struct nvme_queue { u16 last_cq_head; u16 qid; u8 cq_phase; + u8 polled; u32 *dbbuf_sq_db; u32 *dbbuf_cq_db; u32 *dbbuf_sq_ei; @@ -250,7 +256,7 @@ static inline void _nvme_check_size(void) static unsigned int max_io_queues(void) { - return num_possible_cpus() + write_queues; + return num_possible_cpus() + write_queues + poll_queues; } static unsigned int max_queue_count(void) @@ -500,8 +506,15 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set) offset = queue_irq_offset(dev); } + /* + * The poll queue(s) doesn't have an IRQ (and hence IRQ + * affinity), so use the regular blk-mq cpu mapping + */ map->queue_offset = qoff; - blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset); + if (i != NVMEQ_TYPE_POLL) + blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset); + else + blk_mq_map_queues(map); qoff += map->nr_queues; offset += map->nr_queues; } @@ -892,7 +905,7 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, * We should not need to do this, but we're still using this to * ensure we can drain requests on a dying queue. */ - if (unlikely(nvmeq->cq_vector < 0)) + if (unlikely(nvmeq->cq_vector < 0 && !nvmeq->polled)) return BLK_STS_IOERR; ret = nvme_setup_cmd(ns, req, &cmnd); @@ -921,6 +934,8 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, static int nvme_flags_to_type(struct request_queue *q, unsigned int flags) { + if (flags & REQ_HIPRI) + return NVMEQ_TYPE_POLL; if ((flags & REQ_OP_MASK) == REQ_OP_READ) return NVMEQ_TYPE_READ; @@ -1094,7 +1109,10 @@ static int adapter_alloc_cq(struct nvme_dev *dev, u16 qid, struct nvme_queue *nvmeq, s16 vector) { struct nvme_command c; - int flags = NVME_QUEUE_PHYS_CONTIG | NVME_CQ_IRQ_ENABLED; + int flags = NVME_QUEUE_PHYS_CONTIG; + + if (vector != -1) + flags |= NVME_CQ_IRQ_ENABLED; /* * Note: we (ab)use the fact that the prp fields survive if no data @@ -1106,7 +1124,10 @@ static int adapter_alloc_cq(struct nvme_dev *dev, u16 qid, c.create_cq.cqid = cpu_to_le16(qid); c.create_cq.qsize = cpu_to_le16(nvmeq->q_depth - 1); c.create_cq.cq_flags = cpu_to_le16(flags); - c.create_cq.irq_vector = cpu_to_le16(vector); + if (vector != -1) + c.create_cq.irq_vector = cpu_to_le16(vector); + else + c.create_cq.irq_vector = 0; return nvme_submit_sync_cmd(dev->ctrl.admin_q, &c, NULL, 0); } @@ -1348,13 +1369,14 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq) int vector; spin_lock_irq(&nvmeq->cq_lock); - if (nvmeq->cq_vector == -1) { + if (nvmeq->cq_vector == -1 && !nvmeq->polled) { spin_unlock_irq(&nvmeq->cq_lock); return 1; } vector = nvmeq->cq_vector; nvmeq->dev->online_queues--; nvmeq->cq_vector = -1; + nvmeq->polled = false; spin_unlock_irq(&nvmeq->cq_lock); /* @@ -1366,7 +1388,8 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq) if (!nvmeq->qid && nvmeq->dev->ctrl.admin_q) blk_mq_quiesce_queue(nvmeq->dev->ctrl.admin_q); - pci_free_irq(to_pci_dev(nvmeq->dev->dev), vector, nvmeq); + if (vector != -1) + pci_free_irq(to_pci_dev(nvmeq->dev->dev), vector, nvmeq); return 0; } @@ -1500,7 +1523,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid) spin_unlock_irq(&nvmeq->cq_lock); } -static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) +static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled) { struct nvme_dev *dev = nvmeq->dev; int result; @@ -1510,7 +1533,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) * A queue's vector matches the queue identifier unless the controller * has only one vector available. */ - vector = dev->num_vecs == 1 ? 0 : qid; + if (!polled) + vector = dev->num_vecs == 1 ? 0 : qid; + else + vector = -1; + result = adapter_alloc_cq(dev, qid, nvmeq, vector); if (result) return result; @@ -1527,15 +1554,20 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) * xxx' warning if the create CQ/SQ command times out. */ nvmeq->cq_vector = vector; + nvmeq->polled = polled; nvme_init_queue(nvmeq, qid); - result = queue_request_irq(nvmeq); - if (result < 0) - goto release_sq; + + if (vector != -1) { + result = queue_request_irq(nvmeq); + if (result < 0) + goto release_sq; + } return result; release_sq: nvmeq->cq_vector = -1; + nvmeq->polled = false; dev->online_queues--; adapter_delete_sq(dev, qid); release_cq: @@ -1686,7 +1718,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev) static int nvme_create_io_queues(struct nvme_dev *dev) { - unsigned i, max; + unsigned i, max, rw_queues; int ret = 0; for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) { @@ -1697,8 +1729,17 @@ static int nvme_create_io_queues(struct nvme_dev *dev) } max = min(dev->max_qid, dev->ctrl.queue_count - 1); + if (max != 1 && dev->io_queues[NVMEQ_TYPE_POLL]) { + rw_queues = dev->io_queues[NVMEQ_TYPE_READ] + + dev->io_queues[NVMEQ_TYPE_WRITE]; + } else { + rw_queues = max; + } + for (i = dev->online_queues; i <= max; i++) { - ret = nvme_create_queue(&dev->queues[i], i); + bool polled = i > rw_queues; + + ret = nvme_create_queue(&dev->queues[i], i, polled); if (ret) break; } @@ -1970,6 +2011,7 @@ static int nvme_setup_host_mem(struct nvme_dev *dev) static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int nr_io_queues) { unsigned int this_w_queues = write_queues; + unsigned int this_p_queues = poll_queues; /* * Setup read/write queue split @@ -1977,9 +2019,28 @@ static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int nr_io_queues) if (nr_io_queues == 1) { dev->io_queues[NVMEQ_TYPE_READ] = 1; dev->io_queues[NVMEQ_TYPE_WRITE] = 0; + dev->io_queues[NVMEQ_TYPE_POLL] = 0; return; } + /* + * Configure number of poll queues, if set + */ + if (this_p_queues) { + /* + * We need at least one queue left. With just one queue, we'll + * have a single shared read/write set. + */ + if (this_p_queues >= nr_io_queues) { + this_w_queues = 0; + this_p_queues = nr_io_queues - 1; + } + + dev->io_queues[NVMEQ_TYPE_POLL] = this_p_queues; + nr_io_queues -= this_p_queues; + } else + dev->io_queues[NVMEQ_TYPE_POLL] = 0; + /* * If 'write_queues' is set, ensure it leaves room for at least * one read queue @@ -2049,19 +2110,29 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) if (!irq_sets[1]) affd.nr_sets = 1; + /* + * Need IRQs for read+write queues, and one for the admin queue + */ + nr_io_queues = irq_sets[0] + irq_sets[1] + 1; + /* * If we enable msix early due to not intx, disable it again before * setting up the full range we need. */ pci_free_irq_vectors(pdev); - result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1, + result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues, PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd); if (result <= 0) return -EIO; dev->num_vecs = result; - dev->max_qid = max(result - 1, 1); + result = max(result - 1, 1); + dev->max_qid = result + dev->io_queues[NVMEQ_TYPE_POLL]; nvme_calc_io_queues(dev, dev->max_qid); + dev_info(dev->ctrl.device, "%d/%d/%d r/w/p queues\n", + dev->io_queues[NVMEQ_TYPE_READ], + dev->io_queues[NVMEQ_TYPE_WRITE], + dev->io_queues[NVMEQ_TYPE_POLL]); /* * Should investigate if there's a performance win from allocating diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 250b9ed86cd4..7bb77868371d 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -82,7 +82,7 @@ struct blk_mq_queue_map { }; enum { - HCTX_MAX_TYPES = 2, + HCTX_MAX_TYPES = 3, }; struct blk_mq_tag_set {