From patchwork Thu Dec 21 08:20:45 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sagi Grimberg <sagi@grimberg.me>
X-Patchwork-Id: 10127083
Return-Path: <linux-block-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	271D66019C for <patchwork-linux-block@patchwork.kernel.org>;
	Thu, 21 Dec 2017 08:20:56 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 14A5529B1D
	for <patchwork-linux-block@patchwork.kernel.org>;
	Thu, 21 Dec 2017 08:20:56 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 08AD929B24; Thu, 21 Dec 2017 08:20:56 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI
	autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 84F6729B1D
	for <patchwork-linux-block@patchwork.kernel.org>;
	Thu, 21 Dec 2017 08:20:55 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751667AbdLUIUu (ORCPT
	<rfc822;patchwork-linux-block@patchwork.kernel.org>);
	Thu, 21 Dec 2017 03:20:50 -0500
Received: from mail-wm0-f65.google.com ([74.125.82.65]:36832 "EHLO
	mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751231AbdLUIUt (ORCPT
	<rfc822;linux-block@vger.kernel.org>);
	Thu, 21 Dec 2017 03:20:49 -0500
Received: by mail-wm0-f65.google.com with SMTP id b76so14216577wmg.1
	for <linux-block@vger.kernel.org>;
	Thu, 21 Dec 2017 00:20:48 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:subject:to:cc:references:from:message-id:date
	:user-agent:mime-version:in-reply-to:content-language
	:content-transfer-encoding;
	bh=j2HwBbNqa/wEeIw3VQhaEcFxfyB83zZaTMzb5VHXyfc=;
	b=gmb1ZgbIlUVIShWXffS5/HaY8F64DN1D0+76x7+1+54CbXu4SdT6HFUTKQEZCHpwWt
	3RAIo1mftidVkvpksklFRnzI6I1UOeiu3C6g7SNftUXE6/2uzcCXQ56d52eQXxRCkkhZ
	jY6GeycxsUe9jAMxvw/1r7cOf0M4iX3sBJYdC93rYGu/InO9Fo8pj3UMjQvRi1UCrQ4f
	DMFqCXmQJl3oOdYfzyaY4WCGqb0eiCc07fYq9DxmOsN2skJFVE8uClGRl666LmUEBYHA
	328n43Ni3cO8BuwJoY0qHfrRUUKpt4yR+kkYOdtGhYrq49/hbBHjjKcG8DC+oDQtqFNl
	YkuQ==
X-Gm-Message-State: AKGB3mJ1l8XdA4+j01ZgQASaBY2PI2ngqU3XgDK4EZWbzOa+wtWv82PU
	SFCHLyGdh1feLOsy1pS3Cy0=
X-Google-Smtp-Source: 
 ACJfBouUDM8A0vBlm9xv5dlCHzgYR0k9h5aTN+eucDQWlXVLBg+3c5M5n2HIiec2K1fA/y+rA2oSLw==
X-Received: by 10.80.188.20 with SMTP id j20mr8950954edh.243.1513844447905;
	Thu, 21 Dec 2017 00:20:47 -0800 (PST)
Received: from [192.168.64.169] (bzq-82-81-101-184.red.bezeqint.net.
	[82.81.101.184]) by smtp.gmail.com with ESMTPSA id
	h16sm15857434edj.34.2017.12.21.00.20.46
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Thu, 21 Dec 2017 00:20:47 -0800 (PST)
Subject: Re: [PATCH V2 6/6] nvme-pci: remove .init_request callback
To: Ming Lei <ming.lei@redhat.com>, linux-nvme@lists.infradead.org,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@fb.com>,
	linux-block@vger.kernel.org
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
	Keith Busch <keith.busch@intel.com>, Yi Zhang <yi.zhang@redhat.com>,
	Johannes Thumshirn <jthumshirn@suse.de>
References: <20171214023103.18272-1-ming.lei@redhat.com>
	<20171214023103.18272-7-ming.lei@redhat.com>
From: Sagi Grimberg <sagi@grimberg.me>
Message-ID: <0fcf3176-7864-8631-e293-2bfa7aacea38@grimberg.me>
Date: Thu, 21 Dec 2017 10:20:45 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
	Thunderbird/52.5.0
MIME-Version: 1.0
In-Reply-To: <20171214023103.18272-7-ming.lei@redhat.com>
Content-Language: en-US
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Ming,

I'd prefer that we make the pci driver match
the rest of the drivers in nvme.

IMO it would be better to allocate a queues array at probe time
and simply reuse it at reset sequence.

Can this (untested) patch also fix the issue your seeing:
---
         u32 __iomem *dbs;
@@ -365,7 +365,7 @@ static int nvme_admin_init_hctx(struct blk_mq_hw_ctx 
*hctx, void *data,
                                 unsigned int hctx_idx)
  {
         struct nvme_dev *dev = data;
-       struct nvme_queue *nvmeq = dev->queues[0];
+       struct nvme_queue *nvmeq = &dev->queues[0];

         WARN_ON(hctx_idx != 0);
         WARN_ON(dev->admin_tagset.tags[0] != hctx->tags);
@@ -387,7 +387,7 @@ static int nvme_init_hctx(struct blk_mq_hw_ctx 
*hctx, void *data,
                           unsigned int hctx_idx)
  {
         struct nvme_dev *dev = data;
-       struct nvme_queue *nvmeq = dev->queues[hctx_idx + 1];
+       struct nvme_queue *nvmeq = &dev->queues[hctx_idx + 1];

         if (!nvmeq->tags)
                 nvmeq->tags = &dev->tagset.tags[hctx_idx];
@@ -403,7 +403,7 @@ static int nvme_init_request(struct blk_mq_tag_set 
*set, struct request *req,
         struct nvme_dev *dev = set->driver_data;
         struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
         int queue_idx = (set == &dev->tagset) ? hctx_idx + 1 : 0;
-       struct nvme_queue *nvmeq = dev->queues[queue_idx];
+       struct nvme_queue *nvmeq = &dev->queues[queue_idx];

         BUG_ON(!nvmeq);
         iod->nvmeq = nvmeq;
@@ -1046,7 +1046,7 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, 
unsigned int tag)
  static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
  {
         struct nvme_dev *dev = to_nvme_dev(ctrl);
-       struct nvme_queue *nvmeq = dev->queues[0];
+       struct nvme_queue *nvmeq = &dev->queues[0];
         struct nvme_command c;

         memset(&c, 0, sizeof(c));
@@ -1290,10 +1290,8 @@ static void nvme_free_queues(struct nvme_dev 
*dev, int lowest)
         int i;

         for (i = dev->ctrl.queue_count - 1; i >= lowest; i--) {
-               struct nvme_queue *nvmeq = dev->queues[i];
                 dev->ctrl.queue_count--;
-               dev->queues[i] = NULL;
-               nvme_free_queue(nvmeq);
+               nvme_free_queue(&dev->queues[i]);
         }
  }

@@ -1325,7 +1323,7 @@ static int nvme_suspend_queue(struct nvme_queue 
*nvmeq)

  static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown)
  {
-       struct nvme_queue *nvmeq = dev->queues[0];
+       struct nvme_queue *nvmeq = &dev->queues[0];

         if (!nvmeq)
                 return;
@@ -1387,10 +1385,7 @@ static int nvme_alloc_sq_cmds(struct nvme_dev 
*dev, struct nvme_queue *nvmeq,
  static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, int qid,
                                                         int depth, int 
node)
  {
-       struct nvme_queue *nvmeq = kzalloc_node(sizeof(*nvmeq), GFP_KERNEL,
-                                                       node);
-       if (!nvmeq)
-               return NULL;
+       struct nvme_queue *nvmeq = &dev->queues[qid];

         nvmeq->cqes = dma_zalloc_coherent(dev->dev, CQ_SIZE(depth),
                                           &nvmeq->cq_dma_addr, GFP_KERNEL);
@@ -1409,7 +1404,6 @@ static struct nvme_queue *nvme_alloc_queue(struct 
nvme_dev *dev, int qid,
         nvmeq->q_depth = depth;
         nvmeq->qid = qid;
         nvmeq->cq_vector = -1;
-       dev->queues[qid] = nvmeq;
         dev->ctrl.queue_count++;

         return nvmeq;
@@ -1592,7 +1586,7 @@ static int nvme_pci_configure_admin_queue(struct 
nvme_dev *dev)
         if (result < 0)
                 return result;

-       nvmeq = dev->queues[0];
+       nvmeq = &dev->queues[0];
         if (!nvmeq) {
                 nvmeq = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
                                         dev_to_node(dev->dev));
@@ -1638,7 +1632,7 @@ static int nvme_create_io_queues(struct nvme_dev *dev)

         max = min(dev->max_qid, dev->ctrl.queue_count - 1);
         for (i = dev->online_queues; i <= max; i++) {
-               ret = nvme_create_queue(dev->queues[i], i);
+               ret = nvme_create_queue(&dev->queues[i], i);
                 if (ret)
                         break;
         }
@@ -1894,7 +1888,7 @@ static int nvme_setup_host_mem(struct nvme_dev *dev)

  static int nvme_setup_io_queues(struct nvme_dev *dev)
  {
-       struct nvme_queue *adminq = dev->queues[0];
+       struct nvme_queue *adminq = &dev->queues[0];
         struct pci_dev *pdev = to_pci_dev(dev->dev);
         int result, nr_io_queues;
         unsigned long size;
@@ -2020,7 +2014,7 @@ static void nvme_disable_io_queues(struct nvme_dev 
*dev, int queues)
   retry:
                 timeout = ADMIN_TIMEOUT;
                 for (; i > 0; i--, sent++)
-                       if (nvme_delete_queue(dev->queues[i], opcode))
+                       if (nvme_delete_queue(&dev->queues[i], opcode))
                                 break;

                 while (sent--) {
@@ -2209,7 +2203,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, 
bool shutdown)

         queues = dev->online_queues - 1;
         for (i = dev->ctrl.queue_count - 1; i > 0; i--)
-               nvme_suspend_queue(dev->queues[i]);
+               nvme_suspend_queue(&dev->queues[i]);

         if (dead) {
                 /* A device might become IO incapable very soon during
@@ -2217,7 +2211,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, 
bool shutdown)
                  * queue_count can be 0 here.
                  */
                 if (dev->ctrl.queue_count)
-                       nvme_suspend_queue(dev->queues[0]);
+                       nvme_suspend_queue(&dev->queues[0]);
         } else {
                 nvme_disable_io_queues(dev, queues);
                 nvme_disable_admin_queue(dev, shutdown);
@@ -2462,6 +2456,7 @@ static int nvme_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
         int node, result = -ENOMEM;
         struct nvme_dev *dev;
         unsigned long quirks = id->driver_data;
+       unsigned int alloc_size;

         node = dev_to_node(&pdev->dev);
         if (node == NUMA_NO_NODE)
@@ -2470,8 +2465,9 @@ static int nvme_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
         dev = kzalloc_node(sizeof(*dev), GFP_KERNEL, node);
         if (!dev)
                 return -ENOMEM;
-       dev->queues = kzalloc_node((num_possible_cpus() + 1) * 
sizeof(void *),
-                                                       GFP_KERNEL, node);
+
+       alloc_size = (num_possible_cpus() + 1) * sizeof(struct 
nvme_queue *);
+       dev->queues = kzalloc_node(alloc_size, GFP_KERNEL, node);
         if (!dev->queues)
                 goto free;
--

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f26aaa393016..a8edb29dac16 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -75,7 +75,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, 
bool shutdown);
   * Represents an NVM Express device.  Each nvme_dev is a PCI function.
   */
  struct nvme_dev {
-       struct nvme_queue **queues;
+       struct nvme_queue *queues;
         struct blk_mq_tag_set tagset;
         struct blk_mq_tag_set admin_tagset;