From patchwork Fri Mar 9 03:32:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10269605 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 77DB5602C8 for ; Fri, 9 Mar 2018 03:32:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A5082992E for ; Fri, 9 Mar 2018 03:32:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E8B829952; Fri, 9 Mar 2018 03:32:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 723D129932 for ; Fri, 9 Mar 2018 03:32:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750920AbeCIDc4 (ORCPT ); Thu, 8 Mar 2018 22:32:56 -0500 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:49796 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750909AbeCIDcz (ORCPT ); Thu, 8 Mar 2018 22:32:55 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B64B2BD9E; Fri, 9 Mar 2018 03:32:54 +0000 (UTC) Received: from localhost (ovpn-12-83.pek2.redhat.com [10.72.12.83]) by smtp.corp.redhat.com (Postfix) with ESMTP id C5FAE202322A; Fri, 9 Mar 2018 03:32:45 +0000 (UTC) From: Ming Lei To: James Bottomley , Jens Axboe , "Martin K . Petersen" Cc: Christoph Hellwig , linux-scsi@vger.kernel.org, linux-block@vger.kernel.org, Meelis Roos , Don Brace , Kashyap Desai , Laurence Oberman , Mike Snitzer , Ming Lei , Hannes Reinecke , James Bottomley , Artem Bityutskiy Subject: [PATCH V4 1/4] scsi: hpsa: fix selection of reply queue Date: Fri, 9 Mar 2018 11:32:15 +0800 Message-Id: <20180309033218.23042-2-ming.lei@redhat.com> In-Reply-To: <20180309033218.23042-1-ming.lei@redhat.com> References: <20180309033218.23042-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 09 Mar 2018 03:32:54 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 09 Mar 2018 03:32:54 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'ming.lei@redhat.com' RCPT:'' Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs), one msix vector can be created without any online CPU mapped, then one command's completion may not be notified. This patch setups mapping between cpu and reply queue according to irq affinity info retrived by pci_irq_get_affinity(), and uses this mapping table to choose reply queue for queuing one command. Then the chosen reply queue has to be active, and fixes IO hang caused by using inactive reply queue which doesn't have any online CPU mapped. Cc: Hannes Reinecke Cc: "Martin K. Petersen" , Cc: James Bottomley , Cc: Christoph Hellwig , Cc: Don Brace Cc: Kashyap Desai Cc: Laurence Oberman Cc: Meelis Roos Cc: Artem Bityutskiy Cc: Mike Snitzer Tested-by: Laurence Oberman Tested-by: Don Brace Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs") Signed-off-by: Ming Lei Tested-by: Artem Bityutskiy Acked-by: Don Brace Tested-by: Don Brace --- drivers/scsi/hpsa.c | 73 +++++++++++++++++++++++++++++++++++++++-------------- drivers/scsi/hpsa.h | 1 + 2 files changed, 55 insertions(+), 19 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 5293e6827ce5..3a9eca163db8 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -1045,11 +1045,7 @@ static void set_performant_mode(struct ctlr_info *h, struct CommandList *c, c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1); if (unlikely(!h->msix_vectors)) return; - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - c->Header.ReplyQueue = - raw_smp_processor_id() % h->nreply_queues; - else - c->Header.ReplyQueue = reply_queue % h->nreply_queues; + c->Header.ReplyQueue = reply_queue; } } @@ -1063,10 +1059,7 @@ static void set_ioaccel1_performant_mode(struct ctlr_info *h, * Tell the controller to post the reply to the queue for this * processor. This seems to give the best I/O throughput. */ - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - cp->ReplyQueue = smp_processor_id() % h->nreply_queues; - else - cp->ReplyQueue = reply_queue % h->nreply_queues; + cp->ReplyQueue = reply_queue; /* * Set the bits in the address sent down to include: * - performant mode bit (bit 0) @@ -1087,10 +1080,7 @@ static void set_ioaccel2_tmf_performant_mode(struct ctlr_info *h, /* Tell the controller to post the reply to the queue for this * processor. This seems to give the best I/O throughput. */ - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - cp->reply_queue = smp_processor_id() % h->nreply_queues; - else - cp->reply_queue = reply_queue % h->nreply_queues; + cp->reply_queue = reply_queue; /* Set the bits in the address sent down to include: * - performant mode bit not used in ioaccel mode 2 * - pull count (bits 0-3) @@ -1109,10 +1099,7 @@ static void set_ioaccel2_performant_mode(struct ctlr_info *h, * Tell the controller to post the reply to the queue for this * processor. This seems to give the best I/O throughput. */ - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - cp->reply_queue = smp_processor_id() % h->nreply_queues; - else - cp->reply_queue = reply_queue % h->nreply_queues; + cp->reply_queue = reply_queue; /* * Set the bits in the address sent down to include: * - performant mode bit not used in ioaccel mode 2 @@ -1157,6 +1144,8 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info *h, { dial_down_lockup_detection_during_fw_flash(h, c); atomic_inc(&h->commands_outstanding); + + reply_queue = h->reply_map[raw_smp_processor_id()]; switch (c->cmd_type) { case CMD_IOACCEL1: set_ioaccel1_performant_mode(h, c, reply_queue); @@ -7376,6 +7365,26 @@ static void hpsa_disable_interrupt_mode(struct ctlr_info *h) h->msix_vectors = 0; } +static void hpsa_setup_reply_map(struct ctlr_info *h) +{ + const struct cpumask *mask; + unsigned int queue, cpu; + + for (queue = 0; queue < h->msix_vectors; queue++) { + mask = pci_irq_get_affinity(h->pdev, queue); + if (!mask) + goto fallback; + + for_each_cpu(cpu, mask) + h->reply_map[cpu] = queue; + } + return; + +fallback: + for_each_possible_cpu(cpu) + h->reply_map[cpu] = 0; +} + /* If MSI/MSI-X is supported by the kernel we will try to enable it on * controllers that are capable. If not, we use legacy INTx mode. */ @@ -7771,6 +7780,10 @@ static int hpsa_pci_init(struct ctlr_info *h) err = hpsa_interrupt_mode(h); if (err) goto clean1; + + /* setup mapping between CPU and reply queue */ + hpsa_setup_reply_map(h); + err = hpsa_pci_find_memory_BAR(h->pdev, &h->paddr); if (err) goto clean2; /* intmode+region, pci */ @@ -8480,6 +8493,28 @@ static struct workqueue_struct *hpsa_create_controller_wq(struct ctlr_info *h, return wq; } +static void hpda_free_ctlr_info(struct ctlr_info *h) +{ + kfree(h->reply_map); + kfree(h); +} + +static struct ctlr_info *hpda_alloc_ctlr_info(void) +{ + struct ctlr_info *h; + + h = kzalloc(sizeof(*h), GFP_KERNEL); + if (!h) + return NULL; + + h->reply_map = kzalloc(sizeof(*h->reply_map) * nr_cpu_ids, GFP_KERNEL); + if (!h->reply_map) { + kfree(h); + return NULL; + } + return h; +} + static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) { int dac, rc; @@ -8517,7 +8552,7 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) * the driver. See comments in hpsa.h for more info. */ BUILD_BUG_ON(sizeof(struct CommandList) % COMMANDLIST_ALIGNMENT); - h = kzalloc(sizeof(*h), GFP_KERNEL); + h = hpda_alloc_ctlr_info(); if (!h) { dev_err(&pdev->dev, "Failed to allocate controller head\n"); return -ENOMEM; @@ -8916,7 +8951,7 @@ static void hpsa_remove_one(struct pci_dev *pdev) h->lockup_detected = NULL; /* init_one 2 */ /* (void) pci_disable_pcie_error_reporting(pdev); */ /* init_one 1 */ - kfree(h); /* init_one 1 */ + hpda_free_ctlr_info(h); /* init_one 1 */ } static int hpsa_suspend(__attribute__((unused)) struct pci_dev *pdev, diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h index 018f980a701c..fb9f5e7f8209 100644 --- a/drivers/scsi/hpsa.h +++ b/drivers/scsi/hpsa.h @@ -158,6 +158,7 @@ struct bmic_controller_parameters { #pragma pack() struct ctlr_info { + unsigned int *reply_map; int ctlr; char devname[8]; char *product_name;