From patchwork Thu Aug 30 03:26:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suganath Prabu S X-Patchwork-Id: 10581175 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 29B1A5A4 for ; Thu, 30 Aug 2018 03:27:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18D472BCE4 for ; Thu, 30 Aug 2018 03:27:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0D4CF2BCE6; Thu, 30 Aug 2018 03:27:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 641A62BCE4 for ; Thu, 30 Aug 2018 03:27:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727497AbeH3H1I (ORCPT ); Thu, 30 Aug 2018 03:27:08 -0400 Received: from mail-qt0-f196.google.com ([209.85.216.196]:44557 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727281AbeH3H1I (ORCPT ); Thu, 30 Aug 2018 03:27:08 -0400 Received: by mail-qt0-f196.google.com with SMTP id k38-v6so8276725qtk.11 for ; Wed, 29 Aug 2018 20:27:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9IkNCVj6WwR9a23sxbwUVZFLcbSPGCfT8YgwCuqA5Eg=; b=WWYvrra0AX5FR250/VWDW2SfYmEH29kDhy1MMmas9oJL58yyt0Ov1H9E4obhVkJNbX QIk5yJaS7fyD4F5ENZZT+VjuW9A1xnMjO7f26kNCtpdgOmvEKxy6uLEESI1IY+zq0eX3 BYr7w43tghXgm7Y9AF+kzDn3NAhcZl5ORjC9U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9IkNCVj6WwR9a23sxbwUVZFLcbSPGCfT8YgwCuqA5Eg=; b=p4hOjoCtYkahRfcM+JX2z/KlbQ4PJgrSdppMhzGxYodINESTFDv0MWloUbrVIT5Tpw 6KJ08WcVE8cHtmZRtP3m6iJCecJhucRJ3ThsCpurPQH8933Ww5q3Uu7zLLFjDs+Aa6PR W5Sk1mA7yDc4omrvIUw8nAzB35aj+3IchD6LtpIoqyutQuoou4QrUqN8/H6ueMhHTw/r TMD5smcKK9VrjH4J48PAW5y/AesoZ+LWd6K4dfl7P1zpf6DE32ajroxZ5kVLEiEtJOUx 5KZDGeZ/6xjQNZT5vVuFvqgOAsaHUxGow3YnzbBS0on6ObMElVQu/210R/RwWLOUYp9G nLXA== X-Gm-Message-State: APzg51A57vwkCMcw5M1bj9asAvv76bk2POwHtbsQXEK7DQnkDnFyZ9vC ljeozdol55Uw2amy959ZtaJ60jmTkNY= X-Google-Smtp-Source: ANB0VdZA/107ndU/IoOt7rB944jqRDoe0gqPehPB2ml9jiTuaYXmOSU0Ttl/X0zbrRJCQYHSHoOx6g== X-Received: by 2002:a0c:d788:: with SMTP id z8-v6mr9708208qvi.71.1535599626270; Wed, 29 Aug 2018 20:27:06 -0700 (PDT) Received: from localhost.localdomain.localdomain ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id f184-v6sm3215427qkc.23.2018.08.29.20.27.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 29 Aug 2018 20:27:05 -0700 (PDT) From: Suganath Prabu S To: linux-scsi@vger.kernel.org Cc: Sathya.Prakash@broadcom.com, sreekanth.reddy@broadcom.com, Suganath Prabu S Subject: [PATCH 2/7] mpt3sas: Add HBA hot plug watchdog thread. Date: Wed, 29 Aug 2018 23:26:28 -0400 Message-Id: <1535599593-4739-3-git-send-email-suganath-prabu.subramani@broadcom.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1535599593-4739-1-git-send-email-suganath-prabu.subramani@broadcom.com> References: <1535599593-4739-1-git-send-email-suganath-prabu.subramani@broadcom.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP During driver load create a hba hot unplug watchdog thread "_base_hba_hot_unplug_work". This will poll whether HBA device is unplugged or not by reading IOC's vendor field in IOC's PCI configuration space for every one second. If hot unplug is detected, it terminates all the outstanding IOs and hence kernels's PCIe hotplug module (i.e. pciehp) will clear the instances of the hot unplugged PCI device. Below functions starts and stops the watchdog. mpt3sas_base_start_hba_unplug_watchdog mpt3sas_base_stop_hba_unplug_watchdog Watchdog thread starts immediately once IOC becomes operational. Signed-off-by: Suganath Prabu S --- drivers/scsi/mpt3sas/mpt3sas_base.c | 92 +++++++++++++++++++++++++++++++++++- drivers/scsi/mpt3sas/mpt3sas_base.h | 6 +++ drivers/scsi/mpt3sas/mpt3sas_scsih.c | 7 +++ 3 files changed, 104 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 7ef8daf..97e9939 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -69,6 +69,7 @@ static MPT_CALLBACK mpt_callbacks[MPT_MAX_CALLBACKS]; #define FAULT_POLLING_INTERVAL 1000 /* in milliseconds */ +#define HBA_HOTUNPLUG_POLLING_INTERVAL 1000 /* in milliseconds */ /* maximum controller queue depth */ #define MAX_HBA_QUEUE_DEPTH 30000 @@ -672,6 +673,46 @@ _base_fault_reset_work(struct work_struct *work) spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags); } +static void +_base_hba_hot_unplug_work(struct work_struct *work) +{ + struct MPT3SAS_ADAPTER *ioc = + container_of(work, struct MPT3SAS_ADAPTER, + hba_hot_unplug_work.work); + unsigned long flags; + + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); + if (ioc->shost_recovery || ioc->pci_error_recovery) + goto rearm_timer; + + if (mpt3sas_base_pci_device_is_unplugged(ioc)) { + if (ioc->remove_host) { + pr_err(MPT3SAS_FMT + "The IOC seems hot unplugged and the driver is " + "waiting for pciehp module to remove the PCIe " + "device instance associated with IOC!!!\n", + ioc->name); + goto rearm_timer; + } + + /* Set remove_host flag here, since kernel will invoke driver's + * .remove() callback function one after the other for all hot + * un-plugged devices, so it may take some time to call + * .remove() function for subsequent hot un-plugged + * PCI devices. + */ + ioc->remove_host = 1; + } + +rearm_timer: + if (ioc->hba_hot_unplug_work_q) + queue_delayed_work(ioc->hba_hot_unplug_work_q, + &ioc->hba_hot_unplug_work, + msecs_to_jiffies(HBA_HOTUNPLUG_POLLING_INTERVAL)); + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); +} + + /** * mpt3sas_base_start_watchdog - start the fault_reset_work_q * @ioc: per adapter object @@ -730,6 +771,54 @@ mpt3sas_base_stop_watchdog(struct MPT3SAS_ADAPTER *ioc) } } +void +mpt3sas_base_start_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc) +{ + unsigned long flags; + + if (ioc->hba_hot_unplug_work_q) + return; + + /* Initialize hba hot unplug polling */ + INIT_DELAYED_WORK(&ioc->hba_hot_unplug_work, + _base_hba_hot_unplug_work); + snprintf(ioc->hba_hot_unplug_work_q_name, + sizeof(ioc->hba_hot_unplug_work_q_name), "poll_%s%d_hba_unplug", + ioc->driver_name, ioc->id); + ioc->hba_hot_unplug_work_q = + create_singlethread_workqueue(ioc->hba_hot_unplug_work_q_name); + if (!ioc->hba_hot_unplug_work_q) { + pr_err(MPT3SAS_FMT "%s: failed (line=%d)\n", + ioc->name, __func__, __LINE__); + return; + } + + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); + if (ioc->hba_hot_unplug_work_q) + queue_delayed_work(ioc->hba_hot_unplug_work_q, + &ioc->hba_hot_unplug_work, + msecs_to_jiffies(HBA_HOTUNPLUG_POLLING_INTERVAL)); + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); +} + +void +mpt3sas_base_stop_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc) +{ + unsigned long flags; + struct workqueue_struct *wq; + + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); + wq = ioc->hba_hot_unplug_work_q; + ioc->hba_hot_unplug_work_q = NULL; + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); + + if (wq) { + if (!cancel_delayed_work_sync(&ioc->hba_hot_unplug_work)) + flush_workqueue(wq); + destroy_workqueue(wq); + } +} + /** * mpt3sas_base_fault_info - verbose translation of firmware FAULT code * @ioc: per adapter object @@ -6458,7 +6547,7 @@ _base_make_ioc_operational(struct MPT3SAS_ADAPTER *ioc) } skip_init_reply_post_host_index: - + mpt3sas_base_start_hba_unplug_watchdog(ioc); _base_unmask_interrupts(ioc); if (ioc->hba_mpi_version_belonged != MPI2_VERSION) { @@ -6789,6 +6878,7 @@ mpt3sas_base_detach(struct MPT3SAS_ADAPTER *ioc) __func__)); mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); mpt3sas_base_free_resources(ioc); _base_release_memory_pools(ioc); mpt3sas_free_enclosure_list(ioc); diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 8ee3ba7..4186bc9 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -1140,8 +1140,11 @@ struct MPT3SAS_ADAPTER { /* fw fault handler */ char fault_reset_work_q_name[20]; + char hba_hot_unplug_work_q_name[20]; struct workqueue_struct *fault_reset_work_q; + struct workqueue_struct *hba_hot_unplug_work_q; struct delayed_work fault_reset_work; + struct delayed_work hba_hot_unplug_work; /* fw event handler */ char firmware_event_name[20]; @@ -1158,6 +1161,7 @@ struct MPT3SAS_ADAPTER { struct mutex reset_in_progress_mutex; spinlock_t ioc_reset_in_progress_lock; + spinlock_t hba_hot_unplug_lock; u8 ioc_link_reset_in_progress; u8 ignore_loginfos; @@ -1482,6 +1486,8 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc); u8 mpt3sas_base_check_cmd_timeout(struct MPT3SAS_ADAPTER *ioc, u8 status, void *mpi_request, int sz); +void mpt3sas_base_start_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc); +void mpt3sas_base_stop_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc); /* scsih shared API */ struct scsi_cmnd *mpt3sas_scsih_scsi_lookup_get(struct MPT3SAS_ADAPTER *ioc, u16 smid); diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index eeee9da..7e0c4ec 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9828,9 +9828,11 @@ static void scsih_remove(struct pci_dev *pdev) ioc->remove_host = 1; mpt3sas_wait_for_commands_to_complete(ioc); + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); _scsih_flush_running_cmds(ioc); _scsih_fw_event_cleanup_queue(ioc); + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); spin_lock_irqsave(&ioc->fw_event_lock, flags); wq = ioc->firmware_event_thread; @@ -10724,6 +10726,7 @@ scsih_suspend(struct pci_dev *pdev, pm_message_t state) pci_power_t device_state; mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); flush_scheduled_work(); scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); @@ -10766,6 +10769,7 @@ scsih_resume(struct pci_dev *pdev) mpt3sas_base_hard_reset_handler(ioc, SOFT_RESET); scsi_unblock_requests(shost); mpt3sas_base_start_watchdog(ioc); + mpt3sas_base_start_hba_unplug_watchdog(ioc); return 0; } #endif /* CONFIG_PM */ @@ -10796,12 +10800,14 @@ scsih_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t state) ioc->pci_error_recovery = 1; scsi_block_requests(ioc->shost); mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); mpt3sas_base_free_resources(ioc); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: /* Permanent error, prepare for device removal */ ioc->pci_error_recovery = 1; mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); _scsih_flush_running_cmds(ioc); return PCI_ERS_RESULT_DISCONNECT; } @@ -10862,6 +10868,7 @@ scsih_pci_resume(struct pci_dev *pdev) pci_cleanup_aer_uncorrect_error_status(pdev); mpt3sas_base_start_watchdog(ioc); + mpt3sas_base_start_hba_unplug_watchdog(ioc); scsi_unblock_requests(ioc->shost); }