From patchwork Fri Apr 28 17:40:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 9705175 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 140EE60225 for ; Fri, 28 Apr 2017 17:40:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F9E528684 for ; Fri, 28 Apr 2017 17:40:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0312028696; Fri, 28 Apr 2017 17:40:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3574028684 for ; Fri, 28 Apr 2017 17:40:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S642181AbdD1RkE (ORCPT ); Fri, 28 Apr 2017 13:40:04 -0400 Received: from mga01.intel.com ([192.55.52.88]:10311 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S642175AbdD1RkD (ORCPT ); Fri, 28 Apr 2017 13:40:03 -0400 Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Apr 2017 10:40:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,388,1488873600"; d="scan'208";a="962314187" Received: from scymds02.sc.intel.com ([10.82.195.37]) by orsmga003.jf.intel.com with ESMTP; 28 Apr 2017 10:40:02 -0700 Received: from scvm10.sc.intel.com (scvm10.sc.intel.com [10.82.195.27]) by scymds02.sc.intel.com with ESMTP id v3SHe2m7030230; Fri, 28 Apr 2017 10:40:02 -0700 Received: from scvm10.sc.intel.com (localhost [127.0.0.1]) by scvm10.sc.intel.com with ESMTP id v3SHe24I025585; Fri, 28 Apr 2017 10:40:02 -0700 Subject: [PATCH v2 11/19] IB/hfi1: Fix softlockup issue From: Dennis Dalessandro To: dledford@redhat.com Cc: linux-rdma@vger.kernel.org, "Michael J. Ruhl" , Mike Marciszyn , Tadeusz Struk Date: Fri, 28 Apr 2017 10:40:02 -0700 Message-ID: <20170428174001.23960.46352.stgit@scvm10.sc.intel.com> In-Reply-To: <20170428173443.23960.32602.stgit@scvm10.sc.intel.com> References: <20170428173443.23960.32602.stgit@scvm10.sc.intel.com> User-Agent: StGit/0.17.1-18-g2e886-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Tadeusz Struk Soft lockups can occur because the mad processing on different CPUs acquire the spin lock dc8051_lock: [534552.835870] [] ? read_dev_port_cntr.isra.37+0x23/0x160 [hfi1] [534552.835880] [] read_dev_cntr+0x4f/0x60 [hfi1] [534552.835893] [] pma_get_opa_portstatus+0x64d/0x8c0 [hfi1] [534552.835904] [] hfi1_process_mad+0x48d/0x18c0 [hfi1] [534552.835908] [] ? __slab_free+0x81/0x2f0 [534552.835936] [] ? ib_mad_recv_done+0x21e/0xa30 [ib_core] [534552.835939] [] ? __kmalloc+0x1f3/0x240 [534552.835947] [] ib_mad_recv_done+0x2cb/0xa30 [ib_core] [534552.835955] [] __ib_process_cq+0x55/0xd0 [ib_core] [534552.835962] [] ib_cq_poll_work+0x20/0x60 [ib_core] [534552.835964] [] process_one_work+0x17b/0x470 [534552.835966] [] worker_thread+0x126/0x410 [534552.835969] [] ? rescuer_thread+0x460/0x460 [534552.835971] [] kthread+0xcf/0xe0 [534552.835974] [] ? kthread_create_on_node+0x140/0x140 [534552.835977] [] ret_from_fork+0x58/0x90 [534552.835980] [] ? kthread_create_on_node+0x140/0x140 This issue is made worse when the 8051 is busy and the reads take longer. Fix by using a non-spinning lock procure. Reviewed-by: Michael J. Ruhl Reviewed-by: Mike Marciszyn Signed-off-by: Tadeusz Struk Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/chip.c | 86 ++++++++++++++++++++++--------------- drivers/infiniband/hw/hfi1/hfi.h | 7 ++- drivers/infiniband/hw/hfi1/init.c | 2 - 3 files changed, 57 insertions(+), 38 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index 79a316a..6b70d1d 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -6381,18 +6381,17 @@ static void lcb_shutdown(struct hfi1_devdata *dd, int abort) * * The expectation is that the caller of this routine would have taken * care of properly transitioning the link into the correct state. + * NOTE: the caller needs to acquire the dd->dc8051_lock lock + * before calling this function. */ -static void dc_shutdown(struct hfi1_devdata *dd) +static void _dc_shutdown(struct hfi1_devdata *dd) { - unsigned long flags; + lockdep_assert_held(&dd->dc8051_lock); - spin_lock_irqsave(&dd->dc8051_lock, flags); - if (dd->dc_shutdown) { - spin_unlock_irqrestore(&dd->dc8051_lock, flags); + if (dd->dc_shutdown) return; - } + dd->dc_shutdown = 1; - spin_unlock_irqrestore(&dd->dc8051_lock, flags); /* Shutdown the LCB */ lcb_shutdown(dd, 1); /* @@ -6403,35 +6402,45 @@ static void dc_shutdown(struct hfi1_devdata *dd) write_csr(dd, DC_DC8051_CFG_RST, 0x1); } +static void dc_shutdown(struct hfi1_devdata *dd) +{ + mutex_lock(&dd->dc8051_lock); + _dc_shutdown(dd); + mutex_unlock(&dd->dc8051_lock); +} + /* * Calling this after the DC has been brought out of reset should not * do any damage. + * NOTE: the caller needs to acquire the dd->dc8051_lock lock + * before calling this function. */ -static void dc_start(struct hfi1_devdata *dd) +static void _dc_start(struct hfi1_devdata *dd) { - unsigned long flags; - int ret; + lockdep_assert_held(&dd->dc8051_lock); - spin_lock_irqsave(&dd->dc8051_lock, flags); if (!dd->dc_shutdown) - goto done; - spin_unlock_irqrestore(&dd->dc8051_lock, flags); + return; + /* Take the 8051 out of reset */ write_csr(dd, DC_DC8051_CFG_RST, 0ull); /* Wait until 8051 is ready */ - ret = wait_fm_ready(dd, TIMEOUT_8051_START); - if (ret) { + if (wait_fm_ready(dd, TIMEOUT_8051_START)) dd_dev_err(dd, "%s: timeout starting 8051 firmware\n", __func__); - } + /* Take away reset for LCB and RX FPE (set in lcb_shutdown). */ write_csr(dd, DCC_CFG_RESET, 0x10); /* lcb_shutdown() with abort=1 does not restore these */ write_csr(dd, DC_LCB_ERR_EN, dd->lcb_err_en); - spin_lock_irqsave(&dd->dc8051_lock, flags); dd->dc_shutdown = 0; -done: - spin_unlock_irqrestore(&dd->dc8051_lock, flags); +} + +static void dc_start(struct hfi1_devdata *dd) +{ + mutex_lock(&dd->dc8051_lock); + _dc_start(dd); + mutex_unlock(&dd->dc8051_lock); } /* @@ -8475,16 +8484,11 @@ static int do_8051_command( { u64 reg, completed; int return_code; - unsigned long flags; unsigned long timeout; hfi1_cdbg(DC8051, "type %d, data 0x%012llx", type, in_data); - /* - * Alternative to holding the lock for a long time: - * - keep busy wait - have other users bounce off - */ - spin_lock_irqsave(&dd->dc8051_lock, flags); + mutex_lock(&dd->dc8051_lock); /* We can't send any commands to the 8051 if it's in reset */ if (dd->dc_shutdown) { @@ -8510,10 +8514,8 @@ static int do_8051_command( return_code = -ENXIO; goto fail; } - spin_unlock_irqrestore(&dd->dc8051_lock, flags); - dc_shutdown(dd); - dc_start(dd); - spin_lock_irqsave(&dd->dc8051_lock, flags); + _dc_shutdown(dd); + _dc_start(dd); } /* @@ -8594,8 +8596,7 @@ static int do_8051_command( write_csr(dd, DC_DC8051_CFG_HOST_CMD_0, 0); fail: - spin_unlock_irqrestore(&dd->dc8051_lock, flags); - + mutex_unlock(&dd->dc8051_lock); return return_code; } @@ -11969,6 +11970,10 @@ static void free_cntrs(struct hfi1_devdata *dd) dd->scntrs = NULL; kfree(dd->cntrnames); dd->cntrnames = NULL; + if (dd->update_cntr_wq) { + destroy_workqueue(dd->update_cntr_wq); + dd->update_cntr_wq = NULL; + } } static u64 read_dev_port_cntr(struct hfi1_devdata *dd, struct cntr_entry *entry, @@ -12124,7 +12129,7 @@ u64 write_port_cntr(struct hfi1_pportdata *ppd, int index, int vl, u64 data) return write_dev_port_cntr(ppd->dd, entry, sval, ppd, vl, data); } -static void update_synth_timer(unsigned long opaque) +static void do_update_synth_timer(struct work_struct *work) { u64 cur_tx; u64 cur_rx; @@ -12133,8 +12138,8 @@ static void update_synth_timer(unsigned long opaque) int i, j, vl; struct hfi1_pportdata *ppd; struct cntr_entry *entry; - - struct hfi1_devdata *dd = (struct hfi1_devdata *)opaque; + struct hfi1_devdata *dd = container_of(work, struct hfi1_devdata, + update_cntr_work); /* * Rather than keep beating on the CSRs pick a minimal set that we can @@ -12217,7 +12222,13 @@ static void update_synth_timer(unsigned long opaque) } else { hfi1_cdbg(CNTR, "[%d] No update necessary", dd->unit); } +} +static void update_synth_timer(unsigned long opaque) +{ + struct hfi1_devdata *dd = (struct hfi1_devdata *)opaque; + + queue_work(dd->update_cntr_wq, &dd->update_cntr_work); mod_timer(&dd->synth_stats_timer, jiffies + HZ * SYNTH_CNT_TIME); } @@ -12453,6 +12464,13 @@ static int init_cntrs(struct hfi1_devdata *dd) if (init_cpu_counters(dd)) goto bail; + dd->update_cntr_wq = alloc_ordered_workqueue("hfi1_update_cntr_%d", + WQ_MEM_RECLAIM, dd->unit); + if (!dd->update_cntr_wq) + goto bail; + + INIT_WORK(&dd->update_cntr_work, do_update_synth_timer); + mod_timer(&dd->synth_stats_timer, jiffies + HZ * SYNTH_CNT_TIME); return 0; bail: diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h index 79c4289..c6b6853 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -474,7 +474,7 @@ struct hfi1_packet { #define HFI1_PART_ENFORCE_OUT 0x2 /* how often we check for synthetic counter wrap around */ -#define SYNTH_CNT_TIME 2 +#define SYNTH_CNT_TIME 3 /* Counter flags */ #define CNTR_NORMAL 0x0 /* Normal counters, just read register */ @@ -926,8 +926,9 @@ struct hfi1_devdata { spinlock_t rcvctrl_lock; /* protect changes to RcvCtrl */ /* around rcd and (user ctxts) ctxt_cnt use (intr vs free) */ spinlock_t uctxt_lock; /* rcd and user context changes */ - /* exclusive access to 8051 */ - spinlock_t dc8051_lock; + struct mutex dc8051_lock; /* exclusive access to 8051 */ + struct workqueue_struct *update_cntr_wq; + struct work_struct update_cntr_work; /* exclusive access to 8051 memory */ spinlock_t dc8051_memlock; int dc8051_timed_out; /* remember if the 8051 timed out */ diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c index 9bfb8eb..55a11a6 100644 --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -1078,11 +1078,11 @@ struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev, size_t extra) spin_lock_init(&dd->uctxt_lock); spin_lock_init(&dd->hfi1_diag_trans_lock); spin_lock_init(&dd->sc_init_lock); - spin_lock_init(&dd->dc8051_lock); spin_lock_init(&dd->dc8051_memlock); seqlock_init(&dd->sc2vl_lock); spin_lock_init(&dd->sde_map_lock); spin_lock_init(&dd->pio_map_lock); + mutex_init(&dd->dc8051_lock); init_waitqueue_head(&dd->event_queue); dd->int_counter = alloc_percpu(u64);