From patchwork Thu Nov 16 12:26:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Vignesh Raghavendra X-Patchwork-Id: 10061077 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 811FF601AE for ; Thu, 16 Nov 2017 12:26:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76C8F2A9CF for ; Thu, 16 Nov 2017 12:26:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6BA562A9D3; Thu, 16 Nov 2017 12:26:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4DAB2A9D4 for ; Thu, 16 Nov 2017 12:26:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934643AbdKPM03 (ORCPT ); Thu, 16 Nov 2017 07:26:29 -0500 Received: from fllnx210.ext.ti.com ([198.47.19.17]:37390 "EHLO fllnx210.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754009AbdKPM02 (ORCPT ); Thu, 16 Nov 2017 07:26:28 -0500 Received: from dlelxv90.itg.ti.com ([172.17.2.17]) by fllnx210.ext.ti.com (8.15.1/8.15.1) with ESMTP id vAGCQRfV019302; Thu, 16 Nov 2017 06:26:27 -0600 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ti.com; s=ti-com-17Q1; t=1510835187; bh=VeuoqRHhuFe9ieOooQTrGytyQJ6FQQA5aaL5+h64j9I=; h=Subject:To:References:From:CC:Date:In-Reply-To; b=N1WLsCoKWuHxCYU2fwVU3jYz9pfjTPQvxHcH1NAZDmy+eNgzTVjxL9H5r8/jkbzKJ 3NfB+0ItBDM9bfceYhc6CsYEhOAVK7PGBTVjyZMRXcR4r9ZZExeQtMcM9KGg3nqOjX ENIKPbybrdwHEQ/l44t/+L4v890yC2sOmdpisUos= Received: from DLEE107.ent.ti.com (dlee107.ent.ti.com [157.170.170.37]) by dlelxv90.itg.ti.com (8.14.3/8.13.8) with ESMTP id vAGCQRMA009179; Thu, 16 Nov 2017 06:26:27 -0600 Received: from DLEE114.ent.ti.com (157.170.170.25) by DLEE107.ent.ti.com (157.170.170.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.1.845.34; Thu, 16 Nov 2017 06:26:26 -0600 Received: from dlep33.itg.ti.com (157.170.170.75) by DLEE114.ent.ti.com (157.170.170.25) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_RSA_WITH_AES_256_CBC_SHA) id 15.1.845.34 via Frontend Transport; Thu, 16 Nov 2017 06:26:26 -0600 Received: from [172.24.190.89] (ileax41-snat.itg.ti.com [10.172.224.153]) by dlep33.itg.ti.com (8.14.3/8.13.8) with ESMTP id vAGCQOqH020096; Thu, 16 Nov 2017 06:26:25 -0600 Subject: =?UTF-8?Q?Re:_xhci=5fhcd_HC_died; _cleaning_up_with_TUSB7340_and_?= =?UTF-8?Q?=c2=b5PD720201?= To: "Quadros, Roger" References: <3dd7a4fc-da86-03cc-9b01-a0d29dd73230@ti.com> From: Vignesh R CC: Chris Welch , "linux-usb@vger.kernel.org" , , Joao Pinto , KISHON VIJAY ABRAHAM Message-ID: Date: Thu, 16 Nov 2017 17:56:56 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <3dd7a4fc-da86-03cc-9b01-a0d29dd73230@ti.com> Content-Language: en-US X-EXCLAIMER-MD-CONFIG: e1e8a2fd-e40a-4ac6-ac9b-f7e9cc9ee180 Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP +linux-pci Hi Chris, On Thursday 16 November 2017 05:20 PM, Quadros, Roger wrote: > +Vignesh > > On 13/09/17 17:26, Chris Welch wrote: >> We are developing a product based on the TI AM5728 EVM.  The product utilizes a TUSB7340 PCIe USB host for additional ports.  The TUSB7340 is detected and setup properly and works OK with low data rate devices.  However, hot plugging a Realtek USB network adapter and doing Ethernet transfer bandwidth testing using iperf3 causes the TUSB7340 host to be  locked out.  The TUSB7340 host appears to no longer communicate and the logging indicates xhci_hcd 0000:01:00.0: HC died; cleaning up. Same issue occurs with another USB Ethernet adapter I tried (Asus). >> >> We looked at using another host and found a mini PCIe card that utilizes the µPD720201 and can be directly installed on the TI AM5728 EVM.  The card is detected properly and we reran the transfer test.  The uPD720201 gets locks out with the same problem. >> >> The AM5728 testing was performed using the TI SD card stock am57xx-evm-linux-04.00.00.04.img, kernel am57xx-evm 4.9.28-geed43d1050, and it reports that it is using the TI AM572x EVM Rev A3 device tree. >> >> It shows the following logging when it fails (this is with the TI EVM and uPD720201). >> >> [  630.400899] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command. >> [  630.408769] xhci_hcd 0000:01:00.0: Assuming host is dying, halting host. >> [  630.420849] r8152 2-4:1.0 enp1s0u4: Tx status -108 >> [  630.425667] r8152 2-4:1.0 enp1s0u4: Tx status -108 >> [  630.430483] r8152 2-4:1.0 enp1s0u4: Tx status -108 >> [  630.435297] r8152 2-4:1.0 enp1s0u4: Tx status -108 >> [  630.440122] xhci_hcd 0000:01:00.0: HC died; cleaning up >> [  630.453961] usb 2-4: USB disconnect, device number 2 >> >> The problem appears to be a general driver issue given we get the same problem with both the  TUSB7340 and the µPD720201. Seems like PCIe driver is missing MSI IRQs leading to stall. Reading xHCI registers via PCIe mem space confirms this. I see two problems wrt MSI handling: Since commit 8c934095fa2f3 ("PCI: dwc: Clear MSI interrupt status after it is handled, not before"), dwc clears MSI status after calling EP's IRQ handler. But, it happens that another MSI interrupt is raised just at the end of EP's IRQ handler and before clearing MSI status. This will result in loss of new MSI IRQ as we clear the MSI IRQ status without handling. Another problem appears to be wrt dra7xx PCIe wrapper: PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI does not seem to catch MSI IRQs unless, its ensured that PCIE_MSI_INTR0_STATUS register read returns 0. So, could you try reverting commit 8c934095fa2f3 and also apply below patch and let me know if that fixes the issue? ----------- diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c index e77a4ceed74c..8280abc56f30 100644 --- a/drivers/pci/dwc/pci-dra7xx.c +++ b/drivers/pci/dwc/pci-dra7xx.c @@ -259,10 +259,17 @@ static irqreturn_t dra7xx_pcie_msi_irq_handler(int irq, void *arg) u32 reg; reg = dra7xx_pcie_readl(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI); + dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI, reg); switch (reg) { case MSI: - dw_handle_msi_irq(pp); + /* + * Need to make sure no MSI IRQs are pending before + * exiting handler, else the wrapper will not catch new + * IRQs. So loop around till dw_handle_msi_irq() returns + * IRQ_NONE + */ + while (dw_handle_msi_irq(pp) != IRQ_NONE); break; case INTA: case INTB: @@ -273,8 +280,6 @@ static irqreturn_t dra7xx_pcie_msi_irq_handler(int irq, void *arg) break; } - dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI, reg); - return IRQ_HANDLED; }