From patchwork Thu May 6 15:31:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Pali_Roh=C3=A1r?= X-Patchwork-Id: 12242483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-22.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E90EC433ED for ; Thu, 6 May 2021 15:35:38 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A42EA610C8 for ; Thu, 6 May 2021 15:35:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A42EA610C8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:Cc:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=0+77uKVk/4ULIxWlDboy7IZsVvECtA1brHj6JU0y/zw=; b=DEMv1Wy3sa4okfWt0WdGSPWZi p57fenocMTPOc2Wj7t0Wpr+OihfW+43aYW+3r3tHZeLfi5TAKitOO6bQfWqvU0OGYw9w0RCkFaKbP iBsWrrKP0E31/H88qYdifPm6DhDhEMojCHUVsJ4ppx9ifcb1krWKgrSVB2IOZaWlU3ysOJS3nJmhY MD2H2ZePf/38WO1gEzqAiTYdX5mhBAcKYaeazZ2QROV4c+PNwd1DOM9UYeBep8xMBKRXobX9gCK+o aJgf10Z9FduRTEI0WjvK/UQUFPSTB8RhCEWDjMCIOcD0Uoy9lhhiPQAuySUYl1ZDPF0+DLFxff0no RwoW332Zg==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1leg0n-004aPI-Cp; Thu, 06 May 2021 15:34:01 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lefzg-004a2r-L9 for linux-arm-kernel@desiato.infradead.org; Thu, 06 May 2021 15:32:52 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc: To:From:Sender:Reply-To:Content-ID:Content-Description; bh=+u31kIFm5zZJgYRpI5nPFxm4N80JqTvM466qjbl87Dc=; b=MHxGWv1DaYy094iLR/MJZuYsqc KfPi1i08wmgzdsVBQfeAfQYDsS6B3eoiR9bY0pcVwM8jLCK927+7UItDdlMr18udWI7PclnNpDRT1 7kEqQ7VG+2Il3OHUnEKEcEQP9hNBsfti61C00N5m2dwjs7fxDYlYo8+pR8nGAt6ZfRMaLoaSlfjVt WrbpFl5W5Licep1YfV3zBzF80WCEhkMl/3B/Ag5z45Nlcvh77r4kmn1ReO4lmEGOC+yJ8fP8ye2CR BbJYH+2hc2MkZ/+qNklWOtQiV6FPoS8d4Y70UBW3cgS5PdXz0TBkAWOmtfry3wGbvMAxxYYeB8HfO HZq9j2qA==; Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lefzb-0069f3-W3 for linux-arm-kernel@lists.infradead.org; Thu, 06 May 2021 15:32:51 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 22EC66101A; Thu, 6 May 2021 15:32:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1620315167; bh=s/YJwMrqvoOsPJSEvD5WAEweomBui7Dy2w8ikfwL2rE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jMcfh4hFHod7LXZBOUUtvmvy4wKrI7+Jptr9J5rR2eVvG5FFNHPHSOEEb6liiPu8s ne7QsyByM7KFV6dkVidzC6nn0oLthCWzodUR0UaW0zmSs+bGLH6IUkw4g9Qvr9maQ9 gLYST0fQlC+n9u2zAgGDodvZY4OjNlkaeMpRrsgLHZrlGacTTn2odt2/k4WEAigRow g447gIpekZ5d9Mf5ju7AbcR3h4pLGUkjpAMjRy3j6jDlY10CUqm7EtYRgfx1FKvmcv lL402eKQINe/Rdi2kYIKPgrEi2CdHF5NqIegO7TwoWYoEfNMmAQj8aa4uGrHUo+Y4z bz5q802AXJzWA== Received: by pali.im (Postfix) id 32F3D89A; Thu, 6 May 2021 17:32:44 +0200 (CEST) From: =?utf-8?q?Pali_Roh=C3=A1r?= To: Lorenzo Pieralisi , Thomas Petazzoni , Rob Herring , Bjorn Helgaas Cc: Russell King , =?utf-8?q?Marek_Beh=C3=BAn?= , Remi Pommarel , Xogium , Tomasz Maciej Nowak , Marc Zyngier , linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH 01/42] PCI: aardvark: Fix kernel panic during PIO transfer Date: Thu, 6 May 2021 17:31:12 +0200 Message-Id: <20210506153153.30454-2-pali@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210506153153.30454-1-pali@kernel.org> References: <20210506153153.30454-1-pali@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210506_083248_152764_95AF65AF X-CRM114-Status: GOOD ( 21.97 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Trying to start a new PIO transfer by writing value 0 in PIO_START register when previous transfer has not yet completed (which is indicated by value 1 in PIO_START) causes an External Abort on CPU, which results in kernel panic: SError Interrupt on CPU0, code 0xbf000002 -- SError Kernel panic - not syncing: Asynchronous SError Interrupt To prevent kernel panic, it is required to reject a new PIO transfer when previous one has not finished yet. If previous PIO transfer is not finished yet, the kernel may issue a new PIO request only if the previous PIO transfer timed out. In the past the root cause of this issue was incorrectly identified (as it often happens during link retraining or after link down event) and special hack was implemented in Trusted Firmware to catch all SError events in EL3, to ignore errors with code 0xbf000002 and not forwarding any other errors to kernel and instead throw panic from EL3 Trusted Firmware handler. Links to discussion and patches about this issue: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50 https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@triplefau.lt/ https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@www.loen.fr/ https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541 But the real cause was the fact that during link retraning or after link down event the PIO transfer may take longer time, up to the 1.44s until it times out. This increased probability that a new PIO transfer would be issued by kernel while previous one has not finished yet. After applying this change into the kernel, it is possible to revert the mentioned TF-A hack and SError events do not have to be caught in TF-A EL3. Signed-off-by: Pali Rohár Reviewed-by: Marek Behún Cc: stable@vger.kernel.org # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock") --- drivers/pci/controller/pci-aardvark.c | 49 ++++++++++++++++++++++----- 1 file changed, 40 insertions(+), 9 deletions(-) diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index 051b48bd7985..e3f5e7ab7606 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -514,7 +514,7 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie) udelay(PIO_RETRY_DELAY); } - dev_err(dev, "config read/write timed out\n"); + dev_err(dev, "PIO read/write transfer time out\n"); return -ETIMEDOUT; } @@ -657,6 +657,35 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, return true; } +static bool advk_pcie_pio_is_running(struct advk_pcie *pcie) +{ + struct device *dev = &pcie->pdev->dev; + + /* + * Trying to start a new PIO transfer when previous has not completed + * cause External Abort on CPU which results in kernel panic: + * + * SError Interrupt on CPU0, code 0xbf000002 -- SError + * Kernel panic - not syncing: Asynchronous SError Interrupt + * + * Functions advk_pcie_rd_conf() and advk_pcie_wr_conf() are protected + * by raw_spin_lock_irqsave() at pci_lock_config() level to prevent + * concurrent calls at the same time. But because PIO transfer may take + * about 1.5s when link is down or card is disconnected, it means that + * advk_pcie_wait_pio() does not always have to wait for completion. + * + * Some versions of ARM Trusted Firmware handles this External Abort at + * EL3 level and mask it to prevent kernel panic. Relevant TF-A commit: + * https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50 + */ + if (advk_readl(pcie, PIO_START)) { + dev_err(dev, "Previous PIO read/write transfer is still running\n"); + return true; + } + + return false; +} + static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn, int where, int size, u32 *val) { @@ -673,9 +702,10 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn, return pci_bridge_emul_conf_read(&pcie->bridge, where, size, val); - /* Start PIO */ - advk_writel(pcie, 0, PIO_START); - advk_writel(pcie, 1, PIO_ISR); + if (advk_pcie_pio_is_running(pcie)) { + *val = 0xffffffff; + return PCIBIOS_SET_FAILED; + } /* Program the control register */ reg = advk_readl(pcie, PIO_CTRL); @@ -694,7 +724,8 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn, /* Program the data strobe */ advk_writel(pcie, 0xf, PIO_WR_DATA_STRB); - /* Start the transfer */ + /* Clear PIO DONE ISR and start the transfer */ + advk_writel(pcie, 1, PIO_ISR); advk_writel(pcie, 1, PIO_START); ret = advk_pcie_wait_pio(pcie); @@ -734,9 +765,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn, if (where % size) return PCIBIOS_SET_FAILED; - /* Start PIO */ - advk_writel(pcie, 0, PIO_START); - advk_writel(pcie, 1, PIO_ISR); + if (advk_pcie_pio_is_running(pcie)) + return PCIBIOS_SET_FAILED; /* Program the control register */ reg = advk_readl(pcie, PIO_CTRL); @@ -763,7 +793,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn, /* Program the data strobe */ advk_writel(pcie, data_strobe, PIO_WR_DATA_STRB); - /* Start the transfer */ + /* Clear PIO DONE ISR and start the transfer */ + advk_writel(pcie, 1, PIO_ISR); advk_writel(pcie, 1, PIO_START); ret = advk_pcie_wait_pio(pcie);