From patchwork Wed Feb 1 04:30:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Huacai Chen X-Patchwork-Id: 13123764 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 496C4C05027 for ; Wed, 1 Feb 2023 04:31:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229574AbjBAEbT (ORCPT ); Tue, 31 Jan 2023 23:31:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230519AbjBAEbP (ORCPT ); Tue, 31 Jan 2023 23:31:15 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E121564B1 for ; Tue, 31 Jan 2023 20:31:13 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8E75360E26 for ; Wed, 1 Feb 2023 04:31:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8C6F5C433D2; Wed, 1 Feb 2023 04:31:09 +0000 (UTC) From: Huacai Chen To: Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= Cc: linux-pci@vger.kernel.org, Jianmin Lv , Xuefeng Li , Huacai Chen , Jiaxun Yang , Huacai Chen Subject: [PATCH V4 1/2] PCI: Omit pci_disable_device() in .shutdown() Date: Wed, 1 Feb 2023 12:30:17 +0800 Message-Id: <20230201043018.778499-2-chenhuacai@loongson.cn> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230201043018.778499-1-chenhuacai@loongson.cn> References: <20230201043018.778499-1-chenhuacai@loongson.cn> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org This patch has a long story. After cc27b735ad3a7557 ("PCI/portdrv: Turn off PCIe services during shutdown") we observe poweroff/reboot failures on systems with LS7A chipset. We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER" in do_pci_disable_device(), it can work well. The hardware engineer says that the root cause is that CPU is still accessing PCIe devices while poweroff/reboot, and if we disable the Bus Master Bit at this time, the PCIe controller doesn't forward requests to downstream devices, and also does not send TIMEOUT to CPU, which causes CPU wait forever (hardware deadlock). To be clear, the sequence is like this: - CPU issues MMIO read to device below Root Port - LS7A Root Port fails to forward transaction to secondary bus because of LS7A Bus Master defect - CPU hangs waiting for response to MMIO read Then how is userspace able to use a device after the device is removed? To give more details, let's take the graphics driver (e.g. amdgpu) as an example. The userspace programs call printf() to display "shutting down xxx service" during shutdown/reboot, or the kernel calls printk() to display something during shutdown/reboot. These can happen at any time, even after we call pcie_port_device_remove() to disable the pcie port on the graphic card. The call stack is: printk() --> call_console_drivers() --> con->write() --> vt_console_print() --> fbcon_putcs() This scenario happens because userspace programs (or the kernel itself) don't know whether a device is 'usable', they just use it, at any time. This hardware behavior is a PCIe protocol violation (Bus Master should not be involved in CPU MMIO transactions), and it will be fixed in new revisions of hardware (add timeout mechanism for CPU read request, whether or not Bus Master bit is cleared). On some x86 platforms, radeon/amdgpu devices can cause similar problems [1][2]. Once before I add a quirk to solve the LS7A problem but looks ugly. After long time discussions, Bjorn Helgaas suggest simply remove the pci_disable_device() in pcie_portdrv_shutdown() and this patch do it exactly. [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980 [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638 Signed-off-by: Huacai Chen --- drivers/pci/pcie/portdrv.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c index 2cc2e60bcb39..46fad0d813b2 100644 --- a/drivers/pci/pcie/portdrv.c +++ b/drivers/pci/pcie/portdrv.c @@ -501,7 +501,6 @@ static void pcie_port_device_remove(struct pci_dev *dev) { device_for_each_child(&dev->dev, NULL, remove_iter); pci_free_irq_vectors(dev); - pci_disable_device(dev); } /** @@ -727,6 +726,19 @@ static void pcie_portdrv_remove(struct pci_dev *dev) } pcie_port_device_remove(dev); + + pci_disable_device(dev); +} + +static void pcie_portdrv_shutdown(struct pci_dev *dev) +{ + if (pci_bridge_d3_possible(dev)) { + pm_runtime_forbid(&dev->dev); + pm_runtime_get_noresume(&dev->dev); + pm_runtime_dont_use_autosuspend(&dev->dev); + } + + pcie_port_device_remove(dev); } static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev, @@ -777,7 +789,7 @@ static struct pci_driver pcie_portdriver = { .probe = pcie_portdrv_probe, .remove = pcie_portdrv_remove, - .shutdown = pcie_portdrv_remove, + .shutdown = pcie_portdrv_shutdown, .err_handler = &pcie_portdrv_err_handler, From patchwork Wed Feb 1 04:30:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Huacai Chen X-Patchwork-Id: 13123765 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D417C05027 for ; Wed, 1 Feb 2023 04:31:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230031AbjBAEbp (ORCPT ); Tue, 31 Jan 2023 23:31:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230156AbjBAEbo (ORCPT ); Tue, 31 Jan 2023 23:31:44 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3C5011655 for ; Tue, 31 Jan 2023 20:31:41 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2107960DEF for ; Wed, 1 Feb 2023 04:31:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D514C433EF; Wed, 1 Feb 2023 04:31:37 +0000 (UTC) From: Huacai Chen To: Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= Cc: linux-pci@vger.kernel.org, Jianmin Lv , Xuefeng Li , Huacai Chen , Jiaxun Yang , Huacai Chen Subject: [PATCH V4 2/2] PCI: loongson: Improve the MRRS quirk for LS7A Date: Wed, 1 Feb 2023 12:30:18 +0800 Message-Id: <20230201043018.778499-3-chenhuacai@loongson.cn> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230201043018.778499-1-chenhuacai@loongson.cn> References: <20230201043018.778499-1-chenhuacai@loongson.cn> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org In new revision of LS7A, some PCIe ports support larger value than 256, but their maximum supported MRRS values are not detectable. Moreover, the current loongson_mrrs_quirk() cannot avoid devices increasing its MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169) will actually set a big value in its driver. So the only possible way is configure MRRS of all devices in BIOS, and add a pci host bridge bit flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations. However, according to PCIe Spec, it is legal for an OS to program any value for MRRS, and it is also legal for an endpoint to generate a Read Request with any size up to its MRRS. As the hardware engineers say, the root cause here is LS7A doesn't break up large read requests. In detail, LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read request with a size that's "too big" ("too big" means larger than the PCIe ports can handle, which means 256 for some ports and 4096 for the others, and of course this is a problem in the LS7A's hardware design). Link: Link: https://bugzilla.kernel.org/show_bug.cgi?id=216884 Signed-off-by: Huacai Chen --- drivers/pci/controller/pci-loongson.c | 44 +++++++++------------------ drivers/pci/pci.c | 6 ++++ include/linux/pci.h | 1 + 3 files changed, 22 insertions(+), 29 deletions(-) diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c index 05c50408f13b..759ec211c17b 100644 --- a/drivers/pci/controller/pci-loongson.c +++ b/drivers/pci/controller/pci-loongson.c @@ -75,37 +75,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, DEV_LS7A_LPC, system_bus_quirk); -static void loongson_mrrs_quirk(struct pci_dev *dev) +static void loongson_mrrs_quirk(struct pci_dev *pdev) { - struct pci_bus *bus = dev->bus; - struct pci_dev *bridge; - static const struct pci_device_id bridge_devids[] = { - { PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) }, - { PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) }, - { PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) }, - { 0, }, - }; - - /* look for the matching bridge */ - while (!pci_is_root_bus(bus)) { - bridge = bus->self; - bus = bus->parent; - /* - * Some Loongson PCIe ports have a h/w limitation of - * 256 bytes maximum read request size. They can't handle - * anything larger than this. So force this limit on - * any devices attached under these ports. - */ - if (pci_match_id(bridge_devids, bridge)) { - if (pcie_get_readrq(dev) > 256) { - pci_info(dev, "limiting MRRS to 256\n"); - pcie_set_readrq(dev, 256); - } - break; - } - } + /* + * Some Loongson PCIe ports have h/w limitations of maximum read + * request size. They can't handle anything larger than this. So + * force this limit on any devices attached under these ports. + */ + struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus); + + bridge->no_inc_mrrs = 1; } -DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, + DEV_PCIE_PORT_0, loongson_mrrs_quirk); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, + DEV_PCIE_PORT_1, loongson_mrrs_quirk); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, + DEV_PCIE_PORT_2, loongson_mrrs_quirk); static void loongson_pci_pin_quirk(struct pci_dev *pdev) { diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index fba95486caaf..ae88210a12c7 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6033,6 +6033,7 @@ int pcie_set_readrq(struct pci_dev *dev, int rq) { u16 v; int ret; + struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus); if (rq < 128 || rq > 4096 || !is_power_of_2(rq)) return -EINVAL; @@ -6051,6 +6052,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq) v = (ffs(rq) - 8) << 12; + if (bridge->no_inc_mrrs) { + if (rq > pcie_get_readrq(dev)) + return -EINVAL; + } + ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_READRQ, v); diff --git a/include/linux/pci.h b/include/linux/pci.h index adffd65e84b4..3df2049ec4a8 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -572,6 +572,7 @@ struct pci_host_bridge { void *release_data; unsigned int ignore_reset_delay:1; /* For entire hierarchy */ unsigned int no_ext_tags:1; /* No Extended Tags */ + unsigned int no_inc_mrrs:1; /* No Increase MRRS */ unsigned int native_aer:1; /* OS may use PCIe AER */ unsigned int native_pcie_hotplug:1; /* OS may use PCIe hotplug */ unsigned int native_shpc_hotplug:1; /* OS may use SHPC hotplug */