From patchwork Fri Jan 10 13:40:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiwei X-Patchwork-Id: 13934676 X-Patchwork-Delegate: bhelgaas@google.com Received: from out162-62-58-211.mail.qq.com (out162-62-58-211.mail.qq.com [162.62.58.211]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 643462063C9; Fri, 10 Jan 2025 13:46:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=162.62.58.211 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736516805; cv=none; b=a0wEphDdzr5Ms0rDVxUh08/CroNXG8+sa6i4CB5yEd8HzzEIfla7nx0F9PMTXPnxon/cEvSRvWamP+cmNgdZ6AgqC0jgbRCSKziEIYNgdw5WQOYjudhrgY9P4gacQ4wipg+gqXzbhEaBDeewPeWlvWOkRqqbUy4jVRxiwL+tKNg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736516805; c=relaxed/simple; bh=oZawplHCNRT8BruZ7AHnDuMCVbBUTqmmU1w72cdFPW8=; h=Message-ID:From:To:Cc:Subject:Date:MIME-Version; b=cmXU6J/wEzCiSmbz1NzYYOWMa6hp3XgBBjcC+nhAB6SyyFT8dtL78fLae5WZe4yl1I52oml1K9x+eTv2Zeb94F/fDOjpbXDsAV9+3TEuFIrchhqufPwABzGhjg0Osruu4e469VYDgWCxbISF3xnL/8reN6TeALrgZvx2zqaMOvs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=qq.com; spf=pass smtp.mailfrom=qq.com; dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b=JPyBWRZG; arc=none smtp.client-ip=162.62.58.211 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=qq.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=qq.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b="JPyBWRZG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512; t=1736516792; bh=sv735Zju32zbddEntk9l6cBvbBc4hC1Divb3zrT5+Pk=; h=From:To:Cc:Subject:Date; b=JPyBWRZGvX8dq1p0TkA5La1VbTuNfrd/wZTK18OFCZg0alCh8jb4n5FWXR7kByYEh wuWbixcNayrfT95ouiM6XJ+gfhiDuGWV+MiXE4CM7mgFo084qI960JdmrZpDHzubHW DrnrWqTNWV+C0Z21jUfgir37yeF7SDY1ATcSMbxs= Received: from jiwei-VirtualBox.lenovo.com ([120.244.62.107]) by newxmesmtplogicsvrszgpua8-0.qq.com (NewEsmtp) with SMTP id A0E03ABA; Fri, 10 Jan 2025 21:40:14 +0800 X-QQ-mid: xmsmtpt1736516414t5m8cmw5w Message-ID: X-QQ-XMAILINFO: M1rD3f8svNzn/roit6RvG8nk1osforKZTegtUm3EH8/PLOXQhK9iCzAql5LsEV peeOO+Gz37CklgAjiiuG/xc+WnWOQXfdwFiRFJfv0PaV8A6ATafjF2pjOBk9fRleCYnue06z1BNJ qezvP27nVicxwFsKRakPUNlHxwKwAkK35rXrZmxu8ZAZS/hWwvOB6BeGXMlYEBeD7Du1NdtM7btZ lLyXbJePXNe4jWLNJCHdbiUACXBK4LeCx8tJT71KUOT1+ffZma2b7uMk5R24jXRut5m3z1952qQZ tWPBpF9Xi6j00v7QETSZJSxO01qiwItnURFaYfWzTuoXLGbM3T3+Dil6gKac76Rzos8qLtgzR6ZA sirCHE3a0ljgV5quH057w9npW4GwFtHrDzxIbXA4mIhQEMseKfTgRXxZyFnSoYMbAaDLlAavdxsk HGcZHTsiJhWqImswAp4V811xkGkGLaDPWnyjnss1Ixc7JO2n3OUbHFM7vXKDe7ObRBO1UG204Kvp c8K+zHBfxmdzRVNbjJK4HEyjkMfpCEpkvW6UlN5WdoeywUshRHG8/lCK9bNTdFjesM6xzsIZxdoC ECDPipv6DY6JNLGHfF7BBMVfGrXk6KzxbUORP/TAU3c4ViUpsu1wLaagoahE47DMg899a8IWcaye U2ssqRKazAze7mIiZPVZaZuIFmMRJOcTleD/XKc87Hl7Lv0PFDiWBw+FXgrVlyUcD15zmLjsT1oc yZrfRwQK1Fs/Jckr38uFy0HOlNlUEFwDRHTXyJIGjjxNATdHRTqfhAAORv89S/gJcXXnqDjiWUy6 vdfk2nFkhv7cuUbiq6FCWgfj0h1pAL9J56GMRgaz6reIXgW/Cd0jJKzGNcLnbFug6FmAZWdIGRj2 eUHc5qanWPZk6Sid2ohCsWp9lFGseGT+Plz1AYnUW0lguG6K3kkA8OHOteqSjFU7d6CI7mKTDguC nE4J0tekemH6i6SOBFlrX902ar0Qq+W/xCF579Ad/YBjfp215brTbv4IY/mMCSKvMDEpMiWZFEOg xLyXI2XA/rsc2agCiYWxRIaS4XbuahSJSmEKI8TYAc/LjFeQlX0RKPIvMJecMnwdOF9aUU/g== X-QQ-XMRINFO: NI4Ajvh11aEj8Xl/2s1/T8w= From: Jiwei Sun To: macro@orcam.me.uk, ilpo.jarvinen@linux.intel.com, bhelgaas@google.com Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, guojinhui.liam@bytedance.com, helgaas@kernel.org, lukas@wunner.de, ahuang12@lenovo.com, sunjw10@lenovo.com, jiwei.sun.bj@qq.com Subject: [PATCH 1/2] PCI: Fix the wrong reading of register fields Date: Fri, 10 Jan 2025 21:40:06 +0800 X-OQ-MSGID: <20250110134006.33527-1-jiwei.sun.bj@qq.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Jiwei Sun Since commit de9a6c8d5dbf ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed"), there are two potential issues in the function pcie_failed_link_retrain(). (1) The macro PCIE_LNKCTL2_TLS2SPEED() and PCIE_LNKCAP_SLS2SPEED() just uses the link speed field of the registers. However, there are many other different function fields in the Link Control 2 Register or the Link Capabilities Register. If the register value is directly used by the two macros, it may cause getting an error link speed value (PCI_SPEED_UNKNOWN). (2) In the pcie_failed_link_retrain(), the local variable lnkctl2 is not changed after reading from PCI_EXP_LNKCTL2. It might cause that the removing 2.5GT/s downstream link speed restriction codes are not executed. In order to avoid the above-mentioned potential issues, only keep link speed field of the two registers before using by pcie_set_target_speed() and reread the Link Control 2 Register before using. Fixes: de9a6c8d5dbf ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed") Signed-off-by: Jiwei Sun --- drivers/pci/quirks.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 76f4df75b08a..605628c810a5 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -118,11 +118,13 @@ int pcie_failed_link_retrain(struct pci_dev *dev) ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false); if (ret) { pci_info(dev, "retraining failed\n"); + oldlnkctl2 &= PCI_EXP_LNKCTL2_TLS; pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true); return ret; } + pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2); pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta); } @@ -133,6 +135,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev) pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n"); pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); + lnkcap &= PCI_EXP_LNKCAP_SLS; ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false); if (ret) { pci_info(dev, "retraining failed\n"); From patchwork Fri Jan 10 13:44:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiwei X-Patchwork-Id: 13934675 X-Patchwork-Delegate: bhelgaas@google.com Received: from xmbghk7.mail.qq.com (xmbghk7.mail.qq.com [43.163.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6B3720E6FD; Fri, 10 Jan 2025 13:44:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=43.163.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736516678; cv=none; b=lpHoU53wjnSCEinfANwQkcpsk25eH5XmYzWynwQ592LDvQa4/Jo/ijMuKafTY6qeJNAeD5al1eYQ2b6mJGFf5qGFP0roi/1NzbXK1AIaiiodaWbfeFgLNk8bD/p/62C/D/FHwcOqCCdiPolQCtTCIP+POwhXmcPJfXGj1HozFlc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736516678; c=relaxed/simple; bh=zcQi3cuOjY7oReka/rBrEqA89SqAZdFQ6huk/mtxaEQ=; h=Message-ID:From:To:Cc:Subject:Date:MIME-Version; b=f0OI0NLvKVDSsgL/RUwt/SRrChD6TbwzOvtC/E81n3eM1ddQTDKVtNrF1NkiGLfiJUrEzW5kOfhMPxUA7eOsG23g3GtrCRL6lAHXCvl6b6Lb1odjLO1lTvvzuetdKXjbq4h74VuaAmdagFLFeILyE6rmkDcmuMl1lVOEjhXuH/g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=qq.com; spf=pass smtp.mailfrom=qq.com; dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b=CGHVal+M; arc=none smtp.client-ip=43.163.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=qq.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=qq.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b="CGHVal+M" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512; t=1736516652; bh=P6tCNUHde1VgBvC9lRDHDfls7Ec2MDCLShQjnJL9+A4=; h=From:To:Cc:Subject:Date; b=CGHVal+Mn5xZ+AFsWRMn2NB5S/yd67B6w+BDub/EhpWpEvVUGNB/vRYPali1GhmLK KfQ7/g3sz73yCLs9qWxnRzRaZFQUxMag293lnfxR3kmO1QbtzgeElBYM8yBpJE6nK6 LpCWQ/8HCx/uaBQF1aO5JB5vQYGMx+cGSl93G8Xk= Received: from jiwei-VirtualBox.lenovo.com ([120.244.62.107]) by newxmesmtplogicsvrsza36-0.qq.com (NewEsmtp) with SMTP id B0917AA7; Fri, 10 Jan 2025 21:44:09 +0800 X-QQ-mid: xmsmtpt1736516649tiqcnmc0i Message-ID: X-QQ-XMAILINFO: NHgT31LhP5vDVpQVMa1QnVviXB6yxDexdxjEMrfVy/qUCxjbm/I8heaHhEWOP4 +Bj9y/FS18EVNdUBsNiTszpl8Y0rhWgUuVxJNewzD3kI7fGWR1917aY4EZ6RlVvPMbapiHToQDDY 2lclmMHcDHzJO9aLsNzLS/MfgxkWhLFTmaj3Artokq5VJfzJd3rQNgxio9j/6xygDrbLvoMXYJuf xNY4RzlRzhTvboEqzSCH3UCRL0H2x/eMMU9muGBmqtUMog23+yiONs8t8v4mV2GnMYOjGQFGm0i9 U0oMn6VfLWxIRWlBgrsrxbUZ2Jzzhfi4RJaIL1zPt9i0iyGLPpuCHpQEvGFfpkMYRHHvxPEeJnRR BzUSZ9xwY9IvRtidNIpsjyYh601VWvSmFwel5eOC75WoY+M1NwYJ9HjRHlrSmy+RhADVwHAhvPPW MKGXDDYxQIeD4QVC03BYhnGzQQRZGqET3VE4BRjDHS+Snk8VM2YAGe93mF4O3Dv9u+wURgo4NjZH 7Kda7ZW3iATiFVQSeGA1hw4KTZsRgD5/VBampea6o27YNtuhUVry0x3TaQ+IEbrJyu5Z+woWeokZ tzNbJHqbfIGYbPaOspmDNyra682+nHrsyRBik7+ZgErWNWQxSAS4vsXNweI47EBfknURy2BmqXEo njRUVAXOt+MDiJ0/lFdIzsc2p9fJSnbKMN1+xrT6GUPs8EI9Vp8Z5HkJNbnkXL5kRVgtbnyc2LIo H0mdVhKbNpO8NgITHn2jpEnL3KL2AYoD/sdsZ1YWGh8CTarWKXISqve0esYGIqCkhlPy+4buFhRm 7PzWqy9wQMgAYRRJRyMw5X28KZDKU9Ggg23jiHlAM3a4t9R9gFiRKuiLAFNt14Wa0Njp7U9HCyOO r3CU2XJp7lpNFCDUqZE6C6xgBcBjHQ5QXbK4Jx7pwmbHjHx2qVopSmt9la5tCH/NcoWydTnpXBWp YYCHygMHkscX9Y/sfNWL3bK4pJR3+1OFuYoTZ5N3A3rK5ZFrb0krSiInLhEa8e6d4/pYreplFeXk TD07XAtqX+il3dpVmlsSwxszVozUZanSQpsdzii88wxi6dYtmvxPk7EiVsGGsvX0qSy4wvKWhtiz ktIw0cGkIa8TNASM1IdVb8PdJj33VNwlIucHVL X-QQ-XMRINFO: OWPUhxQsoeAVDbp3OJHYyFg= From: Jiwei Sun To: macro@orcam.me.uk, ilpo.jarvinen@linux.intel.com, bhelgaas@google.com Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, guojinhui.liam@bytedance.com, helgaas@kernel.org, lukas@wunner.de, ahuang12@lenovo.com, sunjw10@lenovo.com, jiwei.sun.bj@qq.com Subject: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Date: Fri, 10 Jan 2025 21:44:01 +0800 X-OQ-MSGID: <20250110134401.34536-1-jiwei.sun.bj@qq.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Jiwei Sun When we do the quick hot-add/hot-remove test (within 1 second) with a PCIE Gen 5 NVMe disk, there is a possibility that the PCIe bridge will decrease to 2.5GT/s from 32GT/s pcieport 10002:00:04.0: pciehp: Slot(75): Link Down pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found ... pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: broken device, retraining non-functional downstream link at 2.5GT/s pcieport 10002:00:04.0: pciehp: Slot(75): No link pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): Link Up pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pci 10002:02:00.0: [144d:a826] type 00 class 0x010802 PCIe Endpoint pci 10002:02:00.0: BAR 0 [mem 0x00000000-0x00007fff 64bit] pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x00007fff 64bit] pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x001fffff 64bit]: contains BAR 0 for 64 VFs pci 10002:02:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 10002:00:04.0 (capable of 126.028 Gb/s with 32.0 GT/s PCIe x4 link) If a NVMe disk is hot removed, the pciehp interrupt will be triggered, and the kernel thread pciehp_ist will be woken up, the pcie_failed_link_retrain() will be called as the following call trace. irq/87-pciehp-2524 [121] ..... 152046.006765: pcie_failed_link_retrain <-pcie_wait_for_link irq/87-pciehp-2524 [121] ..... 152046.006782: => [FTRACE TRAMPOLINE] => pcie_failed_link_retrain => pcie_wait_for_link => pciehp_check_link_status => pciehp_enable_slot => pciehp_handle_presence_or_link_change => pciehp_ist => irq_thread_fn => irq_thread => kthread => ret_from_fork => ret_from_fork_asm Accorind to investigation, the issue is caused by the following scenerios, NVMe disk pciehp hardirq hot-remove top-half pciehp irq kernel thread ====================================================================== pciehp hardirq will be triggered cpu handle pciehp hardirq pciehp irq kthread will be woken up pciehp_ist ... pcie_failed_link_retrain read PCI_EXP_LNKCTL2 register read PCI_EXP_LNKSTA register If NVMe disk hot-add before calling pcie_retrain_link() set target speed to 2_5GT pcie_bwctrl_change_speed pcie_retrain_link : the retrain work will be successful, because pci_match_id() will be 0 in pcie_failed_link_retrain() the target link speed field of the Link Control 2 Register will keep 0x1. In order to fix the issue, don't do the retraining work except ASMedia ASM2824. Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures") Reported-by: Adrian Huang Signed-off-by: Jiwei Sun --- drivers/pci/quirks.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 605628c810a5..ff04ebd9ae16 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -104,6 +104,9 @@ int pcie_failed_link_retrain(struct pci_dev *dev) u16 lnksta, lnkctl2; int ret = -ENOTTY; + if (!pci_match_id(ids, dev)) + return 0; + if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) || !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting) return ret; @@ -129,8 +132,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev) } if ((lnksta & PCI_EXP_LNKSTA_DLLLA) && - (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT && - pci_match_id(ids, dev)) { + (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT) { u32 lnkcap; pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");