Message ID | fe03941e3e1cc42fb9bf4395e302bff53ee2198b.1734428762.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Krzysztof Wilczyński |
Headers | show |
Series | [for-linus,v3,1/2] PCI: Honor Max Link Speed when determining supported speeds | expand |
On Tue, 17 Dec 2024, Lukas Wunner wrote: > The Supported Link Speeds Vector in the Link Capabilities 2 Register > indicates the *supported* link speeds. The Max Link Speed field in the > Link Capabilities Register indicates the *maximum* of those speeds. > > pcie_get_supported_speeds() neglects to honor the Max Link Speed field and > will thus incorrectly deem higher speeds as supported. Fix it. > > One user-visible issue addressed here is an incorrect value in the sysfs > attribute "max_link_speed". > > But the main motivation is a boot hang reported by Niklas: Intel JHL7540 > "Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds, > but indicate 2.5 GT/s as maximum. Ilpo recalls seeing this on more > devices. It can be explained by the controller's Downstream Ports > supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s > if the port interfaces to a PCIe Adapter, in accordance with USB4 v2 > sec 11.2.1: > > "This section defines the functionality of an Internal PCIe Port that > interfaces to a PCIe Adapter. [...] > The Logical sub-block shall update the PCIe configuration registers > with the following characteristics: [...] > Max Link Speed field in the Link Capabilities Register set to 0001b > (data rate of 2.5 GT/s only). > Note: These settings do not represent actual throughput. Throughput > is implementation specific and based on the USB4 Fabric performance." > > The present commit is not sufficient on its own to fix Niklas' boot hang, > but it is a prerequisite: A subsequent commit will fix the boot hang by > enabling bandwidth control only if more than one speed is supported. > > The GENMASK() macro used herein specifies 0 as lowest bit, even though > the Supported Link Speeds Vector ends at bit 1. This is done on purpose > to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro > would be invalid as the lowest bit is greater than the highest bit. > Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated > Endpoints in particular, so it does occur in practice. Thanks for adding this extra information. I'd also add reference to r6.2 section 7.5.3 which states those registers are required for RPs, Switch Ports, Bridges, and Endpoints _that are not RCiEPs_. My reading is that implies they're not required from RCiEPs. Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> -- i. > Fixes: d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds") > Reported-by: Niklas Schnelle <niks@kernel.org> > Tested-by: Niklas Schnelle <niks@kernel.org> > Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org/ > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > --- > drivers/pci/pci.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 35dc9f2..b730560 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -6240,12 +6240,14 @@ u8 pcie_get_supported_speeds(struct pci_dev *dev) > pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2); > speeds = lnkcap2 & PCI_EXP_LNKCAP2_SLS; > > + /* Ignore speeds higher than Max Link Speed */ > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); > + speeds &= GENMASK(lnkcap & PCI_EXP_LNKCAP_SLS, 0); > + > /* PCIe r3.0-compliant */ > if (speeds) > return speeds; > > - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); > - > /* Synthesize from the Max Link Speed field */ > if ((lnkcap & PCI_EXP_LNKCAP_SLS) == PCI_EXP_LNKCAP_SLS_5_0GB) > speeds = PCI_EXP_LNKCAP2_SLS_5_0GB | PCI_EXP_LNKCAP2_SLS_2_5GB; >
Hello, > > One user-visible issue addressed here is an incorrect value in the sysfs > > attribute "max_link_speed". > > > > But the main motivation is a boot hang reported by Niklas: Intel JHL7540 > > "Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds, > > but indicate 2.5 GT/s as maximum. Ilpo recalls seeing this on more > > devices. It can be explained by the controller's Downstream Ports > > supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s > > if the port interfaces to a PCIe Adapter, in accordance with USB4 v2 > > sec 11.2.1: > > > > "This section defines the functionality of an Internal PCIe Port that > > interfaces to a PCIe Adapter. [...] > > The Logical sub-block shall update the PCIe configuration registers > > with the following characteristics: [...] > > Max Link Speed field in the Link Capabilities Register set to 0001b > > (data rate of 2.5 GT/s only). > > Note: These settings do not represent actual throughput. Throughput > > is implementation specific and based on the USB4 Fabric performance." > > > > The present commit is not sufficient on its own to fix Niklas' boot hang, > > but it is a prerequisite: A subsequent commit will fix the boot hang by > > enabling bandwidth control only if more than one speed is supported. > > > > The GENMASK() macro used herein specifies 0 as lowest bit, even though > > the Supported Link Speeds Vector ends at bit 1. This is done on purpose > > to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro > > would be invalid as the lowest bit is greater than the highest bit. > > Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated > > Endpoints in particular, so it does occur in practice. > > Thanks for adding this extra information. > > I'd also add reference to r6.2 section 7.5.3 which states those registers > are required for RPs, Switch Ports, Bridges, and Endpoints _that are not > RCiEPs_. My reading is that implies they're not required from RCiEPs. Let me know how you would like to update the commit message. I will do it directly on the branch. Thank you! Krzysztof
On Thu, Dec 19, 2024 at 08:43:57AM +0900, Krzysztof Wilczy??ski wrote: > > > The GENMASK() macro used herein specifies 0 as lowest bit, even though > > > the Supported Link Speeds Vector ends at bit 1. This is done on purpose > > > to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro > > > would be invalid as the lowest bit is greater than the highest bit. > > > Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated > > > Endpoints in particular, so it does occur in practice. > > > > Thanks for adding this extra information. > > > > I'd also add reference to r6.2 section 7.5.3 which states those registers > > are required for RPs, Switch Ports, Bridges, and Endpoints _that are not > > RCiEPs_. My reading is that implies they're not required from RCiEPs. > > Let me know how you would like to update the commit message. I will do it > directly on the branch. FWIW, I edited the commit message like this on my local branch: -Endpoints in particular, so it does occur in practice. +Endpoints in particular, so it does occur in practice. (The Link +Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.) In other words, I just added the sentence in parentheses. But maybe Ilpo has another wording preference... :) Thanks, Lukas
On Thu, 19 Dec 2024, Lukas Wunner wrote: > On Thu, Dec 19, 2024 at 08:43:57AM +0900, Krzysztof Wilczy??ski wrote: > > > > The GENMASK() macro used herein specifies 0 as lowest bit, even though > > > > the Supported Link Speeds Vector ends at bit 1. This is done on purpose > > > > to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro > > > > would be invalid as the lowest bit is greater than the highest bit. > > > > Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated > > > > Endpoints in particular, so it does occur in practice. > > > > > > Thanks for adding this extra information. > > > > > > I'd also add reference to r6.2 section 7.5.3 which states those registers > > > are required for RPs, Switch Ports, Bridges, and Endpoints _that are not > > > RCiEPs_. My reading is that implies they're not required from RCiEPs. > > > > Let me know how you would like to update the commit message. I will do it > > directly on the branch. > > FWIW, I edited the commit message like this on my local branch: > > -Endpoints in particular, so it does occur in practice. > +Endpoints in particular, so it does occur in practice. (The Link > +Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.) > > In other words, I just added the sentence in parentheses. > But maybe Ilpo has another wording preference... :) Your wording is good summary for the real substance that is the spec itself. :-)
Hello, [...] > > > > Thanks for adding this extra information. > > > > > > > > I'd also add reference to r6.2 section 7.5.3 which states those registers > > > > are required for RPs, Switch Ports, Bridges, and Endpoints _that are not > > > > RCiEPs_. My reading is that implies they're not required from RCiEPs. > > > > > > Let me know how you would like to update the commit message. I will do it > > > directly on the branch. > > > > FWIW, I edited the commit message like this on my local branch: > > > > -Endpoints in particular, so it does occur in practice. > > +Endpoints in particular, so it does occur in practice. (The Link > > +Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.) > > > > In other words, I just added the sentence in parentheses. > > But maybe Ilpo has another wording preference... :) > > Your wording is good summary for the real substance that is the spec > itself. :-) Updated. Thank you both! Krzysztof
On Thu, Dec 19, 2024 at 12:50:59PM -0500, Bjorn Helgaas wrote: > On Thu, Dec 19, 2024, 11:37AM Krzysztof Wilczynski <kw@linux.com> wrote: > > > > > > I'd also add reference to r6.2 section 7.5.3 which states those > > > > > > registers are required for RPs, Switch Ports, Bridges, and > > > > > > Endpoints _that are not RCiEPs_. My reading is that implies > > > > > > they're not required from RCiEPs. > > Don't have the spec with me, but I don't know what link-related registers > would even mean for RCiEPs. Why would we look at them at all? We don't: pcie_capability_read_dword() checks whether the register being read is actually implemented by the device: pcie_capability_read_dword() pcie_capability_reg_implemented() pcie_cap_has_lnkctl() And pcie_cap_has_lnkctl() returns false for PCI_EXP_TYPE_RC_END, in which case pcie_capability_read_dword() just returns zero without accessing Config Space. Likewise accesses to PCI_EXP_LNKCAP2_SLS are short-circuited to zero if the device only conforms to PCIe r1.1 or earlier and thus doesn't implement the Link Capabilities 2 Register. (Recognizable by PCI_EXP_FLAGS_VERS being 1 instead of 2.) So pcie_get_supported_speeds() returns zero for such devices and that's the value assigned to dev->supported_speeds for RCiEPs on probe. Thanks, Lukas
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 35dc9f2..b730560 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6240,12 +6240,14 @@ u8 pcie_get_supported_speeds(struct pci_dev *dev) pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2); speeds = lnkcap2 & PCI_EXP_LNKCAP2_SLS; + /* Ignore speeds higher than Max Link Speed */ + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); + speeds &= GENMASK(lnkcap & PCI_EXP_LNKCAP_SLS, 0); + /* PCIe r3.0-compliant */ if (speeds) return speeds; - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); - /* Synthesize from the Max Link Speed field */ if ((lnkcap & PCI_EXP_LNKCAP_SLS) == PCI_EXP_LNKCAP_SLS_5_0GB) speeds = PCI_EXP_LNKCAP2_SLS_5_0GB | PCI_EXP_LNKCAP2_SLS_2_5GB;