diff mbox series

[v3,06/10] PCI: Cache PCIe device's Supported Speed Vector

Message ID 20230929115723.7864-7-ilpo.jarvinen@linux.intel.com (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series Add PCIe Bandwidth Controller | expand

Commit Message

Ilpo Järvinen Sept. 29, 2023, 11:57 a.m. UTC
The Supported Link Speeds Vector in the Link Capabilities Register 2
corresponds to the bus below on Root Ports and Downstream Ports,
whereas it corresponds to the bus above on Upstream Ports and
Endpoints. Only the former is currently cached in pcie_bus_speeds in
the struct pci_bus. The link speeds that are supported is the
intersection of these two.

Store the device's Supported Link Speeds Vector into the struct pci_bus
when the Function 0 is enumerated (the Multi-Function Devices must have
same speeds the same for all Functions) to be easily able to calculate
the intersection of Supported Link Speeds.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/probe.c  | 10 ++++++++++
 drivers/pci/remove.c |  2 ++
 include/linux/pci.h  |  1 +
 3 files changed, 13 insertions(+)

Comments

Lukas Wunner Dec. 30, 2023, 3:19 p.m. UTC | #1
On Fri, Sep 29, 2023 at 02:57:19PM +0300, Ilpo Järvinen wrote:
> The Supported Link Speeds Vector in the Link Capabilities Register 2
> corresponds to the bus below on Root Ports and Downstream Ports,
> whereas it corresponds to the bus above on Upstream Ports and
> Endpoints.

It would be good to add a pointer to the spec here.  I think the
relevant section is PCIe r6.1 sec 7.5.3.18 which says:

 "Supported Link Speeds Vector - This field indicates the supported
  Link speed(s) of the associated Port."
                       ^^^^^^^^^^^^^^^

Obviously the associated port is upstream on a Switch Upstream Port
or Endpoint, whereas it is downstream on a Switch Downstream Port
or Root Port.

Come to think of it, what about edge cases such as RCiEPs?


> Only the former is currently cached in pcie_bus_speeds in
> the struct pci_bus. The link speeds that are supported is the
> intersection of these two.

I'm wondering if caching both is actually necessary.  Why not cache
just the intersection?  Do we need either of the two somewhere?


> Store the device's Supported Link Speeds Vector into the struct pci_bus
> when the Function 0 is enumerated (the Multi-Function Devices must have
> same speeds the same for all Functions) to be easily able to calculate
> the intersection of Supported Link Speeds.

Might want to add an explanation what you're going to need this for,
I assume it's accessed frequently by the bandwidth throttling driver
in a subsequent patch?

Thanks,

Lukas
Ilpo Järvinen Jan. 1, 2024, 6:31 p.m. UTC | #2
On Sat, 30 Dec 2023, Lukas Wunner wrote:

> On Fri, Sep 29, 2023 at 02:57:19PM +0300, Ilpo Järvinen wrote:
> > The Supported Link Speeds Vector in the Link Capabilities Register 2
> > corresponds to the bus below on Root Ports and Downstream Ports,
> > whereas it corresponds to the bus above on Upstream Ports and
> > Endpoints.
> 
> It would be good to add a pointer to the spec here.  I think the
> relevant section is PCIe r6.1 sec 7.5.3.18 which says:
> 
>  "Supported Link Speeds Vector - This field indicates the supported
>   Link speed(s) of the associated Port."
>                        ^^^^^^^^^^^^^^^
> 
> Obviously the associated port is upstream on a Switch Upstream Port
> or Endpoint, whereas it is downstream on a Switch Downstream Port
> or Root Port.
> 
> Come to think of it, what about edge cases such as RCiEPs?

On real HW I've seen, RCiEPs don't seem to have these speeds at all 
(PCIe r6.1, sec 7.5.3):

"The Link Capabilities, Link Status, and Link Control registers are 
required for all Root Ports, Switch Ports, Bridges, and Endpoints that are 
not RCiEPs. For Functions that do not implement the Link Capabilities, 
Link Status, and Link Contro registers, these spaces must be hardwired to 
0. Link Capabilities 2, Link Status 2, and Link Control 2 registers are
required for all Root Ports, Switch Ports, Bridges, and Endpoints (except 
for RCiEPs) that implement capabilities requiring those registers. For 
Functions that do not implement the Link Capabilities 2, Link Status 2, 
and Link Control 2 registers, these spaces must be hardwired to 0b."

> > Only the former is currently cached in pcie_bus_speeds in
> > the struct pci_bus. The link speeds that are supported is the
> > intersection of these two.
> 
> I'm wondering if caching both is actually necessary.  Why not cache
> just the intersection?  Do we need either of the two somewhere?

Intersection is enough at least for bwctrl. The only downside that is 
barely worth mentioning is that the bus SLSV has to be re-read when
function 0 sets the intersection.

I can think of somebody wanting to expose the list of both supported speed 
to userspace though sysfs (not done by this patch series), but they could 
be read from the registers in that case so that use case doesn't really 
matter much, IMO.

> > Store the device's Supported Link Speeds Vector into the struct pci_bus
> > when the Function 0 is enumerated (the Multi-Function Devices must have
> > same speeds the same for all Functions) to be easily able to calculate
> > the intersection of Supported Link Speeds.
> 
> Might want to add an explanation what you're going to need this for,
> I assume it's accessed frequently by the bandwidth throttling driver
> in a subsequent patch?

Yes. I tend to try to avoid forward references because some maintainers 
complain about them (leading to minimal changes where true motivations 
have to be hidden because "future" cannot be used to motivate a change 
even if that's often the truest motivation within a patch series). But 
I'll add a fwd ref here to make it more obvious. :-)
Lukas Wunner Jan. 3, 2024, 4:51 p.m. UTC | #3
On Mon, Jan 01, 2024 at 08:31:39PM +0200, Ilpo Järvinen wrote:
> On Sat, 30 Dec 2023, Lukas Wunner wrote:
> > On Fri, Sep 29, 2023 at 02:57:19PM +0300, Ilpo Järvinen wrote:
> > > Only the former is currently cached in pcie_bus_speeds in
> > > the struct pci_bus. The link speeds that are supported is the
> > > intersection of these two.
> > 
> > I'm wondering if caching both is actually necessary.  Why not cache
> > just the intersection?  Do we need either of the two somewhere?
> 
> Intersection is enough at least for bwctrl. The only downside that is 
> barely worth mentioning is that the bus SLSV has to be re-read when
> function 0 sets the intersection.
>
> I can think of somebody wanting to expose the list of both supported speed 
> to userspace though sysfs (not done by this patch series), but they could 
> be read from the registers in that case so that use case doesn't really 
> matter much, IMO.

Yes, that would be a reasonable argument to keep both values instead
of storing just the intersection.


> > > Store the device's Supported Link Speeds Vector into the struct pci_bus
> > > when the Function 0 is enumerated (the Multi-Function Devices must have
> > > same speeds the same for all Functions) to be easily able to calculate
> > > the intersection of Supported Link Speeds.
> > 
> > Might want to add an explanation what you're going to need this for,
> > I assume it's accessed frequently by the bandwidth throttling driver
> > in a subsequent patch?
> 
> Yes. I tend to try to avoid forward references because some maintainers 
> complain about them (leading to minimal changes where true motivations 
> have to be hidden because "future" cannot be used to motivate a change 
> even if that's often the truest motivation within a patch series). But 
> I'll add a fwd ref here to make it more obvious. :-)

Bjorn has used phrases such as "We're about to ..." a couple of times
in commit messages to convey that a particular change in the present
patch will be taken advantage of by a subsequent patch.

I've used the same phrase but got criticized (in other subsystems)
for using "we".

So I use phrases such as:

 "An upcoming commit will create DOE mailboxes upon device enumeration by
  the PCI core.  Their lifetime shall not be limited by a driver.
  Therefore rework..." (see 022b66f38195)

Can also reference the past:

 "The PCI core has just been amended to create a pci_doe_mb struct for
  every DOE instance on device enumeration.  [...]  That leaves [...]
  without any callers, so drop them." (see 74e491e5d1bc)

If someone finds your commit e.g. through git blame, it may help them
enormously if you provide context in the commit message.  If maintainers
in other subsystem tell you otherwise, they're wrong. ;)

Thanks,

Lukas
diff mbox series

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ca1d797a30cb..a9408f2420e5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2564,6 +2564,7 @@  static void pci_set_msi_domain(struct pci_dev *dev)
 
 void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 {
+	u8 dev_speeds = 0;
 	int ret;
 
 	pci_configure_device(dev);
@@ -2590,11 +2591,20 @@  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 
 	pci_init_capabilities(dev);
 
+	if (pci_is_pcie(dev) && PCI_FUNC(dev->devfn) == 0) {
+		u32 linkcap, linkcap2;
+
+		pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap);
+		pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &linkcap2);
+		dev_speeds = pcie_get_supported_speeds(linkcap, linkcap2);
+	}
 	/*
 	 * Add the device to our list of discovered devices
 	 * and the bus list for fixup functions, etc.
 	 */
 	down_write(&pci_bus_sem);
+	if (dev_speeds)
+		bus->pcie_dev_speeds = dev_speeds;
 	list_add_tail(&dev->bus_list, &bus->devices);
 	up_write(&pci_bus_sem);
 
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index d749ea8250d6..656784cfb291 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -36,6 +36,8 @@  static void pci_destroy_dev(struct pci_dev *dev)
 	device_del(&dev->dev);
 
 	down_write(&pci_bus_sem);
+	if (pci_is_pcie(dev) && PCI_FUNC(dev->devfn) == 0)
+		dev->bus->pcie_dev_speeds = 0;
 	list_del(&dev->bus_list);
 	up_write(&pci_bus_sem);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index cb03f3ff9d23..b8bd3dc92032 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -665,6 +665,7 @@  struct pci_bus {
 	unsigned char	max_bus_speed;	/* enum pci_bus_speed */
 	unsigned char	cur_bus_speed;	/* enum pci_bus_speed */
 	u8		pcie_bus_speeds;/* Supported Link Speeds Vector (+ reserved 0 at LSB) */
+	u8		pcie_dev_speeds;/* Device's Supported Link Speeds Vector (+ 0 at LSB) */
 #ifdef CONFIG_PCI_DOMAINS_GENERIC
 	int		domain_nr;
 #endif