diff mbox series

[2/2] PCI: Ignore PCIe ports used for tunneling in pcie_bandwidth_available()

Message ID 20231031133438.5299-2-mario.limonciello@amd.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show
Series [1/2] PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header | expand

Commit Message

Mario Limonciello Oct. 31, 2023, 1:34 p.m. UTC
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.

In reality these ports speed is controlled by the fabric implementation.

Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
to program the device will always find the PCIe ports used for
tunneling as a limiting factor and may make incorrect decisions.

To prevent problems in downstream drivers check explicitly for ports
being used for PCIe tunneling and skip them when looking for bandwidth
limitations.

2 types of devices are detected:
1) PCIe root port used for PCIe tunneling
2) Intel Thunderbolt 3 bridge

Downstream drivers could make this change on their own but then they
wouldn't be able to detect other potential speed bottlenecks.

Link: https://lore.kernel.org/linux-pci/7ad4b2ce-4ee4-429d-b5db-3dfc360f4c3e@amd.com/
Link: https://www.usb.org/document-library/usb4r-specification-v20
      USB4 V2 with Errata and ECN through June 2023 - CLEAN p710
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/pci/pci.c | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

Comments

kernel test robot Oct. 31, 2023, 11:02 p.m. UTC | #1
Hi Mario,

kernel test robot noticed the following build warnings:

[auto build test WARNING on pci/for-linus]
[also build test WARNING on westeri-thunderbolt/next linus/master v6.6 next-20231031]
[cannot apply to pci/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello/PCI-Ignore-PCIe-ports-used-for-tunneling-in-pcie_bandwidth_available/20231031-224221
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git for-linus
patch link:    https://lore.kernel.org/r/20231031133438.5299-2-mario.limonciello%40amd.com
patch subject: [PATCH 2/2] PCI: Ignore PCIe ports used for tunneling in pcie_bandwidth_available()
config: arc-randconfig-002-20231101 (https://download.01.org/0day-ci/archive/20231101/202311010646.KCczSLIW-lkp@intel.com/config)
compiler: arc-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231101/202311010646.KCczSLIW-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311010646.KCczSLIW-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/pci/pci.c:6234: warning: Function parameter or member 'pdev' not described in 'pcie_is_tunneling_port'
>> drivers/pci/pci.c:6234: warning: Excess function parameter 'dev' description in 'pcie_is_tunneling_port'

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for VIDEO_OV7670
   Depends on [n]: MEDIA_SUPPORT [=y] && VIDEO_DEV [=y] && VIDEO_CAMERA_SENSOR [=n]
   Selected by [y]:
   - VIDEO_CAFE_CCIC [=y] && MEDIA_SUPPORT [=y] && MEDIA_PLATFORM_SUPPORT [=y] && MEDIA_PLATFORM_DRIVERS [=y] && V4L_PLATFORM_DRIVERS [=y] && PCI [=y] && I2C [=y] && VIDEO_DEV [=y] && COMMON_CLK [=y]


vim +6234 drivers/pci/pci.c

  6225	
  6226	/**
  6227	 * pcie_is_tunneling_port - Check if a PCI device is used for TBT3/USB4 tunneling
  6228	 * @dev: PCI device to check
  6229	 *
  6230	 * Returns true if the device is used for PCIe tunneling, false otherwise.
  6231	 */
  6232	static bool
  6233	pcie_is_tunneling_port(struct pci_dev *pdev)
> 6234	{
  6235		struct device_link *link;
  6236		struct pci_dev *supplier;
  6237	
  6238		/* Intel TBT3 bridge */
  6239		if (pdev->is_thunderbolt)
  6240			return true;
  6241	
  6242		if (!pci_is_pcie(pdev))
  6243			return false;
  6244	
  6245		if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
  6246			return false;
  6247	
  6248		/* PCIe root port used for tunneling linked to USB4 router */
  6249		list_for_each_entry(link, &pdev->dev.links.suppliers, c_node) {
  6250			supplier = to_pci_dev(link->supplier);
  6251			if (!supplier)
  6252				continue;
  6253			if (supplier->class == PCI_CLASS_SERIAL_USB_USB4)
  6254				return true;
  6255		}
  6256	
  6257		return false;
  6258	}
  6259
Bjorn Helgaas Nov. 1, 2023, 10:52 p.m. UTC | #2
On Tue, Oct 31, 2023 at 08:34:38AM -0500, Mario Limonciello wrote:
> The USB4 spec specifies that PCIe ports that are used for tunneling
> PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.
> 
> In reality these ports speed is controlled by the fabric implementation.

So I guess you're saying the speed advertised by PCI_EXP_LNKSTA is not
the actual speed?  And we don't have a generic way to find the actual
speed?

> Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
> to program the device will always find the PCIe ports used for
> tunneling as a limiting factor and may make incorrect decisions.
> 
> To prevent problems in downstream drivers check explicitly for ports
> being used for PCIe tunneling and skip them when looking for bandwidth
> limitations.
> 
> 2 types of devices are detected:
> 1) PCIe root port used for PCIe tunneling
> 2) Intel Thunderbolt 3 bridge
> 
> Downstream drivers could make this change on their own but then they
> wouldn't be able to detect other potential speed bottlenecks.

Is the implication that a tunneling port can *never* be a speed
bottleneck?  That seems to be how this patch would work in practice.

> Link: https://lore.kernel.org/linux-pci/7ad4b2ce-4ee4-429d-b5db-3dfc360f4c3e@amd.com/
> Link: https://www.usb.org/document-library/usb4r-specification-v20
>       USB4 V2 with Errata and ECN through June 2023 - CLEAN p710

I guess this is sec 11.2.1 ("PCIe Physical Layer Logical Sub-block")
on PDF p710 (labeled "666" on the printed page).  How annoying that
the PDF page numbers don't match the printed ones; do the section
numbers at least stay stable in new spec revisions?

> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925

This issue says the external GPU doesn't work at all.  Does this patch
fix that?  This patch looks like it might improve GPU performance, but
wouldn't fix something that didn't work at all.

> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  drivers/pci/pci.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 59c01d68c6d5..4a7dc9c2b8f4 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6223,6 +6223,40 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
>  }
>  EXPORT_SYMBOL(pcie_set_mps);
>  
> +/**
> + * pcie_is_tunneling_port - Check if a PCI device is used for TBT3/USB4 tunneling
> + * @dev: PCI device to check
> + *
> + * Returns true if the device is used for PCIe tunneling, false otherwise.
> + */
> +static bool
> +pcie_is_tunneling_port(struct pci_dev *pdev)

Use usual function signature styling (all on one line).

> +{
> +	struct device_link *link;
> +	struct pci_dev *supplier;
> +
> +	/* Intel TBT3 bridge */
> +	if (pdev->is_thunderbolt)
> +		return true;
> +
> +	if (!pci_is_pcie(pdev))
> +		return false;
> +
> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
> +		return false;
> +
> +	/* PCIe root port used for tunneling linked to USB4 router */
> +	list_for_each_entry(link, &pdev->dev.links.suppliers, c_node) {
> +		supplier = to_pci_dev(link->supplier);
> +		if (!supplier)
> +			continue;
> +		if (supplier->class == PCI_CLASS_SERIAL_USB_USB4)
> +			return true;

Since this is in drivers/pci, and this USB4/Thunderbolt routing is not
covered by the PCIe specs, this is basically black magic.  Is there a
reference to the USB4 spec we could include to help make it less
magical?

Lukas' brief intro in
https://lore.kernel.org/all/20230925141930.GA21033@wunner.de/ really
helped me connect a few dots, because things like
Documentation/admin-guide/thunderbolt.rst assume we already know those
details.

> +	}
> +
> +	return false;
> +}
> +
>  /**
>   * pcie_bandwidth_available - determine minimum link settings of a PCIe
>   *			      device and its bandwidth limitation
> @@ -6236,6 +6270,8 @@ EXPORT_SYMBOL(pcie_set_mps);
>   * limiting_dev, speed, and width pointers are supplied) information about
>   * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
>   * raw bandwidth.
> + *
> + * This function excludes root ports and bridges used for USB4 and TBT3 tunneling.
>   */
>  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>  			     enum pci_bus_speed *speed,
> @@ -6254,6 +6290,10 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>  	bw = 0;
>  
>  	while (dev) {
> +		/* skip root ports and bridges used for tunneling */
> +		if (pcie_is_tunneling_port(dev))
> +			goto skip;
> +
>  		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
>  
>  		next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
> @@ -6274,6 +6314,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>  				*width = next_width;
>  		}
>  
> +skip:
>  		dev = pci_upstream_bridge(dev);
>  	}
>  
> -- 
> 2.34.1
>
Mario Limonciello Nov. 2, 2023, 1:14 a.m. UTC | #3
On 11/1/2023 17:52, Bjorn Helgaas wrote:
> On Tue, Oct 31, 2023 at 08:34:38AM -0500, Mario Limonciello wrote:
>> The USB4 spec specifies that PCIe ports that are used for tunneling
>> PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.
>>
>> In reality these ports speed is controlled by the fabric implementation.
> 
> So I guess you're saying the speed advertised by PCI_EXP_LNKSTA is not
> the actual speed?  And we don't have a generic way to find the actual
> speed?

Correct.

> 
>> Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
>> to program the device will always find the PCIe ports used for
>> tunneling as a limiting factor and may make incorrect decisions.
>>
>> To prevent problems in downstream drivers check explicitly for ports
>> being used for PCIe tunneling and skip them when looking for bandwidth
>> limitations.
>>
>> 2 types of devices are detected:
>> 1) PCIe root port used for PCIe tunneling
>> 2) Intel Thunderbolt 3 bridge
>>
>> Downstream drivers could make this change on their own but then they
>> wouldn't be able to detect other potential speed bottlenecks.
> 
> Is the implication that a tunneling port can *never* be a speed
> bottleneck?  That seems to be how this patch would work in practice.

I think that's a stretch we should avoid concluding.

IIUC the fabric can be hosting other traffic and it's entirely possible 
the traffic over the tunneling port runs more slowly at times.

Perhaps that's why the the USB4 spec decided to advertise it this way? 
I don't know.

> 
>> Link: https://lore.kernel.org/linux-pci/7ad4b2ce-4ee4-429d-b5db-3dfc360f4c3e@amd.com/
>> Link: https://www.usb.org/document-library/usb4r-specification-v20
>>        USB4 V2 with Errata and ECN through June 2023 - CLEAN p710
> 
> I guess this is sec 11.2.1 ("PCIe Physical Layer Logical Sub-block")
> on PDF p710 (labeled "666" on the printed page).  How annoying that
> the PDF page numbers don't match the printed ones; do the section
> numbers at least stay stable in new spec revisions?

I'd hope so.  I'll change it to section numbers in the next revision.

> 
>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
> 
> This issue says the external GPU doesn't work at all.  Does this patch
> fix that?  This patch looks like it might improve GPU performance, but
> wouldn't fix something that didn't work at all.

The issue actually identified 4 distinct different problems.  The 3 
problems will be fixed in amdgpu which are functional.

This performance one was from later in the ticket after some back and 
forth identifying proper solutions for the first 3.

> 
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>   drivers/pci/pci.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 41 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 59c01d68c6d5..4a7dc9c2b8f4 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6223,6 +6223,40 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
>>   }
>>   EXPORT_SYMBOL(pcie_set_mps);
>>   
>> +/**
>> + * pcie_is_tunneling_port - Check if a PCI device is used for TBT3/USB4 tunneling
>> + * @dev: PCI device to check
>> + *
>> + * Returns true if the device is used for PCIe tunneling, false otherwise.
>> + */
>> +static bool
>> +pcie_is_tunneling_port(struct pci_dev *pdev)
> 
> Use usual function signature styling (all on one line).

OK.

> 
>> +{
>> +	struct device_link *link;
>> +	struct pci_dev *supplier;
>> +
>> +	/* Intel TBT3 bridge */
>> +	if (pdev->is_thunderbolt)
>> +		return true;
>> +
>> +	if (!pci_is_pcie(pdev))
>> +		return false;
>> +
>> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
>> +		return false;
>> +
>> +	/* PCIe root port used for tunneling linked to USB4 router */
>> +	list_for_each_entry(link, &pdev->dev.links.suppliers, c_node) {
>> +		supplier = to_pci_dev(link->supplier);
>> +		if (!supplier)
>> +			continue;
>> +		if (supplier->class == PCI_CLASS_SERIAL_USB_USB4)
>> +			return true;
> 
> Since this is in drivers/pci, and this USB4/Thunderbolt routing is not
> covered by the PCIe specs, this is basically black magic.  Is there a
> reference to the USB4 spec we could include to help make it less
> magical?

The "magic" part is that there is an ACPI construct to indicate a PCIe 
port is linked to a USB4 router.

Here is a link to the page that is explained:
https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/usb4-acpi-requirements#port-mapping-_dsd-for-usb-3x-and-pcie

In the Linux side this link is created in the 'thunderbolt' driver.

Thinking about this again, this does actually mean we could have a 
different result based on whether pcie_bandwidth_available() is called 
before or after the 'thunderbolt' driver has loaded.

For example if a GPU driver that called pcie_bandwidth_available() was 
in the initramfs but 'thunderbolt' was in the rootfs we might end up 
with the wrong result again.

Considering this I think it's a good idea to move that creation of the 
device link into drivers/pci/pci-acpi.c and store a bit in struct 
pci_device to indicate it's a tunneled port.

Then 'thunderbolt' can look for this directly instead of walking all the 
FW nodes.

pcie_bandwidth_available() can just look at the tunneled port bit 
instead of the existence of the device link.

> 
> Lukas' brief intro in
> https://lore.kernel.org/all/20230925141930.GA21033@wunner.de/ really
> helped me connect a few dots, because things like
> Documentation/admin-guide/thunderbolt.rst assume we already know those
> details.

Thanks for sharing that.  If I move the detection mechanism as I 
suggested above I'll reference some of that as well in the commit 
message to explain what exactly a tunneled port is.

> 
>> +	}
>> +
>> +	return false;
>> +}
>> +
>>   /**
>>    * pcie_bandwidth_available - determine minimum link settings of a PCIe
>>    *			      device and its bandwidth limitation
>> @@ -6236,6 +6270,8 @@ EXPORT_SYMBOL(pcie_set_mps);
>>    * limiting_dev, speed, and width pointers are supplied) information about
>>    * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
>>    * raw bandwidth.
>> + *
>> + * This function excludes root ports and bridges used for USB4 and TBT3 tunneling.
>>    */
>>   u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>>   			     enum pci_bus_speed *speed,
>> @@ -6254,6 +6290,10 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>>   	bw = 0;
>>   
>>   	while (dev) {
>> +		/* skip root ports and bridges used for tunneling */
>> +		if (pcie_is_tunneling_port(dev))
>> +			goto skip;
>> +
>>   		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
>>   
>>   		next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
>> @@ -6274,6 +6314,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>>   				*width = next_width;
>>   		}
>>   
>> +skip:
>>   		dev = pci_upstream_bridge(dev);
>>   	}
>>   
>> -- 
>> 2.34.1
>>
Mika Westerberg Nov. 2, 2023, 10:31 a.m. UTC | #4
On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> On 11/1/2023 17:52, Bjorn Helgaas wrote:
> > On Tue, Oct 31, 2023 at 08:34:38AM -0500, Mario Limonciello wrote:
> > > The USB4 spec specifies that PCIe ports that are used for tunneling
> > > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.
> > > 
> > > In reality these ports speed is controlled by the fabric implementation.
> > 
> > So I guess you're saying the speed advertised by PCI_EXP_LNKSTA is not
> > the actual speed?  And we don't have a generic way to find the actual
> > speed?
> 
> Correct.
> 
> > 
> > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
> > > to program the device will always find the PCIe ports used for
> > > tunneling as a limiting factor and may make incorrect decisions.
> > > 
> > > To prevent problems in downstream drivers check explicitly for ports
> > > being used for PCIe tunneling and skip them when looking for bandwidth
> > > limitations.
> > > 
> > > 2 types of devices are detected:
> > > 1) PCIe root port used for PCIe tunneling
> > > 2) Intel Thunderbolt 3 bridge
> > > 
> > > Downstream drivers could make this change on their own but then they
> > > wouldn't be able to detect other potential speed bottlenecks.
> > 
> > Is the implication that a tunneling port can *never* be a speed
> > bottleneck?  That seems to be how this patch would work in practice.
> 
> I think that's a stretch we should avoid concluding.
> 
> IIUC the fabric can be hosting other traffic and it's entirely possible the
> traffic over the tunneling port runs more slowly at times.
> 
> Perhaps that's why the the USB4 spec decided to advertise it this way? I
> don't know.
> 
> > 
> > > Link: https://lore.kernel.org/linux-pci/7ad4b2ce-4ee4-429d-b5db-3dfc360f4c3e@amd.com/
> > > Link: https://www.usb.org/document-library/usb4r-specification-v20
> > >        USB4 V2 with Errata and ECN through June 2023 - CLEAN p710
> > 
> > I guess this is sec 11.2.1 ("PCIe Physical Layer Logical Sub-block")
> > on PDF p710 (labeled "666" on the printed page).  How annoying that
> > the PDF page numbers don't match the printed ones; do the section
> > numbers at least stay stable in new spec revisions?
> 
> I'd hope so.  I'll change it to section numbers in the next revision.
> 
> > 
> > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
> > 
> > This issue says the external GPU doesn't work at all.  Does this patch
> > fix that?  This patch looks like it might improve GPU performance, but
> > wouldn't fix something that didn't work at all.
> 
> The issue actually identified 4 distinct different problems.  The 3 problems
> will be fixed in amdgpu which are functional.
> 
> This performance one was from later in the ticket after some back and forth
> identifying proper solutions for the first 3.
> 
> > 
> > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > ---
> > >   drivers/pci/pci.c | 41 +++++++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 41 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 59c01d68c6d5..4a7dc9c2b8f4 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -6223,6 +6223,40 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
> > >   }
> > >   EXPORT_SYMBOL(pcie_set_mps);
> > > +/**
> > > + * pcie_is_tunneling_port - Check if a PCI device is used for TBT3/USB4 tunneling
> > > + * @dev: PCI device to check
> > > + *
> > > + * Returns true if the device is used for PCIe tunneling, false otherwise.
> > > + */
> > > +static bool
> > > +pcie_is_tunneling_port(struct pci_dev *pdev)
> > 
> > Use usual function signature styling (all on one line).
> 
> OK.
> 
> > 
> > > +{
> > > +	struct device_link *link;
> > > +	struct pci_dev *supplier;
> > > +
> > > +	/* Intel TBT3 bridge */
> > > +	if (pdev->is_thunderbolt)
> > > +		return true;
> > > +
> > > +	if (!pci_is_pcie(pdev))
> > > +		return false;
> > > +
> > > +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
> > > +		return false;
> > > +
> > > +	/* PCIe root port used for tunneling linked to USB4 router */
> > > +	list_for_each_entry(link, &pdev->dev.links.suppliers, c_node) {
> > > +		supplier = to_pci_dev(link->supplier);
> > > +		if (!supplier)
> > > +			continue;
> > > +		if (supplier->class == PCI_CLASS_SERIAL_USB_USB4)
> > > +			return true;
> > 
> > Since this is in drivers/pci, and this USB4/Thunderbolt routing is not
> > covered by the PCIe specs, this is basically black magic.  Is there a
> > reference to the USB4 spec we could include to help make it less
> > magical?
> 
> The "magic" part is that there is an ACPI construct to indicate a PCIe port
> is linked to a USB4 router.
> 
> Here is a link to the page that is explained:
> https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/usb4-acpi-requirements#port-mapping-_dsd-for-usb-3x-and-pcie
> 
> In the Linux side this link is created in the 'thunderbolt' driver.
> 
> Thinking about this again, this does actually mean we could have a different
> result based on whether pcie_bandwidth_available() is called before or after
> the 'thunderbolt' driver has loaded.
> 
> For example if a GPU driver that called pcie_bandwidth_available() was in
> the initramfs but 'thunderbolt' was in the rootfs we might end up with the
> wrong result again.

Right, that's possible if the boot firmware has support for a connection
manager. Although we do reset the whole topology with the USB4 v2 host
routers this is kept as is for v1.

> Considering this I think it's a good idea to move that creation of the
> device link into drivers/pci/pci-acpi.c and store a bit in struct pci_device
> to indicate it's a tunneled port.

Note it currently is setting the link between xHCI and the
USB4/Thunderbolt host controller but we may want to change it later to
link between USB 3.x port and the USB4/Thunderbolt host to allow more
fine grained power management, this is especially true with the new USB
Gen T tunneling. So for now it is only PCI but we may need to touch the
USB stack too (perhaps put it in drivers/acpi/ instead).

> Then 'thunderbolt' can look for this directly instead of walking all the FW
> nodes.
> 
> pcie_bandwidth_available() can just look at the tunneled port bit instead of
> the existence of the device link.
> 
> > 
> > Lukas' brief intro in
> > https://lore.kernel.org/all/20230925141930.GA21033@wunner.de/ really
> > helped me connect a few dots, because things like
> > Documentation/admin-guide/thunderbolt.rst assume we already know those
> > details.
> 
> Thanks for sharing that.  If I move the detection mechanism as I suggested
> above I'll reference some of that as well in the commit message to explain
> what exactly a tunneled port is.

I'm not sure it makes sense to explain from the zero all this stuff that
people can easily look up from the corresponding spec, such as PCIe or
USB.

There is a good picture in USB4 v2 ch 2.2.3 about paths crossing USB4
fabric, perhaps reference that one? Or ch 2.2.10.3 that shows how this
works with PCIe tunneling instead (although they are similar).
Bjorn Helgaas Nov. 2, 2023, 12:07 p.m. UTC | #5
On Thu, Nov 02, 2023 at 12:31:08PM +0200, Mika Westerberg wrote:
> On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> > On 11/1/2023 17:52, Bjorn Helgaas wrote:

> > > Lukas' brief intro in
> > > https://lore.kernel.org/all/20230925141930.GA21033@wunner.de/ really
> > > helped me connect a few dots, because things like
> > > Documentation/admin-guide/thunderbolt.rst assume we already know those
> > > details.
> > 
> > Thanks for sharing that.  If I move the detection mechanism as I suggested
> > above I'll reference some of that as well in the commit message to explain
> > what exactly a tunneled port is.
> 
> I'm not sure it makes sense to explain from the zero all this stuff that
> people can easily look up from the corresponding spec, such as PCIe or
> USB.

I don't know if it needs to be in the commit log.

I mentioned thunderbolt.rst because the text at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/thunderbolt.rst?id=v6.6#n6
assumes that we know the terms "host router", "host controller",
"router", "tunnel", "connection manager", and I don't think that's a
good assumption in that documentation.

A little bit of introduction based on Lukas' text could improve that.

> There is a good picture in USB4 v2 ch 2.2.3 about paths crossing USB4
> fabric, perhaps reference that one? Or ch 2.2.10.3 that shows how this
> works with PCIe tunneling instead (although they are similar).

Thanks for these!

Bjorn
Mika Westerberg Nov. 2, 2023, 12:17 p.m. UTC | #6
On Thu, Nov 02, 2023 at 07:07:39AM -0500, Bjorn Helgaas wrote:
> On Thu, Nov 02, 2023 at 12:31:08PM +0200, Mika Westerberg wrote:
> > On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> > > On 11/1/2023 17:52, Bjorn Helgaas wrote:
> 
> > > > Lukas' brief intro in
> > > > https://lore.kernel.org/all/20230925141930.GA21033@wunner.de/ really
> > > > helped me connect a few dots, because things like
> > > > Documentation/admin-guide/thunderbolt.rst assume we already know those
> > > > details.
> > > 
> > > Thanks for sharing that.  If I move the detection mechanism as I suggested
> > > above I'll reference some of that as well in the commit message to explain
> > > what exactly a tunneled port is.
> > 
> > I'm not sure it makes sense to explain from the zero all this stuff that
> > people can easily look up from the corresponding spec, such as PCIe or
> > USB.
> 
> I don't know if it needs to be in the commit log.
> 
> I mentioned thunderbolt.rst because the text at
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/thunderbolt.rst?id=v6.6#n6
> assumes that we know the terms "host router", "host controller",
> "router", "tunnel", "connection manager", and I don't think that's a
> good assumption in that documentation.
> 
> A little bit of introduction based on Lukas' text could improve that.

All these are explained in the USB4 spec, I wonder if we should just
link that in the document rather than expaining all of them there.
Anyway, point taken, thanks for the feedback!
Lukas Wunner Nov. 2, 2023, 3:21 p.m. UTC | #7
On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> Considering this I think it's a good idea to move that creation of the
> device link into drivers/pci/pci-acpi.c and store a bit in struct pci_device
> to indicate it's a tunneled port.
> 
> Then 'thunderbolt' can look for this directly instead of walking all the FW
> nodes.
> 
> pcie_bandwidth_available() can just look at the tunneled port bit instead of
> the existence of the device link.

pci_is_thunderbolt_attached() should already be doing exactly what
you want to achieve with the new bit.  It tells you whether a PCI
device is behind a Thunderbolt tunnel.  So I don't think a new bit
is actually needed.

Thanks,

Lukas
Mario Limonciello Nov. 2, 2023, 3:26 p.m. UTC | #8
On 11/2/2023 10:21, Lukas Wunner wrote:
> On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
>> Considering this I think it's a good idea to move that creation of the
>> device link into drivers/pci/pci-acpi.c and store a bit in struct pci_device
>> to indicate it's a tunneled port.
>>
>> Then 'thunderbolt' can look for this directly instead of walking all the FW
>> nodes.
>>
>> pcie_bandwidth_available() can just look at the tunneled port bit instead of
>> the existence of the device link.
> 
> pci_is_thunderbolt_attached() should already be doing exactly what
> you want to achieve with the new bit.  It tells you whether a PCI
> device is behind a Thunderbolt tunnel.  So I don't think a new bit
> is actually needed.
> 
> Thanks,
> 
> Lukas

It's only for a device connected to an Intel TBT3 controller though; it 
won't apply to USB4.
Lukas Wunner Nov. 2, 2023, 3:33 p.m. UTC | #9
On Thu, Nov 02, 2023 at 10:26:31AM -0500, Mario Limonciello wrote:
> On 11/2/2023 10:21, Lukas Wunner wrote:
> > On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> > > Considering this I think it's a good idea to move that creation of the
> > > device link into drivers/pci/pci-acpi.c and store a bit in struct
> > > pci_device to indicate it's a tunneled port.
> > > 
> > > Then 'thunderbolt' can look for this directly instead of walking all
> > > the FW nodes.
> > > 
> > > pcie_bandwidth_available() can just look at the tunneled port bit
> > > instead of the existence of the device link.
> > 
> > pci_is_thunderbolt_attached() should already be doing exactly what
> > you want to achieve with the new bit.  It tells you whether a PCI
> > device is behind a Thunderbolt tunnel.  So I don't think a new bit
> > is actually needed.
> 
> It's only for a device connected to an Intel TBT3 controller though; it
> won't apply to USB4.

Time to resurrect this patch here...? :)

https://lore.kernel.org/all/20220204182820.130339-3-mario.limonciello@amd.com/
Bjorn Helgaas Nov. 2, 2023, 3:47 p.m. UTC | #10
On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> On 11/1/2023 17:52, Bjorn Helgaas wrote:
> > On Tue, Oct 31, 2023 at 08:34:38AM -0500, Mario Limonciello wrote:
> > > The USB4 spec specifies that PCIe ports that are used for tunneling
> > > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s.
> > > 
> > > In reality these ports speed is controlled by the fabric implementation.
> > 
> > So I guess you're saying the speed advertised by PCI_EXP_LNKSTA is not
> > the actual speed?  And we don't have a generic way to find the actual
> > speed?
> 
> Correct.
> 
> > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
> > > to program the device will always find the PCIe ports used for
> > > tunneling as a limiting factor and may make incorrect decisions.
> > > 
> > > To prevent problems in downstream drivers check explicitly for ports
> > > being used for PCIe tunneling and skip them when looking for bandwidth
> > > limitations.
> > > 
> > > 2 types of devices are detected:
> > > 1) PCIe root port used for PCIe tunneling
> > > 2) Intel Thunderbolt 3 bridge
> > > 
> > > Downstream drivers could make this change on their own but then they
> > > wouldn't be able to detect other potential speed bottlenecks.
> > 
> > Is the implication that a tunneling port can *never* be a speed
> > bottleneck?  That seems to be how this patch would work in practice.
> 
> I think that's a stretch we should avoid concluding.

I'm just reading the description and the patch, which seem to say that
pcie_bandwidth_available() will never report a tunneling port as the
limiting port.

Maybe this can be rectified with a comment about how we can't tell the
actual bandwidth of a tunneled port, and it may be a hidden unreported
bottleneck, so pcie_bandwidth_available() can't actually return a
reliable result.  Seems sort of unsatisfactory, but ... I dunno, maybe
it's the best we can do.

> IIUC the fabric can be hosting other traffic and it's entirely possible the
> traffic over the tunneling port runs more slowly at times.
> 
> Perhaps that's why the the USB4 spec decided to advertise it this way? I
> don't know.

Maybe, although the same happens on shared PCIe links above switches.

> > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
> > 
> > This issue says the external GPU doesn't work at all.  Does this patch
> > fix that?  This patch looks like it might improve GPU performance, but
> > wouldn't fix something that didn't work at all.
> 
> The issue actually identified 4 distinct different problems.  The 3 problems
> will be fixed in amdgpu which are functional.
> 
> This performance one was from later in the ticket after some back and forth
> identifying proper solutions for the first 3.

There's a lot of material in that report.  Is there a way to link to
the specific part related to performance?

> > > + * This function excludes root ports and bridges used for USB4 and TBT3 tunneling.

Wrap to fit in 80 columns like the rest of the file.

> > >    */
> > >   u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > >   			     enum pci_bus_speed *speed,
> > > @@ -6254,6 +6290,10 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > >   	bw = 0;
> > >   	while (dev) {
> > > +		/* skip root ports and bridges used for tunneling */
> > > +		if (pcie_is_tunneling_port(dev))
> > > +			goto skip;
> > > +
> > >   		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> > >   		next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
> > > @@ -6274,6 +6314,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > >   				*width = next_width;
> > >   		}
> > > +skip:
> > >   		dev = pci_upstream_bridge(dev);
> > >   	}
Mario Limonciello Nov. 2, 2023, 4:22 p.m. UTC | #11
On 11/2/2023 10:33, Lukas Wunner wrote:
> On Thu, Nov 02, 2023 at 10:26:31AM -0500, Mario Limonciello wrote:
>> On 11/2/2023 10:21, Lukas Wunner wrote:
>>> On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
>>>> Considering this I think it's a good idea to move that creation of the
>>>> device link into drivers/pci/pci-acpi.c and store a bit in struct
>>>> pci_device to indicate it's a tunneled port.
>>>>
>>>> Then 'thunderbolt' can look for this directly instead of walking all
>>>> the FW nodes.
>>>>
>>>> pcie_bandwidth_available() can just look at the tunneled port bit
>>>> instead of the existence of the device link.
>>>
>>> pci_is_thunderbolt_attached() should already be doing exactly what
>>> you want to achieve with the new bit.  It tells you whether a PCI
>>> device is behind a Thunderbolt tunnel.  So I don't think a new bit
>>> is actually needed.
>>
>> It's only for a device connected to an Intel TBT3 controller though; it
>> won't apply to USB4.
> 
> Time to resurrect this patch here...? :)
> 
> https://lore.kernel.org/all/20220204182820.130339-3-mario.limonciello@amd.com/

That thought crossed my mind, but I don't think it's actually correct.
That's the major reason I didn't resurrect that series.

The PCIe topology looks like this:

├─PCIe tunneled root port
|  └─PCIe bridge/switch (TBT3 or USB4 hub)
|    └─PCIe device
└─PCIe root port
   └─USB 4 Router

In this topology the USB4 PCIe class device is going to be the USB4 
router.  This *isn't* a tunneled device.

The two problematic devices are going to be that PCIe bridge (TBT or 
USB4 hub) and PCIe tunneled root port.
Looking for the class is going to mark the wrong device for the "USB 4 
Router".

I looked through the USB4 spec again and I don't see any way that such a 
port can be distinguished.

I feel the correct way to identify it is via the relationship specified 
in ACPI.

FWIW I also think that that all the kernel users of 
pci_is_thunderbolt_attached() *should* be using dev_is_removable().

amdgpu is going to be switching over to this as one of the fixes I 
mentioned for that bug:
https://patchwork.freedesktop.org/patch/564738/

If nouveau and radeon also switch over we can probably should axe the 
function pci_is_thunderbolt_attached() all together.

If you guys agree I can send out a separate series for this to go after 
the amdgpu patch merges.
Lukas Wunner Nov. 2, 2023, 5:28 p.m. UTC | #12
On Thu, Nov 02, 2023 at 10:47:48AM -0500, Bjorn Helgaas wrote:
> On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> > On 11/1/2023 17:52, Bjorn Helgaas wrote:
> > > Is the implication that a tunneling port can *never* be a speed
> > > bottleneck?  That seems to be how this patch would work in practice.
> > 
> > I think that's a stretch we should avoid concluding.
> 
> I'm just reading the description and the patch, which seem to say that
> pcie_bandwidth_available() will never report a tunneling port as the
> limiting port.

If the Thunderbolt host controller is a discrete chip attached with PCIe,
the bandwidth is capped by its Switch Upstream Port.

E.g. the "Light Ridge" Thunderbolt 1 controller's Switch Upstream Port
supports 5 GT/s at x4 width.

In contemporary systems, the Thunderbolt controller is often part of the
CPU SoC, so attached Thunderbolt devices appear below a Root Port.
In that case, there's no such limitation.

Additionally the bandwidth is limited by the Thunderbolt generation:
Thunderbolt 1 had two bidirectional 10 GBit/s channels,
Thunderbolt 2 has 20 GBit/s total, Thunderbolt 3 & 4 has 40 GBit/s total:

https://en.wikipedia.org/wiki/Thunderbolt_(interface)

Hence assuming "unlimited" capacity for Thunderbolt wouldn't be accurate.

Thanks,

Lukas
Mika Westerberg Nov. 3, 2023, 5:48 a.m. UTC | #13
On Thu, Nov 02, 2023 at 11:22:05AM -0500, Mario Limonciello wrote:
> On 11/2/2023 10:33, Lukas Wunner wrote:
> > On Thu, Nov 02, 2023 at 10:26:31AM -0500, Mario Limonciello wrote:
> > > On 11/2/2023 10:21, Lukas Wunner wrote:
> > > > On Wed, Nov 01, 2023 at 08:14:31PM -0500, Mario Limonciello wrote:
> > > > > Considering this I think it's a good idea to move that creation of the
> > > > > device link into drivers/pci/pci-acpi.c and store a bit in struct
> > > > > pci_device to indicate it's a tunneled port.
> > > > > 
> > > > > Then 'thunderbolt' can look for this directly instead of walking all
> > > > > the FW nodes.
> > > > > 
> > > > > pcie_bandwidth_available() can just look at the tunneled port bit
> > > > > instead of the existence of the device link.
> > > > 
> > > > pci_is_thunderbolt_attached() should already be doing exactly what
> > > > you want to achieve with the new bit.  It tells you whether a PCI
> > > > device is behind a Thunderbolt tunnel.  So I don't think a new bit
> > > > is actually needed.
> > > 
> > > It's only for a device connected to an Intel TBT3 controller though; it
> > > won't apply to USB4.
> > 
> > Time to resurrect this patch here...? :)
> > 
> > https://lore.kernel.org/all/20220204182820.130339-3-mario.limonciello@amd.com/
> 
> That thought crossed my mind, but I don't think it's actually correct.
> That's the major reason I didn't resurrect that series.
> 
> The PCIe topology looks like this:
> 
> ├─PCIe tunneled root port
> |  └─PCIe bridge/switch (TBT3 or USB4 hub)
> |    └─PCIe device
> └─PCIe root port
>   └─USB 4 Router
> 
> In this topology the USB4 PCIe class device is going to be the USB4 router.
> This *isn't* a tunneled device.
> 
> The two problematic devices are going to be that PCIe bridge (TBT or USB4
> hub) and PCIe tunneled root port.
> Looking for the class is going to mark the wrong device for the "USB 4
> Router".
> 
> I looked through the USB4 spec again and I don't see any way that such a
> port can be distinguished.
> 
> I feel the correct way to identify it is via the relationship specified in
> ACPI.

Just to add here, for discrete (eg. add in USB4 host controllers) the
USB4 spec defines DVSEC capability that can be used to figure out the
same information as the ACPI above so at least we should make the code
work so that adding the DVSEC support later on is still possible with
minimal effort :)

> FWIW I also think that that all the kernel users of
> pci_is_thunderbolt_attached() *should* be using dev_is_removable().

I tend to agree with this. There can be other "mediums" than
USB4/Thunderbolt that can carry PCIe packets.
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59c01d68c6d5..4a7dc9c2b8f4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6223,6 +6223,40 @@  int pcie_set_mps(struct pci_dev *dev, int mps)
 }
 EXPORT_SYMBOL(pcie_set_mps);
 
+/**
+ * pcie_is_tunneling_port - Check if a PCI device is used for TBT3/USB4 tunneling
+ * @dev: PCI device to check
+ *
+ * Returns true if the device is used for PCIe tunneling, false otherwise.
+ */
+static bool
+pcie_is_tunneling_port(struct pci_dev *pdev)
+{
+	struct device_link *link;
+	struct pci_dev *supplier;
+
+	/* Intel TBT3 bridge */
+	if (pdev->is_thunderbolt)
+		return true;
+
+	if (!pci_is_pcie(pdev))
+		return false;
+
+	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT)
+		return false;
+
+	/* PCIe root port used for tunneling linked to USB4 router */
+	list_for_each_entry(link, &pdev->dev.links.suppliers, c_node) {
+		supplier = to_pci_dev(link->supplier);
+		if (!supplier)
+			continue;
+		if (supplier->class == PCI_CLASS_SERIAL_USB_USB4)
+			return true;
+	}
+
+	return false;
+}
+
 /**
  * pcie_bandwidth_available - determine minimum link settings of a PCIe
  *			      device and its bandwidth limitation
@@ -6236,6 +6270,8 @@  EXPORT_SYMBOL(pcie_set_mps);
  * limiting_dev, speed, and width pointers are supplied) information about
  * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
  * raw bandwidth.
+ *
+ * This function excludes root ports and bridges used for USB4 and TBT3 tunneling.
  */
 u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 			     enum pci_bus_speed *speed,
@@ -6254,6 +6290,10 @@  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 	bw = 0;
 
 	while (dev) {
+		/* skip root ports and bridges used for tunneling */
+		if (pcie_is_tunneling_port(dev))
+			goto skip;
+
 		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
 
 		next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
@@ -6274,6 +6314,7 @@  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 				*width = next_width;
 		}
 
+skip:
 		dev = pci_upstream_bridge(dev);
 	}