mbox series

[v3,00/10] arm64: dts: qcom: sc8280xp: PCIe fixes and GICv3 ITS enable

Message ID 20240305081105.11912-1-johan+linaro@kernel.org (mailing list archive)
Headers show
Series arm64: dts: qcom: sc8280xp: PCIe fixes and GICv3 ITS enable | expand

Message

Johan Hovold March 5, 2024, 8:10 a.m. UTC
This series addresses a few problems with the sc8280xp PCIe
implementation.

The DWC PCIe controller can either use its internal MSI controller or an
external one such as the GICv3 ITS. Enabling the latter allows for
assigning affinity to individual interrupts, but results in a large
amount of Correctable Errors being logged on both the Lenovo ThinkPad
X13s and the sc8280xp-crd reference design.

It turns out that these errors are always generated, but for some yet to
be determined reason, the AER interrupts are never received when using
the internal MSI controller, which makes the link errors harder to
notice.

On the X13s, there is a large number of errors generated when bringing
up the link on boot. This is related to the fact that UEFI firmware has
already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the
link at Gen3 generates a massive amount of errors until the Wi-Fi
firmware is restarted. This has now also been shown to cause the Wi-Fi
to sometimes not start at all on boot for some users.

A recent commit enabling ASPM on certain Qualcomm platforms introduced
further errors when using the Wi-Fi on the X13s as well as when
accessing the NVMe on the CRD. The exact reason for this has not yet
been identified, but disabling ASPM L0s makes the errors go away. This
could suggest that either the current ASPM implementation is incomplete
or that L0s is not supported with these devices.

Note that the X13s and CRD use the same Wi-Fi controller, but the errors
are only generated on the X13s. The NVMe controller on my X13s does not
support L0s so there are no issues there, unlike on the CRD which uses a
different controller. The modem on the CRD does not generate any errors,
but both the NVMe and modem keeps bouncing in and out of L0s/L1 also
when not used, which could indicate that there are bigger problems with
the ASPM implementation. I don't have a modem on my X13s so I have not
been able to test whether L0s causes any trouble there.

Enabling AER error reporting on sc8280xp could similarly also reveal
existing problems with the related sa8295p and sa8540p platforms as they
share the base dtsi.

After discussing this with Bjorn Andersson at Qualcomm we have decided
to go ahead and disable L0s for all controllers on the CRD and the
X13s.

Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a
significant impact on the power consumption (and there are indications
that this applies generally for L0s on these platforms).

***

As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe
binding rework in linux-next so that the whole series can be merged for
6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for
stable backport anyway).

The DT bindings and PCI patch are expected to go through the PCI tree,
while Bjorn A takes the devicetree updates through the Qualcomm tree.

Johan


Changes in v3
 - drop the two wifi link speed patches which have been picked up for
   6.8
 - rebase on binding rework in linux-next and add the properties also to
   the new qcom,pcie-common.yaml
   - https://lore.kernel.org/linux-pci/20240126-dt-bindings-pci-qcom-split-v3-0-f23cda4d74c0@linaro.org/
 - fix an 'L0s' typo in one commit message

Changes in v2
 - drop RFC from ASPM patches and add stable tags
 - reorder patches and move ITS patch last
 - fix s/GB/MB/ typo in Gen2 speed commit messages
 - fix an incorrect Fixes tag
 - amend commit message X13 wifi link speed patch after user
   confirmation that this fixes the wifi startup issue
 - disable L0s also for modem and wifi on CRD
 - disable L0s also for nvme and modem on X13s


Johan Hovold (10):
  dt-bindings: PCI: qcom: Allow 'required-opps'
  dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
  dt-bindings: PCI: qcom: Allow 'aspm-no-l0s'
  PCI: qcom: Add support for disabling ASPM L0s in devicetree
  arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
  arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe
  arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for modem and Wi-Fi
  arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi
  arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for NVMe and modem
  arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

 .../bindings/pci/qcom,pcie-common.yaml        |  6 +++++-
 .../devicetree/bindings/pci/qcom,pcie.yaml    |  6 +++++-
 arch/arm64/boot/dts/qcom/sc8280xp-crd.dts     |  5 +++++
 .../qcom/sc8280xp-lenovo-thinkpad-x13s.dts    |  5 +++++
 arch/arm64/boot/dts/qcom/sc8280xp.dtsi        | 17 +++++++++++++++-
 drivers/pci/controller/dwc/pcie-qcom.c        | 20 +++++++++++++++++++
 6 files changed, 56 insertions(+), 3 deletions(-)

Comments

Manivannan Sadhasivam March 6, 2024, 6:33 a.m. UTC | #1
On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote:
> This series addresses a few problems with the sc8280xp PCIe
> implementation.
> 
> The DWC PCIe controller can either use its internal MSI controller or an
> external one such as the GICv3 ITS. Enabling the latter allows for
> assigning affinity to individual interrupts, but results in a large
> amount of Correctable Errors being logged on both the Lenovo ThinkPad
> X13s and the sc8280xp-crd reference design.
> 
> It turns out that these errors are always generated, but for some yet to
> be determined reason, the AER interrupts are never received when using
> the internal MSI controller, which makes the link errors harder to
> notice.
> 
> On the X13s, there is a large number of errors generated when bringing
> up the link on boot. This is related to the fact that UEFI firmware has
> already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the
> link at Gen3 generates a massive amount of errors until the Wi-Fi
> firmware is restarted. This has now also been shown to cause the Wi-Fi
> to sometimes not start at all on boot for some users.
> 
> A recent commit enabling ASPM on certain Qualcomm platforms introduced
> further errors when using the Wi-Fi on the X13s as well as when
> accessing the NVMe on the CRD. The exact reason for this has not yet
> been identified, but disabling ASPM L0s makes the errors go away. This
> could suggest that either the current ASPM implementation is incomplete
> or that L0s is not supported with these devices.
> 
> Note that the X13s and CRD use the same Wi-Fi controller, but the errors
> are only generated on the X13s. The NVMe controller on my X13s does not
> support L0s so there are no issues there, unlike on the CRD which uses a
> different controller. The modem on the CRD does not generate any errors,
> but both the NVMe and modem keeps bouncing in and out of L0s/L1 also
> when not used, which could indicate that there are bigger problems with
> the ASPM implementation. I don't have a modem on my X13s so I have not
> been able to test whether L0s causes any trouble there.
> 
> Enabling AER error reporting on sc8280xp could similarly also reveal
> existing problems with the related sa8295p and sa8540p platforms as they
> share the base dtsi.
> 
> After discussing this with Bjorn Andersson at Qualcomm we have decided
> to go ahead and disable L0s for all controllers on the CRD and the
> X13s.
> 

Just received confirmation from Qcom that L0s is not supported for any of the
PCIe instances in sc8280xp (and its derivatives). Please move the property to
SoC dtsi.

- Mani

> Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a
> significant impact on the power consumption (and there are indications
> that this applies generally for L0s on these platforms).
> 
> ***
> 
> As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe
> binding rework in linux-next so that the whole series can be merged for
> 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for
> stable backport anyway).
> 
> The DT bindings and PCI patch are expected to go through the PCI tree,
> while Bjorn A takes the devicetree updates through the Qualcomm tree.
> 
> Johan
> 
> 
> Changes in v3
>  - drop the two wifi link speed patches which have been picked up for
>    6.8
>  - rebase on binding rework in linux-next and add the properties also to
>    the new qcom,pcie-common.yaml
>    - https://lore.kernel.org/linux-pci/20240126-dt-bindings-pci-qcom-split-v3-0-f23cda4d74c0@linaro.org/
>  - fix an 'L0s' typo in one commit message
> 
> Changes in v2
>  - drop RFC from ASPM patches and add stable tags
>  - reorder patches and move ITS patch last
>  - fix s/GB/MB/ typo in Gen2 speed commit messages
>  - fix an incorrect Fixes tag
>  - amend commit message X13 wifi link speed patch after user
>    confirmation that this fixes the wifi startup issue
>  - disable L0s also for modem and wifi on CRD
>  - disable L0s also for nvme and modem on X13s
> 
> 
> Johan Hovold (10):
>   dt-bindings: PCI: qcom: Allow 'required-opps'
>   dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
>   dt-bindings: PCI: qcom: Allow 'aspm-no-l0s'
>   PCI: qcom: Add support for disabling ASPM L0s in devicetree
>   arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
>   arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe
>   arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for modem and Wi-Fi
>   arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi
>   arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for NVMe and modem
>   arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe
> 
>  .../bindings/pci/qcom,pcie-common.yaml        |  6 +++++-
>  .../devicetree/bindings/pci/qcom,pcie.yaml    |  6 +++++-
>  arch/arm64/boot/dts/qcom/sc8280xp-crd.dts     |  5 +++++
>  .../qcom/sc8280xp-lenovo-thinkpad-x13s.dts    |  5 +++++
>  arch/arm64/boot/dts/qcom/sc8280xp.dtsi        | 17 +++++++++++++++-
>  drivers/pci/controller/dwc/pcie-qcom.c        | 20 +++++++++++++++++++
>  6 files changed, 56 insertions(+), 3 deletions(-)
> 
> -- 
> 2.43.0
>
Johan Hovold March 6, 2024, 7:20 a.m. UTC | #2
On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote:
> > This series addresses a few problems with the sc8280xp PCIe
> > implementation.
> > 
> > The DWC PCIe controller can either use its internal MSI controller or an
> > external one such as the GICv3 ITS. Enabling the latter allows for
> > assigning affinity to individual interrupts, but results in a large
> > amount of Correctable Errors being logged on both the Lenovo ThinkPad
> > X13s and the sc8280xp-crd reference design.
> > 
> > It turns out that these errors are always generated, but for some yet to
> > be determined reason, the AER interrupts are never received when using
> > the internal MSI controller, which makes the link errors harder to
> > notice.

> > Enabling AER error reporting on sc8280xp could similarly also reveal
> > existing problems with the related sa8295p and sa8540p platforms as they
> > share the base dtsi.
> > 
> > After discussing this with Bjorn Andersson at Qualcomm we have decided
> > to go ahead and disable L0s for all controllers on the CRD and the
> > X13s.
 
> Just received confirmation from Qcom that L0s is not supported for any of the
> PCIe instances in sc8280xp (and its derivatives). Please move the property to
> SoC dtsi.

Ok, thanks for confirming. But then the devicetree property is not the
right way to handle this, and we should disable L0s based on the
compatible string instead.

> > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe
> > binding rework in linux-next so that the whole series can be merged for
> > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for
> > stable backport anyway).

I'll respin the series. Looks like we've already missed the chance to
enable ITS in 6.9 anyway.

Johan
Manivannan Sadhasivam March 6, 2024, 8:39 a.m. UTC | #3
On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:
> On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote:
> > > This series addresses a few problems with the sc8280xp PCIe
> > > implementation.
> > > 
> > > The DWC PCIe controller can either use its internal MSI controller or an
> > > external one such as the GICv3 ITS. Enabling the latter allows for
> > > assigning affinity to individual interrupts, but results in a large
> > > amount of Correctable Errors being logged on both the Lenovo ThinkPad
> > > X13s and the sc8280xp-crd reference design.
> > > 
> > > It turns out that these errors are always generated, but for some yet to
> > > be determined reason, the AER interrupts are never received when using
> > > the internal MSI controller, which makes the link errors harder to
> > > notice.
> 
> > > Enabling AER error reporting on sc8280xp could similarly also reveal
> > > existing problems with the related sa8295p and sa8540p platforms as they
> > > share the base dtsi.
> > > 
> > > After discussing this with Bjorn Andersson at Qualcomm we have decided
> > > to go ahead and disable L0s for all controllers on the CRD and the
> > > X13s.
>  
> > Just received confirmation from Qcom that L0s is not supported for any of the
> > PCIe instances in sc8280xp (and its derivatives). Please move the property to
> > SoC dtsi.
> 
> Ok, thanks for confirming. But then the devicetree property is not the
> right way to handle this, and we should disable L0s based on the
> compatible string instead.
> 

Hmm. I checked further and got the info that there is no change in the IP, but
the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
there will be AERs when L0s is enabled on any controller instance. And there
will be no updated PHY sequence in the future also for this chipset.

So yeah, let's disable it in the driver instead.

> > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe
> > > binding rework in linux-next so that the whole series can be merged for
> > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for
> > > stable backport anyway).
> 
> I'll respin the series. Looks like we've already missed the chance to
> enable ITS in 6.9 anyway.
> 

Sounds good, thanks!

- Mani
Dmitry Baryshkov March 6, 2024, 8:48 a.m. UTC | #4
On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam
<manivannan.sadhasivam@linaro.org> wrote:
>
> On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:
> > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:
> > > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote:
> > > > This series addresses a few problems with the sc8280xp PCIe
> > > > implementation.
> > > >
> > > > The DWC PCIe controller can either use its internal MSI controller or an
> > > > external one such as the GICv3 ITS. Enabling the latter allows for
> > > > assigning affinity to individual interrupts, but results in a large
> > > > amount of Correctable Errors being logged on both the Lenovo ThinkPad
> > > > X13s and the sc8280xp-crd reference design.
> > > >
> > > > It turns out that these errors are always generated, but for some yet to
> > > > be determined reason, the AER interrupts are never received when using
> > > > the internal MSI controller, which makes the link errors harder to
> > > > notice.
> >
> > > > Enabling AER error reporting on sc8280xp could similarly also reveal
> > > > existing problems with the related sa8295p and sa8540p platforms as they
> > > > share the base dtsi.
> > > >
> > > > After discussing this with Bjorn Andersson at Qualcomm we have decided
> > > > to go ahead and disable L0s for all controllers on the CRD and the
> > > > X13s.
> >
> > > Just received confirmation from Qcom that L0s is not supported for any of the
> > > PCIe instances in sc8280xp (and its derivatives). Please move the property to
> > > SoC dtsi.
> >
> > Ok, thanks for confirming. But then the devicetree property is not the
> > right way to handle this, and we should disable L0s based on the
> > compatible string instead.
> >
>
> Hmm. I checked further and got the info that there is no change in the IP, but
> the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
> there will be AERs when L0s is enabled on any controller instance. And there
> will be no updated PHY sequence in the future also for this chipset.

Why? If it is a bug in the PHY driver, it should be fixed there
instead of adding workarounds.

>
> So yeah, let's disable it in the driver instead.
>
> > > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe
> > > > binding rework in linux-next so that the whole series can be merged for
> > > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for
> > > > stable backport anyway).
> >
> > I'll respin the series. Looks like we've already missed the chance to
> > enable ITS in 6.9 anyway.
> >
>
> Sounds good, thanks!
>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்
>
Johan Hovold March 6, 2024, 9:12 a.m. UTC | #5
On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote:
> On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:
> > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:

> > > > Just received confirmation from Qcom that L0s is not supported for any of the
> > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to
> > > > SoC dtsi.

> > > Ok, thanks for confirming. But then the devicetree property is not the
> > > right way to handle this, and we should disable L0s based on the
> > > compatible string instead.

> > Hmm. I checked further and got the info that there is no change in the IP, but
> > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
> > there will be AERs when L0s is enabled on any controller instance. And there
> > will be no updated PHY sequence in the future also for this chipset.
> 
> Why? If it is a bug in the PHY driver, it should be fixed there
> instead of adding workarounds.

ASPM L0s is currently broken on these platforms and, as far as I
understand, both under Windows and Linux. Since Qualcomm hasn't been
able to come up with the necessary PHY init sequences for these
platforms yet, I doubt they will suddenly appear in the near future.

So we need to disable L0s for now. If an updated PHY init sequence later
appears, we can always enable it again.

> > So yeah, let's disable it in the driver instead.

Johan
Dmitry Baryshkov March 6, 2024, 9:19 a.m. UTC | #6
On Wed, 6 Mar 2024 at 11:12, Johan Hovold <johan@kernel.org> wrote:
>
> On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote:
> > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam
> > <manivannan.sadhasivam@linaro.org> wrote:
> > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:
> > > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:
>
> > > > > Just received confirmation from Qcom that L0s is not supported for any of the
> > > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to
> > > > > SoC dtsi.
>
> > > > Ok, thanks for confirming. But then the devicetree property is not the
> > > > right way to handle this, and we should disable L0s based on the
> > > > compatible string instead.
>
> > > Hmm. I checked further and got the info that there is no change in the IP, but
> > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
> > > there will be AERs when L0s is enabled on any controller instance. And there
> > > will be no updated PHY sequence in the future also for this chipset.
> >
> > Why? If it is a bug in the PHY driver, it should be fixed there
> > instead of adding workarounds.
>
> ASPM L0s is currently broken on these platforms and, as far as I
> understand, both under Windows and Linux. Since Qualcomm hasn't been
> able to come up with the necessary PHY init sequences for these
> platforms yet, I doubt they will suddenly appear in the near future.

I see. Ok, I retract my comment.

>
> So we need to disable L0s for now. If an updated PHY init sequence later
> appears, we can always enable it again.
>
> > > So yeah, let's disable it in the driver instead.
>
> Johan
Manivannan Sadhasivam March 6, 2024, 9:38 a.m. UTC | #7
On Wed, Mar 06, 2024 at 10:12:31AM +0100, Johan Hovold wrote:
> On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote:
> > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam
> > <manivannan.sadhasivam@linaro.org> wrote:
> > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:
> > > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:
> 
> > > > > Just received confirmation from Qcom that L0s is not supported for any of the
> > > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to
> > > > > SoC dtsi.
> 
> > > > Ok, thanks for confirming. But then the devicetree property is not the
> > > > right way to handle this, and we should disable L0s based on the
> > > > compatible string instead.
> 
> > > Hmm. I checked further and got the info that there is no change in the IP, but
> > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
> > > there will be AERs when L0s is enabled on any controller instance. And there
> > > will be no updated PHY sequence in the future also for this chipset.
> > 
> > Why? If it is a bug in the PHY driver, it should be fixed there
> > instead of adding workarounds.
> 
> ASPM L0s is currently broken on these platforms and, as far as I
> understand, both under Windows and Linux. Since Qualcomm hasn't been
> able to come up with the necessary PHY init sequences for these
> platforms yet, I doubt they will suddenly appear in the near future.
> 
> So we need to disable L0s for now. If an updated PHY init sequence later
> appears, we can always enable it again.
> 

It could be the same case for all 'non-mobile' chipsets (automotive, compute,
modem). So instead of using the compatible, please add a flag and set that for
all non-mobile SoCs. Like the ones starting with SAxxx, SCxxx, SDxxx.

- Mani
Manivannan Sadhasivam March 6, 2024, 9:45 a.m. UTC | #8
On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote:
> On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
> > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:
> > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote:
> > > > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote:
> > > > > This series addresses a few problems with the sc8280xp PCIe
> > > > > implementation.
> > > > >
> > > > > The DWC PCIe controller can either use its internal MSI controller or an
> > > > > external one such as the GICv3 ITS. Enabling the latter allows for
> > > > > assigning affinity to individual interrupts, but results in a large
> > > > > amount of Correctable Errors being logged on both the Lenovo ThinkPad
> > > > > X13s and the sc8280xp-crd reference design.
> > > > >
> > > > > It turns out that these errors are always generated, but for some yet to
> > > > > be determined reason, the AER interrupts are never received when using
> > > > > the internal MSI controller, which makes the link errors harder to
> > > > > notice.
> > >
> > > > > Enabling AER error reporting on sc8280xp could similarly also reveal
> > > > > existing problems with the related sa8295p and sa8540p platforms as they
> > > > > share the base dtsi.
> > > > >
> > > > > After discussing this with Bjorn Andersson at Qualcomm we have decided
> > > > > to go ahead and disable L0s for all controllers on the CRD and the
> > > > > X13s.
> > >
> > > > Just received confirmation from Qcom that L0s is not supported for any of the
> > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to
> > > > SoC dtsi.
> > >
> > > Ok, thanks for confirming. But then the devicetree property is not the
> > > right way to handle this, and we should disable L0s based on the
> > > compatible string instead.
> > >
> >
> > Hmm. I checked further and got the info that there is no change in the IP, but
> > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
> > there will be AERs when L0s is enabled on any controller instance. And there
> > will be no updated PHY sequence in the future also for this chipset.
> 
> Why? If it is a bug in the PHY driver, it should be fixed there
> instead of adding workarounds.
> 

Fixing the L0s support requires the expertise of the PHY team and they will only
do if there is any real demand (like in the case of mobile chipsets). For
compute chipsets, they didn't do because most of the NVMe devices out there in
the market only support L1 and L1ss.

So we have to live with this limitation for now.

- Mani

> >
> > So yeah, let's disable it in the driver instead.
> >
> > > > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe
> > > > > binding rework in linux-next so that the whole series can be merged for
> > > > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for
> > > > > stable backport anyway).
> > >
> > > I'll respin the series. Looks like we've already missed the chance to
> > > enable ITS in 6.9 anyway.
> > >
> >
> > Sounds good, thanks!
> >
> > - Mani
> >
> > --
> > மணிவண்ணன் சதாசிவம்
> >
> 
> 
> -- 
> With best wishes
> Dmitry
Johan Hovold March 6, 2024, 9:54 a.m. UTC | #9
On Wed, Mar 06, 2024 at 03:08:57PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Mar 06, 2024 at 10:12:31AM +0100, Johan Hovold wrote:
> > On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote:
> > > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam
> > > <manivannan.sadhasivam@linaro.org> wrote:
> > > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote:

> > > > > Ok, thanks for confirming. But then the devicetree property is not the
> > > > > right way to handle this, and we should disable L0s based on the
> > > > > compatible string instead.
> > 
> > > > Hmm. I checked further and got the info that there is no change in the IP, but
> > > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So
> > > > there will be AERs when L0s is enabled on any controller instance. And there
> > > > will be no updated PHY sequence in the future also for this chipset.
> > > 
> > > Why? If it is a bug in the PHY driver, it should be fixed there
> > > instead of adding workarounds.
> > 
> > ASPM L0s is currently broken on these platforms and, as far as I
> > understand, both under Windows and Linux. Since Qualcomm hasn't been
> > able to come up with the necessary PHY init sequences for these
> > platforms yet, I doubt they will suddenly appear in the near future.
> > 
> > So we need to disable L0s for now. If an updated PHY init sequence later
> > appears, we can always enable it again.
> 
> It could be the same case for all 'non-mobile' chipsets (automotive, compute,
> modem). So instead of using the compatible, please add a flag and set that for
> all non-mobile SoCs. Like the ones starting with SAxxx, SCxxx, SDxxx.

I've already updated the series and was just about to post it. Disabling
for further platforms would also require matching on the compatible
string and we can easily do that in a follow-up patch once we have some
confirmation that it is needed.

Johan