diff mbox series

[RFC] ASPM L1 link latencies

Message ID CAA85sZvJQge6ETwF1GkdvK1Mpwazh_cYJcmeZVAohmt0FjbMZg@mail.gmail.com (mailing list archive)
State Not Applicable, archived
Delegated to: Bjorn Helgaas
Headers show
Series [RFC] ASPM L1 link latencies | expand

Commit Message

Ian Kumlien July 25, 2020, 7:47 p.m. UTC
Hi,

A while ago I realised that I was having all kinds off issues with my
connection, ~933 mbit had become ~40 mbit

This only applied on links to the internet (via a linux fw running
NAT) however while debugging with the help of Alexander Duyck
we realised that ASPM could be the culprit (at least disabling ASPM on
the nic it self made things work just fine)...

So while trying to understand PCIe and such things, I found this:

The calculations of the max delay looked at "that node" + start latency * "hops"

But one hop might have a larger latency and break the acceptable delay...

So after a lot playing around with the code, i ended up with this, and
it seems to fix my problem and does
set two pcie bridges to ASPM Disabled that didn't happen before.

I do however have questions.... Shouldn't the change be applied to the endpoint?
Or should it be applied recursively along the path to the endpoint?

Also, the L0S checks are only done on the local links, is this correct?

Comments

Bjorn Helgaas July 29, 2020, 10:47 p.m. UTC | #1
On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote:
> Hi,
> 
> A while ago I realised that I was having all kinds off issues with my
> connection, ~933 mbit had become ~40 mbit
> 
> This only applied on links to the internet (via a linux fw running
> NAT) however while debugging with the help of Alexander Duyck
> we realised that ASPM could be the culprit (at least disabling ASPM on
> the nic it self made things work just fine)...
> 
> So while trying to understand PCIe and such things, I found this:
> 
> The calculations of the max delay looked at "that node" + start latency * "hops"
> 
> But one hop might have a larger latency and break the acceptable delay...
> 
> So after a lot playing around with the code, i ended up with this, and
> it seems to fix my problem and does
> set two pcie bridges to ASPM Disabled that didn't happen before.
> 
> I do however have questions.... Shouldn't the change be applied to
> the endpoint?  Or should it be applied recursively along the path to
> the endpoint?

I don't understand this very well, but I think we do need to consider
the latencies along the entire path.  PCIe r5.0, sec 5.4.1.3, contains
this:

  Power management software, using the latency information reported by
  all components in the Hierarchy, can enable the appropriate level of
  ASPM by comparing exit latency for each given path from Root to
  Endpoint against the acceptable latency that each corresponding
  Endpoint can withstand.

Also this:

  5.4.1.3.1 Software Flow for Enabling or Disabling ASPM

  Following is an example software algorithm that highlights how to
  enable or disable ASPM in a component.

  - PCI Express components power up with an appropriate value in their
    Slot Clock Configuration bit. The method by which they initialize
    this bit is device-specific.

  - PCI Express system software scans the Slot Clock Configuration bit
    in the components on both ends of each Link to determine if both
    are using the same reference clock source or reference clocks from
    separate sources.  If the Slot Clock Configuration bits in both
    devices are Set, they are both using the same reference clock
    source, otherwise they're not.

  - PCI Express software updates the Common Clock Configuration bits
    in the components on both ends of each Link to indicate if those
    devices share the same reference clock and triggers Link
    retraining by writing 1b to the Retrain Link bit in the Link
    Control register of the Upstream component.

  - Devices must reflect the appropriate L0s/L1 exit latency in their
    L0s /L1 Exit Latency fields, per the setting of the Common Clock
    Configuration bit.

  - PCI Express system software then reads and calculates the L0s/L1
    exit latency for each Endpoint based on the latencies reported by
    each Port. Refer to Section 5.4.1.2.2 for an example.

  - For each component with one or more Endpoint Functions, PCI
    Express system software examines the Endpoint L0s/L1 Acceptable
    Latency, as reported by each Endpoint Function in its Link
    Capabilities register, and enables or disables L0s/L1 entry (via
    the ASPM Control field in the Link Control register) accordingly
    in some or all of the intervening device Ports on that hierarchy.

> Also, the L0S checks are only done on the local links, is this
> correct?

ASPM configuration is done on both ends of a link.  I'm not sure it
makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends
of the link support it.  In particular, sec 5.4.1.3 says:

  Software must not enable L0s in either direction on a given Link
  unless components on both sides of the Link each support L0s;
  otherwise, the result is undefined.

But I think we do need to consider the entire path when enabling L0s;
from sec 7.5.3.3:

  Endpoint L0s Acceptable Latency - This field indicates the
  acceptable total latency that an Endpoint can withstand due to the
  transition from L0s state to the L0 state. It is essentially an
  indirect measure of the Endpoint’s internal buffering.  Power
  management software uses the reported L0s Acceptable Latency number
  to compare against the L0s exit latencies reported by all components
  comprising the data path from this Endpoint to the Root Complex Root
  Port to determine whether ASPM L0s entry can be used with no loss of
  performance.

Does any of that help answer your question?

Bjorn

> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index b17e5ffd31b1..bd53fba7f382 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -434,7 +434,7 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> 
>  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
>  {
> -       u32 latency, l1_switch_latency = 0;
> +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
>         struct aspm_latency *acceptable;
>         struct pcie_link_state *link;
> 
> @@ -470,8 +470,9 @@ static void pcie_aspm_check_latency(struct pci_dev
> *endpoint)
>                  * substate latencies (and hence do not do any check).
>                  */
>                 latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> +               l1_max_latency = max_t(u32, latency, l1_max_latency);
>                 if ((link->aspm_capable & ASPM_STATE_L1) &&
> -                   (latency + l1_switch_latency > acceptable->l1))
> +                   (l1_max_latency + l1_switch_latency > acceptable->l1))
>                         link->aspm_capable &= ~ASPM_STATE_L1;
>                 l1_switch_latency += 1000;
Ian Kumlien July 29, 2020, 11:02 p.m. UTC | #2
On Thu, Jul 30, 2020 at 12:47 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote:
> > Hi,
> >
> > A while ago I realised that I was having all kinds off issues with my
> > connection, ~933 mbit had become ~40 mbit
> >
> > This only applied on links to the internet (via a linux fw running
> > NAT) however while debugging with the help of Alexander Duyck
> > we realised that ASPM could be the culprit (at least disabling ASPM on
> > the nic it self made things work just fine)...
> >
> > So while trying to understand PCIe and such things, I found this:
> >
> > The calculations of the max delay looked at "that node" + start latency * "hops"
> >
> > But one hop might have a larger latency and break the acceptable delay...
> >
> > So after a lot playing around with the code, i ended up with this, and
> > it seems to fix my problem and does
> > set two pcie bridges to ASPM Disabled that didn't happen before.
> >
> > I do however have questions.... Shouldn't the change be applied to
> > the endpoint?  Or should it be applied recursively along the path to
> > the endpoint?
>
> I don't understand this very well, but I think we do need to consider
> the latencies along the entire path.  PCIe r5.0, sec 5.4.1.3, contains
> this:
>
>   Power management software, using the latency information reported by
>   all components in the Hierarchy, can enable the appropriate level of
>   ASPM by comparing exit latency for each given path from Root to
>   Endpoint against the acceptable latency that each corresponding
>   Endpoint can withstand.

One of the questions is this:
They say from root to endpoint while we walk from endpoint to root

So, is that more optimal in some way? or should latencies always be
considered from root to endpoint?
In that case, should the link ASPM be disabled somewhere else?
(I tried to disable them on the "endpoint" and it didn't help for some reason)

> Also this:

[--8<--]

>   - For each component with one or more Endpoint Functions, PCI
>     Express system software examines the Endpoint L0s/L1 Acceptable
>     Latency, as reported by each Endpoint Function in its Link
>     Capabilities register, and enables or disables L0s/L1 entry (via
>     the ASPM Control field in the Link Control register) accordingly
>     in some or all of the intervening device Ports on that hierarchy.

> > Also, the L0S checks are only done on the local links, is this
> > correct?
>
> ASPM configuration is done on both ends of a link.  I'm not sure it
> makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends
> of the link support it.  In particular, sec 5.4.1.3 says:
>
>   Software must not enable L0s in either direction on a given Link
>   unless components on both sides of the Link each support L0s;
>   otherwise, the result is undefined.
>
> But I think we do need to consider the entire path when enabling L0s;
> from sec 7.5.3.3:
>
>   Endpoint L0s Acceptable Latency - This field indicates the
>   acceptable total latency that an Endpoint can withstand due to the
>   transition from L0s state to the L0 state. It is essentially an
>   indirect measure of the Endpoint’s internal buffering.  Power
>   management software uses the reported L0s Acceptable Latency number
>   to compare against the L0s exit latencies reported by all components
>   comprising the data path from this Endpoint to the Root Complex Root
>   Port to determine whether ASPM L0s entry can be used with no loss of
>   performance.
>
> Does any of that help answer your question?

Yes! It's exactly what I wanted to know, :)

So now the question is should I group the fixes into one patch or
separate them for easier bisecting?

> Bjorn
>
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index b17e5ffd31b1..bd53fba7f382 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -434,7 +434,7 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> >
> >  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >  {
> > -       u32 latency, l1_switch_latency = 0;
> > +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
> >         struct aspm_latency *acceptable;
> >         struct pcie_link_state *link;
> >
> > @@ -470,8 +470,9 @@ static void pcie_aspm_check_latency(struct pci_dev
> > *endpoint)
> >                  * substate latencies (and hence do not do any check).
> >                  */
> >                 latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> > +               l1_max_latency = max_t(u32, latency, l1_max_latency);
> >                 if ((link->aspm_capable & ASPM_STATE_L1) &&
> > -                   (latency + l1_switch_latency > acceptable->l1))
> > +                   (l1_max_latency + l1_switch_latency > acceptable->l1))
> >                         link->aspm_capable &= ~ASPM_STATE_L1;
> >                 l1_switch_latency += 1000;
Ian Kumlien July 29, 2020, 11:12 p.m. UTC | #3
On Thu, Jul 30, 2020 at 1:02 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> On Thu, Jul 30, 2020 at 12:47 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote:
> > > Hi,
> > >
> > > A while ago I realised that I was having all kinds off issues with my
> > > connection, ~933 mbit had become ~40 mbit
> > >
> > > This only applied on links to the internet (via a linux fw running
> > > NAT) however while debugging with the help of Alexander Duyck
> > > we realised that ASPM could be the culprit (at least disabling ASPM on
> > > the nic it self made things work just fine)...
> > >
> > > So while trying to understand PCIe and such things, I found this:
> > >
> > > The calculations of the max delay looked at "that node" + start latency * "hops"
> > >
> > > But one hop might have a larger latency and break the acceptable delay...
> > >
> > > So after a lot playing around with the code, i ended up with this, and
> > > it seems to fix my problem and does
> > > set two pcie bridges to ASPM Disabled that didn't happen before.
> > >
> > > I do however have questions.... Shouldn't the change be applied to
> > > the endpoint?  Or should it be applied recursively along the path to
> > > the endpoint?
> >
> > I don't understand this very well, but I think we do need to consider
> > the latencies along the entire path.  PCIe r5.0, sec 5.4.1.3, contains
> > this:
> >
> >   Power management software, using the latency information reported by
> >   all components in the Hierarchy, can enable the appropriate level of
> >   ASPM by comparing exit latency for each given path from Root to
> >   Endpoint against the acceptable latency that each corresponding
> >   Endpoint can withstand.
>
> One of the questions is this:
> They say from root to endpoint while we walk from endpoint to root
>
> So, is that more optimal in some way? or should latencies always be
> considered from root to endpoint?
> In that case, should the link ASPM be disabled somewhere else?
> (I tried to disable them on the "endpoint" and it didn't help for some reason)
>
> > Also this:
>
> [--8<--]
>
> >   - For each component with one or more Endpoint Functions, PCI
> >     Express system software examines the Endpoint L0s/L1 Acceptable
> >     Latency, as reported by each Endpoint Function in its Link
> >     Capabilities register, and enables or disables L0s/L1 entry (via
> >     the ASPM Control field in the Link Control register) accordingly
> >     in some or all of the intervening device Ports on that hierarchy.
>
> > > Also, the L0S checks are only done on the local links, is this
> > > correct?
> >
> > ASPM configuration is done on both ends of a link.  I'm not sure it
> > makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends
> > of the link support it.  In particular, sec 5.4.1.3 says:
> >
> >   Software must not enable L0s in either direction on a given Link
> >   unless components on both sides of the Link each support L0s;
> >   otherwise, the result is undefined.
> >
> > But I think we do need to consider the entire path when enabling L0s;
> > from sec 7.5.3.3:
> >
> >   Endpoint L0s Acceptable Latency - This field indicates the
> >   acceptable total latency that an Endpoint can withstand due to the
> >   transition from L0s state to the L0 state. It is essentially an
> >   indirect measure of the Endpoint’s internal buffering.  Power
> >   management software uses the reported L0s Acceptable Latency number
> >   to compare against the L0s exit latencies reported by all components
> >   comprising the data path from this Endpoint to the Root Complex Root
> >   Port to determine whether ASPM L0s entry can be used with no loss of
> >   performance.
> >
> > Does any of that help answer your question?
>
> Yes! It's exactly what I wanted to know, :)
>
> So now the question is should I group the fixes into one patch or
> separate them for easier bisecting?

Actually this raises a few questions...

It does sound like this is sum(link->latency_up.l0s) +
sum(link->latency_dw.l0s) of the link vs acceptable->l0s

But, would that mean that we walk the link backwards? so it's both sides?

Currently they are separated - and they are not diaabled as a whole...

How should we handle the difference between up and down to keep the
finer grained control we have?

> > Bjorn
> >
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index b17e5ffd31b1..bd53fba7f382 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -434,7 +434,7 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> > >
> > >  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> > >  {
> > > -       u32 latency, l1_switch_latency = 0;
> > > +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
> > >         struct aspm_latency *acceptable;
> > >         struct pcie_link_state *link;
> > >
> > > @@ -470,8 +470,9 @@ static void pcie_aspm_check_latency(struct pci_dev
> > > *endpoint)
> > >                  * substate latencies (and hence do not do any check).
> > >                  */
> > >                 latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> > > +               l1_max_latency = max_t(u32, latency, l1_max_latency);
> > >                 if ((link->aspm_capable & ASPM_STATE_L1) &&
> > > -                   (latency + l1_switch_latency > acceptable->l1))
> > > +                   (l1_max_latency + l1_switch_latency > acceptable->l1))
> > >                         link->aspm_capable &= ~ASPM_STATE_L1;
> > >                 l1_switch_latency += 1000;
Ian Kumlien July 29, 2020, 11:26 p.m. UTC | #4
On Thu, Jul 30, 2020 at 1:12 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> On Thu, Jul 30, 2020 at 1:02 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > On Thu, Jul 30, 2020 at 12:47 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote:
> > > > Hi,
> > > >
> > > > A while ago I realised that I was having all kinds off issues with my
> > > > connection, ~933 mbit had become ~40 mbit
> > > >
> > > > This only applied on links to the internet (via a linux fw running
> > > > NAT) however while debugging with the help of Alexander Duyck
> > > > we realised that ASPM could be the culprit (at least disabling ASPM on
> > > > the nic it self made things work just fine)...
> > > >
> > > > So while trying to understand PCIe and such things, I found this:
> > > >
> > > > The calculations of the max delay looked at "that node" + start latency * "hops"
> > > >
> > > > But one hop might have a larger latency and break the acceptable delay...
> > > >
> > > > So after a lot playing around with the code, i ended up with this, and
> > > > it seems to fix my problem and does
> > > > set two pcie bridges to ASPM Disabled that didn't happen before.
> > > >
> > > > I do however have questions.... Shouldn't the change be applied to
> > > > the endpoint?  Or should it be applied recursively along the path to
> > > > the endpoint?
> > >
> > > I don't understand this very well, but I think we do need to consider
> > > the latencies along the entire path.  PCIe r5.0, sec 5.4.1.3, contains
> > > this:
> > >
> > >   Power management software, using the latency information reported by
> > >   all components in the Hierarchy, can enable the appropriate level of
> > >   ASPM by comparing exit latency for each given path from Root to
> > >   Endpoint against the acceptable latency that each corresponding
> > >   Endpoint can withstand.
> >
> > One of the questions is this:
> > They say from root to endpoint while we walk from endpoint to root
> >
> > So, is that more optimal in some way? or should latencies always be
> > considered from root to endpoint?
> > In that case, should the link ASPM be disabled somewhere else?
> > (I tried to disable them on the "endpoint" and it didn't help for some reason)
> >
> > > Also this:
> >
> > [--8<--]
> >
> > >   - For each component with one or more Endpoint Functions, PCI
> > >     Express system software examines the Endpoint L0s/L1 Acceptable
> > >     Latency, as reported by each Endpoint Function in its Link
> > >     Capabilities register, and enables or disables L0s/L1 entry (via
> > >     the ASPM Control field in the Link Control register) accordingly
> > >     in some or all of the intervening device Ports on that hierarchy.
> >
> > > > Also, the L0S checks are only done on the local links, is this
> > > > correct?
> > >
> > > ASPM configuration is done on both ends of a link.  I'm not sure it
> > > makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends
> > > of the link support it.  In particular, sec 5.4.1.3 says:
> > >
> > >   Software must not enable L0s in either direction on a given Link
> > >   unless components on both sides of the Link each support L0s;
> > >   otherwise, the result is undefined.
> > >
> > > But I think we do need to consider the entire path when enabling L0s;
> > > from sec 7.5.3.3:
> > >
> > >   Endpoint L0s Acceptable Latency - This field indicates the
> > >   acceptable total latency that an Endpoint can withstand due to the
> > >   transition from L0s state to the L0 state. It is essentially an
> > >   indirect measure of the Endpoint’s internal buffering.  Power
> > >   management software uses the reported L0s Acceptable Latency number
> > >   to compare against the L0s exit latencies reported by all components
> > >   comprising the data path from this Endpoint to the Root Complex Root
> > >   Port to determine whether ASPM L0s entry can be used with no loss of
> > >   performance.
> > >
> > > Does any of that help answer your question?
> >
> > Yes! It's exactly what I wanted to know, :)
> >
> > So now the question is should I group the fixes into one patch or
> > separate them for easier bisecting?
>
> Actually this raises a few questions...
>
> It does sound like this is sum(link->latency_up.l0s) +
> sum(link->latency_dw.l0s) of the link vs acceptable->l0s
>
> But, would that mean that we walk the link backwards? so it's both sides?
>
> Currently they are separated - and they are not diaabled as a whole...
>
> How should we handle the difference between up and down to keep the
> finer grained control we have?

so, this:
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index bd53fba7f382..18b659e6d3e8 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,

 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
+       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+               l0s_max_latency = 0;
        struct aspm_latency *acceptable;
        struct pcie_link_state *link;

@@ -448,14 +449,18 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)

        while (link) {
                /* Check upstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-                   (link->latency_up.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
+               if (link->aspm_capable & ASPM_STATE_L0S_UP) {
+                       l0s_max_latency += link->latency_up.l0s;
+                       if (l0s_max_latency > acceptable->l0s)
+                               link->aspm_capable &= ~ASPM_STATE_L0S_UP;
+               }

                /* Check downstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-                   (link->latency_dw.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+               if (link->aspm_capable & ASPM_STATE_L0S_DW) {
+                       l0s_max_latency += link->latency_dw.l0s;
+                       if (l0s_max_latency > acceptable->l0s)
+                               link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+               }
                /*
                 * Check L1 latency.
                 * Every switch on the path to root complex need 1

--- vs ----

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index bd53fba7f382..74dee09e788b 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,

 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
+       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+               l0s_max_latency = 0;
        struct aspm_latency *acceptable;
        struct pcie_link_state *link;

@@ -448,14 +449,17 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)

        while (link) {
                /* Check upstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-                   (link->latency_up.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
+               if (link->aspm_capable & ASPM_STATE_L0S_UP)
+                       l0s_max_latency += link->latency_up.l0s;

                /* Check downstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-                   (link->latency_dw.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+               if (link->aspm_capable & ASPM_STATE_L0S_DW)
+                       l0s_max_latency += link->latency_dw.l0s;
+
+               /* If the latency exceeds, disable both */
+               if (l0s_max_latency > acceptable->l0s)
+                       link->aspm_capable &= ~ASPM_STATE_L0S;
+
                /*
                 * Check L1 latency.
                 * Every switch on the path to root complex need 1

----

Currently we don't make a difference between them and I don't quite
understand the upstream and downstream terminology since it's a serial
bus ;)
Ian Kumlien Aug. 3, 2020, 9:54 a.m. UTC | #5
On Mon, Aug 3, 2020 at 11:44 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> On Thu, Jul 30, 2020 at 1:26 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> >
> > On Thu, Jul 30, 2020 at 1:12 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > >
> > > On Thu, Jul 30, 2020 at 1:02 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > > > On Thu, Jul 30, 2020 at 12:47 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote:
> > > > > > Hi,
> > > > > >
> > > > > > A while ago I realised that I was having all kinds off issues with my
> > > > > > connection, ~933 mbit had become ~40 mbit
> > > > > >
> > > > > > This only applied on links to the internet (via a linux fw running
> > > > > > NAT) however while debugging with the help of Alexander Duyck
> > > > > > we realised that ASPM could be the culprit (at least disabling ASPM on
> > > > > > the nic it self made things work just fine)...
> > > > > >
> > > > > > So while trying to understand PCIe and such things, I found this:
> > > > > >
> > > > > > The calculations of the max delay looked at "that node" + start latency * "hops"
> > > > > >
> > > > > > But one hop might have a larger latency and break the acceptable delay...
> > > > > >
> > > > > > So after a lot playing around with the code, i ended up with this, and
> > > > > > it seems to fix my problem and does
> > > > > > set two pcie bridges to ASPM Disabled that didn't happen before.
> > > > > >
> > > > > > I do however have questions.... Shouldn't the change be applied to
> > > > > > the endpoint?  Or should it be applied recursively along the path to
> > > > > > the endpoint?
> > > > >
> > > > > I don't understand this very well, but I think we do need to consider
> > > > > the latencies along the entire path.  PCIe r5.0, sec 5.4.1.3, contains
> > > > > this:
> > > > >
> > > > >   Power management software, using the latency information reported by
> > > > >   all components in the Hierarchy, can enable the appropriate level of
> > > > >   ASPM by comparing exit latency for each given path from Root to
> > > > >   Endpoint against the acceptable latency that each corresponding
> > > > >   Endpoint can withstand.
> > > >
> > > > One of the questions is this:
> > > > They say from root to endpoint while we walk from endpoint to root
> > > >
> > > > So, is that more optimal in some way? or should latencies always be
> > > > considered from root to endpoint?
> > > > In that case, should the link ASPM be disabled somewhere else?
> > > > (I tried to disable them on the "endpoint" and it didn't help for some reason)
> > > >
> > > > > Also this:
> > > >
> > > > [--8<--]
> > > >
> > > > >   - For each component with one or more Endpoint Functions, PCI
> > > > >     Express system software examines the Endpoint L0s/L1 Acceptable
> > > > >     Latency, as reported by each Endpoint Function in its Link
> > > > >     Capabilities register, and enables or disables L0s/L1 entry (via
> > > > >     the ASPM Control field in the Link Control register) accordingly
> > > > >     in some or all of the intervening device Ports on that hierarchy.
> > > >
> > > > > > Also, the L0S checks are only done on the local links, is this
> > > > > > correct?
> > > > >
> > > > > ASPM configuration is done on both ends of a link.  I'm not sure it
> > > > > makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends
> > > > > of the link support it.  In particular, sec 5.4.1.3 says:
> > > > >
> > > > >   Software must not enable L0s in either direction on a given Link
> > > > >   unless components on both sides of the Link each support L0s;
> > > > >   otherwise, the result is undefined.
> > > > >
> > > > > But I think we do need to consider the entire path when enabling L0s;
> > > > > from sec 7.5.3.3:
> > > > >
> > > > >   Endpoint L0s Acceptable Latency - This field indicates the
> > > > >   acceptable total latency that an Endpoint can withstand due to the
> > > > >   transition from L0s state to the L0 state. It is essentially an
> > > > >   indirect measure of the Endpoint’s internal buffering.  Power
> > > > >   management software uses the reported L0s Acceptable Latency number
> > > > >   to compare against the L0s exit latencies reported by all components
> > > > >   comprising the data path from this Endpoint to the Root Complex Root
> > > > >   Port to determine whether ASPM L0s entry can be used with no loss of
> > > > >   performance.
> > > > >
> > > > > Does any of that help answer your question?
> > > >
> > > > Yes! It's exactly what I wanted to know, :)
> > > >
> > > > So now the question is should I group the fixes into one patch or
> > > > separate them for easier bisecting?
> > >
> > > Actually this raises a few questions...
> > >
> > > It does sound like this is sum(link->latency_up.l0s) +
> > > sum(link->latency_dw.l0s) of the link vs acceptable->l0s
> > >
> > > But, would that mean that we walk the link backwards? so it's both sides?
> > >
> > > Currently they are separated - and they are not diaabled as a whole...
> > >
> > > How should we handle the difference between up and down to keep the
> > > finer grained control we have?
> >
> > so, this:
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index bd53fba7f382..18b659e6d3e8 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> >
> >  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >  {
> > -       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
> > +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
> > +               l0s_max_latency = 0;
> >         struct aspm_latency *acceptable;
> >         struct pcie_link_state *link;
> >
> > @@ -448,14 +449,18 @@ static void pcie_aspm_check_latency(struct
> > pci_dev *endpoint)
> >
> >         while (link) {
> >                 /* Check upstream direction L0s latency */
> > -               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
> > -                   (link->latency_up.l0s > acceptable->l0s))
> > -                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
> > +               if (link->aspm_capable & ASPM_STATE_L0S_UP) {
> > +                       l0s_max_latency += link->latency_up.l0s;
> > +                       if (l0s_max_latency > acceptable->l0s)
> > +                               link->aspm_capable &= ~ASPM_STATE_L0S_UP;
> > +               }
> >
> >                 /* Check downstream direction L0s latency */
> > -               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
> > -                   (link->latency_dw.l0s > acceptable->l0s))
> > -                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
> > +               if (link->aspm_capable & ASPM_STATE_L0S_DW) {
> > +                       l0s_max_latency += link->latency_dw.l0s;
> > +                       if (l0s_max_latency > acceptable->l0s)
> > +                               link->aspm_capable &= ~ASPM_STATE_L0S_DW;
> > +               }
> >                 /*
> >                  * Check L1 latency.
> >                  * Every switch on the path to root complex need 1
> >
> > --- vs ----
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index bd53fba7f382..74dee09e788b 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> >
> >  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >  {
> > -       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
> > +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
> > +               l0s_max_latency = 0;
> >         struct aspm_latency *acceptable;
> >         struct pcie_link_state *link;
> >
> > @@ -448,14 +449,17 @@ static void pcie_aspm_check_latency(struct
> > pci_dev *endpoint)
> >
> >         while (link) {
> >                 /* Check upstream direction L0s latency */
> > -               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
> > -                   (link->latency_up.l0s > acceptable->l0s))
> > -                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
> > +               if (link->aspm_capable & ASPM_STATE_L0S_UP)
> > +                       l0s_max_latency += link->latency_up.l0s;
> >
> >                 /* Check downstream direction L0s latency */
> > -               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
> > -                   (link->latency_dw.l0s > acceptable->l0s))
> > -                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
> > +               if (link->aspm_capable & ASPM_STATE_L0S_DW)
> > +                       l0s_max_latency += link->latency_dw.l0s;
> > +
> > +               /* If the latency exceeds, disable both */
> > +               if (l0s_max_latency > acceptable->l0s)
> > +                       link->aspm_capable &= ~ASPM_STATE_L0S;
> > +
> >                 /*
> >                  * Check L1 latency.
> >                  * Every switch on the path to root complex need 1
> >
> > ----
> >
> > Currently we don't make a difference between them and I don't quite
> > understand the upstream and downstream terminology since it's a serial
> > bus ;)
>
> So, played around a bit on my old macbook pro (since it's a intel
> system with different controllers)
>
> I did one baseline on 5.8, one with the l1 patch applied and one with
> a variant of L0s applied.
>
> Did a lspci at each boot, and here is the results:
> diff -u lspci-base.txt lspci-l1.txt
> --- lspci-base.txt 2020-08-03 11:20:05.328814478 +0200
> +++ lspci-l1.txt 2020-08-03 11:31:38.997572430 +0200
> @@ -200,14 +200,13 @@
>
>  00:1a.0 USB controller: Intel Corporation 7 Series/C216 Chipset
> Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI])
>   Subsystem: Intel Corporation 7 Series/C216 Chipset Family USB
> Enhanced Host Controller
> - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx-
> + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx-
>   Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> - Latency: 0
>   Interrupt: pin A routed to IRQ 23
>   Region 0: Memory at a0a16c00 (32-bit, non-prefetchable) [size=1K]
>   Capabilities: [50] Power Management version 2
>   Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> - Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> + Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
>   Capabilities: [58] Debug port: BAR=1 offset=00a0
>   Capabilities: [98] PCI Advanced Features
>   AFCap: TP+ FLR+
> @@ -722,7 +721,7 @@
>   DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>   LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s L1, Exit Latency
> L0s unlimited, L1 unlimited
>   ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp-
> - LnkCtl: ASPM Disabled; Disabled- CommClk+
> + LnkCtl: ASPM L1 Enabled; Disabled- CommClk+
>   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (ok), Width x4 (ok)
>   TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> @@ -1088,7 +1087,7 @@
>   DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
>   LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s L1, Exit Latency
> L0s unlimited, L1 unlimited
>   ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
> - LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
> + LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
>   ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (ok), Width x4 (ok)
>   TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt
> ---
>
> And:
> --- lspci-l1.txt 2020-08-03 11:31:38.997572430 +0200
> +++ lspci-l0s.txt 2020-08-03 11:37:54.875843204 +0200
> @@ -274,7 +274,7 @@
>   DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>   LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency
> L0s <512ns, L1 <16us
>   ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp-
> - LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+
> + LnkCtl: ASPM L0s Enabled; RCB 64 bytes, Disabled- CommClk+
>   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (downgraded), Width x1 (ok)
>   TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
> @@ -455,7 +455,7 @@
>   DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
>   LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency
> L0s <2us, L1 <64us
>   ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> - LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+
> + LnkCtl: ASPM L0s Enabled; RCB 64 bytes, Disabled- CommClk+
>   ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
>   TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> @@ -517,7 +517,7 @@
>   DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
>   LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency
> L0s <2us, L1 <64us
>   ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> - LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+
> + LnkCtl: ASPM L0s Enabled; RCB 64 bytes, Disabled- CommClk+
>   ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
>   TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> @@ -721,7 +721,7 @@
>   DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>   LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s L1, Exit Latency
> L0s unlimited, L1 unlimited
>   ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp-
> - LnkCtl: ASPM L1 Enabled; Disabled- CommClk+
> + LnkCtl: ASPM Disabled; Disabled- CommClk+
>   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (ok), Width x4 (ok)
>   TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> @@ -1087,7 +1087,7 @@
>   DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
>   LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s L1, Exit Latency
> L0s unlimited, L1 unlimited
>   ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
> - LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
> + LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
>   ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>   LnkSta: Speed 2.5GT/s (ok), Width x4 (ok)
>   TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> ----
>
> The actual total path is now:
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index b17e5ffd31b1..e16c5712bef0 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
>
>  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
>  {
> -       u32 latency, l1_switch_latency = 0;
> +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
> +               l0s_max_latency = 0, l0s_up = 0, l0s_dw = 0;
>         struct aspm_latency *acceptable;
>         struct pcie_link_state *link;
>
> @@ -448,14 +449,19 @@ static void pcie_aspm_check_latency(struct
> pci_dev *endpoint)
>
>         while (link) {
>                 /* Check upstream direction L0s latency */
> -               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
> -                   (link->latency_up.l0s > acceptable->l0s))
> -                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
> +               if (link->aspm_capable & ASPM_STATE_L0S_UP)
> +                       l0s_up = link->latency_up.l0s;
>
>                 /* Check downstream direction L0s latency */
> -               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
> -                   (link->latency_dw.l0s > acceptable->l0s))
> -                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
> +               if (link->aspm_capable & ASPM_STATE_L0S_DW)
> +                       l0s_dw = link->latency_dw.l0s;
> +
> +               l0s_max_latency += max_t(u32, l0s_up, l0s_dw);
> +
> +               /* If the latency exceeds, disable both */
> +               if (l0s_max_latency > acceptable->l0s)
> +                       link->aspm_capable &= ~ASPM_STATE_L0S;
> +
>                 /*
>                  * Check L1 latency.
>                  * Every switch on the path to root complex need 1
> @@ -469,9 +475,9 @@ static void pcie_aspm_check_latency(struct pci_dev
> *endpoint)
>                  * L1 exit latencies advertised by a device include L1
>                  * substate latencies (and hence do not do any check).
>                  */
> -               latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> +               l1_max_latency = max_t(u32, latency, l1_max_latency);
>                 if ((link->aspm_capable & ASPM_STATE_L1) &&
> -                   (latency + l1_switch_latency > acceptable->l1))
> +                   (l1_max_latency + l1_switch_latency > acceptable->l1))
>                         link->aspm_capable &= ~ASPM_STATE_L1;
>                 l1_switch_latency += 1000;
> ---
>
> There is a lot that can be incorrect in this case, will try a
> different L0s path and see if it changes behaviour
>
> Info about the machine:
> lspci -t
> -[0000:00]-+-00.0
>            +-01.0-[01]--
>            +-01.1-[04-9a]----00.0-[05-6a]--+-00.0-[06]----00.0
>            |                               +-03.0-[07-37]--
>            |                               +-04.0-[38-68]--
>            |                               +-05.0-[69]--
>            |                               \-06.0-[6a]--
>            +-02.0
>            +-14.0
>            +-16.0
>            +-1a.0
>            +-1b.0
>            +-1c.0-[02]--+-00.0
>            |            \-00.1
>            +-1c.1-[03]----00.0
>            +-1d.0
>            +-1f.0
>            +-1f.2
>            \-1f.3

And sometimes your suspicions is confirmed...

diff -u lspci-l0s.txt lspci-l0s-fixed.txt
--- lspci-l0s.txt 2020-08-03 11:37:54.875843204 +0200
+++ lspci-l0s-fixed.txt 2020-08-03 11:52:01.522670482 +0200
@@ -105,10 +105,10 @@
  DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
  LnkCap: Port #3, Speed 8GT/s, Width x8, ASPM L0s L1, Exit Latency
L0s <256ns, L1 <8us
  ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
- LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
  ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 5GT/s (downgraded), Width x4 (downgraded)
- TrErr- Train- SlotClk+ DLActive- BWMgmt+ ABWMgmt-
+ TrErr- Train+ SlotClk+ DLActive- BWMgmt+ ABWMgmt-
  SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
  Slot #2, PowerLimit 75.000W; Interlock- NoCompl+
  SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
@@ -654,7 +654,7 @@
  DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
  LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM L0s L1, Exit Latency
L0s unlimited, L1 unlimited
  ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
- LnkCtl: ASPM Disabled; Disabled- CommClk+
+ LnkCtl: ASPM L1 Enabled; Disabled- CommClk+
  ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 5GT/s (ok), Width x4 (ok)
  TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

And patch is:
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index b17e5ffd31b1..d926929c6103 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,

 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-       u32 latency, l1_switch_latency = 0;
+       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+               l0s_max_latency = 0;
        struct aspm_latency *acceptable;
        struct pcie_link_state *link;

@@ -447,15 +448,23 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)
        acceptable = &link->acceptable[PCI_FUNC(endpoint->devfn)];

        while (link) {
-               /* Check upstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-                   (link->latency_up.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
-
-               /* Check downstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-                   (link->latency_dw.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+               if (link->aspm_capable & ASPM_STATE_L0S) {
+                       u32 l0s_up = 0, l0s_dw = 0;
+                       /* Check upstream direction L0s latency */
+                       if (link->aspm_capable & ASPM_STATE_L0S_UP)
+                               l0s_up = link->latency_up.l0s;
+
+                       /* Check downstream direction L0s latency */
+                       if (link->aspm_capable & ASPM_STATE_L0S_DW)
+                               l0s_dw = link->latency_dw.l0s;
+
+                       l0s_max_latency += max_t(u32, l0s_up, l0s_dw);
+
+                       /* If the latency exceeds, disable both */
+                       if (l0s_max_latency > acceptable->l0s)
+                               link->aspm_capable &= ~ASPM_STATE_L0S;
+               }
+
                /*
                 * Check L1 latency.
                 * Every switch on the path to root complex need 1
@@ -469,9 +478,9 @@ static void pcie_aspm_check_latency(struct pci_dev
*endpoint)
                 * L1 exit latencies advertised by a device include L1
                 * substate latencies (and hence do not do any check).
                 */
-               latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
+               l1_max_latency = max_t(u32, latency, l1_max_latency);
                if ((link->aspm_capable & ASPM_STATE_L1) &&
-                   (latency + l1_switch_latency > acceptable->l1))
+                   (l1_max_latency + l1_switch_latency > acceptable->l1))
                        link->aspm_capable &= ~ASPM_STATE_L1;
                l1_switch_latency += 1000;
----

Sorry about the noise
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index b17e5ffd31b1..bd53fba7f382 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,7 @@  static void pcie_get_aspm_reg(struct pci_dev *pdev,

 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-       u32 latency, l1_switch_latency = 0;
+       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
        struct aspm_latency *acceptable;
        struct pcie_link_state *link;

@@ -470,8 +470,9 @@  static void pcie_aspm_check_latency(struct pci_dev
*endpoint)
                 * substate latencies (and hence do not do any check).
                 */
                latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
+               l1_max_latency = max_t(u32, latency, l1_max_latency);
                if ((link->aspm_capable & ASPM_STATE_L1) &&
-                   (latency + l1_switch_latency > acceptable->l1))
+                   (l1_max_latency + l1_switch_latency > acceptable->l1))
                        link->aspm_capable &= ~ASPM_STATE_L1;
                l1_switch_latency += 1000;