mbox series

[v6,0/5] PCI: brcmstb: Configure appropriate HW CLKREQ# mode

Message ID 20230623144100.34196-1-james.quinlan@broadcom.com (mailing list archive)
Headers show
Series PCI: brcmstb: Configure appropriate HW CLKREQ# mode | expand

Message

Jim Quinlan June 23, 2023, 2:40 p.m. UTC
v6 -- No code has been changed.
   -- Changed commit subject and comment in "#PERST" commit (Bjorn, Cyril)
   -- Changed sign-off and author email address for all commits.
      This was due to a change in Broadcom's upstreaming policy.

v5 -- Remove DT property "brcm,completion-timeout-us" from	 
      "DT bindings" commit.  Although this error may be reported	 
      as a completion timeout, its cause was traced to an	 
      internal bus timeout which may occur even when there is	 
      no PCIe access being processed.  We set a timeout of four	 
      seconds only if we are operating in "L1SS CLKREQ#" mode.
   -- Correct CEM 2.0 reference provided by HW engineer,
      s/3.2.5.2.5/3.2.5.2.2/ (Bjorn)
   -- Add newline to dev_info() string (Stefan)
   -- Change variable rval to unsigned (Stefan)
   -- s/implementaion/implementation/ (Bjorn)
   -- s/superpowersave/powersupersave/ (Bjorn)
   -- Slightly modify message on "PERST#" commit.
   -- Rebase to torvalds master

v4 -- New commit that asserts PERST# for 2711/RPi SOCs at PCIe RC
      driver probe() time.  This is done in Raspian Linux and its
      absence may be the cause of a failing test case.
   -- New commit that removes stale comment.

v3 -- Rewrote commit msgs and comments refering panics if L1SS
      is enabled/disabled; the code snippet that unadvertises L1SS
      eliminates the panic scenario. (Bjorn)
   -- Add reference for "400ns of CLKREQ# assertion" blurb (Bjorn)
   -- Put binding names in DT commit Subject (Bjorn)
   -- Add a verb to a commit's subject line (Bjorn)
   -- s/accomodat(\w+)/accommodat$1/g (Bjorn)
   -- Rewrote commit msgs and comments refering panics if L1SS
      is enabled/disabled; the code snippet that unadvertises L1SS
      eliminates the panic scenario. (Bjorn)

v2 -- Changed binding property 'brcm,completion-timeout-msec' to
      'brcm,completion-timeout-us'.  (StefanW for standard suffix).
   -- Warn when clamping timeout value, and include clamped
      region in message. Also add min and max in YAML. (StefanW)
   -- Qualify description of "brcm,completion-timeout-us" so that
      it refers to PCIe transactions. (StefanW)
   -- Remvove mention of Linux specifics in binding description. (StefanW)
   -- s/clkreq#/CLKREQ#/g (Bjorn)
   -- Refactor completion-timeout-us code to compare max and min to
      value given by the property (as opposed to the computed value).

v1 -- The current driver assumes the downstream devices can
      provide CLKREQ# for ASPM.  These commits accomodate devices
      w/ or w/o clkreq# and also handle L1SS-capable devices.

   -- The Raspian Linux folks have already been using a PCIe RC
      property "brcm,enable-l1ss".  These commits use the same
      property, in a backward-compatible manner, and the implementaion
      adds more detail and also automatically identifies devices w/o
      a clkreq# signal, i.e. most devices plugged into an RPi CM4
      IO board.


Jim Quinlan (5):
  dt-bindings: PCI: brcmstb: Add brcm,enable-l1ss property
  PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream
    device
  PCI: brcmstb: Set higher value for internal bus timeout
  PCI: brcmstb: Assert PERST# on BCM2711
  PCI: brcmstb: Remove stale comment

 .../bindings/pci/brcm,stb-pcie.yaml           |  9 ++
 drivers/pci/controller/pcie-brcmstb.c         | 91 ++++++++++++++++---
 2 files changed, 89 insertions(+), 11 deletions(-)


base-commit: 8a28a0b6f1a1dcbf5a834600a9acfbe2ba51e5eb

Comments

Cyril Brulebois June 29, 2023, 1:59 a.m. UTC | #1
Hi Jim,

Jim Quinlan <james.quinlan@broadcom.com> (2023-06-23):
> v6 -- No code has been changed.
>    -- Changed commit subject and comment in "#PERST" commit (Bjorn, Cyril)
>    -- Changed sign-off and author email address for all commits.
>       This was due to a change in Broadcom's upstreaming policy.

I've just run some more tests to be on the safe side, and I can confirm
everything is still looking good with the updated series and the updated
base commit.

Test setup:
-----------

 - using a $CM with the 20230111 EEPROM
 - on the same CM4 IO Board
 - with a $PCIE board (PCIe to multiple USB ports)
 - and the same Samsung USB flash drive.

where $CM is one of:

 - CM4 Lite Rev 1.0
 - CM4 8/32 Rev 1.0
 - CM4 4/32 Rev 1.1

and $PCIE is one of:

 - SupaHub PCE6U1C-R02, VER 006
 - SupaHub PCE6U1C-R02, VER 006S


Results:
--------

 1. With an unpatched kernel, I'm getting the dreaded Serror for all
    $CM/$PCIE combinations. That's reproducible with:
     - the 6.1.y kernel shipped in Debian 12;
     - a locally-built v6.4-rc7-194-g8a28a0b6f1a1d kernel.

 2. With a patched kernel (v6.4-rc7-194-g8a28a0b6f1a1d + this series),
    for all $CM/$PCIE combinations, I'm getting a system that boots,
    sees the flash drive, and gives decent read performance on the USB
    flash drive (200+ MB/s on the CM4 Lite, 220+ MB/s on the non-Lite
    versions).


In passing, since that looks like it could be merged finally: I suppose
it's fair to say this series adds support for hardware that wasn't
working before, which means it's not a candidate for inclusion via
stable@ (even if it gets rid of a nasty failure to boot depending on
what hardware is plugged in at that time)?

In other words, downstream distributions should be expected to either
adjust their build systems to pick some future Linux release or consider
backporting this series on their own, to each base Linux version they
support?


Thanks again for all the help figuring this out.


Cheers,
Lorenzo Pieralisi Aug. 21, 2023, 8:34 a.m. UTC | #2
On Fri, Jun 23, 2023 at 10:40:53AM -0400, Jim Quinlan wrote:
> v6 -- No code has been changed.
>    -- Changed commit subject and comment in "#PERST" commit (Bjorn, Cyril)
>    -- Changed sign-off and author email address for all commits.
>       This was due to a change in Broadcom's upstreaming policy.
> 
> v5 -- Remove DT property "brcm,completion-timeout-us" from	 
>       "DT bindings" commit.  Although this error may be reported	 
>       as a completion timeout, its cause was traced to an	 
>       internal bus timeout which may occur even when there is	 
>       no PCIe access being processed.  We set a timeout of four	 
>       seconds only if we are operating in "L1SS CLKREQ#" mode.
>    -- Correct CEM 2.0 reference provided by HW engineer,
>       s/3.2.5.2.5/3.2.5.2.2/ (Bjorn)
>    -- Add newline to dev_info() string (Stefan)
>    -- Change variable rval to unsigned (Stefan)
>    -- s/implementaion/implementation/ (Bjorn)
>    -- s/superpowersave/powersupersave/ (Bjorn)
>    -- Slightly modify message on "PERST#" commit.
>    -- Rebase to torvalds master
> 
> v4 -- New commit that asserts PERST# for 2711/RPi SOCs at PCIe RC
>       driver probe() time.  This is done in Raspian Linux and its
>       absence may be the cause of a failing test case.
>    -- New commit that removes stale comment.
> 
> v3 -- Rewrote commit msgs and comments refering panics if L1SS
>       is enabled/disabled; the code snippet that unadvertises L1SS
>       eliminates the panic scenario. (Bjorn)
>    -- Add reference for "400ns of CLKREQ# assertion" blurb (Bjorn)
>    -- Put binding names in DT commit Subject (Bjorn)
>    -- Add a verb to a commit's subject line (Bjorn)
>    -- s/accomodat(\w+)/accommodat$1/g (Bjorn)
>    -- Rewrote commit msgs and comments refering panics if L1SS
>       is enabled/disabled; the code snippet that unadvertises L1SS
>       eliminates the panic scenario. (Bjorn)
> 
> v2 -- Changed binding property 'brcm,completion-timeout-msec' to
>       'brcm,completion-timeout-us'.  (StefanW for standard suffix).
>    -- Warn when clamping timeout value, and include clamped
>       region in message. Also add min and max in YAML. (StefanW)
>    -- Qualify description of "brcm,completion-timeout-us" so that
>       it refers to PCIe transactions. (StefanW)
>    -- Remvove mention of Linux specifics in binding description. (StefanW)
>    -- s/clkreq#/CLKREQ#/g (Bjorn)
>    -- Refactor completion-timeout-us code to compare max and min to
>       value given by the property (as opposed to the computed value).
> 
> v1 -- The current driver assumes the downstream devices can
>       provide CLKREQ# for ASPM.  These commits accomodate devices
>       w/ or w/o clkreq# and also handle L1SS-capable devices.
> 
>    -- The Raspian Linux folks have already been using a PCIe RC
>       property "brcm,enable-l1ss".  These commits use the same
>       property, in a backward-compatible manner, and the implementaion
>       adds more detail and also automatically identifies devices w/o
>       a clkreq# signal, i.e. most devices plugged into an RPi CM4
>       IO board.
> 
> 
> Jim Quinlan (5):
>   dt-bindings: PCI: brcmstb: Add brcm,enable-l1ss property
>   PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream
>     device

I am not merging the first two patches since the discussion thread
is still open and I'd like to understand better what can/should be
done, sorry.

>   PCI: brcmstb: Set higher value for internal bus timeout
>   PCI: brcmstb: Assert PERST# on BCM2711
>   PCI: brcmstb: Remove stale comment

Is it OK to apply these three on their own ? Overall it would be
great to avoid mixing patches with different end goals in a single
series.

Thanks,
Lorenzo

>  .../bindings/pci/brcm,stb-pcie.yaml           |  9 ++
>  drivers/pci/controller/pcie-brcmstb.c         | 91 ++++++++++++++++---
>  2 files changed, 89 insertions(+), 11 deletions(-)
> 
> 
> base-commit: 8a28a0b6f1a1dcbf5a834600a9acfbe2ba51e5eb
> -- 
> 2.17.1
>
Jim Quinlan Aug. 21, 2023, 12:15 p.m. UTC | #3
On Mon, Aug 21, 2023 at 4:35 AM Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
>
> On Fri, Jun 23, 2023 at 10:40:53AM -0400, Jim Quinlan wrote:
> > v6 -- No code has been changed.
> >    -- Changed commit subject and comment in "#PERST" commit (Bjorn, Cyril)
> >    -- Changed sign-off and author email address for all commits.
> >       This was due to a change in Broadcom's upstreaming policy.
> >
> > v5 -- Remove DT property "brcm,completion-timeout-us" from
> >       "DT bindings" commit.  Although this error may be reported
> >       as a completion timeout, its cause was traced to an
> >       internal bus timeout which may occur even when there is
> >       no PCIe access being processed.  We set a timeout of four
> >       seconds only if we are operating in "L1SS CLKREQ#" mode.
> >    -- Correct CEM 2.0 reference provided by HW engineer,
> >       s/3.2.5.2.5/3.2.5.2.2/ (Bjorn)
> >    -- Add newline to dev_info() string (Stefan)
> >    -- Change variable rval to unsigned (Stefan)
> >    -- s/implementaion/implementation/ (Bjorn)
> >    -- s/superpowersave/powersupersave/ (Bjorn)
> >    -- Slightly modify message on "PERST#" commit.
> >    -- Rebase to torvalds master
> >
> > v4 -- New commit that asserts PERST# for 2711/RPi SOCs at PCIe RC
> >       driver probe() time.  This is done in Raspian Linux and its
> >       absence may be the cause of a failing test case.
> >    -- New commit that removes stale comment.
> >
> > v3 -- Rewrote commit msgs and comments refering panics if L1SS
> >       is enabled/disabled; the code snippet that unadvertises L1SS
> >       eliminates the panic scenario. (Bjorn)
> >    -- Add reference for "400ns of CLKREQ# assertion" blurb (Bjorn)
> >    -- Put binding names in DT commit Subject (Bjorn)
> >    -- Add a verb to a commit's subject line (Bjorn)
> >    -- s/accomodat(\w+)/accommodat$1/g (Bjorn)
> >    -- Rewrote commit msgs and comments refering panics if L1SS
> >       is enabled/disabled; the code snippet that unadvertises L1SS
> >       eliminates the panic scenario. (Bjorn)
> >
> > v2 -- Changed binding property 'brcm,completion-timeout-msec' to
> >       'brcm,completion-timeout-us'.  (StefanW for standard suffix).
> >    -- Warn when clamping timeout value, and include clamped
> >       region in message. Also add min and max in YAML. (StefanW)
> >    -- Qualify description of "brcm,completion-timeout-us" so that
> >       it refers to PCIe transactions. (StefanW)
> >    -- Remvove mention of Linux specifics in binding description. (StefanW)
> >    -- s/clkreq#/CLKREQ#/g (Bjorn)
> >    -- Refactor completion-timeout-us code to compare max and min to
> >       value given by the property (as opposed to the computed value).
> >
> > v1 -- The current driver assumes the downstream devices can
> >       provide CLKREQ# for ASPM.  These commits accomodate devices
> >       w/ or w/o clkreq# and also handle L1SS-capable devices.
> >
> >    -- The Raspian Linux folks have already been using a PCIe RC
> >       property "brcm,enable-l1ss".  These commits use the same
> >       property, in a backward-compatible manner, and the implementaion
> >       adds more detail and also automatically identifies devices w/o
> >       a clkreq# signal, i.e. most devices plugged into an RPi CM4
> >       IO board.
> >
> >
> > Jim Quinlan (5):
> >   dt-bindings: PCI: brcmstb: Add brcm,enable-l1ss property
> >   PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream
> >     device
>
> I am not merging the first two patches since the discussion thread
> is still open and I'd like to understand better what can/should be
> done, sorry.

Hello Lorenzo,

This patch-set has been stable for months, V5 was out early May and
the V6 changes
did not involve code.  I'm a little surprised that you are voicing
concern at this stage.

The previous discussions covered all aspects of these commits AFAICT.
Please  review
them and the commit messages and let me know what issues you do not understand
or any topics that were not considered.

Are you concerned about the Broadcom STB/CM community  or the RPi community?
For the former, I have direct communication w/ our customers and none of them
are even close to using upstream (they may backport my commits).  For
the latter, I have
tested these commits on the official RPi4 and CM4 IO platforms, and
Cyril has also put in
an admiral amount of testing.

Note that I have on my desk a CM4 IO board w/ a conventional PCIe
device, and it does not boot
upstream master Linux until these patches are applied.

Further, Raspian OS has already introduced the "brcm,enable-l1ss"
property but did not upstream it, and
my commits are backwards compatible with this.

>
> >   PCI: brcmstb: Set higher value for internal bus timeout
> >   PCI: brcmstb: Assert PERST# on BCM2711
> >   PCI: brcmstb: Remove stale comment
>
> Is it OK to apply these three on their own ? Overall it would be
> great to avoid mixing patches with different end goals in a single
> series.

Well, they are related for one customer who wants to use L1SS power
savings AND require
a long  period for the internal timeout.  But, yes, these commits are
fine  to apply
independently.

Regards,
Jim Quinlan
Broadcom STB

>
> Thanks,
> Lorenzo
>
> >  .../bindings/pci/brcm,stb-pcie.yaml           |  9 ++
> >  drivers/pci/controller/pcie-brcmstb.c         | 91 ++++++++++++++++---
> >  2 files changed, 89 insertions(+), 11 deletions(-)
> >
> >
> > base-commit: 8a28a0b6f1a1dcbf5a834600a9acfbe2ba51e5eb
> > --
> > 2.17.1
> >
>
>
Lorenzo Pieralisi Aug. 21, 2023, 2:42 p.m. UTC | #4
On Mon, Aug 21, 2023 at 08:15:02AM -0400, Jim Quinlan wrote:
> On Mon, Aug 21, 2023 at 4:35 AM Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
> >
> > On Fri, Jun 23, 2023 at 10:40:53AM -0400, Jim Quinlan wrote:
> > > v6 -- No code has been changed.
> > >    -- Changed commit subject and comment in "#PERST" commit (Bjorn, Cyril)
> > >    -- Changed sign-off and author email address for all commits.
> > >       This was due to a change in Broadcom's upstreaming policy.
> > >
> > > v5 -- Remove DT property "brcm,completion-timeout-us" from
> > >       "DT bindings" commit.  Although this error may be reported
> > >       as a completion timeout, its cause was traced to an
> > >       internal bus timeout which may occur even when there is
> > >       no PCIe access being processed.  We set a timeout of four
> > >       seconds only if we are operating in "L1SS CLKREQ#" mode.
> > >    -- Correct CEM 2.0 reference provided by HW engineer,
> > >       s/3.2.5.2.5/3.2.5.2.2/ (Bjorn)
> > >    -- Add newline to dev_info() string (Stefan)
> > >    -- Change variable rval to unsigned (Stefan)
> > >    -- s/implementaion/implementation/ (Bjorn)
> > >    -- s/superpowersave/powersupersave/ (Bjorn)
> > >    -- Slightly modify message on "PERST#" commit.
> > >    -- Rebase to torvalds master
> > >
> > > v4 -- New commit that asserts PERST# for 2711/RPi SOCs at PCIe RC
> > >       driver probe() time.  This is done in Raspian Linux and its
> > >       absence may be the cause of a failing test case.
> > >    -- New commit that removes stale comment.
> > >
> > > v3 -- Rewrote commit msgs and comments refering panics if L1SS
> > >       is enabled/disabled; the code snippet that unadvertises L1SS
> > >       eliminates the panic scenario. (Bjorn)
> > >    -- Add reference for "400ns of CLKREQ# assertion" blurb (Bjorn)
> > >    -- Put binding names in DT commit Subject (Bjorn)
> > >    -- Add a verb to a commit's subject line (Bjorn)
> > >    -- s/accomodat(\w+)/accommodat$1/g (Bjorn)
> > >    -- Rewrote commit msgs and comments refering panics if L1SS
> > >       is enabled/disabled; the code snippet that unadvertises L1SS
> > >       eliminates the panic scenario. (Bjorn)
> > >
> > > v2 -- Changed binding property 'brcm,completion-timeout-msec' to
> > >       'brcm,completion-timeout-us'.  (StefanW for standard suffix).
> > >    -- Warn when clamping timeout value, and include clamped
> > >       region in message. Also add min and max in YAML. (StefanW)
> > >    -- Qualify description of "brcm,completion-timeout-us" so that
> > >       it refers to PCIe transactions. (StefanW)
> > >    -- Remvove mention of Linux specifics in binding description. (StefanW)
> > >    -- s/clkreq#/CLKREQ#/g (Bjorn)
> > >    -- Refactor completion-timeout-us code to compare max and min to
> > >       value given by the property (as opposed to the computed value).
> > >
> > > v1 -- The current driver assumes the downstream devices can
> > >       provide CLKREQ# for ASPM.  These commits accomodate devices
> > >       w/ or w/o clkreq# and also handle L1SS-capable devices.
> > >
> > >    -- The Raspian Linux folks have already been using a PCIe RC
> > >       property "brcm,enable-l1ss".  These commits use the same
> > >       property, in a backward-compatible manner, and the implementaion
> > >       adds more detail and also automatically identifies devices w/o
> > >       a clkreq# signal, i.e. most devices plugged into an RPi CM4
> > >       IO board.
> > >
> > >
> > > Jim Quinlan (5):
> > >   dt-bindings: PCI: brcmstb: Add brcm,enable-l1ss property
> > >   PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream
> > >     device
> >
> > I am not merging the first two patches since the discussion thread
> > is still open and I'd like to understand better what can/should be
> > done, sorry.
> 
> Hello Lorenzo,
> 
> This patch-set has been stable for months, V5 was out early May and
> the V6 changes
> did not involve code.  I'm a little surprised that you are voicing
> concern at this stage.
> 
> The previous discussions covered all aspects of these commits AFAICT.
> Please  review
> them and the commit messages and let me know what issues you do not understand
> or any topics that were not considered.

I disagree with the reasoning behind "brcm,enable-l1ss" property usage
instead of a command line option - at least I would like to get a
comment from DT maintainers about it.

I think Bjorn made the point consistently and I also think he is right.

I would like to get Rob's opinion on this. I know he acked the DT
bindings (I have a comment on those too) but regardless, it is clearly a
property used for what is a command line configuration parameter,
no two ways about it.

Thanks,
Lorenzo

> 
> Are you concerned about the Broadcom STB/CM community  or the RPi community?
> For the former, I have direct communication w/ our customers and none of them
> are even close to using upstream (they may backport my commits).  For
> the latter, I have
> tested these commits on the official RPi4 and CM4 IO platforms, and
> Cyril has also put in
> an admiral amount of testing.
> 
> Note that I have on my desk a CM4 IO board w/ a conventional PCIe
> device, and it does not boot
> upstream master Linux until these patches are applied.
> 
> Further, Raspian OS has already introduced the "brcm,enable-l1ss"
> property but did not upstream it, and
> my commits are backwards compatible with this.
> 
> >
> > >   PCI: brcmstb: Set higher value for internal bus timeout
> > >   PCI: brcmstb: Assert PERST# on BCM2711
> > >   PCI: brcmstb: Remove stale comment
> >
> > Is it OK to apply these three on their own ? Overall it would be
> > great to avoid mixing patches with different end goals in a single
> > series.
> 
> Well, they are related for one customer who wants to use L1SS power
> savings AND require
> a long  period for the internal timeout.  But, yes, these commits are
> fine  to apply
> independently.
> 
> Regards,
> Jim Quinlan
> Broadcom STB
> 
> >
> > Thanks,
> > Lorenzo
> >
> > >  .../bindings/pci/brcm,stb-pcie.yaml           |  9 ++
> > >  drivers/pci/controller/pcie-brcmstb.c         | 91 ++++++++++++++++---
> > >  2 files changed, 89 insertions(+), 11 deletions(-)
> > >
> > >
> > > base-commit: 8a28a0b6f1a1dcbf5a834600a9acfbe2ba51e5eb
> > > --
> > > 2.17.1
> > >
> >
> >
Lorenzo Pieralisi Aug. 24, 2023, 3:36 p.m. UTC | #5
On Fri, 23 Jun 2023 10:40:53 -0400, Jim Quinlan wrote:
> v6 -- No code has been changed.
>    -- Changed commit subject and comment in "#PERST" commit (Bjorn, Cyril)
>    -- Changed sign-off and author email address for all commits.
>       This was due to a change in Broadcom's upstreaming policy.
> 
> v5 -- Remove DT property "brcm,completion-timeout-us" from
>       "DT bindings" commit.  Although this error may be reported
>       as a completion timeout, its cause was traced to an
>       internal bus timeout which may occur even when there is
>       no PCIe access being processed.  We set a timeout of four
>       seconds only if we are operating in "L1SS CLKREQ#" mode.
>    -- Correct CEM 2.0 reference provided by HW engineer,
>       s/3.2.5.2.5/3.2.5.2.2/ (Bjorn)
>    -- Add newline to dev_info() string (Stefan)
>    -- Change variable rval to unsigned (Stefan)
>    -- s/implementaion/implementation/ (Bjorn)
>    -- s/superpowersave/powersupersave/ (Bjorn)
>    -- Slightly modify message on "PERST#" commit.
>    -- Rebase to torvalds master
> 
> [...]

Applied to controller/brcmstb, thanks!

[4/5] PCI: brcmstb: Assert PERST# on BCM2711
      https://git.kernel.org/pci/pci/c/8eb8c2735306
[5/5] PCI: brcmstb: Remove stale comment
      https://git.kernel.org/pci/pci/c/6dac1507a654

Thanks,
Lorenzo