diff mbox series

arm64: dts: rockchip: fix rk3399 pcie speed

Message ID 20200423150510.6216-1-pgwipeout@gmail.com (mailing list archive)
State New, archived
Headers show
Series arm64: dts: rockchip: fix rk3399 pcie speed | expand

Commit Message

Peter Geis April 23, 2020, 3:05 p.m. UTC
The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
The device-tree incorrectly limits us to gen 1.

Correctly set the maximum link speed to <2>.

Tested on the rockpro64.

Signed-off-by: Peter Geis <pgwipeout@gmail.com>
---
 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Peter Geis April 23, 2020, 3:09 p.m. UTC | #1
On Thu, Apr 23, 2020 at 11:05 AM Peter Geis <pgwipeout@gmail.com> wrote:
>
> The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
> The device-tree incorrectly limits us to gen 1.
>
> Correctly set the maximum link speed to <2>.
>
> Tested on the rockpro64.

Note, this was tested on the rockpro64 after I performed the hardware
fixes as delineated at
https://forum.pine64.org/showthread.php?tid=8374

We probably will have to drop this back to <1> on board specific dts
files if issues are discovered.

>
> Signed-off-by: Peter Geis <pgwipeout@gmail.com>
> ---
>  arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> index 74f2c3d49095..e9efd330810b 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> @@ -248,7 +248,7 @@
>                                 <0 0 0 3 &pcie0_intc 2>,
>                                 <0 0 0 4 &pcie0_intc 3>;
>                 linux,pci-domain = <0>;
> -               max-link-speed = <1>;
> +               max-link-speed = <2>;
>                 msi-map = <0x0 &its 0x0 0x1000>;
>                 phys = <&pcie_phy 0>, <&pcie_phy 1>,
>                        <&pcie_phy 2>, <&pcie_phy 3>;
> --
> 2.20.1
>
Robin Murphy April 23, 2020, 3:49 p.m. UTC | #2
On 2020-04-23 4:09 pm, Peter Geis wrote:
> On Thu, Apr 23, 2020 at 11:05 AM Peter Geis <pgwipeout@gmail.com> wrote:
>>
>> The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
>> The device-tree incorrectly limits us to gen 1.
>>
>> Correctly set the maximum link speed to <2>.
>>
>> Tested on the rockpro64.
> 
> Note, this was tested on the rockpro64 after I performed the hardware
> fixes as delineated at
> https://forum.pine64.org/showthread.php?tid=8374
> 
> We probably will have to drop this back to <1> on board specific dts
> files if issues are discovered.

I'd say commit 712fa1777207 and the fact that the current rev 1.8 
datasheet only mentions 2.5GT/s rather weaken that argument. It would 
seem safer to leave the default as-is, and only override it for boards 
where Gen2 really is proven to work reliably. Which, er, is already the 
case ;)

That said, "proven to work reliably" is itself a bit doubtful - my 
NanoPC-T4 has always been rock-solid at Gen2 with a Samsung Evo 960 
NVMe, yet I've seen plenty of reports of other NVMe models being 
unusable with mainline due to failing link training ~90% of the time. 
It's a grey area for sure.

Robin.

>>
>> Signed-off-by: Peter Geis <pgwipeout@gmail.com>
>> ---
>>   arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
>> index 74f2c3d49095..e9efd330810b 100644
>> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
>> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
>> @@ -248,7 +248,7 @@
>>                                  <0 0 0 3 &pcie0_intc 2>,
>>                                  <0 0 0 4 &pcie0_intc 3>;
>>                  linux,pci-domain = <0>;
>> -               max-link-speed = <1>;
>> +               max-link-speed = <2>;
>>                  msi-map = <0x0 &its 0x0 0x1000>;
>>                  phys = <&pcie_phy 0>, <&pcie_phy 1>,
>>                         <&pcie_phy 2>, <&pcie_phy 3>;
>> --
>> 2.20.1
>>
Peter Geis April 23, 2020, 4:13 p.m. UTC | #3
On Thu, Apr 23, 2020 at 11:49 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-04-23 4:09 pm, Peter Geis wrote:
> > On Thu, Apr 23, 2020 at 11:05 AM Peter Geis <pgwipeout@gmail.com> wrote:
> >>
> >> The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
> >> The device-tree incorrectly limits us to gen 1.
> >>
> >> Correctly set the maximum link speed to <2>.
> >>
> >> Tested on the rockpro64.
> >
> > Note, this was tested on the rockpro64 after I performed the hardware
> > fixes as delineated at
> > https://forum.pine64.org/showthread.php?tid=8374
> >
> > We probably will have to drop this back to <1> on board specific dts
> > files if issues are discovered.
>
> I'd say commit 712fa1777207 and the fact that the current rev 1.8
> datasheet only mentions 2.5GT/s rather weaken that argument. It would
> seem safer to leave the default as-is, and only override it for boards
> where Gen2 really is proven to work reliably. Which, er, is already the
> case ;)

Do we have a copy of this errata?
I can't seem to find it.
The write up in that commit is extremely vague.

As the tegra mailing list often points out, the device-tree describes
the hardware as it is.
As:
The rk3399 itself supports PCIe gen 2.
The board specific implementations determine if we need to limit that to gen 1.
The rk3399 should be set to 2, and any board that requires that to be
redefined should do that via an override in their device-tree.
This is similar to the gmac overrides for timing.

Do we have a list of the boards that require pulling back down to gen 1?

>
> That said, "proven to work reliably" is itself a bit doubtful - my
> NanoPC-T4 has always been rock-solid at Gen2 with a Samsung Evo 960
> NVMe, yet I've seen plenty of reports of other NVMe models being
> unusable with mainline due to failing link training ~90% of the time.
> It's a grey area for sure.
>
> Robin.
>
> >>
> >> Signed-off-by: Peter Geis <pgwipeout@gmail.com>
> >> ---
> >>   arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> >> index 74f2c3d49095..e9efd330810b 100644
> >> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> >> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> >> @@ -248,7 +248,7 @@
> >>                                  <0 0 0 3 &pcie0_intc 2>,
> >>                                  <0 0 0 4 &pcie0_intc 3>;
> >>                  linux,pci-domain = <0>;
> >> -               max-link-speed = <1>;
> >> +               max-link-speed = <2>;
> >>                  msi-map = <0x0 &its 0x0 0x1000>;
> >>                  phys = <&pcie_phy 0>, <&pcie_phy 1>,
> >>                         <&pcie_phy 2>, <&pcie_phy 3>;
> >> --
> >> 2.20.1
> >>
Rob Herring April 23, 2020, 4:26 p.m. UTC | #4
On Thu, Apr 23, 2020 at 11:13 AM Peter Geis <pgwipeout@gmail.com> wrote:
>
> On Thu, Apr 23, 2020 at 11:49 AM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> > On 2020-04-23 4:09 pm, Peter Geis wrote:
> > > On Thu, Apr 23, 2020 at 11:05 AM Peter Geis <pgwipeout@gmail.com> wrote:
> > >>
> > >> The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
> > >> The device-tree incorrectly limits us to gen 1.
> > >>
> > >> Correctly set the maximum link speed to <2>.
> > >>
> > >> Tested on the rockpro64.
> > >
> > > Note, this was tested on the rockpro64 after I performed the hardware
> > > fixes as delineated at
> > > https://forum.pine64.org/showthread.php?tid=8374

Are you going to fix everyone's board?

> > >
> > > We probably will have to drop this back to <1> on board specific dts
> > > files if issues are discovered.
> >
> > I'd say commit 712fa1777207 and the fact that the current rev 1.8
> > datasheet only mentions 2.5GT/s rather weaken that argument. It would
> > seem safer to leave the default as-is, and only override it for boards
> > where Gen2 really is proven to work reliably. Which, er, is already the
> > case ;)
>
> Do we have a copy of this errata?
> I can't seem to find it.
> The write up in that commit is extremely vague.
>
> As the tegra mailing list often points out, the device-tree describes
> the hardware as it is.

I think that's DT describes the h/w not settings for the Linux kernel
which is different from what's discussed here.

> As:
> The rk3399 itself supports PCIe gen 2.
> The board specific implementations determine if we need to limit that to gen 1.
> The rk3399 should be set to 2, and any board that requires that to be
> redefined should do that via an override in their device-tree.
> This is similar to the gmac overrides for timing.
>
> Do we have a list of the boards that require pulling back down to gen 1?
>
> >
> > That said, "proven to work reliably" is itself a bit doubtful - my
> > NanoPC-T4 has always been rock-solid at Gen2 with a Samsung Evo 960
> > NVMe, yet I've seen plenty of reports of other NVMe models being
> > unusable with mainline due to failing link training ~90% of the time.
> > It's a grey area for sure.

That seems pretty clear to me as to what the default should be. What's
most likely to work OOTB for users.

Rob
Robin Murphy April 23, 2020, 5:40 p.m. UTC | #5
On 2020-04-23 5:26 pm, Rob Herring wrote:
> On Thu, Apr 23, 2020 at 11:13 AM Peter Geis <pgwipeout@gmail.com> wrote:
>>
>> On Thu, Apr 23, 2020 at 11:49 AM Robin Murphy <robin.murphy@arm.com> wrote:
>>>
>>> On 2020-04-23 4:09 pm, Peter Geis wrote:
>>>> On Thu, Apr 23, 2020 at 11:05 AM Peter Geis <pgwipeout@gmail.com> wrote:
>>>>>
>>>>> The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
>>>>> The device-tree incorrectly limits us to gen 1.
>>>>>
>>>>> Correctly set the maximum link speed to <2>.
>>>>>
>>>>> Tested on the rockpro64.
>>>>
>>>> Note, this was tested on the rockpro64 after I performed the hardware
>>>> fixes as delineated at
>>>> https://forum.pine64.org/showthread.php?tid=8374
> 
> Are you going to fix everyone's board?
> 
>>>>
>>>> We probably will have to drop this back to <1> on board specific dts
>>>> files if issues are discovered.
>>>
>>> I'd say commit 712fa1777207 and the fact that the current rev 1.8
>>> datasheet only mentions 2.5GT/s rather weaken that argument. It would
>>> seem safer to leave the default as-is, and only override it for boards
>>> where Gen2 really is proven to work reliably. Which, er, is already the
>>> case ;)
>>
>> Do we have a copy of this errata?
>> I can't seem to find it.
>> The write up in that commit is extremely vague.

I'm not a Rockchip customer, so I don't have access to any more RK3399 
documentation than you do. However, I have been involved in plenty of 
errata discussions, writeups, and workarounds for Arm Ltd. IP, so based 
on general experience if I see a patch like that coming directly from 
the silicon vendor, I'm inclined to trust it at face value.

>> As the tegra mailing list often points out, the device-tree describes
>> the hardware as it is.
> 
> I think that's DT describes the h/w not settings for the Linux kernel
> which is different from what's discussed here.

Seconded - this is not a matter of software policy, it's still a 
property of the hardware regardless of what exact route it takes into 
the final DTB. If anyone wants to get into the philosophical argument of 
how accurately a SoC dtsi should describe the SoC in isolation, then I'd 
push for the correct default speed actually being 0GT/s, since if you 
don't consider the board then the link layer isn't even powered ;)

>> As:
>> The rk3399 itself supports PCIe gen 2.

That's what's in question here - clearly RK3399 *was designed* to 
support Gen2, but Rockchip have since decided that they will only 
officially support using it at Gen1 speeds.

>> The board specific implementations determine if we need to limit that to gen 1.
>> The rk3399 should be set to 2, and any board that requires that to be
>> redefined should do that via an override in their device-tree.
>> This is similar to the gmac overrides for timing.

Note that the GMAC delay settings are not "overrides", they are 
board-specific parameters based the physical trace lengths between the 
SoC and the external phy.

AFAICS this would be far more like putting the 2GHz/1.5GHz OPPs into the 
base dtsi - on the basis that that was also an original design target[1] 
and does work on some RK3399s - then expecting board authors to remember 
to downgrade them to the 1.8GHz/1.4GHz that ended up being the 
officially-supported maximum for all the chips that weren't lucky enough 
to be special "OP1"s.

>> Do we have a list of the boards that require pulling back down to gen 1?

Any that are likely to suffer wobbly signal integrity by exporting the 
PCIe signals on random pin headers instead of proper connectors would 
probably be a fair starting point, but most of that list would consist 
of any number of current and future boards that are not yet supported 
upstream, and would potentially not be supported upstream for even 
longer if some poor engineer wastes time chasing random PCI errors 
because someone set the default to an unsupported value.

Robin.

[1] 
https://www.cnx-software.com/2016/02/22/more-details-about-rockchip-rk3399-cortex-a72-soc-4k-h-264h-265vp9-usb-3-0-pcie-and-displayport/

>>> That said, "proven to work reliably" is itself a bit doubtful - my
>>> NanoPC-T4 has always been rock-solid at Gen2 with a Samsung Evo 960
>>> NVMe, yet I've seen plenty of reports of other NVMe models being
>>> unusable with mainline due to failing link training ~90% of the time.
>>> It's a grey area for sure.
> 
> That seems pretty clear to me as to what the default should be. What's
> most likely to work OOTB for users.
> 
> Rob
>
Peter Geis April 23, 2020, 6:16 p.m. UTC | #6
On Thu, Apr 23, 2020 at 1:40 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-04-23 5:26 pm, Rob Herring wrote:
> > On Thu, Apr 23, 2020 at 11:13 AM Peter Geis <pgwipeout@gmail.com> wrote:
> >>
> >> On Thu, Apr 23, 2020 at 11:49 AM Robin Murphy <robin.murphy@arm.com> wrote:
> >>>
> >>> On 2020-04-23 4:09 pm, Peter Geis wrote:
> >>>> On Thu, Apr 23, 2020 at 11:05 AM Peter Geis <pgwipeout@gmail.com> wrote:
> >>>>>
> >>>>> The rk3399 is capable of operating at PCIe gen 2 as per the TRM.
> >>>>> The device-tree incorrectly limits us to gen 1.
> >>>>>
> >>>>> Correctly set the maximum link speed to <2>.
> >>>>>
> >>>>> Tested on the rockpro64.
> >>>>
> >>>> Note, this was tested on the rockpro64 after I performed the hardware
> >>>> fixes as delineated at
> >>>> https://forum.pine64.org/showthread.php?tid=8374
> >
> > Are you going to fix everyone's board?
> >
> >>>>
> >>>> We probably will have to drop this back to <1> on board specific dts
> >>>> files if issues are discovered.
> >>>
> >>> I'd say commit 712fa1777207 and the fact that the current rev 1.8
> >>> datasheet only mentions 2.5GT/s rather weaken that argument. It would
> >>> seem safer to leave the default as-is, and only override it for boards
> >>> where Gen2 really is proven to work reliably. Which, er, is already the
> >>> case ;)
> >>
> >> Do we have a copy of this errata?
> >> I can't seem to find it.
> >> The write up in that commit is extremely vague.
>
> I'm not a Rockchip customer, so I don't have access to any more RK3399
> documentation than you do. However, I have been involved in plenty of
> errata discussions, writeups, and workarounds for Arm Ltd. IP, so based
> on general experience if I see a patch like that coming directly from
> the silicon vendor, I'm inclined to trust it at face value.
>
> >> As the tegra mailing list often points out, the device-tree describes
> >> the hardware as it is.
> >
> > I think that's DT describes the h/w not settings for the Linux kernel
> > which is different from what's discussed here.
>
> Seconded - this is not a matter of software policy, it's still a
> property of the hardware regardless of what exact route it takes into
> the final DTB. If anyone wants to get into the philosophical argument of
> how accurately a SoC dtsi should describe the SoC in isolation, then I'd
> push for the correct default speed actually being 0GT/s, since if you
> don't consider the board then the link layer isn't even powered ;)
>
> >> As:
> >> The rk3399 itself supports PCIe gen 2.
>
> That's what's in question here - clearly RK3399 *was designed* to
> support Gen2, but Rockchip have since decided that they will only
> officially support using it at Gen1 speeds.
>
> >> The board specific implementations determine if we need to limit that to gen 1.
> >> The rk3399 should be set to 2, and any board that requires that to be
> >> redefined should do that via an override in their device-tree.
> >> This is similar to the gmac overrides for timing.
>
> Note that the GMAC delay settings are not "overrides", they are
> board-specific parameters based the physical trace lengths between the
> SoC and the external phy.
>
> AFAICS this would be far more like putting the 2GHz/1.5GHz OPPs into the
> base dtsi - on the basis that that was also an original design target[1]
> and does work on some RK3399s - then expecting board authors to remember
> to downgrade them to the 1.8GHz/1.4GHz that ended up being the
> officially-supported maximum for all the chips that weren't lucky enough
> to be special "OP1"s.
>
> >> Do we have a list of the boards that require pulling back down to gen 1?
>
> Any that are likely to suffer wobbly signal integrity by exporting the
> PCIe signals on random pin headers instead of proper connectors would
> probably be a fair starting point, but most of that list would consist
> of any number of current and future boards that are not yet supported
> upstream, and would potentially not be supported upstream for even
> longer if some poor engineer wastes time chasing random PCI errors
> because someone set the default to an unsupported value.
>
> Robin.
>
> [1]
> https://www.cnx-software.com/2016/02/22/more-details-about-rockchip-rk3399-cortex-a72-soc-4k-h-264h-265vp9-usb-3-0-pcie-and-displayport/
>
> >>> That said, "proven to work reliably" is itself a bit doubtful - my
> >>> NanoPC-T4 has always been rock-solid at Gen2 with a Samsung Evo 960
> >>> NVMe, yet I've seen plenty of reports of other NVMe models being
> >>> unusable with mainline due to failing link training ~90% of the time.
> >>> It's a grey area for sure.
> >
> > That seems pretty clear to me as to what the default should be. What's
> > most likely to work OOTB for users.
> >
> > Rob
> >

Very well, I am properly chastised.
This will have to remain something an individual user will do, until
someone comes up with something better.
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 74f2c3d49095..e9efd330810b 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -248,7 +248,7 @@ 
 				<0 0 0 3 &pcie0_intc 2>,
 				<0 0 0 4 &pcie0_intc 3>;
 		linux,pci-domain = <0>;
-		max-link-speed = <1>;
+		max-link-speed = <2>;
 		msi-map = <0x0 &its 0x0 0x1000>;
 		phys = <&pcie_phy 0>, <&pcie_phy 1>,
 		       <&pcie_phy 2>, <&pcie_phy 3>;