diff mbox series

[v2] arm64: dts: rockchip: rk356x: Fix PCIe register map and ranges

Message ID 20221005085439.740992-1-megi@xff.cz (mailing list archive)
State New, archived
Headers show
Series [v2] arm64: dts: rockchip: rk356x: Fix PCIe register map and ranges | expand

Commit Message

Ondřej Jirman Oct. 5, 2022, 8:54 a.m. UTC
I have two Realtek PCIe wifi cards connected over the 4 port PCIe swtich
to Quartz64-A. The cards fail to work, when nvme SSD is connected at the
same time to the bridge. Without nvme connected, cards work fine. The
issue seems to be related to mixed use of devices which make use of I/O
ranges and memory ranges.

This patch changes I/O, MEM and config mappings so that config and I/O
mappings use the 0xf4000000 outbound address space, and MEM range uses
the whole 0x300000000 outbound space.

This is simialar to how BSP does the mappings.

I changed num-ob-windows to value detected by the kernel so if for whatever
reason the kernel ever starts respecting this DT property, it would not
switch to sharing I/O and CFG spaces via a single iATU mapping for
no reason.

This change to the regs/ranges makes the issue go away and both nvme and
wifi cards work when connected at the same time to the bridge. I tested
the nvme with large amount of reads/writes, both behind the PCIe bridge
and when directly connected to Quartz64-A board.

Signed-off-by: Ondrej Jirman <megi@xff.cz>
---
BSP for reference: https://github.com/rockchip-linux/kernel/blob/develop-4.19/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2370

v2:
- change ranges to use 0x300000000 fully for MEM and make use of
  the 0xf4000000 outbound range for IO and config
- full retest with/without the switch
- if lscpi/dmesg is useful in the future for comparison, see:
  https://xff.cz/kernels/random/quartz64a-pcie/

I used this script for the tests:

#!/bin/bash

OUT=/mnt/data
n=8

test -f /tmp/test.dat || \
    dd if=/dev/urandom of=/tmp/test.dat bs=1M count=1024
md5sum /tmp/test.dat

i=0
while test $i -lt $n
do
    dd if=/tmp/test.dat of=$OUT/test$i.dat bs=4M oflag=direct

    i=$(($i+1))
done

i=0
while test $i -lt $n
do
    dd if=$OUT/test$i.dat bs=4M iflag=direct | md5sum

    i=$(($i+1))
done


 arch/arm64/boot/dts/rockchip/rk356x.dtsi | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Peter Geis Oct. 5, 2022, 11:42 a.m. UTC | #1
On Wed, Oct 5, 2022 at 4:54 AM Ondrej Jirman <megi@xff.cz> wrote:
>

Good Morning,

> I have two Realtek PCIe wifi cards connected over the 4 port PCIe swtich
> to Quartz64-A. The cards fail to work, when nvme SSD is connected at the
> same time to the bridge. Without nvme connected, cards work fine. The
> issue seems to be related to mixed use of devices which make use of I/O
> ranges and memory ranges.
>
> This patch changes I/O, MEM and config mappings so that config and I/O
> mappings use the 0xf4000000 outbound address space, and MEM range uses
> the whole 0x300000000 outbound space.
>
> This is simialar to how BSP does the mappings.

This change was very recent in the BSP stuff (Jan 2022):
https://github.com/rockchip-linux/kernel/commit/cfab7abefc4093daa379fbd90a1e7ac1a484332b
A few other interesting changes there as well. They added a 32 bit
window in the lower range and made the entire upper range a 64 bit
relocatable (why?) and prefetchable window. They also set the viewport
number to 8. The dt-binding says this is autodetected, but I wonder if
the value is being detected correctly.

It looks like it is dependent in BSP on a backported change from mainline:
https://github.com/rockchip-linux/kernel/commit/50a01d3c10a6212f66364575a3c8f66c07f41591

Can someone weigh in why the dw core has config in the reg node
instead of ranges?

>
> I changed num-ob-windows to value detected by the kernel so if for whatever
> reason the kernel ever starts respecting this DT property, it would not
> switch to sharing I/O and CFG spaces via a single iATU mapping for
> no reason.

This worries me that this value may be being detected incorrectly,
they set it to this for a reason. It's not unheard of for Rockchip to
need to override what they encode in the silicon.

Very Respectfully,
Peter Geis

>
> This change to the regs/ranges makes the issue go away and both nvme and
> wifi cards work when connected at the same time to the bridge. I tested
> the nvme with large amount of reads/writes, both behind the PCIe bridge
> and when directly connected to Quartz64-A board.
>
> Signed-off-by: Ondrej Jirman <megi@xff.cz>
> ---
> BSP for reference: https://github.com/rockchip-linux/kernel/blob/develop-4.19/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2370
>
> v2:
> - change ranges to use 0x300000000 fully for MEM and make use of
>   the 0xf4000000 outbound range for IO and config
> - full retest with/without the switch
> - if lscpi/dmesg is useful in the future for comparison, see:
>   https://xff.cz/kernels/random/quartz64a-pcie/
>
> I used this script for the tests:
>
> #!/bin/bash
>
> OUT=/mnt/data
> n=8
>
> test -f /tmp/test.dat || \
>     dd if=/dev/urandom of=/tmp/test.dat bs=1M count=1024
> md5sum /tmp/test.dat
>
> i=0
> while test $i -lt $n
> do
>     dd if=/tmp/test.dat of=$OUT/test$i.dat bs=4M oflag=direct
>
>     i=$(($i+1))
> done
>
> i=0
> while test $i -lt $n
> do
>     dd if=$OUT/test$i.dat bs=4M iflag=direct | md5sum
>
>     i=$(($i+1))
> done
>
>
>  arch/arm64/boot/dts/rockchip/rk356x.dtsi | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> index 319981c3e9f7..99fd9543fc6f 100644
> --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> @@ -855,7 +855,8 @@ pcie2x1: pcie@fe260000 {
>                 compatible = "rockchip,rk3568-pcie";
>                 reg = <0x3 0xc0000000 0x0 0x00400000>,
>                       <0x0 0xfe260000 0x0 0x00010000>,
> -                     <0x3 0x3f000000 0x0 0x01000000>;
> +                     <0x0 0xf4000000 0x0 0x01f00000>;
> +
>                 reg-names = "dbi", "apb", "config";
>                 interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
>                              <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
> @@ -877,15 +878,15 @@ pcie2x1: pcie@fe260000 {
>                                 <0 0 0 4 &pcie_intc 3>;
>                 linux,pci-domain = <0>;
>                 num-ib-windows = <6>;
> -               num-ob-windows = <2>;
> +               num-ob-windows = <8>;
>                 max-link-speed = <2>;
>                 msi-map = <0x0 &gic 0x0 0x1000>;
>                 num-lanes = <1>;
>                 phys = <&combphy2 PHY_TYPE_PCIE>;
>                 phy-names = "pcie-phy";
>                 power-domains = <&power RK3568_PD_PIPE>;
> -               ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
> -                         0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
> +               ranges = <0x01000000 0x0 0x00000000 0x0 0xf5f00000 0x0 0x00100000
> +                         0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
>                 resets = <&cru SRST_PCIE20_POWERUP>;
>                 reset-names = "pipe";
>                 #address-cells = <3>;
> --
> 2.37.3
>
Ondřej Jirman Oct. 5, 2022, 12:49 p.m. UTC | #2
Hi,

On Wed, Oct 05, 2022 at 07:42:54AM -0400, Peter Geis wrote:
> On Wed, Oct 5, 2022 at 4:54 AM Ondrej Jirman <megi@xff.cz> wrote:
> >
> 
> Good Morning,
> 
> > I have two Realtek PCIe wifi cards connected over the 4 port PCIe swtich
> > to Quartz64-A. The cards fail to work, when nvme SSD is connected at the
> > same time to the bridge. Without nvme connected, cards work fine. The
> > issue seems to be related to mixed use of devices which make use of I/O
> > ranges and memory ranges.
> >
> > This patch changes I/O, MEM and config mappings so that config and I/O
> > mappings use the 0xf4000000 outbound address space, and MEM range uses
> > the whole 0x300000000 outbound space.
> >
> > This is simialar to how BSP does the mappings.
> 
> This change was very recent in the BSP stuff (Jan 2022):
> https://github.com/rockchip-linux/kernel/commit/cfab7abefc4093daa379fbd90a1e7ac1a484332b
> A few other interesting changes there as well. They added a 32 bit
> window in the lower range and made the entire upper range a 64 bit
> relocatable (why?) and prefetchable window. They also set the viewport
> number to 8. The dt-binding says this is autodetected, but I wonder if
> the value is being detected correctly.

That number in BSP is equivalent in meaning to num-ob-windows in mainline. At
least it's used in identical manner in the code.

> It looks like it is dependent in BSP on a backported change from mainline:
> https://github.com/rockchip-linux/kernel/commit/50a01d3c10a6212f66364575a3c8f66c07f41591
> 
> Can someone weigh in why the dw core has config in the reg node
> instead of ranges?
> 
> >
> > I changed num-ob-windows to value detected by the kernel so if for whatever
> > reason the kernel ever starts respecting this DT property, it would not
> > switch to sharing I/O and CFG spaces via a single iATU mapping for
> > no reason.
> 
> This worries me that this value may be being detected incorrectly,
> they set it to this for a reason. It's not unheard of for Rockchip to
> need to override what they encode in the silicon.

num-ob-windows mainline == viewport-num in BSP for whatever reason. (it's used
to decide whether there are enough windows to share config with I/O or not) So
it's identical to BSP.

kind regards,
	o.

> Very Respectfully,
> Peter Geis
> 
> >
> > This change to the regs/ranges makes the issue go away and both nvme and
> > wifi cards work when connected at the same time to the bridge. I tested
> > the nvme with large amount of reads/writes, both behind the PCIe bridge
> > and when directly connected to Quartz64-A board.
> >
> > Signed-off-by: Ondrej Jirman <megi@xff.cz>
> > ---
> > BSP for reference: https://github.com/rockchip-linux/kernel/blob/develop-4.19/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2370
> >
> > v2:
> > - change ranges to use 0x300000000 fully for MEM and make use of
> >   the 0xf4000000 outbound range for IO and config
> > - full retest with/without the switch
> > - if lscpi/dmesg is useful in the future for comparison, see:
> >   https://xff.cz/kernels/random/quartz64a-pcie/
> >
> > I used this script for the tests:
> >
> > #!/bin/bash
> >
> > OUT=/mnt/data
> > n=8
> >
> > test -f /tmp/test.dat || \
> >     dd if=/dev/urandom of=/tmp/test.dat bs=1M count=1024
> > md5sum /tmp/test.dat
> >
> > i=0
> > while test $i -lt $n
> > do
> >     dd if=/tmp/test.dat of=$OUT/test$i.dat bs=4M oflag=direct
> >
> >     i=$(($i+1))
> > done
> >
> > i=0
> > while test $i -lt $n
> > do
> >     dd if=$OUT/test$i.dat bs=4M iflag=direct | md5sum
> >
> >     i=$(($i+1))
> > done
> >
> >
> >  arch/arm64/boot/dts/rockchip/rk356x.dtsi | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > index 319981c3e9f7..99fd9543fc6f 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > @@ -855,7 +855,8 @@ pcie2x1: pcie@fe260000 {
> >                 compatible = "rockchip,rk3568-pcie";
> >                 reg = <0x3 0xc0000000 0x0 0x00400000>,
> >                       <0x0 0xfe260000 0x0 0x00010000>,
> > -                     <0x3 0x3f000000 0x0 0x01000000>;
> > +                     <0x0 0xf4000000 0x0 0x01f00000>;
> > +
> >                 reg-names = "dbi", "apb", "config";
> >                 interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
> >                              <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
> > @@ -877,15 +878,15 @@ pcie2x1: pcie@fe260000 {
> >                                 <0 0 0 4 &pcie_intc 3>;
> >                 linux,pci-domain = <0>;
> >                 num-ib-windows = <6>;
> > -               num-ob-windows = <2>;
> > +               num-ob-windows = <8>;
> >                 max-link-speed = <2>;
> >                 msi-map = <0x0 &gic 0x0 0x1000>;
> >                 num-lanes = <1>;
> >                 phys = <&combphy2 PHY_TYPE_PCIE>;
> >                 phy-names = "pcie-phy";
> >                 power-domains = <&power RK3568_PD_PIPE>;
> > -               ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
> > -                         0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
> > +               ranges = <0x01000000 0x0 0x00000000 0x0 0xf5f00000 0x0 0x00100000
> > +                         0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> >                 resets = <&cru SRST_PCIE20_POWERUP>;
> >                 reset-names = "pipe";
> >                 #address-cells = <3>;
> > --
> > 2.37.3
> >
Ondřej Jirman Oct. 5, 2022, 10:08 p.m. UTC | #3
On Wed, Oct 05, 2022 at 07:42:54AM -0400, Peter Geis wrote:
> On Wed, Oct 5, 2022 at 4:54 AM Ondrej Jirman <megi@xff.cz> wrote:
> >
> 
> Good Morning,
> 
> > I have two Realtek PCIe wifi cards connected over the 4 port PCIe swtich
> > to Quartz64-A. The cards fail to work, when nvme SSD is connected at the
> > same time to the bridge. Without nvme connected, cards work fine. The
> > issue seems to be related to mixed use of devices which make use of I/O
> > ranges and memory ranges.
> >
> > This patch changes I/O, MEM and config mappings so that config and I/O
> > mappings use the 0xf4000000 outbound address space, and MEM range uses
> > the whole 0x300000000 outbound space.
> >
> > This is simialar to how BSP does the mappings.
> 
> This change was very recent in the BSP stuff (Jan 2022):
> https://github.com/rockchip-linux/kernel/commit/cfab7abefc4093daa379fbd90a1e7ac1a484332b
> A few other interesting changes there as well. They added a 32 bit
> window in the lower range and made the entire upper range a 64 bit
> relocatable (why?) and prefetchable window. They also set the viewport
> number to 8. The dt-binding says this is autodetected, but I wonder if
> the value is being detected correctly.
> 
> It looks like it is dependent in BSP on a backported change from mainline:
> https://github.com/rockchip-linux/kernel/commit/50a01d3c10a6212f66364575a3c8f66c07f41591
> 
> Can someone weigh in why the dw core has config in the reg node
> instead of ranges?
> 
> >
> > I changed num-ob-windows to value detected by the kernel so if for whatever
> > reason the kernel ever starts respecting this DT property, it would not
> > switch to sharing I/O and CFG spaces via a single iATU mapping for
> > no reason.
> 
> This worries me that this value may be being detected incorrectly,
> they set it to this for a reason. It's not unheard of for Rockchip to
> need to override what they encode in the silicon.

I just noticed that you may be thinking that BSP does some detection. It does
not. It just uses either value from DT or hardcoded value 2 in the code.

https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/pci/controller/dwc/pcie-designware-host.c#L450

regards,
	o.

> Very Respectfully,
> Peter Geis
> 
> >
> > This change to the regs/ranges makes the issue go away and both nvme and
> > wifi cards work when connected at the same time to the bridge. I tested
> > the nvme with large amount of reads/writes, both behind the PCIe bridge
> > and when directly connected to Quartz64-A board.
> >
> > Signed-off-by: Ondrej Jirman <megi@xff.cz>
> > ---
> > BSP for reference: https://github.com/rockchip-linux/kernel/blob/develop-4.19/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2370
> >
> > v2:
> > - change ranges to use 0x300000000 fully for MEM and make use of
> >   the 0xf4000000 outbound range for IO and config
> > - full retest with/without the switch
> > - if lscpi/dmesg is useful in the future for comparison, see:
> >   https://xff.cz/kernels/random/quartz64a-pcie/
> >
> > I used this script for the tests:
> >
> > #!/bin/bash
> >
> > OUT=/mnt/data
> > n=8
> >
> > test -f /tmp/test.dat || \
> >     dd if=/dev/urandom of=/tmp/test.dat bs=1M count=1024
> > md5sum /tmp/test.dat
> >
> > i=0
> > while test $i -lt $n
> > do
> >     dd if=/tmp/test.dat of=$OUT/test$i.dat bs=4M oflag=direct
> >
> >     i=$(($i+1))
> > done
> >
> > i=0
> > while test $i -lt $n
> > do
> >     dd if=$OUT/test$i.dat bs=4M iflag=direct | md5sum
> >
> >     i=$(($i+1))
> > done
> >
> >
> >  arch/arm64/boot/dts/rockchip/rk356x.dtsi | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > index 319981c3e9f7..99fd9543fc6f 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > @@ -855,7 +855,8 @@ pcie2x1: pcie@fe260000 {
> >                 compatible = "rockchip,rk3568-pcie";
> >                 reg = <0x3 0xc0000000 0x0 0x00400000>,
> >                       <0x0 0xfe260000 0x0 0x00010000>,
> > -                     <0x3 0x3f000000 0x0 0x01000000>;
> > +                     <0x0 0xf4000000 0x0 0x01f00000>;
> > +
> >                 reg-names = "dbi", "apb", "config";
> >                 interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
> >                              <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
> > @@ -877,15 +878,15 @@ pcie2x1: pcie@fe260000 {
> >                                 <0 0 0 4 &pcie_intc 3>;
> >                 linux,pci-domain = <0>;
> >                 num-ib-windows = <6>;
> > -               num-ob-windows = <2>;
> > +               num-ob-windows = <8>;
> >                 max-link-speed = <2>;
> >                 msi-map = <0x0 &gic 0x0 0x1000>;
> >                 num-lanes = <1>;
> >                 phys = <&combphy2 PHY_TYPE_PCIE>;
> >                 phy-names = "pcie-phy";
> >                 power-domains = <&power RK3568_PD_PIPE>;
> > -               ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
> > -                         0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
> > +               ranges = <0x01000000 0x0 0x00000000 0x0 0xf5f00000 0x0 0x00100000
> > +                         0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> >                 resets = <&cru SRST_PCIE20_POWERUP>;
> >                 reset-names = "pipe";
> >                 #address-cells = <3>;
> > --
> > 2.37.3
> >
Heiko Stuebner Oct. 17, 2022, 11:56 a.m. UTC | #4
Hi,

Am Donnerstag, 6. Oktober 2022, 00:08:12 CEST schrieb Ondřej Jirman:
> On Wed, Oct 05, 2022 at 07:42:54AM -0400, Peter Geis wrote:
> > On Wed, Oct 5, 2022 at 4:54 AM Ondrej Jirman <megi@xff.cz> wrote:
> > >
> > 
> > Good Morning,
> > 
> > > I have two Realtek PCIe wifi cards connected over the 4 port PCIe swtich
> > > to Quartz64-A. The cards fail to work, when nvme SSD is connected at the
> > > same time to the bridge. Without nvme connected, cards work fine. The
> > > issue seems to be related to mixed use of devices which make use of I/O
> > > ranges and memory ranges.
> > >
> > > This patch changes I/O, MEM and config mappings so that config and I/O
> > > mappings use the 0xf4000000 outbound address space, and MEM range uses
> > > the whole 0x300000000 outbound space.
> > >
> > > This is simialar to how BSP does the mappings.
> > 
> > This change was very recent in the BSP stuff (Jan 2022):
> > https://github.com/rockchip-linux/kernel/commit/cfab7abefc4093daa379fbd90a1e7ac1a484332b
> > A few other interesting changes there as well. They added a 32 bit
> > window in the lower range and made the entire upper range a 64 bit
> > relocatable (why?) and prefetchable window. They also set the viewport
> > number to 8. The dt-binding says this is autodetected, but I wonder if
> > the value is being detected correctly.
> > 
> > It looks like it is dependent in BSP on a backported change from mainline:
> > https://github.com/rockchip-linux/kernel/commit/50a01d3c10a6212f66364575a3c8f66c07f41591
> > 
> > Can someone weigh in why the dw core has config in the reg node
> > instead of ranges?
> > 
> > >
> > > I changed num-ob-windows to value detected by the kernel so if for whatever
> > > reason the kernel ever starts respecting this DT property, it would not
> > > switch to sharing I/O and CFG spaces via a single iATU mapping for
> > > no reason.
> > 
> > This worries me that this value may be being detected incorrectly,
> > they set it to this for a reason. It's not unheard of for Rockchip to
> > need to override what they encode in the silicon.
> 
> I just noticed that you may be thinking that BSP does some detection. It does
> not. It just uses either value from DT or hardcoded value 2 in the code.
> 
> https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/pci/controller/dwc/pcie-designware-host.c#L450

@Peter or other people in the recipient list with more PCIe
experience than me, can someone provide some more judgement
on this topic?

Thanks
Heiko


> > Very Respectfully,
> > Peter Geis
> > 
> > >
> > > This change to the regs/ranges makes the issue go away and both nvme and
> > > wifi cards work when connected at the same time to the bridge. I tested
> > > the nvme with large amount of reads/writes, both behind the PCIe bridge
> > > and when directly connected to Quartz64-A board.
> > >
> > > Signed-off-by: Ondrej Jirman <megi@xff.cz>
> > > ---
> > > BSP for reference: https://github.com/rockchip-linux/kernel/blob/develop-4.19/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2370
> > >
> > > v2:
> > > - change ranges to use 0x300000000 fully for MEM and make use of
> > >   the 0xf4000000 outbound range for IO and config
> > > - full retest with/without the switch
> > > - if lscpi/dmesg is useful in the future for comparison, see:
> > >   https://xff.cz/kernels/random/quartz64a-pcie/
> > >
> > > I used this script for the tests:
> > >
> > > #!/bin/bash
> > >
> > > OUT=/mnt/data
> > > n=8
> > >
> > > test -f /tmp/test.dat || \
> > >     dd if=/dev/urandom of=/tmp/test.dat bs=1M count=1024
> > > md5sum /tmp/test.dat
> > >
> > > i=0
> > > while test $i -lt $n
> > > do
> > >     dd if=/tmp/test.dat of=$OUT/test$i.dat bs=4M oflag=direct
> > >
> > >     i=$(($i+1))
> > > done
> > >
> > > i=0
> > > while test $i -lt $n
> > > do
> > >     dd if=$OUT/test$i.dat bs=4M iflag=direct | md5sum
> > >
> > >     i=$(($i+1))
> > > done
> > >
> > >
> > >  arch/arm64/boot/dts/rockchip/rk356x.dtsi | 9 +++++----
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > > index 319981c3e9f7..99fd9543fc6f 100644
> > > --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > > +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > > @@ -855,7 +855,8 @@ pcie2x1: pcie@fe260000 {
> > >                 compatible = "rockchip,rk3568-pcie";
> > >                 reg = <0x3 0xc0000000 0x0 0x00400000>,
> > >                       <0x0 0xfe260000 0x0 0x00010000>,
> > > -                     <0x3 0x3f000000 0x0 0x01000000>;
> > > +                     <0x0 0xf4000000 0x0 0x01f00000>;
> > > +
> > >                 reg-names = "dbi", "apb", "config";
> > >                 interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
> > >                              <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
> > > @@ -877,15 +878,15 @@ pcie2x1: pcie@fe260000 {
> > >                                 <0 0 0 4 &pcie_intc 3>;
> > >                 linux,pci-domain = <0>;
> > >                 num-ib-windows = <6>;
> > > -               num-ob-windows = <2>;
> > > +               num-ob-windows = <8>;
> > >                 max-link-speed = <2>;
> > >                 msi-map = <0x0 &gic 0x0 0x1000>;
> > >                 num-lanes = <1>;
> > >                 phys = <&combphy2 PHY_TYPE_PCIE>;
> > >                 phy-names = "pcie-phy";
> > >                 power-domains = <&power RK3568_PD_PIPE>;
> > > -               ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
> > > -                         0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
> > > +               ranges = <0x01000000 0x0 0x00000000 0x0 0xf5f00000 0x0 0x00100000
> > > +                         0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> > >                 resets = <&cru SRST_PCIE20_POWERUP>;
> > >                 reset-names = "pipe";
> > >                 #address-cells = <3>;
> > > --
> > > 2.37.3
> > >
>
Peter Geis Oct. 21, 2022, 1:07 p.m. UTC | #5
On Mon, Oct 17, 2022 at 7:56 AM Heiko Stuebner <heiko@sntech.de> wrote:
>
> Hi,
>
> Am Donnerstag, 6. Oktober 2022, 00:08:12 CEST schrieb Ondřej Jirman:
> > On Wed, Oct 05, 2022 at 07:42:54AM -0400, Peter Geis wrote:
> > > On Wed, Oct 5, 2022 at 4:54 AM Ondrej Jirman <megi@xff.cz> wrote:
> > > >
> > >
> > > Good Morning,
> > >
> > > > I have two Realtek PCIe wifi cards connected over the 4 port PCIe swtich
> > > > to Quartz64-A. The cards fail to work, when nvme SSD is connected at the
> > > > same time to the bridge. Without nvme connected, cards work fine. The
> > > > issue seems to be related to mixed use of devices which make use of I/O
> > > > ranges and memory ranges.
> > > >
> > > > This patch changes I/O, MEM and config mappings so that config and I/O
> > > > mappings use the 0xf4000000 outbound address space, and MEM range uses
> > > > the whole 0x300000000 outbound space.
> > > >
> > > > This is simialar to how BSP does the mappings.
> > >
> > > This change was very recent in the BSP stuff (Jan 2022):
> > > https://github.com/rockchip-linux/kernel/commit/cfab7abefc4093daa379fbd90a1e7ac1a484332b
> > > A few other interesting changes there as well. They added a 32 bit
> > > window in the lower range and made the entire upper range a 64 bit
> > > relocatable (why?) and prefetchable window. They also set the viewport
> > > number to 8. The dt-binding says this is autodetected, but I wonder if
> > > the value is being detected correctly.
> > >
> > > It looks like it is dependent in BSP on a backported change from mainline:
> > > https://github.com/rockchip-linux/kernel/commit/50a01d3c10a6212f66364575a3c8f66c07f41591
> > >
> > > Can someone weigh in why the dw core has config in the reg node
> > > instead of ranges?
> > >
> > > >
> > > > I changed num-ob-windows to value detected by the kernel so if for whatever
> > > > reason the kernel ever starts respecting this DT property, it would not
> > > > switch to sharing I/O and CFG spaces via a single iATU mapping for
> > > > no reason.
> > >
> > > This worries me that this value may be being detected incorrectly,
> > > they set it to this for a reason. It's not unheard of for Rockchip to
> > > need to override what they encode in the silicon.
> >
> > I just noticed that you may be thinking that BSP does some detection. It does
> > not. It just uses either value from DT or hardcoded value 2 in the code.
> >
> > https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/pci/controller/dwc/pcie-designware-host.c#L450
>
> @Peter or other people in the recipient list with more PCIe
> experience than me, can someone provide some more judgement
> on this topic?

Good Morning Heiko,

Apologies for just getting to this, I'm still in the middle of moving
and just got my lab set back up.

I've tested this patch series and it leads to the same regression with
NVMe drives. A loop of md5sum on two identical 4GB random files
produces the following results:
d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
fad97e91da8d4fd554c895cafa89809b  test-rand2.img
2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
b9637505bf88ed725f6d03deb7065dab  test-rand.img
f7437e88d524ea92e097db51dce1c60d  test-rand2.img

Before this patch series:
d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img

Though I do love where this patch is going and would like to see if it
can be made to work, in its current form it does not.

Very Respectfully,
Peter Geis

>
> Thanks
> Heiko
>
>
> > > Very Respectfully,
> > > Peter Geis
> > >
> > > >
> > > > This change to the regs/ranges makes the issue go away and both nvme and
> > > > wifi cards work when connected at the same time to the bridge. I tested
> > > > the nvme with large amount of reads/writes, both behind the PCIe bridge
> > > > and when directly connected to Quartz64-A board.
> > > >
> > > > Signed-off-by: Ondrej Jirman <megi@xff.cz>
> > > > ---
> > > > BSP for reference: https://github.com/rockchip-linux/kernel/blob/develop-4.19/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2370
> > > >
> > > > v2:
> > > > - change ranges to use 0x300000000 fully for MEM and make use of
> > > >   the 0xf4000000 outbound range for IO and config
> > > > - full retest with/without the switch
> > > > - if lscpi/dmesg is useful in the future for comparison, see:
> > > >   https://xff.cz/kernels/random/quartz64a-pcie/
> > > >
> > > > I used this script for the tests:
> > > >
> > > > #!/bin/bash
> > > >
> > > > OUT=/mnt/data
> > > > n=8
> > > >
> > > > test -f /tmp/test.dat || \
> > > >     dd if=/dev/urandom of=/tmp/test.dat bs=1M count=1024
> > > > md5sum /tmp/test.dat
> > > >
> > > > i=0
> > > > while test $i -lt $n
> > > > do
> > > >     dd if=/tmp/test.dat of=$OUT/test$i.dat bs=4M oflag=direct
> > > >
> > > >     i=$(($i+1))
> > > > done
> > > >
> > > > i=0
> > > > while test $i -lt $n
> > > > do
> > > >     dd if=$OUT/test$i.dat bs=4M iflag=direct | md5sum
> > > >
> > > >     i=$(($i+1))
> > > > done
> > > >
> > > >
> > > >  arch/arm64/boot/dts/rockchip/rk356x.dtsi | 9 +++++----
> > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > > > index 319981c3e9f7..99fd9543fc6f 100644
> > > > --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > > > +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> > > > @@ -855,7 +855,8 @@ pcie2x1: pcie@fe260000 {
> > > >                 compatible = "rockchip,rk3568-pcie";
> > > >                 reg = <0x3 0xc0000000 0x0 0x00400000>,
> > > >                       <0x0 0xfe260000 0x0 0x00010000>,
> > > > -                     <0x3 0x3f000000 0x0 0x01000000>;
> > > > +                     <0x0 0xf4000000 0x0 0x01f00000>;
> > > > +
> > > >                 reg-names = "dbi", "apb", "config";
> > > >                 interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
> > > >                              <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
> > > > @@ -877,15 +878,15 @@ pcie2x1: pcie@fe260000 {
> > > >                                 <0 0 0 4 &pcie_intc 3>;
> > > >                 linux,pci-domain = <0>;
> > > >                 num-ib-windows = <6>;
> > > > -               num-ob-windows = <2>;
> > > > +               num-ob-windows = <8>;
> > > >                 max-link-speed = <2>;
> > > >                 msi-map = <0x0 &gic 0x0 0x1000>;
> > > >                 num-lanes = <1>;
> > > >                 phys = <&combphy2 PHY_TYPE_PCIE>;
> > > >                 phy-names = "pcie-phy";
> > > >                 power-domains = <&power RK3568_PD_PIPE>;
> > > > -               ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
> > > > -                         0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
> > > > +               ranges = <0x01000000 0x0 0x00000000 0x0 0xf5f00000 0x0 0x00100000
> > > > +                         0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> > > >                 resets = <&cru SRST_PCIE20_POWERUP>;
> > > >                 reset-names = "pipe";
> > > >                 #address-cells = <3>;
> > > > --
> > > > 2.37.3
> > > >
> >
>
>
>
>
Ondřej Jirman Oct. 21, 2022, 3:39 p.m. UTC | #6
On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> Good Morning Heiko,
> 
> Apologies for just getting to this, I'm still in the middle of moving
> and just got my lab set back up.
> 
> I've tested this patch series and it leads to the same regression with
> NVMe drives. A loop of md5sum on two identical 4GB random files
> produces the following results:
> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> b9637505bf88ed725f6d03deb7065dab  test-rand.img
> f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> 
> Before this patch series:
> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> 
> Though I do love where this patch is going and would like to see if it
> can be made to work, in its current form it does not.

Thanks for the test. Can you please also test v1? Also please share lspci -vvv
of your nvme drive, so that we can see allocated address ranges, etc.

kind regards,
	o.

> Very Respectfully,
> Peter Geis
Peter Geis Oct. 21, 2022, 4:48 p.m. UTC | #7
On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
>
> On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> > Good Morning Heiko,
> >
> > Apologies for just getting to this, I'm still in the middle of moving
> > and just got my lab set back up.
> >
> > I've tested this patch series and it leads to the same regression with
> > NVMe drives. A loop of md5sum on two identical 4GB random files
> > produces the following results:
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> > 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > b9637505bf88ed725f6d03deb7065dab  test-rand.img
> > f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> >
> > Before this patch series:
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >
> > Though I do love where this patch is going and would like to see if it
> > can be made to work, in its current form it does not.
>
> Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> of your nvme drive, so that we can see allocated address ranges, etc.

Good catch, with your patch as is, the following issue crops up:
Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
Region 2: I/O ports at 1000 [disabled] [size=256]

However, with a simple fix, we can get this:
Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
Region 2: I/O ports at 1000 [virtual] [size=256]

and with it a working NVMe drive.

Change the following range:
0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
to
0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;

I still haven't tested this with other cards yet, and another patch
that does similar work I've tested successfully as well with NVMe
drives. I'll have to get back to you on the results of greater
testing.

Very Respectfully,
Peter Geis

>
> kind regards,
>         o.
>
> > Very Respectfully,
> > Peter Geis
Ondřej Jirman Oct. 21, 2022, 7:32 p.m. UTC | #8
On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
> On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
> >
> > On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> > > Good Morning Heiko,
> > >
> > > Apologies for just getting to this, I'm still in the middle of moving
> > > and just got my lab set back up.
> > >
> > > I've tested this patch series and it leads to the same regression with
> > > NVMe drives. A loop of md5sum on two identical 4GB random files
> > > produces the following results:
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> > > 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > b9637505bf88ed725f6d03deb7065dab  test-rand.img
> > > f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> > >
> > > Before this patch series:
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > >
> > > Though I do love where this patch is going and would like to see if it
> > > can be made to work, in its current form it does not.
> >
> > Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> > of your nvme drive, so that we can see allocated address ranges, etc.
> 
> Good catch, with your patch as is, the following issue crops up:
> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
> Region 2: I/O ports at 1000 [disabled] [size=256]
> 
> However, with a simple fix, we can get this:
> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
> Region 2: I/O ports at 1000 [virtual] [size=256]
> 
> and with it a working NVMe drive.
> 
> Change the following range:
> 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> to
> 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;

I've already tried this, but this unfrotunately breaks the wifi cards.
(those only use the I/O space) Maybe because I/O and memory address spaces
now overlap, I don't know. That's why I used the 1GiB offset for memory
space.

kind regards,
	o.

> I still haven't tested this with other cards yet, and another patch
> that does similar work I've tested successfully as well with NVMe
> drives. I'll have to get back to you on the results of greater
> testing.
> 
> Very Respectfully,
> Peter Geis
> 
> >
> > kind regards,
> >         o.
> >
> > > Very Respectfully,
> > > Peter Geis
Mark Kettenis Oct. 21, 2022, 8:52 p.m. UTC | #9
> Date: Fri, 21 Oct 2022 21:32:48 +0200
> From: Ondřej Jirman <megi@xff.cz>
> 
> On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
> > On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
> > >
> > > On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> > > > Good Morning Heiko,
> > > >
> > > > Apologies for just getting to this, I'm still in the middle of moving
> > > > and just got my lab set back up.
> > > >
> > > > I've tested this patch series and it leads to the same regression with
> > > > NVMe drives. A loop of md5sum on two identical 4GB random files
> > > > produces the following results:
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> > > > 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> > > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > > 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> > > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > > b9637505bf88ed725f6d03deb7065dab  test-rand.img
> > > > f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> > > >
> > > > Before this patch series:
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > >
> > > > Though I do love where this patch is going and would like to see if it
> > > > can be made to work, in its current form it does not.
> > >
> > > Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> > > of your nvme drive, so that we can see allocated address ranges, etc.
> > 
> > Good catch, with your patch as is, the following issue crops up:
> > Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
> > Region 2: I/O ports at 1000 [disabled] [size=256]
> > 
> > However, with a simple fix, we can get this:
> > Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
> > Region 2: I/O ports at 1000 [virtual] [size=256]
> > 
> > and with it a working NVMe drive.
> > 
> > Change the following range:
> > 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> > to
> > 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
> 
> I've already tried this, but this unfrotunately breaks the wifi cards.
> (those only use the I/O space) Maybe because I/O and memory address spaces
> now overlap, I don't know. That's why I used the 1GiB offset for memory
> space.

Meanwhile, I have an NVMe drive that only works if mmio is completely
untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
Motion SM2260 controller.

So for me, a working configuration has the following "ranges":

ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
         <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
         <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;

This also needs changes to the "reg" propery:

reg = <0x3 0xc0000000 0x0 0x00400000>,
      <0x0 0xfe260000 0x0 0x00010000>,
      <0x3 0x00000000 0x0 0x10000000>;

Now admittedly, this is with OpenBSD running on EDK2 UEFI firmware
from

  https://github.com/jaredmcneill/quartz64_uefi

that I modified to pass through the device tree and modify the ranges
as above.  But the way my OpenBSD driver sets up the address
translation windows matches what the mainline Linux driver does.

I picked the ranges above to match the EDK2 configuration.  But it is
a setup that maximizes the 32-bit mmio window.

Cheers,

Mark

> > I still haven't tested this with other cards yet, and another patch
> > that does similar work I've tested successfully as well with NVMe
> > drives. I'll have to get back to you on the results of greater
> > testing.
> > 
> > Very Respectfully,
> > Peter Geis
> > 
> > >
> > > kind regards,
> > >         o.
> > >
> > > > Very Respectfully,
> > > > Peter Geis
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Peter Geis Oct. 22, 2022, 12:19 p.m. UTC | #10
On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>
> > Date: Fri, 21 Oct 2022 21:32:48 +0200
> > From: Ondřej Jirman <megi@xff.cz>
> >
> > On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
> > > On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
> > > >
> > > > On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> > > > > Good Morning Heiko,
> > > > >
> > > > > Apologies for just getting to this, I'm still in the middle of moving
> > > > > and just got my lab set back up.
> > > > >
> > > > > I've tested this patch series and it leads to the same regression with
> > > > > NVMe drives. A loop of md5sum on two identical 4GB random files
> > > > > produces the following results:
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> > > > > 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> > > > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > > > 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> > > > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > > > b9637505bf88ed725f6d03deb7065dab  test-rand.img
> > > > > f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> > > > >
> > > > > Before this patch series:
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > >
> > > > > Though I do love where this patch is going and would like to see if it
> > > > > can be made to work, in its current form it does not.
> > > >
> > > > Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> > > > of your nvme drive, so that we can see allocated address ranges, etc.
> > >
> > > Good catch, with your patch as is, the following issue crops up:
> > > Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
> > > Region 2: I/O ports at 1000 [disabled] [size=256]
> > >
> > > However, with a simple fix, we can get this:
> > > Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
> > > Region 2: I/O ports at 1000 [virtual] [size=256]
> > >
> > > and with it a working NVMe drive.
> > >
> > > Change the following range:
> > > 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> > > to
> > > 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
> >
> > I've already tried this, but this unfrotunately breaks the wifi cards.
> > (those only use the I/O space) Maybe because I/O and memory address spaces
> > now overlap, I don't know. That's why I used the 1GiB offset for memory
> > space.
>
> Meanwhile, I have an NVMe drive that only works if mmio is completely
> untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
> Motion SM2260 controller.
>
> So for me, a working configuration has the following "ranges":
>
> ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
>          <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
>          <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;
>
> This also needs changes to the "reg" propery:
>
> reg = <0x3 0xc0000000 0x0 0x00400000>,
>       <0x0 0xfe260000 0x0 0x00010000>,
>       <0x3 0x00000000 0x0 0x10000000>;

Now this is interesting. I've been reading up on PCIe ranges and what
is necessary for things to work properly, and I found this interesting
article from ARM:
https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions

TLDR: We need a low region (below 4g) and a high region.

From other articles I've gleaned that the config / io should probably
also be in the low range. As such I believe the other patch that was
sent to me may be the correct way to go. If both of you would try the
following reg / ranges:

reg = <0x3 0xc0000000 0x0 0x00400000>,
      <0x0 0xfe260000 0x0 0x00010000>,
      <0x0 0xf4000000 0x0 0x00100000>;

ranges = <0x01000000 0x0 0xf4100000 0x0 0xf4100000 0x0 0x00100000>,
<0x02000000 0x0 0xf4200000 0x0 0xf4200000 0x0 0x01e00000>,
<0x03000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;

Very Respectfully,
Peter Geis

>
> Now admittedly, this is with OpenBSD running on EDK2 UEFI firmware
> from
>
>   https://github.com/jaredmcneill/quartz64_uefi
>
> that I modified to pass through the device tree and modify the ranges
> as above.  But the way my OpenBSD driver sets up the address
> translation windows matches what the mainline Linux driver does.
>
> I picked the ranges above to match the EDK2 configuration.  But it is
> a setup that maximizes the 32-bit mmio window.
>
> Cheers,
>
> Mark
>
> > > I still haven't tested this with other cards yet, and another patch
> > > that does similar work I've tested successfully as well with NVMe
> > > drives. I'll have to get back to you on the results of greater
> > > testing.
> > >
> > > Very Respectfully,
> > > Peter Geis
> > >
> > > >
> > > > kind regards,
> > > >         o.
> > > >
> > > > > Very Respectfully,
> > > > > Peter Geis
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ondřej Jirman Oct. 22, 2022, 2:36 p.m. UTC | #11
On Sat, Oct 22, 2022 at 08:19:57AM -0400, Peter Geis wrote:
> On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>
> [...]
> 
> Now this is interesting. I've been reading up on PCIe ranges and what
> is necessary for things to work properly, and I found this interesting
> article from ARM:
> https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions

Thanks for the research and the link. :)

> TLDR: We need a low region (below 4g) and a high region.
> 
> From other articles I've gleaned that the config / io should probably
> also be in the low range. As such I believe the other patch that was
> sent to me may be the correct way to go. If both of you would try the
> following reg / ranges:
> 
> reg = <0x3 0xc0000000 0x0 0x00400000>,
>       <0x0 0xfe260000 0x0 0x00010000>,
>       <0x0 0xf4000000 0x0 0x00100000>;
> 
> ranges = <0x01000000 0x0 0xf4100000 0x0 0xf4100000 0x0 0x00100000>,
> <0x02000000 0x0 0xf4200000 0x0 0xf4200000 0x0 0x01e00000>,
> <0x03000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;

Tested, and it works. 

One thing to note though is that this results in Linux allocating address space
for all my devices in the 32-bit range, so while it works for me, the address
space above 0x3_0000_0000 was not tested in my experiments, yet.

I'll try this with some other pcie devices, too, after I get my other quartz64
+ pcie bridge setup to work. Hopefully some combo of them will hit the 64-bit
MEM range.

See:

00:00.0 PCI bridge: Rockchip Electronics Co., Ltd RK3568 Remote Signal Processor (rev 01) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 1000-2fff [size=8K] [16-bit]
	Memory behind bridge: f4200000-f44fffff [size=3M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Expansion ROM at f4500000 [virtual] [disabled] [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
	Capabilities: [70] Express Root Port (Slot-), MSI 00
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Secondary PCI Express
	Capabilities: [160] L1 PM Substates
	Capabilities: [170] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
lspci: Unable to load libkmod resources: error -2

01:00.0 PCI bridge: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
	Subsystem: ASMedia Technology Inc. Device 118f
	Flags: bus master, fast devsel, latency 0, IRQ 69
	Bus: primary=01, secondary=02, subordinate=06, sec-latency=0
	I/O behind bridge: 1000-2fff [size=8K] [16-bit]
	Memory behind bridge: f4200000-f44fffff [size=3M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Power Management version 3
	Capabilities: [80] Express Upstream Port, MSI 00
	Capabilities: [c0] Subsystem: ASMedia Technology Inc. Device 118f
	Capabilities: [100] Virtual Channel
	Capabilities: [200] Advanced Error Reporting
	Capabilities: [300] Vendor Specific Information: ID=0000 Rev=0 Len=c00 <?>
	Kernel driver in use: pcieport

02:01.0 PCI bridge: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
	Subsystem: ASMedia Technology Inc. Device 118f
	Flags: bus master, fast devsel, latency 0, IRQ 70
	Bus: primary=02, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: [disabled] [32-bit]
	Memory behind bridge: f4200000-f42fffff [size=1M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Power Management version 3
	Capabilities: [80] Express Downstream Port (Slot+), MSI 00
	Capabilities: [c0] Subsystem: ASMedia Technology Inc. Device 118f
	Capabilities: [100] Virtual Channel
	Capabilities: [200] Advanced Error Reporting
	Kernel driver in use: pcieport

02:03.0 PCI bridge: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
	Subsystem: ASMedia Technology Inc. Device 118f
	Flags: bus master, fast devsel, latency 0, IRQ 71
	Bus: primary=02, secondary=04, subordinate=04, sec-latency=0
	I/O behind bridge: 1000-1fff [size=4K] [16-bit]
	Memory behind bridge: f4300000-f43fffff [size=1M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Power Management version 3
	Capabilities: [80] Express Downstream Port (Slot+), MSI 00
	Capabilities: [c0] Subsystem: ASMedia Technology Inc. Device 118f
	Capabilities: [100] Virtual Channel
	Capabilities: [200] Advanced Error Reporting
	Kernel driver in use: pcieport

02:05.0 PCI bridge: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
	Subsystem: ASMedia Technology Inc. Device 118f
	Flags: bus master, fast devsel, latency 0, IRQ 72
	Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: [disabled] [32-bit]
	Memory behind bridge: [disabled] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Power Management version 3
	Capabilities: [80] Express Downstream Port (Slot+), MSI 00
	Capabilities: [c0] Subsystem: ASMedia Technology Inc. Device 118f
	Capabilities: [100] Virtual Channel
	Capabilities: [200] Advanced Error Reporting
	Kernel driver in use: pcieport

02:07.0 PCI bridge: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
	Subsystem: ASMedia Technology Inc. Device 118f
	Flags: bus master, fast devsel, latency 0, IRQ 73
	Bus: primary=02, secondary=06, subordinate=06, sec-latency=0
	I/O behind bridge: 2000-2fff [size=4K] [16-bit]
	Memory behind bridge: f4400000-f44fffff [size=1M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Power Management version 3
	Capabilities: [80] Express Downstream Port (Slot+), MSI 00
	Capabilities: [c0] Subsystem: ASMedia Technology Inc. Device 118f
	Capabilities: [100] Virtual Channel
	Capabilities: [200] Advanced Error Reporting
	Kernel driver in use: pcieport

03:00.0 Non-Volatile memory controller: Phison Electronics Corporation PS5013 E13 NVMe Controller (rev 01) (prog-if 02 [NVM Express])
	Subsystem: Phison Electronics Corporation PS5013 E13 NVMe Controller
	Flags: bus master, fast devsel, latency 0, NUMA node 0
	Memory at f4200000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [80] Express Endpoint, MSI 00
	Capabilities: [d0] MSI-X: Enable+ Count=9 Masked-
	Capabilities: [e0] MSI: Enable- Count=1/8 Maskable+ 64bit+
	Capabilities: [f8] Power Management version 3
	Capabilities: [100] Latency Tolerance Reporting
	Capabilities: [110] L1 PM Substates
	Capabilities: [200] Advanced Error Reporting
	Capabilities: [300] Secondary PCI Express
	Kernel driver in use: nvme

04:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8822CE 802.11ac PCIe Wireless Network Adapter
	Subsystem: Hewlett-Packard Company Device 85f7
	Flags: bus master, fast devsel, latency 0, IRQ 75
	I/O ports at 1000 [size=256]
	Memory at f4300000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-e0-4c-ff-fe-c8-22-01
	Capabilities: [158] Latency Tolerance Reporting
	Capabilities: [160] L1 PM Substates
	Kernel driver in use: rtw_8822ce

06:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8852AE 802.11ax PCIe Wireless Network Adapter
	Subsystem: Hewlett-Packard Company Device 88e1
	Flags: bus master, fast devsel, latency 0, IRQ 76
	I/O ports at 2000 [size=256]
	Memory at f4400000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-e0-4c-ff-fe-88-52-01
	Capabilities: [158] Latency Tolerance Reporting
	Capabilities: [160] L1 PM Substates
	Kernel driver in use: rtw89_8852ae

kind regards,
	o.


> Very Respectfully,
> Peter Geis
> 
> >
> > Now admittedly, this is with OpenBSD running on EDK2 UEFI firmware
> > from
> >
> >   https://github.com/jaredmcneill/quartz64_uefi
> >
> > that I modified to pass through the device tree and modify the ranges
> > as above.  But the way my OpenBSD driver sets up the address
> > translation windows matches what the mainline Linux driver does.
> >
> > I picked the ranges above to match the EDK2 configuration.  But it is
> > a setup that maximizes the 32-bit mmio window.
> >
> > Cheers,
> >
> > Mark
> >
> > > > I still haven't tested this with other cards yet, and another patch
> > > > that does similar work I've tested successfully as well with NVMe
> > > > drives. I'll have to get back to you on the results of greater
> > > > testing.
> > > >
> > > > Very Respectfully,
> > > > Peter Geis
> > > >
> > > > >
> > > > > kind regards,
> > > > >         o.
> > > > >
> > > > > > Very Respectfully,
> > > > > > Peter Geis
> > >
> > > _______________________________________________
> > > linux-arm-kernel mailing list
> > > linux-arm-kernel@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ondřej Jirman Oct. 22, 2022, 4:41 p.m. UTC | #12
On Sat, Oct 22, 2022 at 08:19:57AM -0400, Peter Geis wrote:
>
> [...]
> 
> reg = <0x3 0xc0000000 0x0 0x00400000>,
>       <0x0 0xfe260000 0x0 0x00010000>,
>       <0x0 0xf4000000 0x0 0x00100000>;
> 
> ranges = <0x01000000 0x0 0xf4100000 0x0 0xf4100000 0x0 0x00100000>,
> <0x02000000 0x0 0xf4200000 0x0 0xf4200000 0x0 0x01e00000>,
> <0x03000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;

... more data:

Diff of my v2 vs the above ranges for my 2xwifi card + nvme setup:

--- switch.lspciv	2022-10-05 10:36:33.924838688 +0200
+++ switch-pg.lspciv	2022-10-22 18:30:33.412025097 +0200
@@ -5 +5 @@
-	Memory behind bridge: 00000000-002fffff [size=3M] [32-bit]
+	Memory behind bridge: f4200000-f44fffff [size=3M] [32-bit]
@@ -7 +7 @@
-	Expansion ROM at 300300000 [virtual] [disabled] [size=64K]
+	Expansion ROM at f4500000 [virtual] [disabled] [size=64K]
@@ -22 +22 @@
-	Memory behind bridge: 00000000-002fffff [size=3M] [32-bit]
+	Memory behind bridge: f4200000-f44fffff [size=3M] [32-bit]
@@ -38 +38 @@
-	Memory behind bridge: 00000000-000fffff [size=1M] [32-bit]
+	Memory behind bridge: f4200000-f42fffff [size=1M] [32-bit]
@@ -53 +53 @@
-	Memory behind bridge: 00100000-001fffff [size=1M] [32-bit]
+	Memory behind bridge: f4300000-f43fffff [size=1M] [32-bit]
@@ -83 +83 @@
-	Memory behind bridge: 00200000-002fffff [size=1M] [32-bit]
+	Memory behind bridge: f4400000-f44fffff [size=1M] [32-bit]
@@ -96 +96 @@
-	Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
+	Memory at f4200000 (64-bit, non-prefetchable) [size=16K]
@@ -111 +111 @@
-	Memory at 300100000 (64-bit, non-prefetchable) [size=64K]
+	Memory at f4300000 (64-bit, non-prefetchable) [size=64K]
@@ -123 +123 @@
-	Flags: bus master, fast devsel, latency 0, IRQ 80
+	Flags: bus master, fast devsel, latency 0, IRQ 76
@@ -125 +125 @@
-	Memory at 300200000 (64-bit, non-prefetchable) [size=1M]
+	Memory at f4400000 (64-bit, non-prefetchable) [size=1M]

(not so dramatic differences)

But for SATA card + USB card + 2-port intel ethernet card, it's
massively better:

--- fullpci-my.lspciv	2022-10-15 17:16:55.002000065 +0200
+++ fullpci-pg.lspciv	2022-10-15 17:15:09.837000015 +0200
@@ -5 +5 @@
-	Memory behind bridge: [disabled] [32-bit]
+	Memory behind bridge: f4200000-f55fffff [size=20M] [32-bit]
@@ -7 +7 @@
-	Expansion ROM at 300000000 [virtual] [disabled] [size=64K]
+	Expansion ROM at f5600000 [virtual] [disabled] [size=64K]
@@ -22 +22 @@
-	Memory behind bridge: [disabled] [32-bit]
+	Memory behind bridge: f4200000-f55fffff [size=20M] [32-bit]
@@ -38 +38 @@
-	Memory behind bridge: [disabled] [32-bit]
+	Memory behind bridge: f5400000-f54fffff [size=1M] [32-bit]
@@ -53 +53 @@
-	Memory behind bridge: [disabled] [32-bit]
+	Memory behind bridge: f5500000-f55fffff [size=1M] [32-bit]
@@ -83 +83 @@
-	Memory behind bridge: [disabled] [32-bit]
+	Memory behind bridge: f4200000-f53fffff [size=18M] [32-bit]
@@ -95 +95,4 @@
-	Flags: fast devsel
+	Flags: bus master, fast devsel, latency 0, IRQ 74
+	Memory at f5480000 (32-bit, non-prefetchable) [size=8K]
+	Memory at f5482000 (32-bit, non-prefetchable) [size=8K]
+	Expansion ROM at f5400000 [virtual] [disabled] [size=512K]
@@ -97 +100 @@
-	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
+	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
@@ -100,0 +104 @@
+	Kernel driver in use: ahci
@@ -104 +108,2 @@
-	Flags: fast devsel
+	Flags: bus master, fast devsel, latency 0, IRQ 75
+	Memory at f5500000 (64-bit, non-prefetchable) [size=4K]
@@ -106 +111 @@
-	Capabilities: [90] MSI: Enable- Count=1/4 Maskable- 64bit+
+	Capabilities: [90] MSI: Enable+ Count=1/4 Maskable- 64bit+
@@ -108,0 +114 @@
+	Kernel driver in use: xhci_hcd
@@ -112 +118,3 @@
-	Flags: fast devsel
+	Flags: bus master, fast devsel, latency 0
+	Memory at f4200000 (32-bit, non-prefetchable) [size=128K]
+	Memory at f4400000 (32-bit, non-prefetchable) [size=4M]
@@ -113,0 +122,2 @@
+	Memory at f4240000 (32-bit, non-prefetchable) [size=16K]
+	Expansion ROM at f4800000 [virtual] [disabled] [size=4M]
@@ -116 +126 @@
-	Capabilities: [70] MSI-X: Enable- Count=10 Masked-
+	Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
@@ -121,0 +132 @@
+	Kernel driver in use: igb
@@ -125,2 +136,6 @@
-	Flags: fast devsel
-	I/O ports at 1020 [disabled] [size=32]
+	Flags: bus master, fast devsel, latency 0, IRQ 85
+	Memory at f4220000 (32-bit, non-prefetchable) [size=128K]
+	Memory at f4c00000 (32-bit, non-prefetchable) [size=4M]
+	I/O ports at 1020 [size=32]
+	Memory at f4284000 (32-bit, non-prefetchable) [size=16K]
+	Expansion ROM at f5000000 [virtual] [disabled] [size=4M]
@@ -128 +143 @@
-	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
+	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
@@ -134,0 +150 @@
+	Kernel driver in use: igb

(Full output https://megous.com/dl/tmp/fullpci-pg.lspciv)

So it's still not testing the 0x3_0000_0000 range, but as far as I'm
concerned, it works with whatever I can throw at it (7 different
pcie devices I have and combining them behind a 4-port pcie switch).

The best reg/ranges combination so far. ;)

Tested-by: Ondrej Jirman <megi@xff.cz>

kind regards,
	o.

> Very Respectfully,
> Peter Geis
Mark Kettenis Oct. 22, 2022, 5:24 p.m. UTC | #13
> From: Peter Geis <pgwipeout@gmail.com>
> Date: Sat, 22 Oct 2022 08:19:57 -0400

Hello Peter,

> On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >
> > > Date: Fri, 21 Oct 2022 21:32:48 +0200
> > > From: Ondřej Jirman <megi@xff.cz>
> > >
> > > On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
> > > > On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
> > > > >
> > > > > On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> > > > > > Good Morning Heiko,
> > > > > >
> > > > > > Apologies for just getting to this, I'm still in the middle of moving
> > > > > > and just got my lab set back up.
> > > > > >
> > > > > > I've tested this patch series and it leads to the same regression with
> > > > > > NVMe drives. A loop of md5sum on two identical 4GB random files
> > > > > > produces the following results:
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > > fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> > > > > > 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> > > > > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > > > > 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> > > > > > 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> > > > > > b9637505bf88ed725f6d03deb7065dab  test-rand.img
> > > > > > f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> > > > > >
> > > > > > Before this patch series:
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> > > > > > d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> > > > > >
> > > > > > Though I do love where this patch is going and would like to see if it
> > > > > > can be made to work, in its current form it does not.
> > > > >
> > > > > Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> > > > > of your nvme drive, so that we can see allocated address ranges, etc.
> > > >
> > > > Good catch, with your patch as is, the following issue crops up:
> > > > Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
> > > > Region 2: I/O ports at 1000 [disabled] [size=256]
> > > >
> > > > However, with a simple fix, we can get this:
> > > > Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
> > > > Region 2: I/O ports at 1000 [virtual] [size=256]
> > > >
> > > > and with it a working NVMe drive.
> > > >
> > > > Change the following range:
> > > > 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> > > > to
> > > > 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
> > >
> > > I've already tried this, but this unfrotunately breaks the wifi cards.
> > > (those only use the I/O space) Maybe because I/O and memory address spaces
> > > now overlap, I don't know. That's why I used the 1GiB offset for memory
> > > space.
> >
> > Meanwhile, I have an NVMe drive that only works if mmio is completely
> > untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
> > Motion SM2260 controller.
> >
> > So for me, a working configuration has the following "ranges":
> >
> > ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
> >          <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
> >          <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;
> >
> > This also needs changes to the "reg" propery:
> >
> > reg = <0x3 0xc0000000 0x0 0x00400000>,
> >       <0x0 0xfe260000 0x0 0x00010000>,
> >       <0x3 0x00000000 0x0 0x10000000>;
> 
> Now this is interesting. I've been reading up on PCIe ranges and what
> is necessary for things to work properly, and I found this interesting
> article from ARM:
> https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions
> 
> TLDR: We need a low region (below 4g) and a high region.

Well, that description applies to a specific ARM reference design.
And it appears that the PCIe-RC used in that reference design does not
support address translation.

The Synopsys DesignWare PCIe-RC implementation used on the RockChip
RK35xx SoCs does support address translation.  But some of the results
we're seeing suggests that this feature is subtly broken for the
RockChip implementation.

> >From other articles I've gleaned that the config / io should probably
> also be in the low range. As such I believe the other patch that was
> sent to me may be the correct way to go. If both of you would try the
> following reg / ranges:
> 
> reg = <0x3 0xc0000000 0x0 0x00400000>,
>       <0x0 0xfe260000 0x0 0x00010000>,
>       <0x0 0xf4000000 0x0 0x00100000>;
> 
> ranges = <0x01000000 0x0 0xf4100000 0x0 0xf4100000 0x0 0x00100000>,
> <0x02000000 0x0 0xf4200000 0x0 0xf4200000 0x0 0x01e00000>,
> <0x03000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;

So that matches the configuration used by RockChip in their downstream
kernel and u-boot:

  https://github.com/rockchip-linux/kernel/blob/develop-5.10/arch/arm64/boot/dts/rockchip/rk3568.dtsi#L2382

That probably means this config has received testing in the wild.

I tried this configuration on my board during my earlier experiments,
and it works.

One downside of this configuration is that it uses 32-bit IO
addresses.  Support for 32-bit IO address is not universal since the
x86 INB and OUTB instructions only support a 16-bit address space.
But if translation is indeed broken for IO in the same way as MMIO,
that might be the best you can do.

> > Now admittedly, this is with OpenBSD running on EDK2 UEFI firmware
> > from
> >
> >   https://github.com/jaredmcneill/quartz64_uefi
> >
> > that I modified to pass through the device tree and modify the ranges
> > as above.  But the way my OpenBSD driver sets up the address
> > translation windows matches what the mainline Linux driver does.
> >
> > I picked the ranges above to match the EDK2 configuration.  But it is
> > a setup that maximizes the 32-bit mmio window.
> >
> > Cheers,
> >
> > Mark
> >
> > > > I still haven't tested this with other cards yet, and another patch
> > > > that does similar work I've tested successfully as well with NVMe
> > > > drives. I'll have to get back to you on the results of greater
> > > > testing.
> > > >
> > > > Very Respectfully,
> > > > Peter Geis
> > > >
> > > > >
> > > > > kind regards,
> > > > >         o.
> > > > >
> > > > > > Very Respectfully,
> > > > > > Peter Geis
> > >
> > > _______________________________________________
> > > linux-arm-kernel mailing list
> > > linux-arm-kernel@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Robin Murphy Oct. 24, 2022, 11:05 a.m. UTC | #14
On 2022-10-22 18:24, Mark Kettenis wrote:
>> From: Peter Geis <pgwipeout@gmail.com>
>> Date: Sat, 22 Oct 2022 08:19:57 -0400
> 
> Hello Peter,
> 
>> On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>
>>>> Date: Fri, 21 Oct 2022 21:32:48 +0200
>>>> From: Ondřej Jirman <megi@xff.cz>
>>>>
>>>> On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
>>>>> On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
>>>>>>
>>>>>> On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
>>>>>>> Good Morning Heiko,
>>>>>>>
>>>>>>> Apologies for just getting to this, I'm still in the middle of moving
>>>>>>> and just got my lab set back up.
>>>>>>>
>>>>>>> I've tested this patch series and it leads to the same regression with
>>>>>>> NVMe drives. A loop of md5sum on two identical 4GB random files
>>>>>>> produces the following results:
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>> fad97e91da8d4fd554c895cafa89809b  test-rand2.img
>>>>>>> 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
>>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
>>>>>>> 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
>>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
>>>>>>> b9637505bf88ed725f6d03deb7065dab  test-rand.img
>>>>>>> f7437e88d524ea92e097db51dce1c60d  test-rand2.img
>>>>>>>
>>>>>>> Before this patch series:
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>>
>>>>>>> Though I do love where this patch is going and would like to see if it
>>>>>>> can be made to work, in its current form it does not.
>>>>>>
>>>>>> Thanks for the test. Can you please also test v1? Also please share lspci -vvv
>>>>>> of your nvme drive, so that we can see allocated address ranges, etc.
>>>>>
>>>>> Good catch, with your patch as is, the following issue crops up:
>>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
>>>>> Region 2: I/O ports at 1000 [disabled] [size=256]
>>>>>
>>>>> However, with a simple fix, we can get this:
>>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
>>>>> Region 2: I/O ports at 1000 [virtual] [size=256]
>>>>>
>>>>> and with it a working NVMe drive.
>>>>>
>>>>> Change the following range:
>>>>> 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
>>>>> to
>>>>> 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
>>>>
>>>> I've already tried this, but this unfrotunately breaks the wifi cards.
>>>> (those only use the I/O space) Maybe because I/O and memory address spaces
>>>> now overlap, I don't know. That's why I used the 1GiB offset for memory
>>>> space.
>>>
>>> Meanwhile, I have an NVMe drive that only works if mmio is completely
>>> untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
>>> Motion SM2260 controller.
>>>
>>> So for me, a working configuration has the following "ranges":
>>>
>>> ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
>>>           <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
>>>           <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;
>>>
>>> This also needs changes to the "reg" propery:
>>>
>>> reg = <0x3 0xc0000000 0x0 0x00400000>,
>>>        <0x0 0xfe260000 0x0 0x00010000>,
>>>        <0x3 0x00000000 0x0 0x10000000>;
>>
>> Now this is interesting. I've been reading up on PCIe ranges and what
>> is necessary for things to work properly, and I found this interesting
>> article from ARM:
>> https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions
>>
>> TLDR: We need a low region (below 4g) and a high region.
> 
> Well, that description applies to a specific ARM reference design.
> And it appears that the PCIe-RC used in that reference design does not
> support address translation.

Indeed, that's not an "interesting article", it's just documentation for 
some other system that isn't this one. In fact it's a system that 
strictly doesn't even *have* PCIe; the reference designs are not 
complete SoCs, and all that is being described there is the interconnect 
address map for the parts which are in place ready for a customer to 
stitch their choice of PCIe implementation to.

The equivalent for RK3568 is that you *do* have "low" and "high" PCIe 
windows at 0xfx000000 and 0x3xxx00000 respectively in the system 
interconnect address map. How the PCIe controllers choose to relate 
those system MMIO addresses to those to PCI Memory, I/O and Config space 
addresses is another matter entirely.

Robin.
Peter Geis Oct. 24, 2022, 8:16 p.m. UTC | #15
On Mon, Oct 24, 2022 at 7:05 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2022-10-22 18:24, Mark Kettenis wrote:
> >> From: Peter Geis <pgwipeout@gmail.com>
> >> Date: Sat, 22 Oct 2022 08:19:57 -0400
> >
> > Hello Peter,
> >
> >> On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >>>
> >>>> Date: Fri, 21 Oct 2022 21:32:48 +0200
> >>>> From: Ondřej Jirman <megi@xff.cz>
> >>>>
> >>>> On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
> >>>>> On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
> >>>>>>
> >>>>>> On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> >>>>>>> Good Morning Heiko,
> >>>>>>>
> >>>>>>> Apologies for just getting to this, I'm still in the middle of moving
> >>>>>>> and just got my lab set back up.
> >>>>>>>
> >>>>>>> I've tested this patch series and it leads to the same regression with
> >>>>>>> NVMe drives. A loop of md5sum on two identical 4GB random files
> >>>>>>> produces the following results:
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>> fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> >>>>>>> 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> >>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> >>>>>>> 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> >>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> >>>>>>> b9637505bf88ed725f6d03deb7065dab  test-rand.img
> >>>>>>> f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> >>>>>>>
> >>>>>>> Before this patch series:
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>>
> >>>>>>> Though I do love where this patch is going and would like to see if it
> >>>>>>> can be made to work, in its current form it does not.
> >>>>>>
> >>>>>> Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> >>>>>> of your nvme drive, so that we can see allocated address ranges, etc.
> >>>>>
> >>>>> Good catch, with your patch as is, the following issue crops up:
> >>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
> >>>>> Region 2: I/O ports at 1000 [disabled] [size=256]
> >>>>>
> >>>>> However, with a simple fix, we can get this:
> >>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
> >>>>> Region 2: I/O ports at 1000 [virtual] [size=256]
> >>>>>
> >>>>> and with it a working NVMe drive.
> >>>>>
> >>>>> Change the following range:
> >>>>> 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> >>>>> to
> >>>>> 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
> >>>>
> >>>> I've already tried this, but this unfrotunately breaks the wifi cards.
> >>>> (those only use the I/O space) Maybe because I/O and memory address spaces
> >>>> now overlap, I don't know. That's why I used the 1GiB offset for memory
> >>>> space.
> >>>
> >>> Meanwhile, I have an NVMe drive that only works if mmio is completely
> >>> untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
> >>> Motion SM2260 controller.
> >>>
> >>> So for me, a working configuration has the following "ranges":
> >>>
> >>> ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
> >>>           <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
> >>>           <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;
> >>>
> >>> This also needs changes to the "reg" propery:
> >>>
> >>> reg = <0x3 0xc0000000 0x0 0x00400000>,
> >>>        <0x0 0xfe260000 0x0 0x00010000>,
> >>>        <0x3 0x00000000 0x0 0x10000000>;
> >>
> >> Now this is interesting. I've been reading up on PCIe ranges and what
> >> is necessary for things to work properly, and I found this interesting
> >> article from ARM:
> >> https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions
> >>
> >> TLDR: We need a low region (below 4g) and a high region.
> >
> > Well, that description applies to a specific ARM reference design.
> > And it appears that the PCIe-RC used in that reference design does not
> > support address translation.
>
> Indeed, that's not an "interesting article", it's just documentation for
> some other system that isn't this one. In fact it's a system that
> strictly doesn't even *have* PCIe; the reference designs are not
> complete SoCs, and all that is being described there is the interconnect
> address map for the parts which are in place ready for a customer to
> stitch their choice of PCIe implementation to.
>
> The equivalent for RK3568 is that you *do* have "low" and "high" PCIe
> windows at 0xfx000000 and 0x3xxx00000 respectively in the system
> interconnect address map. How the PCIe controllers choose to relate
> those system MMIO addresses to those to PCI Memory, I/O and Config space
> addresses is another matter entirely.

Unfortunately we are working with insufficient documentation and
without the detailed understanding of a system integrator here. I'm
fully aware that the Neoverse N2 is not the rk3568, however
significant chunks of the rk3568 are based on ARM IP. Looking at how
ARM expects things to work by comparing their reference documents to
the hardware we have on hand is helpful in determining what we are
lacking.

The specific portions of the documentation that I found useful are not
the memory maps, but the generic descriptions of expected PCIe
regions. Combining those with other reference documents (unfortunately
most x86 based, but we have the unfortunate reality that PCIe has a
lot of x86isms to deal with) is quite enlightening. I've been pinging
various representatives of the IP and implementation on the mailing
list about these issues for about a year now with no responses from
the Designware folk. You have been pretty one of the only individuals
with the level of knowledge we need to respond and I thank you for
that.

Based on what I've read I suspect that at least one of the two
following statements is true:
a. Mark is correct that translation is broken in Rockchip's
implementation (unknown if this is a SoC or a driver issue)
b. We do in fact require IO and Config to be 32 bit addressable to be
fully compatible.

These issues are compounded in rk3588 where we have much smaller
regions in the 32bit space for PCIe, so a definite answer on the true
requirements and limitations would be quite helpful.

As always, thank you for your time,
Peter

>
> Robin.
Robin Murphy Oct. 25, 2022, 10:29 a.m. UTC | #16
On 2022-10-24 21:16, Peter Geis wrote:
> On Mon, Oct 24, 2022 at 7:05 AM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2022-10-22 18:24, Mark Kettenis wrote:
>>>> From: Peter Geis <pgwipeout@gmail.com>
>>>> Date: Sat, 22 Oct 2022 08:19:57 -0400
>>>
>>> Hello Peter,
>>>
>>>> On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>>
>>>>>> Date: Fri, 21 Oct 2022 21:32:48 +0200
>>>>>> From: Ondřej Jirman <megi@xff.cz>
>>>>>>
>>>>>> On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
>>>>>>> On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
>>>>>>>>
>>>>>>>> On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
>>>>>>>>> Good Morning Heiko,
>>>>>>>>>
>>>>>>>>> Apologies for just getting to this, I'm still in the middle of moving
>>>>>>>>> and just got my lab set back up.
>>>>>>>>>
>>>>>>>>> I've tested this patch series and it leads to the same regression with
>>>>>>>>> NVMe drives. A loop of md5sum on two identical 4GB random files
>>>>>>>>> produces the following results:
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>>>> fad97e91da8d4fd554c895cafa89809b  test-rand2.img
>>>>>>>>> 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
>>>>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
>>>>>>>>> 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
>>>>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
>>>>>>>>> b9637505bf88ed725f6d03deb7065dab  test-rand.img
>>>>>>>>> f7437e88d524ea92e097db51dce1c60d  test-rand2.img
>>>>>>>>>
>>>>>>>>> Before this patch series:
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
>>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
>>>>>>>>>
>>>>>>>>> Though I do love where this patch is going and would like to see if it
>>>>>>>>> can be made to work, in its current form it does not.
>>>>>>>>
>>>>>>>> Thanks for the test. Can you please also test v1? Also please share lspci -vvv
>>>>>>>> of your nvme drive, so that we can see allocated address ranges, etc.
>>>>>>>
>>>>>>> Good catch, with your patch as is, the following issue crops up:
>>>>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
>>>>>>> Region 2: I/O ports at 1000 [disabled] [size=256]
>>>>>>>
>>>>>>> However, with a simple fix, we can get this:
>>>>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
>>>>>>> Region 2: I/O ports at 1000 [virtual] [size=256]
>>>>>>>
>>>>>>> and with it a working NVMe drive.
>>>>>>>
>>>>>>> Change the following range:
>>>>>>> 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
>>>>>>> to
>>>>>>> 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
>>>>>>
>>>>>> I've already tried this, but this unfrotunately breaks the wifi cards.
>>>>>> (those only use the I/O space) Maybe because I/O and memory address spaces
>>>>>> now overlap, I don't know. That's why I used the 1GiB offset for memory
>>>>>> space.
>>>>>
>>>>> Meanwhile, I have an NVMe drive that only works if mmio is completely
>>>>> untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
>>>>> Motion SM2260 controller.
>>>>>
>>>>> So for me, a working configuration has the following "ranges":
>>>>>
>>>>> ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
>>>>>            <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
>>>>>            <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;
>>>>>
>>>>> This also needs changes to the "reg" propery:
>>>>>
>>>>> reg = <0x3 0xc0000000 0x0 0x00400000>,
>>>>>         <0x0 0xfe260000 0x0 0x00010000>,
>>>>>         <0x3 0x00000000 0x0 0x10000000>;
>>>>
>>>> Now this is interesting. I've been reading up on PCIe ranges and what
>>>> is necessary for things to work properly, and I found this interesting
>>>> article from ARM:
>>>> https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions
>>>>
>>>> TLDR: We need a low region (below 4g) and a high region.
>>>
>>> Well, that description applies to a specific ARM reference design.
>>> And it appears that the PCIe-RC used in that reference design does not
>>> support address translation.
>>
>> Indeed, that's not an "interesting article", it's just documentation for
>> some other system that isn't this one. In fact it's a system that
>> strictly doesn't even *have* PCIe; the reference designs are not
>> complete SoCs, and all that is being described there is the interconnect
>> address map for the parts which are in place ready for a customer to
>> stitch their choice of PCIe implementation to.
>>
>> The equivalent for RK3568 is that you *do* have "low" and "high" PCIe
>> windows at 0xfx000000 and 0x3xxx00000 respectively in the system
>> interconnect address map. How the PCIe controllers choose to relate
>> those system MMIO addresses to those to PCI Memory, I/O and Config space
>> addresses is another matter entirely.
> 
> Unfortunately we are working with insufficient documentation and
> without the detailed understanding of a system integrator here. I'm
> fully aware that the Neoverse N2 is not the rk3568, however
> significant chunks of the rk3568 are based on ARM IP. Looking at how
> ARM expects things to work by comparing their reference documents to
> the hardware we have on hand is helpful in determining what we are
> lacking.
> 
> The specific portions of the documentation that I found useful are not
> the memory maps, but the generic descriptions of expected PCIe
> regions. Combining those with other reference documents (unfortunately
> most x86 based, but we have the unfortunate reality that PCIe has a
> lot of x86isms to deal with) is quite enlightening.

OK, but you're looking at the wrong place for that. The only actual 
relevant reference would be rule PCI_MM_06 in the BSA[1], which says 
that PCI memory space should not be translated relative to the system 
address map. It is hopefully obvious that 32-bit devices need 32-bit PCI 
mem space to assign to their BARs, thus it falls out that if there is no 
translation, that requires a 32-bit window in system address space too.

That is of course speaking of a BSA-compliant system. Vendors are still 
free to not care about BSA and do whatever the heck they want.

Thanks,
Robin.

[1] https://developer.arm.com/documentation/den0094/latest/

> I've been pinging
> various representatives of the IP and implementation on the mailing
> list about these issues for about a year now with no responses from
> the Designware folk. You have been pretty one of the only individuals
> with the level of knowledge we need to respond and I thank you for
> that.
> 
> Based on what I've read I suspect that at least one of the two
> following statements is true:
> a. Mark is correct that translation is broken in Rockchip's
> implementation (unknown if this is a SoC or a driver issue)
> b. We do in fact require IO and Config to be 32 bit addressable to be
> fully compatible.
> 
> These issues are compounded in rk3588 where we have much smaller
> regions in the 32bit space for PCIe, so a definite answer on the true
> requirements and limitations would be quite helpful.
> 
> As always, thank you for your time,
> Peter
> 
>>
>> Robin.
Mark Kettenis March 12, 2023, 8:13 p.m. UTC | #17
> Date: Tue, 25 Oct 2022 11:29:12 +0100
> From: Robin Murphy <robin.murphy@arm.com>

Reviving this old thread as rk3566 support in mainline U-Boot is
getting usable which prompted me to look at the PCIe address map
issues again.

> On 2022-10-24 21:16, Peter Geis wrote:
> > On Mon, Oct 24, 2022 at 7:05 AM Robin Murphy <robin.murphy@arm.com> wrote:
> >>
> >> On 2022-10-22 18:24, Mark Kettenis wrote:
> >>>> From: Peter Geis <pgwipeout@gmail.com>
> >>>> Date: Sat, 22 Oct 2022 08:19:57 -0400
> >>>
> >>> Hello Peter,
> >>>
> >>>> On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >>>>>
> >>>>>> Date: Fri, 21 Oct 2022 21:32:48 +0200
> >>>>>> From: Ondřej Jirman <megi@xff.cz>
> >>>>>>
> >>>>>> On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
> >>>>>>> On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xff.cz> wrote:
> >>>>>>>>
> >>>>>>>> On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
> >>>>>>>>> Good Morning Heiko,
> >>>>>>>>>
> >>>>>>>>> Apologies for just getting to this, I'm still in the middle of moving
> >>>>>>>>> and just got my lab set back up.
> >>>>>>>>>
> >>>>>>>>> I've tested this patch series and it leads to the same regression with
> >>>>>>>>> NVMe drives. A loop of md5sum on two identical 4GB random files
> >>>>>>>>> produces the following results:
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>>>> fad97e91da8d4fd554c895cafa89809b  test-rand2.img
> >>>>>>>>> 2d56a7baa05c38535f4c19a2b371f90a  test-rand.img
> >>>>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> >>>>>>>>> 25cfcfecf4dd529e4e9fbbe2be482053  test-rand.img
> >>>>>>>>> 74e8e6f93d7c3dc3ad250e91176f5901  test-rand2.img
> >>>>>>>>> b9637505bf88ed725f6d03deb7065dab  test-rand.img
> >>>>>>>>> f7437e88d524ea92e097db51dce1c60d  test-rand2.img
> >>>>>>>>>
> >>>>>>>>> Before this patch series:
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand.img
> >>>>>>>>> d11cf0caa541b72551ca22dc5bef2de0  test-rand2.img
> >>>>>>>>>
> >>>>>>>>> Though I do love where this patch is going and would like to see if it
> >>>>>>>>> can be made to work, in its current form it does not.
> >>>>>>>>
> >>>>>>>> Thanks for the test. Can you please also test v1? Also please share lspci -vvv
> >>>>>>>> of your nvme drive, so that we can see allocated address ranges, etc.
> >>>>>>>
> >>>>>>> Good catch, with your patch as is, the following issue crops up:
> >>>>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
> >>>>>>> Region 2: I/O ports at 1000 [disabled] [size=256]
> >>>>>>>
> >>>>>>> However, with a simple fix, we can get this:
> >>>>>>> Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
> >>>>>>> Region 2: I/O ports at 1000 [virtual] [size=256]
> >>>>>>>
> >>>>>>> and with it a working NVMe drive.
> >>>>>>>
> >>>>>>> Change the following range:
> >>>>>>> 0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
> >>>>>>> to
> >>>>>>> 0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;
> >>>>>>
> >>>>>> I've already tried this, but this unfrotunately breaks the wifi cards.
> >>>>>> (those only use the I/O space) Maybe because I/O and memory address spaces
> >>>>>> now overlap, I don't know. That's why I used the 1GiB offset for memory
> >>>>>> space.
> >>>>>
> >>>>> Meanwhile, I have an NVMe drive that only works if mmio is completely
> >>>>> untranslated.  This is an ADATA SX8000NP drive, which uses a Silicon
> >>>>> Motion SM2260 controller.
> >>>>>
> >>>>> So for me, a working configuration has the following "ranges":
> >>>>>
> >>>>> ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
> >>>>>            <0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
> >>>>>            <0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;
> >>>>>
> >>>>> This also needs changes to the "reg" propery:
> >>>>>
> >>>>> reg = <0x3 0xc0000000 0x0 0x00400000>,
> >>>>>         <0x0 0xfe260000 0x0 0x00010000>,
> >>>>>         <0x3 0x00000000 0x0 0x10000000>;
> >>>>
> >>>> Now this is interesting. I've been reading up on PCIe ranges and what
> >>>> is necessary for things to work properly, and I found this interesting
> >>>> article from ARM:
> >>>> https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions
> >>>>
> >>>> TLDR: We need a low region (below 4g) and a high region.
> >>>
> >>> Well, that description applies to a specific ARM reference design.
> >>> And it appears that the PCIe-RC used in that reference design does not
> >>> support address translation.
> >>
> >> Indeed, that's not an "interesting article", it's just documentation for
> >> some other system that isn't this one. In fact it's a system that
> >> strictly doesn't even *have* PCIe; the reference designs are not
> >> complete SoCs, and all that is being described there is the interconnect
> >> address map for the parts which are in place ready for a customer to
> >> stitch their choice of PCIe implementation to.
> >>
> >> The equivalent for RK3568 is that you *do* have "low" and "high" PCIe
> >> windows at 0xfx000000 and 0x3xxx00000 respectively in the system
> >> interconnect address map. How the PCIe controllers choose to relate
> >> those system MMIO addresses to those to PCI Memory, I/O and Config space
> >> addresses is another matter entirely.
> > 
> > Unfortunately we are working with insufficient documentation and
> > without the detailed understanding of a system integrator here. I'm
> > fully aware that the Neoverse N2 is not the rk3568, however
> > significant chunks of the rk3568 are based on ARM IP. Looking at how
> > ARM expects things to work by comparing their reference documents to
> > the hardware we have on hand is helpful in determining what we are
> > lacking.
> > 
> > The specific portions of the documentation that I found useful are not
> > the memory maps, but the generic descriptions of expected PCIe
> > regions. Combining those with other reference documents (unfortunately
> > most x86 based, but we have the unfortunate reality that PCIe has a
> > lot of x86isms to deal with) is quite enlightening.
> 
> OK, but you're looking at the wrong place for that. The only actual 
> relevant reference would be rule PCI_MM_06 in the BSA[1], which says 
> that PCI memory space should not be translated relative to the system 
> address map. It is hopefully obvious that 32-bit devices need 32-bit PCI 
> mem space to assign to their BARs, thus it falls out that if there is no 
> translation, that requires a 32-bit window in system address space too.
> 
> That is of course speaking of a BSA-compliant system. Vendors are still 
> free to not care about BSA and do whatever the heck they want.
> 
> Thanks,
> Robin.
> 
> [1] https://developer.arm.com/documentation/den0094/latest/
> 
> > I've been pinging
> > various representatives of the IP and implementation on the mailing
> > list about these issues for about a year now with no responses from
> > the Designware folk. You have been pretty one of the only individuals
> > with the level of knowledge we need to respond and I thank you for
> > that.
> > 
> > Based on what I've read I suspect that at least one of the two
> > following statements is true:
> > a. Mark is correct that translation is broken in Rockchip's
> > implementation (unknown if this is a SoC or a driver issue)

It seems translation isn't actually broken.  At least I got it to work
with a slight twist.  What seems to be happening is that reads (and
writes?) to the first 64 MB of the PCIe memory address space
(0x00000000-0x03ffffff) don't make it out to the PCIe device.  I
suspect they are somehow claimed by the RC, maybe because the BAR for
the root complex isn't properly disabled.

If I change the PCIe bus addess of the mmio window from 0x00000000 to
0x40000000 like so:

    ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
              0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x3ef00000>;

my NVMe drive seems to work just fine.  I picked 0x40000000 here
because it is nicely aligned on a 1GB boundary, which matches the size
of the region.  Diff against a recent linux-next at the end of this
mail.

So what I think is happening is that Linux is allocating resources
from the top of the region.  So only if you have a more complicated
PCIe hierarchy it ends up allocating from the low 64 MB and runs into
the issue.  OpenBSD on the other hand allocates from the bottom, which
pretty much guarantees that I hit the issue.

Now this could be a driver bug.  As far as I can tell BAR0/1 is
properly disabled, but maybe there is some additional bit that we need
to set.  But I don't think there are any downsides of my workaround.
We can still provide a ~1GB mmio range; it just starts at 0x40000000
instead of 0x00000000.

Maybe somebody can test this on Linux?

> > b. We do in fact require IO and Config to be 32 bit addressable to be
> > fully compatible.
> > 
> > These issues are compounded in rk3588 where we have much smaller
> > regions in the 32bit space for PCIe, so a definite answer on the true
> > requirements and limitations would be quite helpful.
> > 
> > As always, thank you for your time,
> > Peter

diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
index eed0059a68b8..218e51f41852 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
@@ -983,7 +983,7 @@ pcie2x1: pcie@fe260000 {
 		phy-names = "pcie-phy";
 		power-domains = <&power RK3568_PD_PIPE>;
 		ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
-			  0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
+			  0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x3ef00000>;
 		resets = <&cru SRST_PCIE20_POWERUP>;
 		reset-names = "pipe";
 		#address-cells = <3>;
Ondřej Jirman March 12, 2023, 8:46 p.m. UTC | #18
Hi,

On Sun, Mar 12, 2023 at 09:13:12PM +0100, Mark Kettenis wrote:
> [...]
>
> It seems translation isn't actually broken.  At least I got it to work
> with a slight twist.  What seems to be happening is that reads (and
> writes?) to the first 64 MB of the PCIe memory address space
> (0x00000000-0x03ffffff) don't make it out to the PCIe device.  I
> suspect they are somehow claimed by the RC, maybe because the BAR for
> the root complex isn't properly disabled.
> 
> If I change the PCIe bus addess of the mmio window from 0x00000000 to
> 0x40000000 like so:
> 
>     ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
>               0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x3ef00000>;
> 
> my NVMe drive seems to work just fine.  I picked 0x40000000 here
> because it is nicely aligned on a 1GB boundary, which matches the size
> of the region.  Diff against a recent linux-next at the end of this
> mail.
> 
> So what I think is happening is that Linux is allocating resources
> from the top of the region.  So only if you have a more complicated
> PCIe hierarchy it ends up allocating from the low 64 MB and runs into
> the issue.  OpenBSD on the other hand allocates from the bottom, which
> pretty much guarantees that I hit the issue.
> 
> Now this could be a driver bug.  As far as I can tell BAR0/1 is
> properly disabled, but maybe there is some additional bit that we need
> to set.  But I don't think there are any downsides of my workaround.
> We can still provide a ~1GB mmio range; it just starts at 0x40000000
> instead of 0x00000000.
> 
> Maybe somebody can test this on Linux?

There were other discussions and patches posted, and further testing happened
since this discussion (just by looking at the dates...).

The result was: 
https://lore.kernel.org/lkml/20221112114125.1637543-1-aholmes@omnom.net/
https://lore.kernel.org/lkml/20221112114125.1637543-2-aholmes@omnom.net/

The changes for pcie2x1 in that patch make PCIe work on Linux under many
different device combinations, incl. with various combinations of devices
behind PCIe switch, etc.

That patch includes your change, too.

(The patch is not correct for pcie3, see discussion.)

kind regards,
	o.

> > > b. We do in fact require IO and Config to be 32 bit addressable to be
> > > fully compatible.
> > > 
> > > These issues are compounded in rk3588 where we have much smaller
> > > regions in the 32bit space for PCIe, so a definite answer on the true
> > > requirements and limitations would be quite helpful.
> > > 
> > > As always, thank you for your time,
> > > Peter
> 
> diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> index eed0059a68b8..218e51f41852 100644
> --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> @@ -983,7 +983,7 @@ pcie2x1: pcie@fe260000 {
>  		phy-names = "pcie-phy";
>  		power-domains = <&power RK3568_PD_PIPE>;
>  		ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
> -			  0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
> +			  0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x3ef00000>;
>  		resets = <&cru SRST_PCIE20_POWERUP>;
>  		reset-names = "pipe";
>  		#address-cells = <3>;
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
index 319981c3e9f7..99fd9543fc6f 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
@@ -855,7 +855,8 @@  pcie2x1: pcie@fe260000 {
 		compatible = "rockchip,rk3568-pcie";
 		reg = <0x3 0xc0000000 0x0 0x00400000>,
 		      <0x0 0xfe260000 0x0 0x00010000>,
-		      <0x3 0x3f000000 0x0 0x01000000>;
+		      <0x0 0xf4000000 0x0 0x01f00000>;
+
 		reg-names = "dbi", "apb", "config";
 		interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
@@ -877,15 +878,15 @@  pcie2x1: pcie@fe260000 {
 				<0 0 0 4 &pcie_intc 3>;
 		linux,pci-domain = <0>;
 		num-ib-windows = <6>;
-		num-ob-windows = <2>;
+		num-ob-windows = <8>;
 		max-link-speed = <2>;
 		msi-map = <0x0 &gic 0x0 0x1000>;
 		num-lanes = <1>;
 		phys = <&combphy2 PHY_TYPE_PCIE>;
 		phy-names = "pcie-phy";
 		power-domains = <&power RK3568_PD_PIPE>;
-		ranges = <0x01000000 0x0 0x3ef00000 0x3 0x3ef00000 0x0 0x00100000
-			  0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x3ef00000>;
+		ranges = <0x01000000 0x0 0x00000000 0x0 0xf5f00000 0x0 0x00100000
+			  0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
 		resets = <&cru SRST_PCIE20_POWERUP>;
 		reset-names = "pipe";
 		#address-cells = <3>;