diff mbox series

[04/11] of: address: Preserve the flags portion on 1:1 dma-ranges mapping

Message ID 5ca13a5b01c6c737f07416be53eb05b32811da21.1724159867.git.andrea.porta@suse.com (mailing list archive)
State New, archived
Headers show
Series Add support for RaspberryPi RP1 PCI device using a DT overlay | expand

Commit Message

Andrea della Porta Aug. 20, 2024, 2:36 p.m. UTC
A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
translations. In this specific case, rhe current behaviour is to zero out
the entire specifier so that the translation could be carried on as an
offset from zero.  This includes address specifier that has flags (e.g.
PCI ranges).
Once the flags portion has been zeroed, the translation chain is broken
since the mapping functions will check the upcoming address specifier
against mismatching flags, always failing the 1:1 mapping and its entire
purpose of always succeeding.
Set to zero only the address portion while passing the flags through.

Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
---
 drivers/of/address.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Rob Herring Aug. 21, 2024, 12:16 a.m. UTC | #1
On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> translations. In this specific case, rhe current behaviour is to zero out

typo

> the entire specifier so that the translation could be carried on as an
> offset from zero.  This includes address specifier that has flags (e.g.
> PCI ranges).
> Once the flags portion has been zeroed, the translation chain is broken
> since the mapping functions will check the upcoming address specifier

What does "upcoming address" mean?

> against mismatching flags, always failing the 1:1 mapping and its entire
> purpose of always succeeding.
> Set to zero only the address portion while passing the flags through.

Can you point me to what the failing DT looks like. I'm puzzled how 
things would have worked for anyone.


> 
> Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
> ---
>  drivers/of/address.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index d669ce25b5f9..5a6d55a67aa8 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -443,7 +443,8 @@ static int of_translate_one(struct device_node *parent, struct of_bus *bus,
>  	}
>  	if (ranges == NULL || rlen == 0) {
>  		offset = of_read_number(addr, na);
> -		memset(addr, 0, pna * 4);
> +		/* copy the address while preserving the flags */
> +		memset(addr + pbus->flag_cells, 0, (pna - pbus->flag_cells) * 4);
>  		pr_debug("empty ranges; 1:1 translation\n");
>  		goto finish;
>  	}
> -- 
> 2.35.3
>
Andrea della Porta Aug. 21, 2024, 8:18 a.m. UTC | #2
Hi Rob,

On 19:16 Tue 20 Aug     , Rob Herring wrote:
> On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > translations. In this specific case, rhe current behaviour is to zero out
> 
> typo

Fixed, thanks!

> 
> > the entire specifier so that the translation could be carried on as an
> > offset from zero.  This includes address specifier that has flags (e.g.
> > PCI ranges).
> > Once the flags portion has been zeroed, the translation chain is broken
> > since the mapping functions will check the upcoming address specifier
> 
> What does "upcoming address" mean?

Sorry for the confusion, this means "address specifier (with valid flags) fed
to the translating functions and for which we are looking for a translation".
While this address has some valid flags set, it will fail the translation step
since the ranges it is matched against have flags zeroed out by the 1:1 mapping
condition.

> 
> > against mismatching flags, always failing the 1:1 mapping and its entire
> > purpose of always succeeding.
> > Set to zero only the address portion while passing the flags through.
> 
> Can you point me to what the failing DT looks like. I'm puzzled how 
> things would have worked for anyone.
> 

The following is a simplified and lightly edited) version of the resulting DT
from RPi5:

 pci@0,0 {
	#address-cells = <0x03>;
	#size-cells = <0x02>;
	......
	device_type = "pci";
	compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
	ranges = <0x82000000 0x00 0x00   0x82000000 0x00 0x00   0x00 0x600000>;
	reg = <0x00 0x00 0x00   0x00 0x00>;

	......

	rp1@0 {
		#address-cells = <0x02>;
		#size-cells = <0x02>;
		compatible = "simple-bus";
		ranges = <0xc0 0x40000000   0x01 0x00 0x00   0x00 0x400000>;
		dma-ranges = <0x10 0x00   0x43000000 0x10 0x00   0x10 0x00>;
		......
	};
 };

The pci@0,0 bridge node is automatically created by virtue of
CONFIG_PCI_DYNAMIC_OF_NODES, and has no dma-ranges, hence it implies 1:1 dma
mappings (flags for this mapping are set to zero).  The rp1@0 node has
dma-ranges with flags set (0x43000000). Since 0x43000000 != 0x00 any translation
will fail.
Regarding why no one has really complained about that: AFAIK this could
very well be an unusual scenario that is arising now that we have real use
case for platform devices behind a PCI endpoint and devices populated
dynamically from dtb overlay.

Many thanks,
Andrea

> 
> > 
> > Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
> > ---
> >  drivers/of/address.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index d669ce25b5f9..5a6d55a67aa8 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -443,7 +443,8 @@ static int of_translate_one(struct device_node *parent, struct of_bus *bus,
> >  	}
> >  	if (ranges == NULL || rlen == 0) {
> >  		offset = of_read_number(addr, na);
> > -		memset(addr, 0, pna * 4);
> > +		/* copy the address while preserving the flags */
> > +		memset(addr + pbus->flag_cells, 0, (pna - pbus->flag_cells) * 4);
> >  		pr_debug("empty ranges; 1:1 translation\n");
> >  		goto finish;
> >  	}
> > -- 
> > 2.35.3
> >
Rob Herring Aug. 26, 2024, 9:29 p.m. UTC | #3
On Wed, Aug 21, 2024 at 3:19 AM Andrea della Porta
<andrea.porta@suse.com> wrote:
>
> Hi Rob,
>
> On 19:16 Tue 20 Aug     , Rob Herring wrote:
> > On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > > translations. In this specific case, rhe current behaviour is to zero out
> >
> > typo
>
> Fixed, thanks!
>
> >
> > > the entire specifier so that the translation could be carried on as an
> > > offset from zero.  This includes address specifier that has flags (e.g.
> > > PCI ranges).
> > > Once the flags portion has been zeroed, the translation chain is broken
> > > since the mapping functions will check the upcoming address specifier
> >
> > What does "upcoming address" mean?
>
> Sorry for the confusion, this means "address specifier (with valid flags) fed
> to the translating functions and for which we are looking for a translation".
> While this address has some valid flags set, it will fail the translation step
> since the ranges it is matched against have flags zeroed out by the 1:1 mapping
> condition.
>
> >
> > > against mismatching flags, always failing the 1:1 mapping and its entire
> > > purpose of always succeeding.
> > > Set to zero only the address portion while passing the flags through.
> >
> > Can you point me to what the failing DT looks like. I'm puzzled how
> > things would have worked for anyone.
> >
>
> The following is a simplified and lightly edited) version of the resulting DT
> from RPi5:
>
>  pci@0,0 {
>         #address-cells = <0x03>;
>         #size-cells = <0x02>;
>         ......
>         device_type = "pci";
>         compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
>         ranges = <0x82000000 0x00 0x00   0x82000000 0x00 0x00   0x00 0x600000>;
>         reg = <0x00 0x00 0x00   0x00 0x00>;
>
>         ......
>
>         rp1@0 {

What does 0 represent here? There's no 0 address in 'ranges' below.
Since you said the parent is a PCI-PCI bridge, then the unit-address
would have to be the PCI devfn and you are missing 'reg' (or omitted
it).

>                 #address-cells = <0x02>;
>                 #size-cells = <0x02>;
>                 compatible = "simple-bus";

The parent is a PCI-PCI bridge. Child nodes have to be PCI devices and
"simple-bus" is not a PCI device.

The assumption so far with all of this is that you have some specific
PCI device (and therefore a driver). The simple-buses under it are
defined per BAR. Not really certain if that makes sense in all cases,
but since the address assignment is dynamic, it may have to. I'm also
not completely convinced we should reuse 'simple-bus' here or define
something specific like 'pci-bar-bus' or something.

>                 ranges = <0xc0 0x40000000   0x01 0x00 0x00   0x00 0x400000>;
>                 dma-ranges = <0x10 0x00   0x43000000 0x10 0x00   0x10 0x00>;
>                 ......
>         };
>  };
>
> The pci@0,0 bridge node is automatically created by virtue of
> CONFIG_PCI_DYNAMIC_OF_NODES, and has no dma-ranges, hence it implies 1:1 dma
> mappings (flags for this mapping are set to zero).  The rp1@0 node has
> dma-ranges with flags set (0x43000000). Since 0x43000000 != 0x00 any translation
> will fail.

It's possible that we should fill in 'dma-ranges' when making these
nodes rather than supporting missing dma-ranges here.

Rob
Andrea della Porta Aug. 29, 2024, 10:13 a.m. UTC | #4
Hi Rob,

On 16:29 Mon 26 Aug     , Rob Herring wrote:
> On Wed, Aug 21, 2024 at 3:19 AM Andrea della Porta
> <andrea.porta@suse.com> wrote:
> >
> > Hi Rob,
> >
> > On 19:16 Tue 20 Aug     , Rob Herring wrote:
> > > On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > > > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > > > translations. In this specific case, rhe current behaviour is to zero out
> > >
> > > typo
> >
> > Fixed, thanks!
> >
> > >
> > > > the entire specifier so that the translation could be carried on as an
> > > > offset from zero.  This includes address specifier that has flags (e.g.
> > > > PCI ranges).
> > > > Once the flags portion has been zeroed, the translation chain is broken
> > > > since the mapping functions will check the upcoming address specifier
> > >
> > > What does "upcoming address" mean?
> >
> > Sorry for the confusion, this means "address specifier (with valid flags) fed
> > to the translating functions and for which we are looking for a translation".
> > While this address has some valid flags set, it will fail the translation step
> > since the ranges it is matched against have flags zeroed out by the 1:1 mapping
> > condition.
> >
> > >
> > > > against mismatching flags, always failing the 1:1 mapping and its entire
> > > > purpose of always succeeding.
> > > > Set to zero only the address portion while passing the flags through.
> > >
> > > Can you point me to what the failing DT looks like. I'm puzzled how
> > > things would have worked for anyone.
> > >
> >
> > The following is a simplified and lightly edited) version of the resulting DT
> > from RPi5:
> >
> >  pci@0,0 {
> >         #address-cells = <0x03>;
> >         #size-cells = <0x02>;
> >         ......
> >         device_type = "pci";
> >         compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
> >         ranges = <0x82000000 0x00 0x00   0x82000000 0x00 0x00   0x00 0x600000>;
> >         reg = <0x00 0x00 0x00   0x00 0x00>;
> >
> >         ......
> >
> >         rp1@0 {
> 
> What does 0 represent here? There's no 0 address in 'ranges' below.
> Since you said the parent is a PCI-PCI bridge, then the unit-address
> would have to be the PCI devfn and you are missing 'reg' (or omitted
> it).

There's no reg property because the registers for RP1 are addressed
starting at 0x40108000 offset from BAR1. The devicetree specs says
that a missing reg node should not have any unit address specified
(and AFAIK there's no other special directives for simple-bus specified
in dt-bindings). 
I've added @0 just to get rid of the following warning:

 Warning (unit_address_vs_reg): /fragment@0/__overlay__/rp1: node has
 a reg or ranges property, but no unit name 

coming from make W=1 CHECK_DTBS=y broadcom/rp1.dtbo.
This is the exact same approach used by Bootlin patchset from:

https://lore.kernel.org/all/20240808154658.247873-2-herve.codina@bootlin.com/

replied here below for convenience:

+	pci-ep-bus@0 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+
+		/*
+		 * map @0xe2000000 (32MB) to BAR0 (CPU)
+		 * map @0xe0000000 (16MB) to BAR1 (AMBA)
+		 */
+		ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
+		          0xe0000000 0x01 0x00 0x00 0x1000000>;

Also, I think it's not possible to know the devfn in advance, since the
DT part is pre-compiled as an overlay while the devfn number is coming from
bus enumeration.
Since the registers for sub-peripherals will start (as stated in ranges
property) from 0xc040000000, I'd be inclined to use rp1@c040000000 as the
node name and address unit. Is it feasible?

> 
> >                 #address-cells = <0x02>;
> >                 #size-cells = <0x02>;
> >                 compatible = "simple-bus";
> 
> The parent is a PCI-PCI bridge. Child nodes have to be PCI devices and
> "simple-bus" is not a PCI device.

The simple-bus is needed to automatically traverse and create platform
devices in of_platform_populate(). It's true that RP1 is a PCI device,
but sub-peripherals of RP1 are platform devices so I guess this is
unavoidable right now.

> 
> The assumption so far with all of this is that you have some specific
> PCI device (and therefore a driver). The simple-buses under it are
> defined per BAR. Not really certain if that makes sense in all cases,
> but since the address assignment is dynamic, it may have to. I'm also
> not completely convinced we should reuse 'simple-bus' here or define
> something specific like 'pci-bar-bus' or something.

Good point. Labeling a new bus for this kind of 'appliance' could be
beneficial to unify the dt overlay approach, and I guess it could be
adopted by the aforementioned Bootlin's Microchip patchset too.
However, since the difference with simple-bus would be basically non
existent, I believe that this could be done in a future patch due to
the fact that the dtbo is contained into the driver itself, so we do
not suffer from the proliferation that happens when dtb are managed
outside.

> 
> >                 ranges = <0xc0 0x40000000   0x01 0x00 0x00   0x00 0x400000>;
> >                 dma-ranges = <0x10 0x00   0x43000000 0x10 0x00   0x10 0x00>;
> >                 ......
> >         };
> >  };
> >
> > The pci@0,0 bridge node is automatically created by virtue of
> > CONFIG_PCI_DYNAMIC_OF_NODES, and has no dma-ranges, hence it implies 1:1 dma
> > mappings (flags for this mapping are set to zero).  The rp1@0 node has
> > dma-ranges with flags set (0x43000000). Since 0x43000000 != 0x00 any translation
> > will fail.
> 
> It's possible that we should fill in 'dma-ranges' when making these
> nodes rather than supporting missing dma-ranges here.

I really think that filling dma-ranges for dynamically created pci
nodes would be the correct approach.
However, IMHO this does not imply that we could let inconsistent
address (64 bit addr with 32 flag bit set) laying around the 
translation chain, and fixing that is currently working fine. I'd
be then inclined to say the proposed change is outside the scope
of the present patchset and to postpone it to a future patch.

Many thanks,
Andrea

> 
> Rob
Rob Herring Aug. 29, 2024, 1:18 p.m. UTC | #5
On Thu, Aug 29, 2024 at 5:13 AM Andrea della Porta
<andrea.porta@suse.com> wrote:
>
> Hi Rob,

BTW, I noticed your email replies set "reply-to" to everyone in To and
Cc. The result (with Gmail) is my reply lists everyone twice (in both
To and Cc). "reply-to" is just supposed to be the 1 address you want
replies sent to instead of the "from" address.

> On 16:29 Mon 26 Aug     , Rob Herring wrote:
> > On Wed, Aug 21, 2024 at 3:19 AM Andrea della Porta
> > <andrea.porta@suse.com> wrote:
> > >
> > > Hi Rob,
> > >
> > > On 19:16 Tue 20 Aug     , Rob Herring wrote:
> > > > On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > > > > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > > > > translations. In this specific case, rhe current behaviour is to zero out
> > > >
> > > > typo
> > >
> > > Fixed, thanks!
> > >
> > > >
> > > > > the entire specifier so that the translation could be carried on as an
> > > > > offset from zero.  This includes address specifier that has flags (e.g.
> > > > > PCI ranges).
> > > > > Once the flags portion has been zeroed, the translation chain is broken
> > > > > since the mapping functions will check the upcoming address specifier
> > > >
> > > > What does "upcoming address" mean?
> > >
> > > Sorry for the confusion, this means "address specifier (with valid flags) fed
> > > to the translating functions and for which we are looking for a translation".
> > > While this address has some valid flags set, it will fail the translation step
> > > since the ranges it is matched against have flags zeroed out by the 1:1 mapping
> > > condition.
> > >
> > > >
> > > > > against mismatching flags, always failing the 1:1 mapping and its entire
> > > > > purpose of always succeeding.
> > > > > Set to zero only the address portion while passing the flags through.
> > > >
> > > > Can you point me to what the failing DT looks like. I'm puzzled how
> > > > things would have worked for anyone.
> > > >
> > >
> > > The following is a simplified and lightly edited) version of the resulting DT
> > > from RPi5:
> > >
> > >  pci@0,0 {
> > >         #address-cells = <0x03>;
> > >         #size-cells = <0x02>;
> > >         ......
> > >         device_type = "pci";
> > >         compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
> > >         ranges = <0x82000000 0x00 0x00   0x82000000 0x00 0x00   0x00 0x600000>;
> > >         reg = <0x00 0x00 0x00   0x00 0x00>;
> > >
> > >         ......
> > >
> > >         rp1@0 {
> >
> > What does 0 represent here? There's no 0 address in 'ranges' below.
> > Since you said the parent is a PCI-PCI bridge, then the unit-address
> > would have to be the PCI devfn and you are missing 'reg' (or omitted
> > it).
>
> There's no reg property because the registers for RP1 are addressed
> starting at 0x40108000 offset from BAR1. The devicetree specs says
> that a missing reg node should not have any unit address specified
> (and AFAIK there's no other special directives for simple-bus specified
> in dt-bindings).
> I've added @0 just to get rid of the following warning:
>
>  Warning (unit_address_vs_reg): /fragment@0/__overlay__/rp1: node has
>  a reg or ranges property, but no unit name

It's still wrong as dtc only checks the unit-address is correct in a
few cases with known bus types.

> coming from make W=1 CHECK_DTBS=y broadcom/rp1.dtbo.
> This is the exact same approach used by Bootlin patchset from:
>
> https://lore.kernel.org/all/20240808154658.247873-2-herve.codina@bootlin.com/

It is not. First, that has a node for the PCI device (i.e. the
LAN966x). You do not. You only have a PCI-PCI bridge and that is
wrong.

BTW, you should Cc Herve and others that are working on this feature.
It is by no means fully sorted as you have found.

> replied here below for convenience:
>
> +       pci-ep-bus@0 {
> +               compatible = "simple-bus";
> +               #address-cells = <1>;
> +               #size-cells = <1>;
> +
> +               /*
> +                * map @0xe2000000 (32MB) to BAR0 (CPU)
> +                * map @0xe0000000 (16MB) to BAR1 (AMBA)
> +                */
> +               ranges = <0xe2000000 0x00 0x00 0x00 0x2000000

The 0 parent address here matches the unit-address, so all good in this case.

> +                         0xe0000000 0x01 0x00 0x00 0x1000000>;
>
> Also, I think it's not possible to know the devfn in advance, since the
> DT part is pre-compiled as an overlay while the devfn number is coming from
> bus enumeration.

No. devfn is fixed unless you are plugging in a card in different
slots. The bus number is the part that is not known and assigned by
the OS, but you'll notice that is omitted.

In any case, the RP1 node should be generated, so its devfn is irrelevant.

> Since the registers for sub-peripherals will start (as stated in ranges
> property) from 0xc040000000, I'd be inclined to use rp1@c040000000 as the
> node name and address unit. Is it feasible?

Yes, but that would be in nodes underneath ranges. Above, it is the
parent bus we are talking about.

> > >                 #address-cells = <0x02>;
> > >                 #size-cells = <0x02>;
> > >                 compatible = "simple-bus";
> >
> > The parent is a PCI-PCI bridge. Child nodes have to be PCI devices and
> > "simple-bus" is not a PCI device.
>
> The simple-bus is needed to automatically traverse and create platform
> devices in of_platform_populate(). It's true that RP1 is a PCI device,
> but sub-peripherals of RP1 are platform devices so I guess this is
> unavoidable right now.

You are missing the point. A PCI-PCI bridge does not have a
simple-bus. However, I think it's just what you pasted here that's
wrong. From the looks of the RP1 driver and the overlay, it should be
correct.

It would also help if you dumped out what "lspci -tvnn" prints.

> > The assumption so far with all of this is that you have some specific
> > PCI device (and therefore a driver). The simple-buses under it are
> > defined per BAR. Not really certain if that makes sense in all cases,
> > but since the address assignment is dynamic, it may have to. I'm also
> > not completely convinced we should reuse 'simple-bus' here or define
> > something specific like 'pci-bar-bus' or something.
>
> Good point. Labeling a new bus for this kind of 'appliance' could be
> beneficial to unify the dt overlay approach, and I guess it could be
> adopted by the aforementioned Bootlin's Microchip patchset too.
> However, since the difference with simple-bus would be basically non
> existent, I believe that this could be done in a future patch due to
> the fact that the dtbo is contained into the driver itself, so we do
> not suffer from the proliferation that happens when dtb are managed
> outside.

It's an ABI, so we really need to decide first.

> > >                 ranges = <0xc0 0x40000000   0x01 0x00 0x00   0x00 0x400000>;
> > >                 dma-ranges = <0x10 0x00   0x43000000 0x10 0x00   0x10 0x00>;
> > >                 ......
> > >         };
> > >  };
> > >
> > > The pci@0,0 bridge node is automatically created by virtue of
> > > CONFIG_PCI_DYNAMIC_OF_NODES, and has no dma-ranges, hence it implies 1:1 dma
> > > mappings (flags for this mapping are set to zero).  The rp1@0 node has
> > > dma-ranges with flags set (0x43000000). Since 0x43000000 != 0x00 any translation
> > > will fail.
> >
> > It's possible that we should fill in 'dma-ranges' when making these
> > nodes rather than supporting missing dma-ranges here.
>
> I really think that filling dma-ranges for dynamically created pci
> nodes would be the correct approach.
> However, IMHO this does not imply that we could let inconsistent
> address (64 bit addr with 32 flag bit set) laying around the
> translation chain, and fixing that is currently working fine. I'd
> be then inclined to say the proposed change is outside the scope
> of the present patchset and to postpone it to a future patch.

Okay, but let's fix it with a test case. There's already a test case
for all this in the DT unittest which can be extended.

Rob
Andrea della Porta Aug. 29, 2024, 4:26 p.m. UTC | #6
Hi Rob,

On 08:18 Thu 29 Aug     , Rob Herring wrote:
> On Thu, Aug 29, 2024 at 5:13 AM Andrea della Porta
> <andrea.porta@suse.com> wrote:
> >
> > Hi Rob,
> 
> BTW, I noticed your email replies set "reply-to" to everyone in To and
> Cc. The result (with Gmail) is my reply lists everyone twice (in both
> To and Cc). "reply-to" is just supposed to be the 1 address you want
> replies sent to instead of the "from" address.

IIUC you're probably referring to Mail-Reply-To, that address only one recipient
at a time (i.e. you want to reply only to the author). Reply-To is a catch-all
that will work as a fallback when you hit either reply or reply-all in your client.
In fact, neither Reply-To nor Mail-Reply-To are included in my emails, the
interesting one being Mail-Followup-To, that should override any of the above
mentioned headers (including To: and Cc:) for the reply all function. How these
headers are interpreted depends solely on the mail client, I'm afraid.
Is it possible that your client is mistakenly merging both Mail-Followup-To 
plus To and Cc lists?
Anyway, I've disabled Mail-followup-To as it's added by mutt, can you please
confirm that this mail now works for you? Hopefully it will not clobber the
recipent list too much, since AFAIK that header was purposely invented to avoid
such inconsistencies.

> 
> > On 16:29 Mon 26 Aug     , Rob Herring wrote:
> > > On Wed, Aug 21, 2024 at 3:19 AM Andrea della Porta
> > > <andrea.porta@suse.com> wrote:
> > > >
> > > > Hi Rob,
> > > >
> > > > On 19:16 Tue 20 Aug     , Rob Herring wrote:
> > > > > On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > > > > > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > > > > > translations. In this specific case, rhe current behaviour is to zero out
> > > > >
> > > > > typo
> > > >
> > > > Fixed, thanks!
> > > >
> > > > >
> > > > > > the entire specifier so that the translation could be carried on as an
> > > > > > offset from zero.  This includes address specifier that has flags (e.g.
> > > > > > PCI ranges).
> > > > > > Once the flags portion has been zeroed, the translation chain is broken
> > > > > > since the mapping functions will check the upcoming address specifier
> > > > >
> > > > > What does "upcoming address" mean?
> > > >
> > > > Sorry for the confusion, this means "address specifier (with valid flags) fed
> > > > to the translating functions and for which we are looking for a translation".
> > > > While this address has some valid flags set, it will fail the translation step
> > > > since the ranges it is matched against have flags zeroed out by the 1:1 mapping
> > > > condition.
> > > >
> > > > >
> > > > > > against mismatching flags, always failing the 1:1 mapping and its entire
> > > > > > purpose of always succeeding.
> > > > > > Set to zero only the address portion while passing the flags through.
> > > > >
> > > > > Can you point me to what the failing DT looks like. I'm puzzled how
> > > > > things would have worked for anyone.
> > > > >
> > > >
> > > > The following is a simplified and lightly edited) version of the resulting DT
> > > > from RPi5:
> > > >
> > > >  pci@0,0 {
> > > >         #address-cells = <0x03>;
> > > >         #size-cells = <0x02>;
> > > >         ......
> > > >         device_type = "pci";
> > > >         compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
> > > >         ranges = <0x82000000 0x00 0x00   0x82000000 0x00 0x00   0x00 0x600000>;
> > > >         reg = <0x00 0x00 0x00   0x00 0x00>;
> > > >
> > > >         ......
> > > >
> > > >         rp1@0 {
> > >
> > > What does 0 represent here? There's no 0 address in 'ranges' below.
> > > Since you said the parent is a PCI-PCI bridge, then the unit-address
> > > would have to be the PCI devfn and you are missing 'reg' (or omitted
> > > it).
> >
> > There's no reg property because the registers for RP1 are addressed
> > starting at 0x40108000 offset from BAR1. The devicetree specs says
> > that a missing reg node should not have any unit address specified
> > (and AFAIK there's no other special directives for simple-bus specified
> > in dt-bindings).
> > I've added @0 just to get rid of the following warning:
> >
> >  Warning (unit_address_vs_reg): /fragment@0/__overlay__/rp1: node has
> >  a reg or ranges property, but no unit name
> 
> It's still wrong as dtc only checks the unit-address is correct in a
> few cases with known bus types.

Sorry, I don't follow you on this, I'm probably missing something. Could
you please add some details?

> 
> > coming from make W=1 CHECK_DTBS=y broadcom/rp1.dtbo.
> > This is the exact same approach used by Bootlin patchset from:
> >
> > https://lore.kernel.org/all/20240808154658.247873-2-herve.codina@bootlin.com/
> 
> It is not. First, that has a node for the PCI device (i.e. the
> LAN966x). You do not. You only have a PCI-PCI bridge and that is
> wrong.

I'm a little confused now, but I think the confusion is generated by
the label node names I've chosen that are, admittedly, a bit sloppy.
I'll try to make some clarity, please see below.

> 
> BTW, you should Cc Herve and others that are working on this feature.
> It is by no means fully sorted as you have found.

Sure, just added, thanks for the heads up.

> 
> > replied here below for convenience:
> >
> > +       pci-ep-bus@0 {
> > +               compatible = "simple-bus";
> > +               #address-cells = <1>;
> > +               #size-cells = <1>;
> > +
> > +               /*
> > +                * map @0xe2000000 (32MB) to BAR0 (CPU)
> > +                * map @0xe0000000 (16MB) to BAR1 (AMBA)
> > +                */
> > +               ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
> 
> The 0 parent address here matches the unit-address, so all good in this case.

Just to be sure, the parent address being the triple zeros in the ranges property,
right?

> 
> > +                         0xe0000000 0x01 0x00 0x00 0x1000000>;
> >
> > Also, I think it's not possible to know the devfn in advance, since the
> > DT part is pre-compiled as an overlay while the devfn number is coming from
> > bus enumeration.
> 
> No. devfn is fixed unless you are plugging in a card in different
> slots. The bus number is the part that is not known and assigned by
> the OS, but you'll notice that is omitted.
> 
> In any case, the RP1 node should be generated, so its devfn is irrelevant.

Which is a possibility, since the driver should possibly work also with RP1
mounted on a PCI card, one day. But as you pointed out, since this is automatically
generated, should not be a concern.

> 
> > Since the registers for sub-peripherals will start (as stated in ranges
> > property) from 0xc040000000, I'd be inclined to use rp1@c040000000 as the
> > node name and address unit. Is it feasible?
> 
> Yes, but that would be in nodes underneath ranges. Above, it is the
> parent bus we are talking about.

Right.

> 
> > > >                 #address-cells = <0x02>;
> > > >                 #size-cells = <0x02>;
> > > >                 compatible = "simple-bus";
> > >
> > > The parent is a PCI-PCI bridge. Child nodes have to be PCI devices and
> > > "simple-bus" is not a PCI device.
> >
> > The simple-bus is needed to automatically traverse and create platform
> > devices in of_platform_populate(). It's true that RP1 is a PCI device,
> > but sub-peripherals of RP1 are platform devices so I guess this is
> > unavoidable right now.
> 
> You are missing the point. A PCI-PCI bridge does not have a
> simple-bus. However, I think it's just what you pasted here that's
> wrong. From the looks of the RP1 driver and the overlay, it should be
> correct.

Trying to clarify: at first I thought of my rp1 node (in the dtso) as the pci
endpoint device, but I now see that it should be intended as just the bus 
attached to the real endpoint device (which is the dynamically generated dev@0,0).
In this sense, rp1 is probably a really wrong name, let's say we use the same 
name from Bootlin, i.e. pci-ep-bus. The DT tree would then be:

pci@0,0         <- auto generated, this represent the pci bridge
  dev@0,0       <- auto generated, this represent the pci ednpoint device, a.k.a. the RP1
    pci-ep-bus  <- added from dtbo, this is the simple-bus to which peripherals are attached

this view is much like Bootlin's approach, also my pci-ep-bus node now would look
like this:
 ...
 pci-ep-bus@0 {
	ranges = <0xc0 0x40000000
                  0x01 0x00 0x00000000
                  0x00 0x00400000>;
	...
 };

and also the correct unit address here is 0 again, since the parent address in
ranges is 0x01 0x00 0x00000000 (0x01 is the flags and in this case represent
BAR1, I assume that for the unit address I should use only the address part that
is 0, right?).

> 
> It would also help if you dumped out what "lspci -tvnn" prints.
> 

Here it is:

localhost:~ # lspci -tvnn
-[0002:00]---00.0-[01]----00.0  Raspberry Pi Ltd RP1 PCIe 2.0 South Bridge [1de4:0001]

> > > The assumption so far with all of this is that you have some specific
> > > PCI device (and therefore a driver). The simple-buses under it are
> > > defined per BAR. Not really certain if that makes sense in all cases,
> > > but since the address assignment is dynamic, it may have to. I'm also
> > > not completely convinced we should reuse 'simple-bus' here or define
> > > something specific like 'pci-bar-bus' or something.
> >
> > Good point. Labeling a new bus for this kind of 'appliance' could be
> > beneficial to unify the dt overlay approach, and I guess it could be
> > adopted by the aforementioned Bootlin's Microchip patchset too.
> > However, since the difference with simple-bus would be basically non
> > existent, I believe that this could be done in a future patch due to
> > the fact that the dtbo is contained into the driver itself, so we do
> > not suffer from the proliferation that happens when dtb are managed
> > outside.
> 
> It's an ABI, so we really need to decide first.

Okay. How should we proceed?

> 
> > > >                 ranges = <0xc0 0x40000000   0x01 0x00 0x00   0x00 0x400000>;
> > > >                 dma-ranges = <0x10 0x00   0x43000000 0x10 0x00   0x10 0x00>;
> > > >                 ......
> > > >         };
> > > >  };
> > > >
> > > > The pci@0,0 bridge node is automatically created by virtue of
> > > > CONFIG_PCI_DYNAMIC_OF_NODES, and has no dma-ranges, hence it implies 1:1 dma
> > > > mappings (flags for this mapping are set to zero).  The rp1@0 node has
> > > > dma-ranges with flags set (0x43000000). Since 0x43000000 != 0x00 any translation
> > > > will fail.
> > >
> > > It's possible that we should fill in 'dma-ranges' when making these
> > > nodes rather than supporting missing dma-ranges here.
> >
> > I really think that filling dma-ranges for dynamically created pci
> > nodes would be the correct approach.
> > However, IMHO this does not imply that we could let inconsistent
> > address (64 bit addr with 32 flag bit set) laying around the
> > translation chain, and fixing that is currently working fine. I'd
> > be then inclined to say the proposed change is outside the scope
> > of the present patchset and to postpone it to a future patch.
> 
> Okay, but let's fix it with a test case. There's already a test case
> for all this in the DT unittest which can be extended.

I'm taking a look at the unittest framework right now.

Many thanks,
Andrea

> 
> Rob
Rob Herring Aug. 30, 2024, 7:37 p.m. UTC | #7
On Thu, Aug 29, 2024 at 11:26 AM Andrea della Porta
<andrea.porta@suse.com> wrote:
>
> Hi Rob,
>
> On 08:18 Thu 29 Aug     , Rob Herring wrote:
> > On Thu, Aug 29, 2024 at 5:13 AM Andrea della Porta
> > <andrea.porta@suse.com> wrote:
> > >
> > > Hi Rob,
> >
> > BTW, I noticed your email replies set "reply-to" to everyone in To and
> > Cc. The result (with Gmail) is my reply lists everyone twice (in both
> > To and Cc). "reply-to" is just supposed to be the 1 address you want
> > replies sent to instead of the "from" address.
>
> IIUC you're probably referring to Mail-Reply-To, that address only one recipient
> at a time (i.e. you want to reply only to the author). Reply-To is a catch-all
> that will work as a fallback when you hit either reply or reply-all in your client.
> In fact, neither Reply-To nor Mail-Reply-To are included in my emails, the
> interesting one being Mail-Followup-To, that should override any of the above
> mentioned headers (including To: and Cc:) for the reply all function. How these
> headers are interpreted depends solely on the mail client, I'm afraid.
> Is it possible that your client is mistakenly merging both Mail-Followup-To
> plus To and Cc lists?

Indeed, the UI displays 'reply-to' but the source shows
Mail-Followup-To which I'd never heard of. From reading up on it, I
agree the client, Gmail web interface, handles it incorrectly. Not
really anything I can do to fix Gmail though...

> Anyway, I've disabled Mail-followup-To as it's added by mutt, can you please
> confirm that this mail now works for you? Hopefully it will not clobber the
> recipent list too much, since AFAIK that header was purposely invented to avoid
> such inconsistencies.

Seems fine now.

My brief read of the header is it is to avoid receiving multiple
copies depending if you are subscribed to a list or not. But listing
out every other address except your own seems like an odd way to
implement "omit From" in replies. You're still going to get N copies
from N lists as well. I guess it made some sense to some people...

> > > On 16:29 Mon 26 Aug     , Rob Herring wrote:
> > > > On Wed, Aug 21, 2024 at 3:19 AM Andrea della Porta
> > > > <andrea.porta@suse.com> wrote:
> > > > >
> > > > > Hi Rob,
> > > > >
> > > > > On 19:16 Tue 20 Aug     , Rob Herring wrote:
> > > > > > On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > > > > > > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > > > > > > translations. In this specific case, rhe current behaviour is to zero out
> > > > > >
> > > > > > typo
> > > > >
> > > > > Fixed, thanks!
> > > > >
> > > > > >
> > > > > > > the entire specifier so that the translation could be carried on as an
> > > > > > > offset from zero.  This includes address specifier that has flags (e.g.
> > > > > > > PCI ranges).
> > > > > > > Once the flags portion has been zeroed, the translation chain is broken
> > > > > > > since the mapping functions will check the upcoming address specifier
> > > > > >
> > > > > > What does "upcoming address" mean?
> > > > >
> > > > > Sorry for the confusion, this means "address specifier (with valid flags) fed
> > > > > to the translating functions and for which we are looking for a translation".
> > > > > While this address has some valid flags set, it will fail the translation step
> > > > > since the ranges it is matched against have flags zeroed out by the 1:1 mapping
> > > > > condition.
> > > > >
> > > > > >
> > > > > > > against mismatching flags, always failing the 1:1 mapping and its entire
> > > > > > > purpose of always succeeding.
> > > > > > > Set to zero only the address portion while passing the flags through.
> > > > > >
> > > > > > Can you point me to what the failing DT looks like. I'm puzzled how
> > > > > > things would have worked for anyone.
> > > > > >
> > > > >
> > > > > The following is a simplified and lightly edited) version of the resulting DT
> > > > > from RPi5:
> > > > >
> > > > >  pci@0,0 {
> > > > >         #address-cells = <0x03>;
> > > > >         #size-cells = <0x02>;
> > > > >         ......
> > > > >         device_type = "pci";
> > > > >         compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
> > > > >         ranges = <0x82000000 0x00 0x00   0x82000000 0x00 0x00   0x00 0x600000>;
> > > > >         reg = <0x00 0x00 0x00   0x00 0x00>;
> > > > >
> > > > >         ......
> > > > >
> > > > >         rp1@0 {
> > > >
> > > > What does 0 represent here? There's no 0 address in 'ranges' below.
> > > > Since you said the parent is a PCI-PCI bridge, then the unit-address
> > > > would have to be the PCI devfn and you are missing 'reg' (or omitted
> > > > it).
> > >
> > > There's no reg property because the registers for RP1 are addressed
> > > starting at 0x40108000 offset from BAR1. The devicetree specs says
> > > that a missing reg node should not have any unit address specified
> > > (and AFAIK there's no other special directives for simple-bus specified
> > > in dt-bindings).
> > > I've added @0 just to get rid of the following warning:
> > >
> > >  Warning (unit_address_vs_reg): /fragment@0/__overlay__/rp1: node has
> > >  a reg or ranges property, but no unit name
> >
> > It's still wrong as dtc only checks the unit-address is correct in a
> > few cases with known bus types.
>
> Sorry, I don't follow you on this, I'm probably missing something. Could
> you please add some details?

dtc only checks unit-address matches reg/ranges for simple-bus, pci,
i2c, and spi. Since this case is none of them, there is no warning and
it is left to reviewers to check. The warnings are often a clue
something is wrong and the easy fix is often not the right one. This
is still an area the tooling needs improvements on.

> >
> > > coming from make W=1 CHECK_DTBS=y broadcom/rp1.dtbo.
> > > This is the exact same approach used by Bootlin patchset from:
> > >
> > > https://lore.kernel.org/all/20240808154658.247873-2-herve.codina@bootlin.com/
> >
> > It is not. First, that has a node for the PCI device (i.e. the
> > LAN966x). You do not. You only have a PCI-PCI bridge and that is
> > wrong.
>
> I'm a little confused now, but I think the confusion is generated by
> the label node names I've chosen that are, admittedly, a bit sloppy.
> I'll try to make some clarity, please see below.
>
> >
> > BTW, you should Cc Herve and others that are working on this feature.
> > It is by no means fully sorted as you have found.
>
> Sure, just added, thanks for the heads up.
>
> >
> > > replied here below for convenience:
> > >
> > > +       pci-ep-bus@0 {
> > > +               compatible = "simple-bus";
> > > +               #address-cells = <1>;
> > > +               #size-cells = <1>;
> > > +
> > > +               /*
> > > +                * map @0xe2000000 (32MB) to BAR0 (CPU)
> > > +                * map @0xe0000000 (16MB) to BAR1 (AMBA)
> > > +                */
> > > +               ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
> >
> > The 0 parent address here matches the unit-address, so all good in this case.
>
> Just to be sure, the parent address being the triple zeros in the ranges property,
> right?

Yes.

>
> >
> > > +                         0xe0000000 0x01 0x00 0x00 0x1000000>;
> > >
> > > Also, I think it's not possible to know the devfn in advance, since the
> > > DT part is pre-compiled as an overlay while the devfn number is coming from
> > > bus enumeration.
> >
> > No. devfn is fixed unless you are plugging in a card in different
> > slots. The bus number is the part that is not known and assigned by
> > the OS, but you'll notice that is omitted.
> >
> > In any case, the RP1 node should be generated, so its devfn is irrelevant.
>
> Which is a possibility, since the driver should possibly work also with RP1
> mounted on a PCI card, one day. But as you pointed out, since this is automatically
> generated, should not be a concern.
>
> >
> > > Since the registers for sub-peripherals will start (as stated in ranges
> > > property) from 0xc040000000, I'd be inclined to use rp1@c040000000 as the
> > > node name and address unit. Is it feasible?
> >
> > Yes, but that would be in nodes underneath ranges. Above, it is the
> > parent bus we are talking about.
>
> Right.
>
> >
> > > > >                 #address-cells = <0x02>;
> > > > >                 #size-cells = <0x02>;
> > > > >                 compatible = "simple-bus";
> > > >
> > > > The parent is a PCI-PCI bridge. Child nodes have to be PCI devices and
> > > > "simple-bus" is not a PCI device.
> > >
> > > The simple-bus is needed to automatically traverse and create platform
> > > devices in of_platform_populate(). It's true that RP1 is a PCI device,
> > > but sub-peripherals of RP1 are platform devices so I guess this is
> > > unavoidable right now.
> >
> > You are missing the point. A PCI-PCI bridge does not have a
> > simple-bus. However, I think it's just what you pasted here that's
> > wrong. From the looks of the RP1 driver and the overlay, it should be
> > correct.
>
> Trying to clarify: at first I thought of my rp1 node (in the dtso) as the pci
> endpoint device, but I now see that it should be intended as just the bus
> attached to the real endpoint device (which is the dynamically generated dev@0,0).
> In this sense, rp1 is probably a really wrong name, let's say we use the same
> name from Bootlin, i.e. pci-ep-bus. The DT tree would then be:
>
> pci@0,0         <- auto generated, this represent the pci bridge

Or root port specifically.

>   dev@0,0       <- auto generated, this represent the pci ednpoint device, a.k.a. the RP1
>     pci-ep-bus  <- added from dtbo, this is the simple-bus to which peripherals are attached
>
> this view is much like Bootlin's approach, also my pci-ep-bus node now would look
> like this:
>  ...
>  pci-ep-bus@0 {
>         ranges = <0xc0 0x40000000
>                   0x01 0x00 0x00000000
>                   0x00 0x00400000>;
>         ...
>  };
>
> and also the correct unit address here is 0 again, since the parent address in
> ranges is 0x01 0x00 0x00000000 (0x01 is the flags and in this case represent
> BAR1, I assume that for the unit address I should use only the address part that
> is 0, right?).

No, it should be 1 for BAR1. It's 1 node per BAR.

>
> >
> > It would also help if you dumped out what "lspci -tvnn" prints.
> >
>
> Here it is:
>
> localhost:~ # lspci -tvnn
> -[0002:00]---00.0-[01]----00.0  Raspberry Pi Ltd RP1 PCIe 2.0 South Bridge [1de4:0001]

Right, so that matches what you now have above.

> > > > The assumption so far with all of this is that you have some specific
> > > > PCI device (and therefore a driver). The simple-buses under it are
> > > > defined per BAR. Not really certain if that makes sense in all cases,
> > > > but since the address assignment is dynamic, it may have to. I'm also
> > > > not completely convinced we should reuse 'simple-bus' here or define
> > > > something specific like 'pci-bar-bus' or something.
> > >
> > > Good point. Labeling a new bus for this kind of 'appliance' could be
> > > beneficial to unify the dt overlay approach, and I guess it could be
> > > adopted by the aforementioned Bootlin's Microchip patchset too.
> > > However, since the difference with simple-bus would be basically non
> > > existent, I believe that this could be done in a future patch due to
> > > the fact that the dtbo is contained into the driver itself, so we do
> > > not suffer from the proliferation that happens when dtb are managed
> > > outside.
> >
> > It's an ABI, so we really need to decide first.
>
> Okay. How should we proceed?

I think simple-bus where you have it is fine. It is really 1 level up
that needs to be specified. Basically something that's referenced from
the specific PCI device's schema (e.g. the RP1 schema (which you are
missing)).

That schema needs to roughly look like this:

properties:
  "#address-cells":
    const: 3
  "#size-cells":
    const: 2
  ranges:
    minItems: 1
    maxItems: 6
    items:
      additionalItems: true
      items:
        - maximum: 5  # The BAR number
        - const: 0
        - const: 0
        - # TODO: valid PCI memory flags

patternProperties:
  "^bar-bus@[0-5]$":
    type: object
    additionalProperties: true
    properties:
      compatible:
        const: simple-bus
      ranges: true

There were some discussions around interrupt handling that might also
factor into this.

Rob
Herve Codina Sept. 3, 2024, 9:09 a.m. UTC | #8
Hi,

On Fri, 30 Aug 2024 14:37:54 -0500
Rob Herring <robh@kernel.org> wrote:

...

> > this view is much like Bootlin's approach, also my pci-ep-bus node now would look
> > like this:
> >  ...
> >  pci-ep-bus@0 {
> >         ranges = <0xc0 0x40000000
> >                   0x01 0x00 0x00000000
> >                   0x00 0x00400000>;
> >         ...
> >  };
> >
> > and also the correct unit address here is 0 again, since the parent address in
> > ranges is 0x01 0x00 0x00000000 (0x01 is the flags and in this case represent
> > BAR1, I assume that for the unit address I should use only the address part that
> > is 0, right?).  
> 
> No, it should be 1 for BAR1. It's 1 node per BAR.

It should be 1 node per BAR but in some cases it is not.

Indeed, in the LAN966x case, the pci-ep-bus need to have access to several
BARs and we have:
	...
	pci-ep-bus@0 {
		compatible = "simple-bus";
		#address-cells = <1>;
		#size-cells = <1>;

		/*
		 * map @0xe2000000 (32MB) to BAR0 (CPU)
		 * map @0xe0000000 (16MB) to BAR1 (AMBA)
		 */
		ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
		          0xe0000000 0x01 0x00 0x00 0x1000000>;
	...

Some devices under this bus need to use both BARs and use two regs values
in their reg properties to access BAR0 and BAR1.


> > > > > The assumption so far with all of this is that you have some specific
> > > > > PCI device (and therefore a driver). The simple-buses under it are
> > > > > defined per BAR. Not really certain if that makes sense in all cases,
> > > > > but since the address assignment is dynamic, it may have to. I'm also
> > > > > not completely convinced we should reuse 'simple-bus' here or define
> > > > > something specific like 'pci-bar-bus' or something.  
> > > >
> > > > Good point. Labeling a new bus for this kind of 'appliance' could be
> > > > beneficial to unify the dt overlay approach, and I guess it could be
> > > > adopted by the aforementioned Bootlin's Microchip patchset too.
> > > > However, since the difference with simple-bus would be basically non
> > > > existent, I believe that this could be done in a future patch due to
> > > > the fact that the dtbo is contained into the driver itself, so we do
> > > > not suffer from the proliferation that happens when dtb are managed
> > > > outside.  
> > >
> > > It's an ABI, so we really need to decide first.  
> >
> > Okay. How should we proceed?  
> 
> I think simple-bus where you have it is fine. It is really 1 level up
> that needs to be specified. Basically something that's referenced from
> the specific PCI device's schema (e.g. the RP1 schema (which you are
> missing)).
> 
> That schema needs to roughly look like this:
> 
> properties:
>   "#address-cells":
>     const: 3
>   "#size-cells":
>     const: 2
>   ranges:
>     minItems: 1
>     maxItems: 6
>     items:
>       additionalItems: true
>       items:
>         - maximum: 5  # The BAR number
>         - const: 0
>         - const: 0
>         - # TODO: valid PCI memory flags
> 
> patternProperties:
>   "^bar-bus@[0-5]$":
>     type: object
>     additionalProperties: true
>     properties:
>       compatible:
>         const: simple-bus
>       ranges: true
> 

IMHO, the node should not have 'bar' in the name.
In the LAN966x PCI use case, multiple BARs have to be accessed by devices
under this simple-bus. That's why I choose pci-ep-bus for this node name.

Best regards,
Hervé
Andrea della Porta Sept. 3, 2024, 9:33 a.m. UTC | #9
Hi Herve,

On 11:09 Tue 03 Sep     , Herve Codina wrote:
> Hi,
> 
> On Fri, 30 Aug 2024 14:37:54 -0500
> Rob Herring <robh@kernel.org> wrote:
> 
> ...
> 
> > > this view is much like Bootlin's approach, also my pci-ep-bus node now would look
> > > like this:
> > >  ...
> > >  pci-ep-bus@0 {
> > >         ranges = <0xc0 0x40000000
> > >                   0x01 0x00 0x00000000
> > >                   0x00 0x00400000>;
> > >         ...
> > >  };
> > >
> > > and also the correct unit address here is 0 again, since the parent address in
> > > ranges is 0x01 0x00 0x00000000 (0x01 is the flags and in this case represent
> > > BAR1, I assume that for the unit address I should use only the address part that
> > > is 0, right?).  
> > 
> > No, it should be 1 for BAR1. It's 1 node per BAR.
> 
> It should be 1 node per BAR but in some cases it is not.
> 
> Indeed, in the LAN966x case, the pci-ep-bus need to have access to several
> BARs and we have:

I second this, on RP1 there are multiple BARs too, but for this minimal
implementation we need only one. Splitting them in one bus per BAR or
merging them with multiple ranges entries depend on whether the peripherals
can access different BARs simultaneously. Besides this contraint, I would
say both approach are viable.

> 	...
> 	pci-ep-bus@0 {
> 		compatible = "simple-bus";
> 		#address-cells = <1>;
> 		#size-cells = <1>;
> 
> 		/*
> 		 * map @0xe2000000 (32MB) to BAR0 (CPU)
> 		 * map @0xe0000000 (16MB) to BAR1 (AMBA)
> 		 */
> 		ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
> 		          0xe0000000 0x01 0x00 0x00 0x1000000>;
> 	...
> 
> Some devices under this bus need to use both BARs and use two regs values
> in their reg properties to access BAR0 and BAR1.
> 
> 
> > > > > > The assumption so far with all of this is that you have some specific
> > > > > > PCI device (and therefore a driver). The simple-buses under it are
> > > > > > defined per BAR. Not really certain if that makes sense in all cases,
> > > > > > but since the address assignment is dynamic, it may have to. I'm also
> > > > > > not completely convinced we should reuse 'simple-bus' here or define
> > > > > > something specific like 'pci-bar-bus' or something.  
> > > > >
> > > > > Good point. Labeling a new bus for this kind of 'appliance' could be
> > > > > beneficial to unify the dt overlay approach, and I guess it could be
> > > > > adopted by the aforementioned Bootlin's Microchip patchset too.
> > > > > However, since the difference with simple-bus would be basically non
> > > > > existent, I believe that this could be done in a future patch due to
> > > > > the fact that the dtbo is contained into the driver itself, so we do
> > > > > not suffer from the proliferation that happens when dtb are managed
> > > > > outside.  
> > > >
> > > > It's an ABI, so we really need to decide first.  
> > >
> > > Okay. How should we proceed?  
> > 
> > I think simple-bus where you have it is fine. It is really 1 level up
> > that needs to be specified. Basically something that's referenced from
> > the specific PCI device's schema (e.g. the RP1 schema (which you are
> > missing)).
> > 
> > That schema needs to roughly look like this:
> > 
> > properties:
> >   "#address-cells":
> >     const: 3
> >   "#size-cells":
> >     const: 2
> >   ranges:
> >     minItems: 1
> >     maxItems: 6
> >     items:
> >       additionalItems: true
> >       items:
> >         - maximum: 5  # The BAR number
> >         - const: 0
> >         - const: 0
> >         - # TODO: valid PCI memory flags
> > 
> > patternProperties:
> >   "^bar-bus@[0-5]$":
> >     type: object
> >     additionalProperties: true
> >     properties:
> >       compatible:
> >         const: simple-bus
> >       ranges: true
> > 
> 
> IMHO, the node should not have 'bar' in the name.
> In the LAN966x PCI use case, multiple BARs have to be accessed by devices
> under this simple-bus. That's why I choose pci-ep-bus for this node name.
>

Agreed for your scenario. Anyway, since the dtbo and driver are shipped together,
we are free to change the name anytime without impacting anything.

Many thanks,
Andrea
 
> Best regards,
> Hervé
Andrea della Porta Sept. 3, 2024, 4:15 p.m. UTC | #10
Hi Rob,

On 14:37 Fri 30 Aug     , Rob Herring wrote:
> On Thu, Aug 29, 2024 at 11:26 AM Andrea della Porta
> <andrea.porta@suse.com> wrote:
> >
> > Hi Rob,
> >

...

> 
> I think simple-bus where you have it is fine. It is really 1 level up
> that needs to be specified. Basically something that's referenced from
> the specific PCI device's schema (e.g. the RP1 schema (which you are
> missing)).
> 
> That schema needs to roughly look like this:
> 
> properties:
>   "#address-cells":
>     const: 3
>   "#size-cells":
>     const: 2
>   ranges:
>     minItems: 1
>     maxItems: 6
>     items:
>       additionalItems: true
>       items:
>         - maximum: 5  # The BAR number
>         - const: 0
>         - const: 0
>         - # TODO: valid PCI memory flags
> 
> patternProperties:
>   "^bar-bus@[0-5]$":
>     type: object
>     additionalProperties: true
>     properties:
>       compatible:
>         const: simple-bus
>       ranges: true
>

Hmmm.. not sure how this is going to work. The PCI device (RP1) will
havei, at runtime, a compatible like this:

compatible = "pci1de4,1\0pciclass,0200000\0pciclass,0200";

that is basically generated automatically by the OF framework. So, in the
schema you proposed above, I can put something like:

properties:
  compatible:
    contains:
      pattern: '^pci1de4,1'

or maybe I could omit the compatible entirely, like in:

https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/pci/pci-iommu.yaml

that seems to refer to generic compatible values.
In both cases though, I don't see how these binding could work with
make dt_binding_check, since there's no compatible known at compile
time (for the first approach), or no compatible at all (the second
approach).
Is it intended only as a loose documentation?
Or are you proposing that for a future new bus (hence with a new, specific,
compatible) that could be described by the schema above?

Many thanks,
Andrea
 
> There were some discussions around interrupt handling that might also
> factor into this.
> 
> Rob
Rob Herring Sept. 3, 2024, 6:46 p.m. UTC | #11
On Tue, Sep 3, 2024 at 11:15 AM Andrea della Porta
<andrea.porta@suse.com> wrote:
>
> Hi Rob,
>
> On 14:37 Fri 30 Aug     , Rob Herring wrote:
> > On Thu, Aug 29, 2024 at 11:26 AM Andrea della Porta
> > <andrea.porta@suse.com> wrote:
> > >
> > > Hi Rob,
> > >
>
> ...
>
> >
> > I think simple-bus where you have it is fine. It is really 1 level up
> > that needs to be specified. Basically something that's referenced from
> > the specific PCI device's schema (e.g. the RP1 schema (which you are
> > missing)).
> >
> > That schema needs to roughly look like this:
> >
> > properties:
> >   "#address-cells":
> >     const: 3
> >   "#size-cells":
> >     const: 2
> >   ranges:
> >     minItems: 1
> >     maxItems: 6
> >     items:
> >       additionalItems: true
> >       items:
> >         - maximum: 5  # The BAR number
> >         - const: 0
> >         - const: 0
> >         - # TODO: valid PCI memory flags
> >
> > patternProperties:
> >   "^bar-bus@[0-5]$":
> >     type: object
> >     additionalProperties: true
> >     properties:
> >       compatible:
> >         const: simple-bus
> >       ranges: true
> >
>
> Hmmm.. not sure how this is going to work. The PCI device (RP1) will
> havei, at runtime, a compatible like this:
>
> compatible = "pci1de4,1\0pciclass,0200000\0pciclass,0200";
>
> that is basically generated automatically by the OF framework. So, in the
> schema you proposed above, I can put something like:
>
> properties:
>   compatible:
>     contains:
>       pattern: '^pci1de4,1'

No, it should be like this:

compatible:
  items:
    - const: pci1de4,1
    - const: pciclass,0200000
    - const: pciclass,0200

or

compatible:
  addtionalItems: true
  maxItems: 3
  items:
    - const: pci1de4,1


Alternatively, we could instead only generate 'pciclass' compatibles
for bridge nodes. The reason being that being an ethernet controller
doesn't really tell us anything. There's no standard interface
associated with that class.

> or maybe I could omit the compatible entirely, like in:

No.

> https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/pci/pci-iommu.yaml

That's not a device node, but just part of pci-host-bridge.yaml.

> that seems to refer to generic compatible values.
> In both cases though, I don't see how these binding could work with
> make dt_binding_check, since there's no compatible known at compile
> time (for the first approach), or no compatible at all (the second
> approach).
> Is it intended only as a loose documentation?

No, schemas define exactly what a binding can and can't contain. But
they are divided into device schemas and common schemas. The latter
are incomplete and are included by the former. Generally, "compatible"
goes in device schemas.

> Or are you proposing that for a future new bus (hence with a new, specific,
> compatible) that could be described by the schema above?

The above schema would be the common schema included by a RP1 schema,
LAN966x schema, or any other device doing the same thing.
Rob
Rob Herring Sept. 3, 2024, 6:55 p.m. UTC | #12
On Tue, Sep 3, 2024 at 4:33 AM Andrea della Porta <andrea.porta@suse.com> wrote:
>
> Hi Herve,
>
> On 11:09 Tue 03 Sep     , Herve Codina wrote:
> > Hi,
> >
> > On Fri, 30 Aug 2024 14:37:54 -0500
> > Rob Herring <robh@kernel.org> wrote:
> >
> > ...
> >
> > > > this view is much like Bootlin's approach, also my pci-ep-bus node now would look
> > > > like this:
> > > >  ...
> > > >  pci-ep-bus@0 {
> > > >         ranges = <0xc0 0x40000000
> > > >                   0x01 0x00 0x00000000
> > > >                   0x00 0x00400000>;
> > > >         ...
> > > >  };
> > > >
> > > > and also the correct unit address here is 0 again, since the parent address in
> > > > ranges is 0x01 0x00 0x00000000 (0x01 is the flags and in this case represent
> > > > BAR1, I assume that for the unit address I should use only the address part that
> > > > is 0, right?).
> > >
> > > No, it should be 1 for BAR1. It's 1 node per BAR.
> >
> > It should be 1 node per BAR but in some cases it is not.
> >
> > Indeed, in the LAN966x case, the pci-ep-bus need to have access to several
> > BARs and we have:

Might not be the last time I forget this...

> I second this, on RP1 there are multiple BARs too, but for this minimal
> implementation we need only one. Splitting them in one bus per BAR or
> merging them with multiple ranges entries depend on whether the peripherals
> can access different BARs simultaneously. Besides this contraint, I would
> say both approach are viable.

Okay. You all define what you need as you understand the devices better than me.


> >       ...
> >       pci-ep-bus@0 {
> >               compatible = "simple-bus";
> >               #address-cells = <1>;
> >               #size-cells = <1>;
> >
> >               /*
> >                * map @0xe2000000 (32MB) to BAR0 (CPU)
> >                * map @0xe0000000 (16MB) to BAR1 (AMBA)
> >                */
> >               ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
> >                         0xe0000000 0x01 0x00 0x00 0x1000000>;
> >       ...
> >
> > Some devices under this bus need to use both BARs and use two regs values
> > in their reg properties to access BAR0 and BAR1.
> >
> >
> > > > > > > The assumption so far with all of this is that you have some specific
> > > > > > > PCI device (and therefore a driver). The simple-buses under it are
> > > > > > > defined per BAR. Not really certain if that makes sense in all cases,
> > > > > > > but since the address assignment is dynamic, it may have to. I'm also
> > > > > > > not completely convinced we should reuse 'simple-bus' here or define
> > > > > > > something specific like 'pci-bar-bus' or something.
> > > > > >
> > > > > > Good point. Labeling a new bus for this kind of 'appliance' could be
> > > > > > beneficial to unify the dt overlay approach, and I guess it could be
> > > > > > adopted by the aforementioned Bootlin's Microchip patchset too.
> > > > > > However, since the difference with simple-bus would be basically non
> > > > > > existent, I believe that this could be done in a future patch due to
> > > > > > the fact that the dtbo is contained into the driver itself, so we do
> > > > > > not suffer from the proliferation that happens when dtb are managed
> > > > > > outside.
> > > > >
> > > > > It's an ABI, so we really need to decide first.
> > > >
> > > > Okay. How should we proceed?
> > >
> > > I think simple-bus where you have it is fine. It is really 1 level up
> > > that needs to be specified. Basically something that's referenced from
> > > the specific PCI device's schema (e.g. the RP1 schema (which you are
> > > missing)).
> > >
> > > That schema needs to roughly look like this:
> > >
> > > properties:
> > >   "#address-cells":
> > >     const: 3
> > >   "#size-cells":
> > >     const: 2
> > >   ranges:
> > >     minItems: 1
> > >     maxItems: 6
> > >     items:
> > >       additionalItems: true
> > >       items:
> > >         - maximum: 5  # The BAR number
> > >         - const: 0
> > >         - const: 0
> > >         - # TODO: valid PCI memory flags
> > >
> > > patternProperties:
> > >   "^bar-bus@[0-5]$":
> > >     type: object
> > >     additionalProperties: true
> > >     properties:
> > >       compatible:
> > >         const: simple-bus
> > >       ranges: true
> > >
> >
> > IMHO, the node should not have 'bar' in the name.
> > In the LAN966x PCI use case, multiple BARs have to be accessed by devices
> > under this simple-bus. That's why I choose pci-ep-bus for this node name.
> >
>
> Agreed for your scenario. Anyway, since the dtbo and driver are shipped together,
> we are free to change the name anytime without impacting anything.

Indeed, no one should care what the nodename is. However, that doesn't
mean you don't have to define something. Really, just 'bus' should be
enough as node names should only be what class a node is. If it is not
enough, then really you need some sort of compatible to identify the
kind of 'bus'. If you want 'pci-ep-bus', then that's fine.


Rob
Andrea della Porta Sept. 4, 2024, 8:33 a.m. UTC | #13
Hi Rob,

On 13:46 Tue 03 Sep     , Rob Herring wrote:
> On Tue, Sep 3, 2024 at 11:15 AM Andrea della Porta
> <andrea.porta@suse.com> wrote:
> >
> > Hi Rob,
> >
> > On 14:37 Fri 30 Aug     , Rob Herring wrote:
> > > On Thu, Aug 29, 2024 at 11:26 AM Andrea della Porta
> > > <andrea.porta@suse.com> wrote:
> > > >
> > > > Hi Rob,
> > > >
> >
> > ...
> >
> > >
> > > I think simple-bus where you have it is fine. It is really 1 level up
> > > that needs to be specified. Basically something that's referenced from
> > > the specific PCI device's schema (e.g. the RP1 schema (which you are
> > > missing)).
> > >
> > > That schema needs to roughly look like this:
> > >
> > > properties:
> > >   "#address-cells":
> > >     const: 3
> > >   "#size-cells":
> > >     const: 2
> > >   ranges:
> > >     minItems: 1
> > >     maxItems: 6
> > >     items:
> > >       additionalItems: true
> > >       items:
> > >         - maximum: 5  # The BAR number
> > >         - const: 0
> > >         - const: 0
> > >         - # TODO: valid PCI memory flags
> > >
> > > patternProperties:
> > >   "^bar-bus@[0-5]$":
> > >     type: object
> > >     additionalProperties: true
> > >     properties:
> > >       compatible:
> > >         const: simple-bus
> > >       ranges: true
> > >
> >
> > Hmmm.. not sure how this is going to work. The PCI device (RP1) will
> > havei, at runtime, a compatible like this:
> >
> > compatible = "pci1de4,1\0pciclass,0200000\0pciclass,0200";
> >
> > that is basically generated automatically by the OF framework. So, in the
> > schema you proposed above, I can put something like:
> >
> > properties:
> >   compatible:
> >     contains:
> >       pattern: '^pci1de4,1'
> 
> No, it should be like this:
> 
> compatible:
>   items:
>     - const: pci1de4,1
>     - const: pciclass,0200000
>     - const: pciclass,0200
> 
> or
> 
> compatible:
>   addtionalItems: true
>   maxItems: 3
>   items:
>     - const: pci1de4,1
>

Ack.
 
> 
> Alternatively, we could instead only generate 'pciclass' compatibles
> for bridge nodes. The reason being that being an ethernet controller
> doesn't really tell us anything. There's no standard interface
> associated with that class.

I'd avoid this one, since the class is not representative in this case. RP1
is an MFD and not an Ethernet controller. Also, it would prevent other similar
PCI devices with differnt class from using this schema.

> 
> > or maybe I could omit the compatible entirely, like in:
> 
> No.
> 
> > https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/pci/pci-iommu.yaml
> 
> That's not a device node, but just part of pci-host-bridge.yaml.
> 
> > that seems to refer to generic compatible values.
> > In both cases though, I don't see how these binding could work with
> > make dt_binding_check, since there's no compatible known at compile
> > time (for the first approach), or no compatible at all (the second
> > approach).
> > Is it intended only as a loose documentation?
> 
> No, schemas define exactly what a binding can and can't contain. But
> they are divided into device schemas and common schemas. The latter
> are incomplete and are included by the former. Generally, "compatible"
> goes in device schemas.

Ack.

> 
> > Or are you proposing that for a future new bus (hence with a new, specific,
> > compatible) that could be described by the schema above?
> 
> The above schema would be the common schema included by a RP1 schema,
> LAN966x schema, or any other device doing the same thing.

Many thanks, I believe I've got it now :)

Cheers,
Andrea

> Rob
diff mbox series

Patch

diff --git a/drivers/of/address.c b/drivers/of/address.c
index d669ce25b5f9..5a6d55a67aa8 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -443,7 +443,8 @@  static int of_translate_one(struct device_node *parent, struct of_bus *bus,
 	}
 	if (ranges == NULL || rlen == 0) {
 		offset = of_read_number(addr, na);
-		memset(addr, 0, pna * 4);
+		/* copy the address while preserving the flags */
+		memset(addr + pbus->flag_cells, 0, (pna - pbus->flag_cells) * 4);
 		pr_debug("empty ranges; 1:1 translation\n");
 		goto finish;
 	}