Message ID | 20131127172806.GC2291@e103592.cambridge.arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > Hi all, > > SoC architectures are getting increasingly complex in ways that are not > transparent to software. > > A particular emerging issue is that of multi-master SoCs, which may have > different address views, IOMMUs, and coherency behaviour from one master > to the next. > > DT can't describe multi-master systems today except for PCI DMA and > similar. This comes with constraints and assumptions that won't work > for emerging SoC bus architectures. On-SoC, a device's interface to the > system can't be described in terms of a single interface to a single > "bus". > > Different masters may have different views of the system too. Software > needs to understand the true topology in order to do address mapping, > coherency management etc., in any generic way. > > One piece of the puzzle is to define how to describe these topologies in > DT. > > The other is how to get the right abstractions in the kernel to drive > these systems in a generic way. > > The following proposal (originally from Will) begins to address the DT > part. > > Comments encouraged -- I anticipate it may take some discussion to > reach a consensus here. > > Cheers > ---Dave > > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > Date: Wed, 20 Nov 2013 12:06:13 +0000 > Subject: [PATCH RFC v2] Documentation: devicetree: add description for generic bus properties > > This patch documents properties that can be used as part of bus and > device bindings in order to describe their linkages within the system > topology. > > Use of these properties allows topological parsing to occur in generic > library code, making it easier for bus drivers to parse information > regarding their upstream masters and potentially allows us to treat > the slave and master interfaces separately for a given device. > > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > > A number of discussion points remain to be resolved: > > - Use of the ranges property and describing slave vs master bus > address ranges. In the latter case, we actually want to describe our > address space with respect to the bus on which the bus masters, > rather than the parent. This could potentially be achieved by adding > properties such as dma-parent and dma-ranges (already used by PPC?) > > - Describing masters that master through multiple different buses > > - How on Earth this fits in with the Linux device model (it doesn't) How does this _not_ fit into the Linux device model? What am I missing here that precludes the use of the "driver/device/bus" model we have today? > - Interaction with IOMMU bindings (currently under discussion) > > Cheers, > > Will > > .../devicetree/bindings/arm/coherent-bus.txt | 110 +++++++++++++++++++++ Why "arm"? What makes it ARM specific? thanks, greg k-h
Hi Greg, On Wed, Nov 27, 2013 at 11:06:50PM +0000, Greg KH wrote: > On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > > A number of discussion points remain to be resolved: > > > > - Use of the ranges property and describing slave vs master bus > > address ranges. In the latter case, we actually want to describe our > > address space with respect to the bus on which the bus masters, > > rather than the parent. This could potentially be achieved by adding > > properties such as dma-parent and dma-ranges (already used by PPC?) > > > > - Describing masters that master through multiple different buses > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > How does this _not_ fit into the Linux device model? What am I missing > here that precludes the use of the "driver/device/bus" model we have > today? The main problem is that we have devices which slave on one bus and master on another. That then complicates probing, power-management, IOMMU configuration, address mapping (e.g. I walk the slave buses to figure out where the slave registers live, but then I need a way to work out where exactly I master on a different bus) and dynamic coherency, amongst other things. If we try to use the current infrastructure then we end up with one bus per device, which usually ends up being a fake bus representing both the slave and master buses (which is how the platform bus gets abused) and then device drivers having their own idea of the system topology where it's required. This is fairly horrible and doesn't work for anything other than the trivial case, where one or both of the buses are `dumb' and don't require any work from Linux. > > .../devicetree/bindings/arm/coherent-bus.txt | 110 +++++++++++++++++++++ > > Why "arm"? > > What makes it ARM specific? This is just an RFC, so I'd be happy to put the binding somewhere more broad. I'm not sure how much of an issue this is outside of the SoC space, though. Will
[Resending -- apologies for any duplicates received. Real reply below. My lame excuse: It turns out that Mutt's decode-copy command (Esc-C) will lose most headers unless you invoke it from the message viewer *and* you have full header display enabled at the time. Otherwise, or if invoked from the index, many headers may disappear including headers not filtered by normal header weeding, and everything needed for threading. copy-message (C) always saves full headers though. Maybe it's a bug. Or not. Go figure. This + my habit of saving messages I want to reply to in a separate mbox + mindlessly using decode-save instead of save-message even when I'm not going to run git am on the result = facepalm.] On Thu, Nov 28, 2013 at 10:28:45AM +0000, Will Deacon wrote: > Hi Greg, > > On Wed, Nov 27, 2013 at 11:06:50PM +0000, Greg KH wrote: > > On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > > > A number of discussion points remain to be resolved: > > > > > > - Use of the ranges property and describing slave vs master bus > > > address ranges. In the latter case, we actually want to describe our > > > address space with respect to the bus on which the bus masters, > > > rather than the parent. This could potentially be achieved by adding > > > properties such as dma-parent and dma-ranges (already used by PPC?) > > > > > > - Describing masters that master through multiple different buses > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > How does this _not_ fit into the Linux device model? What am I missing > > here that precludes the use of the "driver/device/bus" model we have > > today? The physical-sockets history of buses like PCI tends to force a simple tree-like topology as a natural consequence. You also end up with closely linked topologies for power, clocks, interrupts etc., because those all pass through the same sockets, so it's difficult to have a different structure. On SoC, those constraints have never existed and are not followed. A device's interface to the system is almost always split into multiple connections, not covered by a single design or standard. The problem now is that increasing complexity means that the sometimes bizarre topology features of SoCs are becoming less and less transparent for software. The device model currently seems to assume that certain things (power, DMA and MMIO accessibility) follow the tree (which may not work for many SoCs), and some other things (clocks, regulators, interrupts etc.) are not incorporated at all -- making them independent, but it may make some abstractions impossible today. How much this matters for actual systems is hard to foresee yet. Since not _all_ possible insanities find their way into silicon. The onus should certainly be on us (i.e., the ARM/SoC community) to demonstrate if the device model needs to change, and to find practical ways to change it that minimise the resulting churn. > The main problem is that we have devices which slave on one bus and master > on another. That then complicates probing, power-management, IOMMU > configuration, address mapping (e.g. I walk the slave buses to figure out > where the slave registers live, but then I need a way to work out where > exactly I master on a different bus) and dynamic coherency, amongst other > things. > > If we try to use the current infrastructure then we end up with one bus per > device, which usually ends up being a fake bus representing both the slave > and master buses (which is how the platform bus gets abused) and then device > drivers having their own idea of the system topology where it's required. > > This is fairly horrible and doesn't work for anything other than the trivial > case, where one or both of the buses are `dumb' and don't require any work > from Linux. If we can come up with some generic bus type that is just a container for a load of hooks that know how to deal with various aspects of each device's interface to the system, on a per-device basis, than may be a start. The platform bus kinda serves that role, but the trouble with that is that it doesn't encourage any abstraction at all. In the face of increasing complexity, abstraction is desperately needed. > > > .../devicetree/bindings/arm/coherent-bus.txt | 110 +++++++++++++++++++++ > > > > Why "arm"? > > > > What makes it ARM specific? > > This is just an RFC, so I'd be happy to put the binding somewhere more > broad. I'm not sure how much of an issue this is outside of the SoC space, > though. I think that the ARM community are the ones who care the most today, so are likely to make the most noise about it. The binding is entirely generic in concept, so we should certainly push for it to be non-ARM-specific. Non-ARM SoCs will likely need to solve this problem too at some point. Cheers ---Dave
On Thu, Nov 28, 2013 at 10:28:45AM +0000, Will Deacon wrote: > Hi Greg, > > On Wed, Nov 27, 2013 at 11:06:50PM +0000, Greg KH wrote: > > On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > > > A number of discussion points remain to be resolved: > > > > > > - Use of the ranges property and describing slave vs master bus > > > address ranges. In the latter case, we actually want to describe our > > > address space with respect to the bus on which the bus masters, > > > rather than the parent. This could potentially be achieved by adding > > > properties such as dma-parent and dma-ranges (already used by PPC?) > > > > > > - Describing masters that master through multiple different buses > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > How does this _not_ fit into the Linux device model? What am I missing > > here that precludes the use of the "driver/device/bus" model we have > > today? > > The main problem is that we have devices which slave on one bus and master > on another. That then complicates probing, power-management, IOMMU > configuration, address mapping (e.g. I walk the slave buses to figure out > where the slave registers live, but then I need a way to work out where > exactly I master on a different bus) and dynamic coherency, amongst other > things. > > If we try to use the current infrastructure then we end up with one bus per > device, which usually ends up being a fake bus representing both the slave > and master buses (which is how the platform bus gets abused) and then device > drivers having their own idea of the system topology where it's required. > This is fairly horrible and doesn't work for anything other than the trivial > case, where one or both of the buses are `dumb' and don't require any work > from Linux. Then just put everything on a single "bus", there's nothing in the driver core that requires a bus to work in a specific way. > > > .../devicetree/bindings/arm/coherent-bus.txt | 110 +++++++++++++++++++++ > > > > Why "arm"? > > > > What makes it ARM specific? > > This is just an RFC, so I'd be happy to put the binding somewhere more > broad. I'm not sure how much of an issue this is outside of the SoC space, > though. There aren't "SoC"s on other architectures? :) thanks, greg k-h
On Thu, Nov 28, 2013 at 05:33:39PM +0000, Dave Martin wrote: > On Thu, Nov 28, 2013 at 10:28:45AM +0000, Will Deacon wrote: > > Hi Greg, > > > > On Wed, Nov 27, 2013 at 11:06:50PM +0000, Greg KH wrote: > > > On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > > > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > > > > A number of discussion points remain to be resolved: > > > > > > > > - Use of the ranges property and describing slave vs master bus > > > > address ranges. In the latter case, we actually want to describe our > > > > address space with respect to the bus on which the bus masters, > > > > rather than the parent. This could potentially be achieved by adding > > > > properties such as dma-parent and dma-ranges (already used by PPC?) > > > > > > > > - Describing masters that master through multiple different buses > > > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > How does this _not_ fit into the Linux device model? What am I missing > > > here that precludes the use of the "driver/device/bus" model we have > > > today? > > The physical-sockets history of buses like PCI tends to force a simple > tree-like topology as a natural consequence. You also end up with > closely linked topologies for power, clocks, interrupts etc., because > those all pass through the same sockets, so it's difficult to have a > different structure. There's nothing in the driver core that enforces such a topology. > On SoC, those constraints have never existed and are not followed. A > device's interface to the system is almost always split into multiple > connections, not covered by a single design or standard. The problem > now is that increasing complexity means that the sometimes bizarre > topology features of SoCs are becoming less and less transparent for > software. > > The device model currently seems to assume that certain things (power, > DMA and MMIO accessibility) follow the tree (which may not work for many > SoCs), and some other things (clocks, regulators, interrupts etc.) are > not incorporated at all -- making them independent, but it may make some > abstractions impossible today. > > How much this matters for actual systems is hard to foresee yet. Since > not _all_ possible insanities find their way into silicon. The > onus should certainly be on us (i.e., the ARM/SoC community) to > demonstrate if the device model needs to change, and to find practical > ways to change it that minimise the resulting churn. Yes it is, you all are the ones tasked with implementing the crazy crap the hardware people have created, best of luck with that :) > > The main problem is that we have devices which slave on one bus and master > > on another. That then complicates probing, power-management, IOMMU > > configuration, address mapping (e.g. I walk the slave buses to figure out > > where the slave registers live, but then I need a way to work out where > > exactly I master on a different bus) and dynamic coherency, amongst other > > things. > > > > If we try to use the current infrastructure then we end up with one bus per > > device, which usually ends up being a fake bus representing both the slave > > and master buses (which is how the platform bus gets abused) and then device > > drivers having their own idea of the system topology where it's required. > > > > This is fairly horrible and doesn't work for anything other than the trivial > > case, where one or both of the buses are `dumb' and don't require any work > > from Linux. > > If we can come up with some generic bus type that is just a container for > a load of hooks that know how to deal with various aspects of each device's > interface to the system, on a per-device basis, than may be a start. > > The platform bus kinda serves that role, but the trouble with that is that > it doesn't encourage any abstraction at all. In the face of increasing > complexity, abstraction is desperately needed. Then create a different abstraction, the normal solution to any problem in programming :) thanks, greg k-h
On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > On Thu, Nov 28, 2013 at 05:33:39PM +0000, Dave Martin wrote: > > On Thu, Nov 28, 2013 at 10:28:45AM +0000, Will Deacon wrote: > > > Hi Greg, > > > > > > On Wed, Nov 27, 2013 at 11:06:50PM +0000, Greg KH wrote: > > > > On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > > > > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > > > > > A number of discussion points remain to be resolved: > > > > > > > > > > - Use of the ranges property and describing slave vs master bus > > > > > address ranges. In the latter case, we actually want to describe our > > > > > address space with respect to the bus on which the bus masters, > > > > > rather than the parent. This could potentially be achieved by adding > > > > > properties such as dma-parent and dma-ranges (already used by PPC?) > > > > > > > > > > - Describing masters that master through multiple different buses > > > > > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > > > How does this _not_ fit into the Linux device model? What am I missing > > > > here that precludes the use of the "driver/device/bus" model we have > > > > today? > > > > The physical-sockets history of buses like PCI tends to force a simple > > tree-like topology as a natural consequence. You also end up with > > closely linked topologies for power, clocks, interrupts etc., because > > those all pass through the same sockets, so it's difficult to have a > > different structure. > > There's nothing in the driver core that enforces such a topology. Maybe not ... I have to wrap my head around that stuff a bit more. > > On SoC, those constraints have never existed and are not followed. A > > device's interface to the system is almost always split into multiple > > connections, not covered by a single design or standard. The problem > > now is that increasing complexity means that the sometimes bizarre > > topology features of SoCs are becoming less and less transparent for > > software. > > > > The device model currently seems to assume that certain things (power, > > DMA and MMIO accessibility) follow the tree (which may not work for many > > SoCs), and some other things (clocks, regulators, interrupts etc.) are > > not incorporated at all -- making them independent, but it may make some > > abstractions impossible today. > > > > How much this matters for actual systems is hard to foresee yet. Since > > not _all_ possible insanities find their way into silicon. The > > onus should certainly be on us (i.e., the ARM/SoC community) to > > demonstrate if the device model needs to change, and to find practical > > ways to change it that minimise the resulting churn. > > Yes it is, you all are the ones tasked with implementing the crazy crap > the hardware people have created, best of luck with that :) Agreed. The first assumption should be that we can fit in with the existing device model -- we should only reconsider if we find that to be impossible. > > > The main problem is that we have devices which slave on one bus and master > > > on another. That then complicates probing, power-management, IOMMU > > > configuration, address mapping (e.g. I walk the slave buses to figure out > > > where the slave registers live, but then I need a way to work out where > > > exactly I master on a different bus) and dynamic coherency, amongst other > > > things. > > > > > > If we try to use the current infrastructure then we end up with one bus per > > > device, which usually ends up being a fake bus representing both the slave > > > and master buses (which is how the platform bus gets abused) and then device > > > drivers having their own idea of the system topology where it's required. > > > > > > This is fairly horrible and doesn't work for anything other than the trivial > > > case, where one or both of the buses are `dumb' and don't require any work > > > from Linux. > > > > If we can come up with some generic bus type that is just a container for > > a load of hooks that know how to deal with various aspects of each device's > > interface to the system, on a per-device basis, than may be a start. > > > > The platform bus kinda serves that role, but the trouble with that is that > > it doesn't encourage any abstraction at all. In the face of increasing > > complexity, abstraction is desperately needed. > > Then create a different abstraction, the normal solution to any problem > in programming :) That's certainly the first step. It might end up looking a lot like a kludge layer which duplicates core functionality -- if so, we should then consider whether there is a better way, but we shouldn't judge it prematurely. It would be great to get some comments [hint to everyone] on the proposed DT binding so that we can start to explore this properly. Cheers ---Dave
On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: [...] > From will.deacon@arm.com Wed Nov 20 12:06:22 2013 [...] > A number of discussion points remain to be resolved: > > - Use of the ranges property and describing slave vs master bus > address ranges. In the latter case, we actually want to describe our > address space with respect to the bus on which the bus masters, > rather than the parent. This could potentially be achieved by adding > properties such as dma-parent and dma-ranges (already used by PPC?) > > - Describing masters that master through multiple different buses > > - How on Earth this fits in with the Linux device model (it doesn't) > > - Interaction with IOMMU bindings (currently under discussion) This is all very vague. Perhaps everyone else knows what this is all about, in which case it'd be great if somebody could clue me in. In particular I'm not sure what exact problem this solves. Perhaps a somewhat more concrete example would help. Or perhaps pointers to documentation that can help filling in the gaps. > .../devicetree/bindings/arm/coherent-bus.txt | 110 +++++++++++++++++++++ > 1 file changed, 110 insertions(+) > create mode 100644 Documentation/devicetree/bindings/arm/coherent-bus.txt > > diff --git a/Documentation/devicetree/bindings/arm/coherent-bus.txt b/Documentation/devicetree/bindings/arm/coherent-bus.txt > new file mode 100644 > index 000000000000..e3fbc2e491c7 > --- /dev/null > +++ b/Documentation/devicetree/bindings/arm/coherent-bus.txt > @@ -0,0 +1,110 @@ > +* Generic binding to describe a coherent bus > + > +In some systems, devices (peripherals and/or CPUs) do not share > +coherent views of memory, while on other systems sets of devices may > +share a coherent view of memory depending on the static bus topology > +and/or dynamic configuration of both the bus and device. Establishing > +such dynamic configurations requires appropriate topological information > +to be communicated to the operating system. > + > +This binding document attempts to define a set of generic properties > +which can be used to encode topological information in bus and device > +nodes. > + > + > +* Terminology > + > + - Port : An interface over which memory transactions > + can propagate. A port may act as a master, > + slave or both (see below). > + > + - Master port : A port capable of issuing memory transactions > + to a slave. For example, a port connecting a > + DMA controller to main memory. > + > + - Slave port : A port capable of responding to memory > + transactions received by a master. For > + example, a port connecting the control > + registers of an MMIO device to a peripheral > + bus. "Port" sounds awfully generic. Other bindings (such as those for V4L2, aka media) use ports for something completely different. Perhaps we can come up with a more specific term that matches the use-case better? What exactly does this map to in hardware? Thierry
On Thu, Nov 28, 2013 at 09:33:23PM +0100, Thierry Reding wrote: > > - Describing masters that master through multiple different buses > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > - Interaction with IOMMU bindings (currently under discussion) > > This is all very vague. Perhaps everyone else knows what this is all > about, in which case it'd be great if somebody could clue me in. It looks like an approach to describe an AXI physical bus topology in DT.. AFAIK the issue is that the AXI toolkit arm provides encourages a single IP block to have several AXI ports - control, DMA, high speed MMIO, for instance. Each of those ports is hooked up into an AXI bus DAG that has little to do with the CPU address map. Contrasted with something like PCI, where each IP has exactly one bus port into the system, so the MMIO register access address range directly implies the bus master DMA path. To my mind, a sensble modeling would be to have the DT tree represent the AXI DAG flattened into a tree rooted at the CPU vertex. Separately in the DT would be the full AXI DAG represented with phandle connections. Nodes in the DT would use phandles to indicate their connections into the AXI DAG. Hugely roughly: soc { ranges = <Some quasi-real ranges indicating IP to CPU mapping>; ip_block { reg = <...> axi-ports = <mmio = &axi_low_speed_port0, dma = &axi_dma_port1, .. >; } } axi { /* Describe a DAG of AXI connections here. */ cpu { downstream = &ax_switch,} axi_switch {downstream = &memory,&low_speed} memory {} dma {downstream = &memory} low_speed {} } I was just reading the Zynq manual which gives a pretty good description of what one vendor did using the ARM AXI toolkits.. http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf Figure 5-1 pg 122 You can see it is a complex DAG of AXI busses. For instance if you want to master from a 'High Performance Port M0' to 'On Chip RAM' you follow the path AXI_HP[MO] -> Switch1[M2] -> OCM. But you can't master from 'High Performance Port M0' to internal slaves, as there is no routing path. Each switch block is an opportunity for the designer to provide address remapping/IO MMU hardware that needs configuring :) Which is why I think encoding the AXI DAG directly in DT is probably the most future proof way to model this stuff - it sticks close to the tools ARM provides to the SOC designers, so it is very likely to be able to model arbitary SOC designs. Regards, Jason
On Thu, Nov 28, 2013 at 07:39:17PM +0000, Dave Martin wrote: > On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > > On Thu, Nov 28, 2013 at 05:33:39PM +0000, Dave Martin wrote: > > > On Thu, Nov 28, 2013 at 10:28:45AM +0000, Will Deacon wrote: > > > > Hi Greg, > > > > > > > > On Wed, Nov 27, 2013 at 11:06:50PM +0000, Greg KH wrote: > > > > > On Wed, Nov 27, 2013 at 05:28:06PM +0000, Dave Martin wrote: > > > > > > >From will.deacon@arm.com Wed Nov 20 12:06:22 2013 > > > > > > A number of discussion points remain to be resolved: > > > > > > > > > > > > - Use of the ranges property and describing slave vs master bus > > > > > > address ranges. In the latter case, we actually want to describe our > > > > > > address space with respect to the bus on which the bus masters, > > > > > > rather than the parent. This could potentially be achieved by adding > > > > > > properties such as dma-parent and dma-ranges (already used by PPC?) > > > > > > > > > > > > - Describing masters that master through multiple different buses > > > > > > > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > > > > > How does this _not_ fit into the Linux device model? What am I missing > > > > > here that precludes the use of the "driver/device/bus" model we have > > > > > today? > > > > > > The physical-sockets history of buses like PCI tends to force a simple > > > tree-like topology as a natural consequence. You also end up with > > > closely linked topologies for power, clocks, interrupts etc., because > > > those all pass through the same sockets, so it's difficult to have a > > > different structure. > > > > There's nothing in the driver core that enforces such a topology. > > Maybe not ... I have to wrap my head around that stuff a bit more. > > > > On SoC, those constraints have never existed and are not followed. A > > > device's interface to the system is almost always split into multiple > > > connections, not covered by a single design or standard. The problem > > > now is that increasing complexity means that the sometimes bizarre > > > topology features of SoCs are becoming less and less transparent for > > > software. > > > > > > The device model currently seems to assume that certain things (power, > > > DMA and MMIO accessibility) follow the tree (which may not work for many > > > SoCs), and some other things (clocks, regulators, interrupts etc.) are > > > not incorporated at all -- making them independent, but it may make some > > > abstractions impossible today. > > > > > > How much this matters for actual systems is hard to foresee yet. Since > > > not _all_ possible insanities find their way into silicon. The > > > onus should certainly be on us (i.e., the ARM/SoC community) to > > > demonstrate if the device model needs to change, and to find practical > > > ways to change it that minimise the resulting churn. > > > > Yes it is, you all are the ones tasked with implementing the crazy crap > > the hardware people have created, best of luck with that :) > > Agreed. The first assumption should be that we can fit in with the > existing device model -- we should only reconsider if we find that > to be impossible. Let me know if you think it is somehow impossible, but you all should really push back on the insane hardware designers that are forcing you all to do this work. I find it "interesting" how this all becomes your workload for their crazy ideas. Best of luck, greg k-h
On Thu, Nov 28, 2013 at 02:10:09PM -0700, Jason Gunthorpe wrote: > On Thu, Nov 28, 2013 at 09:33:23PM +0100, Thierry Reding wrote: > > > > - Describing masters that master through multiple different buses > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > - Interaction with IOMMU bindings (currently under discussion) > > > > This is all very vague. Perhaps everyone else knows what this is all > > about, in which case it'd be great if somebody could clue me in. > > It looks like an approach to describe an AXI physical bus topology in > DT.. Thanks for explaining this. It makes a whole lot more sense now. > AFAIK the issue is that the AXI toolkit arm provides encourages a > single IP block to have several AXI ports - control, DMA, high speed > MMIO, for instance. Each of those ports is hooked up into an AXI bus > DAG that has little to do with the CPU address map. > > Contrasted with something like PCI, where each IP has exactly one bus > port into the system, so the MMIO register access address range > directly implies the bus master DMA path. > > To my mind, a sensble modeling would be to have the DT tree represent > the AXI DAG flattened into a tree rooted at the CPU vertex. Separately > in the DT would be the full AXI DAG represented with phandle > connections. > > Nodes in the DT would use phandles to indicate their connections into > the AXI DAG. > > Hugely roughly: > soc > { > ranges = <Some quasi-real ranges indicating IP to CPU mapping>; > ip_block > { > reg = <...> > axi-ports = <mmio = &axi_low_speed_port0, dma = &axi_dma_port1, .. >; > } > } > > axi > { > /* Describe a DAG of AXI connections here. */ > cpu { downstream = &ax_switch,} > axi_switch {downstream = &memory,&low_speed} > memory {} > dma {downstream = &memory} > low_speed {} > } Correct me if I'm wrong, but the switch would be what the specification refers to as "interconnect", while a port would correspond to what is called an "interface" in the specification? > I was just reading the Zynq manual which gives a pretty good > description of what one vendor did using the ARM AXI toolkits.. > > http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf > Figure 5-1 pg 122 > > You can see it is a complex DAG of AXI busses. For instance if you > want to master from a 'High Performance Port M0' to 'On Chip RAM' you > follow the path AXI_HP[MO] -> Switch1[M2] -> OCM. > > But you can't master from 'High Performance Port M0' to internal > slaves, as there is no routing path. > > Each switch block is an opportunity for the designer to provide > address remapping/IO MMU hardware that needs configuring :) > > Which is why I think encoding the AXI DAG directly in DT is probably > the most future proof way to model this stuff - it sticks close to the > tools ARM provides to the SOC designers, so it is very likely to be > able to model arbitary SOC designs. I'm not sure I agree with you fully here. At least I think that if what we want to describe is an AXI bus topology, then we should be describing it in terms of the AXI specification. On the other hand I fear that this will lead to very many nodes and properties that we need to add, with potentially no immediate gain. So I think we should be cautious about what we do add, and restrict ourselves to what we really need. I mean, even though device tree is supposed to describe hardware, there needs to be a limit to the amount of detail we put into it. After all it isn't a hardware description language, but rather a language to describe the hardware in a way that makes sense for operating system software to use it. Perhaps this is just another way of saying what Greg has already said. If we continue down this road, we'll eventually end up having to describe all sorts of nitty gritty details. And we'll need even more code to deal with those descriptions and the hardware they represent. At some point we need to start pushing some of the complexity back into hardware so that we can keep a sane code-base. Thierry
On Thu, Nov 28, 2013 at 11:22:33PM +0100, Thierry Reding wrote: > On Thu, Nov 28, 2013 at 02:10:09PM -0700, Jason Gunthorpe wrote: > > On Thu, Nov 28, 2013 at 09:33:23PM +0100, Thierry Reding wrote: > > > > > > - Describing masters that master through multiple different buses > > > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > > > - Interaction with IOMMU bindings (currently under discussion) > > > > > > This is all very vague. Perhaps everyone else knows what this is all > > > about, in which case it'd be great if somebody could clue me in. > > > > It looks like an approach to describe an AXI physical bus topology in > > DT.. > > Thanks for explaining this. It makes a whole lot more sense now. Hopefully the ARM guys concur, this was just my impression from reviewing their patches and having recently done some design work with AXI.. > > axi > > { > > /* Describe a DAG of AXI connections here. */ > > cpu { downstream = &ax_switch,} > > axi_switch {downstream = &memory,&low_speed} > > memory {} > > dma {downstream = &memory} > > low_speed {} > > } > > Correct me if I'm wrong, but the switch would be what the specification > refers to as "interconnect", while a port would correspond to what is > called an "interface" in the specification? That seems correct, but for this purpose we are not interested in boring dumb interconnect but fancy interconnect with address remapping capabilities, or cache coherency (eg the SCU/L2 cache is modeled as switch/interconnect in a AXI DAG). I called it a switch because the job of the interconnect block is to take an AXI input packet on a slave interface and route it to the proper master interface with internal arbitration between slave interfaces. In my world that is a called a switch ;) AXI is basically an on-chip point-to-point switched fabric like PCI-E, and the stuff that travels on AXI looks fairly similar to PCI-E TLPs.. If you refer to the PDF I linked I broadly modeled the above DT fragment on that diagram, each axi sub node (vertex) represents an 'interconnect' and 'downstream' is a master->slave interface pair (edge). Fundamentally AXI is inherently a DAG, but unlike what we are used to in other platforms you don't have to go through a fused CPU/cache/memory controller unit to access memory, so there are software visible asymmetries depending on how the DMA flows through the AXI DAG. > > Which is why I think encoding the AXI DAG directly in DT is probably > > the most future proof way to model this stuff - it sticks close to the > > tools ARM provides to the SOC designers, so it is very likely to be > > able to model arbitary SOC designs. > > I'm not sure I agree with you fully here. At least I think that if what > we want to describe is an AXI bus topology, then we should be describing > it in terms of the AXI specification. Right, that was what I was trying to describe :) The DAG would be vertexes that are 'interconnect' and directed edges that are 'master -> slave interface' pairs. This would be an addendum/side-table dataset to the standard 'soc' CPU address map tree, that would only be needed to program address mapping/iommu hardware. And it isn't really AXI specific, x86 style platforms can have a DAG too, it is just much simpler, as there is only 1 vertex - the IOMMU. > I mean, even though device tree is supposed to describe hardware, there > needs to be a limit to the amount of detail we put into it. After all it > isn't a hardware description language, but rather a language to describe > the hardware in a way that makes sense for operating system software to > use it. Right - which is why I said the usual 'soc' node should remain as-is typical today - a tree formed by viewing the AXI DAG from the CPU vertex. That 100% matches the OS perspective of the system for CPU originated MMIO. The AXI DAG side-table would be used to resolve weirdness with 'bus master' DMA programming. The OS can detect all the required configuration and properties by tracing a path through the DAG from the source of the DMA to the target - that tells you what IOMMUs are involved, if the path is cache coherent, etc. > Perhaps this is just another way of saying what Greg has already said. > If we continue down this road, we'll eventually end up having to > describe all sorts of nitty gritty details. And we'll need even more Greg's point makes sense, but the HW guys are not designing things this way for kicks - there are real physics based reasons for some of these choices... eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery expensive compared to a purpose built muxed bus tree. Doing coherency look ups on DMA traffic costs energy, etc. > code to deal with those descriptions and the hardware they represent. At > some point we need to start pushing some of the complexity back into > hardware so that we can keep a sane code-base. Some of this is a consequence of the push to have the firmware minimal. As soon as you say the kernel has to configure the address map you've created a big complexity for it.. Jason
On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: > > Perhaps this is just another way of saying what Greg has already said. > > If we continue down this road, we'll eventually end up having to > > describe all sorts of nitty gritty details. And we'll need even more > > Greg's point makes sense, but the HW guys are not designing things > this way for kicks - there are real physics based reasons for some of > these choices... > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > expensive compared to a purpose built muxed bus tree. Doing coherency > look ups on DMA traffic costs energy, etc. Really? How much power exactly does it take / save? Yes, hardware people think "software is free", but when you can't actually control the hardware in the software properly, well, you end up with something like itanium... > > code to deal with those descriptions and the hardware they represent. At > > some point we need to start pushing some of the complexity back into > > hardware so that we can keep a sane code-base. > > Some of this is a consequence of the push to have the firmware > minimal. As soon as you say the kernel has to configure the address > map you've created a big complexity for it.. Why the push to make firmware "minimal"? What is that "saving"? You just push the complexity from one place to the other, just because ARM doesn't seem to have good firmware engineers, doesn't mean they should punish their kernel developers :) greg k-h
On Thu, Nov 28, 2013 at 06:35:54PM -0800, Greg KH wrote: > On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: > > > Perhaps this is just another way of saying what Greg has already said. > > > If we continue down this road, we'll eventually end up having to > > > describe all sorts of nitty gritty details. And we'll need even more > > > > Greg's point makes sense, but the HW guys are not designing things > > this way for kicks - there are real physics based reasons for some of > > these choices... > > > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > > expensive compared to a purpose built muxed bus tree. Doing coherency > > look ups on DMA traffic costs energy, etc. > > Really? How much power exactly does it take / save? Yes, hardware > people think "software is free", but when you can't actually control the > hardware in the software properly, well, you end up with something like > itanium... > > > > code to deal with those descriptions and the hardware they represent. At > > > some point we need to start pushing some of the complexity back into > > > hardware so that we can keep a sane code-base. > > > > Some of this is a consequence of the push to have the firmware > > minimal. As soon as you say the kernel has to configure the address > > map you've created a big complexity for it.. > > Why the push to make firmware "minimal"? What is that "saving"? You > just push the complexity from one place to the other, just because ARM > doesn't seem to have good firmware engineers, doesn't mean they should > punish their kernel developers :) In my experience the biggest problem here is that people working on upstream kernels and therefore confronted with these issues are seldom able to track the latest developments of new chips. When the time comes to upstream support, most of the functionality has been implemented downstream already, so it actually works and there's no apparent reason why things should change. Now I know that that's not an ideal situation and upstreaming should start a whole lot earlier, but even if that were the case, once the silicon tapes out there's not a whole lot you can do about it anymore. Starting with upstreaming even before that would have to be a solution, but I don't think that's realistic at the current pace of development. There's a large gap between how fast new SoCs are supposed to tape out and the rate at which new code can be merged upstream. Perhaps some of that could be mitigated by putting more of the complexity into firmware and that's already happening to some degree for ARMv8. But I suspect there's a limit to what you can hide away in firmware while at the same time giving the kernel enough information to do the right thing. I am completely convinced that our goal should be to do upstreaming early and ideally there shouldn't be any downstream development in the first place. The reason why we're not there yet is because it isn't practical to do so currently, so I'm very interested in suggestions or finding ways to improve the situation. Thierry
On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > There's a large gap between how fast new SoCs are supposed to tape out > and the rate at which new code can be merged upstream. Perhaps some of > that could be mitigated by putting more of the complexity into firmware > and that's already happening to some degree for ARMv8. But I suspect > there's a limit to what you can hide away in firmware while at the same > time giving the kernel enough information to do the right thing. One of the bigger issues which stands in the way of companies caring about mainstream support is closed source IPs like VPUs and GPUs. If you have one of those on your chip, even if the kernel side code is already under the GPL, normally that code is not "mainline worthy". Also, as the userspace code may not be open source, some people object to having the open source part in the kernel. So for customers to be able to get the performance out of the chip, they have to stick with having non-mainline kernel. At that point, why bother spending too much time getting mainline support for the device. It's never going to be fully functional in mainline. It doesn't make sense for these SoC companies.
On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: > On Thu, Nov 28, 2013 at 11:22:33PM +0100, Thierry Reding wrote: > > On Thu, Nov 28, 2013 at 02:10:09PM -0700, Jason Gunthorpe wrote: > > > On Thu, Nov 28, 2013 at 09:33:23PM +0100, Thierry Reding wrote: > > > > > > > > - Describing masters that master through multiple different buses > > > > > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > > > > > - Interaction with IOMMU bindings (currently under discussion) > > > > > > > > This is all very vague. Perhaps everyone else knows what this is all > > > > about, in which case it'd be great if somebody could clue me in. > > > > > > It looks like an approach to describe an AXI physical bus topology in > > > DT.. > > > > Thanks for explaining this. It makes a whole lot more sense now. > > Hopefully the ARM guys concur, this was just my impression from > reviewing their patches and having recently done some design work with > AXI.. > > > > axi > > > { > > > /* Describe a DAG of AXI connections here. */ > > > cpu { downstream = &ax_switch,} > > > axi_switch {downstream = &memory,&low_speed} > > > memory {} > > > dma {downstream = &memory} > > > low_speed {} > > > } > > > > Correct me if I'm wrong, but the switch would be what the specification > > refers to as "interconnect", while a port would correspond to what is > > called an "interface" in the specification? > > That seems correct, but for this purpose we are not interested in > boring dumb interconnect but fancy interconnect with address remapping > capabilities, or cache coherency (eg the SCU/L2 cache is modeled as > switch/interconnect in a AXI DAG). > > I called it a switch because the job of the interconnect block is to > take an AXI input packet on a slave interface and route it to the > proper master interface with internal arbitration between slave > interfaces. In my world that is a called a switch ;) > > AXI is basically an on-chip point-to-point switched fabric like PCI-E, > and the stuff that travels on AXI looks fairly similar to PCI-E TLPs.. > > If you refer to the PDF I linked I broadly modeled the above DT > fragment on that diagram, each axi sub node (vertex) represents an > 'interconnect' and 'downstream' is a master->slave interface pair (edge). > > Fundamentally AXI is inherently a DAG, but unlike what we are used to > in other platforms you don't have to go through a fused > CPU/cache/memory controller unit to access memory, so there are > software visible asymmetries depending on how the DMA flows through > the AXI DAG. > > > > Which is why I think encoding the AXI DAG directly in DT is probably > > > the most future proof way to model this stuff - it sticks close to the > > > tools ARM provides to the SOC designers, so it is very likely to be > > > able to model arbitary SOC designs. > > > > I'm not sure I agree with you fully here. At least I think that if what > > we want to describe is an AXI bus topology, then we should be describing > > it in terms of the AXI specification. > > Right, that was what I was trying to describe :) > > The DAG would be vertexes that are 'interconnect' and directed edges > that are 'master -> slave interface' pairs. > > This would be an addendum/side-table dataset to the standard 'soc' CPU > address map tree, that would only be needed to program address > mapping/iommu hardware. > > And it isn't really AXI specific, x86 style platforms can have a DAG > too, it is just much simpler, as there is only 1 vertex - the IOMMU. > > > I mean, even though device tree is supposed to describe hardware, there > > needs to be a limit to the amount of detail we put into it. After all it > > isn't a hardware description language, but rather a language to describe > > the hardware in a way that makes sense for operating system software to > > use it. > > Right - which is why I said the usual 'soc' node should remain as-is > typical today - a tree formed by viewing the AXI DAG from the CPU > vertex. That 100% matches the OS perspective of the system for CPU > originated MMIO. > > The AXI DAG side-table would be used to resolve weirdness with 'bus > master' DMA programming. The OS can detect all the required > configuration and properties by tracing a path through the DAG from > the source of the DMA to the target - that tells you what IOMMUs are > involved, if the path is cache coherent, etc. That all sounds like an awful amount of data to wade through. Do we really need all of it to do what we want? Perhaps it can be simplified a bit. For instance it seems like the majority of hardware where this is actually required will have to go through one IOMMU (or a cascade of IOMMUs) and the path isn't cache coherent. IOMMUs typically require additional parameters to properly map devices to virtual address spaces, so we'll need to hook them up with masters in DT anyway. If we further assume that all masters use non-cache-coherent paths, then the problem becomes much simpler. Of course that would only work for a specific case and not solve the more general case. But perhaps it'll be good enough to cover the majority of uses. > > Perhaps this is just another way of saying what Greg has already said. > > If we continue down this road, we'll eventually end up having to > > describe all sorts of nitty gritty details. And we'll need even more > > Greg's point makes sense, but the HW guys are not designing things > this way for kicks - there are real physics based reasons for some of > these choices... > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > expensive compared to a purpose built muxed bus tree. Doing coherency > look ups on DMA traffic costs energy, etc. I understand that these may all contribute to saving power. However what good is a system that's very power-efficient if it's so complex that the software can no longer control it? Thierry
On Fri, Nov 29, 2013 at 09:57:03AM +0000, Russell King - ARM Linux wrote: > On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > > There's a large gap between how fast new SoCs are supposed to tape out > > and the rate at which new code can be merged upstream. Perhaps some of > > that could be mitigated by putting more of the complexity into firmware > > and that's already happening to some degree for ARMv8. But I suspect > > there's a limit to what you can hide away in firmware while at the same > > time giving the kernel enough information to do the right thing. > > One of the bigger issues which stands in the way of companies caring > about mainstream support is closed source IPs like VPUs and GPUs. > > If you have one of those on your chip, even if the kernel side code > is already under the GPL, normally that code is not "mainline worthy". > Also, as the userspace code may not be open source, some people object > to having the open source part in the kernel. > > So for customers to be able to get the performance out of the chip, > they have to stick with having non-mainline kernel. > > At that point, why bother spending too much time getting mainline > support for the device. It's never going to be fully functional in > mainline. It doesn't make sense for these SoC companies. Well, there are advantages to having large parts, even if not all, of an SoC supported in the mainline kernel. The closer you are to mainline, the easier it becomes for customers and users to use a mainline kernel. The better an SoC is supported upstream the fewer vendor-specific patches are required to get feature parity, which in turn makes it easier for customers to forward-port (and back-port for that matter) those patches to whatever kernel they want. It also allows vendors to concentrate on the more contentious patches and spend time on making them worthy of mainline. Customers aren't only end-users but also embedded partners that want to use the SoC in their products. It's no secret that many vendor trees lag behind upstream and that causes all kinds of pain such as having to port drivers for new hardware that's not supported in whatever vendor tree you happen to get. Tracking upstream is invaluable because it makes it almost trivial to support new hardware and it helps in turn with getting your own changes merged. So upstreaming SoC support isn't only for the benefit of SoC vendors. It also is very convenient for all users of the SoC. It's true that ideally an SoC would be fully functional in mainline. That's not the case today, but that doesn't mean it will always be true. Thierry
On Thu, Nov 28, 2013 at 09:25:28PM +0000, Greg KH wrote: > On Thu, Nov 28, 2013 at 07:39:17PM +0000, Dave Martin wrote: > > On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > > > Yes it is, you all are the ones tasked with implementing the crazy crap > > > the hardware people have created, best of luck with that :) > > > > Agreed. The first assumption should be that we can fit in with the > > existing device model -- we should only reconsider if we find that > > to be impossible. > > Let me know if you think it is somehow impossible, but you all should > really push back on the insane hardware designers that are forcing you > all to do this work. I find it "interesting" how this all becomes your > workload for their crazy ideas. Oh, I don't think we're claiming anything is impossible here :) It's more that we will probably want to make some changes to the device model to allow, for example, a device to be associated with multiple buses of potentially different types. Step one is to get the DT binding sorted, then we can try and get Linux to make use of it. This goes hand-in-hand with the IOMMU discussion going on here: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/210401.html which is one of the issues that is hitting us right now. Cheers, Will
On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: > On Thu, Nov 28, 2013 at 11:22:33PM +0100, Thierry Reding wrote: > > On Thu, Nov 28, 2013 at 02:10:09PM -0700, Jason Gunthorpe wrote: > > > On Thu, Nov 28, 2013 at 09:33:23PM +0100, Thierry Reding wrote: > > > > > > > > - Describing masters that master through multiple different buses > > > > > > > > > > - How on Earth this fits in with the Linux device model (it doesn't) > > > > > > > > > > - Interaction with IOMMU bindings (currently under discussion) > > > > > > > > This is all very vague. Perhaps everyone else knows what this is all > > > > about, in which case it'd be great if somebody could clue me in. > > > > > > It looks like an approach to describe an AXI physical bus topology in > > > DT.. > > > > Thanks for explaining this. It makes a whole lot more sense now. > > Hopefully the ARM guys concur, this was just my impression from > reviewing their patches and having recently done some design work with > AXI.. Yes and no. We are trying to describe a real topology here, but only because there are salient features that the kernel genuinely does need to know about if we want to be able to abstract this kind of thing. It's not just about AXI. Things like CCI-400 ("cache coherent interconnect") and its successors have real run-time control requirements on each connection to the bus. (The "port" terminology is used in the CCI documentation, but in any case the concept of a link between a device and a bus should be a pretty generic concept, not tied to a specific name or a specific interconnect.) If I need to turn on the bus interface for device X, in need to know how to tell the bus which interface to poke -- hence the need for a "port ID". Of course, we're free to choose other names. The master-slave link concept is not supposed to be a new concept at all: DT already has this concept. All we are aiming to add here is the ability to describe cross-links that ePAPR cannot describe directly. > > > > axi > > > { > > > /* Describe a DAG of AXI connections here. */ > > > cpu { downstream = &ax_switch,} > > > axi_switch {downstream = &memory,&low_speed} > > > memory {} > > > dma {downstream = &memory} > > > low_speed {} > > > } > > > > Correct me if I'm wrong, but the switch would be what the specification > > refers to as "interconnect", while a port would correspond to what is > > called an "interface" in the specification? > > That seems correct, but for this purpose we are not interested in > boring dumb interconnect but fancy interconnect with address remapping > capabilities, or cache coherency (eg the SCU/L2 cache is modeled as > switch/interconnect in a AXI DAG). Bear in mind that "fancy interconnect with address remapping capabilities" probably means at least two independent components on an ARM SoC. To avoid excessive code fragmentation we'd want a driver for each, not a driver per every possible pairing. The pairing could be different on every port even in a single SoC, though I hope we will never see that. > I called it a switch because the job of the interconnect block is to > take an AXI input packet on a slave interface and route it to the > proper master interface with internal arbitration between slave > interfaces. In my world that is a called a switch ;) In axi { axi_switch {} }, are you describing two levels of bus, or one? I'm guessing one, but then the nested node looks a bit weird. > AXI is basically an on-chip point-to-point switched fabric like PCI-E, > and the stuff that travels on AXI looks fairly similar to PCI-E TLPs.. > > If you refer to the PDF I linked I broadly modeled the above DT > fragment on that diagram, each axi sub node (vertex) represents an > 'interconnect' and 'downstream' is a master->slave interface pair (edge). > > Fundamentally AXI is inherently a DAG, but unlike what we are used to > in other platforms you don't have to go through a fused > CPU/cache/memory controller unit to access memory, so there are > software visible asymmetries depending on how the DMA flows through > the AXI DAG. Just to call this out, the linkage is *not* guaranteed to be acyclic. If you connect pass-through devices (i.e., buses) round in a cycle, you may get transactions going round and round forever, so we should never see that in a system. However, there's nothing to stop a DMA controller's master side being looped back so that it can access its own slave interface. This is the normal situation for coherent DMA, since the whole point there is that the DMA controller should shares its system view closely with the CPUs, including some levels of cache. (This does mean that the DMA may be able to program itself -- but I don't claim that this is useful. Rather, it's a side-effect of providing a coherent system view.) > > > Which is why I think encoding the AXI DAG directly in DT is probably > > > the most future proof way to model this stuff - it sticks close to the > > > tools ARM provides to the SOC designers, so it is very likely to be > > > able to model arbitary SOC designs. > > > > I'm not sure I agree with you fully here. At least I think that if what > > we want to describe is an AXI bus topology, then we should be describing > > it in terms of the AXI specification. > > Right, that was what I was trying to describe :) > > The DAG would be vertexes that are 'interconnect' and directed edges > that are 'master -> slave interface' pairs. > > This would be an addendum/side-table dataset to the standard 'soc' CPU > address map tree, that would only be needed to program address > mapping/iommu hardware. > > And it isn't really AXI specific, x86 style platforms can have a DAG > too, it is just much simpler, as there is only 1 vertex - the IOMMU. Agree -- this concept of a master/slave link is a really generic concept, The complete set of properties associated with each link will be specific to each different interconnect, and possibly from port to port: _that_ stuff would be described by separate, non-generic properties defined per interconnect type. But to do DMA mapping, you should only need to know what master/slave links exist, and any associated mappings. > > > I mean, even though device tree is supposed to describe hardware, there > > needs to be a limit to the amount of detail we put into it. After all it > > isn't a hardware description language, but rather a language to describe > > the hardware in a way that makes sense for operating system software to > > use it. > > Right - which is why I said the usual 'soc' node should remain as-is > typical today - a tree formed by viewing the AXI DAG from the CPU > vertex. That 100% matches the OS perspective of the system for CPU > originated MMIO. Do you mean the top-level bus node in the DT and its contents, or something else? If so, agreed ... > The AXI DAG side-table would be used to resolve weirdness with 'bus > master' DMA programming. The OS can detect all the required > configuration and properties by tracing a path through the DAG from > the source of the DMA to the target - that tells you what IOMMUs are > involved, if the path is cache coherent, etc. ... that could work, although putting the links in the natural places in the DT directly feels cleaner that stashing a crib table elsewhere in the DT. That's partly cosmetic, I think both could work? > > > Perhaps this is just another way of saying what Greg has already said. > > If we continue down this road, we'll eventually end up having to > > describe all sorts of nitty gritty details. And we'll need even more > > Greg's point makes sense, but the HW guys are not designing things > this way for kicks - there are real physics based reasons for some of > these choices... > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > expensive compared to a purpose built muxed bus tree. Doing coherency > look ups on DMA traffic costs energy, etc. > > > code to deal with those descriptions and the hardware they represent. At > > some point we need to start pushing some of the complexity back into > > hardware so that we can keep a sane code-base. > > Some of this is a consequence of the push to have the firmware > minimal. As soon as you say the kernel has to configure the address > map you've created a big complexity for it.. > > Jason > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Fri, Nov 29, 2013 at 09:57:03AM +0000, Russell King - ARM Linux wrote: > On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > > There's a large gap between how fast new SoCs are supposed to tape out > > and the rate at which new code can be merged upstream. Perhaps some of > > that could be mitigated by putting more of the complexity into firmware > > and that's already happening to some degree for ARMv8. But I suspect > > there's a limit to what you can hide away in firmware while at the same > > time giving the kernel enough information to do the right thing. > > One of the bigger issues which stands in the way of companies caring > about mainstream support is closed source IPs like VPUs and GPUs. > > If you have one of those on your chip, even if the kernel side code > is already under the GPL, normally that code is not "mainline worthy". > Also, as the userspace code may not be open source, some people object > to having the open source part in the kernel. > > So for customers to be able to get the performance out of the chip, > they have to stick with having non-mainline kernel. > > At that point, why bother spending too much time getting mainline > support for the device. It's never going to be fully functional in > mainline. It doesn't make sense for these SoC companies. Putting effort into upstream support for something that is only relevant to GPUs (or VPUs) does look less valuable for us right now, unless it encourages people to start posting more GPU/VPU code upstream, and we know there are other blockers there. DMA and IOMMU we definitely care about, though. Cheers ---Dave
On Fri, Nov 29, 2013 at 01:13:59PM +0000, Dave Martin wrote: > On Fri, Nov 29, 2013 at 09:57:03AM +0000, Russell King - ARM Linux wrote: > > On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > > > There's a large gap between how fast new SoCs are supposed to tape out > > > and the rate at which new code can be merged upstream. Perhaps some of > > > that could be mitigated by putting more of the complexity into firmware > > > and that's already happening to some degree for ARMv8. But I suspect > > > there's a limit to what you can hide away in firmware while at the same > > > time giving the kernel enough information to do the right thing. > > > > One of the bigger issues which stands in the way of companies caring > > about mainstream support is closed source IPs like VPUs and GPUs. > > > > If you have one of those on your chip, even if the kernel side code > > is already under the GPL, normally that code is not "mainline worthy". > > Also, as the userspace code may not be open source, some people object > > to having the open source part in the kernel. > > > > So for customers to be able to get the performance out of the chip, > > they have to stick with having non-mainline kernel. > > > > At that point, why bother spending too much time getting mainline > > support for the device. It's never going to be fully functional in > > mainline. It doesn't make sense for these SoC companies. > > Putting effort into upstream support for something that is only relevant > to GPUs (or VPUs) does look less valuable for us right now, unless it > encourages people to start posting more GPU/VPU code upstream, and we > know there are other blockers there. I think you miss my point. Manufacturers want their chips to be useful to people, and they want all the features on their chip to be usable. They don't want something which sort-of works but has chunks of support for various IP that they spent time integrating not supported. So they have two options: either they develop a kernel out of mainline which supports everything, which isn't subject to the whims of mainline kernel developers breaking it all the time because of lack of testing, or they decide that they're not going to support everything and work on mainline only. The problem with the latter is they're explicitly saying to some customers that they're on their own as far as that's concerned, and they're not prepared to do that: customers are the people who pay the bills, remember, and you don't turn them away without good reason. For example, I doubt that SolidRun would've picked Freescale's IMX6 for their next board unless there was support for the GPU and VPU in some kernel somewhere. Remember, not everyone is interested in producing yet another toy board which can only really be used as a NAS. Some want accelerated graphics and hardware assisted video decode today too, and want to sell products based on those features.
On Fri, Nov 29, 2013 at 09:57:12AM +0000, Thierry Reding wrote: > On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: [...] > > The AXI DAG side-table would be used to resolve weirdness with 'bus > > master' DMA programming. The OS can detect all the required > > configuration and properties by tracing a path through the DAG from > > the source of the DMA to the target - that tells you what IOMMUs are > > involved, if the path is cache coherent, etc. > > That all sounds like an awful amount of data to wade through. Do we > really need all of it to do what we want? Perhaps it can be simplified > a bit. For instance it seems like the majority of hardware where this is > actually required will have to go through one IOMMU (or a cascade of > IOMMUs) and the path isn't cache coherent. The DT should describe the hardware, but only those aspects that a sane OS should need to care about. Some judgment is needed. Figuring out exactly which info we ought to care about is part of the purpose of this discussion. There are certainly lots of hardware integration and configuration parameters that we don't need to know. I think that figuring out the path capabilities ought to be a one-off step, done when the DMA client device is probed. We need to retain enough information to do the mapping each time a buffer needs to be set up, but we shouldn't have to re-scan the DT each time. In more general cases, there are still some things that really can't be pushed into firmware for which Linux needs a fair amount of topology information. Our current example is things like MSIs from PCIe devices in systems with newer GICs and SMMU. Particularly for guests under KVM, the ways all these link together is needed for configuring and routing MSIs to guest CPUs. [...] > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > > expensive compared to a purpose built muxed bus tree. Doing coherency > > look ups on DMA traffic costs energy, etc. > > I understand that these may all contribute to saving power. However what > good is a system that's very power-efficient if it's so complex that the > software can no longer control it? Not a lot of good. However, that's the extreme case. We have to deal with some pain for sure -- and some parts of that pain may turn out to be too ridiculous (either too useless, or too unworkable) for it to be worth supporting them in Linux. This thread focuses on one of the less ridculous things: the aim is not to describe the hardware bus architecture in full detail, just the aspects the OS needs to know about for important, abstractable things like DMA and IOMMU topology. If we model the description around the actual topology, there seems less chance of needing to bodge the bindings in the future when some previously non-relevant aspect of the topology becomes important. Cheers ---Dave
On Fri, Nov 29, 2013 at 11:44:53AM +0000, Will Deacon wrote: > On Thu, Nov 28, 2013 at 09:25:28PM +0000, Greg KH wrote: > > On Thu, Nov 28, 2013 at 07:39:17PM +0000, Dave Martin wrote: > > > On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > > > > Yes it is, you all are the ones tasked with implementing the crazy crap > > > > the hardware people have created, best of luck with that :) > > > > > > Agreed. The first assumption should be that we can fit in with the > > > existing device model -- we should only reconsider if we find that > > > to be impossible. > > > > Let me know if you think it is somehow impossible, but you all should > > really push back on the insane hardware designers that are forcing you > > all to do this work. I find it "interesting" how this all becomes your > > workload for their crazy ideas. > > Oh, I don't think we're claiming anything is impossible here :) It's more > that we will probably want to make some changes to the device model to allow, > for example, a device to be associated with multiple buses of potentially > different types. Why would you want that? What good would that help with? > Step one is to get the DT binding sorted, then we can try and get Linux to > make use of it. This goes hand-in-hand with the IOMMU discussion going on > here: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/210401.html > > which is one of the issues that is hitting us right now. Interesting how people seem to not know how to cc: the needed maintainers when they touch core code :(
On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > > > Some of this is a consequence of the push to have the firmware > > > minimal. As soon as you say the kernel has to configure the address > > > map you've created a big complexity for it.. > > > > Why the push to make firmware "minimal"? What is that "saving"? You > > just push the complexity from one place to the other, just because ARM > > doesn't seem to have good firmware engineers, doesn't mean they should > > punish their kernel developers :) > > In my experience the biggest problem here is that people working on > upstream kernels and therefore confronted with these issues are seldom > able to track the latest developments of new chips. > > When the time comes to upstream support, most of the functionality has > been implemented downstream already, so it actually works and there's no > apparent reason why things should change. That's a failure of the companies involved. > Now I know that that's not an ideal situation and upstreaming should > start a whole lot earlier, but even if that were the case, once the > silicon tapes out there's not a whole lot you can do about it anymore. > Starting with upstreaming even before that would have to be a solution, > but I don't think that's realistic at the current pace of development. For other companies it is realistic. I have a whole presentation on this, and why it even makes good business sense to do it properly (hint, saves you time and money, who doesn't like that?) > There's a large gap between how fast new SoCs are supposed to tape out > and the rate at which new code can be merged upstream. Perhaps some of > that could be mitigated by putting more of the complexity into firmware > and that's already happening to some degree for ARMv8. But I suspect > there's a limit to what you can hide away in firmware while at the same > time giving the kernel enough information to do the right thing. > > I am completely convinced that our goal should be to do upstreaming > early and ideally there shouldn't be any downstream development in the > first place. The reason why we're not there yet is because it isn't > practical to do so currently, so I'm very interested in suggestions or > finding ways to improve the situation. "Practical"? Heh, other companies know how to do this properly, and because of that, they will succeed, sorry. It can be done, the fact that ARM and it's licensees don't want to do it, doesn't mean it isn't "practical" at all, it's just a failure on their part to do things in the "correct" way, wasting time and money in the process. Oh well, I guess you all have tons of time and money, best of luck with that :) greg k-h
On Fri, Nov 29, 2013 at 01:13:59PM +0000, Dave Martin wrote: > On Fri, Nov 29, 2013 at 09:57:03AM +0000, Russell King - ARM Linux wrote: > > On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > > > There's a large gap between how fast new SoCs are supposed to tape out > > > and the rate at which new code can be merged upstream. Perhaps some of > > > that could be mitigated by putting more of the complexity into firmware > > > and that's already happening to some degree for ARMv8. But I suspect > > > there's a limit to what you can hide away in firmware while at the same > > > time giving the kernel enough information to do the right thing. > > > > One of the bigger issues which stands in the way of companies caring > > about mainstream support is closed source IPs like VPUs and GPUs. > > > > If you have one of those on your chip, even if the kernel side code > > is already under the GPL, normally that code is not "mainline worthy". > > Also, as the userspace code may not be open source, some people object > > to having the open source part in the kernel. > > > > So for customers to be able to get the performance out of the chip, > > they have to stick with having non-mainline kernel. > > > > At that point, why bother spending too much time getting mainline > > support for the device. It's never going to be fully functional in > > mainline. It doesn't make sense for these SoC companies. > > Putting effort into upstream support for something that is only relevant > to GPUs (or VPUs) does look less valuable for us right now, unless it > encourages people to start posting more GPU/VPU code upstream, and we > know there are other blockers there. What are these "other blockers"?
On Fri, Nov 29, 2013 at 05:37:01PM +0000, Greg KH wrote: > On Fri, Nov 29, 2013 at 11:44:53AM +0000, Will Deacon wrote: > > On Thu, Nov 28, 2013 at 09:25:28PM +0000, Greg KH wrote: > > > On Thu, Nov 28, 2013 at 07:39:17PM +0000, Dave Martin wrote: > > > > On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > > > > > Yes it is, you all are the ones tasked with implementing the crazy crap > > > > > the hardware people have created, best of luck with that :) > > > > > > > > Agreed. The first assumption should be that we can fit in with the > > > > existing device model -- we should only reconsider if we find that > > > > to be impossible. > > > > > > Let me know if you think it is somehow impossible, but you all should > > > really push back on the insane hardware designers that are forcing you > > > all to do this work. I find it "interesting" how this all becomes your > > > workload for their crazy ideas. > > > > Oh, I don't think we're claiming anything is impossible here :) It's more > > that we will probably want to make some changes to the device model to allow, > > for example, a device to be associated with multiple buses of potentially > > different types. > > Why would you want that? What good would that help with? It would help with devices which have their slave interface on one bus, but master to another. We need a way to configure the master side of things (IOMMU, coherency, MSI routing, etc) on one bus and configure the slave side (device probing, power management, clocks, etc) on another. > > Step one is to get the DT binding sorted, then we can try and get Linux to > > make use of it. This goes hand-in-hand with the IOMMU discussion going on > > here: > > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/210401.html > > > > which is one of the issues that is hitting us right now. > > Interesting how people seem to not know how to cc: the needed > maintainers when they touch core code :( To be fair, I don't think that code was intended to be merged, and ended up sparking a discussion about what we need in the DT to represent these topologies. DT people were on CC iirc. Will
On Fri, Nov 29, 2013 at 06:01:10PM +0000, Will Deacon wrote: > On Fri, Nov 29, 2013 at 05:37:01PM +0000, Greg KH wrote: > > On Fri, Nov 29, 2013 at 11:44:53AM +0000, Will Deacon wrote: > > > On Thu, Nov 28, 2013 at 09:25:28PM +0000, Greg KH wrote: > > > > On Thu, Nov 28, 2013 at 07:39:17PM +0000, Dave Martin wrote: > > > > > On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > > > > > > Yes it is, you all are the ones tasked with implementing the crazy crap > > > > > > the hardware people have created, best of luck with that :) > > > > > > > > > > Agreed. The first assumption should be that we can fit in with the > > > > > existing device model -- we should only reconsider if we find that > > > > > to be impossible. > > > > > > > > Let me know if you think it is somehow impossible, but you all should > > > > really push back on the insane hardware designers that are forcing you > > > > all to do this work. I find it "interesting" how this all becomes your > > > > workload for their crazy ideas. > > > > > > Oh, I don't think we're claiming anything is impossible here :) It's more > > > that we will probably want to make some changes to the device model to allow, > > > for example, a device to be associated with multiple buses of potentially > > > different types. > > > > Why would you want that? What good would that help with? > > It would help with devices which have their slave interface on one bus, but > master to another. > > We need a way to configure the master side of things (IOMMU, coherency, MSI > routing, etc) on one bus and configure the slave side (device probing, power > management, clocks, etc) on another. Make this two "devices" and have each "device" have a pointer or a way to "find" the other one.
On Fri, Nov 29, 2013 at 06:11:23PM +0000, Greg KH wrote: > On Fri, Nov 29, 2013 at 06:01:10PM +0000, Will Deacon wrote: > > On Fri, Nov 29, 2013 at 05:37:01PM +0000, Greg KH wrote: > > > On Fri, Nov 29, 2013 at 11:44:53AM +0000, Will Deacon wrote: > > > > On Thu, Nov 28, 2013 at 09:25:28PM +0000, Greg KH wrote: > > > > > On Thu, Nov 28, 2013 at 07:39:17PM +0000, Dave Martin wrote: > > > > > > On Thu, Nov 28, 2013 at 11:13:31AM -0800, Greg KH wrote: > > > > > > > Yes it is, you all are the ones tasked with implementing the crazy crap > > > > > > > the hardware people have created, best of luck with that :) > > > > > > > > > > > > Agreed. The first assumption should be that we can fit in with the > > > > > > existing device model -- we should only reconsider if we find that > > > > > > to be impossible. > > > > > > > > > > Let me know if you think it is somehow impossible, but you all should > > > > > really push back on the insane hardware designers that are forcing you > > > > > all to do this work. I find it "interesting" how this all becomes your > > > > > workload for their crazy ideas. > > > > > > > > Oh, I don't think we're claiming anything is impossible here :) It's more > > > > that we will probably want to make some changes to the device model to allow, > > > > for example, a device to be associated with multiple buses of potentially > > > > different types. > > > > > > Why would you want that? What good would that help with? > > > > It would help with devices which have their slave interface on one bus, but > > master to another. > > > > We need a way to configure the master side of things (IOMMU, coherency, MSI > > routing, etc) on one bus and configure the slave side (device probing, power > > management, clocks, etc) on another. > > Make this two "devices" and have each "device" have a pointer or a way > to "find" the other one. That's certainly one possibility, and one that I'd also toyed around with. The risk is that we're just spreading the problem around (e.g. into the dmaengine API), but it's definitely a starting point. As I said, we need to sort out the DT bindings first then we can see exactly what we need to fit into Linux. Will
On Fri, Nov 29, 2013 at 11:58:15AM +0000, Dave Martin wrote: > > Hopefully the ARM guys concur, this was just my impression from > > reviewing their patches and having recently done some design work with > > AXI.. > > Yes and no. We are trying to describe a real topology here, but only > because there are salient features that the kernel genuinely does > need to know about if we want to be able to abstract this kind of thing. > > It's not just about AXI. Right, I brought up AXI because it is public, well documented and easy to talk about - every bus/interconnect (PCI, PCI-E, RapidIO, HyperTransport, etc) I've ever seen works in essentially the same way - links and 'switches'. > The master-slave link concept is not supposed to be a new concept at > all: DT already has this concept. All we are aiming to add here is > the ability to describe cross-links that ePAPR cannot describe > directly. The main issue seems to be how to do merge the DT standard CPU-centric tree with a bus graph that isn't CPU-centric - eg like in that Zynq diagram I mentioned. All the existing DT cases I'm aware of are able to capture the DMA bus topology within the CPU tree - because they are the same :) > In axi { axi_switch {} }, are you describing two levels of bus, or > one? I'm guessing one, but then the nested node looks a bit weird. So, my attempt was to sketch a vertex list and adjacency matrix in DT. 'axi' is the container for the graph, 'axi_switch' is a vertex and then 'downstream' encodes the adjacency list. We can't use the natural DT tree hierarchy here because there is no natural graph root - referring to the Zynq diagram there is no vertex you can start at and then reach every other vertex - so a tree can't work, and there is no such thing as a 'bus level' > However, there's nothing to stop a DMA controller's master side being > looped back so that it can access its own slave interface. This is the > normal situation for coherent DMA, since the whole point there is > that the DMA controller should shares its system view closely with > the CPUs, including some levels of cache. The DAG would only have vertexes for switches and distinct vertexes for 'end-ports'. So if an IP block has a master interface and a slave interface then it would have two DAG end-port vertexes and the DAG can remain acyclic. The only way to create cycles is to connect switches in loops, and you can always model a group of looped switches as a single switch vertex to remove cycles. If cycles really are required then it just makes the kernel's job harder, it doesn't break the DT representation .. > > Right - which is why I said the usual 'soc' node should remain as-is > > typical today - a tree formed by viewing the AXI DAG from the CPU > > vertex. That 100% matches the OS perspective of the system for CPU > > originated MMIO. > > Do you mean the top-level bus node in the DT and its contents, or > something else? > > If so, agreed ... Right, the DT, and the 'reg' properties should present a tree that is the MMIO path for the CPU. That tree should be a subset of the full bus graph. If the bus is 'sane' then that tree matches the DMA graph as well, which is where most implementations are today. > ... that could work, although putting the links in the natural places > in the DT directly feels cleaner that stashing a crib table elsewhere > in the DT. That's partly cosmetic, I think both could work? I choose to talk about this as a side table for a few reasons (touched on above) but perhaps the most important is where do you put switches that the CPU's MMIO path doesn't flow through? What is the natural place in the DT tree? Again refering to the Zynq diagram, you could have a SOC node like this: soc { // Start at Cortex A9 scu { OCM {} l2cache { memory {} slave { on_chip0 {reg = {}} on_chip1 {reg = {}} on_chip2 {reg = {}} ... } } } } Where do I put a node for the 'memory interconnect' switch? How do I model DMA connected to the 'Cache Coherent AXI port'? MMIO config registers for these blocks are going to fit into the MMIO tree someplace, but that doesn't really tell you anything about how they fit into the dma graph. Easy to do with a side table: axi { cpu {downstream = scu} scu {downstream = OCM,l2cache} l2cache {downstream = memory,slave interconnect} slave interconnect {} // No more switches hp axi {downstream = memory interconnect} memory interconnect {downstream = memory,OCM} coherent acp {downstream = scu} gp axi {downstream = master interconnect} master interconnect {downstream = central interconnect} central interconnect {downstream = ocm,slave interconnect,memory} dma engine {downstream = central interconnect} } Which captures the switch vertex list and adjacency list. Then you have to connect the device nodes into the AXI graph: Perhaps: on_chip0 { reg = {} axi_mmio_slave_port = <&slave interconnect, M0> // Vertex and edge axi_bus_master_port = <&hp axi, S1> } Or maybe back reference from the graph table is better: axi { hp axi { downstream = memory interconnect controller = &...; S1 { bus_master = &on_chip0; // Vertex and edge axi,arbitration-priority = 10; } } slave interconnect { M0 { mmio_slave = &on_chip0; } } } It sort of feels natural that you could describe the interconnect in under its own tidy node and the main tree remains left alone... This might be easier to parse as well, since you know everything under 'axi' is related to interconnect and not jumbled with other stuff. Cheers, Jason
On Fri, Nov 29, 2013 at 09:42:23AM -0800, Greg KH wrote: > On Fri, Nov 29, 2013 at 10:37:14AM +0100, Thierry Reding wrote: > > > > Some of this is a consequence of the push to have the firmware > > > > minimal. As soon as you say the kernel has to configure the address > > > > map you've created a big complexity for it.. > > > > > > Why the push to make firmware "minimal"? What is that "saving"? You > > > just push the complexity from one place to the other, just because ARM > > > doesn't seem to have good firmware engineers, doesn't mean they should > > > punish their kernel developers :) > > > > In my experience the biggest problem here is that people working on > > upstream kernels and therefore confronted with these issues are seldom > > able to track the latest developments of new chips. > > > > When the time comes to upstream support, most of the functionality has > > been implemented downstream already, so it actually works and there's no > > apparent reason why things should change. > > That's a failure of the companies involved. Yes, I know. > > Now I know that that's not an ideal situation and upstreaming should > > start a whole lot earlier, but even if that were the case, once the > > silicon tapes out there's not a whole lot you can do about it anymore. > > Starting with upstreaming even before that would have to be a solution, > > but I don't think that's realistic at the current pace of development. > > For other companies it is realistic. I have a whole presentation on > this, and why it even makes good business sense to do it properly (hint, > saves you time and money, who doesn't like that?) I've seen a recording of that presentation. Twice. =) > > There's a large gap between how fast new SoCs are supposed to tape out > > and the rate at which new code can be merged upstream. Perhaps some of > > that could be mitigated by putting more of the complexity into firmware > > and that's already happening to some degree for ARMv8. But I suspect > > there's a limit to what you can hide away in firmware while at the same > > time giving the kernel enough information to do the right thing. > > > > I am completely convinced that our goal should be to do upstreaming > > early and ideally there shouldn't be any downstream development in the > > first place. The reason why we're not there yet is because it isn't > > practical to do so currently, so I'm very interested in suggestions or > > finding ways to improve the situation. > > "Practical"? Heh, other companies know how to do this properly, and > because of that, they will succeed, sorry. > > It can be done, the fact that ARM and it's licensees don't want to do > it, doesn't mean it isn't "practical" at all, it's just a failure on > their part to do things in the "correct" way, wasting time and money in > the process. Well, I can't really argue with that, so I'll stop with the whining and go back to work. Thierry
On Fri, Nov 29, 2013 at 11:43:41AM -0700, Jason Gunthorpe wrote: > On Fri, Nov 29, 2013 at 11:58:15AM +0000, Dave Martin wrote: > > > Hopefully the ARM guys concur, this was just my impression from > > > reviewing their patches and having recently done some design work with > > > AXI.. > > > > Yes and no. We are trying to describe a real topology here, but only > > because there are salient features that the kernel genuinely does > > need to know about if we want to be able to abstract this kind of thing. > > > > It's not just about AXI. > > Right, I brought up AXI because it is public, well documented and easy > to talk about - every bus/interconnect (PCI, PCI-E, RapidIO, > HyperTransport, etc) I've ever seen works in essentially the same way > - links and 'switches'. > > > The master-slave link concept is not supposed to be a new concept at > > all: DT already has this concept. All we are aiming to add here is > > the ability to describe cross-links that ePAPR cannot describe > > directly. > > The main issue seems to be how to do merge the DT standard CPU-centric > tree with a bus graph that isn't CPU-centric - eg like in that Zynq > diagram I mentioned. > > All the existing DT cases I'm aware of are able to capture the DMA bus > topology within the CPU tree - because they are the same :) > > > In axi { axi_switch {} }, are you describing two levels of bus, or > > one? I'm guessing one, but then the nested node looks a bit weird. > > So, my attempt was to sketch a vertex list and adjacency matrix in DT. > > 'axi' is the container for the graph, 'axi_switch' is a vertex and > then 'downstream' encodes the adjacency list. > > We can't use the natural DT tree hierarchy here because there is no > natural graph root - referring to the Zynq diagram there is no vertex > you can start at and then reach every other vertex - so a tree can't > work, and there is no such thing as a 'bus level' > > > However, there's nothing to stop a DMA controller's master side being > > looped back so that it can access its own slave interface. This is the > > normal situation for coherent DMA, since the whole point there is > > that the DMA controller should shares its system view closely with > > the CPUs, including some levels of cache. > > The DAG would only have vertexes for switches and distinct vertexes > for 'end-ports'. So if an IP block has a master interface and a slave > interface then it would have two DAG end-port vertexes and the DAG can > remain acyclic. > > The only way to create cycles is to connect switches in loops, and you > can always model a group of looped switches as a single switch vertex > to remove cycles. > > If cycles really are required then it just makes the kernel's job > harder, it doesn't break the DT representation .. > > > > Right - which is why I said the usual 'soc' node should remain as-is > > > typical today - a tree formed by viewing the AXI DAG from the CPU > > > vertex. That 100% matches the OS perspective of the system for CPU > > > originated MMIO. > > > > Do you mean the top-level bus node in the DT and its contents, or > > something else? > > > > If so, agreed ... > > Right, the DT, and the 'reg' properties should present a tree that is > the MMIO path for the CPU. That tree should be a subset of the full > bus graph. > > If the bus is 'sane' then that tree matches the DMA graph as well, > which is where most implementations are today. > > > ... that could work, although putting the links in the natural places > > in the DT directly feels cleaner that stashing a crib table elsewhere > > in the DT. That's partly cosmetic, I think both could work? > > I choose to talk about this as a side table for a few reasons (touched > on above) but perhaps the most important is where do you put switches > that the CPU's MMIO path doesn't flow through? What is the natural > place in the DT tree? > > Again refering to the Zynq diagram, you could have a SOC node like > this: > > soc > { > // Start at Cortex A9 > scu { > OCM {} > l2cache { > memory {} > slave { > on_chip0 {reg = {}} > on_chip1 {reg = {}} > on_chip2 {reg = {}} > ... > } > } > } > } > > Where do I put a node for the 'memory interconnect' switch? How do I > model DMA connected to the 'Cache Coherent AXI port'? > > MMIO config registers for these blocks are going to fit into the MMIO > tree someplace, but that doesn't really tell you anything about how > they fit into the dma graph. > > Easy to do with a side table: > axi { > cpu {downstream = scu} > scu {downstream = OCM,l2cache} > l2cache {downstream = memory,slave interconnect} > slave interconnect {} // No more switches > > hp axi {downstream = memory interconnect} > memory interconnect {downstream = memory,OCM} > > coherent acp {downstream = scu} > > gp axi {downstream = master interconnect} > master interconnect {downstream = central interconnect} > central interconnect {downstream = ocm,slave interconnect,memory} > > dma engine {downstream = central interconnect} > } > > Which captures the switch vertex list and adjacency list. > > Then you have to connect the device nodes into the AXI graph: > > Perhaps: > > on_chip0 > { > reg = {} > axi_mmio_slave_port = <&slave interconnect, M0> // Vertex and edge > axi_bus_master_port = <&hp axi, S1> > } > > Or maybe back reference from the graph table is better: > > axi { > hp axi { > downstream = memory interconnect > controller = &...; > S1 { > bus_master = &on_chip0; // Vertex and edge > axi,arbitration-priority = 10; > } > } > slave interconnect { > M0 { > mmio_slave = &on_chip0; > } > } > } > > It sort of feels natural that you could describe the interconnect in > under its own tidy node and the main tree remains left alone... > > This might be easier to parse as well, since you know everything under > 'axi' is related to interconnect and not jumbled with other stuff. That is true, but I do have a concern that bolting more and more info onto the side of DT may leave us with a mess, while the "main" tree becomes increasingly fictional. You make a lot of good points -- apologies for not responding in detail to all of them yet, but I tie myself in knots trying to say too many different things at the same time. For comparison, here's what I think is the main alternative approach. My discussion touches obliquely on some of the issues you raise... My basic idea is that DT allows us to express master/slave relationships like this already: master_device { slave_device { reg = < ... >; ranges = < ... >; dma-ranges = < ... >; }; }; (dma-ranges is provided by ePAPR to describe the reverse mappings for slave_device to master back on master_device). In a multi-master system this isn't enough, because a node might have to have multiple parents in order to express all the master/slave relationships. In that case, we can choose one of the parents as the canonical one (e.g., the immediate master on the path from the coherent CPUs), or if there is no obvious canonical parent the child node can be a freestanding node in the tree (i.e., with no reg or ranges properties, either in the / { } junkyard, or in some location that makes topological sense for the device in question). The DT herarchy retains real meaning since the direction of master/slave relationships is fixed, but the DT becomes a tree of connected trees, rather than a single tree. So, some master/slave relationships will not be child/parent any more, and we need another way to express the linkage. My idea now, building on Will's suggestion is to keep the existing abstractions unchanged, but create an alternate "detached" representation of each relationship, to use in multi-master situations. In the following, the #address-cells and #size-cells properties of a and b (where applicable) control the parsing of REG or RANGE in precisely the same way for both forms. SLAVE-PORT is required for slave buses with multiple masters, but it is only needed in systems where the ports need to be identified distinctly. DT has no traditional way to describe this, so we'd need to add another property ("parent-master" in my example) if we want to describe that in tree form. For now, I assume that a device (i.e., a slave that is not a bus) does not need to need to distinguish between different slave ports -- if it did need to do so, additional properties could be added to describe that, but we have no example of this today(?) Drivers for buses that use SLAVE-PORT could still work with a DT that does not provide the SLAVE-PORT information, but might have restricted functionality in that case (no per-port control, profiling, etc.) Note also that: { #slave-cells = <0>; } is equivalent to { // no #slave-cells } { parent-master = <>; } is equivalent to { // no parent-master } ...which gives the correct interpretation for traditional DTs. We then have the following transformations: Slave device mapping, tree form: a { b { reg = < REG >; }; }; Slave device mapping, detached form: a { slave-reg = < &b REG >; }; b { }; Slave passthrough bus mapping, tree form: a { b { #slave-cells = < SLAVE-PORT-CELLS >; parent-master = < SLAVE-PORT >; ranges; }; }; Slave passthrough bus mapping, detached form: a { slave = < &b SLAVE-PORT >; }; b { #slave-cells = < SLAVE-PORT-CELLS >; }; Remapped slave bus mapping, tree form: a { b { #slave-cells = < SLAVE-PORT-CELLS >; parent-master = < SLAVE-PORT >; ranges = < RANGE >; }; }; Remapped slave bus mapping, detached form: a { slave = < &b SLAVE-PORT RANGE >; }; b { #slave-cells = < SLAVE-PORT-CELLS >; }; There are no dma-ranges properties here, because in this context "DMA" is just a master/slave relationship where the master isn't a CPU. The added properties actually allow us to describe that just fine, but in a more explicit and general way. Some new code will be required to parse this, but it is just a new more flexible _mechanism_ for expressing an old relationship, extended with a straightforward slave-port identifier which is a simple array of cells, we should have a good chance of burying the change under abstracted interfaces. Existing DTs should not need to change, because we have a well-defined mapping between the two representations in the non-multi-master case. I may try to come up with a partial description of the Zync SoC, but I was getting myself confused when I tried it earlier ;) Cheers ---Dave
On Mon, Dec 02, 2013 at 08:25:43PM +0000, Dave Martin wrote: > > This might be easier to parse as well, since you know everything under > > 'axi' is related to interconnect and not jumbled with other stuff. > > That is true, but I do have a concern that bolting more and more > info onto the side of DT may leave us with a mess, while the "main" > tree becomes increasingly fictional. > > You make a lot of good points -- apologies for not responding in detail > to all of them yet, but I tie myself in knots trying to say too many > different things at the same time. I think the main point is to observe that we are encoding a directed graph onto DT, so long as the original graph can be extracted the DT encoding can be whatever people like :) > In a multi-master system this isn't enough, because a node might have > to have multiple parents in order to express all the master/slave > relationships. Right, DT is a tree, not a graph - and this is already a minor problem we've seen modeling some IP blocks on the Marvell chips. They also have multiple ports into the various system busses. > In that case, we can choose one of the parents as the canonical one > (e.g., the immediate master on the path from the coherent CPUs), or if > there is no obvious canonical parent the child node can be a > freestanding node in the tree (i.e., with no reg or ranges properties, > either in the / { } junkyard, or in some location that makes topological > sense for the device in question). The DT herarchy retains real > meaning since the direction of master/slave relationships is fixed, > but the DT becomes a tree of connected trees, rather than a single > tree. I'm not sure this will really be a problem in practice: Consider: - All IP blocks we care about are going to have a CPU MMIO port for control. - The 'soc' tree is the MMIO hierarchy from the CPU perspective - IP blocks should try to DT model as a single node when possible In that case, the location of a DT node for a multiport IP is now well defined: It is the path from the CPU to the MMIO port, expressed in DT. Further, every 'switch' is going to have MMIO to control the switch, so the switch node DT locations are also well defined. Basically, I think the main 'soc' tree's layout is mostly unambiguous and covers all the relevant blocks. You woun't get a forest of DT trees because every block must be MMIO reachable. It is also the same core DT tree with my suggestion or yours. Your edge encoding also makes sense, but I think this is where I would disagree the most: > Slave device mapping, tree form: > a { > b { > reg = < REG >; > }; > }; > > Slave device mapping, detached form: > a { > slave-reg = < &b REG >; > }; > > b { > }; This now requires the OS to parse this dataset just to access standard MMIO, and you have to change the standard existing code that parses ranges and reg to support this extended format. Both of those reasons seem like major downsides to me. If the OS doesn't support advanced features (IOMMU, power management, etc) it should not require DT parsing beyond the standard items. This may become relevant when re-using a kernel DT in uboot for instance. On the other hand, this is a great way to actually express the correct address mapping path for every reg window - but isn't that a separate issue from the IOMMU/DMA problem? You still need to describe the DMA bus mastering ports on IP directly. The side-table concept would keep the parsing completely contained within the IOMMU/etc drivers, and not have it leak out into existing core DT code, but it doesn't completely tidy multiple slave ports. Also, I was thinking after I sent the last email that this is a good time to be thinking about a future need for describing NUMA affinites in DT. That is basically the same directed graph we are talking about here. Trying some modeling samples with that in mind would be a good idea.. You should also think about places to encode parameters like master/slave QOS and other edge-specific tunables.. > I may try to come up with a partial description of the Zync SoC, but > I was getting myself confused when I tried it earlier ;) The Zynq is interesting because all the information is public - and it is a good example of the various AXI building blocks. Imagine some IOMMUs in there and you have a complete scenario to talk about.. It even has a coherent AXI port available for IP to hook up to. :) Regards, Jason
On Mon, Dec 02, 2013 at 05:07:40PM -0700, Jason Gunthorpe wrote: > On Mon, Dec 02, 2013 at 08:25:43PM +0000, Dave Martin wrote: > > > This might be easier to parse as well, since you know everything under > > > 'axi' is related to interconnect and not jumbled with other stuff. > > > > That is true, but I do have a concern that bolting more and more > > info onto the side of DT may leave us with a mess, while the "main" > > tree becomes increasingly fictional. > > > > You make a lot of good points -- apologies for not responding in detail > > to all of them yet, but I tie myself in knots trying to say too many > > different things at the same time. > > I think the main point is to observe that we are encoding a directed > graph onto DT, so long as the original graph can be extracted the > DT encoding can be whatever people like :) Sure, we're just juggling different descriptions for the same thing here. The fact that our different representations do seem to agree on that is reassuring... > > In a multi-master system this isn't enough, because a node might have > > to have multiple parents in order to express all the master/slave > > relationships. > > Right, DT is a tree, not a graph - and this is already a minor problem > we've seen modeling some IP blocks on the Marvell chips. They also > have multiple ports into the various system busses. > > > In that case, we can choose one of the parents as the canonical one > > (e.g., the immediate master on the path from the coherent CPUs), or if > > there is no obvious canonical parent the child node can be a > > freestanding node in the tree (i.e., with no reg or ranges properties, > > either in the / { } junkyard, or in some location that makes topological > > sense for the device in question). The DT herarchy retains real > > meaning since the direction of master/slave relationships is fixed, > > but the DT becomes a tree of connected trees, rather than a single > > tree. > > I'm not sure this will really be a problem in practice: > > Consider: > - All IP blocks we care about are going to have a CPU MMIO port for > control. > - The 'soc' tree is the MMIO hierarchy from the CPU perspective > - IP blocks should try to DT model as a single node when possible > > In that case, the location of a DT node for a multiport IP is now well > defined: It is the path from the CPU to the MMIO port, expressed in > DT. > > Further, every 'switch' is going to have MMIO to control the switch, > so the switch node DT locations are also well defined. > > Basically, I think the main 'soc' tree's layout is mostly unambiguous > and covers all the relevant blocks. > > You woun't get a forest of DT trees because every block must be MMIO > reachable. > > It is also the same core DT tree with my suggestion or yours. Absolutely: I didn't argue this very well. The CPU's-eye view of the system determines a natural hierarchy for everything or almost. It's possible that there is some bus or switch that only non-CPUs can see. But if the CPU has no control interface for it, that suggests that bus is transparent enough that it needs no control -- and might not need to be represented in the DT at all. As a rule, we should never put anything in the DT that does not need to be described. But if we end up with deviations from this rule, floating nodes give us an escape route. Suppose you have a cluster of DSPs used to implement a GPU. They might have their own front-side bus which they control themselves. In this situation, it might be more natural to represent that whole side cluster as a separate floating subtree within /. But that's all very hypothetical. In most cases, you just call that monstrosity "gpu" and make it look like a device -- even in the hardware. > > Your edge encoding also makes sense, but I think this is where I would > disagree the most: > > > Slave device mapping, tree form: > > a { > > b { > > reg = < REG >; > > }; > > }; > > > > Slave device mapping, detached form: > > a { > > slave-reg = < &b REG >; > > }; > > > > b { > > }; > > This now requires the OS to parse this dataset just to access standard > MMIO, and you have to change the standard existing code that parses > ranges and reg to support this extended format. > > Both of those reasons seem like major downsides to me. If the OS > doesn't support advanced features (IOMMU, power management, etc) it > should not require DT parsing beyond the standard items. This may > become relevant when re-using a kernel DT in uboot for instance. You're right that this is a change. However, I think that no existing DT needs to change, and few DTs will use it -- similar to the argument about why DT will normally look like a single tree. In real systems, I think multi-master slaves which are accessed directly and not via some multi-master shared bus are not that common. A partial dodge would be to introduce a dummy bus node: a { b { reg = < REG >; }; }; becomes a { slave-ranges = < &b_bus RANGE >; }; b_bus { // #slave-cells = <0> is the default compatible = "simple-bus"; b { reg = < REG' >; }; }; Where RANGE maps REG into a's address space, and REG' is REG rebased to address 0. Now, we can refer indirectly to b as many times as we like, without using the slave-reg thing. This still needs special parsing though -- but again, only for cases that we already can't describe with DT. The common situation will be that all shared slaves are really under some shared bus, and that bus really has some natural location in the DT. So for cases simple enough not to require these extensions, I think there would still be no change. > On the other hand, this is a great way to actually express the correct > address mapping path for every reg window - but isn't that a separate > issue from the IOMMU/DMA problem? You still need to describe the DMA > bus mastering ports on IP directly. Those problems aren't identical, but they seem closely related. My thought was that this gives us most of the language required to describe the mastering links for bus-mastering devices. > > The side-table concept would keep the parsing completely contained > within the IOMMU/etc drivers, and not have it leak out into existing > core DT code, but it doesn't completely tidy multiple slave ports. > > Also, I was thinking after I sent the last email that this is a good > time to be thinking about a future need for describing NUMA affinites > in DT. That is basically the same directed graph we are talking about > here. Trying some modeling samples with that in mind would be a good > idea.. > > You should also think about places to encode parameters like > master/slave QOS and other edge-specific tunables.. The idea that some things are properties of and edge or link, not a node or device, overlaps with my thinking about IOMMU. This may apply to any device that behaves like some adaptor or passthrough. One option is to create subnodes for these links. I did not elaborate this previously, but I think allowing "slave" to be a node for the more complex cases where we need to add more info might help here: dma { slave { compatible = "slave-link", "simple-bus"; ranges = < ... >; iommu-foo = < ... >; slave = < &shared_bus SLAVE-PORT >; } }; The best way to describe multiple master ports on the DMA controller would need some thought. One option would to extend the address space on the slave node with additional cell(s) to carry port identifiers. ePAPR already does things in this sort of way in the interrupt-map and interrupt-map-mask properties, to describe PCI interrupt routing. > > > I may try to come up with a partial description of the Zync SoC, but > > I was getting myself confused when I tried it earlier ;) > > The Zynq is interesting because all the information is public - and it > is a good example of the various AXI building blocks. Imagine some > IOMMUs in there and you have a complete scenario to talk about.. Indeed. I was impressed to see a non-trivial block diagram that wasn't pasted straight out of some marketing powerpoint :) It's a good example for discussion here, particularly if we add some IOMMUs to the mix. > It even has a coherent AXI port available for IP to hook up to. :) You mean the ACP port connecting the PL Fabric back to the CPU cluster? I'm guessing the PL Fabric is the interface to the FPGA logic. Now I need to go back to your proposal and the IOMMU thread and try to understand better how the approaches map onto each other. Cheers ---Dave
On Thu, Nov 28, 2013 at 06:35:54PM -0800, Greg KH wrote: > On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: > > Greg's point makes sense, but the HW guys are not designing things > > this way for kicks - there are real physics based reasons for some of > > these choices... > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > > expensive compared to a purpose built muxed bus tree. Doing coherency > > look ups on DMA traffic costs energy, etc. > Really? How much power exactly does it take / save? Yes, hardware > people think "software is free", but when you can't actually control the > hardware in the software properly, well, you end up with something like > itanium... If you look at the hardware design decisions this stuff tends to be totally sensible; there's a bunch of factors at play (complexity, area and isolation tend to be other ones). There's a lot of the stuff that we're complaining about where they can reasonably question why this is so complex for us. That doesn't mean that everything that it's possible to do is sensible but there's definitely limitations on the kernel side here. > > > code to deal with those descriptions and the hardware they represent. At > > > some point we need to start pushing some of the complexity back into > > > hardware so that we can keep a sane code-base. > > Some of this is a consequence of the push to have the firmware > > minimal. As soon as you say the kernel has to configure the address > > map you've created a big complexity for it.. > Why the push to make firmware "minimal"? What is that "saving"? You > just push the complexity from one place to the other, just because ARM > doesn't seem to have good firmware engineers, doesn't mean they should > punish their kernel developers :) These firmwares have tended to be ROMed or otherwise require expensive validation to change for sometimes sensible reasons, keeping the amount of code that's painful to change low will tend to make people happier if a change is needed. Most people like the risk mitigation.
On Wed, Dec 04, 2013 at 06:43:45PM +0000, Mark Brown wrote: > On Thu, Nov 28, 2013 at 06:35:54PM -0800, Greg KH wrote: > > On Thu, Nov 28, 2013 at 04:31:47PM -0700, Jason Gunthorpe wrote: > > > > Greg's point makes sense, but the HW guys are not designing things > > > this way for kicks - there are real physics based reasons for some of > > > these choices... > > > > eg An all-to-all bus cross bar (eg like Intel's ring bus) is engery > > > expensive compared to a purpose built muxed bus tree. Doing coherency > > > look ups on DMA traffic costs energy, etc. > > > Really? How much power exactly does it take / save? Yes, hardware > > people think "software is free", but when you can't actually control the > > hardware in the software properly, well, you end up with something like > > itanium... > > If you look at the hardware design decisions this stuff tends to be > totally sensible; there's a bunch of factors at play (complexity, area > and isolation tend to be other ones). There's a lot of the stuff that > we're complaining about where they can reasonably question why this is > so complex for us. That doesn't mean that everything that it's possible > to do is sensible but there's definitely limitations on the kernel side > here. The main reason it's so "complex" is the drive for people to have a "one kernel image for multiple systems" and hence the need to have DT handle all of this. I don't think that requirement has been pushed back on the hardware engineers yet, and they are still thinking that a custom image per chip is ok. If the hardware designers don't have that goal, this is just going to get harder and harder over time, as your systems get more and more complex. > > > > code to deal with those descriptions and the hardware they represent. At > > > > some point we need to start pushing some of the complexity back into > > > > hardware so that we can keep a sane code-base. > > > > Some of this is a consequence of the push to have the firmware > > > minimal. As soon as you say the kernel has to configure the address > > > map you've created a big complexity for it.. > > > Why the push to make firmware "minimal"? What is that "saving"? You > > just push the complexity from one place to the other, just because ARM > > doesn't seem to have good firmware engineers, doesn't mean they should > > punish their kernel developers :) > > These firmwares have tended to be ROMed or otherwise require expensive > validation to change for sometimes sensible reasons, keeping the amount > of code that's painful to change low will tend to make people happier if > a change is needed. Most people like the risk mitigation. I love it how it's so easy to make the kernel be the part of the whole system stack that is simpler to change than any other :) I'm all for making Linux be the "firmware" and deal with these very low-level issues directly, but again, this drives in the face of your self-stated goal of "one image per architecture" for the kernel. You kind of can't have it both ways it seems, so someone needs to make up their mind as to what it's going to be... Best of luck with this. greg k-h
On Wed, Dec 04, 2013 at 11:03:12AM -0800, Greg KH wrote: > On Wed, Dec 04, 2013 at 06:43:45PM +0000, Mark Brown wrote: > > If you look at the hardware design decisions this stuff tends to be > > totally sensible; there's a bunch of factors at play (complexity, area > > and isolation tend to be other ones). There's a lot of the stuff that > > we're complaining about where they can reasonably question why this is > > so complex for us. That doesn't mean that everything that it's possible > > to do is sensible but there's definitely limitations on the kernel side > > here. > The main reason it's so "complex" is the drive for people to have a "one > kernel image for multiple systems" and hence the need to have DT handle > all of this. I don't think that requirement has been pushed back on the > hardware engineers yet, and they are still thinking that a custom image > per chip is ok. No, the single system image stuff is orthogonal here and is basically irrelevant for many use cases - it's essential for servers and so on but most consumer electronics guys pretty much don't care and this stuff is as much a problem for them as anyone else. We've always struggled with these things even when building for specific hardware, DT is just another way to write the data structure here. When people are talking about figuring out the DT first here what they're taking about is as much working out what we need to abstract first as anything else. This isn't a million miles away from the stuff we've dealt with using probe deferral in terms of fitting into the device model, at least at a high level, though it is harder to sidestep the issues here. > > These firmwares have tended to be ROMed or otherwise require expensive > > validation to change for sometimes sensible reasons, keeping the amount > > of code that's painful to change low will tend to make people happier if > > a change is needed. Most people like the risk mitigation. > I love it how it's so easy to make the kernel be the part of the whole > system stack that is simpler to change than any other :) Dammn open source license :) > I'm all for making Linux be the "firmware" and deal with these very > low-level issues directly, but again, this drives in the face of your > self-stated goal of "one image per architecture" for the kernel. You > kind of can't have it both ways it seems, so someone needs to make up > their mind as to what it's going to be... I think you're confusing me with someone else... in any case, I don't see why there should be any conflict here.
diff --git a/Documentation/devicetree/bindings/arm/coherent-bus.txt b/Documentation/devicetree/bindings/arm/coherent-bus.txt new file mode 100644 index 000000000000..e3fbc2e491c7 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/coherent-bus.txt @@ -0,0 +1,110 @@ +* Generic binding to describe a coherent bus + +In some systems, devices (peripherals and/or CPUs) do not share +coherent views of memory, while on other systems sets of devices may +share a coherent view of memory depending on the static bus topology +and/or dynamic configuration of both the bus and device. Establishing +such dynamic configurations requires appropriate topological information +to be communicated to the operating system. + +This binding document attempts to define a set of generic properties +which can be used to encode topological information in bus and device +nodes. + + +* Terminology + + - Port : An interface over which memory transactions + can propagate. A port may act as a master, + slave or both (see below). + + - Master port : A port capable of issuing memory transactions + to a slave. For example, a port connecting a + DMA controller to main memory. + + - Slave port : A port capable of responding to memory + transactions received by a master. For + example, a port connecting the control + registers of an MMIO device to a peripheral + bus. + + **Note** The ports on a bus to which masters are connected are + referred to as slave ports on that bus. + + +* Properties + + - #slave-port-cells : A property of the bus, describing the number + of cells required for an upstream master + device to encode a single slave port on the + bus. The actual encoding is defined by the + bus binding. + + - slave-ports : A property of a device mastering through a + downstream bus, describing the set of slave + ports on the bus to which the device is + connected. The property takes the form of a + list of pairs, where each pair contains a + phandle to the bus node as its first element + and #slave-port-cells cells (for the bus + referred to in the first element) as the + second element. + + +* Example + + my-coherent-bus { + compatible = "acme,coherent-bus-9000"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0xba5e0000 0x10000>; + + [...] /* More bus-specific properties */ + + /* + * Slave ports on this bus can be identified with a + * single cell. + */ + #slave-port-cells = <1>; + + /* 1:1 address space mapping with our parent bus. */ + ranges; + + /* + * These devices all have at least their *slave* interfaces + * on the coherent bus. + */ + dma0@0xfff00000 { + compatible = "acme,coherent-dma-9000"; + reg = <0xfff00000 0x10000>; + + [...] /* More dma-specific properties */ + + /* + * The DMA controller can master through two + * ports on the coherent bus, using port + * identifiers '0' and '1'. + */ + slave-ports = <&my-coherent-bus 0>, + <&my-coherent-bus 1>; + }; + + [...] /* More devices */ + }; + + /* + * A device that can master through the coherent bus, but has its + * slave interface elsewhere. + */ + dma1@0xfff80000 { + compatible = "acme,coherent-dma-9000"; + reg = <0xfff80000 0x10000>; + + [...] /* More dma-specific properties */ + + /* + * The DMA controller can master through a single port + * on the coherent bus above, using port identifier '8'. + */ + slave-ports = <&my-coherent-bus 8>; + };
Hi all, SoC architectures are getting increasingly complex in ways that are not transparent to software. A particular emerging issue is that of multi-master SoCs, which may have different address views, IOMMUs, and coherency behaviour from one master to the next. DT can't describe multi-master systems today except for PCI DMA and similar. This comes with constraints and assumptions that won't work for emerging SoC bus architectures. On-SoC, a device's interface to the system can't be described in terms of a single interface to a single "bus". Different masters may have different views of the system too. Software needs to understand the true topology in order to do address mapping, coherency management etc., in any generic way. One piece of the puzzle is to define how to describe these topologies in DT. The other is how to get the right abstractions in the kernel to drive these systems in a generic way. The following proposal (originally from Will) begins to address the DT part. Comments encouraged -- I anticipate it may take some discussion to reach a consensus here. Cheers ---Dave From will.deacon@arm.com Wed Nov 20 12:06:22 2013 Date: Wed, 20 Nov 2013 12:06:13 +0000 Subject: [PATCH RFC v2] Documentation: devicetree: add description for generic bus properties This patch documents properties that can be used as part of bus and device bindings in order to describe their linkages within the system topology. Use of these properties allows topological parsing to occur in generic library code, making it easier for bus drivers to parse information regarding their upstream masters and potentially allows us to treat the slave and master interfaces separately for a given device. Signed-off-by: Will Deacon <will.deacon@arm.com> --- A number of discussion points remain to be resolved: - Use of the ranges property and describing slave vs master bus address ranges. In the latter case, we actually want to describe our address space with respect to the bus on which the bus masters, rather than the parent. This could potentially be achieved by adding properties such as dma-parent and dma-ranges (already used by PPC?) - Describing masters that master through multiple different buses - How on Earth this fits in with the Linux device model (it doesn't) - Interaction with IOMMU bindings (currently under discussion) Cheers, Will .../devicetree/bindings/arm/coherent-bus.txt | 110 +++++++++++++++++++++ 1 file changed, 110 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/coherent-bus.txt