Message ID | 1362577426-12804-5-git-send-email-thomas.petazzoni@free-electrons.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Mar 06, 2013 at 02:43:40PM +0100, Thomas Petazzoni wrote: > The Marvell EBU SoCs have a configurable physical address space > layout: the physical ranges of memory used to address PCI(e) > interfaces, NOR flashes, SRAM and various other types of memory are > configurable by software, through a mechanism of so-called 'address > decoding windows'. This is a nice forward step improvement, but I think it would be nicer to read the HW specific information from the DT rather than encoding it in tables in the driver. It would make the driver smaller and simpler.. > +static const struct mvebu_mbus_mapping armada_370_map[] = { > + MAPDEF("bootrom", 1, 0xe0, MAPDEF_NOMASK), > + MAPDEF("devbus-boot", 1, 0x2f, MAPDEF_NOMASK), > + MAPDEF("devbus-cs0", 1, 0x3e, MAPDEF_NOMASK), > + MAPDEF("devbus-cs1", 1, 0x3d, MAPDEF_NOMASK), > + MAPDEF("devbus-cs2", 1, 0x3b, MAPDEF_NOMASK), > + MAPDEF("devbus-cs3", 1, 0x37, MAPDEF_NOMASK), > + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), > + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), > + {}, > +}; These tables could be encoded using the approach Arnd described: /* 2 adress cell value with MAPDEF value (target attributes) in the first cell and 0 in the 2nd. */ #define MAPDEF(t,a,m) ((t<<16) | (a<<8) | m) 0 mbus { compatible = "marvell,mbus"; #address-cells = <2>; #size-cells = <1>; regs = <...> windows = <0 1 2 3 4>; remappable-windows = <7 8 9>; internal-window = 5; // Names and numbers made up for illistration // The 2nd column is where to place the MBUS target in the CPU address map ranges < MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs [..] // MBUS target for internal registers internal_regs { compatible = "simple-bus"; ranges <MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0 0x10000>; #address-cells = <1>; #size-cells = <1>; // Serial controller at offset 0x3000 from the start of internal registers serial@0x3000 { reg = <0x3000 0x100>; } } // MBUS target for boot rom bootrom { compatible = "simple-bus"; ranges <MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0 0x10000>; #address-cells = <1>; #size-cells = <1>; } } Where: - The top ranges is the table 'mvebu_mbus_mapping', combined with the board specific code that sets the target CPU address. The mbus driver can directly setup mappings without requiring board support. - Something like the 'bootrom' bus is where devices under that window would be placed. This also replaces code like this from the board files: mvebu_mbus_add_window("nand", KIRKWOOD_NAND_MEM_PHYS_BASE, KIRKWOOD_NAND_MEM_SIZE); mvebu_mbus_add_window("sram", KIRKWOOD_SRAM_PHYS_BASE, KIRKWOOD_SRAM_SIZE); And lets us make the DT self-consistent with the value of KIRKWOOD_SRAM_PHYS_BASE and others > + * Some variants of Orion5x have 4 remappable windows, some other have > + * only two of them. > + */ > +static const struct mvebu_mbus_soc_data orion5x_4win_mbus_data = { > + .num_wins = 8, > + .num_remappable_wins = 4, These values also seem like something worth reading from the DT as well.. See above for an idea. Jason
Dear Jason Gunthorpe, On Wed, 6 Mar 2013 12:08:21 -0700, Jason Gunthorpe wrote: > This is a nice forward step improvement, but I think it would be nicer > to read the HW specific information from the DT rather than encoding > it in tables in the driver. It would make the driver smaller and > simpler.. > > > +static const struct mvebu_mbus_mapping armada_370_map[] = { > > + MAPDEF("bootrom", 1, 0xe0, MAPDEF_NOMASK), > > + MAPDEF("devbus-boot", 1, 0x2f, MAPDEF_NOMASK), > > + MAPDEF("devbus-cs0", 1, 0x3e, MAPDEF_NOMASK), > > + MAPDEF("devbus-cs1", 1, 0x3d, MAPDEF_NOMASK), > > + MAPDEF("devbus-cs2", 1, 0x3b, MAPDEF_NOMASK), > > + MAPDEF("devbus-cs3", 1, 0x37, MAPDEF_NOMASK), > > + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), > > + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), > > + {}, > > +}; > > These tables could be encoded using the approach Arnd described: > > /* 2 adress cell value with MAPDEF value (target attributes) in the > first cell and 0 in the 2nd. */ > #define MAPDEF(t,a,m) ((t<<16) | (a<<8) | m) 0 > > mbus { > compatible = "marvell,mbus"; > #address-cells = <2>; > #size-cells = <1>; > regs = <...> > > windows = <0 1 2 3 4>; > remappable-windows = <7 8 9>; > internal-window = 5; > > // Names and numbers made up for illistration > // The 2nd column is where to place the MBUS target in the CPU address map > ranges < MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom > MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot > MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs > [..] > > // MBUS target for internal registers > internal_regs { > compatible = "simple-bus"; > ranges <MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0 0x10000>; > #address-cells = <1>; > #size-cells = <1>; > > // Serial controller at offset 0x3000 from the start of internal registers > serial@0x3000 { > reg = <0x3000 0x100>; > } > } > > // MBUS target for boot rom > bootrom { > compatible = "simple-bus"; > ranges <MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0 0x10000>; > #address-cells = <1>; > #size-cells = <1>; > } > } > > Where: > - The top ranges is the table 'mvebu_mbus_mapping', combined with > the board specific code that sets the target CPU address. The > mbus driver can directly setup mappings without requiring board > support. > - Something like the 'bootrom' bus is where devices under that > window would be placed. > > This also replaces code like this from the board files: > > mvebu_mbus_add_window("nand", KIRKWOOD_NAND_MEM_PHYS_BASE, > KIRKWOOD_NAND_MEM_SIZE); > mvebu_mbus_add_window("sram", KIRKWOOD_SRAM_PHYS_BASE, > KIRKWOOD_SRAM_SIZE); > > And lets us make the DT self-consistent with the value of > KIRKWOOD_SRAM_PHYS_BASE and others I personally dislike this proposal quite a lot. I believe it puts way too many things into the Device Tree. Seems like these days, people want to write thousands of lines of data in Device Trees rather than having drivers properly abstract the differences between SoC. Also, I don't believe that windows should be described in the Device Tree. My long term plan is rather to make the allocation of the base address of each window entirely dynamic, because there is absolutely no reason for those addresses to be fixed in stone in the Device Tree. > > > + * Some variants of Orion5x have 4 remappable windows, some other have > > + * only two of them. > > + */ > > +static const struct mvebu_mbus_soc_data orion5x_4win_mbus_data = { > > + .num_wins = 8, > > + .num_remappable_wins = 4, > > These values also seem like something worth reading from the DT as > well.. See above for an idea. No: there will anyway be different functions to be called depending on the SoC family. So the driver will anyway have to have different compatible strings for the different SoC families, because we can't put code into the Device Tree. So rather than having some of the SoC-specific data encoded in the DT and some of the SoC-specific code in the driver, I'd rather have all the SoC-specific things (data *and* code) into the driver, and leave the DT only the responsibility of telling which SoC variant we're running on, and where the window registers are. Another reason is that the DT bindings become part of the kernel ABI, so they can't be changed once they are defined and have started to be used seriously on production devices. This leads to a very simple choice: limit the amount of things we put in the DT, and instead put more of the internal logic into the driver. So this was for the fundamental disagreement. No, I also have a temporary disagreement: I think what you're asking is way too far away from the current code. I've already been asked to make gazillions of cleanups and changes to get the PCIe driver in, and I don't think it's honest and reasonable to ask me to redo the entire Marvell world just to get the PCIe driver in. I've already refactored this address decoding stuff, which should make future evolutions much easier. I've also taken care of migrating all the legacy SoC families. So now asking me to rework the entire thing in one step, as a requirement to get the PCIe stuff in, it's really going beyond what's reasonable, I feel. Best regards, Thomas
On Wed, Mar 06, 2013 at 08:27:10PM +0100, Thomas Petazzoni wrote: > I personally dislike this proposal quite a lot. I believe it puts way > too many things into the Device Tree. Seems like these days, people > want to write thousands of lines of data in Device Trees rather than > having drivers properly abstract the differences between SoC. Well, in this case I think it is fairly reasonable because it means the mbus driver could work with all current and future Marvell chips that are based on this architecture, rather than having to fiddle with the driver every time.. Part of the point of all this DT stuff is to give HW designers some reasonable flexability in their HW design without requiring code changes to Linux to boot it. So if Marvell makes a new Armada varient that re-uses all the same pre existing IP, just with different mbus targets, they should be able to make Linux run on it simply by writing a SOC specific firmware and DT. > Also, I don't believe that windows should be described in the Device > Tree. My long term plan is rather to make the allocation of the base > address of each window entirely dynamic, because there is absolutely no > reason for those addresses to be fixed in stone in the Device Tree. How will that work? Device tree necessarily must have CPU addresses for the things it describes. Today we have the problem that those addresses must match the values hard coded into Linux, ie DT is describing what Linux expects, not what the HW provides.. OF was targetted at the case where the firmware would set the windows and then the DT is simply the way to communicate what the window settings are to the consumer. This idea that Linux should setup windows on its own has moved away from what OF was intended for, I'm not sure there is an overarching plan on how to handle those cases - but hardcoding addresses in Linux and then requiring the DT to match them exactly certainly seems wrong. As for dynamic allocation, they way to do it via DT is through the ranges property, similar to how I noted. The DT can be modified in-memory, so a dyanmic mbus driver would ignore the CPU target address in the ranges, select its own CPU address and then update the in-memory DT with the corrected value. Everything else would then work properly. > > > + * Some variants of Orion5x have 4 remappable windows, some other have > > > + * only two of them. > > > + */ > > > +static const struct mvebu_mbus_soc_data orion5x_4win_mbus_data = { > > > + .num_wins = 8, > > > + .num_remappable_wins = 4, > > > > These values also seem like something worth reading from the DT as > > well.. See above for an idea. > > No: there will anyway be different functions to be called depending on > the SoC family. So the driver will anyway have to have different > compatible strings for the different SoC families, because we can't > put Well I wrote the 'mbus_win_offset' function concept with DT lists: > windows = <0 1 2 3 4>; > remappable-windows = <7 8 9>; > internal-window = 5; And the two varients of the SDRAM function still have to exist, but would be best handled by separating the SDRAM description from the MBUS description in DT - they are orthogonal concepts. > Another reason is that the DT bindings become part of the kernel ABI, > so they can't be changed once they are defined and have started to be > used seriously on production devices. This leads to a very simple > choice: limit the amount of things we put in the DT, and instead put > more of the internal logic into the driver. Well, this is certainly a fair notion - but things like the MAPDEF values, and window offsets are properties of the SOC and are never, ever going to change. So selecting a sane way to communicate those values through DT makes lots of sense to me. > No, I also have a temporary disagreement: I think what you're asking is > way too far away from the current code. I've already been asked to > make I certainly agree with this. If it wasn't for the idea that the DT is a stable ABI and thus must be perfect from the start, I wouldn't have mentioned any of this right now, making fine tuning adjustments as a follow on seems better to me. It is absolutely way too much work to try and fix everything in one go! I would be happy to see your patch go in as-is, but mildly unhappy to see us stuck with this DT layout forever... Especially since I really want to see your PCI stuff go in .... Jason
Dear Jason Gunthorpe, On Wed, 6 Mar 2013 13:24:47 -0700, Jason Gunthorpe wrote: > Well, in this case I think it is fairly reasonable because it means > the mbus driver could work with all current and future Marvell chips > that are based on this architecture, rather than having to fiddle with > the driver every time.. > > Part of the point of all this DT stuff is to give HW designers some > reasonable flexability in their HW design without requiring code > changes to Linux to boot it. So if Marvell makes a new Armada varient > that re-uses all the same pre existing IP, just with different mbus > targets, they should be able to make Linux run on it simply by writing > a SOC specific firmware and DT. That simply doesn't work for real. HW designers often introduce a subtle change in the registers, and bing, your driver no longer works, and you have to fix it anyway. The idea that making changes only to the DT will allow porting the kernel to a new family of SoC is just a dream. I know a lot of people are dreaming about this, but it doesn't make sense: you can't imagine, when designing a DT binding, what kind of crazy bizarre change the HW designers will come up with in a newer revision of a given IP. And the good way to handle that, IMO, is to put as little logic in the DT as possible (because DT bindings have to be stable), and handle the crazy little differences between SoCs in the driver. > > Also, I don't believe that windows should be described in the Device > > Tree. My long term plan is rather to make the allocation of the base > > address of each window entirely dynamic, because there is absolutely no > > reason for those addresses to be fixed in stone in the Device Tree. > > How will that work? Device tree necessarily must have CPU addresses > for the things it describes. Today we have the problem that those > addresses must match the values hard coded into Linux, ie DT is > describing what Linux expects, not what the HW provides.. I don't follow you. For the moment we have to say in the DT, I have a NOR of 16 MB, and it's mapped at 0xCD000000. This 0xCD000000 is completely useless: the NOR driver or whatever driver sets up the NOR can do an allocate_resource(), which returns an available fragment of the physical address space, and create the address decoding window at this address. The base address of address decoding windows is not a description of the hardware. Those base addresses can be dynamically allocated during the probing of devices. > OF was targetted at the case where the firmware would set the windows > and then the DT is simply the way to communicate what the window > settings are to the consumer. > > This idea that Linux should setup windows on its own has moved away > from what OF was intended for, I'm not sure there is an overarching > plan on how to handle those cases - but hardcoding addresses in Linux > and then requiring the DT to match them exactly certainly seems wrong. Where have you seen that I want to hardcode addresses in Linux? That's what is done for now, because it has been done this way since a long time in the Linux support for Marvell SoCs, and I don't want to move away from that in one step. > > No: there will anyway be different functions to be called depending on > > the SoC family. So the driver will anyway have to have different > > compatible strings for the different SoC families, because we can't > > put > > Well I wrote the 'mbus_win_offset' function concept with DT lists: > > > windows = <0 1 2 3 4>; > > remappable-windows = <7 8 9>; > > internal-window = 5; > > And the two varients of the SDRAM function still have to exist, but > would be best handled by separating the SDRAM description from the > MBUS description in DT - they are orthogonal concepts. You've solved the problem for those two particular cases, and then the next SoC shows up, and has a little difference in the register layout, or something... and your DT binding no longer works. Really, I don't see any point or any useful feature brought by adding this enormous additional amount of information in the DT. > > Another reason is that the DT bindings become part of the kernel ABI, > > so they can't be changed once they are defined and have started to be > > used seriously on production devices. This leads to a very simple > > choice: limit the amount of things we put in the DT, and instead put > > more of the internal logic into the driver. > > Well, this is certainly a fair notion - but things like the MAPDEF > values, and window offsets are properties of the SOC and are never, > ever going to change. So selecting a sane way to communicate those > values through DT makes lots of sense to me. Never going to change? They change from one SoC family to another... > > No, I also have a temporary disagreement: I think what you're asking is > > way too far away from the current code. I've already been asked to > > make > > I certainly agree with this. If it wasn't for the idea that the DT is > a stable ABI and thus must be perfect from the start, I wouldn't have > mentioned any of this right now, making fine tuning adjustments as a > follow on seems better to me. > > It is absolutely way too much work to try and fix everything in one > go! I would be happy to see your patch go in as-is, but mildly unhappy > to see us stuck with this DT layout forever... Again, I don't see the point of making the DT binding more complex than the one being proposed here, and you haven't given any really compelling arguments except that it is not to your taste. I believe we're reaching a point where things start to be a matter of taste, and so unless you're writing the code and doing the work yourself, you also have to accept that other people might have different visions of how things should be handled, and therefore have different tastes. How can we make progress on this, in a *reasonable* way? Best regards, Thomas
On Wed, Mar 06, 2013 at 09:40:36PM +0100, Thomas Petazzoni wrote: > > Part of the point of all this DT stuff is to give HW designers some > > reasonable flexability in their HW design without requiring code > > changes to Linux to boot it. So if Marvell makes a new Armada varient > > that re-uses all the same pre existing IP, just with different mbus > > targets, they should be able to make Linux run on it simply by writing > > a SOC specific firmware and DT. > > That simply doesn't work for real. HW designers often introduce a > subtle change in the registers, and bing, your driver no longer works, > and you have to fix it anyway. The goal would be to have HW architects cognisant of what capabilites are available in the DT and Kernel and stay within those limits. Fundamentally this is how x86 maintains its compatibility and to achive this dream on ARM the same must be true. This process must go from both sides, the DT must allow enough flexibility and the HW architect must care to minimize the software impact. Today there is no real incentive for HW to say the same because they have to make Linux patches for every new SOC anyhow. The introduction of DT will be the first time there is an actual tangible economic advantage to developing/considering DT bindings for your IP during the HW architecture phase. I hope that will change things for the better :) > > > Also, I don't believe that windows should be described in the Device > > > Tree. My long term plan is rather to make the allocation of the base > > > address of each window entirely dynamic, because there is absolutely no > > > reason for those addresses to be fixed in stone in the Device Tree. > > > > How will that work? Device tree necessarily must have CPU addresses > > for the things it describes. Today we have the problem that those > > addresses must match the values hard coded into Linux, ie DT is > > describing what Linux expects, not what the HW provides.. > > I don't follow you. For the moment we have to say in the DT, I have a > NOR of 16 MB, and it's mapped at 0xCD000000. This 0xCD000000 is > completely useless: the NOR driver or whatever driver sets up the NOR > can do an allocate_resource(), which returns an available fragment of > the physical address space, and create the address decoding window at > this address. As I understand it (from other conversations), the preferred way to do this in a DT world is with the layout I described. The mbus driver would allocate the regions and update the DT in memory with the results of allocate_resource(). There are a couple ways to make a DT layout that allows this, but the basic idea is the same. This keeps the window allocation encapsulated within the mbus driver, the NOR driver doesn't know or care about these details, and all the OF machinery still works because the window base address in DT is valid. > > This idea that Linux should setup windows on its own has moved away > > from what OF was intended for, I'm not sure there is an overarching > > plan on how to handle those cases - but hardcoding addresses in Linux > > and then requiring the DT to match them exactly certainly seems wrong. > > Where have you seen that I want to hardcode addresses in Linux? That's > what is done for now, because it has been done this way since a long > time in the Linux support for Marvell SoCs, and I don't want to move > away from that in one step. I say this because I don't see how to go from this DT binding to a future that has full dynamic window address assignment. What do you imagine a NOR DT stanza would look like for dynamic assignment? How do you tell which DT stanza is associated with which target attributes? How do you keep mbus specific window setup code out of the NOR/NAND/SPI/MTD/etc drivers? > You've solved the problem for those two particular cases, and then the > next SoC shows up, and has a little difference in the register layout, > or something... and your DT binding no longer works. There are now about 5 SOCs with the same layout for the MBUS window registers, and the idea to specify the base offset of each of the 3 types of window register types works for all of them. So to me, this is a prime thing to store in DT. If a future HW uses a totally different register layout then should this driver even work on it? But, this is orthogonal to using 'ranges' and target id's in the DT to associate DT stanzas with MBUS targets. > > Well, this is certainly a fair notion - but things like the MAPDEF > > values, and window offsets are properties of the SOC and are never, > > ever going to change. So selecting a sane way to communicate those > > values through DT makes lots of sense to me. > > Never going to change? They change from one SoC family to another... I ment never change within the SOC. If they were the same on every SOC then they could live in the driver just fine. It is the fact that every SOC that uses the MBUS driver uses different target attributes that makes them a good candidate to live in DT. > > It is absolutely way too much work to try and fix everything in one > > go! I would be happy to see your patch go in as-is, but mildly unhappy > > to see us stuck with this DT layout forever... > > Again, I don't see the point of making the DT binding more complex than > the one being proposed here, and you haven't given any really > compelling arguments except that it is not to your taste. I believe > we're reaching a point where things start to be a matter of taste, and > so unless you're writing the code and doing the work yourself, you also > have to accept that other people might have different visions of how > things should be handled, and therefore have different tastes. > > How can we make progress on this, in a *reasonable* way? Well, as you say, we've pretty much completed presenting arguments for each view on this subject, so it is really up to the gatekeepers of the stable DT 'ABI' to decide on how they want this all to look in the end, IMHO. Jason C/Andrew L/Arnd? Regards, Jason
On Wed, Mar 06, 2013 at 02:50:31PM -0700, Jason Gunthorpe wrote: > On Wed, Mar 06, 2013 at 09:40:36PM +0100, Thomas Petazzoni wrote: ... > > Again, I don't see the point of making the DT binding more complex than > > the one being proposed here, and you haven't given any really > > compelling arguments except that it is not to your taste. I believe > > we're reaching a point where things start to be a matter of taste, and > > so unless you're writing the code and doing the work yourself, you also > > have to accept that other people might have different visions of how > > things should be handled, and therefore have different tastes. > > > > How can we make progress on this, in a *reasonable* way? > > Well, as you say, we've pretty much completed presenting arguments for > each view on this subject, so it is really up to the gatekeepers of > the stable DT 'ABI' to decide on how they want this all to look in the > end, IMHO. > > Jason C/Andrew L/Arnd? I'm inclined to agree with Thomas. Since the the first Kirkwood DT board was added to the tree, it's been pounded into me that DT *describes hardware*. Not how random bootloader X happened to configure it. However, we also have the example of local-mac-address. Which is describing how the hardware is *configured* to present a consistent id to a network (from bootloader to kernel to userspace). This is certainly a valid deviation from the rule. If a bootloader setup some windows before handing off control to the kernel (keep in mind, this could be *BSD, windows, BeOS, etc), is there any /need/ to keep that configuration? If we are to include Jason's suggestion, I think it should be listed as optional properties typically inserted at boot by the bootloader as an FYI. And, for Thomas' sanity, presented as a future patch on top of current work. ;-) At this point, we have at least a few release cycles to shake out a standard DT binding for Marvell SoCs. Currently, not a single product ships with a DT-enabled bootloader. Not even the development boards created by Marvell. So I'm fine with Thomas' bindings as they stand *and* fine with adjusting them over the next few releases as needed. It'll be a different story once products start being released with DT-enabled bootloaders. thx, Jason.
On Wed, Mar 06, 2013 at 05:27:12PM -0500, Jason Cooper wrote: > On Wed, Mar 06, 2013 at 02:50:31PM -0700, Jason Gunthorpe wrote: > > On Wed, Mar 06, 2013 at 09:40:36PM +0100, Thomas Petazzoni wrote: > ... > > > Again, I don't see the point of making the DT binding more complex than > > > the one being proposed here, and you haven't given any really > > > compelling arguments except that it is not to your taste. I believe > > > we're reaching a point where things start to be a matter of taste, and > > > so unless you're writing the code and doing the work yourself, you also > > > have to accept that other people might have different visions of how > > > things should be handled, and therefore have different tastes. > > > > > > How can we make progress on this, in a *reasonable* way? > > > > Well, as you say, we've pretty much completed presenting arguments for > > each view on this subject, so it is really up to the gatekeepers of > > the stable DT 'ABI' to decide on how they want this all to look in the > > end, IMHO. > > > > Jason C/Andrew L/Arnd? > > I'm inclined to agree with Thomas. Since the the first Kirkwood DT > board was added to the tree, it's been pounded into me that DT > *describes hardware*. Not how random bootloader X happened to configure > it. I'm not sure this is what we are discussing, certainly it isn't what I was talking about... The discussion was around the SOC specific tables in the driver and if they should be in DT or C code. These tables very much describe the hardware, and are copied from similar tables in the Marvell manuals. For instance I repeated this idea: ranges =<MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs [..] At the stanza of MBUS driver's bus. Each line represents a MBUS target (this is a physical HW IP block). The MAPDEF is a SOC specific 'target id' that is currently living in tables in the driver. This value is directly compatible with the mbus window register and is defined by Marvell. The 2nd number is the CPU base address of that window, and the last is the size. Per OF conventions the base address should be the value the bootloader left it at - but that is not important, the value could be 0. A dynamic MBUS driver would go through each item in the ranges, find an address range and program a window with the MAPDEF from the DT. It would then write the base address back into the DT (at runtime!) so that the entire DT remains conistent with the current state of the hardware. An initial implementation without dynamic address allocation would just have the mbus driver directly use the 2nd value as the base address. The attraction to this model is that the association of later DT stanza's to MBUS targets is extremely clear: nor@0 { compatible = 'linux mtd nor driver'; regs = <MAPDEF(1, 0xab, MAPDEF_NOMASK) 0x10000>; } We can now tell directly that 'linux mtd nor driver' is behind MBUS target 0xab, and the OF code already knows the base for this MBUS target from the ranges property, as modified by the MBUS driver. The MBUS driver also knows which targets to configure windows for immediately at boot without having to consult internal tables. Plus we don't need to modify the NOR driver to call out to MBUS window functions to fetch an address. There are a few other variants on how to do this in DT, but they all come down to modeling the MBUS target using a ranges property and having the mbus driver adjust the range mapping at runtime. > At this point, we have at least a few release cycles to shake out a > standard DT binding for Marvell SoCs. Currently, not a single product > ships with a DT-enabled bootloader. Not even the development boards > created by Marvell. So I'm fine with Thomas' bindings as they stand > *and* fine with adjusting them over the next few releases as needed. I think this is a fine idea. Thomas's driver is a much better starting place to do stuff like the above! Regards, Jason
On Wednesday 06 March 2013, Jason Gunthorpe wrote: > On Wed, Mar 06, 2013 at 08:27:10PM +0100, Thomas Petazzoni wrote: > Part of the point of all this DT stuff is to give HW designers some > reasonable flexability in their HW design without requiring code > changes to Linux to boot it. So if Marvell makes a new Armada varient > that re-uses all the same pre existing IP, just with different mbus > targets, they should be able to make Linux run on it simply by writing > a SOC specific firmware and DT. Agreed. Further, the device tree should accurately describe the mapping of addresses, and we have a lot of infrastructure to do exactly that. > > Also, I don't believe that windows should be described in the Device > > Tree. My long term plan is rather to make the allocation of the base > > address of each window entirely dynamic, because there is absolutely no > > reason for those addresses to be fixed in stone in the Device Tree. > > How will that work? Device tree necessarily must have CPU addresses > for the things it describes. Today we have the problem that those > addresses must match the values hard coded into Linux, ie DT is > describing what Linux expects, not what the HW provides.. Exactly. > This idea that Linux should setup windows on its own has moved away > from what OF was intended for, I'm not sure there is an overarching > plan on how to handle those cases - but hardcoding addresses in Linux > and then requiring the DT to match them exactly certainly seems wrong. > > As for dynamic allocation, they way to do it via DT is through the > ranges property, similar to how I noted. The DT can be modified > in-memory, so a dyanmic mbus driver would ignore the CPU target > address in the ranges, select its own CPU address and then update the > in-memory DT with the corrected value. Everything else would then work > properly. Yes. We can even use the index of the "ranges" property to refer directly to the number of the window, so ranges replaces the kernel internal struct mvebu_mbus_mapping. I would also suggest not making the mbus compatible with "simple_bus", but instead probing /only/ the mbus device at bootup, which can then set up the hardware based on the ranges property and then call of_platform_populate on its children. > > > > + * Some variants of Orion5x have 4 remappable windows, some other have > > > > + * only two of them. > > > > + */ > > > > +static const struct mvebu_mbus_soc_data orion5x_4win_mbus_data = { > > > > + .num_wins = 8, > > > > + .num_remappable_wins = 4, > > > > > > These values also seem like something worth reading from the DT as > > > well.. See above for an idea. > > > > No: there will anyway be different functions to be called depending on > > the SoC family. So the driver will anyway have to have different > > compatible strings for the different SoC families, because we can't > > put > > Well I wrote the 'mbus_win_offset' function concept with DT lists: > > > windows = <0 1 2 3 4>; > > remappable-windows = <7 8 9>; > > internal-window = 5; > > And the two varients of the SDRAM function still have to exist, but > would be best handled by separating the SDRAM description from the > MBUS description in DT - they are orthogonal concepts. *nod* > > No, I also have a temporary disagreement: I think what you're asking is > > way too far away from the current code. I've already been asked to > > make > > I certainly agree with this. If it wasn't for the idea that the DT is > a stable ABI and thus must be perfect from the start, I wouldn't have > mentioned any of this right now, making fine tuning adjustments as a > follow on seems better to me. > > It is absolutely way too much work to try and fix everything in one > go! I would be happy to see your patch go in as-is, but mildly unhappy > to see us stuck with this DT layout forever... > > Especially since I really want to see your PCI stuff go in .... I basically agree with everything you write in this mail, Jason, and I was about to make exactly the same comments myself. Getting the binding right is certainly a priority here, but I also think that we don't need to rush the code to use that binding (although having the code work would make me more comfortable thinking that the binding actually works). Arnd
On Wednesday 06 March 2013, Thomas Petazzoni wrote: > I don't follow you. For the moment we have to say in the DT, I have a > NOR of 16 MB, and it's mapped at 0xCD000000. This 0xCD000000 is > completely useless: the NOR driver or whatever driver sets up the NOR > can do an allocate_resource(), which returns an available fragment of > the physical address space, and create the address decoding window at > this address. > > The base address of address decoding windows is not a description of > the hardware. Those base addresses can be dynamically allocated during > the probing of devices. I think you just described precisely why the binding that Jason has suggested is able to describe what the hardware does, while the binding you proposed cannot do that. If we want to dynamically assign the address of the NOR flash, we need a way to describe the hardware address that we want to remap to a CPU visible logical address using MBUS. When each device behind MBUS has its own bus space in hardware, a ranges property is exactly what we should use to describe how that bus space is mapped into the parent bus that is visible to the CPU. The ranges property that is passed by the boot loader can describe how the address is currently configured at boot time, or how you would like the kernel to configure it to get a sensible system view. But if you want to dynamically remap it in MBUS, that should /not/ require changing the "reg" property of the device, because the device itself does not change, only the mapping changes. There is a little deficiency here in the way that Linux probes the platform devices: We allocate a "struct resource" at the time the platform_device is created, and that resource will be incorrect if we remap the window. However, I think it's easy enough to work around that by using 'of_iomap' in drivers that are children of the MBUS and can get remapped, rather than using 'devm_ioremap_resource'. Platform devices are Linux specific implementation detail that should not impact the DT binding in this case. Arnd
On Thu, Mar 07, 2013 at 03:37:43AM +0000, Arnd Bergmann wrote: > Yes. We can even use the index of the "ranges" property to refer directly > to the number of the window, so ranges replaces the kernel internal > struct mvebu_mbus_mapping. I think we are thinking the same thing, but, s/index/bus address/ and s/number of the window/target id/ ? > I would also suggest not making the mbus compatible with "simple_bus", but > instead probing /only/ the mbus device at bootup, which can then set > up the hardware based on the ranges property and then call > of_platform_populate on its children. Yeah, something is needed to make startup ordering work. Things like the timer and main interrupt controller live behind the special internal regs window, so in a perfect world finding the timer would automatically setup the enclosing bus structures ... > > I certainly agree with this. If it wasn't for the idea that the DT is > > a stable ABI and thus must be perfect from the start, I wouldn't have > > mentioned any of this right now, making fine tuning adjustments as a > > follow on seems better to me. > > > > It is absolutely way too much work to try and fix everything in one > > go! I would be happy to see your patch go in as-is, but mildly unhappy > > to see us stuck with this DT layout forever... > > > > Especially since I really want to see your PCI stuff go in .... > > I basically agree with everything you write in this mail, Jason, and > I was about to make exactly the same comments myself. > > Getting the binding right is certainly a priority here, but I also think > that we don't need to rush the code to use that binding (although having > the code work would make me more comfortable thinking that the binding > actually works). Do you think Jason C's notion to take it as is and then be OK with altering the binding later pending agreement is acceptable for now? I agree with Thomas that this is too much to ask just to get the already huge Marvell PCI-E patchset into mainline. Jason
Dear Jason Gunthorpe, On Wed, 6 Mar 2013 16:04:12 -0700, Jason Gunthorpe wrote: > I'm not sure this is what we are discussing, certainly it isn't what I > was talking about... > > The discussion was around the SOC specific tables in the driver and if > they should be in DT or C code. These tables very much describe the > hardware, and are copied from similar tables in the Marvell manuals. > > For instance I repeated this idea: > > ranges =<MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom > MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot > MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs > [..] This doesn't work for PCIe interfaces. The PCIe windows cannot be described in the DT, because they are inherently dynamic, and depend on which PCIe devices are plugged into the PCIe slots, and how much I/O and memory space each of those devices require. This means that regardless of what you think, *some* windows will have to be created dynamically during the system initialization, and we will *not* be able to describe all windows in the Device Tree. And to me, it seems completely stupid to have some windows defined statically in the Device Tree, and some other windows set up dynamically when the system initializes. > At the stanza of MBUS driver's bus. Each line represents a MBUS target > (this is a physical HW IP block). The MAPDEF is a SOC specific 'target > id' that is currently living in tables in the driver. This value is > directly compatible with the mbus window register and is defined by > Marvell. The 2nd number is the CPU base address of that window, and > the last is the size. Please explain how you handle PCIe windows. > Per OF conventions the base address should be the value the bootloader > left it at - but that is not important, the value could be 0. A > dynamic MBUS driver would go through each item in the ranges, find > an address range and program a window with the MAPDEF from the DT. It > would then write the base address back into the DT (at runtime!) so > that the entire DT remains conistent with the current state of the > hardware. Why the heck would the kernel need to rewrite the DT ? Just like a driver does an ioremap() to create a virtual -> physical mapping, the driver can just as well do mvebu_mbus_add_window() to create a physical -> device mapping. It doesn't have to be hardcoded into the Device Tree. > An initial implementation without dynamic address allocation would > just have the mbus driver directly use the 2nd value as the base > address. > > The attraction to this model is that the association of later DT > stanza's to MBUS targets is extremely clear: > > nor@0 { > compatible = 'linux mtd nor driver'; > regs = <MAPDEF(1, 0xab, MAPDEF_NOMASK) 0x10000>; > } > > We can now tell directly that 'linux mtd nor driver' is behind MBUS > target 0xab, and the OF code already knows the base for this MBUS > target from the ranges property, as modified by the MBUS driver. The > MBUS driver also knows which targets to configure windows for > immediately at boot without having to consult internal tables. Plus we > don't need to modify the NOR driver to call out to MBUS window > functions to fetch an address. Please take the PCIe example instead. Thanks. Best regards, Thomas
Dear Arnd Bergmann, On Thu, 7 Mar 2013 03:37:43 +0000, Arnd Bergmann wrote: > On Wednesday 06 March 2013, Jason Gunthorpe wrote: > > On Wed, Mar 06, 2013 at 08:27:10PM +0100, Thomas Petazzoni wrote: > > > Part of the point of all this DT stuff is to give HW designers some > > reasonable flexability in their HW design without requiring code > > changes to Linux to boot it. So if Marvell makes a new Armada varient > > that re-uses all the same pre existing IP, just with different mbus > > targets, they should be able to make Linux run on it simply by writing > > a SOC specific firmware and DT. > > Agreed. Further, the device tree should accurately describe the mapping > of addresses, and we have a lot of infrastructure to do exactly that. Why should the Device Tree describe something that isn't a description of the hardware? The mapping of addresses used for address decoding windows is inherently dynamic. Do I have to, again, repeat the PCIe discussion? We *cannot* describe the PCIe address decoding windows in the Device Tree. It is NOT possible. We have too many PCIe interfaces (up to 10), too few windows (only 20), and not enough physical address space, to cover, statically, all possible combinations of PCIe devices. In addition, the PCIe bus is very nice because it is dynamic: you discover devices, how much memory and I/O space they require, and you set up the windows accordingly. Please always think about this PCIe use case when proposing any sort of solution about address decoding windows. Any solution that doesn't take into account the dynamic nature of PCIe windows is simply useless. > > > > Also, I don't believe that windows should be described in the Device > > > Tree. My long term plan is rather to make the allocation of the base > > > address of each window entirely dynamic, because there is absolutely no > > > reason for those addresses to be fixed in stone in the Device Tree. > > > > How will that work? Device tree necessarily must have CPU addresses > > for the things it describes. Today we have the problem that those > > addresses must match the values hard coded into Linux, ie DT is > > describing what Linux expects, not what the HW provides.. > > Exactly. No, the Device Tree does not need to have CPU addresses. The hardware description of a NOR is : * It is connected to chip-select X * Its size is 16 MB * Its model/description/characteristics are <foo> So the NOR would be: reg = <0 0x1000000>; cs = <3>; We encode this information in the Device Tree, because they describe the hardware. Then, in the driver, we do: struct resource nor_resource; /* Get the size of the NOR from DT */ of_address_to_resource(np, 0, &nor_resource); /* Find an available physical range to map the NOR */ allocate_resource(&iomem_res, &nor_resource, resource_size(&nor_resource), 0, 0, 0, NULL, NULL); /* Create the address decoding window */ mvebu_mbus_add_window(nor, nor_resource.start, resource_size(&nor_resource)); And that's it. The physical address at which the NOR is mapped is *NOT* a description of the hardware. We are not hardcoding virtual addresses in the DT, right? So for the same reason, we should not hardcode the address decoding windows and their base address into the DT. Please explain why you think hardcoding in the Device Tree things that is not a hardware description is a good idea. > > This idea that Linux should setup windows on its own has moved away > > from what OF was intended for, I'm not sure there is an overarching > > plan on how to handle those cases - but hardcoding addresses in Linux > > and then requiring the DT to match them exactly certainly seems wrong. > > > > As for dynamic allocation, they way to do it via DT is through the > > ranges property, similar to how I noted. The DT can be modified > > in-memory, so a dyanmic mbus driver would ignore the CPU target > > address in the ranges, select its own CPU address and then update the > > in-memory DT with the corrected value. Everything else would then work > > properly. > > Yes. We can even use the index of the "ranges" property to refer directly > to the number of the window, so ranges replaces the kernel internal > struct mvebu_mbus_mapping. Doesn't work with PCIe windows that have to be dynamically created. It really strikes me that in the name of the Device Tree, we would have to hardcode in stone things that are inherently dynamic. > Getting the binding right is certainly a priority here, but I also think > that we don't need to rush the code to use that binding (although having > the code work would make me more comfortable thinking that the binding > actually works). The current binding is just a compatible string, not more. So it can easily be extended in the future to have more properties below it giving more configuration details. Of course, I continue to fundamentally disagree with the proposal you and Jason are making, but if you want to implement something that does that, it will always be possible to extend the DT binding in the future. Also remember that the mvebu-mbus driver has to be compatible with non-DT platforms. We haven't migrated yet all the SoC families to the DT, and it will take a bit more time to convert all of them to the Device Tree. Would it be possible to agree that this driver is a first step that consolidates the existing address decoding code, which doesn't prevent further improvements, and get the driver code merged in this current state (of course, after fixing all the bugs, issues, that will be discovered during review and testing) ? Thanks, Thomas
On Thu, Mar 07, 2013 at 11:58:17AM +0100, Thomas Petazzoni wrote: > Please always think about this PCIe use case when proposing any sort of > solution about address decoding windows. Any solution that doesn't take > into account the dynamic nature of PCIe windows is simply useless. PCIe is not a problem, it continues to use the dynamic interface! This is all about how do you connect the DT stanzas that are already in the DT to the dynamic window mbus system - and you do that via ranges. The ranges only need to cover the needs of the static elements of the DT - they do not need to extend beyond to fully dynamic elements like PCI-E - those cases are *totally* different and must be handled in code. The reason for this is quite simply to avoid this ugly boiler plate in every mbus driver: > struct resource nor_resource; > > /* Get the size of the NOR from DT */ > of_address_to_resource(np, 0, &nor_resource); > > /* Find an available physical range to map the NOR */ > allocate_resource(&iomem_res, &nor_resource, > resource_size(&nor_resource), > 0, 0, 0, NULL, NULL); > > /* Create the address decoding window */ > mvebu_mbus_add_window(nor, nor_resource.start, > resource_size(&nor_resource)); > Not only does this violate encapsulate principles, it breaks the DT modeling because there is no place in the DT that has the base address!! The PCI system allocates and assigns BAR prior to probing the driver - that is the Linux convention. The mbus driver needs to do the same thing. The window must be setup prior to calling a child's probe, and the DT must be correct at that time. > It really strikes me that in the name of the Device Tree, we would have > to hardcode in stone things that are inherently dynamic. What do you mean? The suggestion is to describe all in-DT elements with a ranges like this: > ranges =<MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 Which directly models the HW MBUS system. The target-id is set in stone by the SOC, the window size is based on what is described in the DT - so there is no need for it to be smaller/larger, and the base address can be run-time altered to the true window base by the mbus driver. There is nothing inherently dynamic that is set in stone. The base address *is not set in stone* - it is just refactored out of the individual stanzas and placed into a single common place that is easy to find pragmatically. This is not about PCI-E - PCI-E continues to use the full dynamic interface, nothing major changes there. > Would it be possible to agree that this driver is a first step that > consolidates the existing address decoding code, which doesn't prevent > further improvements, and get the driver code merged in this current > state (of course, after fixing all the bugs, issues, that will be > discovered during review and testing) ? I continue to agree with this. However, I think Ezequiel's driver should be held up pending a decision if mvebu_mbus_add_window is appropriate for it to call. Regards, Jason
Dear Jason Gunthorpe, On Thu, 7 Mar 2013 10:37:36 -0700, Jason Gunthorpe wrote: > On Thu, Mar 07, 2013 at 11:58:17AM +0100, Thomas Petazzoni wrote: > > > Please always think about this PCIe use case when proposing any sort of > > solution about address decoding windows. Any solution that doesn't take > > into account the dynamic nature of PCIe windows is simply useless. > > PCIe is not a problem, it continues to use the dynamic interface! > > This is all about how do you connect the DT stanzas that are already > in the DT to the dynamic window mbus system - and you do that via > ranges. The ranges only need to cover the needs of the static elements > of the DT - they do not need to extend beyond to fully dynamic > elements like PCI-E - those cases are *totally* different and must be > handled in code. I still don't see why devices like NOR and PCIe shouldn't be handled in the same way. > The reason for this is quite simply to avoid this ugly boiler plate in > every mbus driver: Come on "ugly" ? It's three line of basic set up code, and it avoids hard-coding base addresses in the Device Tree. Not having hard-coded base addresses is very nice for users porting Linux to new boards using Marvell SoCs: they don't have to worry about deciding which base address to use. A given board may have a huge NOR, some other board may have gazillions of PCIe devices with large memory space requirements, some other board might have a FPGA requiring a large physical memory mapping. With your idea, the guy writing the board .dts file will have to think about what is the layout of the physical address space to find where to map his FPGA or its big NOR. With the really dynamic idea I'm proposing, the guy doesn't have to worry at all about the base address. I know you're proposing to make things dynamic by having the mbus driver patch the Device Tree once the driver figures out an available range to map a given device. But that doesn't make any sense: why write in the DT something that will anyway be overwritten later on? > > struct resource nor_resource; > > > > /* Get the size of the NOR from DT */ > > of_address_to_resource(np, 0, &nor_resource); > > > > /* Find an available physical range to map the NOR */ > > allocate_resource(&iomem_res, &nor_resource, > > resource_size(&nor_resource), > > 0, 0, 0, NULL, NULL); > > > > /* Create the address decoding window */ > > mvebu_mbus_add_window(nor, nor_resource.start, > > resource_size(&nor_resource)); > > > > Not only does this violate encapsulate principles, it breaks the DT > modeling because there is no place in the DT that has the base > address!! But why the *heck* do we care about having the base address in the DT. We don't care *at all*. The base address is *NOT* a description of the hardware, because it is completely configurable. Therefore, there is no need to have it in the Device Tree. In my opinion, the Device Tree should describe the informations that the software cannot "guess" by looking at the hardware. > The PCI system allocates and assigns BAR prior to probing the driver - > that is the Linux convention. The mbus driver needs to do the same > thing. The window must be setup prior to calling a child's probe, and > the DT must be correct at that time. > > > It really strikes me that in the name of the Device Tree, we would have > > to hardcode in stone things that are inherently dynamic. > > What do you mean? The suggestion is to describe all in-DT elements > with a ranges like this: > > > ranges =<MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 > > Which directly models the HW MBUS system. The target-id is set in > stone by the SOC, the window size is based on what is described in the > DT - so there is no need for it to be smaller/larger, and the base > address can be run-time altered to the true window base by the mbus > driver. > > There is nothing inherently dynamic that is set in stone. The base > address *is not set in stone* - it is just refactored out of the > individual stanzas and placed into a single common place that is easy > to find pragmatically. > > This is not about PCI-E - PCI-E continues to use the full dynamic > interface, nothing major changes there. I still continue to believe that it's a matter of taste, and where you put the boundary between what should be described in the Device Tree and what should not. > > Would it be possible to agree that this driver is a first step that > > consolidates the existing address decoding code, which doesn't prevent > > further improvements, and get the driver code merged in this current > > state (of course, after fixing all the bugs, issues, that will be > > discovered during review and testing) ? > > I continue to agree with this. Thanks. > However, I think Ezequiel's driver should be held up pending a > decision if mvebu_mbus_add_window is appropriate for it to call. It is appropriate for it to call, because it's the only and correct way to set up an address decoding window with the mvebu-mbus driver, just like it is the only and correct way that will be used by the PCIe driver. Adding description of windows in the Device Tree can be added later on, in a completely DT backward compatible way, and the mvebu-devbus driver can be adjusted accordingly. Thanks, Thomas
On Thu, Mar 07, 2013 at 08:11:33PM +0100, Thomas Petazzoni wrote: >> The reason for this is quite simply to avoid this ugly boiler plate in >> every mbus driver: > Come on "ugly" ? It's three line of basic set up code, and it avoids > hard-coding base addresses in the Device Tree. Not having hard-coded It is ugly because it doesn't belong in a device driver, it is bus specific setup code and leaking it out of the bus abstraction is bad. > With your idea, the guy writing the board .dts file will have to think > about what is the layout of the physical address space to find where > to map his FPGA or its big NOR. Bear in mind that an optimal allocator that doesn't leave any address space holes and considers the power-of-two size restriction is midly complex. Building an optimal allocator is actually impossible if each device driver does its own allocate_resources - you have to call allocate_resources in the right order to get an optimal placement (usually sorted from largest to smallest alignment) Address allocation/assignment is bus specific nonsense and does not belong in device drivers. > I know you're proposing to make things dynamic by having the mbus > driver patch the Device Tree once the driver figures out an available > range to map a given device. But that doesn't make any sense: why write > in the DT something that will anyway be overwritten later on? Simply because the DT is supposed to have the address map for the elements it describes. An OF node in the DT is expected to resolve back to the the correct CPU address, Stuff in the kernel depends on it, eg the sysfs names are constructed using the address map: $ ls -l /sys/bus/platform/devices/ lrwxrwxrwx 1 root root 0 Mar 7 19:43 f1010078.thermal -> ../../../devices/platform/internal.0/f1010078.thermal lrwxrwxrwx 1 root root 0 Mar 7 19:43 f1010100.gpio -> ../../../devices/platform/internal.0/f1010100.gpio lrwxrwxrwx 1 root root 0 Mar 7 19:43 f1010140.gpio -> ../../../devices/platform/internal.0/f1010140.gpio [..] So if you don't have correct address mapping through the DT then addressing used by the core OF code is wrong, and the mechanism it uses to avoid name conflicts is broken. Subtle Badness! The 'DT must describe the hardware' litany is 90% correct but misses that a critical function of DT is to describe the *address map*. So you need to leave a placeholder in the DT for the right value to get stuffed in. Ranges is a logical place to do that. You can't do it in the device driver because it that is too late. It must be done before calling of_populate... - ie by the mbus driver, and the mbus driver cannot be a simple-bus, like Arnd alluded to. As far as the placeholder goes, the driver could could ignore it completely, or try to use it as a default - falling back to dynamic, and maybe interpret 0/-1 as 'do dynamic always'. > Adding description of windows in the Device Tree can be added later on, > in a completely DT backward compatible way, and the mvebu-devbus driver > can be adjusted accordingly. Hmm.. You could make a new DT that is compatible with an older kernel, but once the mvebu_mbus_add_window is moved into the mbus driver it would require an updated DT. Jason
Dear Jason Gunthorpe, On Thu, 7 Mar 2013 12:55:41 -0700, Jason Gunthorpe wrote: > > Adding description of windows in the Device Tree can be added later on, > > in a completely DT backward compatible way, and the mvebu-devbus driver > > can be adjusted accordingly. > > Hmm.. You could make a new DT that is compatible with an older kernel, > but once the mvebu_mbus_add_window is moved into the mbus driver it > would require an updated DT. So what is your proposal that preserves non-DT compatibility? As I already said multiple times, we still have non-DT platforms around, and we can leave them on the side on this topic, so we have to have a solution that keeps non-DT compatibility for the moment. Thanks, Thomas
On Thu, Mar 07, 2013 at 09:28:54PM +0100, Thomas Petazzoni wrote: > Dear Jason Gunthorpe, > > On Thu, 7 Mar 2013 12:55:41 -0700, Jason Gunthorpe wrote: > > > > Adding description of windows in the Device Tree can be added later on, > > > in a completely DT backward compatible way, and the mvebu-devbus driver > > > can be adjusted accordingly. > > > > Hmm.. You could make a new DT that is compatible with an older kernel, > > but once the mvebu_mbus_add_window is moved into the mbus driver it > > would require an updated DT. > > So what is your proposal that preserves non-DT compatibility? As I > already said multiple times, we still have non-DT platforms around, and > we can leave them on the side on this topic, so we have to have a > solution that keeps non-DT compatibility for the moment. Well, I'd keep it basically the same as we have now.. Non-DT platforms call into the mbus driver via code to create windows before creating the platform device that will use the window. Use a fixed address, or dynamic allocation, it doesn't matter. Pass the resulting address in through the struct resource during platform device register. This would be done in the same place as today, in the non-DT board specific file. Critically, the window is setup and address selected prior to the platform device add and driver probe. In both the DT and non-DT case the struct resource associated with the platform device is correct. Something like this randomly choosen example. static struct platform_device kirkwood_nand_flash = { .name = "orion_nand", .id = -1, .dev = { .platform_data = &kirkwood_nand_data, }, .resource = &kirkwood_nand_resource, .num_resources = 1, }; void __init kirkwood_nand_init(struct mtd_partition *parts, int nr_parts, int chip_delay) { kirkwood_nand_data.parts = parts; kirkwood_nand_data.nr_parts = nr_parts; kirkwood_nand_data.chip_delay = chip_delay; + mbus_assign_resource_window(&kirkwood_nand_flash.resource, + MAPDEF(...)); platform_device_register(&kirkwood_nand_flash); } That should work for all the non-DT support cases in the kernel today? Regards, Jason
Hi Jason, On Wed, Mar 06, 2013 at 04:04:12PM -0700, Jason Gunthorpe wrote: > > nor@0 { > compatible = 'linux mtd nor driver'; > regs = <MAPDEF(1, 0xab, MAPDEF_NOMASK) 0x10000>; > } > > We can now tell directly that 'linux mtd nor driver' is behind MBUS > target 0xab, and the OF code already knows the base for this MBUS > target from the ranges property, as modified by the MBUS driver. The > MBUS driver also knows which targets to configure windows for > immediately at boot without having to consult internal tables. Plus we > don't need to modify the NOR driver to call out to MBUS window > functions to fetch an address. > > There are a few other variants on how to do this in DT, but they all > come down to modeling the MBUS target using a ranges property and > having the mbus driver adjust the range mapping at runtime. > At least in the flash (NOR and friends) case, this is not necessarily true. Now that the Device Bus driver is posted I think we have a nice example on how address decoding windows would be allocated through the mbus driver, *without* touching the child driver (i.e. NOR driver). The windows are allocated as part of the "preparing" tasks the Device Bus does as the parent device and it seems like a very clean solution. The windows are specifically related to him, so I think this driver is the most appropriate to perform the window management. In a certain way, this is similar to other memory-controller devices (e.g. OMAP's GPMC) that need to request a chip select before the child device can be accessed. Moreover -as I already stated in the patchset- I believe we can make the Device Bus implement dynamic window allocation, if he can't find a 'ranges' property for a given child. In conclusion: there is absolutely no need to change the NOR driver, and there is no need to explictly add an explicit description of the NOR window in the device tree. Thanks,
On Thu, Mar 07, 2013 at 07:20:05PM -0300, Ezequiel Garcia wrote: > Now that the Device Bus driver is posted I think we have a nice example > on how address decoding windows would be allocated through the mbus > driver, *without* touching the child driver (i.e. NOR driver). No, that isn't the case. Your device bus driver *is* the child driver, and you have touched it by adding mbus window code :) In fact, looking at what you've done - with the DTs you've proposed your driver does *nothing* but make a call to the mbus driver to setup the windows for its children. Without timing parameters from the DT the driver does nothing else! So what is the point of that? Get rid of your driver, have the mbus driver set up its own windows and put cfi-flash directly under mbus. Smaller, simpler, clearer, better. What you've done is *exactly* the sort of thing I was warning against :) BTW, I've made the same basic choice on our Kirkwood systems. The NAND timing pameters are set by the bootloader and the kernel doesn't touch them. The bootloader knows which flash chip is placed on the board so only it knows the correct timing parameters. There is not any reason to communicate them to the kernel. I do see a fine use for your driver for non-boot devices, but the window stuff needs to be removed so it isn't mandatory to use your driver in the NOP case. > In conclusion: there is absolutely no need to change the NOR driver, > and there is no need to explictly add an explicit description of the > NOR window in the device tree. As I've said, a DT that has child nodes without an associated address map is broken. Jason
Hi Jason, On Thu, Mar 07, 2013 at 04:05:16PM -0700, Jason Gunthorpe wrote: > On Thu, Mar 07, 2013 at 07:20:05PM -0300, Ezequiel Garcia wrote: > > Now that the Device Bus driver is posted I think we have a nice example > > on how address decoding windows would be allocated through the mbus > > driver, *without* touching the child driver (i.e. NOR driver). > > No, that isn't the case. Your device bus driver *is* the child driver, > and you have touched it by adding mbus window code :) > > In fact, looking at what you've done - with the DTs you've proposed > your driver does *nothing* but make a call to the mbus driver to setup > the windows for its children. Without timing parameters from the DT the > driver does nothing else! > Yes, setting aside the timings parameters, the Device Bus driver is only charge of allocating the windows; perhaps we can say nothing more, nothing less :) > > In conclusion: there is absolutely no need to change the NOR driver, > > and there is no need to explictly add an explicit description of the > > NOR window in the device tree. > > As I've said, a DT that has child nodes without an associated address > map is broken. > Mmm... could explain me better, why you think it's *broken*? Perhaps I'm slower than most people around here, but I still don't fully understand this brokenness. Thanks,
Dear Jason Gunthorpe, On Thu, 7 Mar 2013 16:05:16 -0700, Jason Gunthorpe wrote: > So what is the point of that? Get rid of your driver, have the mbus > driver set up its own windows and put cfi-flash directly under > mbus. Smaller, simpler, clearer, better. No, we will still need the Device Bus driver to set up the timings. > What you've done is *exactly* the sort of thing I was warning against > :) > > BTW, I've made the same basic choice on our Kirkwood systems. The NAND > timing pameters are set by the bootloader and the kernel doesn't touch > them. The bootloader knows which flash chip is placed on the board so > only it knows the correct timing parameters. There is not any reason > to communicate them to the kernel. That's _your_ choice, and it's entirely inconsistent with most kernel practices. For example, we have the pinctrl subsystem, which precisely aims at empowering the kernel in terms of pin muxing, while previously, it was a mix of custom kernel APIs, or reliance on the bootloader for having setting up the pin muxing. Sorry, but we _do_ want the kernel to be able to set those timings parameters, and therefore, a Device Bus driver will be needed, regardless of whether it creates the address window or not. > > In conclusion: there is absolutely no need to change the NOR driver, > > and there is no need to explictly add an explicit description of the > > NOR window in the device tree. > > As I've said, a DT that has child nodes without an associated address > map is broken. That's a statement without any arguments. How can we have a serious technical discussion if what you bring is just "this is broken", with no arguments? Best regards, Thomas
Dear Jason Gunthorpe, On Wed, 6 Mar 2013 16:04:12 -0700, Jason Gunthorpe wrote: > The discussion was around the SOC specific tables in the driver and if > they should be in DT or C code. These tables very much describe the > hardware, and are copied from similar tables in the Marvell manuals. > > For instance I repeated this idea: > > ranges =<MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom > MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot > MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs > [..] A few other issues with this: * It forces us to repeat the base address of the NOR twice in the Device Tree. Once in the ranges = <...> property of the MBus DT node, and once in the reg = <...> property of the NOR DT node. Duplication of identical information, isn't that something that sounds broken? * If we do dynamic allocation of the base address, it means that the DT is no longer an accurate representation of what happens for real. The user will write 0xDEADBEEF in the DT, and will get its NOR mapped at 0xBEEFDEAD. Isn't that uselessly confusing? Sorry, but your proposal still has many, many disadvantages over the currently proposed solution, and no compelling argument other than your own perception that the DT should describing the address mapping. With MBus, the DT cannot describe the address mapping because it is fundamentally dynamic. Why the heck would we describe NOR windows in the DT and not the PCIe ones? They are both dynamic in the exact same way. Best regards, Thomas
On Thursday 07 March 2013, Thomas Petazzoni wrote: > This doesn't work for PCIe interfaces. The PCIe windows cannot be > described in the DT, because they are inherently dynamic, and depend on > which PCIe devices are plugged into the PCIe slots, and how much I/O > and memory space each of those devices require. > > This means that regardless of what you think, some windows will have > to be created dynamically during the system initialization, and we will > not be able to describe all windows in the Device Tree. > > And to me, it seems completely stupid to have some windows defined > statically in the Device Tree, and some other windows set up > dynamically when the system initializes. You could have no "ranges" property in the boot time DT if you are looking for consistency here. Ideally, the ranges would contain exactly the setup of the MBUS windows that was configured with at boot time, and then get updated by the kernel when the windows are dynamically changed. An skipping the ranges at boot time would imply that none of the devices underneath MBUS are currently mapped to physical addresses. Putting some of the ranges is there is a way to handle two cases though: * Overriding the boot loader (or hardware) defaults, telling the kernel that we really want to set up the windows differently. In a perfect world, this would not be necessary, because the boot loader should be able to make sure the two always match. * Windows that we don't want to be set up at boot time yet because we know that we want dynamic allocation. > > At the stanza of MBUS driver's bus. Each line represents a MBUS target > > (this is a physical HW IP block). The MAPDEF is a SOC specific 'target > > id' that is currently living in tables in the driver. This value is > > directly compatible with the mbus window register and is defined by > > Marvell. The 2nd number is the CPU base address of that window, and > > the last is the size. > > Please explain how you handle PCIe windows. Any window that we don't want to hardcode, like the PCIe windows, can be absent from the "ranges", which in DT syntax means exactly the right thing: There are bus addresses on the child bus that are wired to one device but not currently mapped into the parent bus. This is exactly what we do for instance on a PCI bus ranges property that does not map its I/O space window into the parent bus as a linear map. This is the normal case on x86, but also done on a few other platforms. > > Per OF conventions the base address should be the value the bootloader > > left it at - but that is not important, the value could be 0. A > > dynamic MBUS driver would go through each item in the ranges, find > > an address range and program a window with the MAPDEF from the DT. It > > would then write the base address back into the DT (at runtime!) so > > that the entire DT remains conistent with the current state of the > > hardware. > > Why the heck would the kernel need to rewrite the DT ? > > Just like a driver does an ioremap() to create a virtual -> physical > mapping, the driver can just as well do mvebu_mbus_add_window() to > create a physical -> device mapping. It doesn't have to be hardcoded > into the Device Tree. The important difference is that the DT describes the physical address space as seen from the CPU bus interface, but does not care at all about the virtual addresses that the kernel puts into page tables. Describing the mapping of addresses from one bus address space to another, down to the CPU's view is the core of all DT bindings, and we should do that properly and consistently. If you change the mapping at run-time, updating the properties is the natural thing to do. Arnd
Dear Arnd Bergmann, On Fri, 8 Mar 2013 13:50:15 +0000, Arnd Bergmann wrote: > > And to me, it seems completely stupid to have some windows defined > > statically in the Device Tree, and some other windows set up > > dynamically when the system initializes. > > You could have no "ranges" property in the boot time DT if you are > looking for consistency here. > > Ideally, the ranges would contain exactly the setup of the MBUS windows > that was configured with at boot time, and then get updated by the > kernel when the windows are dynamically changed. An skipping the > ranges at boot time would imply that none of the devices underneath > MBUS are currently mapped to physical addresses. Putting some of > the ranges is there is a way to handle two cases though: > > * Overriding the boot loader (or hardware) defaults, telling the > kernel that we really want to set up the windows differently. In > a perfect world, this would not be necessary, because the boot > loader should be able to make sure the two always match. > * Windows that we don't want to be set up at boot time yet because > we know that we want dynamic allocation. When the kernel starts, it resets all MBus windows that could have potentially been configured by the bootloader. So we start from a clean world. So you're saying that the 'ranges' property should be empty, but the entire proposal from Jason Gunthorpe is centered around this 'ranges' property. Quoting Jason: // Names and numbers made up for illistration // The 2nd column is where to place the MBUS target in the CPU address map ranges < MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs [..] I must confess I'm lost between your opinion and Jason's opinion. > > > At the stanza of MBUS driver's bus. Each line represents a MBUS target > > > (this is a physical HW IP block). The MAPDEF is a SOC specific 'target > > > id' that is currently living in tables in the driver. This value is > > > directly compatible with the mbus window register and is defined by > > > Marvell. The 2nd number is the CPU base address of that window, and > > > the last is the size. > > > > Please explain how you handle PCIe windows. > > Any window that we don't want to hardcode, like the PCIe windows, can We don't want to hardcode any window, because it doesn't make sense. > be absent from the "ranges", which in DT syntax means exactly the right > thing: There are bus addresses on the child bus that are wired to one > device but not currently mapped into the parent bus. This is exactly > what we do for instance on a PCI bus ranges property that does not map > its I/O space window into the parent bus as a linear map. This is the > normal case on x86, but also done on a few other platforms. It is also the case for all other devices: when the kernel starts, a NOR is not mapped into the parent bus until we create the address decoding window for it. Absolutely no difference compared to PCIe. > > > Per OF conventions the base address should be the value the bootloader > > > left it at - but that is not important, the value could be 0. A > > > dynamic MBUS driver would go through each item in the ranges, find > > > an address range and program a window with the MAPDEF from the DT. It > > > would then write the base address back into the DT (at runtime!) so > > > that the entire DT remains conistent with the current state of the > > > hardware. > > > > Why the heck would the kernel need to rewrite the DT ? > > > > Just like a driver does an ioremap() to create a virtual -> physical > > mapping, the driver can just as well do mvebu_mbus_add_window() to > > create a physical -> device mapping. It doesn't have to be hardcoded > > into the Device Tree. > > The important difference is that the DT describes the physical address > space as seen from the CPU bus interface, but does not care at all about > the virtual addresses that the kernel puts into page tables. It doesn't describe the physical address of everything, because certain things are dynamic. In some systems, the only peripheral addresses that are dynamic are the PCI ones, and they are not hardcoded in the Device Tree. In the special case of Marvell systems, *many* peripherals have dynamic addresses. It's just a specificity of Marvell SoCs, but it is in the end very similar to the dynamic nature of PCI devices. Would it be possible to understand that this mechanism of address decoding windows is a bit special in Marvell SoCs, and that therefore the classical reasoning of what the DT encodes needs to be thought a little bit differently? > Describing the mapping of addresses from one bus address space to another, > down to the CPU's view is the core of all DT bindings, and we should do > that properly and consistently. If you change the mapping at run-time, > updating the properties is the natural thing to do. Again, you don't do that for PCI devices because their address is dynamic, so I don't see why we would have to do it for other devices whose address is also dynamic. Really, I've sent this PCIe driver first on December, 7th, and it was working. 99% percent of the problems have been around the Device Tree, and continue to be around this. Wasn't DT supposed to make things easier? I am really surprised by the amount of nitpicking that this driver receives, when I see how incoherent the pinctrl DT bindings for the various SoC are for example... At this point, I have absolutely no idea what direction to take to bring this further. I'm basically in a dead-end. Best regards, Thomas
On Friday 08 March 2013, Thomas Petazzoni wrote: > When the kernel starts, it resets all MBus windows that could have > potentially been configured by the bootloader. So we start from a clean > world. > > So you're saying that the 'ranges' property should be empty, but the > entire proposal from Jason Gunthorpe is centered around this 'ranges' > property. Quoting Jason: > > // Names and numbers made up for illistration > // The 2nd column is where to place the MBUS target in the CPU address map > ranges < MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom > MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot > MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs > [..] > > I must confess I'm lost between your opinion and Jason's opinion. There is not actually all that much disconnect. As a side note, an empty "ranges" property means an identity map, which would be wrong here, while an absent property means that there is currently no mapping. If you want the boot loader to tell the kernel to autoconfigure all the mappings, an absent ranges property would be the right thing. This makes a lot of sense if you want to go fully dynamic, but I thought that you were not attempting to go that far yet. I think we really want the ranges to be populated at run-time though, and have it match exactly the windows that we decide to put in, so that of_address_translate works. Even if you want to ignore all the mappings that the boot loader sets, there are two different values that would make sense to put into the ranges property anyway rather than leaving it out (as I mentioned earlier): a) Put the mapping in there that is configured by hardware or boot loader, to make any device work when you do not have a driver for MBUS. Any OS could then just translate the addresses from DT and get working devices, but an OS with an MBUS driver can choose to reconfigure the mappings and update the ranges property. This is similar to e.g. the baud rate is configured for a serial port: the firmware puts into DT what the UART is configured to at boot time, and we can use that to output data, or override it if we have a reason to want a different baud rate. We don't normally bother updating the DT properties if we do that, because nothing else depends on it. b) Put a useful default mapping in the DT, and ask Linux to set up the MBUS device so we don't need to duplicate the code in Linux and the boot loader. An OS can still ignore the values and override them with something else to get a fully dynamic mapping. This is similar e.g. to what we do for the ethernet MAC addresses: If the kernel wants to pick some value, it can find a reasonable one in the DT, which it will normally use, but it can just as well use something elese. I don't care much which of those options you choose (including leaving the ranges property out of the initial DT), but I strongly believe that the format of the in-memory ranges property should be what Jason Gunthorpe suggested or at least very similar. I certainly don't want to see the mapping hardwired in the driver. If you want the mapping to be completely boot time created, an empty ranges or the default from the boot loader makes the most sense. If you want some mappings to be fixed, I would prefer them to be in the DT rather than in the driver. > > > > At the stanza of MBUS driver's bus. Each line represents a MBUS target > > > > (this is a physical HW IP block). The MAPDEF is a SOC specific 'target > > > > id' that is currently living in tables in the driver. This value is > > > > directly compatible with the mbus window register and is defined by > > > > Marvell. The 2nd number is the CPU base address of that window, and > > > > the last is the size. > > > > > > Please explain how you handle PCIe windows. > > > > Any window that we don't want to hardcode, like the PCIe windows, can > > We don't want to hardcode any window, because it doesn't make sense. I thought you were doing exact that with a table like +static const struct mvebu_mbus_mapping armada_370_map[] = { + MAPDEF("bootrom", 1, 0xe0, MAPDEF_NOMASK), + MAPDEF("devbus-boot", 1, 0x2f, MAPDEF_NOMASK), + MAPDEF("devbus-cs0", 1, 0x3e, MAPDEF_NOMASK), + MAPDEF("devbus-cs1", 1, 0x3d, MAPDEF_NOMASK), + MAPDEF("devbus-cs2", 1, 0x3b, MAPDEF_NOMASK), + MAPDEF("devbus-cs3", 1, 0x37, MAPDEF_NOMASK), + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), + {}, +}; + > > be absent from the "ranges", which in DT syntax means exactly the right > > thing: There are bus addresses on the child bus that are wired to one > > device but not currently mapped into the parent bus. This is exactly > > what we do for instance on a PCI bus ranges property that does not map > > its I/O space window into the parent bus as a linear map. This is the > > normal case on x86, but also done on a few other platforms. > > It is also the case for all other devices: when the kernel starts, a > NOR is not mapped into the parent bus until we create the address > decoding window for it. Absolutely no difference compared to PCIe. Yes, as I said: if you don't want the mapping for the device to be fixed, it should not be in the ranges property at boot time, but only get added at the point where you pick a mapping. This may concern all devices or just a subset (PCIe, flash, ...). > > The important difference is that the DT describes the physical address > > space as seen from the CPU bus interface, but does not care at all about > > the virtual addresses that the kernel puts into page tables. > > It doesn't describe the physical address of everything, because certain > things are dynamic. In some systems, the only peripheral addresses that > are dynamic are the PCI ones, and they are not hardcoded in the Device > Tree. > > In the special case of Marvell systems, many peripherals have dynamic > addresses. It's just a specificity of Marvell SoCs, but it is in the > end very similar to the dynamic nature of PCI devices. > > Would it be possible to understand that this mechanism of address > decoding windows is a bit special in Marvell SoCs, and that therefore > the classical reasoning of what the DT encodes needs to be thought a > little bit differently? I don't think we should make a special case here when the common device tree format can perfectly describe what these chips are doing. > > Describing the mapping of addresses from one bus address space to another, > > down to the CPU's view is the core of all DT bindings, and we should do > > that properly and consistently. If you change the mapping at run-time, > > updating the properties is the natural thing to do. > > Again, you don't do that for PCI devices because their address is > dynamic, so I don't see why we would have to do it for other devices > whose address is also dynamic. For PCI devices we are often taking a shortcut with the flattened device tree format because we have to probe them all from the kernel anyway, in particular for hotplugged devices. In traditional OF based system, the firmware actually assigned addresses to all PCI devices and showed them in the device tree. We get away with leaving the PCI devices out of the DT because PCI is smart enough to assign the addresses dynamically. On the MBUS, this is not possible without knowing exactly what devices are behind it, and you can only describe those devices if you either list them in the source code (which we are trying to move away from with DT) or if you list them all in the DT, but that requires describing them in a format that has the sufficient information. Arnd
On Thu, 2013-03-07 at 16:05 -0700, Jason Gunthorpe wrote: > > BTW, I've made the same basic choice on our Kirkwood systems. The NAND > timing pameters are set by the bootloader and the kernel doesn't touch > them. The bootloader knows which flash chip is placed on the board so > only it knows the correct timing parameters. There is not any reason > to communicate them to the kernel. since you can probe the chip ONFI timings, why are you doing that ? on a general standpoint, why would you ever want to run with whatever the bootloader left in registers if you have full knowledge to reprogram them in the kernel ? from a testability point this is horrible, users get the classic answer to bug reports, "the problem must be coming from your bootloader, try upgrading it" the 'B' in BIOS stands for buggy
On Wed, Mar 06, 2013 at 11:35:13PM -0700, Jason Gunthorpe wrote: > On Thu, Mar 07, 2013 at 03:37:43AM +0000, Arnd Bergmann wrote: > > > Yes. We can even use the index of the "ranges" property to refer directly > > to the number of the window, so ranges replaces the kernel internal > > struct mvebu_mbus_mapping. > > I think we are thinking the same thing, but, s/index/bus address/ and > s/number of the window/target id/ ? > > > I would also suggest not making the mbus compatible with "simple_bus", but > > instead probing /only/ the mbus device at bootup, which can then set > > up the hardware based on the ranges property and then call > > of_platform_populate on its children. > > Yeah, something is needed to make startup ordering work. Things like > the timer and main interrupt controller live behind the special > internal regs window, so in a perfect world finding the timer would > automatically setup the enclosing bus structures ... > > > > I certainly agree with this. If it wasn't for the idea that the DT is > > > a stable ABI and thus must be perfect from the start, I wouldn't have > > > mentioned any of this right now, making fine tuning adjustments as a > > > follow on seems better to me. > > > > > > It is absolutely way too much work to try and fix everything in one > > > go! I would be happy to see your patch go in as-is, but mildly unhappy > > > to see us stuck with this DT layout forever... > > > > > > Especially since I really want to see your PCI stuff go in .... > > > > I basically agree with everything you write in this mail, Jason, and > > I was about to make exactly the same comments myself. > > > > Getting the binding right is certainly a priority here, but I also think > > that we don't need to rush the code to use that binding (although having > > the code work would make me more comfortable thinking that the binding > > actually works). > > Do you think Jason C's notion to take it as is and then be OK with > altering the binding later pending agreement is acceptable for now? I > agree with Thomas that this is too much to ask just to get > the already huge Marvell PCI-E patchset into mainline. Since these bindings are mvebu-specific, and since there are no DT based boards on the market for mvebu, I think this is an acceptable compromise. Thanks to JasonG and Arnd for clearing up my misunderstandings. The only thing I would really like to see at this point is a clear, thought-through migration path from Thomas' patches to JasonG's proposal. Obviously, it would be nice if the proposer took a crack at this. ;-) I'd really like to avoid a churn/spaghetti nightmare down the road if we can avoid it with a simple tweak now. thx, Jason.
On Fri, Mar 08, 2013 at 04:59:50PM +0100, Maxime Bizon wrote: > On Thu, 2013-03-07 at 16:05 -0700, Jason Gunthorpe wrote: > > > > BTW, I've made the same basic choice on our Kirkwood systems. The NAND > > timing pameters are set by the bootloader and the kernel doesn't touch > > them. The bootloader knows which flash chip is placed on the board so > > only it knows the correct timing parameters. There is not any reason > > to communicate them to the kernel. > > since you can probe the chip ONFI timings, why are you doing that ? Hmm, very interesting idea, but there is no implementation of this for the Marvell NAND driver today. Ezequiel - how would you forsee fitting something like this in? The process for NAND would be to set the slowest timings (which have to be computed based on the TClk frequency), do READID, determine the fastest supported timings, and then update the timings again. This seems like something that belongs in the NAND interface driver, not the device bus driver?? > on a general standpoint, why would you ever want to run with whatever > the bootloader left in registers if you have full knowledge to reprogram > them in the kernel ? Well, because the kernel doesn't support those registers. There are quite a few of them on our Kirkwood systems, including NAND timing. Jason
On Fri, Mar 08, 2013 at 09:48:32AM -0700, Jason Gunthorpe wrote: > > since you can probe the chip ONFI timings, why are you doing that ? > > Hmm, very interesting idea, but there is no implementation of this for > the Marvell NAND driver today. > > Ezequiel - how would you forsee fitting something like this in? The > process for NAND would be to set the slowest timings (which have to be > computed based on the TClk frequency), do READID, determine the > fastest supported timings, and then update the timings again. > > This seems like something that belongs in the NAND interface driver, > not the device bus driver?? > Mmm... adding NAND support is in my TODO list. But I'm not quite there yet, so I'm not sure what's the most appropriate way to handle this.
On Fri, Mar 08, 2013 at 09:10:52AM +0100, Thomas Petazzoni wrote: > Dear Jason Gunthorpe, > > On Thu, 7 Mar 2013 16:05:16 -0700, Jason Gunthorpe wrote: > > > So what is the point of that? Get rid of your driver, have the mbus > > driver set up its own windows and put cfi-flash directly under > > mbus. Smaller, simpler, clearer, better. > > No, we will still need the Device Bus driver to set up the timings. I looked through Ezequiel's patch and saw a driver that provided those properties and no user of them at all. Are the patches incomplete? Is there some plan to use these values in the future? > That's _your_ choice, and it's entirely inconsistent with most kernel > practices. For example, we have the pinctrl subsystem, which > precisely Again, Ezequiel's patches do exactly the same thing. pinctrl is a bit different, it depends on the PCB design which is fixed. NAND/NOR timings depend on the flash chip in use, which can vary from manufacturing run to manufacturing run. > Sorry, but we _do_ want the kernel to be able to set those timings > parameters, and therefore, a Device Bus driver will be needed, > regardless of whether it creates the address window or not. See the very good idea from Maxime Bizon about autoprobing the ONFI timings directly from the NAND chip. > > > In conclusion: there is absolutely no need to change the NOR driver, > > > and there is no need to explictly add an explicit description of the > > > NOR window in the device tree. > > > > As I've said, a DT that has child nodes without an associated address > > map is broken. > > That's a statement without any arguments. How can we have a serious > technical discussion if what you bring is just "this is broken", with no > arguments? I've brought many things up and you have ignored them all. Here is a list of a few: - You can't actually do optmial dynamic allocation a device at a time, that will make holes in the address space due to the alignment constraints. - Having an invalid address map in DT breaks the naming of platform devices and breaks the collision avoidance mechanism in the OF core - Breaking the address map means any downstream driver that tries to use standard OF addressing functions on its of node gets broken. - Breaking the address map means the standard OF platform device creation functions don't work because they can't create the resource ranges from OF. This means things like standard Linux resource accounting is broken. - Adding bus specific boiler plate to drivers is not the Linux convention, the bus abstraction is ment to do that. Arnd also mentioned: - Breaking the address map in DT breaks OS's without a mbus driver What more do you want to hear? Jason
On Fri, Mar 08, 2013 at 03:06:55PM +0100, Thomas Petazzoni wrote: > Really, I've sent this PCIe driver first on December, 7th, and it was > working. 99% percent of the problems have been around the Device Tree, > and continue to be around this. Wasn't DT supposed to make things > easier? I am really surprised by the amount of nitpicking that this > driver receives, when I see how incoherent the pinctrl DT bindings for > the various SoC are for example... Well, please consider that the push to make the DT a stable ABI means it should be nearly as hard to change as the userspace ABI. If you've watched the kernel you know how rough it is to do that. :( Further, the PCI DT binding needs to implement the documented OF specifications. It is not nitpicking to say that X does not conform to the OF document. pinctrl doesn't have an OF spec so people are doing SOC unique things. I think a reasonable take away from some of this is that it may be worthwhile to propose a DT binding before writing code for it, especially if it is a new or complex concept. > At this point, I have absolutely no idea what direction to take to > bring this further. I'm basically in a dead-end. I think Jason C. has said he is fine with bringing in the MBUS driver as is. The PCI-E driver's DT binding looks great to me, I have no further comments at all, and I believe everyone else's comments related to DT are delt with as well. The huge discussion about the SW PCI-E bridge seems to be concluded and everyone is agreeing on that point too. Soo, I think you are not at all at a dead end, but nearly finished :) Regards, Jason
On Fri, Mar 08, 2013 at 03:41:29PM +0000, Arnd Bergmann wrote: > If you want the boot loader to tell the kernel to autoconfigure all the > mappings, an absent ranges property would be the right thing. This makes > a lot of sense if you want to go fully dynamic, but I thought that you > were not attempting to go that far yet. > > I think we really want the ranges to be populated at run-time though, > and have it match exactly the windows that we decide to put in, so that > of_address_translate works. Right, so taking these two paragraphs together - if you have an absent ranges you are imagining that the mbus driver would write out a new fully populated ranges to the DTB? That would be necessary to keep of_address_translate working. If so we are talking about exactly the same thing - the in memory DTB would still have working address translation. An absent ranges makes some sense from a spec perspective - however do we have the kernel infrastructure to grow an in memory DTB node? Jason
Jason, On Fri, Mar 08, 2013 at 10:29:50AM -0700, Jason Gunthorpe wrote: > > I looked through Ezequiel's patch and saw a driver that provided those > properties and no user of them at all. Are the patches incomplete? Is > there some plan to use these values in the future? > Leaving aside the review comments that still I haven't addressed yet, the patchset is complete. The timings parameters were not set in device tree files, not because I expected the bootloader to set them, but instead because SoC default values worked fine for the NOR devices I tested. If this is suboptimal, I can fix the device tree files in v2 to set proper timings parameter values. Thanks,
On Fri, Mar 08, 2013 at 09:14:39AM +0100, Thomas Petazzoni wrote: > > The discussion was around the SOC specific tables in the driver and if > > they should be in DT or C code. These tables very much describe the > > hardware, and are copied from similar tables in the Marvell manuals. > > > > For instance I repeated this idea: > > > > ranges =<MAPDEF(1, 0xe0, MAPDEF_NOMASK) 0xFE000000 0x10000 // boot rom > > MAPDEF(1, 0x3f, MAPDEF_NOMASK) 0xFD000000 0x10000 // devbus-boot > > MAPDEF(1, 0xxx, MAPDEF_NOMASK) 0xFF000000 0x10000 // internal-regs > > [..] > > A few other issues with this: > > * It forces us to repeat the base address of the NOR twice in the > Device Tree. Once in the ranges = <...> property of the MBus DT node, > and once in the reg = <...> property of the NOR DT node. Duplication > of identical information, isn't that something that sounds broken? Uhm. That is how DT works. DT is not ment to be human readable, it is a machine parsible format. Look at a DT that comes out of a full featured firmware and there will be lots of duplicate information, especially in ranges. Downstream stanzas will *always* duplicate the information in upstream ranges. The format absolutely requires it. > * If we do dynamic allocation of the base address, it means that the > DT is no longer an accurate representation of what happens for real. > The user will write 0xDEADBEEF in the DT, and will get its NOR > mapped at 0xBEEFDEAD. Isn't that uselessly confusing? The *input* DTB may not match the *runtime* DTB, but the in-kernel runtime DTB can be accurate. > Sorry, but your proposal still has many, many disadvantages over the > currently proposed solution, and no compelling argument other than your > own perception that the DT should describing the address mapping. Hold on here. Your current solution fails on the significant point that the kernel hard codes CPU addresses and sets up windows on those addresses. The DT is then required to exactly match those hard coded addresess. You dismissed this complaint by saying that would go away with dynamic assignment - but, there isn't a complete proposal for this. For instance, take this from Ezequiel: >> If we want to upgrade this behavior and support dynamically >> allocated windows, we can simply fix the device bus driver to find a >> suitable address when there is no 'ranges' property for that chip >> select. Okay. So you drop the ranges property and dynamically assign. Go through these questions: - What will the source DTS look like for this case? - What happens in the kernel when the child cfi-nor device is created? - Will you rely on the usual OF populate code? - If no, you write a custom OF populate code path, is that good for maintainability? - If yes, how do you make of_translate_address return the correct value so the cfi-nor platform device's resources and name are correct? - Are you going to patch the child's 'reg' value in the dtb? - How is patching 'reg' in a child different from patching 'ranges' in the parent? Is this any less confusing? - What value do you put in the source DTS for the child regs? - What value do you put in the runtime DTB for the child's reg? - How do you deal with other 'ranges' translation in enclosing stanzas? - What if the child doesn't have a 'reg' but has a 'ranges', for some reason? Go through the same thinking process with my proposal. Now constrast the two. > With MBus, the DT cannot describe the address mapping because it is > fundamentally dynamic. Why the heck would we describe NOR windows in > the DT and not the PCIe ones? They are both dynamic in the exact same > way. They are not the same. The NOR flash has a DT stanza associated with it. The dynamic PCIe windows do not. The need to describe the NOR window int DT arises *directly* from the need to describe the NOR device in DT. Jason
On Fri, Mar 08, 2013 at 02:59:27PM -0300, Ezequiel Garcia wrote: > > I looked through Ezequiel's patch and saw a driver that provided those > > properties and no user of them at all. Are the patches incomplete? Is > > there some plan to use these values in the future? > Leaving aside the review comments that still I haven't addressed yet, > the patchset is complete. > The timings parameters were not set in device tree files, > not because I expected the bootloader to set them, but instead because > SoC default values worked fine for the NOR devices I tested. ? Isn't that the same thing? Were the timing registers the SOC reset default or were they leftover from the bootloader? > If this is suboptimal, I can fix the device tree files in v2 > to set proper timings parameter values. I have no opinion on this for your boards, it depends entirely on what flash chips you have to be compatible with, and how you feel about your bootloader. Also, when looking at the idea from Maxime it occured to me that your DT binding might be better using ns or ps for the timings, instead of tclk cycles? That way NAND/NOR datasheet values can be included in the DT directly and are correct no matter what the tclk frequency is set to. Cheers, Jason
On Fri, Mar 08, 2013 at 11:31:20AM -0700, Jason Gunthorpe wrote: > On Fri, Mar 08, 2013 at 02:59:27PM -0300, Ezequiel Garcia wrote: > > > > I looked through Ezequiel's patch and saw a driver that provided those > > > properties and no user of them at all. Are the patches incomplete? Is > > > there some plan to use these values in the future? > > > Leaving aside the review comments that still I haven't addressed yet, > > the patchset is complete. > > > The timings parameters were not set in device tree files, > > not because I expected the bootloader to set them, but instead because > > SoC default values worked fine for the NOR devices I tested. > > ? Isn't that the same thing? Were the timing registers the SOC reset > default or were they leftover from the bootloader? > IMHO, SoC reset default value is not the same as leftover from bootloader, right? But this doesn't mean the driver is not useful to set proper parameter values in case default/bootloader are wrong or suboptimal. > > If this is suboptimal, I can fix the device tree files in v2 > > to set proper timings parameter values. > > I have no opinion on this for your boards, it depends entirely on what > flash chips you have to be compatible with, and how you feel about > your bootloader. > > Also, when looking at the idea from Maxime it occured to me that your > DT binding might be better using ns or ps for the timings, instead of > tclk cycles? That way NAND/NOR datasheet values can be included in the > DT directly and are correct no matter what the tclk frequency is set to. > Mmm... that sounds like a better aproach. I'll think about that.
On Friday 08 March 2013, Jason Gunthorpe wrote: > Right, so taking these two paragraphs together - if you have an absent > ranges you are imagining that the mbus driver would write out a new > fully populated ranges to the DTB? That would be necessary to keep > of_address_translate working. > > If so we are talking about exactly the same thing - the in memory DTB > would still have working address translation. > > An absent ranges makes some sense from a spec perspective - however do we > have the kernel infrastructure to grow an in memory DTB node? I don't think it's necessary to amend the flattened DT in the DTB form, although that could be done if necessary. Once we have the live DT data structures, after calling unflatten_device_tree(), we can trivially call of_add_property() to add the ranges. I think the nicest boot sequence here would be to have an early driver bind of the MBUS, which assigns all the windows, adds the ranges property and then calls of_platform_populate to create all the child platform devices from the tree. If we need any of the MBUS children before initcalls, we can also move that stage earlier and set up the mbus windows from the machine_init callback. Arnd
On Fri, Mar 08, 2013 at 11:48:04AM -0500, Jason Cooper wrote: > Thanks to JasonG and Arnd for clearing up my misunderstandings. The > only thing I would really like to see at this point is a clear, > thought-through migration path from Thomas' patches to JasonG's > proposal. Obviously, it would be nice if the proposer took a crack at > this. ;-) The patch to add the top level ranges is a bit major because it restructures the DT to be organized by mbus target. It also would depend on Steven's preprocesor patchset (for our own sanity). To my mind this is best done after Thomas's work goes in, because it is a complex change to the DT and has dependencies. Eg kirkwood.dtsi would change to place, as best as possible, mbus targets directly below the mbus top level. Here is a sample that shows most of the main points: /* MAPDEF is a 2 dw value, top DW encodes the target id, bottom dword is usally 0. The special target id of '0' is no target/no window. */ #define MAPDEF(x,y,z) ((x << 16) | (y << 8) | z) 0 #define MAPDEF_INTERNAL MAPDEF(...) [..] #define PCI_MMIO_APERTURE 0xe0000000 #define PCI_IO_APERTURE 0xe8000000 mbus { ranges = <MAPDEF_INTERNAL 0xf1000000 0x100000 MAPDEF_NAND 0xf4000000 0x10000 MAPDEF_CRYPTO 0xf5000000 0x10000 /* These two are *not* mbus windows but are required to model the PCI aperture abstraction. Windows are allocated within these regions dynamically as neeed. */ 0 PCI_MMIO_APERTURE PCI_MMIO_APERTURE 0x08000000 0 PCI_IO_APERTURE PCI_MMIO_APERTURE 0x00100000> #address-cells = <2>; [..] // Internal regs special target internal_regs@f1000000 { compatible = "simple-bus"; ranges = <0x00000000 MAPDEF_INTERNAL 0x100000>; #address-cells = <1>; serial@12000 { compatible = "ns16550a"; reg = <0x12000 0x100>; reg-shift = <2>; interrupts = <33>; }; [..] } // All the PCI-E targets pcie-controller { compatible = "marvell,armada-370-pcie"; reg = <MAPDEF_INTERNAL + 0x40000 0x2000>, <MAPDEF_INTERNAL + 0x80000 0x2000>; ranges = <0x82000000 0 PCI_MMIO_APERTURE PCI_MMIO_APERTURE 0 0x08000000 /* non-prefetchable memory */ 0x81000000 0 0 PCI_IO_APERTURE 0 0x00100000>; /* downstream I/O */ [..] pcie@0,0 { reg = <0x0800 0 0 0 0>; marvell,mbus-targets = <MAPDEF_PCIE0_MEM MAPDEF_PCIE0_IO>; } } // Target id 1,0x2f nand@0xf4000000 { #address-cells = <1>; #size-cells = <1>; cle = <0>; ale = <1>; bank-width = <1>; compatible = "marvell,orion-nand"; reg = <MAPDEF_NAND 0x400>; }; // Target id 0x3,0x1 crypto@f5000000 { reg = <MAPDEF_INTERNAL + 0x30000 0x10000>, <MAPDEF_CRYPTO 0x800>; } } Which then results in mirroring changes to all the including dts: mbus { internal_regs@f1000000 { timer@20300 { clock-frequency = <166666667>; }; } A word on the PCI aperture: It is possible to directly show the mbus windows for PCI in the DT, and rely on runtime updating to keep everything in sync with the dynamic choices made by the PCI driver, but there is no point in doing all this work since we have no use for this information within the DT. A simple CPU address range 'aperture' to indicate where we want to place the PCI region is all that is needed. If the region is someday selected dynamically then the two ranges would be updated at runtime. I think I can prepare a patch for this if we reach agreement, but I can only test it on Kirkwood today.. Also, I'm guessing there is some opinions on what should be after the @.. ? From there.. Future patches would have the mbus driver read the top level ranges at boot and program those windows. Then we remove the code in the board specific files that does the same. For kirwood, move the PCI specific window setup code into the non-dt pcie driver, and then we are ready to use Thomas's DT driver on kirkwood. When the non-DT code is removed many of the the tables in the mbus driver can then be purged. Future patches could support fully dyanmic window address assignment, but I'm honestly having a hard time seeing a use case for this.. Jason
On Fri, Mar 08, 2013 at 07:42:27PM +0000, Arnd Bergmann wrote: > > An absent ranges makes some sense from a spec perspective - however do we > > have the kernel infrastructure to grow an in memory DTB node? > > I don't think it's necessary to amend the flattened DT in the DTB form, > although that could be done if necessary. Once we have the live DT data > structures, after calling unflatten_device_tree(), we can trivially call > of_add_property() to add the ranges. Oh, I've never looked at those APIs, that makes it pretty nice, indeed. With those it would be easy to keep the in-memory DT in sync. The only other problem with absent ranges is how do you learn the target id and size to do the assignment? > devices from the tree. If we need any of the MBUS children before > initcalls, we can also move that stage earlier and set up the mbus > windows from the machine_init callback. Yah, technically the interrupt controller and timer should be below the mbus block. Cheers, Jason
On Friday 08 March 2013, Jason Gunthorpe wrote: > > On Fri, Mar 08, 2013 at 07:42:27PM +0000, Arnd Bergmann wrote: > > > An absent ranges makes some sense from a spec perspective - however do we > > > have the kernel infrastructure to grow an in memory DTB node? > > > > I don't think it's necessary to amend the flattened DT in the DTB form, > > although that could be done if necessary. Once we have the live DT data > > structures, after calling unflatten_device_tree(), we can trivially call > > of_add_property() to add the ranges. > > Oh, I've never looked at those APIs, that makes it pretty nice, > indeed. With those it would be easy to keep the in-memory DT in sync. > > The only other problem with absent ranges is how do you learn the > target id and size to do the assignment? You just need to look at all "reg" properties of direct children that are not status="disabled". Those properties are already in the 64-bit address space co ntaining the target ID, and they have the size in them, at least for those where the size is known at boot time, probably all but PCIe, which can get the rest. The algorithm should probably be similar to what we do for PCI resource assignment: collect all resources we need, sort them by size, and fill the logical bus address space starting with the largest one so we don't leave holes. Arnd
On Friday 08 March 2013, Jason Gunthorpe wrote: > Future patches could support fully dyanmic window address assignment, > but I'm honestly having a hard time seeing a use case for this.. The main advantage I can see for going fully dynamic is that we can use a larger aperture and more windows for PCIe on systems that don't use all of the other windows from the ${SOC}.dtsi file. If we don't have a NAND chip connected, the ${BOARD}.dts file can easily mark the nand controller node as status="disabled" and we don't have to give it a physical address. It's not a strong reason, but I think it's nice and clean to assign the windows automatically and exactly for the hardware that we are going to use. Arnd
On Friday 08 March 2013, Jason Gunthorpe wrote: > > At this point, I have absolutely no idea what direction to take to > > bring this further. I'm basically in a dead-end. > > I think Jason C. has said he is fine with bringing in the MBUS driver as > is. > > The PCI-E driver's DT binding looks great to me, I have no further > comments at all, and I believe everyone else's comments related to DT > are delt with as well. > > The huge discussion about the SW PCI-E bridge seems to be concluded > and everyone is agreeing on that point too. > > Soo, I think you are not at all at a dead end, but nearly finished :) I agree, I think the has been going really well and we are just fighting over a few details here. I was giving you all a hard time about the PCIe driver, but that was actually less about the binding than about the question whether to emulate P2P bridges or not. I still don't think it's a good solution, but that is because of the hardware shortcomings and all the alternatives are worse. In case of the MBUS driver, I don't think it's a hard problem at all (compared to the complexity of the PCIe driver and binding), we just have to agree to do the right thing here. Arnd
Dear Jason Gunthorpe, On Fri, 8 Mar 2013 10:29:50 -0700, Jason Gunthorpe wrote: > > Sorry, but we _do_ want the kernel to be able to set those timings > > parameters, and therefore, a Device Bus driver will be needed, > > regardless of whether it creates the address window or not. > > See the very good idea from Maxime Bizon about autoprobing the ONFI > timings directly from the NAND chip. ONFI (Open NAND Firmware Interface) is about NAND, while the Device Bus driver submitted by Ezequiel is about NOR. I don't think NORs have a standard way of exposing their memory timings requirements. And the Device Bus driver is not only for NORs, but also for example for FPGAs connected on the memory bus. Ezequiel followed the work done by Jon Hunter on the GPMC driver for the memory bus of OMAP SoCs, which includes the definition of timings in the Device Tree. To me, it seems like defining the timings in the Device Tree do make sense, and "ONFI" is not a good enough answer to get rid of the Device Bus driver submitted by Ezequiel. Best regards, Thomas
On Mon, Mar 18, 2013 at 05:27:18PM +0100, Thomas Petazzoni wrote: > Dear Jason Gunthorpe, > > On Fri, 8 Mar 2013 10:29:50 -0700, Jason Gunthorpe wrote: > > > > Sorry, but we _do_ want the kernel to be able to set those timings > > > parameters, and therefore, a Device Bus driver will be needed, > > > regardless of whether it creates the address window or not. > > > > See the very good idea from Maxime Bizon about autoprobing the ONFI > > timings directly from the NAND chip. > > ONFI (Open NAND Firmware Interface) is about NAND, while the Device Bus > driver submitted by Ezequiel is about NOR. I don't think NORs have a > standard way of exposing their memory timings requirements. And the > Device Bus driver is not only for NORs, but also for example for FPGAs > connected on the memory bus. Right, AFAIK there is no CFI property for device timings, and there is a lot of variation of optimal timings between pin-compatible parts. Some kind of out-of-band indication of the flash chip and speed grade is necessary to get optimal timings, unfortunately - no idea how you'd fit that sensibly into a DT framework. > Ezequiel followed the work done by Jon Hunter on the GPMC driver for > the memory bus of OMAP SoCs, which includes the definition of timings > in the Device Tree. To me, it seems like defining the timings in the > Device Tree do make sense, and "ONFI" is not a good enough answer to > get rid of the Device Bus driver submitted by Ezequiel. As I said, I think that is fine - there is nothing wrong with a driver to set the bus timings. But, if there are no bus timings in the DT then the driver should not be involved at all.. Similarly the driver should require all the timings, re-using timings left in the register with timings from the DT seems very strange.. Regards, Jason
diff --git a/Documentation/devicetree/bindings/arm/mvebu-mbus.txt b/Documentation/devicetree/bindings/arm/mvebu-mbus.txt new file mode 100644 index 0000000..9f4220f --- /dev/null +++ b/Documentation/devicetree/bindings/arm/mvebu-mbus.txt @@ -0,0 +1,39 @@ +MVEBU MBUS driver +----------------- + +This is the Device Tree binding documentation for the mvebu-mbus +driver, that takes care of configuration the CPU-side address decoding +windows that allows to access the SDRAM and the devices. This driver +is used on all Marvell EBU SoCs (Armada 370/XP, Dove, Kirkwood, +Orion5x and MV78xx0). + +Required properties: + +- compatible: one of: + - marvell,armada370-mbus + - marvell,armadaxp-mbus + - marvell,kirkwood-mbus + - marvell,dove-mbus + - marvell,orion5x-88f5281-mbus + - marvell,orion5x-88f5182-mbus + - marvell,orion5x-88f5181-mbus + - marvell,orion5x-88f6183-mbus + - marvell,mv78xx0-mbus + +- reg: should contain two sets of register ranges. + - first range describing the registers used to configure the MBus + address decoding windows. + - second range describing the registers used to configure the SDRAM + address decoding windows. + +Example: + + soc@f1000000 { + compatible = "marvell,dove-mbus", "simple-bus"; + reg = <0xf10020000 0x80>, <0xf1800100 0x8>; + #address-cells = <1>; + #size-cells = <1>; + + [... devices on mbus ...] + + }; diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig index 0f51ed6..b05ecab 100644 --- a/drivers/bus/Kconfig +++ b/drivers/bus/Kconfig @@ -4,6 +4,13 @@ menu "Bus devices" +config MVEBU_MBUS + bool + depends on PLAT_ORION + help + Driver needed for the MBus configuration on Marvell EBU SoCs + (Kirkwood, Dove, Orion5x, MV78XX0 and Armada 370/XP). + config OMAP_OCP2SCP tristate "OMAP OCP2SCP DRIVER" depends on ARCH_OMAP2PLUS diff --git a/drivers/bus/Makefile b/drivers/bus/Makefile index 45d997c..3c7b53c 100644 --- a/drivers/bus/Makefile +++ b/drivers/bus/Makefile @@ -2,6 +2,7 @@ # Makefile for the bus drivers. # +obj-$(CONFIG_MVEBU_MBUS) += mvebu-mbus.o obj-$(CONFIG_OMAP_OCP2SCP) += omap-ocp2scp.o # Interconnect bus driver for OMAP SoCs. diff --git a/drivers/bus/mvebu-mbus.c b/drivers/bus/mvebu-mbus.c new file mode 100644 index 0000000..1c6fbf4 --- /dev/null +++ b/drivers/bus/mvebu-mbus.c @@ -0,0 +1,821 @@ +/* + * Address map functions for Marvell EBU SoCs (Kirkwood, Armada + * 370/XP, Dove, Orion5x and MV78xx0) + * + * This file is licensed under the terms of the GNU General Public + * License version 2. This program is licensed "as is" without any + * warranty of any kind, whether express or implied. + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/init.h> +#include <linux/mbus.h> +#include <linux/io.h> +#include <linux/ioport.h> +#include <linux/of.h> +#include <linux/of_address.h> +#include <linux/debugfs.h> + +/* + * DDR target is the same on all platforms. + */ +#define TARGET_DDR 0 + +/* + * CPU Address Decode Windows registers + */ +#define WIN_CTRL_OFF 0x0000 +#define WIN_CTRL_ENABLE BIT(0) +#define WIN_CTRL_TGT_MASK 0xf0 +#define WIN_CTRL_TGT_SHIFT 4 +#define WIN_CTRL_ATTR_MASK 0xff00 +#define WIN_CTRL_ATTR_SHIFT 8 +#define WIN_CTRL_SIZE_MASK 0xffff0000 +#define WIN_CTRL_SIZE_SHIFT 16 +#define WIN_BASE_OFF 0x0004 +#define WIN_BASE_LOW 0xffff0000 +#define WIN_BASE_HIGH 0xf +#define WIN_REMAP_LO_OFF 0x0008 +#define WIN_REMAP_LOW 0xffff0000 +#define WIN_REMAP_HI_OFF 0x000c + +#define ATTR_HW_COHERENCY (0x1 << 4) + +#define DDR_BASE_CS_OFF(n) (0x0000 + ((n) << 3)) +#define DDR_BASE_CS_HIGH_MASK 0xf +#define DDR_BASE_CS_LOW_MASK 0xff000000 +#define DDR_SIZE_CS_OFF(n) (0x0004 + ((n) << 3)) +#define DDR_SIZE_ENABLED BIT(0) +#define DDR_SIZE_CS_MASK 0x1c +#define DDR_SIZE_CS_SHIFT 2 +#define DDR_SIZE_MASK 0xff000000 + +#define DOVE_DDR_BASE_CS_OFF(n) ((n) << 4) + +struct mvebu_mbus_mapping { + const char *name; + u8 target; + u8 attr; + u8 attrmask; +}; + +/* + * Masks used for the 'attrmask' field of mvebu_mbus_mapping. They + * allow to get the real attribute value, discarding the special bits + * used to select a PCI MEM region or a PCI WA region. This allows the + * debugfs code to reverse-match the name of a device from its + * target/attr values. + * + * For all devices except PCI, all bits of 'attr' must be + * considered. For most SoCs, only bit 3 should be ignored (it allows + * to select between PCI MEM and PCI I/O). On Orion5x however, there + * is the special bit 5 to select a PCI WA region. + */ +#define MAPDEF_NOMASK 0xff +#define MAPDEF_PCIMASK 0xf7 +#define MAPDEF_ORIONPCIMASK 0xd7 + +/* Macro used to define one mvebu_mbus_mapping entry */ +#define MAPDEF(__n, __t, __a, __m) \ + { .name = __n, .target = __t, .attr = __a, .attrmask = __m } + +struct mvebu_mbus_state; + +struct mvebu_mbus_soc_data { + unsigned int num_wins; + unsigned int num_remappable_wins; + unsigned int (*win_cfg_offset)(const int win); + void (*setup_cpu_target)(struct mvebu_mbus_state *s); + const struct mvebu_mbus_mapping *map; +}; + +struct mvebu_mbus_state { + void __iomem *mbuswins_base; + void __iomem *sdramwins_base; + struct dentry *debugfs_root; + struct dentry *debugfs_sdram; + struct dentry *debugfs_devs; + const struct mvebu_mbus_soc_data *soc; + int hw_io_coherency; +}; + +static struct mvebu_mbus_state mbus_state; + +static struct mbus_dram_target_info mvebu_mbus_dram_info; +const struct mbus_dram_target_info *mv_mbus_dram_info(void) +{ + return &mvebu_mbus_dram_info; +} +EXPORT_SYMBOL_GPL(mv_mbus_dram_info); + +/* + * Functions to manipulate the address decoding windows + */ + +static void mvebu_mbus_read_window(struct mvebu_mbus_state *mbus, + int win, int *enabled, u64 *base, + u32 *size, u8 *target, u8 *attr, + u64 *remap) +{ + void __iomem *addr = mbus->mbuswins_base + + mbus->soc->win_cfg_offset(win); + u32 basereg = readl(addr + WIN_BASE_OFF); + u32 ctrlreg = readl(addr + WIN_CTRL_OFF); + + if (!(ctrlreg & WIN_CTRL_ENABLE)) { + *enabled = 0; + return; + } + + *enabled = 1; + *base = ((u64)basereg & WIN_BASE_HIGH) << 32; + *base |= (basereg & WIN_BASE_LOW); + *size = (ctrlreg | ~WIN_CTRL_SIZE_MASK) + 1; + + if (target) + *target = (ctrlreg & WIN_CTRL_TGT_MASK) >> WIN_CTRL_TGT_SHIFT; + + if (attr) + *attr = (ctrlreg & WIN_CTRL_ATTR_MASK) >> WIN_CTRL_ATTR_SHIFT; + + if (remap) { + if (win < mbus->soc->num_remappable_wins) { + u32 remap_low = readl(addr + WIN_REMAP_LO_OFF); + u32 remap_hi = readl(addr + WIN_REMAP_HI_OFF); + *remap = ((u64)remap_hi << 32) | remap_low; + } else + *remap = 0; + } +} + +static void mvebu_mbus_disable_window(struct mvebu_mbus_state *mbus, + int win) +{ + void __iomem *addr; + + addr = mbus->mbuswins_base + mbus->soc->win_cfg_offset(win); + + writel(0, addr + WIN_BASE_OFF); + writel(0, addr + WIN_CTRL_OFF); + if (win < mbus->soc->num_remappable_wins) { + writel(0, addr + WIN_REMAP_LO_OFF); + writel(0, addr + WIN_REMAP_HI_OFF); + } +} + +/* Checks whether the given window number is available */ +static int mvebu_mbus_window_is_free(struct mvebu_mbus_state *mbus, + const int win) +{ + void __iomem *addr = mbus->mbuswins_base + + mbus->soc->win_cfg_offset(win); + u32 ctrl = readl(addr + WIN_CTRL_OFF); + return !(ctrl & WIN_CTRL_ENABLE); +} + +/* + * Checks whether the given (base, base+size) area doesn't overlap an + * existing region + */ +static int mvebu_mbus_window_conflicts(struct mvebu_mbus_state *mbus, + phys_addr_t base, size_t size, + u8 target, u8 attr) +{ + u64 end = (u64)base + size; + int win; + + for (win = 0; win < mbus->soc->num_wins; win++) { + u64 wbase, wend; + u32 wsize; + u8 wtarget, wattr; + int enabled; + + mvebu_mbus_read_window(mbus, win, + &enabled, &wbase, &wsize, + &wtarget, &wattr, NULL); + + if (!enabled) + continue; + + wend = wbase + wsize; + + /* + * Check if the current window overlaps with the + * proposed physical range + */ + if ((u64)base < wend && end > wbase) + return 0; + + /* + * Check if target/attribute conflicts + */ + if (target == wtarget && attr == wattr) + return 0; + } + + return 1; +} + +static int mvebu_mbus_find_window(struct mvebu_mbus_state *mbus, + phys_addr_t base, size_t size) +{ + int win; + + for (win = 0; win < mbus->soc->num_wins; win++) { + u64 wbase; + u32 wsize; + int enabled; + + mvebu_mbus_read_window(mbus, win, + &enabled, &wbase, &wsize, + NULL, NULL, NULL); + + if (!enabled) + continue; + + if (base == wbase && size == wsize) + return win; + } + + return -ENODEV; +} + +static int mvebu_mbus_setup_window(struct mvebu_mbus_state *mbus, + int win, phys_addr_t base, size_t size, + phys_addr_t remap, u8 target, + u8 attr) +{ + void __iomem *addr = mbus->mbuswins_base + + mbus->soc->win_cfg_offset(win); + u32 ctrl, remap_addr; + + ctrl = ((size - 1) & WIN_CTRL_SIZE_MASK) | + (attr << WIN_CTRL_ATTR_SHIFT) | + (target << WIN_CTRL_TGT_SHIFT) | + WIN_CTRL_ENABLE; + + writel(base & WIN_BASE_LOW, addr + WIN_BASE_OFF); + writel(ctrl, addr + WIN_CTRL_OFF); + if (win < mbus->soc->num_remappable_wins) { + if (remap == MVEBU_MBUS_NO_REMAP) + remap_addr = base; + else + remap_addr = remap; + writel(remap_addr & WIN_REMAP_LOW, addr + WIN_REMAP_LO_OFF); + writel(0, addr + WIN_REMAP_HI_OFF); + } + + return 0; +} + +static int mvebu_mbus_alloc_window(struct mvebu_mbus_state *mbus, + phys_addr_t base, size_t size, + phys_addr_t remap, u8 target, + u8 attr) +{ + int win; + + if (remap == MVEBU_MBUS_NO_REMAP) { + for (win = mbus->soc->num_remappable_wins; + win < mbus->soc->num_wins; win++) + if (mvebu_mbus_window_is_free(mbus, win)) + return mvebu_mbus_setup_window(mbus, win, base, + size, remap, + target, attr); + } + + + for (win = 0; win < mbus->soc->num_wins; win++) + if (mvebu_mbus_window_is_free(mbus, win)) + return mvebu_mbus_setup_window(mbus, win, base, size, + remap, target, attr); + + return -ENOMEM; +} + +/* + * Debugfs debugging + */ + +static int sdram_debug_show(struct seq_file *seq, void *v) +{ + struct mvebu_mbus_state *mbus = &mbus_state; + int i; + + for (i = 0; i < 4; i++) { + u32 basereg = readl(mbus->sdramwins_base + DDR_BASE_CS_OFF(i)); + u32 sizereg = readl(mbus->sdramwins_base + DDR_SIZE_CS_OFF(i)); + u64 base; + u32 size; + + if (!(sizereg & DDR_SIZE_ENABLED)) { + seq_printf(seq, "[%d] disabled\n", i); + continue; + } + + base = ((u64)basereg & DDR_BASE_CS_HIGH_MASK) << 32; + base |= basereg & DDR_BASE_CS_LOW_MASK; + size = sizereg & DDR_SIZE_MASK; + + seq_printf(seq, "[%d] %016llx - 0x%016llx : cs%d\n", + i, (unsigned long long)base, + (unsigned long long)base + size, + (size & DDR_SIZE_CS_MASK) >> DDR_SIZE_CS_SHIFT); + } + + return 0; +} + +static int sdram_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, sdram_debug_show, inode->i_private); +} + +static const struct file_operations sdram_debug_fops = { + .open = sdram_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int devs_debug_show(struct seq_file *seq, void *v) +{ + struct mvebu_mbus_state *mbus = &mbus_state; + int win; + + for (win = 0; win < mbus->soc->num_wins; win++) { + u64 wbase, wremap; + u32 wsize; + u8 wtarget, wattr; + int enabled, i; + const char *name; + + mvebu_mbus_read_window(mbus, win, + &enabled, &wbase, &wsize, + &wtarget, &wattr, &wremap); + + if (!enabled) { + seq_printf(seq, "[%02d] disabled\n", win); + continue; + } + + + for (i = 0; mbus->soc->map[i].name; i++) + if (mbus->soc->map[i].target == wtarget && + mbus->soc->map[i].attr == + (wattr & mbus->soc->map[i].attrmask)) + break; + + name = mbus->soc->map[i].name ?: "unknown"; + + seq_printf(seq, "[%02d] %016llx - %016llx : %s", + win, (unsigned long long)wbase, + (unsigned long long)(wbase + wsize), name); + + if (win < mbus->soc->num_remappable_wins) { + seq_printf(seq, " (remap %016llx)\n", + (unsigned long long)wremap); + } else + seq_printf(seq, "\n"); + } + + return 0; +} + +static int devs_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, devs_debug_show, inode->i_private); +} + +static const struct file_operations devs_debug_fops = { + .open = devs_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +/* + * SoC-specific functions and definitions + */ + +static unsigned int orion_mbus_win_offset(int win) +{ + return win << 4; +} + +static unsigned int armada_370_xp_mbus_win_offset(int win) +{ + /* The register layout is a bit annoying and the below code + * tries to cope with it. + * - At offset 0x0, there are the registers for the first 8 + * windows, with 4 registers of 32 bits per window (ctrl, + * base, remap low, remap high) + * - Then at offset 0x80, there is a hole of 0x10 bytes for + * the internal registers base address and internal units + * sync barrier register. + * - Then at offset 0x90, there the registers for 12 + * windows, with only 2 registers of 32 bits per window + * (ctrl, base). + */ + if (win < 8) + return win << 4; + else + return 0x90 + ((win - 8) << 3); +} + +static unsigned int mv78xx0_mbus_win_offset(int win) +{ + if (win < 8) + return win << 4; + else + return 0x900 + ((win - 8) << 4); +} + +static void __init +mvebu_mbus_default_setup_cpu_target(struct mvebu_mbus_state *mbus) +{ + int i; + int cs; + + mvebu_mbus_dram_info.mbus_dram_target_id = TARGET_DDR; + + for (i = 0, cs = 0; i < 4; i++) { + u32 base = readl(mbus->sdramwins_base + DDR_BASE_CS_OFF(i)); + u32 size = readl(mbus->sdramwins_base + DDR_SIZE_CS_OFF(i)); + + /* + * We only take care of entries for which the chip + * select is enabled, and that don't have high base + * address bits set (devices can only access the first + * 32 bits of the memory). + */ + if ((size & DDR_SIZE_ENABLED) && + !(base & DDR_BASE_CS_HIGH_MASK)) { + struct mbus_dram_window *w; + + w = &mvebu_mbus_dram_info.cs[cs++]; + w->cs_index = i; + w->mbus_attr = 0xf & ~(1 << i); + if (mbus->hw_io_coherency) + w->mbus_attr |= ATTR_HW_COHERENCY; + w->base = base & DDR_BASE_CS_LOW_MASK; + w->size = (size | ~DDR_SIZE_MASK) + 1; + } + } + mvebu_mbus_dram_info.num_cs = cs; +} + +static void __init +mvebu_mbus_dove_setup_cpu_target(struct mvebu_mbus_state *mbus) +{ + int i; + int cs; + + mvebu_mbus_dram_info.mbus_dram_target_id = TARGET_DDR; + + for (i = 0, cs = 0; i < 2; i++) { + u32 map = readl(mbus->sdramwins_base + DOVE_DDR_BASE_CS_OFF(i)); + + /* + * Chip select enabled? + */ + if (map & 1) { + struct mbus_dram_window *w; + + w = &mvebu_mbus_dram_info.cs[cs++]; + w->cs_index = i; + w->mbus_attr = 0; /* CS address decoding done inside */ + /* the DDR controller, no need to */ + /* provide attributes */ + w->base = map & 0xff800000; + w->size = 0x100000 << (((map & 0x000f0000) >> 16) - 4); + } + } + + mvebu_mbus_dram_info.num_cs = cs; +} + +static const struct mvebu_mbus_mapping armada_370_map[] = { + MAPDEF("bootrom", 1, 0xe0, MAPDEF_NOMASK), + MAPDEF("devbus-boot", 1, 0x2f, MAPDEF_NOMASK), + MAPDEF("devbus-cs0", 1, 0x3e, MAPDEF_NOMASK), + MAPDEF("devbus-cs1", 1, 0x3d, MAPDEF_NOMASK), + MAPDEF("devbus-cs2", 1, 0x3b, MAPDEF_NOMASK), + MAPDEF("devbus-cs3", 1, 0x37, MAPDEF_NOMASK), + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), + {}, +}; + +static const struct mvebu_mbus_soc_data armada_370_mbus_data = { + .num_wins = 20, + .num_remappable_wins = 8, + .win_cfg_offset = armada_370_xp_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_default_setup_cpu_target, + .map = armada_370_map, +}; + +static const struct mvebu_mbus_mapping armada_xp_map[] = { + MAPDEF("bootrom", 1, 0x1d, MAPDEF_NOMASK), + MAPDEF("devbus-boot", 1, 0x2f, MAPDEF_NOMASK), + MAPDEF("devbus-cs0", 1, 0x3e, MAPDEF_NOMASK), + MAPDEF("devbus-cs1", 1, 0x3d, MAPDEF_NOMASK), + MAPDEF("devbus-cs2", 1, 0x3b, MAPDEF_NOMASK), + MAPDEF("devbus-cs3", 1, 0x37, MAPDEF_NOMASK), + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie0.1", 4, 0xd0, MAPDEF_PCIMASK), + MAPDEF("pcie0.2", 4, 0xb0, MAPDEF_PCIMASK), + MAPDEF("pcie0.3", 4, 0x70, MAPDEF_PCIMASK), + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie1.1", 8, 0xd0, MAPDEF_PCIMASK), + MAPDEF("pcie1.2", 8, 0xb0, MAPDEF_PCIMASK), + MAPDEF("pcie1.3", 8, 0x70, MAPDEF_PCIMASK), + MAPDEF("pcie2.0", 4, 0xf0, MAPDEF_PCIMASK), + MAPDEF("pcie3.0", 8, 0xf0, MAPDEF_PCIMASK), + {}, +}; + +static const struct mvebu_mbus_soc_data armada_xp_mbus_data = { + .num_wins = 20, + .num_remappable_wins = 8, + .win_cfg_offset = armada_370_xp_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_default_setup_cpu_target, + .map = armada_xp_map, +}; + +static const struct mvebu_mbus_mapping kirkwood_map[] = { + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), + MAPDEF("sram", 3, 0x01, MAPDEF_NOMASK), + MAPDEF("nand", 1, 0x2f, MAPDEF_NOMASK), + {}, +}; + +static const struct mvebu_mbus_soc_data kirkwood_mbus_data = { + .num_wins = 8, + .num_remappable_wins = 4, + .win_cfg_offset = orion_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_default_setup_cpu_target, + .map = kirkwood_map, +}; + +static const struct mvebu_mbus_mapping dove_map[] = { + MAPDEF("pcie0.0", 0x4, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie1.0", 0x8, 0xe0, MAPDEF_PCIMASK), + MAPDEF("cesa", 0x3, 0x01, MAPDEF_NOMASK), + MAPDEF("bootrom", 0x1, 0xfd, MAPDEF_NOMASK), + MAPDEF("scratchpad", 0xd, 0x0, MAPDEF_NOMASK), + {}, +}; + +static const struct mvebu_mbus_soc_data dove_mbus_data = { + .num_wins = 8, + .num_remappable_wins = 4, + .win_cfg_offset = orion_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_dove_setup_cpu_target, + .map = dove_map, +}; + +static const struct mvebu_mbus_mapping orion5x_map[] = { + MAPDEF("pcie0.0", 4, 0x51, MAPDEF_ORIONPCIMASK), + MAPDEF("pci0.0", 3, 0x51, MAPDEF_ORIONPCIMASK), + MAPDEF("devbus-boot", 1, 0x0f, MAPDEF_NOMASK), + MAPDEF("devbus-cs0", 1, 0x1e, MAPDEF_NOMASK), + MAPDEF("devbus-cs1", 1, 0x1d, MAPDEF_NOMASK), + MAPDEF("devbus-cs2", 1, 0x1b, MAPDEF_NOMASK), + MAPDEF("sram", 0, 0x00, MAPDEF_NOMASK), + {}, +}; + +/* + * Some variants of Orion5x have 4 remappable windows, some other have + * only two of them. + */ +static const struct mvebu_mbus_soc_data orion5x_4win_mbus_data = { + .num_wins = 8, + .num_remappable_wins = 4, + .win_cfg_offset = orion_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_default_setup_cpu_target, + .map = orion5x_map, +}; + +static const struct mvebu_mbus_soc_data orion5x_2win_mbus_data = { + .num_wins = 8, + .num_remappable_wins = 2, + .win_cfg_offset = orion_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_default_setup_cpu_target, + .map = orion5x_map, +}; + +static const struct mvebu_mbus_mapping mv78xx0_map[] = { + MAPDEF("pcie0.0", 4, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie0.1", 4, 0xd0, MAPDEF_PCIMASK), + MAPDEF("pcie0.2", 4, 0xb0, MAPDEF_PCIMASK), + MAPDEF("pcie0.3", 4, 0x70, MAPDEF_PCIMASK), + MAPDEF("pcie1.0", 8, 0xe0, MAPDEF_PCIMASK), + MAPDEF("pcie1.1", 8, 0xd0, MAPDEF_PCIMASK), + MAPDEF("pcie1.2", 8, 0xb0, MAPDEF_PCIMASK), + MAPDEF("pcie1.3", 8, 0x70, MAPDEF_PCIMASK), + MAPDEF("pcie2.0", 4, 0xf0, MAPDEF_PCIMASK), + MAPDEF("pcie3.0", 8, 0xf0, MAPDEF_PCIMASK), + {}, +}; + +static const struct mvebu_mbus_soc_data mv78xx0_mbus_data = { + .num_wins = 14, + .num_remappable_wins = 8, + .win_cfg_offset = mv78xx0_mbus_win_offset, + .setup_cpu_target = mvebu_mbus_default_setup_cpu_target, + .map = mv78xx0_map, +}; + +static const struct of_device_id of_mvebu_mbus_ids[] = { + { .compatible = "marvell,armada370-mbus", + .data = &armada_370_mbus_data, }, + { .compatible = "marvell,armadaxp-mbus", + .data = &armada_xp_mbus_data, }, + { .compatible = "marvell,kirkwood-mbus", + .data = &kirkwood_mbus_data, }, + { .compatible = "marvell,dove-mbus", + .data = &dove_mbus_data, }, + { .compatible = "marvell,orion5x-88f5281-mbus", + .data = &orion5x_4win_mbus_data, }, + { .compatible = "marvell,orion5x-88f5182-mbus", + .data = &orion5x_2win_mbus_data, }, + { .compatible = "marvell,orion5x-88f5181-mbus", + .data = &orion5x_2win_mbus_data, }, + { .compatible = "marvell,orion5x-88f6183-mbus", + .data = &orion5x_4win_mbus_data, }, + { .compatible = "marvell,mv78xx0-mbus", + .data = &mv78xx0_mbus_data, }, + { }, +}; + +/* + * Public API of the driver + */ +int mvebu_mbus_add_window_remap_flags(const char *devname, phys_addr_t base, + size_t size, phys_addr_t remap, + unsigned int flags) +{ + struct mvebu_mbus_state *s = &mbus_state; + u8 target, attr; + int i; + + if (!s->soc->map) + return -ENODEV; + + for (i = 0; s->soc->map[i].name; i++) + if (!strcmp(s->soc->map[i].name, devname)) + break; + + if (!s->soc->map[i].name) { + pr_err("mvebu-mbus: unknown device '%s'\n", devname); + return -ENODEV; + } + + target = s->soc->map[i].target; + attr = s->soc->map[i].attr; + + if (flags == MVEBU_MBUS_PCI_MEM) + attr |= 0x8; + else if (flags == MVEBU_MBUS_PCI_WA) + attr |= 0x28; + + if (!mvebu_mbus_window_conflicts(s, base, size, target, attr)) { + pr_err("mvebu-mbus: cannot add window '%s', conflicts with another window\n", + devname); + return -EINVAL; + } + + return mvebu_mbus_alloc_window(s, base, size, remap, target, attr); + +} + +int mvebu_mbus_add_window(const char *devname, phys_addr_t base, size_t size) +{ + return mvebu_mbus_add_window_remap_flags(devname, base, size, + MVEBU_MBUS_NO_REMAP, 0); +} + +int mvebu_mbus_del_window(phys_addr_t base, size_t size) +{ + int win; + + win = mvebu_mbus_find_window(&mbus_state, base, size); + if (win < 0) + return win; + + mvebu_mbus_disable_window(&mbus_state, win); + return 0; +} + +static __init int mvebu_mbus_debugfs_init(void) +{ + struct mvebu_mbus_state *s = &mbus_state; + + /* + * If no base has been initialized, doesn't make sense to + * register the debugfs entries. We may be on a multiplatform + * kernel that isn't running a Marvell EBU SoC. + */ + if (!s->mbuswins_base) + return 0; + + s->debugfs_root = debugfs_create_dir("mvebu-mbus", NULL); + if (s->debugfs_root) { + s->debugfs_sdram = debugfs_create_file("sdram", S_IRUGO, + s->debugfs_root, NULL, + &sdram_debug_fops); + s->debugfs_devs = debugfs_create_file("devices", S_IRUGO, + s->debugfs_root, NULL, + &devs_debug_fops); + } + + return 0; +} +fs_initcall(mvebu_mbus_debugfs_init); + +static int __init mvebu_mbus_common_init(struct mvebu_mbus_state *mbus, + phys_addr_t mbuswins_base, + size_t mbuswins_size, + phys_addr_t sdramwins_base, + size_t sdramwins_size) +{ + int win; + + mbus->mbuswins_base = ioremap(mbuswins_base, mbuswins_size); + if (!mbus->mbuswins_base) + return -ENOMEM; + + mbus->sdramwins_base = ioremap(sdramwins_base, sdramwins_size); + if (!mbus->sdramwins_base) { + iounmap(mbus_state.mbuswins_base); + return -ENOMEM; + } + + for (win = 0; win < mbus->soc->num_wins; win++) + mvebu_mbus_disable_window(mbus, win); + + mbus_state.soc->setup_cpu_target(mbus); + + return 0; +} + +int __init mvebu_mbus_init(const char *soc, phys_addr_t mbuswins_phys_base, + size_t mbuswins_size, + phys_addr_t sdramwins_phys_base, + size_t sdramwins_size) +{ + const struct of_device_id *of_id; + + for (of_id = of_mvebu_mbus_ids; of_id->compatible; of_id++) + if (!strcmp(of_id->compatible, soc)) + break; + + if (!of_id->compatible) { + pr_err("mvebu-mbus: could not find a matching SoC family\n"); + return -ENODEV; + } + + mbus_state.soc = of_id->data; + + return mvebu_mbus_common_init(&mbus_state, + mbuswins_phys_base, mbuswins_size, + sdramwins_phys_base, sdramwins_size); +} + +#ifdef CONFIG_OF +int __init mvebu_mbus_dt_init(void) +{ + struct resource mbuswins_res, sdramwins_res; + struct device_node *np; + const struct of_device_id *of_id; + + np = of_find_matching_node(NULL, of_mvebu_mbus_ids); + if (!np) { + pr_err("mvebu-mbus: could not find a matching SoC family\n"); + return -ENODEV; + } + + of_id = of_match_node(of_mvebu_mbus_ids, np); + mbus_state.soc = of_id->data; + + if (of_address_to_resource(np, 0, &mbuswins_res)) { + pr_err("%s: cannot get MBUS register address\n", __func__); + return -EINVAL; + } + + if (of_address_to_resource(np, 1, &sdramwins_res)) { + pr_err("%s: cannot get SDRAM register address\n", __func__); + return -EINVAL; + } + + return mvebu_mbus_common_init(&mbus_state, + mbuswins_res.start, + resource_size(&mbuswins_res), + sdramwins_res.start, + resource_size(&sdramwins_res)); +} +#endif diff --git a/include/linux/mbus.h b/include/linux/mbus.h index efa1a6d..1f80425 100644 --- a/include/linux/mbus.h +++ b/include/linux/mbus.h @@ -32,6 +32,17 @@ struct mbus_dram_target_info } cs[4]; }; +/* Flags for PCI/PCIe address decoding regions */ +#define MVEBU_MBUS_PCI_IO 0x1 +#define MVEBU_MBUS_PCI_MEM 0x2 +#define MVEBU_MBUS_PCI_WA 0x3 + +/* + * Magic value that explicits that we don't need a remapping-capable + * address decoding window. + */ +#define MVEBU_MBUS_NO_REMAP (0xffffffff) + /* * The Marvell mbus is to be found only on SOCs from the Orion family * at the moment. Provide a dummy stub for other architectures. @@ -44,4 +55,16 @@ static inline const struct mbus_dram_target_info *mv_mbus_dram_info(void) return NULL; } #endif -#endif + +int mvebu_mbus_add_window_remap_flags(const char *devname, phys_addr_t base, + size_t size, phys_addr_t remap, + unsigned int flags); +int mvebu_mbus_add_window(const char *devname, phys_addr_t base, + size_t size); +int mvebu_mbus_del_window(phys_addr_t base, size_t size); +int mvebu_mbus_init(const char *soc, phys_addr_t mbus_phys_base, + size_t mbus_size, phys_addr_t sdram_phys_base, + size_t sdram_size); +int mvebu_mbus_dt_init(void); + +#endif /* __LINUX_MBUS_H */
The Marvell EBU SoCs have a configurable physical address space layout: the physical ranges of memory used to address PCI(e) interfaces, NOR flashes, SRAM and various other types of memory are configurable by software, through a mechanism of so-called 'address decoding windows'. This new driver mvebu-mbus consolidates the existing code to address the configuration of these memory ranges, which is spread into mach-mvebu, mach-orion5x, mach-mv78xx0, mach-dove and mach-kirkwood. Following patches convert each Marvell EBU SoC family to use this driver, therefore removing the old code that was configuring the address decoding windows. It is worth mentioning that the MVEBU_MBUS Kconfig option is intentionally added as a blind option. The new driver implements and exports the mv_mbus_dram_info() function, which is used by various Marvell drivers throughout the tree to get access to window configuration parameters that they require. This function is also implemented in arch/arm/plat-orion/addr-map.c, which ultimately gets removed at the end of this patch series. So, in order to preserve bisectability, we want to ensure that *either* this new driver, *or* the legacy code in plat-orion/addr-map.c gets compiled in. By making MVEBU_MBUS a blind option, we are sure that only a platform that does 'select MVEBU_MBUS' will get this new driver compiled in. Therefore, throughout the next patches that convert the Marvell sub-architectures one after the other to this new driver, we add the 'select MVEBU_MBUS' and also ensure to remove plat-orion/addr-map.c from the build for this specific sub-architecture. This ensures that bisectability is preserved. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> --- .../devicetree/bindings/arm/mvebu-mbus.txt | 39 + drivers/bus/Kconfig | 7 + drivers/bus/Makefile | 1 + drivers/bus/mvebu-mbus.c | 821 ++++++++++++++++++++ include/linux/mbus.h | 25 +- 5 files changed, 892 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/arm/mvebu-mbus.txt create mode 100644 drivers/bus/mvebu-mbus.c