Message ID | 1400242998-437-1-git-send-email-thierry.reding@gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Fri, 16 May 2014 14:23:18 +0200, Thierry Reding wrote: > From: Thierry Reding <treding@nvidia.com> > > This commit introduces a generic device tree binding for IOMMU devices. > Only a very minimal subset is described here, but it is enough to cover > the requirements of both the Exynos System MMU and Tegra SMMU as > discussed here: > > https://lkml.org/lkml/2014/4/27/346 > > More advanced functionality such as the dma-ranges property can easily > be added in a backwards-compatible way. In the absence of a dma-ranges > property it should be safe to default to the whole address space. > > Signed-off-by: Thierry Reding <treding@nvidia.com> > --- > Documentation/devicetree/bindings/iommu/iommu.txt | 109 ++++++++++++++++++++++ > 1 file changed, 109 insertions(+) > create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt > > diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt > new file mode 100644 > index 000000000000..2d67b52b656e > --- /dev/null > +++ b/Documentation/devicetree/bindings/iommu/iommu.txt > @@ -0,0 +1,109 @@ > +This document describes the generic device tree binding for IOMMUs and their > +master(s). > + > + > +IOMMU device node: > +================== > + > +An IOMMU can provide the following services: > + > +* Remap address space to allow devices to access physical memory ranges that > + they otherwise wouldn't be capable of accessing. > + > + Example: 32-bit DMA to 64-bit physical addresses > + > +* Implement scatter-gather at page level granularity so that the device does > + not have to. > + > +* Provide system protection against "rogue" DMA by forcing all accesses to go > + through the IOMMU and faulting when encountering accesses to unmapped > + address regions. > + > +* Provide address space isolation between multiple contexts. > + > + Example: Virtualization > + > +Device nodes compatible with this binding represent hardware with some of the > +above capabilities. > + > +IOMMUs can be single-master or multiple-master. Single-master IOMMU devices > +typically have a fixed association to the master device, whereas multiple- > +master IOMMU devices can translate accesses from more than one master. > + > +Required properties: > +-------------------- > +- #iommu-cells: The number of cells in an IOMMU specifier. The meaning of the > + cells is defined by the binding for the IOMMU device. > + > + Typical values include: > + * 0: Single-master IOMMU devices are often not configurable, therefore the > + specifying doesn't need to encode any information and can be empty. > + > + * 1: Multiple-master IOMMU devices need to know for which master they should > + enable translation. Typically the single cell in the specifier corresponds > + to the master device's ID. > + > + > +IOMMU master node: > +================== > + > +Devices that access memory through an IOMMU are called masters. A device can > +have multiple master interfaces (to one or more IOMMU devices). > + > +Required properties: > +-------------------- > +- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU > + master interfaces of the device. One entry in the list describes one master > + interface of the device. > + > +Optional properties: > +-------------------- > +- iommu-names: A list of names identifying each entry in the iommus property. > + > + > +Examples: > +========= > + > +Single-master IOMMU: > +-------------------- > + > + iommu { > + #iommu-cells = <0>; > + }; > + > + master { > + iommu = <&/iommu>; > + }; > + Great work, Thierry. One simple comment. This should be also applicable to multi-master IOMMUs that the masters of an IOMMU is not configurable with ID or something. I think the title needs to be changed to cover such IOMMUs which always translate master's transactions and unable to change the configuration of the relationship between the masters and IOMMUs by S/W. Regards, KyongHo > +Multi-master IOMMU: > +------------------- > + > + iommu { > + /* the specifier represents the ID of the master */ > + #iommu-cells = <1>; > + }; > + > + master { > + /* device has master ID 42 in the IOMMU */ > + iommu = <&/iommu 42>; > + }; > + > +Multi-master device: > +-------------------- > + > + /* single-master IOMMU */ > + iommu@1 { > + #iommu-cells = <0>; > + }; > + > + /* multi-master IOMMU */ > + iommu@2 { > + /* the specifier represents the ID of the master */ > + #iommu-cells = <1>; > + }; > + > + /* device with two master interfaces */ > + master { > + iommus = <&/iommu@1>, /* master of the single-master IOMMU */ > + <&/iommu@2 42>; /* ID 42 in multi-master IOMMU */ > + }; > -- > 1.9.2 > -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, May 17, 2014 at 05:04:55PM +0900, Cho KyongHo wrote: > On Fri, 16 May 2014 14:23:18 +0200, Thierry Reding wrote: [...] > > +Examples: > > +========= > > + > > +Single-master IOMMU: > > +-------------------- > > + > > + iommu { > > + #iommu-cells = <0>; > > + }; > > + > > + master { > > + iommu = <&/iommu>; > > + }; > > + > > Great work, Thierry. > > One simple comment. > > This should be also applicable to multi-master IOMMUs that the masters > of an IOMMU is not configurable with ID or something. > I think the title needs to be changed to cover such IOMMUs which always > translate master's transactions and unable to change the configuration > of the relationship between the masters and IOMMUs by S/W. Agreed, how about we add a separate example for that case: Multiple-master IOMMU with fixed associations: ---------------------------------------------- /* multiple-master IOMMU */ iommu { /* * Masters are statically associated with this IOMMU and * address translation is always enabled. */ #iommu-cells = <0>; }; /* static association with IOMMU */ master@1 { reg = <1>; iommus = <&/iommu>; }; /* static association with IOMMU */ master@2 { reg = <2>; iommus = <&/iommu>; }; ? Thierry
On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > From: Thierry Reding <treding@nvidia.com> > > This commit introduces a generic device tree binding for IOMMU devices. > Only a very minimal subset is described here, but it is enough to cover > the requirements of both the Exynos System MMU and Tegra SMMU as > discussed here: > > https://lkml.org/lkml/2014/4/27/346 > > More advanced functionality such as the dma-ranges property can easily > be added in a backwards-compatible way. In the absence of a dma-ranges > property it should be safe to default to the whole address space. > The basic binding looks fine, but I'd like it to be more explicit about dma-ranges. Most importantly, what does "the whole address space" mean? A lot of IOMMUs have only 32-bit bus addresses when targetted by a bus master, it would also be normal for some to be smaller and some might even support 64-bit. For the upstream side, I'd hope we always have access to the full physical memory, but since this is a brand-new binding, it should be straightforward to just ask for upstream dma-ranges properties to be set all the way up to the root to confirm that. For downstream, we don't actually have a good place to put the dma-ranges property. We can't put it into the iommu node, because that would imply translating to the iommu's parent bus, not the iommu's own bus space. We also can't put it into the master, because dma-ranges is supposed to be in the parent bus. Finally, it makes no sense to use the dma-ranges property of the master's parent bus, because that bus isn't actually involved in the translation. My preferred option would be to always put the address range into the iommu descriptor, using the iommu's #address-cells. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > From: Thierry Reding <treding@nvidia.com> > > > > This commit introduces a generic device tree binding for IOMMU devices. > > Only a very minimal subset is described here, but it is enough to cover > > the requirements of both the Exynos System MMU and Tegra SMMU as > > discussed here: > > > > https://lkml.org/lkml/2014/4/27/346 > > > > More advanced functionality such as the dma-ranges property can easily > > be added in a backwards-compatible way. In the absence of a dma-ranges > > property it should be safe to default to the whole address space. > > > > The basic binding looks fine, but I'd like it to be more explicit > about dma-ranges. Most importantly, what does "the whole address space" > mean? The whole point was to leave out any mention of dma-ranges from the binding until we've figured out more of the puzzle. So what I was trying to avoid was another lengthy discussion on the topic of dma-ranges. Oh well... =) > A lot of IOMMUs have only 32-bit bus addresses when targetted > by a bus master, it would also be normal for some to be smaller and > some might even support 64-bit. > > For the upstream side, I'd hope we always have access to the full > physical memory, but since this is a brand-new binding, it should > be straightforward to just ask for upstream dma-ranges properties > to be set all the way up to the root to confirm that. > > For downstream, we don't actually have a good place to put the > dma-ranges property. I'm not sure I understand what you mean by upstream and downstream in this context. > We can't put it into the iommu node, because that would imply translating > to the iommu's parent bus, not the iommu's own bus space. My understanding was that the purpose of dma-ranges was to define a mapping from one bus to another. So the general form of child-address parent-address child-size Would be used to translate a region of size <child-size> from the <child-address> (the I/O address space created by the IOMMU) to the <parent-address> (physical address space). > We also can't put it into the master, because dma-ranges is supposed to be > in the parent bus. I don't understand. From the above I would think that the master's node is precisely where it belongs. > Finally, it makes no sense to use the dma-ranges property of the master's > parent bus, because that bus isn't actually involved in the translation. My understanding here is mostly based on the OpenFirmware working group proposal for the dma-ranges property[0]. I'll give another example to try and clarify how I had imagined this to work: / { #address-cells = <2>; #size-cells = <2>; iommu { /* * This is somewhat unusual (or maybe not) in that we * need 2 cells to represent the size of an address * space that is 32 bits long. */ #address-cells = <1>; #size-cells = <2>; #iommu-cells = <1>; }; master { iommus = <&/iommu 42>; /* * Map I/O addresses 0 - 4 GiB to physical addresses * 2 GiB - 6 GiB. */ dma-ranges = <0x00000000 0 0x80000000 1 0>; }; }; This is somewhat incompatible with [0] in that #address-cells used to parse the child address must be taken from the iommu node rather than the child node. But that seems to me to be the only reasonable thing to do, because after all the IOMMU creates a completely new address space for the master. [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt > My preferred option would be to always put the address range into > the iommu descriptor, using the iommu's #address-cells. That could become impossible to parse. I'm not sure if such hardware actually exists, but if for some reason we have to split the address range into two, then there's no longer any way to determine the size needed for the specifier. On the other hand what you propose makes it easy to represent multiple master interfaces on a device. With a separate dma-ranges property how can you define which ranges apply to which of the master interfaces? Then again if address ranges can't be broken up in the first place, then dma-ranges could be considered to be one entry per IOMMU in the iommus property. Thierry
On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote: > On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > > From: Thierry Reding <treding@nvidia.com> > > > > > > This commit introduces a generic device tree binding for IOMMU devices. > > > Only a very minimal subset is described here, but it is enough to cover > > > the requirements of both the Exynos System MMU and Tegra SMMU as > > > discussed here: > > > > > > https://lkml.org/lkml/2014/4/27/346 > > > > > > More advanced functionality such as the dma-ranges property can easily > > > be added in a backwards-compatible way. In the absence of a dma-ranges > > > property it should be safe to default to the whole address space. > > > > > > > The basic binding looks fine, but I'd like it to be more explicit > > about dma-ranges. Most importantly, what does "the whole address space" > > mean? > > The whole point was to leave out any mention of dma-ranges from the > binding until we've figured out more of the puzzle. > > So what I was trying to avoid was another lengthy discussion on the > topic of dma-ranges. Oh well... =) Apologies for the silence on this topic... I'm trying to do too many things at the sime time as usual :/ > > > A lot of IOMMUs have only 32-bit bus addresses when targetted > > by a bus master, it would also be normal for some to be smaller and > > some might even support 64-bit. > > > > For the upstream side, I'd hope we always have access to the full > > physical memory, but since this is a brand-new binding, it should > > be straightforward to just ask for upstream dma-ranges properties > > to be set all the way up to the root to confirm that. > > > > For downstream, we don't actually have a good place to put the > > dma-ranges property. > > I'm not sure I understand what you mean by upstream and downstream in > this context. > > > We can't put it into the iommu node, because that would imply translating > > to the iommu's parent bus, not the iommu's own bus space. > > My understanding was that the purpose of dma-ranges was to define a > mapping from one bus to another. So the general form of > > child-address parent-address child-size > > Would be used to translate a region of size <child-size> from the > <child-address> (the I/O address space created by the IOMMU) to the > <parent-address> (physical address space). > > > We also can't put it into the master, because dma-ranges is supposed to be > > in the parent bus. > > I don't understand. From the above I would think that the master's node > is precisely where it belongs. > > > Finally, it makes no sense to use the dma-ranges property of the master's > > parent bus, because that bus isn't actually involved in the translation. > > My understanding here is mostly based on the OpenFirmware working group > proposal for the dma-ranges property[0]. I'll give another example to > try and clarify how I had imagined this to work: > > / { > #address-cells = <2>; > #size-cells = <2>; > > iommu { > /* > * This is somewhat unusual (or maybe not) in that we > * need 2 cells to represent the size of an address > * space that is 32 bits long. > */ > #address-cells = <1>; > #size-cells = <2>; > > #iommu-cells = <1>; > }; > > master { > iommus = <&/iommu 42>; Comparing this with the other discussion thread, we have a similar concept here, in that the iommu is made a logical parent, however... Firstly, there's an implicit assumption here that the only kind of thing the master could possibly be connected to is an IOMMU, with no non-trivial interconnect in between. I'm not sure this is going to scale to complex SoCs. If a range of Stream IDs may be issued (especially from something like a PCIe RC where the stream ID may be a many-bit value), describing the IDs individually may be impractical. > /* > * Map I/O addresses 0 - 4 GiB to physical addresses > * 2 GiB - 6 GiB. > */ > dma-ranges = <0x00000000 0 0x80000000 1 0>; ... I concur with Arnd dma-ranges is now in the wrong place with respect to the (overridden) child-parent relationship, and that if this device masters to multiple destinations there's no possibility of describing different ranges mappings for each. > }; > }; > > This is somewhat incompatible with [0] in that #address-cells used to > parse the child address must be taken from the iommu node rather than > the child node. But that seems to me to be the only reasonable thing > to do, because after all the IOMMU creates a completely new address > space for the master. > > [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt > > > My preferred option would be to always put the address range into > > the iommu descriptor, using the iommu's #address-cells. > > That could become impossible to parse. I'm not sure if such hardware > actually exists, but if for some reason we have to split the address > range into two, then there's no longer any way to determine the size > needed for the specifier. > > On the other hand what you propose makes it easy to represent multiple > master interfaces on a device. With a separate dma-ranges property how > can you define which ranges apply to which of the master interfaces? In theory, you could describe split ranges simply by multiplying out: <&/iommu 42 sub-range-1>, <&/iommu 42 sub-range-2>, ... This is only manageable if the number of subranges for each (iommu,streamID) mapping is small (and usually 1). Of course, I hope the number of subranges normally _is_ small... Cheers ---Dave > Then again if address ranges can't be broken up in the first place, then > dma-ranges could be considered to be one entry per IOMMU in the iommus > property. > > Thierry -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 19 May 2014 14:53:37 Thierry Reding wrote: > On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > > From: Thierry Reding <treding@nvidia.com> > > > > > > This commit introduces a generic device tree binding for IOMMU devices. > > > Only a very minimal subset is described here, but it is enough to cover > > > the requirements of both the Exynos System MMU and Tegra SMMU as > > > discussed here: > > > > > > https://lkml.org/lkml/2014/4/27/346 > > > > > > More advanced functionality such as the dma-ranges property can easily > > > be added in a backwards-compatible way. In the absence of a dma-ranges > > > property it should be safe to default to the whole address space. > > > > > > > The basic binding looks fine, but I'd like it to be more explicit > > about dma-ranges. Most importantly, what does "the whole address space" > > mean? > > The whole point was to leave out any mention of dma-ranges from the > binding until we've figured out more of the puzzle. > > So what I was trying to avoid was another lengthy discussion on the > topic of dma-ranges. Oh well... =) I think that can't work, because we need a way to specify the ranges for some iommu drivers. We have to make sure we at least don't prevent it from working. > > A lot of IOMMUs have only 32-bit bus addresses when targetted > > by a bus master, it would also be normal for some to be smaller and > > some might even support 64-bit. > > > > For the upstream side, I'd hope we always have access to the full > > physical memory, but since this is a brand-new binding, it should > > be straightforward to just ask for upstream dma-ranges properties > > to be set all the way up to the root to confirm that. > > > > For downstream, we don't actually have a good place to put the > > dma-ranges property. > > I'm not sure I understand what you mean by upstream and downstream in > this context. Upstream -> close to memory Downstream -> close to DMA master > > We can't put it into the iommu node, because that would imply translating > > to the iommu's parent bus, not the iommu's own bus space. > > My understanding was that the purpose of dma-ranges was to define a > mapping from one bus to another. So the general form of > > child-address parent-address child-size > > Would be used to translate a region of size <child-size> from the > <child-address> (the I/O address space created by the IOMMU) to the > <parent-address> (physical address space). That's how it's defined for normal buses, we haven't defined it for IOMMUs yet. > > We also can't put it into the master, because dma-ranges is supposed to be > > in the parent bus. > > I don't understand. From the above I would think that the master's node > is precisely where it belongs. The way that I read the binding, the way that dma-ranges belongs into the parent of the master, according. On normal buses (e.g. classic PCI), all DMA masters see the same address space. The bus defines how a transaction started on one of its children gets translated to the address space of its parent. This is similar to 'ranges', which defines how an MMIO address initiated on the parent of a bus gets translated to an address understood by its children. My interpretation at least matches the defintion of of_dma_get_range(), but then again Santosh added that according to my comments. > > Finally, it makes no sense to use the dma-ranges property of the master's > > parent bus, because that bus isn't actually involved in the translation. > > My understanding here is mostly based on the OpenFirmware working group > proposal for the dma-ranges property[0]. I'll give another example to > try and clarify how I had imagined this to work: > > / { > #address-cells = <2>; > #size-cells = <2>; > > iommu { > /* > * This is somewhat unusual (or maybe not) in that we > * need 2 cells to represent the size of an address > * space that is 32 bits long. > */ > #address-cells = <1>; > #size-cells = <2>; You should never need #size-cells > #address-cells > #iommu-cells = <1>; > }; > > master { > iommus = <&/iommu 42>; > /* > * Map I/O addresses 0 - 4 GiB to physical addresses > * 2 GiB - 6 GiB. > */ > dma-ranges = <0x00000000 0 0x80000000 1 0>; > }; > }; > > This is somewhat incompatible with [0] in that #address-cells used to > parse the child address must be taken from the iommu node rather than > the child node. But that seems to me to be the only reasonable thing > to do, because after all the IOMMU creates a completely new address > space for the master. > > [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt I don't think you can have a dma-ranges without a #address-cells and #size-cells property in the same node. In your example, I'd also expect a child node below 'master' that then interprets the address space made up by dma-ranges. As a comment on the numbers in your example, I don't expect to ever see a 4GB IOMMU address space that doesn't start at an offset. Instead I'd expect either addresses that encode a device ID, or those that are just a subset of the parent address space, with non-overlapping bus addresses for each master. > > My preferred option would be to always put the address range into > > the iommu descriptor, using the iommu's #address-cells. > > That could become impossible to parse. I'm not sure if such hardware > actually exists, but if for some reason we have to split the address > range into two, then there's no longer any way to determine the size > needed for the specifier. > > On the other hand what you propose makes it easy to represent multiple > master interfaces on a device. With a separate dma-ranges property how > can you define which ranges apply to which of the master interfaces? Well, you could have multiple links to the same IOMMU if you want to do that, and define that there must be at least one dma-ranges entry for each IOMMU entry (although not necessarily the other way round, you could have direct ranges in addition to translated ones. > Then again if address ranges can't be broken up in the first place, then > dma-ranges could be considered to be one entry per IOMMU in the iommus > property. Let me do another example, with the address merged into the iommu references: / { #address-cells = <2>; // 64-bit address #size-cells = <2>; iommu@a { #address-cells = <2>; // 1 cell ID, 1 cell address #size-cells = <1>; // no need for #iommu-cells }; master@b { iommus = <&/iommu@a // iommu 0x23 // ID 0x40000000 // window start 0x10000000>; //window size }; }; A disadvantage of this model would be that for all ARM SMMU users, we'd have to always list a 4GB address space, which is kind of redundant. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote: > On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote: [...] > > My understanding here is mostly based on the OpenFirmware working group > > proposal for the dma-ranges property[0]. I'll give another example to > > try and clarify how I had imagined this to work: > > > > / { > > #address-cells = <2>; > > #size-cells = <2>; > > > > iommu { > > /* > > * This is somewhat unusual (or maybe not) in that we > > * need 2 cells to represent the size of an address > > * space that is 32 bits long. > > */ > > #address-cells = <1>; > > #size-cells = <2>; > > > > #iommu-cells = <1>; > > }; > > > > master { > > iommus = <&/iommu 42>; > > Comparing this with the other discussion thread, we have a similar > concept here, in that the iommu is made a logical parent, however... > > Firstly, there's an implicit assumption here that the only kind of > thing the master could possibly be connected to is an IOMMU, with > no non-trivial interconnect in between. I'm not sure this is going > to scale to complex SoCs. Here we go again. We're now in the very same bad spot that we've been in so many times before. There are at least two SoCs that we know do not require anything fancy, yet we're blocking adding support for those use cases because we think that at some point some IOMMU may require more than that. But at the same time it seems like we don't have enough data about what exactly that is, so we keep speculating. At this rate we're making it impossible to get a reasonable feature set supported upstream. That's very frustrating. > If a range of Stream IDs may be issued (especially from something like > a PCIe RC where the stream ID may be a many-bit value), describing > the IDs individually may be impractical. The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has a need to represent the stream IDs as a range there's nothing keeping it from defining the specifier accordingly. Thierry
On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: > On Monday 19 May 2014 14:53:37 Thierry Reding wrote: > > On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > > > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > > > This commit introduces a generic device tree binding for IOMMU devices. > > > > Only a very minimal subset is described here, but it is enough to cover > > > > the requirements of both the Exynos System MMU and Tegra SMMU as > > > > discussed here: > > > > > > > > https://lkml.org/lkml/2014/4/27/346 > > > > > > > > More advanced functionality such as the dma-ranges property can easily > > > > be added in a backwards-compatible way. In the absence of a dma-ranges > > > > property it should be safe to default to the whole address space. > > > > > > > > > > The basic binding looks fine, but I'd like it to be more explicit > > > about dma-ranges. Most importantly, what does "the whole address space" > > > mean? > > > > The whole point was to leave out any mention of dma-ranges from the > > binding until we've figured out more of the puzzle. > > > > So what I was trying to avoid was another lengthy discussion on the > > topic of dma-ranges. Oh well... =) > > I think that can't work, because we need a way to specify the > ranges for some iommu drivers. We have to make sure we at least > don't prevent it from working. That was precisely why I wanted to let this out of the binding for now so that we can actually focus on making existing hardware work rather than speculate about what we may or may not need. If you think the current minimal binding will cause future use-cases to break, then we should change it to avoid that. What you're proposing is to make the binding more complex on the assumption that we'll get it right. Wouldn't it be safer to keep things to the bare minimum necessary to represent hardware that we have access to and understand now, rather than speculating about what may be necessary at some point. I'd much prefer to add complexity on an as-needed basis and when we have a better understand of where we're headed. > > > Finally, it makes no sense to use the dma-ranges property of the master's > > > parent bus, because that bus isn't actually involved in the translation. > > > > My understanding here is mostly based on the OpenFirmware working group > > proposal for the dma-ranges property[0]. I'll give another example to > > try and clarify how I had imagined this to work: > > > > / { > > #address-cells = <2>; > > #size-cells = <2>; > > > > iommu { > > /* > > * This is somewhat unusual (or maybe not) in that we > > * need 2 cells to represent the size of an address > > * space that is 32 bits long. > > */ > > #address-cells = <1>; > > #size-cells = <2>; > > You should never need #size-cells > #address-cells That was always my impression as well. But how then do you represent the full 4 GiB address space in a 32-bit system? It starts at 0 and ends at 4 GiB - 1, which makes it 4 GiB large. That's: <0 1 0> With #address-cells = <1> and #size-cells = <1> the best you can do is: <0 0xffffffff> but that's not accurate. > > #iommu-cells = <1>; > > }; > > > > master { > > iommus = <&/iommu 42>; > > /* > > * Map I/O addresses 0 - 4 GiB to physical addresses > > * 2 GiB - 6 GiB. > > */ > > dma-ranges = <0x00000000 0 0x80000000 1 0>; > > }; > > }; > > > > This is somewhat incompatible with [0] in that #address-cells used to > > parse the child address must be taken from the iommu node rather than > > the child node. But that seems to me to be the only reasonable thing > > to do, because after all the IOMMU creates a completely new address > > space for the master. > > > > [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt > > I don't think you can have a dma-ranges without a #address-cells and > #size-cells property in the same node. In your example, I'd also expect > a child node below 'master' that then interprets the address space > made up by dma-ranges. Okay, so what Dave and you have been saying strongly indicates that dma-ranges isn't the right thing to use here. > As a comment on the numbers in your example, I don't expect to ever > see a 4GB IOMMU address space that doesn't start at an offset. Instead > I'd expect either addresses that encode a device ID, or those that > are just a subset of the parent address space, with non-overlapping > bus addresses for each master. As I understand the Tegra SMMU allows each of the clients to be assigned a separate address space (4 GiB on earlier generations and 16 GiB on new generations) and all addresses in each address space can be mapped to arbitrary physical addresses. > > > My preferred option would be to always put the address range into > > > the iommu descriptor, using the iommu's #address-cells. > > > > That could become impossible to parse. I'm not sure if such hardware > > actually exists, but if for some reason we have to split the address > > range into two, then there's no longer any way to determine the size > > needed for the specifier. > > > > On the other hand what you propose makes it easy to represent multiple > > master interfaces on a device. With a separate dma-ranges property how > > can you define which ranges apply to which of the master interfaces? > > Well, you could have multiple links to the same IOMMU if you want to > do that, and define that there must be at least one dma-ranges entry > for each IOMMU entry (although not necessarily the other way round, > you could have direct ranges in addition to translated ones. > > > Then again if address ranges can't be broken up in the first place, then > > dma-ranges could be considered to be one entry per IOMMU in the iommus > > property. > > Let me do another example, with the address merged into the iommu > references: > > / { > #address-cells = <2>; // 64-bit address > #size-cells = <2>; > > iommu@a { > #address-cells = <2>; // 1 cell ID, 1 cell address > #size-cells = <1>; > > // no need for #iommu-cells > }; > > > master@b { > iommus = <&/iommu@a // iommu > 0x23 // ID > 0x40000000 // window start > 0x10000000>; //window size > }; > }; > > A disadvantage of this model would be that for all ARM SMMU users, we'd > have to always list a 4GB address space, which is kind of redundant. Isn't that the equivalent of the "whole address space" default that I mentioned in the commit message? Could this be handled with #address-cells = <1> and #size-cells = <0> in the iommu node? That way the only cell that needs to be specified in iommus would be the ID and the redundant address space could be simply omitted from DT since it would be implied by the compatible string. Thierry
On Monday 19 May 2014 22:59:46 Thierry Reding wrote: > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: > > On Monday 19 May 2014 14:53:37 Thierry Reding wrote: > > > On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > > > > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > > > > > This commit introduces a generic device tree binding for IOMMU devices. > > > > > Only a very minimal subset is described here, but it is enough to cover > > > > > the requirements of both the Exynos System MMU and Tegra SMMU as > > > > > discussed here: > > > > > > > > > > https://lkml.org/lkml/2014/4/27/346 > > > > > > > > > > More advanced functionality such as the dma-ranges property can easily > > > > > be added in a backwards-compatible way. In the absence of a dma-ranges > > > > > property it should be safe to default to the whole address space. > > > > > > > > > > > > > The basic binding looks fine, but I'd like it to be more explicit > > > > about dma-ranges. Most importantly, what does "the whole address space" > > > > mean? > > > > > > The whole point was to leave out any mention of dma-ranges from the > > > binding until we've figured out more of the puzzle. > > > > > > So what I was trying to avoid was another lengthy discussion on the > > > topic of dma-ranges. Oh well... =) > > > > I think that can't work, because we need a way to specify the > > ranges for some iommu drivers. We have to make sure we at least > > don't prevent it from working. > > That was precisely why I wanted to let this out of the binding for now > so that we can actually focus on making existing hardware work rather > than speculate about what we may or may not need. > > If you think the current minimal binding will cause future use-cases to > break, then we should change it to avoid that. What you're proposing is > to make the binding more complex on the assumption that we'll get it > right. > > Wouldn't it be safer to keep things to the bare minimum necessary to > represent hardware that we have access to and understand now, rather > than speculating about what may be necessary at some point. I'd much > prefer to add complexity on an as-needed basis and when we have a better > understand of where we're headed. I just want to think it through for the cases we know about. We don't have to implement it all at once, but I think there is a danger in making an important binding too simple or too complicated, both of which are equally bad. After giving the ranges stuff some more thought, I have come to the conclusion that using #iommu-cells should work fine for almost all cases, including windowed iommus, because the window is not actually needed in the device, but only in the iommu, wihch is of course free to interpret the arguments as addresses. > > > > Finally, it makes no sense to use the dma-ranges property of the master's > > > > parent bus, because that bus isn't actually involved in the translation. > > > > > > My understanding here is mostly based on the OpenFirmware working group > > > proposal for the dma-ranges property[0]. I'll give another example to > > > try and clarify how I had imagined this to work: > > > > > > / { > > > #address-cells = <2>; > > > #size-cells = <2>; > > > > > > iommu { > > > /* > > > * This is somewhat unusual (or maybe not) in that we > > > * need 2 cells to represent the size of an address > > > * space that is 32 bits long. > > > */ > > > #address-cells = <1>; > > > #size-cells = <2>; > > > > You should never need #size-cells > #address-cells > > That was always my impression as well. But how then do you represent the > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at > 4 GiB - 1, which makes it 4 GiB large. That's: > > <0 1 0> > > With #address-cells = <1> and #size-cells = <1> the best you can do is: > > <0 0xffffffff> > > but that's not accurate. I think we've done both in the past, either extended #size-cells or taken 0xffffffff as a special token. Note that in your example, the iommu actually needs #address-cells = <2> anyway. > > > #iommu-cells = <1>; > > > }; > > > > > > master { > > > iommus = <&/iommu 42>; > > > /* > > > * Map I/O addresses 0 - 4 GiB to physical addresses > > > * 2 GiB - 6 GiB. > > > */ > > > dma-ranges = <0x00000000 0 0x80000000 1 0>; > > > }; > > > }; > > > > > > This is somewhat incompatible with [0] in that #address-cells used to > > > parse the child address must be taken from the iommu node rather than > > > the child node. But that seems to me to be the only reasonable thing > > > to do, because after all the IOMMU creates a completely new address > > > space for the master. > > > > > > [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt > > > > I don't think you can have a dma-ranges without a #address-cells and > > #size-cells property in the same node. In your example, I'd also expect > > a child node below 'master' that then interprets the address space > > made up by dma-ranges. > > Okay, so what Dave and you have been saying strongly indicates that > dma-ranges isn't the right thing to use here. There are cases where I think we could use dma-ranges, but not in the way you have it above. I'll get to that below. > > As a comment on the numbers in your example, I don't expect to ever > > see a 4GB IOMMU address space that doesn't start at an offset. Instead > > I'd expect either addresses that encode a device ID, or those that > > are just a subset of the parent address space, with non-overlapping > > bus addresses for each master. > > As I understand the Tegra SMMU allows each of the clients to be assigned > a separate address space (4 GiB on earlier generations and 16 GiB on new > generations) and all addresses in each address space can be mapped to > arbitrary physical addresses. Right, so this is not a windowed IOMMU, and you wouldn't need to encode the address at all. > > > > My preferred option would be to always put the address range into > > > > the iommu descriptor, using the iommu's #address-cells. > > > > > > That could become impossible to parse. I'm not sure if such hardware > > > actually exists, but if for some reason we have to split the address > > > range into two, then there's no longer any way to determine the size > > > needed for the specifier. > > > > > > On the other hand what you propose makes it easy to represent multiple > > > master interfaces on a device. With a separate dma-ranges property how > > > can you define which ranges apply to which of the master interfaces? > > > > Well, you could have multiple links to the same IOMMU if you want to > > do that, and define that there must be at least one dma-ranges entry > > for each IOMMU entry (although not necessarily the other way round, > > you could have direct ranges in addition to translated ones. > > > > > Then again if address ranges can't be broken up in the first place, then > > > dma-ranges could be considered to be one entry per IOMMU in the iommus > > > property. > > > > Let me do another example, with the address merged into the iommu > > references: > > > > / { > > #address-cells = <2>; // 64-bit address > > #size-cells = <2>; > > > > iommu@a { > > #address-cells = <2>; // 1 cell ID, 1 cell address > > #size-cells = <1>; > > > > // no need for #iommu-cells > > }; > > > > > > master@b { > > iommus = <&/iommu@a // iommu > > 0x23 // ID > > 0x40000000 // window start > > 0x10000000>; //window size > > }; > > }; > > > > A disadvantage of this model would be that for all ARM SMMU users, we'd > > have to always list a 4GB address space, which is kind of redundant. > > Isn't that the equivalent of the "whole address space" default that I > mentioned in the commit message? Could this be handled with > #address-cells = <1> and #size-cells = <0> in the iommu node? That way > the only cell that needs to be specified in iommus would be the ID and > the redundant address space could be simply omitted from DT since it > would be implied by the compatible string. Yes, that's right. After giving it some more thought, I agree that the #size-cells case makes sense as an indicator for cases where the master doesn't have to care about the address at all and no further translation with dma-ranges is possible. Back to my example with dma-ranges, the simplest case would an IOMMU that has an ID per bus, with multiple masters sharing that ID: / { #address-cells = <1>; #size-cells = <1>; iommu { #address-cells = <2>; // ID, address #size-cells = <2>; }; master@a { iommus = <& {/iommu} 0xa 0x0 0x1 0x0>; // 4GB ID '0xa' } bus1 { #address-cells = <1>; #size-cells = <1>; ranges; iommus = <& {/iommu} 0 0 0x100 0>; // all IDs dma-ranges = <0 0xb 0 1 0>; // child devices use ID '0xb' anothermaster { // no iommus link, implied by dma-ranges above }; }; }; If you set #size-cells=<0>, you can't really do that but instead would require an iommus property in each master, which is not a big concern either. The main advantage I think would be for IOMMUs that use the PCI b/d/f numbers as IDs. These can have #address-cells=<3>, #size-cells=<2> and have an empty dma-ranges property in the PCI host bridge node, and interpret this as using the same encoding as the PCI BARs in the ranges property. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 19 May 2014 22:32:33 Thierry Reding wrote: > On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote: > > On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote: > [...] > > > My understanding here is mostly based on the OpenFirmware working group > > > proposal for the dma-ranges property[0]. I'll give another example to > > > try and clarify how I had imagined this to work: > > > > > > / { > > > #address-cells = <2>; > > > #size-cells = <2>; > > > > > > iommu { > > > /* > > > * This is somewhat unusual (or maybe not) in that we > > > * need 2 cells to represent the size of an address > > > * space that is 32 bits long. > > > */ > > > #address-cells = <1>; > > > #size-cells = <2>; > > > > > > #iommu-cells = <1>; > > > }; > > > > > > master { > > > iommus = <&/iommu 42>; > > > > Comparing this with the other discussion thread, we have a similar > > concept here, in that the iommu is made a logical parent, however... > > > > Firstly, there's an implicit assumption here that the only kind of > > thing the master could possibly be connected to is an IOMMU, with > > no non-trivial interconnect in between. I'm not sure this is going > > to scale to complex SoCs. > > Here we go again. We're now in the very same bad spot that we've been in > so many times before. There are at least two SoCs that we know do not > require anything fancy, yet we're blocking adding support for those use > cases because we think that at some point some IOMMU may require more > than that. But at the same time it seems like we don't have enough data > about what exactly that is, so we keep speculating. At this rate we're > making it impossible to get a reasonable feature set supported upstream. > > That's very frustrating. I agree. While I just commented that I want to think through how this would look for other IOMMUs, the generic case of all masters that Dave wants is going overboard with the complexity and we'd be better off deferring this to whenever someone needs it, which I assume is never. > > If a range of Stream IDs may be issued (especially from something like > > a PCIe RC where the stream ID may be a many-bit value), describing > > the IDs individually may be impractical. > > The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has > a need to represent the stream IDs as a range there's nothing keeping it > from defining the specifier accordingly. If we make the IOMMU address space look like the PCI address space, we can have a simple representation for this in DT. For the code, I assume that we will always have to treat PCI and platform devices differently. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote: > On Monday 19 May 2014 22:59:46 Thierry Reding wrote: > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: > > > On Monday 19 May 2014 14:53:37 Thierry Reding wrote: > > > > On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > > > > > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > > > > > From: Thierry Reding <treding@nvidia.com> [...] > > > > > Finally, it makes no sense to use the dma-ranges property of the master's > > > > > parent bus, because that bus isn't actually involved in the translation. > > > > > > > > My understanding here is mostly based on the OpenFirmware working group > > > > proposal for the dma-ranges property[0]. I'll give another example to > > > > try and clarify how I had imagined this to work: > > > > > > > > / { > > > > #address-cells = <2>; > > > > #size-cells = <2>; > > > > > > > > iommu { > > > > /* > > > > * This is somewhat unusual (or maybe not) in that we > > > > * need 2 cells to represent the size of an address > > > > * space that is 32 bits long. > > > > */ > > > > #address-cells = <1>; > > > > #size-cells = <2>; > > > > > > You should never need #size-cells > #address-cells > > > > That was always my impression as well. But how then do you represent the > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at > > 4 GiB - 1, which makes it 4 GiB large. That's: > > > > <0 1 0> > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is: > > > > <0 0xffffffff> > > > > but that's not accurate. > > I think we've done both in the past, either extended #size-cells or > taken 0xffffffff as a special token. Note that in your example, > the iommu actually needs #address-cells = <2> anyway. But it needs #address-cells = <2> only to encode an ID in addition to the address. If this was a single-master IOMMU then there'd be no need for the ID. This really isn't specific to IOMMUs though. It really applies to all cases where #address-cells and #size-cells are parsed. While it's way too late to change the semantics of standard properties, perhaps for properties that are introduced it would make more sense to encode this as a <start limit> pair, both of length #address-cells, to avoid this type of corner case. On the other hand doing so would make it inconsistent with existing bindings which may not be desirable either. But since it seems like we're headed for something completely different for IOMMUs, perhaps it would be worth considering to describe the IOMMU range as <start limit>. Since it will likely use #iommu-cells rather than #address-cells we have an opportunity to change the semantics. > > > Let me do another example, with the address merged into the iommu > > > references: > > > > > > / { > > > #address-cells = <2>; // 64-bit address > > > #size-cells = <2>; > > > > > > iommu@a { > > > #address-cells = <2>; // 1 cell ID, 1 cell address > > > #size-cells = <1>; > > > > > > // no need for #iommu-cells > > > }; > > > > > > > > > master@b { > > > iommus = <&/iommu@a // iommu > > > 0x23 // ID > > > 0x40000000 // window start > > > 0x10000000>; //window size > > > }; > > > }; > > > > > > A disadvantage of this model would be that for all ARM SMMU users, we'd > > > have to always list a 4GB address space, which is kind of redundant. > > > > Isn't that the equivalent of the "whole address space" default that I > > mentioned in the commit message? Could this be handled with > > #address-cells = <1> and #size-cells = <0> in the iommu node? That way > > the only cell that needs to be specified in iommus would be the ID and > > the redundant address space could be simply omitted from DT since it > > would be implied by the compatible string. > > Yes, that's right. After giving it some more thought, I agree that the > #size-cells case makes sense as an indicator for cases where the master > doesn't have to care about the address at all and no further translation > with dma-ranges is possible. > > Back to my example with dma-ranges, the simplest case would an IOMMU > that has an ID per bus, with multiple masters sharing that ID: > > / { > #address-cells = <1>; > #size-cells = <1>; > > iommu { > #address-cells = <2>; // ID, address > #size-cells = <2>; > }; > > master@a { > iommus = <& {/iommu} 0xa 0x0 0x1 0x0>; // 4GB ID '0xa' > } > > bus1 { > #address-cells = <1>; > #size-cells = <1>; > ranges; > iommus = <& {/iommu} 0 0 0x100 0>; // all IDs > dma-ranges = <0 0xb 0 1 0>; // child devices use ID '0xb' > > anothermaster { > // no iommus link, implied by dma-ranges above > }; > }; > }; > > If you set #size-cells=<0>, you can't really do that but instead would > require an iommus property in each master, which is not a big concern > either. I'm not sure I understand the need for 0x100 (all IDs) entry above. If bus1's iommus property applies to all devices on the bus, why can't the ID 0xb be listed in the iommus property? bus1 { #address-cells = <1>; #size-cells = <1>; ranges; iommus = <&{/iommu} 0xb 0 0x1 0x0>; // 4GB ID '0xb' dma-ranges = <0 0xb 0 0x1 0x0>; anothermaster { ... }; }; In which case I guess dma-ranges would be redundant. > The main advantage I think would be for IOMMUs that use the PCI b/d/f > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2> > and have an empty dma-ranges property in the PCI host bridge node, > and interpret this as using the same encoding as the PCI BARs in > the ranges property. I'm somewhat confused here, since you said earlier: > After giving the ranges stuff some more thought, I have come to the > conclusion that using #iommu-cells should work fine for almost > all cases, including windowed iommus, because the window is not > actually needed in the device, but only in the iommu, wihch is of course > free to interpret the arguments as addresses. But now you seem to be saying that we should still be using the #address-cells and #size-cells properties in the IOMMU node to determine the length of the specifier. Thierry
On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote: > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote: > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote: > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: > > > > On Monday 19 May 2014 14:53:37 Thierry Reding wrote: > > > > > On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote: > > > > > > On Friday 16 May 2014 14:23:18 Thierry Reding wrote: > > > > > > > From: Thierry Reding <treding@nvidia.com> > [...] > > > > > > Finally, it makes no sense to use the dma-ranges property of the master's > > > > > > parent bus, because that bus isn't actually involved in the translation. > > > > > > > > > > My understanding here is mostly based on the OpenFirmware working group > > > > > proposal for the dma-ranges property[0]. I'll give another example to > > > > > try and clarify how I had imagined this to work: > > > > > > > > > > / { > > > > > #address-cells = <2>; > > > > > #size-cells = <2>; > > > > > > > > > > iommu { > > > > > /* > > > > > * This is somewhat unusual (or maybe not) in that we > > > > > * need 2 cells to represent the size of an address > > > > > * space that is 32 bits long. > > > > > */ > > > > > #address-cells = <1>; > > > > > #size-cells = <2>; > > > > > > > > You should never need #size-cells > #address-cells > > > > > > That was always my impression as well. But how then do you represent the > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at > > > 4 GiB - 1, which makes it 4 GiB large. That's: > > > > > > <0 1 0> > > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is: > > > > > > <0 0xffffffff> > > > > > > but that's not accurate. > > > > I think we've done both in the past, either extended #size-cells or > > taken 0xffffffff as a special token. Note that in your example, > > the iommu actually needs #address-cells = <2> anyway. > > But it needs #address-cells = <2> only to encode an ID in addition to > the address. If this was a single-master IOMMU then there'd be no need > for the ID. Right. But for a single-master IOMMU, there is no need to specify any additional data, it could have #address-cells=<0> if we take the optimization you suggested. > This really isn't specific to IOMMUs though. It really applies to all > cases where #address-cells and #size-cells are parsed. While it's way > too late to change the semantics of standard properties, perhaps for > properties that are introduced it would make more sense to encode this > as a <start limit> pair, both of length #address-cells, to avoid this > type of corner case. > > On the other hand doing so would make it inconsistent with existing > bindings which may not be desirable either. > > But since it seems like we're headed for something completely different > for IOMMUs, perhaps it would be worth considering to describe the IOMMU > range as <start limit>. Since it will likely use #iommu-cells rather > than #address-cells we have an opportunity to change the semantics. I'd still prefer #address-cells/#size-cells over #iommu-cells. > > / { > > #address-cells = <1>; > > #size-cells = <1>; > > > > iommu { > > #address-cells = <2>; // ID, address > > #size-cells = <2>; > > }; > > > > master@a { > > iommus = <& {/iommu} 0xa 0x0 0x1 0x0>; // 4GB ID '0xa' > > } > > > > bus1 { > > #address-cells = <1>; > > #size-cells = <1>; > > ranges; > > iommus = <& {/iommu} 0 0 0x100 0>; // all IDs > > dma-ranges = <0 0xb 0 1 0>; // child devices use ID '0xb' > > > > anothermaster { > > // no iommus link, implied by dma-ranges above > > }; > > }; > > }; > > > > If you set #size-cells=<0>, you can't really do that but instead would > > require an iommus property in each master, which is not a big concern > > either. > > I'm not sure I understand the need for 0x100 (all IDs) entry above. If > bus1's iommus property applies to all devices on the bus, why can't the > ID 0xb be listed in the iommus property? > > bus1 { > #address-cells = <1>; > #size-cells = <1>; > ranges; > iommus = <&{/iommu} 0xb 0 0x1 0x0>; // 4GB ID '0xb' > dma-ranges = <0 0xb 0 0x1 0x0>; > > anothermaster { > ... > }; > }; It depends on how the address is interpreted, but we could make this a valid case too. > In which case I guess dma-ranges would be redundant. No, because the iommus property doesn't translate the address range, it just creates a new address space. bus1 and iommu in the example have different #address-cells, so you definitely need a non-empty ranges property. > > The main advantage I think would be for IOMMUs that use the PCI b/d/f > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2> > > and have an empty dma-ranges property in the PCI host bridge node, > > and interpret this as using the same encoding as the PCI BARs in > > the ranges property. > > I'm somewhat confused here, since you said earlier: > > > After giving the ranges stuff some more thought, I have come to the > > conclusion that using #iommu-cells should work fine for almost > > all cases, including windowed iommus, because the window is not > > actually needed in the device, but only in the iommu, wihch is of course > > free to interpret the arguments as addresses. > > But now you seem to be saying that we should still be using the > #address-cells and #size-cells properties in the IOMMU node to determine > the length of the specifier. I probably wasn't clear. I think we can make it work either way, but my feeling is that using #address-cells/#size-cells gives us a nicer syntax for the more complex cases. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote: > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote: > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote: > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote: > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: [...] > > > > > You should never need #size-cells > #address-cells > > > > > > > > That was always my impression as well. But how then do you represent the > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at > > > > 4 GiB - 1, which makes it 4 GiB large. That's: > > > > > > > > <0 1 0> > > > > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is: > > > > > > > > <0 0xffffffff> > > > > > > > > but that's not accurate. > > > > > > I think we've done both in the past, either extended #size-cells or > > > taken 0xffffffff as a special token. Note that in your example, > > > the iommu actually needs #address-cells = <2> anyway. > > > > But it needs #address-cells = <2> only to encode an ID in addition to > > the address. If this was a single-master IOMMU then there'd be no need > > for the ID. > > Right. But for a single-master IOMMU, there is no need to specify > any additional data, it could have #address-cells=<0> if we take the > optimization you suggested. Couldn't a single-master IOMMU be windowed? > > I'm not sure I understand the need for 0x100 (all IDs) entry above. If > > bus1's iommus property applies to all devices on the bus, why can't the > > ID 0xb be listed in the iommus property? > > > > bus1 { > > #address-cells = <1>; > > #size-cells = <1>; > > ranges; > > iommus = <&{/iommu} 0xb 0 0x1 0x0>; // 4GB ID '0xb' > > dma-ranges = <0 0xb 0 0x1 0x0>; > > > > anothermaster { > > ... > > }; > > }; > > It depends on how the address is interpreted, but we could make this > a valid case too. > > > In which case I guess dma-ranges would be redundant. > > No, because the iommus property doesn't translate the address range, it > just creates a new address space. bus1 and iommu in the example have > different #address-cells, so you definitely need a non-empty ranges > property. Ah, now I get it. > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2> > > > and have an empty dma-ranges property in the PCI host bridge node, > > > and interpret this as using the same encoding as the PCI BARs in > > > the ranges property. > > > > I'm somewhat confused here, since you said earlier: > > > > > After giving the ranges stuff some more thought, I have come to the > > > conclusion that using #iommu-cells should work fine for almost > > > all cases, including windowed iommus, because the window is not > > > actually needed in the device, but only in the iommu, wihch is of course > > > free to interpret the arguments as addresses. > > > > But now you seem to be saying that we should still be using the > > #address-cells and #size-cells properties in the IOMMU node to determine > > the length of the specifier. > > I probably wasn't clear. I think we can make it work either way, but > my feeling is that using #address-cells/#size-cells gives us a nicer > syntax for the more complex cases. Okay, so in summary we'd have something like this for simple cases: Required properties: -------------------- - #address-cells: The number of cells in an IOMMU specifier needed to encode an address. - #size-cells: The number of cells in an IOMMU specifier needed to represent the length of an address range. Typical values for the above include: - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not configurable and therefore no additional information needs to be encoded in the specifier. This may also apply to multiple master IOMMU devices that do not allow the association of masters to be configured. - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may need to be configured in order to enable translation for a given master. In such cases the single address cell corresponds to the master device's ID. - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA window for masters to be configured. The first cell of the address in this may contain the master device's ID for example, while the second cell could contain the start of the DMA window for the given device. The length of the DMA window is specified by two additional cells. Examples: ========= Single-master IOMMU: -------------------- iommu { #address-cells = <0>; #size-cells = <0>; }; master { iommus = <&/iommu>; }; Multiple-master IOMMU with fixed associations: ---------------------------------------------- /* multiple-master IOMMU */ iommu { /* * Masters are statically associated with this IOMMU and * address translation is always enabled. */ #iommu-cells = <0>; }; /* static association with IOMMU */ master@1 { reg = <1>; iommus = <&/iommu>; }; /* static association with IOMMU */ master@2 { reg = <2>; iommus = <&/iommu>; }; Multiple-master IOMMU: ---------------------- iommu { /* the specifier represents the ID of the master */ #address-cells = <1>; #size-cells = <0>; }; master { /* device has master ID 42 in the IOMMU */ iommus = <&/iommu 42>; }; Multiple-master device: ----------------------- /* single-master IOMMU */ iommu@1 { reg = <1>; #address-cells = <0>; #size-cells = <0>; }; /* multiple-master IOMMU */ iommu@2 { reg = <2>; #address-cells = <1>; #size-cells = <0>; }; /* device with two master interfaces */ master { iommus = <&/iommu@1>, /* master of the single-master IOMMU */ <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */ }; Multiple-master IOMMU with configurable DMA window: --------------------------------------------------- / { #address-cells = <1>; #size-cells = <1>; iommu { /* master ID, address of DMA window */ #address-cells = <2>; #size-cells = <2>; }; master { /* master ID 42, 4 GiB DMA window starting at 0 */ iommus = <&/iommu 42 0 0x1 0x0>; }; }; Does that sound about right? Thierry
On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote: > > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote: > > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote: > > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote: > > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: > [...] > > > > > > You should never need #size-cells > #address-cells > > > > > > > > > > That was always my impression as well. But how then do you represent the > > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at > > > > > 4 GiB - 1, which makes it 4 GiB large. That's: > > > > > > > > > > <0 1 0> > > > > > > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is: > > > > > > > > > > <0 0xffffffff> > > > > > > > > > > but that's not accurate. > > > > > > > > I think we've done both in the past, either extended #size-cells or > > > > taken 0xffffffff as a special token. Note that in your example, > > > > the iommu actually needs #address-cells = <2> anyway. > > > > > > But it needs #address-cells = <2> only to encode an ID in addition to > > > the address. If this was a single-master IOMMU then there'd be no need > > > for the ID. > > > > Right. But for a single-master IOMMU, there is no need to specify > > any additional data, it could have #address-cells=<0> if we take the > > optimization you suggested. > > Couldn't a single-master IOMMU be windowed? Ah, yes. That would actually be like an IBM pSeries, which has a windowed IOMMU but uses one window per virtual machine. In that case, the window could be a property of the iommu node though, rather than part of the address in the link. > > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f > > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2> > > > > and have an empty dma-ranges property in the PCI host bridge node, > > > > and interpret this as using the same encoding as the PCI BARs in > > > > the ranges property. > > > > > > I'm somewhat confused here, since you said earlier: > > > > > > > After giving the ranges stuff some more thought, I have come to the > > > > conclusion that using #iommu-cells should work fine for almost > > > > all cases, including windowed iommus, because the window is not > > > > actually needed in the device, but only in the iommu, wihch is of course > > > > free to interpret the arguments as addresses. > > > > > > But now you seem to be saying that we should still be using the > > > #address-cells and #size-cells properties in the IOMMU node to determine > > > the length of the specifier. > > > > I probably wasn't clear. I think we can make it work either way, but > > my feeling is that using #address-cells/#size-cells gives us a nicer > > syntax for the more complex cases. > > Okay, so in summary we'd have something like this for simple cases: > > Required properties: > -------------------- > - #address-cells: The number of cells in an IOMMU specifier needed to encode > an address. > - #size-cells: The number of cells in an IOMMU specifier needed to represent > the length of an address range. > > Typical values for the above include: > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not > configurable and therefore no additional information needs to be encoded in > the specifier. This may also apply to multiple master IOMMU devices that do > not allow the association of masters to be configured. > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may > need to be configured in order to enable translation for a given master. In > such cases the single address cell corresponds to the master device's ID. > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA > window for masters to be configured. The first cell of the address in this > may contain the master device's ID for example, while the second cell could > contain the start of the DMA window for the given device. The length of the > DMA window is specified by two additional cells. > > Examples: > ========= > > Single-master IOMMU: > -------------------- > > iommu { > #address-cells = <0>; > #size-cells = <0>; > }; > > master { > iommus = <&/iommu>; > }; > > Multiple-master IOMMU with fixed associations: > ---------------------------------------------- > > /* multiple-master IOMMU */ > iommu { > /* > * Masters are statically associated with this IOMMU and > * address translation is always enabled. > */ > #iommu-cells = <0>; > }; copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here. > /* static association with IOMMU */ > master@1 { > reg = <1>; > iommus = <&/iommu>; > }; > > /* static association with IOMMU */ > master@2 { > reg = <2>; > iommus = <&/iommu>; > }; > > Multiple-master IOMMU: > ---------------------- > > iommu { > /* the specifier represents the ID of the master */ > #address-cells = <1>; > #size-cells = <0>; > }; > > master { > /* device has master ID 42 in the IOMMU */ > iommus = <&/iommu 42>; > }; > > Multiple-master device: > ----------------------- > > /* single-master IOMMU */ > iommu@1 { > reg = <1>; > #address-cells = <0>; > #size-cells = <0>; > }; > > /* multiple-master IOMMU */ > iommu@2 { > reg = <2>; > #address-cells = <1>; > #size-cells = <0>; > }; > > /* device with two master interfaces */ > master { > iommus = <&/iommu@1>, /* master of the single-master IOMMU */ > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */ > }; > > Multiple-master IOMMU with configurable DMA window: > --------------------------------------------------- > > / { > #address-cells = <1>; > #size-cells = <1>; > > iommu { > /* master ID, address of DMA window */ > #address-cells = <2>; > #size-cells = <2>; > }; > > master { > /* master ID 42, 4 GiB DMA window starting at 0 */ > iommus = <&/iommu 42 0 0x1 0x0>; > }; > }; > > Does that sound about right? Yes, sounds great. I would probably leave out the Multiple-master device from the examples, since that seems to be a rather obscure case. I would like to add an explanation about dma-ranges to the binding: 8<-------- The parent bus of the iommu must have a valid "dma-ranges" property describing how the physical address space of the IOMMU maps into memory. A device with an "iommus" property will ignore the "dma-ranges" property of the parent node and rely on the IOMMU for translation instead. Using an "iommus" property in bus device nodes with "dma-ranges" specifying how child devices relate to the IOMMU is a possible extension but is not recommended until this binding gets extended. ----------->8 Does that make sense to you? We can change what we say about dma-ranges, I mainly want to be clear with what is or is not allowed at this point. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 12:08:44PM +0200, Arnd Bergmann wrote: > On Monday 19 May 2014 22:32:33 Thierry Reding wrote: > > On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote: > > > On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote: > > [...] > > > > My understanding here is mostly based on the OpenFirmware working group > > > > proposal for the dma-ranges property[0]. I'll give another example to > > > > try and clarify how I had imagined this to work: > > > > > > > > / { > > > > #address-cells = <2>; > > > > #size-cells = <2>; > > > > > > > > iommu { > > > > /* > > > > * This is somewhat unusual (or maybe not) in that we > > > > * need 2 cells to represent the size of an address > > > > * space that is 32 bits long. > > > > */ > > > > #address-cells = <1>; > > > > #size-cells = <2>; > > > > > > > > #iommu-cells = <1>; > > > > }; > > > > > > > > master { > > > > iommus = <&/iommu 42>; > > > > > > Comparing this with the other discussion thread, we have a similar > > > concept here, in that the iommu is made a logical parent, however... > > > > > > Firstly, there's an implicit assumption here that the only kind of > > > thing the master could possibly be connected to is an IOMMU, with > > > no non-trivial interconnect in between. I'm not sure this is going > > > to scale to complex SoCs. > > > > Here we go again. We're now in the very same bad spot that we've been in > > so many times before. There are at least two SoCs that we know do not > > require anything fancy, yet we're blocking adding support for those use > > cases because we think that at some point some IOMMU may require more > > than that. But at the same time it seems like we don't have enough data > > about what exactly that is, so we keep speculating. At this rate we're > > making it impossible to get a reasonable feature set supported upstream. > > > > That's very frustrating. > > I agree. While I just commented that I want to think through how this > would look for other IOMMUs, the generic case of all masters that Dave > wants is going overboard with the complexity and we'd be better off I don't want that, and actually I agree with the conclusion "overboard". It was really about exploring the problem space, including things that we can reasonably foresee. Any bindings today should necessarily be much simpler, and that's certainly the correct approach. I've been neglecting this thread (apologies) -- but although I have to think a bit about Thierry's most recent suggestions, I think there's actually reasonable alignment now. > deferring this to whenever someone needs it, which I assume is never. Well, some of it might be. But I'm not saying we should give in to wild speculation (except to get a feel for what we're ruling out, and the consequences of that). > > > > If a range of Stream IDs may be issued (especially from something like > > > a PCIe RC where the stream ID may be a many-bit value), describing > > > the IDs individually may be impractical. > > > > The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has > > a need to represent the stream IDs as a range there's nothing keeping it > > from defining the specifier accordingly. > > If we make the IOMMU address space look like the PCI address space, we > can have a simple representation for this in DT. For the code, I assume > that we will always have to treat PCI and platform devices differently. Can you elaborate on that? Master ID signals in ARM systems and how they are handled are rather under-specified today. Treating the master ID bits as some extra bits in some kind of extended bus address could work for ARM IOMMUs etc., but I didn't have a complete answer yet. Cheers ---Dave -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: [...] > > Couldn't a single-master IOMMU be windowed? > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > IOMMU but uses one window per virtual machine. In that case, the window could > be a property of the iommu node though, rather than part of the address > in the link. Does that mean that the IOMMU has one statically configured window which is the same for each virtual machine? That would require some other mechanism to assign separate address spaces to each virtual machine, wouldn't it? But I suspect that if the IOMMU allows that it could be allocated dynamically at runtime. > > Multiple-master IOMMU with fixed associations: > > ---------------------------------------------- > > > > /* multiple-master IOMMU */ > > iommu { > > /* > > * Masters are statically associated with this IOMMU and > > * address translation is always enabled. > > */ > > #iommu-cells = <0>; > > }; > > copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here. Yes, I obviously wasn't careful when I copied this over. > > Multiple-master device: > > ----------------------- > > > > /* single-master IOMMU */ > > iommu@1 { > > reg = <1>; > > #address-cells = <0>; > > #size-cells = <0>; > > }; > > > > /* multiple-master IOMMU */ > > iommu@2 { > > reg = <2>; > > #address-cells = <1>; > > #size-cells = <0>; > > }; > > > > /* device with two master interfaces */ > > master { > > iommus = <&/iommu@1>, /* master of the single-master IOMMU */ > > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */ > > }; [...] > > Does that sound about right? > > Yes, sounds great. I would probably leave out the Multiple-master device > from the examples, since that seems to be a rather obscure case. Agreed. We can easily add such examples if/when such device start to appear. > I would like to add an explanation about dma-ranges to the binding: > > 8<-------- > The parent bus of the iommu must have a valid "dma-ranges" property > describing how the physical address space of the IOMMU maps into > memory. With physical address space you mean the addresses after translation, not the I/O virtual addresses, right? But even so, how will this work when there are multiple IOMMU devices? What determines which IOMMU is mapped via which entry? Perhaps having multiple IOMMUs implies that there will have to be some partitioning of the parent address space to make sure two IOMMUs don't translate to the same ranges? > A device with an "iommus" property will ignore the "dma-ranges" property > of the parent node and rely on the IOMMU for translation instead. Do we need to consider the case where an IOMMU listed in iommus isn't enabled (status = "disabled")? In that case presumably the device would either not function or may optionally continue to master onto the parent untranslated. > Using an "iommus" property in bus device nodes with "dma-ranges" > specifying how child devices relate to the IOMMU is a possible extension > but is not recommended until this binding gets extended. Just for my understanding, bus device nodes with iommus and dma-ranges properties could be equivalently written by explicitly moving the iommus properties into the child device nodes, right? In which case they should be the same as the other examples. So that concept is a convenience notation to reduce duplication, but doesn't fundamentally introduce any new concept. Thierry
On Tuesday 20 May 2014 14:07:09 Dave Martin wrote: > On Tue, May 20, 2014 at 12:08:44PM +0200, Arnd Bergmann wrote: > > On Monday 19 May 2014 22:32:33 Thierry Reding wrote: > > > On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote: > > > > On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote: > > > [...] > > > > > My understanding here is mostly based on the OpenFirmware working group > > > > > proposal for the dma-ranges property[0]. I'll give another example to > > > > > try and clarify how I had imagined this to work: > > > > > > > > > > / { > > > > > #address-cells = <2>; > > > > > #size-cells = <2>; > > > > > > > > > > iommu { > > > > > /* > > > > > * This is somewhat unusual (or maybe not) in that we > > > > > * need 2 cells to represent the size of an address > > > > > * space that is 32 bits long. > > > > > */ > > > > > #address-cells = <1>; > > > > > #size-cells = <2>; > > > > > > > > > > #iommu-cells = <1>; > > > > > }; > > > > > > > > > > master { > > > > > iommus = <&/iommu 42>; > > > > > > > > Comparing this with the other discussion thread, we have a similar > > > > concept here, in that the iommu is made a logical parent, however... > > > > > > > > Firstly, there's an implicit assumption here that the only kind of > > > > thing the master could possibly be connected to is an IOMMU, with > > > > no non-trivial interconnect in between. I'm not sure this is going > > > > to scale to complex SoCs. > > > > > > Here we go again. We're now in the very same bad spot that we've been in > > > so many times before. There are at least two SoCs that we know do not > > > require anything fancy, yet we're blocking adding support for those use > > > cases because we think that at some point some IOMMU may require more > > > than that. But at the same time it seems like we don't have enough data > > > about what exactly that is, so we keep speculating. At this rate we're > > > making it impossible to get a reasonable feature set supported upstream. > > > > > > That's very frustrating. > > > > I agree. While I just commented that I want to think through how this > > would look for other IOMMUs, the generic case of all masters that Dave > > wants is going overboard with the complexity and we'd be better off > > I don't want that, and actually I agree with the conclusion "overboard". > > It was really about exploring the problem space, including things that > we can reasonably foresee. Any bindings today should necessarily be > much simpler, and that's certainly the correct approach. > > I've been neglecting this thread (apologies) -- but although I have to > think a bit about Thierry's most recent suggestions, I think there's > actually reasonable alignment now. > > > deferring this to whenever someone needs it, which I assume is never. > > Well, some of it might be. But I'm not saying we should give in to > wild speculation (except to get a feel for what we're ruling out, and > the consequences of that). Ok, fair enough. > > > > If a range of Stream IDs may be issued (especially from something like > > > > a PCIe RC where the stream ID may be a many-bit value), describing > > > > the IDs individually may be impractical. > > > > > > The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has > > > a need to represent the stream IDs as a range there's nothing keeping it > > > from defining the specifier accordingly. > > > > If we make the IOMMU address space look like the PCI address space, we > > can have a simple representation for this in DT. For the code, I assume > > that we will always have to treat PCI and platform devices differently. > > Can you elaborate on that? > > Master ID signals in ARM systems and how they are handled are rather > under-specified today. Treating the master ID bits as some extra bits > in some kind of extended bus address could work for ARM IOMMUs etc., > but I didn't have a complete answer yet. The PCI binding at http://www.openfirmware.org/1275/bindings/pci/pci2_1.pdf defines a 96-bit address space for MMIO registers and other things, so it can uniquely identify not just an address on the bus, but also any standard resource as seen by the device, quoting from there: 2.2.1.1. Numerical Representation (The Numerical Representation of an address is the format that Open Firmware uses for storing an address within a property value and on the stack, as an argument to a package method.) The numerical representation of a PCI address consists of three cells, encoded as follows. For this purpose, the least-significant 32 bits of a cell is used; if the cell size is larger than 32 bits, any additional high-order bits are zero. Bit# 0 refers to the least-significant bit. Bit# 33222222 22221111 11111100 00000000 10987654 32109876 54321098 76543210 phys.hi cell: npt000ss bbbbbbbb dddddfff rrrrrrrr phys.mid cell: hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh phys.lo cell: llllllll llllllll llllllll llllllll where: n is 0 if the address is relocatable, 1 otherwise p is 1 if the addressable region is "prefetchable", 0 otherwise t is 1 if the address is aliased (for non-relocatable I/O), below 1 MB (for Memory), or below 64 KB (for relocatable I/O). ss is the space code, denoting the address space bbbbbbbb is the 8-bit Bus Number ddddd is the 5-bit Device Number fff is the 3-bit Function Number rrrrrrrr is the 8-bit Register Number hh...hh is a 32-bit unsigned number ll...ll is a 32-bit unsigned number We can ignore n, p, t and r here, and use the same format for a DMA address, then define an empty "dma-ranges" property. That would imply that using b/d/f is sufficient to identify each master at the iommu. Any device outside of the PCI host but connected to the same iommu can use the same notation to list the logical b/d/f that gets sent to the IOMMU in bus master transactions. Do you think this is sufficient for the ARM SMMU, or do we need something beyond that? Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote: > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > [...] > > > Couldn't a single-master IOMMU be windowed? > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > > IOMMU but uses one window per virtual machine. In that case, the window could > > be a property of the iommu node though, rather than part of the address > > in the link. > > Does that mean that the IOMMU has one statically configured window which > is the same for each virtual machine? That would require some other > mechanism to assign separate address spaces to each virtual machine, > wouldn't it? But I suspect that if the IOMMU allows that it could be > allocated dynamically at runtime. The way it works on pSeries is that upon VM creation, the guest is assigned one 256MB window for use by assigned DMA capable devices. When the guest creates a mapping, it uses a hypercall to associate a bus address in that range with a guest physical address. The hypervisor checks that the bus address is within the allowed range, and translates the guest physical address into a host physical address, then enters both into the I/O page table or I/O TLB. > > I would like to add an explanation about dma-ranges to the binding: > > > > 8<-------- > > The parent bus of the iommu must have a valid "dma-ranges" property > > describing how the physical address space of the IOMMU maps into > > memory. > > With physical address space you mean the addresses after translation, > not the I/O virtual addresses, right? But even so, how will this work > when there are multiple IOMMU devices? What determines which IOMMU is > mapped via which entry? > > Perhaps having multiple IOMMUs implies that there will have to be some > partitioning of the parent address space to make sure two IOMMUs don't > translate to the same ranges? These dma-ranges properties would almost always be for the entire RAM, and we can treat anything else as a bug. The mapping between what goes into the IOMMU and what comes out of it is not reflected in DT at all, since it only happens at runtime. The dma-ranges property I mean above describes how what comes out of the IOMMU maps into physical memory. > > A device with an "iommus" property will ignore the "dma-ranges" property > > of the parent node and rely on the IOMMU for translation instead. > > Do we need to consider the case where an IOMMU listed in iommus isn't > enabled (status = "disabled")? In that case presumably the device would > either not function or may optionally continue to master onto the parent > untranslated. My reasoning was that the DT should specify whether we use the IOMMU or not. Being able to just switch on or off the IOMMU sounds nice as well, so we could change the text above to do that. Another option would be to do this in the IOMMU code, basically falling back to the IOMMU parent's dma-ranges property and using linear dma_map_ops when that is disabled. > > Using an "iommus" property in bus device nodes with "dma-ranges" > > specifying how child devices relate to the IOMMU is a possible extension > > but is not recommended until this binding gets extended. > > Just for my understanding, bus device nodes with iommus and dma-ranges > properties could be equivalently written by explicitly moving the iommus > properties into the child device nodes, right? In which case they should > be the same as the other examples. So that concept is a convenience > notation to reduce duplication, but doesn't fundamentally introduce any > new concept. The one case where that doesn't work is PCI, because we don't list the PCI devices in DT normally, and the iommus property would only exist at the PCI host controller node. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote: > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote: > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > [...] > > > > Couldn't a single-master IOMMU be windowed? > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > > > IOMMU but uses one window per virtual machine. In that case, the window could > > > be a property of the iommu node though, rather than part of the address > > > in the link. > > > > Does that mean that the IOMMU has one statically configured window which > > is the same for each virtual machine? That would require some other > > mechanism to assign separate address spaces to each virtual machine, > > wouldn't it? But I suspect that if the IOMMU allows that it could be > > allocated dynamically at runtime. > > The way it works on pSeries is that upon VM creation, the guest is assigned > one 256MB window for use by assigned DMA capable devices. When the guest > creates a mapping, it uses a hypercall to associate a bus address in that > range with a guest physical address. The hypervisor checks that the bus > address is within the allowed range, and translates the guest physical > address into a host physical address, then enters both into the I/O page > table or I/O TLB. So when a VM is booted it is passed a device tree with that DMA window? Given what you describe above this seems to be more of a configuration option to restrict the IOMMU to a subset of the physical memory for purposes of virtualization. So I agree that this wouldn't be a good fit for what we're trying to achieve with iommus or dma-ranges in this binding. > > > I would like to add an explanation about dma-ranges to the binding: > > > > > > 8<-------- > > > The parent bus of the iommu must have a valid "dma-ranges" property > > > describing how the physical address space of the IOMMU maps into > > > memory. > > > > With physical address space you mean the addresses after translation, > > not the I/O virtual addresses, right? But even so, how will this work > > when there are multiple IOMMU devices? What determines which IOMMU is > > mapped via which entry? > > > > Perhaps having multiple IOMMUs implies that there will have to be some > > partitioning of the parent address space to make sure two IOMMUs don't > > translate to the same ranges? > > These dma-ranges properties would almost always be for the entire RAM, > and we can treat anything else as a bug. Would it typically be a 1:1 mapping? In that case could we define an empty dma-ranges property to mean exactly that? That would make it consistent with the ranges property. > The mapping between what goes into the IOMMU and what comes out of it > is not reflected in DT at all, since it only happens at runtime. > The dma-ranges property I mean above describes how what comes out of > the IOMMU maps into physical memory. Understood. That makes sense. > > > A device with an "iommus" property will ignore the "dma-ranges" property > > > of the parent node and rely on the IOMMU for translation instead. > > > > Do we need to consider the case where an IOMMU listed in iommus isn't > > enabled (status = "disabled")? In that case presumably the device would > > either not function or may optionally continue to master onto the parent > > untranslated. > > My reasoning was that the DT should specify whether we use the IOMMU > or not. Being able to just switch on or off the IOMMU sounds nice as > well, so we could change the text above to do that. > > Another option would be to do this in the IOMMU code, basically > falling back to the IOMMU parent's dma-ranges property and using > linear dma_map_ops when that is disabled. Yes, it should be trivial for the IOMMU core code to take care of this special case. Still I think it's worth mentioning it in the binding so that it's clearly specified. > > > Using an "iommus" property in bus device nodes with "dma-ranges" > > > specifying how child devices relate to the IOMMU is a possible extension > > > but is not recommended until this binding gets extended. > > > > Just for my understanding, bus device nodes with iommus and dma-ranges > > properties could be equivalently written by explicitly moving the iommus > > properties into the child device nodes, right? In which case they should > > be the same as the other examples. So that concept is a convenience > > notation to reduce duplication, but doesn't fundamentally introduce any > > new concept. > > The one case where that doesn't work is PCI, because we don't list the > PCI devices in DT normally, and the iommus property would only exist > at the PCI host controller node. But it could work in classic OpenFirmware where the device tree can be populated with the tree of PCI devices enumerated by OF. There are also devices that have a fixed configuration and where technically the PCI devices can be listed in the device tree. This is somewhat important if for example one PCI device is a GPIO controller and needs to be referenced by phandle from some other device. I'll make a note in the binding document about this possible future extension. Thierry
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote: > > > On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote: > > > > On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote: > > > > > On Monday 19 May 2014 22:59:46 Thierry Reding wrote: > > > > > > On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote: > > [...] > > > > > > > You should never need #size-cells > #address-cells > > > > > > > > > > > > That was always my impression as well. But how then do you represent the > > > > > > full 4 GiB address space in a 32-bit system? It starts at 0 and ends at > > > > > > 4 GiB - 1, which makes it 4 GiB large. That's: > > > > > > > > > > > > <0 1 0> > > > > > > > > > > > > With #address-cells = <1> and #size-cells = <1> the best you can do is: > > > > > > > > > > > > <0 0xffffffff> > > > > > > > > > > > > but that's not accurate. > > > > > > > > > > I think we've done both in the past, either extended #size-cells or > > > > > taken 0xffffffff as a special token. Note that in your example, > > > > > the iommu actually needs #address-cells = <2> anyway. > > > > > > > > But it needs #address-cells = <2> only to encode an ID in addition to > > > > the address. If this was a single-master IOMMU then there'd be no need > > > > for the ID. > > > > > > Right. But for a single-master IOMMU, there is no need to specify > > > any additional data, it could have #address-cells=<0> if we take the > > > optimization you suggested. > > > > Couldn't a single-master IOMMU be windowed? > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > IOMMU but uses one window per virtual machine. In that case, the window could > be a property of the iommu node though, rather than part of the address > in the link. > > > > > > The main advantage I think would be for IOMMUs that use the PCI b/d/f > > > > > numbers as IDs. These can have #address-cells=<3>, #size-cells=<2> > > > > > and have an empty dma-ranges property in the PCI host bridge node, > > > > > and interpret this as using the same encoding as the PCI BARs in > > > > > the ranges property. > > > > > > > > I'm somewhat confused here, since you said earlier: > > > > > > > > > After giving the ranges stuff some more thought, I have come to the > > > > > conclusion that using #iommu-cells should work fine for almost > > > > > all cases, including windowed iommus, because the window is not > > > > > actually needed in the device, but only in the iommu, wihch is of course > > > > > free to interpret the arguments as addresses. > > > > > > > > But now you seem to be saying that we should still be using the > > > > #address-cells and #size-cells properties in the IOMMU node to determine > > > > the length of the specifier. > > > > > > I probably wasn't clear. I think we can make it work either way, but > > > my feeling is that using #address-cells/#size-cells gives us a nicer > > > syntax for the more complex cases. > > > > Okay, so in summary we'd have something like this for simple cases: > > > > Required properties: > > -------------------- > > - #address-cells: The number of cells in an IOMMU specifier needed to encode > > an address. > > - #size-cells: The number of cells in an IOMMU specifier needed to represent > > the length of an address range. > > > > Typical values for the above include: > > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not > > configurable and therefore no additional information needs to be encoded in > > the specifier. This may also apply to multiple master IOMMU devices that do > > not allow the association of masters to be configured. > > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may > > need to be configured in order to enable translation for a given master. In > > such cases the single address cell corresponds to the master device's ID. > > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA > > window for masters to be configured. The first cell of the address in this > > may contain the master device's ID for example, while the second cell could > > contain the start of the DMA window for the given device. The length of the > > DMA window is specified by two additional cells. I was trying to figure out how to describe the different kinds of transformation we could have on the address/ID input to the IOMMU. Treating the whole thing as opaque gets us off the hook there. IDs are probably not propagated, not remapped, or we simply don't care about them; whereas the address transformation is software-controlled, so we don't describe that anyway. Delegating grokking the mapping to the iommu driver makes sense -- it's what it's there for, after all. I'm not sure whether the windowed IOMMU case is special actually. Since the address to program into the master is found by calling the IOMMU driver to create some mappings, does anything except the IOMMU driver need to understand that there is windowing? > > > > Examples: > > ========= > > > > Single-master IOMMU: > > -------------------- > > > > iommu { > > #address-cells = <0>; > > #size-cells = <0>; > > }; > > > > master { > > iommus = <&/iommu>; > > }; > > > > Multiple-master IOMMU with fixed associations: > > ---------------------------------------------- > > > > /* multiple-master IOMMU */ > > iommu { > > /* > > * Masters are statically associated with this IOMMU and > > * address translation is always enabled. > > */ > > #iommu-cells = <0>; > > }; > > copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here. > > > /* static association with IOMMU */ > > master@1 { > > reg = <1>; Just for clarification, "reg" just has its standard meaning here, and is nothing to do with the IOMMU? > > iommus = <&/iommu>; In effect, "iommus" is doing the same thing as my "slaves" property. The way #address-cells and #size-cells determine the address and range size for mastering into the IOMMU is also similar. The main difference is that I didn't build the ID into the address. > > }; > > > > /* static association with IOMMU */ > > master@2 { > > reg = <2>; > > iommus = <&/iommu>; > > }; > > > > Multiple-master IOMMU: > > ---------------------- > > > > iommu { > > /* the specifier represents the ID of the master */ > > #address-cells = <1>; > > #size-cells = <0>; How do we know the size of the input address to the IOMMU? Do we get cases for example where the IOMMU only accepts a 32-bit input address, but some 64-bit capable masters are connected through it? The size of the output address from the IOMMU will be determined by its own mastering destination, which by default in ePAPR is the IOMMU node's parent. I think that's what you intended, and what we expect in this case. For determining dma masks, it is the output address that it important. Santosh's code can probably be taught to handle this, if given an additional traversal rule for following "iommus" properties. However, deploying an IOMMU whose output address size is smaller than the > > }; > > > > master { > > /* device has master ID 42 in the IOMMU */ > > iommus = <&/iommu 42>; > > }; > > > > Multiple-master device: > > ----------------------- > > > > /* single-master IOMMU */ > > iommu@1 { > > reg = <1>; > > #address-cells = <0>; > > #size-cells = <0>; > > }; > > > > /* multiple-master IOMMU */ > > iommu@2 { > > reg = <2>; > > #address-cells = <1>; > > #size-cells = <0>; > > }; > > > > /* device with two master interfaces */ > > master { > > iommus = <&/iommu@1>, /* master of the single-master IOMMU */ > > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */ > > }; > > > > Multiple-master IOMMU with configurable DMA window: > > --------------------------------------------------- > > > > / { > > #address-cells = <1>; > > #size-cells = <1>; > > > > iommu { > > /* master ID, address of DMA window */ > > #address-cells = <2>; > > #size-cells = <2>; > > }; > > > > master { > > /* master ID 42, 4 GiB DMA window starting at 0 */ > > iommus = <&/iommu 42 0 0x1 0x0>; > > }; > > }; > > > > Does that sound about right? > > Yes, sounds great. I would probably leave out the Multiple-master device > from the examples, since that seems to be a rather obscure case. I think multi-master is the common case. > > I would like to add an explanation about dma-ranges to the binding: > > 8<-------- > The parent bus of the iommu must have a valid "dma-ranges" property > describing how the physical address space of the IOMMU maps into > memory. > A device with an "iommus" property will ignore the "dma-ranges" property > of the parent node and rely on the IOMMU for translation instead. > Using an "iommus" property in bus device nodes with "dma-ranges" > specifying how child devices relate to the IOMMU is a possible extension > but is not recommended until this binding gets extended. This sounds just right. The required semantics is that the presence of "iommus" on some bus mastering device overrides the ePAPR default destination so that transactions are delivered to the IOMMU for translation instead of the master's DT parent node. Where transactions flow out of the IOMMU, the iommu takes on the role of the master, so the default destination would be the iommu node's parent. > ----------->8 > > Does that make sense to you? We can change what we say about > dma-ranges, I mainly want to be clear with what is or is not > allowed at this point. I think it would be inconsistent and unnecessary to disallow it in the binding. The meaning you've proposed seems completely consistent with ePAPR, so I suggest to keep it. The IOMMU is just another bus master from the ePAPR point of view -- no need to make special rules for it unless they are useful. The binding does not need to be (and generally shouldn't be) a description of precisely what the kernel does and does not support. However, if we don't need to support non-identity dma-ranges in Linux yet, we have the option to barf if we see such a dma-ranges memorywards of an IOMMU, if it simplifies the Linux implementation. We could always relax that later -- and it'll be obvious how to describe that situation in DT. What I would like to see is a recommandation, based on Thierry's binding here, for describing how cross-mastering in general is described. It's not really a binding, but more of a template for bindings. I'm happy to have a go at writing it, then we can decide whether it's useful or not. There are a few things from the discussion that are *not* solved by this iommu binding, but they seem reasonable. The binding also doesn't block solving those things later if/when needed: 1) Cross-mastering to things that are not IOMMUs We might need to solve this later if we encounter SoCs with problematic topologies, we shouldn't worry about it for the time being. We'll to revisit it for GICv3 but that's a separate topic. 2) Describing address and ID remappings for cross-mastering. We can describe this in a way that is consistent with this IOMMU binding. We will need to describe something for GICv3, but the common case will be that IDs are just passed through without remapping. We don't need to clarify how IDs are propagated until we have something in DT for IDs to propagate to. Cheers ---Dave -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 02:23:47PM +0100, Arnd Bergmann wrote: > Bit# 33222222 22221111 11111100 00000000 > 10987654 32109876 54321098 76543210 > phys.hi cell: npt000ss bbbbbbbb dddddfff rrrrrrrr > phys.mid cell: hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh > phys.lo cell: llllllll llllllll llllllll llllllll > > where: > n is 0 if the address is relocatable, 1 otherwise > p is 1 if the addressable region is "prefetchable", 0 otherwise > t is 1 if the address is aliased (for non-relocatable I/O), > below 1 MB (for Memory), or below 64 KB (for relocatable I/O). > ss is the space code, denoting the address space > bbbbbbbb is the 8-bit Bus Number > ddddd is the 5-bit Device Number > fff is the 3-bit Function Number > rrrrrrrr is the 8-bit Register Number > hh...hh is a 32-bit unsigned number > ll...ll is a 32-bit unsigned number > > We can ignore n, p, t and r here, and use the same format for a DMA > address, then define an empty "dma-ranges" property. That would > imply that using b/d/f is sufficient to identify each master at the > iommu. Any device outside of the PCI host but connected to the same > iommu can use the same notation to list the logical b/d/f that gets > sent to the IOMMU in bus master transactions. > > Do you think this is sufficient for the ARM SMMU, or do we need > something beyond that? I think it can define the common-cases for the existing implementations, yes. I anticipate Stream-IDs becoming > 16-bit in the near future though, so we'd need extra bits if we're describing other devices coming into the SMMU. Note that we already have a binding for the current SMMU driver, so I'm not really in a position to shift over to a new binding until the next version of the SMMU architecture comes along... Will -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 04:26:59PM +0100, Will Deacon wrote: > On Tue, May 20, 2014 at 02:23:47PM +0100, Arnd Bergmann wrote: > > Bit# 33222222 22221111 11111100 00000000 > > 10987654 32109876 54321098 76543210 > > phys.hi cell: npt000ss bbbbbbbb dddddfff rrrrrrrr > > phys.mid cell: hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh > > phys.lo cell: llllllll llllllll llllllll llllllll > > > > where: > > n is 0 if the address is relocatable, 1 otherwise > > p is 1 if the addressable region is "prefetchable", 0 otherwise > > t is 1 if the address is aliased (for non-relocatable I/O), > > below 1 MB (for Memory), or below 64 KB (for relocatable I/O). > > ss is the space code, denoting the address space > > bbbbbbbb is the 8-bit Bus Number > > ddddd is the 5-bit Device Number > > fff is the 3-bit Function Number > > rrrrrrrr is the 8-bit Register Number > > hh...hh is a 32-bit unsigned number > > ll...ll is a 32-bit unsigned number > > > > We can ignore n, p, t and r here, and use the same format for a DMA > > address, then define an empty "dma-ranges" property. That would > > imply that using b/d/f is sufficient to identify each master at the > > iommu. Any device outside of the PCI host but connected to the same > > iommu can use the same notation to list the logical b/d/f that gets > > sent to the IOMMU in bus master transactions. > > > > Do you think this is sufficient for the ARM SMMU, or do we need > > something beyond that? > > I think it can define the common-cases for the existing implementations, > yes. I anticipate Stream-IDs becoming > 16-bit in the near future though, > so we'd need extra bits if we're describing other devices coming into the > SMMU. > > Note that we already have a binding for the current SMMU driver, so I'm not > really in a position to shift over to a new binding until the next version of > the SMMU architecture comes along... How much code relies on the meaning of the nptsbdf bits? Following the general model of concatenating the master ID with the address bits and trasting that as a single input to the IOMMU seems reasonable -- maybe we either don't care about the attribute bits and can replace them with master ID bits, or instead we could extend with more master ID bits at bit 96 and up. npts looks like memory type information. What memory type information we need to describe for coherent masters on ARM is a bit of an open question. Perhaps we can come up with a good ARM-specific interpretation of these bits, but I wonder whether they are enough. If there is a real PCI bus in the system of course, then we are fine once we reach it. The only problem is for the stuff on AMBA buses etc. In theory, we could have to describe memory types, cacheability attributes and coherency domains, but we may be able to squeeze the possibilities down to a smaller set of sane memory types. Cheers ---Dave -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 20 May 2014 16:24:59 Dave Martin wrote: > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > > On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote: > > > Typical values for the above include: > > > - #address-cells = <0>, size-cells = <0>: Single master IOMMU devices are not > > > configurable and therefore no additional information needs to be encoded in > > > the specifier. This may also apply to multiple master IOMMU devices that do > > > not allow the association of masters to be configured. > > > - #address-cells = <1>, size-cells = <0>: Multiple master IOMMU devices may > > > need to be configured in order to enable translation for a given master. In > > > such cases the single address cell corresponds to the master device's ID. > > > - #address-cells = <2>, size-cells = <2>: Some IOMMU devices allow the DMA > > > window for masters to be configured. The first cell of the address in this > > > may contain the master device's ID for example, while the second cell could > > > contain the start of the DMA window for the given device. The length of the > > > DMA window is specified by two additional cells. > > I was trying to figure out how to describe the different kinds of > transformation we could have on the address/ID input to the IOMMU. > Treating the whole thing as opaque gets us off the hook there. > > IDs are probably not propagated, not remapped, or we simply don't care > about them; whereas the address transformation is software-controlled, > so we don't describe that anyway. > > Delegating grokking the mapping to the iommu driver makes sense -- > it's what it's there for, after all. > > > I'm not sure whether the windowed IOMMU case is special actually. > > Since the address to program into the master is found by calling the > IOMMU driver to create some mappings, does anything except the IOMMU > driver need to understand that there is windowing? No. I tried to explain that earlier today, and in my earlier mails I hadn't thought that part through. Only the IOMMU driver needs to care about the window. > > > > > > Examples: > > > ========= > > > > > > Single-master IOMMU: > > > -------------------- > > > > > > iommu { > > > #address-cells = <0>; > > > #size-cells = <0>; > > > }; > > > > > > master { > > > iommus = <&/iommu>; > > > }; > > > > > > Multiple-master IOMMU with fixed associations: > > > ---------------------------------------------- > > > > > > /* multiple-master IOMMU */ > > > iommu { > > > /* > > > * Masters are statically associated with this IOMMU and > > > * address translation is always enabled. > > > */ > > > #iommu-cells = <0>; > > > }; > > > > copied wrong? I guess you mean #address-cells=<0>/#size-cells=<0> here. > > > > > /* static association with IOMMU */ > > > master@1 { > > > reg = <1>; > > Just for clarification, "reg" just has its standard meaning here, and > is nothing to do with the IOMMU? correct > > > iommus = <&/iommu>; > > In effect, "iommus" is doing the same thing as my "slaves" property. > > The way #address-cells and #size-cells determine the address and range > size for mastering into the IOMMU is also similar. The main difference > is that I didn't build the ID into the address. Right. I think the difference is more about what we want to call things: Calling it iommu means we want to specifically describe the case of iommus that needs to get handled by all OSs in a particular way, while the more generic slave connection doesn't correspond to a specific concept in the OS. > > > }; > > > > > > /* static association with IOMMU */ > > > master@2 { > > > reg = <2>; > > > iommus = <&/iommu>; > > > }; > > > > > > Multiple-master IOMMU: > > > ---------------------- > > > > > > iommu { > > > /* the specifier represents the ID of the master */ > > > #address-cells = <1>; > > > #size-cells = <0>; > > How do we know the size of the input address to the IOMMU? Do we > get cases for example where the IOMMU only accepts a 32-bit input > address, but some 64-bit capable masters are connected through it? I was stuck on this question for a while before, but then I realized that it doesn't matter at all: It's the IOMMU driver itself that manages the address space, and it doesn't matter if a slave can address a larger range than the IOMMU can accept. If the IOMMU needs to deal with the opposite case (64-bit input addresses but a 32-bit master), that limitation can be put into the specifier. > The size of the output address from the IOMMU will be determined > by its own mastering destination, which by default in ePAPR is the > IOMMU node's parent. I think that's what you intended, and what we > expect in this case. Rihgt. > For determining dma masks, it is the output address that it > important. Santosh's code can probably be taught to handle this, > if given an additional traversal rule for following "iommus" > properties. However, deploying an IOMMU whose output address size > is smaller than the Something seems to be missing here. I don't think we want to handle the case where the IOMMU output cannot the entire memory address space. If necessary, that would mean using both an IOMMU driver and swiotlb, but I think it's a reasonable assumption that hardware isn't /that/ crazy. > > > Multiple-master device: > > > ----------------------- > > > > > > /* single-master IOMMU */ > > > iommu@1 { > > > reg = <1>; > > > #address-cells = <0>; > > > #size-cells = <0>; > > > }; > > > > > > /* multiple-master IOMMU */ > > > iommu@2 { > > > reg = <2>; > > > #address-cells = <1>; > > > #size-cells = <0>; > > > }; > > > > > > /* device with two master interfaces */ > > > master { > > > iommus = <&/iommu@1>, /* master of the single-master IOMMU */ > > > <&/iommu@2 42>; /* ID 42 in multiple-master IOMMU */ > > > }; > > > > > > Multiple-master IOMMU with configurable DMA window: > > > --------------------------------------------------- > > > > > > / { > > > #address-cells = <1>; > > > #size-cells = <1>; > > > > > > iommu { > > > /* master ID, address of DMA window */ > > > #address-cells = <2>; > > > #size-cells = <2>; > > > }; > > > > > > master { > > > /* master ID 42, 4 GiB DMA window starting at 0 */ > > > iommus = <&/iommu 42 0 0x1 0x0>; > > > }; > > > }; > > > > > > Does that sound about right? > > > > Yes, sounds great. I would probably leave out the Multiple-master device > > from the examples, since that seems to be a rather obscure case. > > I think multi-master is the common case. Which of the two cases above do you mean? I was referring to the first as being obscure, not the second. I still haven't seen an example of the first, while the second one is very common. > > ----------->8 > > > > Does that make sense to you? We can change what we say about > > dma-ranges, I mainly want to be clear with what is or is not > > allowed at this point. > > I think it would be inconsistent and unnecessary to disallow it in the > binding. The meaning you've proposed seems completely consistent with > ePAPR, so I suggest to keep it. The IOMMU is just another bus master > from the ePAPR point of view -- no need to make special rules for it > unless they are useful. > > The binding does not need to be (and generally shouldn't be) a > description of precisely what the kernel does and does not support. > > However, if we don't need to support non-identity dma-ranges in Linux > yet, we have the option to barf if we see such a dma-ranges memorywards > of an IOMMU, if it simplifies the Linux implementation. We could always > relax that later -- and it'll be obvious how to describe that situation > in DT. Ok. > What I would like to see is a recommandation, based on Thierry's binding > here, for describing how cross-mastering in general is described. It's > not really a binding, but more of a template for bindings. > > I'm happy to have a go at writing it, then we can decide whether it's > useful or not. I don't mind if you take this on, but I'm not sure if that should be part of this binding or not. Let's see what you come up with. > There are a few things from the discussion that are *not* solved by this > iommu binding, but they seem reasonable. The binding also doesn't block > solving those things later if/when needed: > > 1) Cross-mastering to things that are not IOMMUs > > We might need to solve this later if we encounter SoCs with > problematic topologies, we shouldn't worry about it for the time > being. > > We'll to revisit it for GICv3 but that's a separate topic. > 2) Describing address and ID remappings for cross-mastering. > > We can describe this in a way that is consistent with this IOMMU > binding. We will need to describe something for GICv3, but the > common case will be that IDs are just passed through without > remapping. > > We don't need to clarify how IDs are propagated until we have > something in DT for IDs to propagate to. Ok, thanks for pointing these out. I had forgotten about the MSI case, but it seems ok to defer that part for now. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote: > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote: > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote: > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > > [...] > > > > > Couldn't a single-master IOMMU be windowed? > > > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > > > > IOMMU but uses one window per virtual machine. In that case, the window could > > > > be a property of the iommu node though, rather than part of the address > > > > in the link. > > > > > > Does that mean that the IOMMU has one statically configured window which > > > is the same for each virtual machine? That would require some other > > > mechanism to assign separate address spaces to each virtual machine, > > > wouldn't it? But I suspect that if the IOMMU allows that it could be > > > allocated dynamically at runtime. > > > > The way it works on pSeries is that upon VM creation, the guest is assigned > > one 256MB window for use by assigned DMA capable devices. When the guest > > creates a mapping, it uses a hypercall to associate a bus address in that > > range with a guest physical address. The hypervisor checks that the bus > > address is within the allowed range, and translates the guest physical > > address into a host physical address, then enters both into the I/O page > > table or I/O TLB. > > So when a VM is booted it is passed a device tree with that DMA window? Correct. > Given what you describe above this seems to be more of a configuration > option to restrict the IOMMU to a subset of the physical memory for > purposes of virtualization. So I agree that this wouldn't be a good fit > for what we're trying to achieve with iommus or dma-ranges in this > binding. Thinking about it again now, I wonder if there are any other use cases for windowed IOMMUs. If this is the only one, there might be no use in the #address-cells model I suggested instead of your original #iommu-cells. > > > > I would like to add an explanation about dma-ranges to the binding: > > > > > > > > 8<-------- > > > > The parent bus of the iommu must have a valid "dma-ranges" property > > > > describing how the physical address space of the IOMMU maps into > > > > memory. > > > > > > With physical address space you mean the addresses after translation, > > > not the I/O virtual addresses, right? But even so, how will this work > > > when there are multiple IOMMU devices? What determines which IOMMU is > > > mapped via which entry? > > > > > > Perhaps having multiple IOMMUs implies that there will have to be some > > > partitioning of the parent address space to make sure two IOMMUs don't > > > translate to the same ranges? > > > > These dma-ranges properties would almost always be for the entire RAM, > > and we can treat anything else as a bug. > > Would it typically be a 1:1 mapping? In that case could we define an > empty dma-ranges property to mean exactly that? That would make it > consistent with the ranges property. Yes, I believe that is how it's already defined. > > > > A device with an "iommus" property will ignore the "dma-ranges" property > > > > of the parent node and rely on the IOMMU for translation instead. > > > > > > Do we need to consider the case where an IOMMU listed in iommus isn't > > > enabled (status = "disabled")? In that case presumably the device would > > > either not function or may optionally continue to master onto the parent > > > untranslated. > > > > My reasoning was that the DT should specify whether we use the IOMMU > > or not. Being able to just switch on or off the IOMMU sounds nice as > > well, so we could change the text above to do that. > > > > Another option would be to do this in the IOMMU code, basically > > falling back to the IOMMU parent's dma-ranges property and using > > linear dma_map_ops when that is disabled. > > Yes, it should be trivial for the IOMMU core code to take care of this > special case. Still I think it's worth mentioning it in the binding so > that it's clearly specified. Agreed. > > > > Using an "iommus" property in bus device nodes with "dma-ranges" > > > > specifying how child devices relate to the IOMMU is a possible extension > > > > but is not recommended until this binding gets extended. > > > > > > Just for my understanding, bus device nodes with iommus and dma-ranges > > > properties could be equivalently written by explicitly moving the iommus > > > properties into the child device nodes, right? In which case they should > > > be the same as the other examples. So that concept is a convenience > > > notation to reduce duplication, but doesn't fundamentally introduce any > > > new concept. > > > > The one case where that doesn't work is PCI, because we don't list the > > PCI devices in DT normally, and the iommus property would only exist > > at the PCI host controller node. > > But it could work in classic OpenFirmware where the device tree can be > populated with the tree of PCI devices enumerated by OF. There are also > devices that have a fixed configuration and where technically the PCI > devices can be listed in the device tree. This is somewhat important if > for example one PCI device is a GPIO controller and needs to be > referenced by phandle from some other device. Correct. The flaw of classic Open Firmware here was that it cannot handle PCIe hotplug though, so we can never rely on the DT to describe all devices. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 20 May 2014 17:39:12 Dave Martin wrote: > On Tue, May 20, 2014 at 04:26:59PM +0100, Will Deacon wrote: > > On Tue, May 20, 2014 at 02:23:47PM +0100, Arnd Bergmann wrote: > > > Bit# 33222222 22221111 11111100 00000000 > > > 10987654 32109876 54321098 76543210 > > > phys.hi cell: npt000ss bbbbbbbb dddddfff rrrrrrrr > > > phys.mid cell: hhhhhhhh hhhhhhhh hhhhhhhh hhhhhhhh > > > phys.lo cell: llllllll llllllll llllllll llllllll > > > > > > where: > > > n is 0 if the address is relocatable, 1 otherwise > > > p is 1 if the addressable region is "prefetchable", 0 otherwise > > > t is 1 if the address is aliased (for non-relocatable I/O), > > > below 1 MB (for Memory), or below 64 KB (for relocatable I/O). > > > ss is the space code, denoting the address space > > > bbbbbbbb is the 8-bit Bus Number > > > ddddd is the 5-bit Device Number > > > fff is the 3-bit Function Number > > > rrrrrrrr is the 8-bit Register Number > > > hh...hh is a 32-bit unsigned number > > > ll...ll is a 32-bit unsigned number > > > > > > We can ignore n, p, t and r here, and use the same format for a DMA > > > address, then define an empty "dma-ranges" property. That would > > > imply that using b/d/f is sufficient to identify each master at the > > > iommu. Any device outside of the PCI host but connected to the same > > > iommu can use the same notation to list the logical b/d/f that gets > > > sent to the IOMMU in bus master transactions. > > > > > > Do you think this is sufficient for the ARM SMMU, or do we need > > > something beyond that? > > > > I think it can define the common-cases for the existing implementations, > > yes. I anticipate Stream-IDs becoming > 16-bit in the near future though, > > so we'd need extra bits if we're describing other devices coming into the > > SMMU. > > > > Note that we already have a binding for the current SMMU driver, so I'm not > > really in a position to shift over to a new binding until the next version of > > the SMMU architecture comes along... > > How much code relies on the meaning of the nptsbdf bits? Not much at all, I think this was defined mostly for open firmware client interfaces, which we don't use with FDT. n, t and r are probably only used in PCI functions that are listed in DT, not in the host bridge. b/d/f I think is mostly used for the interrupt-maps property, when describing hardwired interrupts. p and s are used when setting up the translation for inbound MMIO and PIO. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote: > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote: > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote: > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote: > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > > > [...] > > > > > > Couldn't a single-master IOMMU be windowed? > > > > > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > > > > > IOMMU but uses one window per virtual machine. In that case, the window could > > > > > be a property of the iommu node though, rather than part of the address > > > > > in the link. > > > > > > > > Does that mean that the IOMMU has one statically configured window which > > > > is the same for each virtual machine? That would require some other > > > > mechanism to assign separate address spaces to each virtual machine, > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be > > > > allocated dynamically at runtime. > > > > > > The way it works on pSeries is that upon VM creation, the guest is assigned > > > one 256MB window for use by assigned DMA capable devices. When the guest > > > creates a mapping, it uses a hypercall to associate a bus address in that > > > range with a guest physical address. The hypervisor checks that the bus > > > address is within the allowed range, and translates the guest physical > > > address into a host physical address, then enters both into the I/O page > > > table or I/O TLB. > > > > So when a VM is booted it is passed a device tree with that DMA window? > > Correct. > > > Given what you describe above this seems to be more of a configuration > > option to restrict the IOMMU to a subset of the physical memory for > > purposes of virtualization. So I agree that this wouldn't be a good fit > > for what we're trying to achieve with iommus or dma-ranges in this > > binding. > > Thinking about it again now, I wonder if there are any other use cases > for windowed IOMMUs. If this is the only one, there might be no use > in the #address-cells model I suggested instead of your original > #iommu-cells. So in this case virtualization is the reason why we need the DMA window. The reason for that is that the guest has no other way of knowing what other guests might be using, so it's essentially a mechanism for the host to manage the DMA region and allocate subregions for each guest. If virtualization isn't an issue then it seems to me that the need for DMA windows goes away because the operating system will track DMA regions anyway. The only reason I can think of why a windowed IOMMU would be useful is to prevent two or more devices from stepping on each others' toes. But that's a problem that the OS should already be handling during DMA buffer allocation, isn't it? > > > > > I would like to add an explanation about dma-ranges to the binding: > > > > > > > > > > 8<-------- > > > > > The parent bus of the iommu must have a valid "dma-ranges" property > > > > > describing how the physical address space of the IOMMU maps into > > > > > memory. > > > > > > > > With physical address space you mean the addresses after translation, > > > > not the I/O virtual addresses, right? But even so, how will this work > > > > when there are multiple IOMMU devices? What determines which IOMMU is > > > > mapped via which entry? > > > > > > > > Perhaps having multiple IOMMUs implies that there will have to be some > > > > partitioning of the parent address space to make sure two IOMMUs don't > > > > translate to the same ranges? > > > > > > These dma-ranges properties would almost always be for the entire RAM, > > > and we can treat anything else as a bug. > > > > Would it typically be a 1:1 mapping? In that case could we define an > > empty dma-ranges property to mean exactly that? That would make it > > consistent with the ranges property. > > Yes, I believe that is how it's already defined. I've gone through the proposal at [0] again, but couldn't find a mention of an empty "dma-ranges" property. But regardless I think that a 1:1 mapping is the obvious meaning of an empty "dma-ranges" property. [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt One thing I'm not sure about is whether dma-ranges should be documented in this binding at all. Since there's an accepted standard proposal it would seem that it doesn't need to be specifically mentioned. One other option would be to link to the above proposal from the binding and then complement that with what an empty "dma-ranges" property means. Or we could possible document this in a file along with other standard properties. I don't think we currently do that for any properties, but my concern is that there will always be a limited number of people knowing about how such properties are supposed to work. If all of a sudden all these people would disappear, everybody else would be left with references to these properties but nowhere to look for their meaning. > > > > > A device with an "iommus" property will ignore the "dma-ranges" property > > > > > of the parent node and rely on the IOMMU for translation instead. > > > > > > > > Do we need to consider the case where an IOMMU listed in iommus isn't > > > > enabled (status = "disabled")? In that case presumably the device would > > > > either not function or may optionally continue to master onto the parent > > > > untranslated. > > > > > > My reasoning was that the DT should specify whether we use the IOMMU > > > or not. Being able to just switch on or off the IOMMU sounds nice as > > > well, so we could change the text above to do that. > > > > > > Another option would be to do this in the IOMMU code, basically > > > falling back to the IOMMU parent's dma-ranges property and using > > > linear dma_map_ops when that is disabled. > > > > Yes, it should be trivial for the IOMMU core code to take care of this > > special case. Still I think it's worth mentioning it in the binding so > > that it's clearly specified. > > Agreed. Okay, I have a new version of the binding that I think incorporates all the changes discussed so far. It uses #address-cells and #size-cells to define the length of the specifier, but if we decide against that it can easily be changed again. Thierry
On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote: > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote: > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: [...] > > > > Multiple-master IOMMU: > > > > ---------------------- > > > > > > > > iommu { > > > > /* the specifier represents the ID of the master */ > > > > #address-cells = <1>; > > > > #size-cells = <0>; > > > > How do we know the size of the input address to the IOMMU? Do we > > get cases for example where the IOMMU only accepts a 32-bit input > > address, but some 64-bit capable masters are connected through it? > > I was stuck on this question for a while before, but then I realized > that it doesn't matter at all: It's the IOMMU driver itself that > manages the address space, and it doesn't matter if a slave can > address a larger range than the IOMMU can accept. If the IOMMU > needs to deal with the opposite case (64-bit input addresses > but a 32-bit master), that limitation can be put into the specifier. Isn't this what DMA masks are for? Couldn't the IOMMU simply use the master device's DMA mask to do the right thing here? As such I don't think this information needs to be in device tree at all. The DMA mask should typically be set by the driver in the first place because it has knowledge about the capabilities of the device. A different way of saying that is that the DMA mask is implied by the device's compatible string. > > For determining dma masks, it is the output address that it > > important. Santosh's code can probably be taught to handle this, > > if given an additional traversal rule for following "iommus" > > properties. However, deploying an IOMMU whose output address size > > is smaller than the > > Something seems to be missing here. I don't think we want to handle > the case where the IOMMU output cannot the entire memory address > space. If necessary, that would mean using both an IOMMU driver > and swiotlb, but I think it's a reasonable assumption that hardware > isn't /that/ crazy. Similarily, should the IOMMU not be treated like any other device here? Its DMA mask should determine what address range it can access. Thierry
On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote: > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote: > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > [...] > > > > > Multiple-master IOMMU: > > > > > ---------------------- > > > > > > > > > > iommu { > > > > > /* the specifier represents the ID of the master */ > > > > > #address-cells = <1>; > > > > > #size-cells = <0>; > > > > > > How do we know the size of the input address to the IOMMU? Do we > > > get cases for example where the IOMMU only accepts a 32-bit input > > > address, but some 64-bit capable masters are connected through it? > > > > I was stuck on this question for a while before, but then I realized > > that it doesn't matter at all: It's the IOMMU driver itself that > > manages the address space, and it doesn't matter if a slave can > > address a larger range than the IOMMU can accept. If the IOMMU > > needs to deal with the opposite case (64-bit input addresses > > but a 32-bit master), that limitation can be put into the specifier. > > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the > master device's DMA mask to do the right thing here? Ah, yes. I guess that's the right way to do it. > > > For determining dma masks, it is the output address that it > > > important. Santosh's code can probably be taught to handle this, > > > if given an additional traversal rule for following "iommus" > > > properties. However, deploying an IOMMU whose output address size > > > is smaller than the > > > > Something seems to be missing here. I don't think we want to handle > > the case where the IOMMU output cannot the entire memory address > > space. If necessary, that would mean using both an IOMMU driver > > and swiotlb, but I think it's a reasonable assumption that hardware > > isn't /that/ crazy. > > Similarily, should the IOMMU not be treated like any other device here? > Its DMA mask should determine what address range it can access. Right. But for that we need a dma-ranges property in the parent of the iommu, just so the mask can be set correctly and we don't have to rely on the 32-bit fallback case. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote: > On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote: > > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote: > > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote: > > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote: > > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > > > > [...] > > > > > > > Couldn't a single-master IOMMU be windowed? > > > > > > > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > > > > > > IOMMU but uses one window per virtual machine. In that case, the window could > > > > > > be a property of the iommu node though, rather than part of the address > > > > > > in the link. > > > > > > > > > > Does that mean that the IOMMU has one statically configured window which > > > > > is the same for each virtual machine? That would require some other > > > > > mechanism to assign separate address spaces to each virtual machine, > > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be > > > > > allocated dynamically at runtime. > > > > > > > > The way it works on pSeries is that upon VM creation, the guest is assigned > > > > one 256MB window for use by assigned DMA capable devices. When the guest > > > > creates a mapping, it uses a hypercall to associate a bus address in that > > > > range with a guest physical address. The hypervisor checks that the bus > > > > address is within the allowed range, and translates the guest physical > > > > address into a host physical address, then enters both into the I/O page > > > > table or I/O TLB. > > > > > > So when a VM is booted it is passed a device tree with that DMA window? > > > > Correct. > > > > > Given what you describe above this seems to be more of a configuration > > > option to restrict the IOMMU to a subset of the physical memory for > > > purposes of virtualization. So I agree that this wouldn't be a good fit > > > for what we're trying to achieve with iommus or dma-ranges in this > > > binding. > > > > Thinking about it again now, I wonder if there are any other use cases > > for windowed IOMMUs. If this is the only one, there might be no use > > in the #address-cells model I suggested instead of your original > > #iommu-cells. > > So in this case virtualization is the reason why we need the DMA window. > The reason for that is that the guest has no other way of knowing what > other guests might be using, so it's essentially a mechanism for the > host to manage the DMA region and allocate subregions for each guest. If > virtualization isn't an issue then it seems to me that the need for DMA > windows goes away because the operating system will track DMA regions > anyway. > > The only reason I can think of why a windowed IOMMU would be useful is > to prevent two or more devices from stepping on each others' toes. But > that's a problem that the OS should already be handling during DMA > buffer allocation, isn't it? Right. As long as we always unmap the buffers from the IOMMU after they have stopped being in use, it's very unlikely that even a broken device driver causes a DMA into some bus address that happens to be mapped for another device. > > > > > > I would like to add an explanation about dma-ranges to the binding: > > > > > > > > > > > > 8<-------- > > > > > > The parent bus of the iommu must have a valid "dma-ranges" property > > > > > > describing how the physical address space of the IOMMU maps into > > > > > > memory. > > > > > > > > > > With physical address space you mean the addresses after translation, > > > > > not the I/O virtual addresses, right? But even so, how will this work > > > > > when there are multiple IOMMU devices? What determines which IOMMU is > > > > > mapped via which entry? > > > > > > > > > > Perhaps having multiple IOMMUs implies that there will have to be some > > > > > partitioning of the parent address space to make sure two IOMMUs don't > > > > > translate to the same ranges? > > > > > > > > These dma-ranges properties would almost always be for the entire RAM, > > > > and we can treat anything else as a bug. > > > > > > Would it typically be a 1:1 mapping? In that case could we define an > > > empty dma-ranges property to mean exactly that? That would make it > > > consistent with the ranges property. > > > > Yes, I believe that is how it's already defined. > > I've gone through the proposal at [0] again, but couldn't find a mention > of an empty "dma-ranges" property. But regardless I think that a 1:1 > mapping is the obvious meaning of an empty "dma-ranges" property. > > [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt > > One thing I'm not sure about is whether dma-ranges should be documented > in this binding at all. Since there's an accepted standard proposal it > would seem that it doesn't need to be specifically mentioned. One other > option would be to link to the above proposal from the binding and then > complement that with what an empty "dma-ranges" property means. > > Or we could possible document this in a file along with other standard > properties. I don't think we currently do that for any properties, but > my concern is that there will always be a limited number of people > knowing about how such properties are supposed to work. If all of a > sudden all these people would disappear, everybody else would be left > with references to these properties but nowhere to look for their > meaning. I think it makes sense to document how the standard dma-ranges interacts with the new iommu binding, because it's not obvious what happens if you have both together, or iommu without a parent dma-ranges. > > > > > > A device with an "iommus" property will ignore the "dma-ranges" property > > > > > > of the parent node and rely on the IOMMU for translation instead. > > > > > > > > > > Do we need to consider the case where an IOMMU listed in iommus isn't > > > > > enabled (status = "disabled")? In that case presumably the device would > > > > > either not function or may optionally continue to master onto the parent > > > > > untranslated. > > > > > > > > My reasoning was that the DT should specify whether we use the IOMMU > > > > or not. Being able to just switch on or off the IOMMU sounds nice as > > > > well, so we could change the text above to do that. > > > > > > > > Another option would be to do this in the IOMMU code, basically > > > > falling back to the IOMMU parent's dma-ranges property and using > > > > linear dma_map_ops when that is disabled. > > > > > > Yes, it should be trivial for the IOMMU core code to take care of this > > > special case. Still I think it's worth mentioning it in the binding so > > > that it's clearly specified. > > > > Agreed. > > Okay, I have a new version of the binding that I think incorporates all > the changes discussed so far. It uses #address-cells and #size-cells to > define the length of the specifier, but if we decide against that it can > easily be changed again. Ok. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote: > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote: > > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote: > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > [...] > > > > > > Multiple-master IOMMU: > > > > > > ---------------------- > > > > > > > > > > > > iommu { > > > > > > /* the specifier represents the ID of the master */ > > > > > > #address-cells = <1>; > > > > > > #size-cells = <0>; > > > > > > > > How do we know the size of the input address to the IOMMU? Do we > > > > get cases for example where the IOMMU only accepts a 32-bit input > > > > address, but some 64-bit capable masters are connected through it? > > > > > > I was stuck on this question for a while before, but then I realized > > > that it doesn't matter at all: It's the IOMMU driver itself that > > > manages the address space, and it doesn't matter if a slave can > > > address a larger range than the IOMMU can accept. If the IOMMU > > > needs to deal with the opposite case (64-bit input addresses > > > but a 32-bit master), that limitation can be put into the specifier. > > > > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the > > master device's DMA mask to do the right thing here? > > Ah, yes. I guess that's the right way to do it. > > > > > For determining dma masks, it is the output address that it > > > > important. Santosh's code can probably be taught to handle this, > > > > if given an additional traversal rule for following "iommus" > > > > properties. However, deploying an IOMMU whose output address size > > > > is smaller than the > > > > > > Something seems to be missing here. I don't think we want to handle > > > the case where the IOMMU output cannot the entire memory address > > > space. If necessary, that would mean using both an IOMMU driver > > > and swiotlb, but I think it's a reasonable assumption that hardware > > > isn't /that/ crazy. > > > > Similarily, should the IOMMU not be treated like any other device here? > > Its DMA mask should determine what address range it can access. > > Right. But for that we need a dma-ranges property in the parent of the > iommu, just so the mask can be set correctly and we don't have to > rely on the 32-bit fallback case. Shouldn't the IOMMU driver be the one to set the DMA mask for the device in exactly the same way that other drivers override the 32-bit default? Thierry
On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote: > On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote: > > On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote: > > > On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote: > > > > On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote: > > > > > On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote: > > > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > > > > > [...] > > > > > > > > Couldn't a single-master IOMMU be windowed? > > > > > > > > > > > > > > Ah, yes. That would actually be like an IBM pSeries, which has a windowed > > > > > > > IOMMU but uses one window per virtual machine. In that case, the window could > > > > > > > be a property of the iommu node though, rather than part of the address > > > > > > > in the link. > > > > > > > > > > > > Does that mean that the IOMMU has one statically configured window which > > > > > > is the same for each virtual machine? That would require some other > > > > > > mechanism to assign separate address spaces to each virtual machine, > > > > > > wouldn't it? But I suspect that if the IOMMU allows that it could be > > > > > > allocated dynamically at runtime. > > > > > > > > > > The way it works on pSeries is that upon VM creation, the guest is assigned > > > > > one 256MB window for use by assigned DMA capable devices. When the guest > > > > > creates a mapping, it uses a hypercall to associate a bus address in that > > > > > range with a guest physical address. The hypervisor checks that the bus > > > > > address is within the allowed range, and translates the guest physical > > > > > address into a host physical address, then enters both into the I/O page > > > > > table or I/O TLB. > > > > > > > > So when a VM is booted it is passed a device tree with that DMA window? > > > > > > Correct. > > > > > > > Given what you describe above this seems to be more of a configuration > > > > option to restrict the IOMMU to a subset of the physical memory for > > > > purposes of virtualization. So I agree that this wouldn't be a good fit > > > > for what we're trying to achieve with iommus or dma-ranges in this > > > > binding. > > > > > > Thinking about it again now, I wonder if there are any other use cases > > > for windowed IOMMUs. If this is the only one, there might be no use > > > in the #address-cells model I suggested instead of your original > > > #iommu-cells. > > > > So in this case virtualization is the reason why we need the DMA window. > > The reason for that is that the guest has no other way of knowing what > > other guests might be using, so it's essentially a mechanism for the > > host to manage the DMA region and allocate subregions for each guest. If > > virtualization isn't an issue then it seems to me that the need for DMA > > windows goes away because the operating system will track DMA regions > > anyway. > > > > The only reason I can think of why a windowed IOMMU would be useful is > > to prevent two or more devices from stepping on each others' toes. But > > that's a problem that the OS should already be handling during DMA > > buffer allocation, isn't it? > > Right. As long as we always unmap the buffers from the IOMMU after they > have stopped being in use, it's very unlikely that even a broken device > driver causes a DMA into some bus address that happens to be mapped for > another device. I think that if buffers remain mapped in the IOMMU when they have been deallocated that should be considered a bug. Thierry
On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote: > On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote: > > > Right. As long as we always unmap the buffers from the IOMMU after they > > have stopped being in use, it's very unlikely that even a broken device > > driver causes a DMA into some bus address that happens to be mapped for > > another device. > > I think that if buffers remain mapped in the IOMMU when they have been > deallocated that should be considered a bug. You could see it as a performance optimization in some cases, e.g. when you cannot flush individual IOTLBs or doing so is expensive, and you just keep assigning new addresses until every one has been used, then you do a global IOTLB flush once. Obviously you have to maintain the IO page tables correctly. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote: > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote: > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > > > > > For determining dma masks, it is the output address that it > > > > > important. Santosh's code can probably be taught to handle this, > > > > > if given an additional traversal rule for following "iommus" > > > > > properties. However, deploying an IOMMU whose output address size > > > > > is smaller than the > > > > > > > > Something seems to be missing here. I don't think we want to handle > > > > the case where the IOMMU output cannot the entire memory address > > > > space. If necessary, that would mean using both an IOMMU driver > > > > and swiotlb, but I think it's a reasonable assumption that hardware > > > > isn't /that/ crazy. > > > > > > Similarily, should the IOMMU not be treated like any other device here? > > > Its DMA mask should determine what address range it can access. > > > > Right. But for that we need a dma-ranges property in the parent of the > > iommu, just so the mask can be set correctly and we don't have to > > rely on the 32-bit fallback case. > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device > in exactly the same way that other drivers override the 32-bit default? The IOMMU driver could /ask/ for an appropriate mask based on its internal design, but if you have an IOMMU with a 64-bit output address connected to a 32-bit bus, that should fail. Note that it's not obvious what the IOMMU's DMA mask actually means. It clearly has to be the mask that is used for allocating the IO page tables, but it wouldn't normally be used in the path that allocates pages on behalf of a DMA master attached to the IOMMU, because that allocation is performed by the code that looks at the other device's dma mask. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote: > On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote: > > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote: > > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > > > > > > > For determining dma masks, it is the output address that it > > > > > > important. Santosh's code can probably be taught to handle this, > > > > > > if given an additional traversal rule for following "iommus" > > > > > > properties. However, deploying an IOMMU whose output address size > > > > > > is smaller than the > > > > > > > > > > Something seems to be missing here. I don't think we want to handle > > > > > the case where the IOMMU output cannot the entire memory address > > > > > space. If necessary, that would mean using both an IOMMU driver > > > > > and swiotlb, but I think it's a reasonable assumption that hardware > > > > > isn't /that/ crazy. > > > > > > > > Similarily, should the IOMMU not be treated like any other device here? > > > > Its DMA mask should determine what address range it can access. > > > > > > Right. But for that we need a dma-ranges property in the parent of the > > > iommu, just so the mask can be set correctly and we don't have to > > > rely on the 32-bit fallback case. > > > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device > > in exactly the same way that other drivers override the 32-bit default? > > The IOMMU driver could /ask/ for an appropriate mask based on its internal > design, but if you have an IOMMU with a 64-bit output address connected > to a 32-bit bus, that should fail. Are there real use-cases where that really happens? I guess if we need that the correct thing would be to bitwise AND both the DMA mask of the IOMMU device (as set by the driver) with that derived from the IOMMU's parent bus' dma-ranges property. > Note that it's not obvious what the IOMMU's DMA mask actually means. > It clearly has to be the mask that is used for allocating the IO page > tables, but it wouldn't normally be used in the path that allocates > pages on behalf of a DMA master attached to the IOMMU, because that > allocation is performed by the code that looks at the other device's > dma mask. Interesting. If a DMA buffer is allocated using the master's DMA mask wouldn't that cause breakage if the IOMMU and master's DMA masks don't match. It seems to me like the right thing to do for buffer allocation is to use the IOMMU's DMA mask if a device uses the IOMMU for translation and use the device's DMA mask when determining to what I/O virtual address to map that buffer. Obviously if we always assume that IOMMU hardware is sane and can always access at least the whole memory then this isn't an issue. But what if a device can do DMA to a 64-bit address space, but the IOMMU can only address 32 bits. If the device's DMA mask is used for allocations, then buffers could reside beyond the 4 GiB boundary that the IOMMU can address, so effectively the IOMMU wouldn't be able to write to those buffers. Thierry
On Wednesday 21 May 2014 12:50:38 Thierry Reding wrote: > On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote: > > On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote: > > > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote: > > > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > > > > > > > > > For determining dma masks, it is the output address that it > > > > > > > important. Santosh's code can probably be taught to handle this, > > > > > > > if given an additional traversal rule for following "iommus" > > > > > > > properties. However, deploying an IOMMU whose output address size > > > > > > > is smaller than the > > > > > > > > > > > > Something seems to be missing here. I don't think we want to handle > > > > > > the case where the IOMMU output cannot the entire memory address > > > > > > space. If necessary, that would mean using both an IOMMU driver > > > > > > and swiotlb, but I think it's a reasonable assumption that hardware > > > > > > isn't /that/ crazy. > > > > > > > > > > Similarily, should the IOMMU not be treated like any other device here? > > > > > Its DMA mask should determine what address range it can access. > > > > > > > > Right. But for that we need a dma-ranges property in the parent of the > > > > iommu, just so the mask can be set correctly and we don't have to > > > > rely on the 32-bit fallback case. > > > > > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device > > > in exactly the same way that other drivers override the 32-bit default? > > > > The IOMMU driver could /ask/ for an appropriate mask based on its internal > > design, but if you have an IOMMU with a 64-bit output address connected > > to a 32-bit bus, that should fail. > > Are there real use-cases where that really happens? I guess if we need > that the correct thing would be to bitwise AND both the DMA mask of the > IOMMU device (as set by the driver) with that derived from the IOMMU's > parent bus' dma-ranges property. It would be unusual for an IOMMU to need this, but it's how the DMA mask is supposed to work for normal devices. As mentioned before, I would probably just error out if we ever encounter such an IOMMU. > > Note that it's not obvious what the IOMMU's DMA mask actually means. > > It clearly has to be the mask that is used for allocating the IO page > > tables, but it wouldn't normally be used in the path that allocates > > pages on behalf of a DMA master attached to the IOMMU, because that > > allocation is performed by the code that looks at the other device's > > dma mask. > > Interesting. If a DMA buffer is allocated using the master's DMA mask > wouldn't that cause breakage if the IOMMU and master's DMA masks don't > match. It seems to me like the right thing to do for buffer allocation > is to use the IOMMU's DMA mask if a device uses the IOMMU for > translation and use the device's DMA mask when determining to what I/O > virtual address to map that buffer. Unfortunately not all code agrees regarding how dma mask is actually interpreted. The most important use is within the dma_map_ops, and that is aware of the IOMMU. The dma_map_ops use that to decide what IOVA (bus address) to generate that is usable for the device, normally this would be a 32-bit range. When driver code looks at the dma mask of the device itself to make an allocation decision without taking the IOMMU or swiotlb into account, things can indeed go wrong. Russell has recently done a good cleanup of various issues around dma masks, and I can't find any drivers that get this wrong. However, there is an issue with the two or three subsystems using "PCI_DMA_BUS_IS_PHYS" to decide how they should treat high buffers coming from user space that get passed to hardware. If the SCSI layer or the network layer find the horribly misnamed PCI_DMA_BUS_IS_PHYS (which is hardcoded to "1" on ARM32), they will create copy in low memory for any data that is above the device dma_mask (SCSI) or above max_low_pfn (network). This is not normally a bug, and won't hurt for the swiotlb case, but will give us worse performance for the IOMMU case, and we should probably change this code to calculate the boundary per device by calling a function from dma_map_ops. We also really need to implement swiotlb support on ARM32 to deal with any other device (besides SCSI and network) that does not have an IOMMU but wants to use the streaming DMA API on pages outside of the dma_mask. We already have this case on shmobile. > Obviously if we always assume that IOMMU hardware is sane and can always > access at least the whole memory then this isn't an issue. But what if a > device can do DMA to a 64-bit address space, but the IOMMU can only > address 32 bits. If the device's DMA mask is used for allocations, then > buffers could reside beyond the 4 GiB boundary that the IOMMU can > address, so effectively the IOMMU wouldn't be able to write to those > buffers. The mask of the device is not even an issue here, it's more the general case of passing a buffer outside of the IOMMU's upstream bus DMA mask into a driver connected to the IOMMU. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote: >> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote: >> >> > Right. As long as we always unmap the buffers from the IOMMU after they >> > have stopped being in use, it's very unlikely that even a broken device >> > driver causes a DMA into some bus address that happens to be mapped for >> > another device. >> >> I think that if buffers remain mapped in the IOMMU when they have been >> deallocated that should be considered a bug. There is currently no general requirement to tear down mappings immediately. An option to enforce immediate tear down might be useful since I agree Virtual Guests sharing host memory through shared physical devices will want that. ie there should be no opportunity for a shared device to be able to access another guests memory through "stale" (but live) DMA mappings. I don't think that's the case but I'd ask someone like Alex Williamson for a more certain answer. > You could see it as a performance optimization in some cases, e.g. when you > cannot flush individual IOTLBs or doing so is expensive, and you just keep > assigning new addresses until every one has been used, then you do a global > IOTLB flush once. This is in fact exactly what several IOMMUs do - including the Intel IOMMU driver. IIRC, the Intel IOMMU driver collects 256 or so entries to flush and then flushes them all at once and updates the metadata. I don't if it does this for ranges assigned to different virtual guests though or just within a given range. > Obviously you have to maintain the IO page tables correctly. To be clear, "Correctly" in this case just means until the IOTLB is flushed, the given IOMMU Pdir entries are marked "in use" even though the driver has handed "ownership" back to the IOMMU driver. cheers, grant -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 21 May 2014 08:44:42 Grant Grundler wrote: > On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote: > >> On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote: > >> > >> > Right. As long as we always unmap the buffers from the IOMMU after they > >> > have stopped being in use, it's very unlikely that even a broken device > >> > driver causes a DMA into some bus address that happens to be mapped for > >> > another device. > >> > >> I think that if buffers remain mapped in the IOMMU when they have been > >> deallocated that should be considered a bug. > > There is currently no general requirement to tear down mappings immediately. > An option to enforce immediate tear down might be useful since I agree > Virtual Guests sharing host memory through shared physical devices > will want that. ie there should be no opportunity for a shared device > to be able to access another guests memory through "stale" (but live) > DMA mappings. I don't think that's the case but I'd ask someone like > Alex Williamson for a more certain answer. I believe powerpc has a boot-time option to enforce the immediate IOTLB flush, which is very useful for device driver debugging when something goes wrong with stale DMAs. > > Obviously you have to maintain the IO page tables correctly. > > To be clear, "Correctly" in this case just means until the IOTLB is > flushed, the given IOMMU Pdir entries are marked "in use" even though > the driver has handed "ownership" back to the IOMMU driver. I don't know what a Pdir is, but I guess strictly speaking we have to ensure that all IO page table entries that have been unmapped by a driver are marked as invalid at least by the time the IOTLB is flushed, plus we have to flush each entry before it gets reused for a different page. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote: > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote: > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > > > On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote: > > > > On Tuesday 20 May 2014 16:24:59 Dave Martin wrote: > > > > > On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote: > > > > > > On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote: > > > [...] > > > > > > > Multiple-master IOMMU: > > > > > > > ---------------------- > > > > > > > > > > > > > > iommu { > > > > > > > /* the specifier represents the ID of the master */ > > > > > > > #address-cells = <1>; > > > > > > > #size-cells = <0>; > > > > > > > > > > How do we know the size of the input address to the IOMMU? Do we > > > > > get cases for example where the IOMMU only accepts a 32-bit input > > > > > address, but some 64-bit capable masters are connected through it? > > > > > > > > I was stuck on this question for a while before, but then I realized > > > > that it doesn't matter at all: It's the IOMMU driver itself that > > > > manages the address space, and it doesn't matter if a slave can > > > > address a larger range than the IOMMU can accept. If the IOMMU > > > > needs to deal with the opposite case (64-bit input addresses > > > > but a 32-bit master), that limitation can be put into the specifier. > > > > > > Isn't this what DMA masks are for? Couldn't the IOMMU simply use the > > > master device's DMA mask to do the right thing here? > > > > Ah, yes. I guess that's the right way to do it. > > > > > > > For determining dma masks, it is the output address that it > > > > > important. Santosh's code can probably be taught to handle this, > > > > > if given an additional traversal rule for following "iommus" > > > > > properties. However, deploying an IOMMU whose output address size > > > > > is smaller than the > > > > > > > > Something seems to be missing here. I don't think we want to handle > > > > the case where the IOMMU output cannot the entire memory address > > > > space. If necessary, that would mean using both an IOMMU driver > > > > and swiotlb, but I think it's a reasonable assumption that hardware > > > > isn't /that/ crazy. > > > > > > Similarily, should the IOMMU not be treated like any other device here? > > > Its DMA mask should determine what address range it can access. > > > > Right. But for that we need a dma-ranges property in the parent of the > > iommu, just so the mask can be set correctly and we don't have to > > rely on the 32-bit fallback case. > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device > in exactly the same way that other drivers override the 32-bit default? > Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask" here? The device has a next-hop mask which may be non-trivial. The IOMMU also has a next-hop mask and/or remapping, but I think we agree that in sensible systems that will be trivial. There might be other non-trivial remappings between the IOMMU and memory (but again, not in the common case). If we just use the same name for all these, we are liable to get confused. To answer the question "what memory can the kernel allocate for DMA with this device", it is the end-to-end transformation that is important. Cheers ---Dave -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 21 May 2014 18:09:54 Dave Martin wrote: > On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote: > > On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote: > > > On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote: > > > > > > > > Similarily, should the IOMMU not be treated like any other device here? > > > > Its DMA mask should determine what address range it can access. > > > > > > Right. But for that we need a dma-ranges property in the parent of the > > > iommu, just so the mask can be set correctly and we don't have to > > > rely on the 32-bit fallback case. > > > > Shouldn't the IOMMU driver be the one to set the DMA mask for the device > > in exactly the same way that other drivers override the 32-bit default? > > > > Are we confusing the "next-hop DMA mask" with the "end-to-end DMA mask" > here? The device has a next-hop mask which may be non-trivial. The > IOMMU also has a next-hop mask and/or remapping, but I think we agree > that in sensible systems that will be trivial. There might be other > non-trivial remappings between the IOMMU and memory (but again, not > in the common case). > > If we just use the same name for all these, we are liable to get > confused. > > To answer the question "what memory can the kernel allocate for DMA > with this device", it is the end-to-end transformation that is > important. Yes, but that is not the same as the dma mask of the device. The DMA mask gets set by the device according to its capabilities, and may get limited by what the bus to either memory or to the iommu can do, if one is in use. Without an IOMMU, the mask is used for allocations, with an IOMMU, The iommu code makes the decision what to allocate without taking dev->dma_mask into account. As mentioned, this is currently not handled well for SCSI and network, where we use a smaller mask than necessary and can end up copying data. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt new file mode 100644 index 000000000000..2d67b52b656e --- /dev/null +++ b/Documentation/devicetree/bindings/iommu/iommu.txt @@ -0,0 +1,109 @@ +This document describes the generic device tree binding for IOMMUs and their +master(s). + + +IOMMU device node: +================== + +An IOMMU can provide the following services: + +* Remap address space to allow devices to access physical memory ranges that + they otherwise wouldn't be capable of accessing. + + Example: 32-bit DMA to 64-bit physical addresses + +* Implement scatter-gather at page level granularity so that the device does + not have to. + +* Provide system protection against "rogue" DMA by forcing all accesses to go + through the IOMMU and faulting when encountering accesses to unmapped + address regions. + +* Provide address space isolation between multiple contexts. + + Example: Virtualization + +Device nodes compatible with this binding represent hardware with some of the +above capabilities. + +IOMMUs can be single-master or multiple-master. Single-master IOMMU devices +typically have a fixed association to the master device, whereas multiple- +master IOMMU devices can translate accesses from more than one master. + +Required properties: +-------------------- +- #iommu-cells: The number of cells in an IOMMU specifier. The meaning of the + cells is defined by the binding for the IOMMU device. + + Typical values include: + * 0: Single-master IOMMU devices are often not configurable, therefore the + specifying doesn't need to encode any information and can be empty. + + * 1: Multiple-master IOMMU devices need to know for which master they should + enable translation. Typically the single cell in the specifier corresponds + to the master device's ID. + + +IOMMU master node: +================== + +Devices that access memory through an IOMMU are called masters. A device can +have multiple master interfaces (to one or more IOMMU devices). + +Required properties: +-------------------- +- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU + master interfaces of the device. One entry in the list describes one master + interface of the device. + +Optional properties: +-------------------- +- iommu-names: A list of names identifying each entry in the iommus property. + + +Examples: +========= + +Single-master IOMMU: +-------------------- + + iommu { + #iommu-cells = <0>; + }; + + master { + iommu = <&/iommu>; + }; + +Multi-master IOMMU: +------------------- + + iommu { + /* the specifier represents the ID of the master */ + #iommu-cells = <1>; + }; + + master { + /* device has master ID 42 in the IOMMU */ + iommu = <&/iommu 42>; + }; + +Multi-master device: +-------------------- + + /* single-master IOMMU */ + iommu@1 { + #iommu-cells = <0>; + }; + + /* multi-master IOMMU */ + iommu@2 { + /* the specifier represents the ID of the master */ + #iommu-cells = <1>; + }; + + /* device with two master interfaces */ + master { + iommus = <&/iommu@1>, /* master of the single-master IOMMU */ + <&/iommu@2 42>; /* ID 42 in multi-master IOMMU */ + };