diff mbox

[PATCHv5,2/9] driver/core: populate devices in order for IOMMUs

Message ID 1384853593-32202-3-git-send-email-hdoyu@nvidia.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hiroshi DOYU Nov. 19, 2013, 9:33 a.m. UTC
IOMMU devices on the bus need to be poplulated first, then iommu
master devices are done later.

With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
whether a device can be an iommu msater or not. If a device can, we'll
defer to populate that device till an iommu device is populated. Once
an iommu device is populated, "dev->bus->iommu_ops" is set in the
bus. Then, those defered iommu master devices are populated and
configured for IOMMU with help of the already populated iommu device
via iommu_ops->add_device(). Multiple IOMMUs can be listed on this
"iommus" binding so that a device can have multiple IOMMUs attached.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
---
v5:
Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".

v4:
This is newly added, and the successor of the following RFC:
  [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
  http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
---
 drivers/base/dd.c        |  5 +++++
 drivers/iommu/of_iommu.c | 22 ++++++++++++++++++++++
 include/linux/of_iommu.h |  7 +++++++
 3 files changed, 34 insertions(+)

Comments

Thierry Reding Nov. 19, 2013, 10:25 a.m. UTC | #1
On Tue, Nov 19, 2013 at 11:33:06AM +0200, Hiroshi Doyu wrote:
> IOMMU devices on the bus need to be poplulated first, then iommu
> master devices are done later.
> 
> With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
> whether a device can be an iommu msater or not. If a device can, we'll
> defer to populate that device till an iommu device is populated. Once
> an iommu device is populated, "dev->bus->iommu_ops" is set in the
> bus. Then, those defered iommu master devices are populated and
> configured for IOMMU with help of the already populated iommu device
> via iommu_ops->add_device(). Multiple IOMMUs can be listed on this
> "iommus" binding so that a device can have multiple IOMMUs attached.
> 
> Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
> ---
> v5:
> Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".
> 
> v4:
> This is newly added, and the successor of the following RFC:
>   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
> ---
>  drivers/base/dd.c        |  5 +++++
>  drivers/iommu/of_iommu.c | 22 ++++++++++++++++++++++
>  include/linux/of_iommu.h |  7 +++++++
>  3 files changed, 34 insertions(+)
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 35fa368..6e892d4 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -25,6 +25,7 @@
>  #include <linux/async.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/pinctrl/devinfo.h>
> +#include <linux/of_iommu.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>  
>  	dev->driver = drv;
>  
> +	ret = of_iommu_attach(dev);
> +	if (ret)
> +		goto probe_failed;
> +
>  	/* If using pinctrl, bind pins now before probing */
>  	ret = pinctrl_bind_pins(dev);
>  	if (ret)
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index ee249bc..4aef2b2 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -20,6 +20,8 @@
>  #include <linux/export.h>
>  #include <linux/limits.h>
>  #include <linux/of.h>
> +#include <linux/device.h>
> +#include <linux/iommu.h>
>  
>  /**
>   * of_get_dma_window - Parse *dma-window property and returns 0 if found.
> @@ -88,3 +90,23 @@ int of_get_dma_window(struct device_node *dn, const char *prefix, int index,
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(of_get_dma_window);
> +
> +int of_iommu_attach(struct device *dev)
> +{
> +	int i;
> +	struct of_phandle_args args;
> +	struct iommu_ops *ops = dev->bus->iommu_ops;
> +
> +	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
> +					       "#iommu-cells", i, &args) {
> +		pr_debug("%s(i=%d) ops=%p %s\n",
> +			 __func__, i, ops, dev_name(dev));
> +
> +		if (!ops)
> +			return -EPROBE_DEFER;
> +	}
> +
> +	if (i && ops->add_device)
> +		return ops->add_device(dev);
> +	return 0;
> +}

I don't think this does what it's supposed to do. As far as I can tell
there's no way the above loop won't run to parse all phandles and their
arguments unless the DT is actually wrong.

From earlier discussions I thought the goal was to actually defer this
until all nodes referred to by the iommus property were actually
registered. The above only checks that the phandles can be resolved to
valid struct device_node:s. That doesn't mean that an actual IOMMU has
been registered for it, only that the devices have been created.

I think within that loop you need to look up the IOMMU corresponding to
the struct device_node in args.np. If no match is found, then return
-EPROBE_DEFER.

If you really only rely on dev->bus->iommu_ops to be present, then there
is no need to go through the loop in the first place, since you have
access to it immediately through the struct device that's passed into
the function.

Furthermore, relying on dev->bus->iommu_ops will prevent multiple IOMMUs
from being used at all, since only one IOMMU can register iommu_ops with
the bus, right? So I think what we really need here is a way to resolve
the IOMMU using a phandle and return the associated struct iommu_ops.

I also have some trouble understanding how the current IOMMU framework
is supposed to work together with multiple IOMMUs for one device. The
.add_device() callback seems to be missing crucial information to help
decide whether the device to be added is actually one that it covers.
Also with an of_iommu_attach() function, doesn't that become more or
less redundant?

Thierry
Hiroshi DOYU Nov. 19, 2013, 12:03 p.m. UTC | #2
Hi Thierry,

Thierry Reding <thierry.reding@gmail.com> wrote @ Tue, 19 Nov 2013 11:25:07 +0100:

> From earlier discussions I thought the goal was to actually defer this
> until all nodes referred to by the iommus property were actually
> registered. The above only checks that the phandles can be resolved to
> valid struct device_node:s. That doesn't mean that an actual IOMMU has
> been registered for it, only that the devices have been created.

Currently "bus->iommu_ops" is set at the end of tegra_smmu_probe(). So
if "bus->iommu_ops" is set, it means that an iommu instance is
populated at that time.

> If you really only rely on dev->bus->iommu_ops to be present, then there
> is no need to go through the loop in the first place, since you have
> access to it immediately through the struct device that's passed into
> the function.

As mentioned in the above, "bus->iommu_ops" is set when an iommu
instance is populated. "iommus=" is used only to defer a device probe
until an iommu instance shows up although multiple IOMMUs are not
supported yet correctly as you pointed out.

> Furthermore, relying on dev->bus->iommu_ops will prevent multiple IOMMUs
> from being used at all, since only one IOMMU can register iommu_ops with
> the bus, right? So I think what we really need here is a way to resolve
> the IOMMU using a phandle and return the associated struct
> iommu_ops.

Multiple IOMMU support doesn't work right now. It needs to be
discussed a bit more.

> I also have some trouble understanding how the current IOMMU framework
> is supposed to work together with multiple IOMMUs for one device. The
> .add_device() callback seems to be missing crucial information to help
> decide whether the device to be added is actually one that it covers.
> Also with an of_iommu_attach() function, doesn't that become more or
> less redundant?

I understand what you meant. In the current tegra SMMU implementation,
iommu_ops is set when an iommu is populated so that we cannot use
iommu_ops before that time. That's why of_iommu_attach() was
introduced here. I couldn't set iommu_ops at drvier init yet because
we support SMMU and GART with a single image and the current IOMMU
framework assume that there's one IOMMU on the bus. If we set
iommu_ops at driver init, the latter one would overwrite it.
Stephen Warren Nov. 19, 2013, 9:22 p.m. UTC | #3
On 11/19/2013 05:03 AM, Hiroshi Doyu wrote:
> Hi Thierry,
> 
> Thierry Reding <thierry.reding@gmail.com> wrote @ Tue, 19 Nov 2013 11:25:07 +0100:
> 
>> From earlier discussions I thought the goal was to actually defer this
>> until all nodes referred to by the iommus property were actually
>> registered. The above only checks that the phandles can be resolved to
>> valid struct device_node:s. That doesn't mean that an actual IOMMU has
>> been registered for it, only that the devices have been created.
> 
> Currently "bus->iommu_ops" is set at the end of tegra_smmu_probe(). So
> if "bus->iommu_ops" is set, it means that an iommu instance is
> populated at that time.

Yes, but that's the register bus, upon which the device is a client, not
the bus upon which the device is a bus master. They aren't necessarily
the same.

There's no getting around the fact that, as Thierry said, you need to
search for a registered IOMMU device for each phandle, and defer probe
if any aren't registered yet.

If we do that, then you shouldn't need to look at the value of
dev->bus->iommu_ops at all; if all IOMMUs in the list were registered,
then iommu_ops must have been set when (one of them) was registered, and
if not, then it possibly wasn't, so defer probe.

That way, this code won't have to change if the core IOMMU code gets
extended to support multiple IOMMUs, devices mastering transactions onto
buses other than their register bus, etc.
Hiroshi DOYU Nov. 20, 2013, 3:17 a.m. UTC | #4
Stephen Warren <swarren@wwwdotorg.org> wrote @ Tue, 19 Nov 2013 22:22:47 +0100:

> On 11/19/2013 05:03 AM, Hiroshi Doyu wrote:
> > Hi Thierry,
> > 
> > Thierry Reding <thierry.reding@gmail.com> wrote @ Tue, 19 Nov 2013 11:25:07 +0100:
> > 
> >> From earlier discussions I thought the goal was to actually defer this
> >> until all nodes referred to by the iommus property were actually
> >> registered. The above only checks that the phandles can be resolved to
> >> valid struct device_node:s. That doesn't mean that an actual IOMMU has
> >> been registered for it, only that the devices have been created.
> > 
> > Currently "bus->iommu_ops" is set at the end of tegra_smmu_probe(). So
> > if "bus->iommu_ops" is set, it means that an iommu instance is
> > populated at that time.
> 
> Yes, but that's the register bus, upon which the device is a client, not
> the bus upon which the device is a bus master. They aren't necessarily
> the same.
> 
> There's no getting around the fact that, as Thierry said, you need to
> search for a registered IOMMU device for each phandle, and defer probe
> if any aren't registered yet.
> 
> If we do that, then you shouldn't need to look at the value of
> dev->bus->iommu_ops at all; if all IOMMUs in the list were registered,
> then iommu_ops must have been set when (one of them) was registered, and
> if not, then it possibly wasn't, so defer probe.
> 
> That way, this code won't have to change if the core IOMMU code gets
> extended to support multiple IOMMUs, devices mastering transactions onto
> buses other than their register bus, etc.

Does the above mean the following?

int of_iommu_attach(struct device *dev)
{
	int i;
	struct of_phandle_args args;

	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
					       "#iommu-cells", i, &args)
		if (!args->np->dev->driver)
			return -EPROBE_DEFER;
	return 0;
}


"args->np->dev->driver" needs the following patch:

  http://lists.linuxfoundation.org/pipermail/iommu/2013-July/006023.html
Thierry Reding Nov. 20, 2013, 1:14 p.m. UTC | #5
On Wed, Nov 20, 2013 at 04:17:08AM +0100, Hiroshi Doyu wrote:
> Stephen Warren <swarren@wwwdotorg.org> wrote @ Tue, 19 Nov 2013 22:22:47 +0100:
> 
> > On 11/19/2013 05:03 AM, Hiroshi Doyu wrote:
> > > Hi Thierry,
> > > 
> > > Thierry Reding <thierry.reding@gmail.com> wrote @ Tue, 19 Nov 2013 11:25:07 +0100:
> > > 
> > >> From earlier discussions I thought the goal was to actually defer this
> > >> until all nodes referred to by the iommus property were actually
> > >> registered. The above only checks that the phandles can be resolved to
> > >> valid struct device_node:s. That doesn't mean that an actual IOMMU has
> > >> been registered for it, only that the devices have been created.
> > > 
> > > Currently "bus->iommu_ops" is set at the end of tegra_smmu_probe(). So
> > > if "bus->iommu_ops" is set, it means that an iommu instance is
> > > populated at that time.
> > 
> > Yes, but that's the register bus, upon which the device is a client, not
> > the bus upon which the device is a bus master. They aren't necessarily
> > the same.
> > 
> > There's no getting around the fact that, as Thierry said, you need to
> > search for a registered IOMMU device for each phandle, and defer probe
> > if any aren't registered yet.
> > 
> > If we do that, then you shouldn't need to look at the value of
> > dev->bus->iommu_ops at all; if all IOMMUs in the list were registered,
> > then iommu_ops must have been set when (one of them) was registered, and
> > if not, then it possibly wasn't, so defer probe.
> > 
> > That way, this code won't have to change if the core IOMMU code gets
> > extended to support multiple IOMMUs, devices mastering transactions onto
> > buses other than their register bus, etc.
> 
> Does the above mean the following?
> 
> int of_iommu_attach(struct device *dev)
> {
> 	int i;
> 	struct of_phandle_args args;
> 
> 	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
> 					       "#iommu-cells", i, &args)
> 		if (!args->np->dev->driver)
> 			return -EPROBE_DEFER;
> 	return 0;
> }

Not quite. The above would only check that a driver was bound to the
device. But if that device isn't an IOMMU then this doesn't help you.

The standard way to solve this issue is to add the IOMMU to a global
list upon registration. Typically subsystems have some way to do that
already, but it seems like IOMMU doesn't. It looks like that's one of
the side-effects of the assumption that there will always only be a
single IOMMU (per bus).

There's also no base object that IOMMU drivers implement, which is the
way it's usually done in other subsystems. The absence of that makes it
more difficult. I suspect the easiest way to do that would be to add a
new type, something like this:

	struct iommu {
		struct list_head list;
		struct device *dev;
	};

Then modify the driver to do something like this (tegra-smmu.c):

	struct smmu_device {
		struct iommu iommu;

		...
	};

	...

	static int tegra_smmu_probe(struct platform_device *pdev)
	{
		struct smmu_device *smmu;

		...

		smmu->iommu.dev = &pdev->dev;

		iommu_add(&smmu->iommu);

		...
	}

	static int tegra_smmu_remove(struct platform_device *pdev)
	{
		struct smmu_device *smmu;

		...

		iommu_del(&smmu->iommu);
	}

The implementation of iommu_add() and iommu_del() could be as simple as:

	static DEFINE_MUTEX(iommus_lock);
	static LIST_HEAD(iommus_list);

	void iommu_add(struct iommu *iommu)
	{
		INIT_LIST_HEAD(&iommu->list);
		mutex_lock(&iommus_lock);
		list_add_tail(&iommu->list, &iommus_list);
		mutex_unlock(&iommus_lock);
	}

	void iommu_del(struct iommu *iommu)
	{
		INIT_LIST_HEAD(&iommu->list);
		mutex_lock(&iommus_lock);
		list_del(&iommu->list);
		mutex_unlock(&iommus_lock);
	}

And then you can implement a lookup function to match an IOMMU to the
phandle, like so:

	struct iommu *of_find_iommu_by_node(struct device_node *np)
	{
		struct iommu *iommu;

		mutex_lock(&iommus_lock);

		list_for_each_entry(iommu, &iommus_list, list) {
			if (iommu->dev->of_node == np) {
				mutex_unlock(&iommus_lock);
				return iommu;
			}
		}

		mutex_unlock(&iommus_lock);
		return NULL;
	}

With that you can use of_find_iommu_by_node() in the loop to check
whether an IOMMU has really been registered.

Of course that all is somewhat intrusive and Joerg might have some
objections.

Thierry
Hiroshi DOYU Nov. 20, 2013, 2:03 p.m. UTC | #6
Thierry Reding <thierry.reding@gmail.com> wrote @ Wed, 20 Nov 2013 14:14:48 +0100:
....
> > Does the above mean the following?
> > 
> > int of_iommu_attach(struct device *dev)
> > {
> > 	int i;
> > 	struct of_phandle_args args;
> > 
> > 	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
> > 					       "#iommu-cells", i, &args)
> > 		if (!args->np->dev->driver)
> > 			return -EPROBE_DEFER;
> > 	return 0;
> > }
> 
> Not quite. The above would only check that a driver was bound to the
> device. But if that device isn't an IOMMU then this doesn't help you.

I thought that, as long as a device is a normal one, it's ok to let it
go to be populated. We only care about that, IOMMU devices comes
first, and clients should come later than IOMMUs, for population. In
the above if all IOMMUs are not populated, client devices are always
deferred. "args->np->dev" always points an IOMMU device in a
loop. Otherwise(no "iommus=") it goes out from the loop immediately.

+#define of_property_for_each_phandle_with_args(np, list, cells, i, args) \
+	 for (i = 0; !of_parse_phandle_with_args(np, list, cells, i, args); i++)

> The standard way to solve this issue is to add the IOMMU to a global
> list upon registration. Typically subsystems have some way to do that
> already,

Your implementation has some possibiity that we could construct any
hierarchy of IOMMUs.

> already, but it seems like IOMMU doesn't. It looks like that's one of
> the side-effects of the assumption that there will always only be a
> single IOMMU (per bus).

There's the following case at least we have already had.

"memory controller"---"smmu_a"---bus--+--"smmu_b"--"device_a"
                                      |
                                      |
                                      +--"device_b"

"smmu_b" isn't related to a bus at all.

> There's also no base object that IOMMU drivers implement, which is the
> way it's usually done in other subsystems. The absence of that makes it
> more difficult. I suspect the easiest way to do that would be to add a
> new type, something like this:
....

> With that you can use of_find_iommu_by_node() in the loop to check
> whether an IOMMU has really been registered.

Do you think if it's acceptable to see if a device is populated or not
via "dev->driver"[1]? Grant Likely proposed to use flag[2] instead of
struct device[2], though.

[1] http://lists.linuxfoundation.org/pipermail/iommu/2013-July/006023.html
[2] http://lists.linuxfoundation.org/pipermail/iommu/2013-October/006763.html
Stephen Warren Nov. 20, 2013, 4:30 p.m. UTC | #7
On 11/20/2013 07:03 AM, Hiroshi Doyu wrote:
> Thierry Reding <thierry.reding@gmail.com> wrote @ Wed, 20 Nov 2013 14:14:48 +0100:

(Yes, what Thierry said)

> ....
>>> Does the above mean the following?
>>>
>>> int of_iommu_attach(struct device *dev)
>>> {
>>> 	int i;
>>> 	struct of_phandle_args args;
>>>
>>> 	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
>>> 					       "#iommu-cells", i, &args)
>>> 		if (!args->np->dev->driver)
>>> 			return -EPROBE_DEFER;
>>> 	return 0;
>>> }
>>
>> Not quite. The above would only check that a driver was bound to the
>> device. But if that device isn't an IOMMU then this doesn't help you.
> 
> I thought that, as long as a device is a normal one, it's ok to let it
> go to be populated.

I don't understand what that means.

> We only care about that, IOMMU devices comes
> first, and clients should come later than IOMMUs, for population. In
> the above if all IOMMUs are not populated, client devices are always
> deferred. "args->np->dev" always points an IOMMU device in a
> loop. Otherwise(no "iommus=") it goes out from the loop immediately.

I'm not sure what that means. Perhaps you're sauying the dev->driver
isn't set until the driver is probe()d for the device, so if
dev->driver!=NULL, then we know the driver probed() successfully for it?
That does go most of the way, but as Thierry pointed out, it doesn't
guarantee that the dev->driver is an IOMMU driver, just that it's *some*
driver. Perhaps this won't actually make any difference in practice, but
AFAIK, all other subsystems do perform the strict check, so I don't see
why the IOMMU subsystem shouldn't.

...
> There's the following case at least we have already had.
> 
> "memory controller"---"smmu_a"---bus--+--"smmu_b"--"device_a"
>                                       |
>                                       |
>                                       +--"device_b"
> 
> "smmu_b" isn't related to a bus at all.

Yes, smmu_b is related to a bus.

Admittedly perhaps not a bus that the CPU can master transactions onto,
but there's still a (perhaps point-point) bus that device_a masters, and
smmu_b acts as a slave.

Note that not all buses in the diagram above are represented as bus
objects in the Linux kernel (or even in DT), since those tend to
concentrate only on representing buses that the CPU masters, rather than
additionally representing buses that only HW devices master.
Hiroshi DOYU Nov. 21, 2013, 9:01 a.m. UTC | #8
On Wed, 20 Nov 2013 17:30:35 +0100
Stephen Warren <swarren@wwwdotorg.org> wrote:
...
> >>> Does the above mean the following?
> >>>
> >>> int of_iommu_attach(struct device *dev)
> >>> {
> >>> 	int i;
> >>> 	struct of_phandle_args args;
> >>>
> >>> 	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
> >>> 					       "#iommu-cells", i, &args)
> >>> 		if (!args->np->dev->driver)
> >>> 			return -EPROBE_DEFER;
> >>> 	return 0;
> >>> }
> >>
> >> Not quite. The above would only check that a driver was bound to the
> >> device. But if that device isn't an IOMMU then this doesn't help you.
> > 
> > I thought that, as long as a device is a normal one, it's ok to let it
> > go to be populated.
> 
> I don't understand what that means.
> 
> > We only care about that, IOMMU devices comes
> > first, and clients should come later than IOMMUs, for population. In
> > the above if all IOMMUs are not populated, client devices are always
> > deferred. "args->np->dev" always points an IOMMU device in a
> > loop. Otherwise(no "iommus=") it goes out from the loop immediately.
> 
> I'm not sure what that means. Perhaps you're sauying the dev->driver
> isn't set until the driver is probe()d for the device, so if
> dev->driver!=NULL, then we know the driver probed() successfully for it?

Yes

> That does go most of the way, but as Thierry pointed out, it doesn't
> guarantee that the dev->driver is an IOMMU driver, just that it's *some*
> driver. Perhaps this won't actually make any difference in practice, but
> AFAIK, all other subsystems do perform the strict check, so I don't see
> why the IOMMU subsystem shouldn't.

Ok, now I got the one Thierry pointed out. Will implement that.
Grant Likely Nov. 21, 2013, 1:15 p.m. UTC | #9
On Tue, 19 Nov 2013 11:33:06 +0200, Hiroshi Doyu <hdoyu@nvidia.com> wrote:
> IOMMU devices on the bus need to be poplulated first, then iommu
> master devices are done later.
> 
> With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
> whether a device can be an iommu msater or not. If a device can, we'll
> defer to populate that device till an iommu device is populated. Once
> an iommu device is populated, "dev->bus->iommu_ops" is set in the
> bus. Then, those defered iommu master devices are populated and
> configured for IOMMU with help of the already populated iommu device
> via iommu_ops->add_device(). Multiple IOMMUs can be listed on this
> "iommus" binding so that a device can have multiple IOMMUs attached.
> 
> Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
> ---
> v5:
> Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".
> 
> v4:
> This is newly added, and the successor of the following RFC:
>   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
> ---
>  drivers/base/dd.c        |  5 +++++
>  drivers/iommu/of_iommu.c | 22 ++++++++++++++++++++++
>  include/linux/of_iommu.h |  7 +++++++
>  3 files changed, 34 insertions(+)
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 35fa368..6e892d4 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -25,6 +25,7 @@
>  #include <linux/async.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/pinctrl/devinfo.h>
> +#include <linux/of_iommu.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>  
>  	dev->driver = drv;
>  
> +	ret = of_iommu_attach(dev);
> +	if (ret)
> +		goto probe_failed;
> +
>  	/* If using pinctrl, bind pins now before probing */
>  	ret = pinctrl_bind_pins(dev);
>  	if (ret)

I'm very concerned about this approach. Hooking into the core probe
path for things like this is not going to scale well. I'm not thrilled
with the pinctrl hook being here either, but that is already merged. :-(
Also, hooking in here is going affect every single device device driver
probe path, and a large number of them are never, ever, going to have
iommu interactions.

There needs to be a less invasive way of doing what you want. I still
feel like the individual device drivers themselves need to be aware that
they might be hooking through an IOMMU. Or, if they are hooking through
a bus like PCIe, then have the PCIe bus defer creating child devices
until the IOMMU is configured and in place.

g.
Stephen Warren Nov. 21, 2013, 7:04 p.m. UTC | #10
On 11/21/2013 06:15 AM, Grant Likely wrote:
> On Tue, 19 Nov 2013 11:33:06 +0200, Hiroshi Doyu <hdoyu@nvidia.com> wrote:
>> IOMMU devices on the bus need to be poplulated first, then iommu
>> master devices are done later.
>>
>> With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
>> whether a device can be an iommu msater or not. If a device can, we'll
>> defer to populate that device till an iommu device is populated. Once
>> an iommu device is populated, "dev->bus->iommu_ops" is set in the
>> bus. Then, those defered iommu master devices are populated and
>> configured for IOMMU with help of the already populated iommu device
>> via iommu_ops->add_device(). Multiple IOMMUs can be listed on this
>> "iommus" binding so that a device can have multiple IOMMUs attached.
>>
>> Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
>> ---
>> v5:
>> Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".
>>
>> v4:
>> This is newly added, and the successor of the following RFC:
>>   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
>>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
>> ---
>>  drivers/base/dd.c        |  5 +++++
>>  drivers/iommu/of_iommu.c | 22 ++++++++++++++++++++++
>>  include/linux/of_iommu.h |  7 +++++++
>>  3 files changed, 34 insertions(+)
>>
>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>> index 35fa368..6e892d4 100644
>> --- a/drivers/base/dd.c
>> +++ b/drivers/base/dd.c
>> @@ -25,6 +25,7 @@
>>  #include <linux/async.h>
>>  #include <linux/pm_runtime.h>
>>  #include <linux/pinctrl/devinfo.h>
>> +#include <linux/of_iommu.h>
>>  
>>  #include "base.h"
>>  #include "power/power.h"
>> @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>>  
>>  	dev->driver = drv;
>>  
>> +	ret = of_iommu_attach(dev);
>> +	if (ret)
>> +		goto probe_failed;
>> +
>>  	/* If using pinctrl, bind pins now before probing */
>>  	ret = pinctrl_bind_pins(dev);
>>  	if (ret)
> 
> I'm very concerned about this approach. Hooking into the core probe
> path for things like this is not going to scale well. I'm not thrilled
> with the pinctrl hook being here either, but that is already merged. :-(
> Also, hooking in here is going affect every single device device driver
> probe path, and a large number of them are never, ever, going to have
> iommu interactions.
> 
> There needs to be a less invasive way of doing what you want. I still
> feel like the individual device drivers themselves need to be aware that
> they might be hooking through an IOMMU. Or, if they are hooking through
> a bus like PCIe, then have the PCIe bus defer creating child devices
> until the IOMMU is configured and in place.

I general though, couldn't any MMIO on-SoC device potentially be
affected by an IOMMU? The point of putting the call to of_iommu_attach()
here rather than inside individual driver's probe() is to avoid
requiring "every" driver having to paste more boiler-plate into probe().

Perhaps what we need is a flag in struct device to indicate that the
driver/device is MMIO, and hence potentially affected by an IOMMU. would
the following work better for you?

+	if (drv->is_mmio) { // let's bikeshed on the field name:-)
+		ret = of_iommu_attach(dev);
+		if (ret)
+			goto probe_failed;
+	}

I've often thought that struct device_driver (or similar) should declare
the set of resources that a device needs (e.g. a list of GPIO names,
regulator names, etc.), so that the driver core can parse all these
standard properties from DT/... before calling the custom probe()
function for all the non-standard stuff. This "is_mmio" flag would fit
into that model well.
Grant Likely Nov. 22, 2013, 7:41 a.m. UTC | #11
On Thu, 21 Nov 2013 12:04:18 -0700, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 11/21/2013 06:15 AM, Grant Likely wrote:
> > On Tue, 19 Nov 2013 11:33:06 +0200, Hiroshi Doyu <hdoyu@nvidia.com> wrote:
> >> IOMMU devices on the bus need to be poplulated first, then iommu
> >> master devices are done later.
> >>
> >> With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
> >> whether a device can be an iommu msater or not. If a device can, we'll
> >> defer to populate that device till an iommu device is populated. Once
> >> an iommu device is populated, "dev->bus->iommu_ops" is set in the
> >> bus. Then, those defered iommu master devices are populated and
> >> configured for IOMMU with help of the already populated iommu device
> >> via iommu_ops->add_device(). Multiple IOMMUs can be listed on this
> >> "iommus" binding so that a device can have multiple IOMMUs attached.
> >>
> >> Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
> >> ---
> >> v5:
> >> Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".
> >>
> >> v4:
> >> This is newly added, and the successor of the following RFC:
> >>   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
> >>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
> >> ---
> >>  drivers/base/dd.c        |  5 +++++
> >>  drivers/iommu/of_iommu.c | 22 ++++++++++++++++++++++
> >>  include/linux/of_iommu.h |  7 +++++++
> >>  3 files changed, 34 insertions(+)
> >>
> >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> >> index 35fa368..6e892d4 100644
> >> --- a/drivers/base/dd.c
> >> +++ b/drivers/base/dd.c
> >> @@ -25,6 +25,7 @@
> >>  #include <linux/async.h>
> >>  #include <linux/pm_runtime.h>
> >>  #include <linux/pinctrl/devinfo.h>
> >> +#include <linux/of_iommu.h>
> >>  
> >>  #include "base.h"
> >>  #include "power/power.h"
> >> @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> >>  
> >>  	dev->driver = drv;
> >>  
> >> +	ret = of_iommu_attach(dev);
> >> +	if (ret)
> >> +		goto probe_failed;
> >> +
> >>  	/* If using pinctrl, bind pins now before probing */
> >>  	ret = pinctrl_bind_pins(dev);
> >>  	if (ret)
> > 
> > I'm very concerned about this approach. Hooking into the core probe
> > path for things like this is not going to scale well. I'm not thrilled
> > with the pinctrl hook being here either, but that is already merged. :-(
> > Also, hooking in here is going affect every single device device driver
> > probe path, and a large number of them are never, ever, going to have
> > iommu interactions.
> > 
> > There needs to be a less invasive way of doing what you want. I still
> > feel like the individual device drivers themselves need to be aware that
> > they might be hooking through an IOMMU. Or, if they are hooking through
> > a bus like PCIe, then have the PCIe bus defer creating child devices
> > until the IOMMU is configured and in place.
> 
> I general though, couldn't any MMIO on-SoC device potentially be
> affected by an IOMMU? The point of putting the call to of_iommu_attach()
> here rather than inside individual driver's probe() is to avoid
> requiring "every" driver having to paste more boiler-plate into probe().

It seems more that IOMMU attachment is closer to being a property of the
bus rather than a property of the device itself. In that context it
would make more sense for the bus device to hold off child device
registration or probing until the IOMMU is available. That keeps the
logic out of both the core code and the individual device drivers.

g.
Stephen Warren Nov. 22, 2013, 5:35 p.m. UTC | #12
On 11/22/2013 12:41 AM, Grant Likely wrote:
> On Thu, 21 Nov 2013 12:04:18 -0700, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 11/21/2013 06:15 AM, Grant Likely wrote:
>>> On Tue, 19 Nov 2013 11:33:06 +0200, Hiroshi Doyu <hdoyu@nvidia.com> wrote:
>>>> IOMMU devices on the bus need to be poplulated first, then iommu
>>>> master devices are done later.
>>>>
>>>> With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
>>>> whether a device can be an iommu msater or not. If a device can, we'll
>>>> defer to populate that device till an iommu device is populated. Once
>>>> an iommu device is populated, "dev->bus->iommu_ops" is set in the
>>>> bus. Then, those defered iommu master devices are populated and
>>>> configured for IOMMU with help of the already populated iommu device
>>>> via iommu_ops->add_device(). Multiple IOMMUs can be listed on this
>>>> "iommus" binding so that a device can have multiple IOMMUs attached.
>>>>
>>>> Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
>>>> ---
>>>> v5:
>>>> Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".
>>>>
>>>> v4:
>>>> This is newly added, and the successor of the following RFC:
>>>>   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
>>>>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
>>>> ---
>>>>  drivers/base/dd.c        |  5 +++++
>>>>  drivers/iommu/of_iommu.c | 22 ++++++++++++++++++++++
>>>>  include/linux/of_iommu.h |  7 +++++++
>>>>  3 files changed, 34 insertions(+)
>>>>
>>>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>>>> index 35fa368..6e892d4 100644
>>>> --- a/drivers/base/dd.c
>>>> +++ b/drivers/base/dd.c
>>>> @@ -25,6 +25,7 @@
>>>>  #include <linux/async.h>
>>>>  #include <linux/pm_runtime.h>
>>>>  #include <linux/pinctrl/devinfo.h>
>>>> +#include <linux/of_iommu.h>
>>>>  
>>>>  #include "base.h"
>>>>  #include "power/power.h"
>>>> @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>>>>  
>>>>  	dev->driver = drv;
>>>>  
>>>> +	ret = of_iommu_attach(dev);
>>>> +	if (ret)
>>>> +		goto probe_failed;
>>>> +
>>>>  	/* If using pinctrl, bind pins now before probing */
>>>>  	ret = pinctrl_bind_pins(dev);
>>>>  	if (ret)
>>>
>>> I'm very concerned about this approach. Hooking into the core probe
>>> path for things like this is not going to scale well. I'm not thrilled
>>> with the pinctrl hook being here either, but that is already merged. :-(
>>> Also, hooking in here is going affect every single device device driver
>>> probe path, and a large number of them are never, ever, going to have
>>> iommu interactions.
>>>
>>> There needs to be a less invasive way of doing what you want. I still
>>> feel like the individual device drivers themselves need to be aware that
>>> they might be hooking through an IOMMU. Or, if they are hooking through
>>> a bus like PCIe, then have the PCIe bus defer creating child devices
>>> until the IOMMU is configured and in place.
>>
>> I general though, couldn't any MMIO on-SoC device potentially be
>> affected by an IOMMU? The point of putting the call to of_iommu_attach()
>> here rather than inside individual driver's probe() is to avoid
>> requiring "every" driver having to paste more boiler-plate into probe().
> 
> It seems more that IOMMU attachment is closer to being a property of the
> bus rather than a property of the device itself. In that context it
> would make more sense for the bus device to hold off child device
> registration or probing until the IOMMU is available. That keeps the
> logic out of both the core code and the individual device drivers.

The bus structure that DT and Linux know about is the register bus.
There's no reason that devices have to emit their master transactions
onto that same bus, or onto only that same bus.
Will Deacon Nov. 25, 2013, 5:39 p.m. UTC | #13
On Fri, Nov 22, 2013 at 05:35:58PM +0000, Stephen Warren wrote:
> On 11/22/2013 12:41 AM, Grant Likely wrote:
> > It seems more that IOMMU attachment is closer to being a property of the
> > bus rather than a property of the device itself. In that context it
> > would make more sense for the bus device to hold off child device
> > registration or probing until the IOMMU is available. That keeps the
> > logic out of both the core code and the individual device drivers.
> 
> The bus structure that DT and Linux know about is the register bus.
> There's no reason that devices have to emit their master transactions
> onto that same bus, or onto only that same bus.

Agreed. Dave (CC'd) and I actually had a lot of discussion around the DT bus
abstractions last week and we ended up with a binding that looked sane enough
to start a meaningful discussion in this area.

Dave -- care to post what we came up with? It certainly has a bunch of
overlap with the IOMMU problems being discussed here.

Will
diff mbox

Patch

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 35fa368..6e892d4 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -25,6 +25,7 @@ 
 #include <linux/async.h>
 #include <linux/pm_runtime.h>
 #include <linux/pinctrl/devinfo.h>
+#include <linux/of_iommu.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -273,6 +274,10 @@  static int really_probe(struct device *dev, struct device_driver *drv)
 
 	dev->driver = drv;
 
+	ret = of_iommu_attach(dev);
+	if (ret)
+		goto probe_failed;
+
 	/* If using pinctrl, bind pins now before probing */
 	ret = pinctrl_bind_pins(dev);
 	if (ret)
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index ee249bc..4aef2b2 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -20,6 +20,8 @@ 
 #include <linux/export.h>
 #include <linux/limits.h>
 #include <linux/of.h>
+#include <linux/device.h>
+#include <linux/iommu.h>
 
 /**
  * of_get_dma_window - Parse *dma-window property and returns 0 if found.
@@ -88,3 +90,23 @@  int of_get_dma_window(struct device_node *dn, const char *prefix, int index,
 	return 0;
 }
 EXPORT_SYMBOL_GPL(of_get_dma_window);
+
+int of_iommu_attach(struct device *dev)
+{
+	int i;
+	struct of_phandle_args args;
+	struct iommu_ops *ops = dev->bus->iommu_ops;
+
+	of_property_for_each_phandle_with_args(dev->of_node, "iommus",
+					       "#iommu-cells", i, &args) {
+		pr_debug("%s(i=%d) ops=%p %s\n",
+			 __func__, i, ops, dev_name(dev));
+
+		if (!ops)
+			return -EPROBE_DEFER;
+	}
+
+	if (i && ops->add_device)
+		return ops->add_device(dev);
+	return 0;
+}
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 51a560f..3457489 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -7,6 +7,8 @@  extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 			     int index, unsigned long *busno, dma_addr_t *addr,
 			     size_t *size);
 
+extern int of_iommu_attach(struct device *dev);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -16,6 +18,11 @@  static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
 	return -EINVAL;
 }
 
+static inline int of_iommu_attach(struct device *dev)
+{
+	return 0;
+}
+
 #endif	/* CONFIG_OF_IOMMU */
 
 #endif /* __OF_IOMMU_H */