diff mbox series

[RFC,v2,08/10] vfio/type1: Add domain at(de)taching group helpers

Message ID 20180830040922.30426-9-baolu.lu@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series vfio/mdev: IOMMU aware mediated device | expand

Commit Message

Baolu Lu Aug. 30, 2018, 4:09 a.m. UTC
If a domain is attaching to a group which includes the
mediated devices, it should attach to the mdev parent
of each mdev. This adds a helper for attaching domain
to group, no matter a PCI physical device or mediated
devices which are derived from a PCI physical device.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 77 ++++++++++++++++++++++++++++++---
 1 file changed, 70 insertions(+), 7 deletions(-)

Comments

Jean-Philippe Brucker Sept. 10, 2018, 4:23 p.m. UTC | #1
On 30/08/2018 05:09, Lu Baolu wrote:
> +static int vfio_detach_aux_domain(struct device *dev, void *data)
> +{
> +	struct iommu_domain *domain = data;
> +
> +	vfio_mdev_set_aux_domain(dev, NULL);
> +	iommu_detach_device(domain, dev->parent);

I think that's only going to work for vt-d, which doesn't use a
default_domain. For other drivers, iommu.c ends up calling
domain->ops->attach_dev(default_domain, dev) here, which isn't what we want.

That was my concern with reusing attach/detach_dev callbacks for PASID
management. The attach_dev code of IOMMU drivers already has to deal
with toggling between default and unmanaged domain. Dealing with more
state transitions in the same path is going to be difficult.

Thanks,
Jean
Baolu Lu Sept. 12, 2018, 5:02 a.m. UTC | #2
Hi,

On 09/11/2018 12:23 AM, Jean-Philippe Brucker wrote:
> On 30/08/2018 05:09, Lu Baolu wrote:
>> +static int vfio_detach_aux_domain(struct device *dev, void *data)
>> +{
>> +	struct iommu_domain *domain = data;
>> +
>> +	vfio_mdev_set_aux_domain(dev, NULL);
>> +	iommu_detach_device(domain, dev->parent);
> 
> I think that's only going to work for vt-d, which doesn't use a
> default_domain. For other drivers, iommu.c ends up calling
> domain->ops->attach_dev(default_domain, dev) here, which isn't what we want.

This doesn't break any functionality. It takes effect only if iommu
hardware supports finer granular translation and advertises it through
the API.

> 
> That was my concern with reusing attach/detach_dev callbacks for PASID
> management. The attach_dev code of IOMMU drivers already has to deal
> with toggling between default and unmanaged domain. Dealing with more
> state transitions in the same path is going to be difficult.

Let's consider it from the API point of view.

We have iommu_a(de)ttach_device() APIs to attach or detach a domain
to/from a device. We should avoid applying a limitation of "these are
only for single domain case, for multiple domains, use another API".

Best regards,
Lu Baolu
Jean-Philippe Brucker Sept. 12, 2018, 5:54 p.m. UTC | #3
On 12/09/2018 06:02, Lu Baolu wrote:
> Hi,
> 
> On 09/11/2018 12:23 AM, Jean-Philippe Brucker wrote:
>> On 30/08/2018 05:09, Lu Baolu wrote:
>>> +static int vfio_detach_aux_domain(struct device *dev, void *data)
>>> +{
>>> +	struct iommu_domain *domain = data;
>>> +
>>> +	vfio_mdev_set_aux_domain(dev, NULL);
>>> +	iommu_detach_device(domain, dev->parent);
>>
>> I think that's only going to work for vt-d, which doesn't use a
>> default_domain. For other drivers, iommu.c ends up calling
>> domain->ops->attach_dev(default_domain, dev) here, which isn't what we want.
> 
> This doesn't break any functionality. It takes effect only if iommu
> hardware supports finer granular translation and advertises it through
> the API.>
>> That was my concern with reusing attach/detach_dev callbacks for PASID
>> management. The attach_dev code of IOMMU drivers already has to deal
>> with toggling between default and unmanaged domain. Dealing with more
>> state transitions in the same path is going to be difficult.
> 
> Let's consider it from the API point of view.
> 
> We have iommu_a(de)ttach_device() APIs to attach or detach a domain
> to/from a device. We should avoid applying a limitation of "these are
> only for single domain case, for multiple domains, use another API".

That's an idealized view of the API, the actual semantics aren't as
simple. For IOMMU drivers that implement IOMMU_DOMAIN_DMA in their
domain_alloc operation (Arm SMMU, AMD IOMMU, ...), attach_dev attaches
an unmanaged domain, detach_dev reattaches the default DMA domain, and
the detach_dev IOMMU op is not called. We can't change that now, so it's
difficult to add more functionality to attach_dev and detach_dev.

Thanks,
Jean
Tian, Kevin Sept. 13, 2018, 12:35 a.m. UTC | #4
> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@arm.com]
> Sent: Thursday, September 13, 2018 1:54 AM
> 
> On 12/09/2018 06:02, Lu Baolu wrote:
> > Hi,
> >
> > On 09/11/2018 12:23 AM, Jean-Philippe Brucker wrote:
> >> On 30/08/2018 05:09, Lu Baolu wrote:
> >>> +static int vfio_detach_aux_domain(struct device *dev, void *data)
> >>> +{
> >>> +	struct iommu_domain *domain = data;
> >>> +
> >>> +	vfio_mdev_set_aux_domain(dev, NULL);
> >>> +	iommu_detach_device(domain, dev->parent);
> >>
> >> I think that's only going to work for vt-d, which doesn't use a
> >> default_domain. For other drivers, iommu.c ends up calling
> >> domain->ops->attach_dev(default_domain, dev) here, which isn't what
> we want.
> >
> > This doesn't break any functionality. It takes effect only if iommu
> > hardware supports finer granular translation and advertises it through
> > the API.>
> >> That was my concern with reusing attach/detach_dev callbacks for
> PASID
> >> management. The attach_dev code of IOMMU drivers already has to
> deal
> >> with toggling between default and unmanaged domain. Dealing with
> more
> >> state transitions in the same path is going to be difficult.
> >
> > Let's consider it from the API point of view.
> >
> > We have iommu_a(de)ttach_device() APIs to attach or detach a domain
> > to/from a device. We should avoid applying a limitation of "these are
> > only for single domain case, for multiple domains, use another API".
> 
> That's an idealized view of the API, the actual semantics aren't as
> simple. For IOMMU drivers that implement IOMMU_DOMAIN_DMA in their
> domain_alloc operation (Arm SMMU, AMD IOMMU, ...), attach_dev
> attaches
> an unmanaged domain, detach_dev reattaches the default DMA domain,
> and
> the detach_dev IOMMU op is not called. We can't change that now, so it's
> difficult to add more functionality to attach_dev and detach_dev.
> 

Now we have four possible usages on a(de)ttach_device:

1) Normal DMA API path i.e. IOMMU_DOMAIN_DMA, for DMA operations
in host/parent device driver;

2) VFIO passthru path, when the physical device is assigned to user space
or guest driver

3) mdev passthru path 1, when mdev is assigned to user space or guest
driver. Here mdev is a wrap on random PCI device

4) mdev passthru path 2, when mdev is assigned to user space or guest
driver. Here mdev is in a smaller granularity (e.g. tagged by PASID) as
supported by vt-d scalable mode

1) and 2) are existing usages. What you described (unmanaged vs. default)
falls into this category.

3) is actually same as 2) in IOMMU layer. sort of passing through DMA
capability to guest. Though there is still a parent driver, the parent driver
itself should not do DMA - i.e. unmanaged in host side.

4) is a new code path introduced in IOMMU layer, which is what
vfio_a(de)tach_aux_domain is trying to address. In this case parent device
can still have its own DMA capability, using existing code path 1) (thus
default domain still applies). and the parent device can have multiple 
aux domains (and associated structures), using code path 4).

I didn't see how this new code path interferes with default domain 
concept in existing path. In the future if ARM or other architectures
plan to support similar scalable mode capability, only the new code
path should be tweaked.

Thanks
Kevin
Jean-Philippe Brucker Sept. 14, 2018, 2:45 p.m. UTC | #5
On 13/09/2018 01:35, Tian, Kevin wrote:
>>> Let's consider it from the API point of view.
>>>
>>> We have iommu_a(de)ttach_device() APIs to attach or detach a domain
>>> to/from a device. We should avoid applying a limitation of "these are
>>> only for single domain case, for multiple domains, use another API".
>>
>> That's an idealized view of the API, the actual semantics aren't as
>> simple. For IOMMU drivers that implement IOMMU_DOMAIN_DMA in their
>> domain_alloc operation (Arm SMMU, AMD IOMMU, ...), attach_dev
>> attaches
>> an unmanaged domain, detach_dev reattaches the default DMA domain,
>> and
>> the detach_dev IOMMU op is not called. We can't change that now, so it's
>> difficult to add more functionality to attach_dev and detach_dev.
>>
> 
> Now we have four possible usages on a(de)ttach_device:
> 
> 1) Normal DMA API path i.e. IOMMU_DOMAIN_DMA, for DMA operations
> in host/parent device driver;
> 
> 2) VFIO passthru path, when the physical device is assigned to user space
> or guest driver
> 
> 3) mdev passthru path 1, when mdev is assigned to user space or guest
> driver. Here mdev is a wrap on random PCI device
> 
> 4) mdev passthru path 2, when mdev is assigned to user space or guest
> driver. Here mdev is in a smaller granularity (e.g. tagged by PASID) as
> supported by vt-d scalable mode
> 
> 1) and 2) are existing usages. What you described (unmanaged vs. default)
> falls into this category.
> 
> 3) is actually same as 2) in IOMMU layer. sort of passing through DMA
> capability to guest. Though there is still a parent driver, the parent driver
> itself should not do DMA - i.e. unmanaged in host side.
> 
> 4) is a new code path introduced in IOMMU layer, which is what
> vfio_a(de)tach_aux_domain is trying to address. In this case parent device
> can still have its own DMA capability, using existing code path 1) (thus
> default domain still applies). and the parent device can have multiple 
> aux domains (and associated structures), using code path 4).

We still have the problem that detach() doesn't propagate to the IOMMU
driver. Here's the current flow, without mdev:

1) At boot, the PF's (parent device) driver is probed, and the IOMMU
core sets up a default DMA domain, to be used by dma_alloc by the PF's
driver.
-> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)

2) or 3) Later userspace wants to control the PF, replaces the PF's
driver with vfio-pci. When userspace creates a container, VFIO allocates
a new IOMMU domain, and calls iommu_attach_group.
-> iommu.c calls domain->ops->attach_dev(domain, dev)
This detaches the PF from the default domain, and attaches it to the new
domain.

1) When the container is closed, VFIO calls iommu_detach_group. This
detaches the PF from its current domain, and attaches it back to the
default domain.
-> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)

-----
Now with mdev, we still attach the DMA domain in 1). Then:

4) Userspace opens an mdev and creates a container. VFIO enables aux
domain for the device. VFIO allocates a new IOMMU domain, and calls
iommu_attach_device(domain1, parent_dev).
-> iommu.c calls domain->ops->attach_dev(domain1, dev)
Because the device is in "aux domain" state, the IOMMU driver does not
detach from the default domain, but instead allocates a PASID and
attaches the aux domain. (Side note: for SMMU we couldn't detach from
the default domain, because we need it for MSI mappings.)

4) Userspace opens another mdev.
-> iommu.c calls domain->ops->attach_dev(domain2, dev)

1)? When the container is closed, VFIO calls
iommu_detach_device(domain2, parent_dev)
-> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)
Given that auxiliary domains are attached, the IOMMU driver could deduce
that this actually means "detach an auxiliary domain". But which one?

So the proposed interface doesn't seem to work as is. If we want to use
iommu_attach/detach_device for auxiliary domains, the existing behavior
of iommu.c, and IOMMU drivers that rely on it, have to change. Any
change I can think of right now seems more daunting than introducing a
different path for auxiliary domains, like iommu_attach_aux_domain for
example.

Thanks,
Jean
Tian, Kevin Sept. 15, 2018, 2:36 a.m. UTC | #6
> From: Jean-Philippe Brucker
> Sent: Friday, September 14, 2018 10:46 PM
> 
> On 13/09/2018 01:35, Tian, Kevin wrote:
> >>> Let's consider it from the API point of view.
> >>>
> >>> We have iommu_a(de)ttach_device() APIs to attach or detach a domain
> >>> to/from a device. We should avoid applying a limitation of "these are
> >>> only for single domain case, for multiple domains, use another API".
> >>
> >> That's an idealized view of the API, the actual semantics aren't as
> >> simple. For IOMMU drivers that implement IOMMU_DOMAIN_DMA in
> their
> >> domain_alloc operation (Arm SMMU, AMD IOMMU, ...), attach_dev
> >> attaches
> >> an unmanaged domain, detach_dev reattaches the default DMA domain,
> >> and
> >> the detach_dev IOMMU op is not called. We can't change that now, so
> it's
> >> difficult to add more functionality to attach_dev and detach_dev.
> >>
> >
> > Now we have four possible usages on a(de)ttach_device:
> >
> > 1) Normal DMA API path i.e. IOMMU_DOMAIN_DMA, for DMA
> operations
> > in host/parent device driver;
> >
> > 2) VFIO passthru path, when the physical device is assigned to user space
> > or guest driver
> >
> > 3) mdev passthru path 1, when mdev is assigned to user space or guest
> > driver. Here mdev is a wrap on random PCI device
> >
> > 4) mdev passthru path 2, when mdev is assigned to user space or guest
> > driver. Here mdev is in a smaller granularity (e.g. tagged by PASID) as
> > supported by vt-d scalable mode
> >
> > 1) and 2) are existing usages. What you described (unmanaged vs. default)
> > falls into this category.
> >
> > 3) is actually same as 2) in IOMMU layer. sort of passing through DMA
> > capability to guest. Though there is still a parent driver, the parent driver
> > itself should not do DMA - i.e. unmanaged in host side.
> >
> > 4) is a new code path introduced in IOMMU layer, which is what
> > vfio_a(de)tach_aux_domain is trying to address. In this case parent
> device
> > can still have its own DMA capability, using existing code path 1) (thus
> > default domain still applies). and the parent device can have multiple
> > aux domains (and associated structures), using code path 4).
> 
> We still have the problem that detach() doesn't propagate to the IOMMU
> driver. Here's the current flow, without mdev:
> 
> 1) At boot, the PF's (parent device) driver is probed, and the IOMMU
> core sets up a default DMA domain, to be used by dma_alloc by the PF's
> driver.
> -> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)
> 
> 2) or 3) Later userspace wants to control the PF, replaces the PF's
> driver with vfio-pci. When userspace creates a container, VFIO allocates
> a new IOMMU domain, and calls iommu_attach_group.
> -> iommu.c calls domain->ops->attach_dev(domain, dev)
> This detaches the PF from the default domain, and attaches it to the new
> domain.
> 
> 1) When the container is closed, VFIO calls iommu_detach_group. This
> detaches the PF from its current domain, and attaches it back to the
> default domain.
> -> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)
> 
> -----
> Now with mdev, we still attach the DMA domain in 1). Then:
> 
> 4) Userspace opens an mdev and creates a container. VFIO enables aux
> domain for the device. VFIO allocates a new IOMMU domain, and calls
> iommu_attach_device(domain1, parent_dev).
> -> iommu.c calls domain->ops->attach_dev(domain1, dev)
> Because the device is in "aux domain" state, the IOMMU driver does not
> detach from the default domain, but instead allocates a PASID and
> attaches the aux domain. (Side note: for SMMU we couldn't detach from
> the default domain, because we need it for MSI mappings.)

same for vtd. We don't require parent driver to detach its domain in 1).
Parent driver can have its own DMA capability when aux domain is
enabled in parallel

> 
> 4) Userspace opens another mdev.
> -> iommu.c calls domain->ops->attach_dev(domain2, dev)

another mdev in same VFIO container or different? I assume the
latter since you mentioned a new domain2.

> 
> 1)? When the container is closed, VFIO calls
> iommu_detach_device(domain2, parent_dev)
> -> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)
> Given that auxiliary domains are attached, the IOMMU driver could deduce
> that this actually means "detach an auxiliary domain". But which one?

I didn't get this one. There is no need to stick to 1) behavior for
4), i.e. below is expected:
	domain2->ops->detach_dev(domain2, dev)

why cannot ARM implement a detach_dev for aux_domain too? My
feeling is that default domain twist is only for switch between 1/2/3
in concept. 

> 
> So the proposed interface doesn't seem to work as is. If we want to use
> iommu_attach/detach_device for auxiliary domains, the existing behavior
> of iommu.c, and IOMMU drivers that rely on it, have to change. Any
> change I can think of right now seems more daunting than introducing a
> different path for auxiliary domains, like iommu_attach_aux_domain for
> example.
> 

introducing *aux* specific API will cause different VFIO code path to
handle RID-based and PASID-based mdev, since RID-based still needs
to use normal attach_domain that way. well, this argument is not very strong
in itself, if indeed proposed way doesn't work for ARM. But let's see
whether it is the case with more understanding of your actual concern.

Thanks
Kevin
Jean-Philippe Brucker Sept. 18, 2018, 3:52 p.m. UTC | #7
On 15/09/2018 03:36, Tian, Kevin wrote:
>> 4) Userspace opens another mdev.
>> -> iommu.c calls domain->ops->attach_dev(domain2, dev)
> 
> another mdev in same VFIO container or different? I assume the
> latter since you mentioned a new domain2.

I was thinking a different VFIO container actually. I used domain2 to
try to make the example clearer

>> 1)? When the container is closed, VFIO calls
>> iommu_detach_device(domain2, parent_dev)
>> -> iommu.c calls default_domain->ops->attach_dev(default_domain, dev)
>> Given that auxiliary domains are attached, the IOMMU driver could deduce
>> that this actually means "detach an auxiliary domain". But which one?
> 
> I didn't get this one. There is no need to stick to 1) behavior for
> 4), i.e. below is expected:
>         domain2->ops->detach_dev(domain2, dev)

But in order to get that, the IOMMU core needs to know that domain2 is
auxiliary. Otherwise, detach_dev is never called when a default_domain
is present for the parent dev.

I guess one solution is to add an "auxiliary" attribute to iommu_domain,
so __iommu_detach_group would do something like:

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7113fe398b70..2b3e9b91aee7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1786,10 +1786,11 @@ static void __iommu_detach_group(struct
iommu_domain *domain,
 {
 	int ret;

-	if (!group->default_domain) {
+	if (!group->default_domain || domain->auxiliary) {
 		__iommu_group_for_each_dev(group, domain,
 					   iommu_group_do_detach_device);
-		group->domain = NULL;
+		if (!domain->auxiliary)
+			group->domain = NULL;
 		return;
 	}

Not sure who would set this "auxiliary" attribute... Maybe the IOMMU
driver, when attaching the domain to a device that has auxiliary mode
enabled?

> why cannot ARM implement a detach_dev for aux_domain too? My
> feeling is that default domain twist is only for switch between 1/2/3
> in concept.

If the core actually calls it, we can implement detach_dev :) The
problem is that the core never calls detach_dev when default_domain is
present (affects any IOMMU driver that relies on default_domain,
including AMD), and even in case 4) the default_domain is present for
the parent device

>> So the proposed interface doesn't seem to work as is. If we want to use
>> iommu_attach/detach_device for auxiliary domains, the existing behavior
>> of iommu.c, and IOMMU drivers that rely on it, have to change. Any
>> change I can think of right now seems more daunting than introducing a
>> different path for auxiliary domains, like iommu_attach_aux_domain for
>> example.
>> 
> 
> introducing *aux* specific API will cause different VFIO code path to
> handle RID-based and PASID-based mdev, since RID-based still needs
> to use normal attach_domain that way.

The PASID-based mdev still requires a special case to retrieve the
allocated PASID and program it in the parent device, so VFIO will need
to know the difference between the two

Thanks,
Jean

> well, this argument is not very strong
> in itself, if indeed proposed way doesn't work for ARM. But let's see
> whether it is the case with more understanding of your actual concern.
> 
> Thanks
> Kevin
Baolu Lu Sept. 19, 2018, 2:10 a.m. UTC | #8
Hi,

On 09/19/2018 07:26 AM, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@arm.com]
>> Sent: Tuesday, September 18, 2018 11:52 PM
>>
>> On 15/09/2018 03:36, Tian, Kevin wrote:
>>>> 4) Userspace opens another mdev.
>>>> -> iommu.c calls domain->ops->attach_dev(domain2, dev)
>>>
>>> another mdev in same VFIO container or different? I assume the
>>> latter since you mentioned a new domain2.
>>
>> I was thinking a different VFIO container actually. I used domain2 to
>> try to make the example clearer
>>
>>>> 1)? When the container is closed, VFIO calls
>>>> iommu_detach_device(domain2, parent_dev)
>>>> -> iommu.c calls default_domain->ops->attach_dev(default_domain,
>> dev)
>>>> Given that auxiliary domains are attached, the IOMMU driver could
>> deduce
>>>> that this actually means "detach an auxiliary domain". But which one?
>>>
>>> I didn't get this one. There is no need to stick to 1) behavior for
>>> 4), i.e. below is expected:
>>>          domain2->ops->detach_dev(domain2, dev)
>>
>> But in order to get that, the IOMMU core needs to know that domain2 is
>> auxiliary. Otherwise, detach_dev is never called when a default_domain
>> is present for the parent dev.
>>
>> I guess one solution is to add an "auxiliary" attribute to iommu_domain,
>> so __iommu_detach_group would do something like:
> 
> this doesn't work. same domain can be also attached to another physical
> device as non-aux domain (e.g. passthrough) at the same time (vfio-pci
> device and vfio-mdev device in same container), then default domain
> tweak is required in that case. "aux" takes effect only per-device, not
> per-domain.

If we have below APIs for aux domain (the API names are just for
discussion purpose, subject to change):

iommu_querry_aux_domain_capability(dev)
iommu_enable_aux_domain(dev)
iommu_disable_aux_domain(dev)
iommu_check_aux_domain_status(dev)

then, we could do this like below:

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ab3d7d3b1583..3bfb652c78e8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1469,12 +1469,31 @@ static int iommu_group_do_detach_device(struct 
device *dev, void *data)
         return 0;
  }

+static int iommu_group_check_aux_domain(struct device *dev, void *data)
+{
+       const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+       if (ops && ops->check_auxd)
+               return !ops->check_auxd(dev);
+
+       return -EINVAL;
+}
+
+/*
+ *  Check whether devices in @group have aux domain enabled.
+ */
+static int iommu_group_aux_domain_enabled(struct iommu_group *group)
+{
+       return __iommu_group_for_each_dev(group, NULL,
+                                         iommu_group_check_aux_domain);
+}
+
  static void __iommu_detach_group(struct iommu_domain *domain,
                                  struct iommu_group *group)
  {
         int ret;

-       if (!group->default_domain) {
+       if (!group->default_domain || 
iommu_group_aux_domain_enabled(group)) {
                 __iommu_group_for_each_dev(group, domain,
                                            iommu_group_do_detach_device);
                 group->domain = NULL;

Best regards,
Lu Baolu

> 
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 7113fe398b70..2b3e9b91aee7 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1786,10 +1786,11 @@ static void __iommu_detach_group(struct
>> iommu_domain *domain,
>>   {
>>   	int ret;
>>
>> -	if (!group->default_domain) {
>> +	if (!group->default_domain || domain->auxiliary) {
>>   		__iommu_group_for_each_dev(group, domain,
>>   					   iommu_group_do_detach_device);
>> -		group->domain = NULL;
>> +		if (!domain->auxiliary)
>> +			group->domain = NULL;
>>   		return;
>>   	}
>>
>> Not sure who would set this "auxiliary" attribute... Maybe the IOMMU
>> driver, when attaching the domain to a device that has auxiliary mode
>> enabled?
>>
>>> why cannot ARM implement a detach_dev for aux_domain too? My
>>> feeling is that default domain twist is only for switch between 1/2/3
>>> in concept.
>>
>> If the core actually calls it, we can implement detach_dev :) The
>> problem is that the core never calls detach_dev when default_domain is
>> present (affects any IOMMU driver that relies on default_domain,
>> including AMD), and even in case 4) the default_domain is present for
>> the parent device
> 
> Then can we change that core logic so detach_dev is invoked in all
> cases? yes there will be some changes in vendor drivers, but I expect
> this change trivial (especially considering the gain in IOMMU API
> simplicity side as described below).
> 
>>
>>>> So the proposed interface doesn't seem to work as is. If we want to use
>>>> iommu_attach/detach_device for auxiliary domains, the existing
>> behavior
>>>> of iommu.c, and IOMMU drivers that rely on it, have to change. Any
>>>> change I can think of right now seems more daunting than introducing a
>>>> different path for auxiliary domains, like iommu_attach_aux_domain for
>>>> example.
>>>>
>>>
>>> introducing *aux* specific API will cause different VFIO code path to
>>> handle RID-based and PASID-based mdev, since RID-based still needs
>>> to use normal attach_domain that way.
>>
>> The PASID-based mdev still requires a special case to retrieve the
>> allocated PASID and program it in the parent device, so VFIO will need
>> to know the difference between the two
>>
> 
> that retrieve/program is down by parent driver, instead of VFIO.
> 
> Thanks
> Kevin
>
Jean-Philippe Brucker Sept. 25, 2018, 5:55 p.m. UTC | #9
On 19/09/2018 00:26, Tian, Kevin wrote:
>> If the core actually calls it, we can implement detach_dev :) The
>> problem is that the core never calls detach_dev when default_domain is
>> present (affects any IOMMU driver that relies on default_domain,
>> including AMD), and even in case 4) the default_domain is present for
>> the parent device
> 
> Then can we change that core logic so detach_dev is invoked in all
> cases? yes there will be some changes in vendor drivers, but I expect
> this change trivial (especially considering the gain in IOMMU API
> simplicity side as described below).

Thinking more about this, there might be a way that doesn't require too
much rewriting of IOMMU drivers. When VFIO calls iommu_detach_group to
detach a vfio-pci device from its domain, it could do detach_dev(dev,
old_domain) followed by attach_dev(dev, default_domain) instead of only
attach_dev(dev, default_domain), so:

__iommu_attach_group(grp, unmanaged_domain)
  ops->attach_dev(dev, unmanaged_domain)
    -> IOMMU driver detaches from default_domain only if the dev isn't
       in auxiliary mode

__iommu_detach_group(grp, unmanaged_domain)
  if (ops->detach_dev)
      ops->detach_dev(dev, unmanaged_domain)
  ops->attach_dev(dev, default_domain)
    -> IOMMU driver ignores this if still attached to default_domain

It's not trivial: since half the IOMMU drivers seem to be using default
domain now, their detach_dev callback might have been dead code for a
while (I can't find a path where it still gets called), so re-enabling
it requires some attention. However, since most of them won't care about
aux domains, maybe we could just remove their detach_dev pointer.

The resulting logic of attach/detach_dev in IOMMU drivers that do use
aux domains still seems unnecessarily convoluted. It could as well be
done with separate attach_aux/detach_aux functions. If you want to keep
it that way though, I think the above could work for us and I'll try to
write an RFC.

Thanks,
Jean
Baolu Lu Sept. 26, 2018, 2:11 a.m. UTC | #10
Hi,

On 09/26/2018 01:55 AM, Jean-Philippe Brucker wrote:
> On 19/09/2018 00:26, Tian, Kevin wrote:
>>> If the core actually calls it, we can implement detach_dev :) The
>>> problem is that the core never calls detach_dev when default_domain is
>>> present (affects any IOMMU driver that relies on default_domain,
>>> including AMD), and even in case 4) the default_domain is present for
>>> the parent device
>>
>> Then can we change that core logic so detach_dev is invoked in all
>> cases? yes there will be some changes in vendor drivers, but I expect
>> this change trivial (especially considering the gain in IOMMU API
>> simplicity side as described below).
> 
> Thinking more about this, there might be a way that doesn't require too
> much rewriting of IOMMU drivers. When VFIO calls iommu_detach_group to
> detach a vfio-pci device from its domain, it could do detach_dev(dev,
> old_domain) followed by attach_dev(dev, default_domain) instead of only
> attach_dev(dev, default_domain), so:
> 
> __iommu_attach_group(grp, unmanaged_domain)
>    ops->attach_dev(dev, unmanaged_domain)
>      -> IOMMU driver detaches from default_domain only if the dev isn't
>         in auxiliary mode
> 
> __iommu_detach_group(grp, unmanaged_domain)
>    if (ops->detach_dev)
>        ops->detach_dev(dev, unmanaged_domain)
>    ops->attach_dev(dev, default_domain)
>      -> IOMMU driver ignores this if still attached to default_domain
> 

This looks good to me.

Just a minor change to make it more readable.

__iommu_detach_group(grp, unmanaged_domain)
     if (ops->detach_dev)
         ops->detach_dev(dev, unmanaged_domain)
     if (default_domain)
         ops->attach_dev(dev, default_domain)

Best regards,
Lu Baolu

> It's not trivial: since half the IOMMU drivers seem to be using default
> domain now, their detach_dev callback might have been dead code for a
> while (I can't find a path where it still gets called), so re-enabling
> it requires some attention. However, since most of them won't care about
> aux domains, maybe we could just remove their detach_dev pointer.
> 
> The resulting logic of attach/detach_dev in IOMMU drivers that do use
> aux domains still seems unnecessarily convoluted. It could as well be
> done with separate attach_aux/detach_aux functions. If you want to keep
> it that way though, I think the above could work for us and I'll try to
> write an RFC.
> 
> Thanks,
> Jean
>
diff mbox series

Patch

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d9fd3188615d..89e2e6123223 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -91,6 +91,9 @@  struct vfio_dma {
 struct vfio_group {
 	struct iommu_group	*iommu_group;
 	struct list_head	next;
+	bool			attach_parent;	/* An mdev group with domain
+						 * attached to parent
+						 */
 };
 
 /*
@@ -1327,6 +1330,66 @@  static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 	return ret;
 }
 
+static int vfio_mdev_set_aux_domain(struct device *dev,
+				    struct iommu_domain *domain)
+{
+	int (*fn)(struct device *dev, void *domain);
+	int ret;
+
+	fn = symbol_get(mdev_set_domain);
+	if (fn) {
+		ret = fn(dev, domain);
+		symbol_put(mdev_set_domain);
+
+		return ret;
+	}
+
+	return -EINVAL;
+}
+
+static int vfio_attach_aux_domain(struct device *dev, void *data)
+{
+	struct iommu_domain *domain = data;
+	int ret;
+
+	ret = vfio_mdev_set_aux_domain(dev, domain);
+	if (ret)
+		return ret;
+
+	return iommu_attach_device(domain, dev->parent);
+}
+
+static int vfio_detach_aux_domain(struct device *dev, void *data)
+{
+	struct iommu_domain *domain = data;
+
+	vfio_mdev_set_aux_domain(dev, NULL);
+	iommu_detach_device(domain, dev->parent);
+
+	return 0;
+}
+
+static int vfio_iommu_attach_group(struct vfio_domain *domain,
+				   struct vfio_group *group)
+{
+	if (group->attach_parent)
+		return iommu_group_for_each_dev(group->iommu_group,
+						domain->domain,
+						vfio_attach_aux_domain);
+	else
+		return iommu_attach_group(domain->domain, group->iommu_group);
+}
+
+static void vfio_iommu_detach_group(struct vfio_domain *domain,
+				    struct vfio_group *group)
+{
+	if (group->attach_parent)
+		iommu_group_for_each_dev(group->iommu_group, domain->domain,
+					 vfio_detach_aux_domain);
+	else
+		iommu_detach_group(domain->domain, group->iommu_group);
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
@@ -1402,7 +1465,7 @@  static int vfio_iommu_type1_attach_group(void *iommu_data,
 			goto out_domain;
 	}
 
-	ret = iommu_attach_group(domain->domain, iommu_group);
+	ret = vfio_iommu_attach_group(domain, group);
 	if (ret)
 		goto out_domain;
 
@@ -1434,8 +1497,8 @@  static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		if (d->domain->ops == domain->domain->ops &&
 		    d->prot == domain->prot) {
-			iommu_detach_group(domain->domain, iommu_group);
-			if (!iommu_attach_group(d->domain, iommu_group)) {
+			vfio_iommu_detach_group(domain, group);
+			if (!vfio_iommu_attach_group(d, group)) {
 				list_add(&group->next, &d->group_list);
 				iommu_domain_free(domain->domain);
 				kfree(domain);
@@ -1443,7 +1506,7 @@  static int vfio_iommu_type1_attach_group(void *iommu_data,
 				return 0;
 			}
 
-			ret = iommu_attach_group(domain->domain, iommu_group);
+			ret = vfio_iommu_attach_group(domain, group);
 			if (ret)
 				goto out_domain;
 		}
@@ -1469,7 +1532,7 @@  static int vfio_iommu_type1_attach_group(void *iommu_data,
 	return 0;
 
 out_detach:
-	iommu_detach_group(domain->domain, iommu_group);
+	vfio_iommu_detach_group(domain, group);
 out_domain:
 	iommu_domain_free(domain->domain);
 out_free:
@@ -1560,7 +1623,7 @@  static void vfio_iommu_type1_detach_group(void *iommu_data,
 		if (!group)
 			continue;
 
-		iommu_detach_group(domain->domain, iommu_group);
+		vfio_iommu_detach_group(domain, group);
 		list_del(&group->next);
 		kfree(group);
 		/*
@@ -1625,7 +1688,7 @@  static void vfio_release_domain(struct vfio_domain *domain, bool external)
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
 		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+			vfio_iommu_detach_group(domain, group);
 		list_del(&group->next);
 		kfree(group);
 	}