diff mbox series

[RFC,kernel] vfio: Skip checking for IOMMU_CAP_CACHE_COHERENCY on POWER and more

Message ID 20220701061751.1955857-1-aik@ozlabs.ru (mailing list archive)
State New, archived
Headers show
Series [RFC,kernel] vfio: Skip checking for IOMMU_CAP_CACHE_COHERENCY on POWER and more | expand

Commit Message

Alexey Kardashevskiy July 1, 2022, 6:17 a.m. UTC
VFIO on POWER does not implement iommu_ops and therefore iommu_capable()
always returns false and __iommu_group_alloc_blocking_domain() always
fails.

iommu_group_claim_dma_owner() in setting container fails for the same
reason - it cannot allocate a domain.

This skips the check for platforms supporting VFIO without implementing
iommu_ops which to my best knowledge is POWER only.

This also allows setting container in absence of iommu_ops.

Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---

Not quite sure what the proper small fix is and implementing iommu_ops
on POWER is not going to happen any time soon or ever :-/

---
 drivers/vfio/vfio.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

Comments

Tian, Kevin July 1, 2022, 7:10 a.m. UTC | #1
> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> Sent: Friday, July 1, 2022 2:18 PM
> 
> VFIO on POWER does not implement iommu_ops and therefore
> iommu_capable()
> always returns false and __iommu_group_alloc_blocking_domain() always
> fails.
> 
> iommu_group_claim_dma_owner() in setting container fails for the same
> reason - it cannot allocate a domain.
> 
> This skips the check for platforms supporting VFIO without implementing
> iommu_ops which to my best knowledge is POWER only.
> 
> This also allows setting container in absence of iommu_ops.
> 
> Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
> Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache
> coherence")
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> Not quite sure what the proper small fix is and implementing iommu_ops
> on POWER is not going to happen any time soon or ever :-/

I'm not sure how others feel about checking bus->iommu_ops outside
of iommu subsystem. This sounds a bit non-modular to me and it's not
obvious from the caller side why lacking of iommu_ops implies the two
relevant APIs are not usable.

Simply returning success when bus->iommu_ops==NULL in the two
APIs is also problematic in concept.

Probably what we really require is an indicator to the caller that the
related operations are irrelevant hence can be skipped when 
called on a particular iommu driver. This reminds me whether
 -EMEDIUMTYPE can be leveraged here. 

Nicolin introduced it in another series for detecting incompatible
domain attach. This errno is not currently used in iommu subsystem
and introduced as a benign error so the caller can check it to retry
another domain.

Similarly we may return -EMEDIUMTYPE when iommu_ops is NULL in
iommu_capable() and iommu_group_claim_dma_owner() then vfio
can safely move forward when this error is returned.

Thanks
Kevin
Robin Murphy July 1, 2022, 10:34 a.m. UTC | #2
On 2022-07-01 07:17, Alexey Kardashevskiy wrote:
> VFIO on POWER does not implement iommu_ops and therefore iommu_capable()
> always returns false and __iommu_group_alloc_blocking_domain() always
> fails.
> 
> iommu_group_claim_dma_owner() in setting container fails for the same
> reason - it cannot allocate a domain.
> 
> This skips the check for platforms supporting VFIO without implementing
> iommu_ops which to my best knowledge is POWER only.
> 
> This also allows setting container in absence of iommu_ops.
> 
> Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
> Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> Not quite sure what the proper small fix is and implementing iommu_ops
> on POWER is not going to happen any time soon or ever :-/

FWIW I did wonder about this when writing [1]. Is it appropriate to have 
any IOMMU API specifics outside the type1 code, or should these bits be 
abstracted behind vfio_iommu_driver_ops methods?

Robin.

[1] 
https://lore.kernel.org/linux-iommu/4ea5eb64246f1ee188d1a61c3e93b37756932eb7.1656092606.git.robin.murphy@arm.com/

> 
> ---
>   drivers/vfio/vfio.c | 13 +++++++------
>   1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 61e71c1154be..71408ab26cd0 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -605,7 +605,8 @@ int vfio_register_group_dev(struct vfio_device *device)
>   	 * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
>   	 * restore cache coherency.
>   	 */
> -	if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (device->dev->bus->iommu_ops &&
> +	    !iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
>   		return -EINVAL;
>   
>   	return __vfio_register_dev(device,
> @@ -934,7 +935,7 @@ static void __vfio_group_unset_container(struct vfio_group *group)
>   		driver->ops->detach_group(container->iommu_data,
>   					  group->iommu_group);
>   
> -	if (group->type == VFIO_IOMMU)
> +	if (group->type == VFIO_IOMMU && iommu_group_dma_owner_claimed(group->iommu_group))
>   		iommu_group_release_dma_owner(group->iommu_group);
>   
>   	group->container = NULL;
> @@ -1010,9 +1011,8 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
>   	}
>   
>   	if (group->type == VFIO_IOMMU) {
> -		ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
> -		if (ret)
> -			goto unlock_out;
> +		if (iommu_group_claim_dma_owner(group->iommu_group, f.file))
> +			pr_warn("Failed to claim DMA owner");
>   	}
>   
>   	driver = container->iommu_driver;
> @@ -1021,7 +1021,8 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
>   						group->iommu_group,
>   						group->type);
>   		if (ret) {
> -			if (group->type == VFIO_IOMMU)
> +			if (group->type == VFIO_IOMMU &&
> +			    iommu_group_dma_owner_claimed(group->iommu_group))
>   				iommu_group_release_dma_owner(
>   					group->iommu_group);
>   			goto unlock_out;
Tian, Kevin July 1, 2022, 11:40 p.m. UTC | #3
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Friday, July 1, 2022 6:34 PM
> 
> On 2022-07-01 07:17, Alexey Kardashevskiy wrote:
> > VFIO on POWER does not implement iommu_ops and therefore
> iommu_capable()
> > always returns false and __iommu_group_alloc_blocking_domain() always
> > fails.
> >
> > iommu_group_claim_dma_owner() in setting container fails for the same
> > reason - it cannot allocate a domain.
> >
> > This skips the check for platforms supporting VFIO without implementing
> > iommu_ops which to my best knowledge is POWER only.
> >
> > This also allows setting container in absence of iommu_ops.
> >
> > Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
> > Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache
> coherence")
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > ---
> >
> > Not quite sure what the proper small fix is and implementing iommu_ops
> > on POWER is not going to happen any time soon or ever :-/
> 
> FWIW I did wonder about this when writing [1]. Is it appropriate to have
> any IOMMU API specifics outside the type1 code, or should these bits be
> abstracted behind vfio_iommu_driver_ops methods?
> 

That is a good point. But an abstraction approach may not work as in
set_container() there may be no iommu driver attached to the container
yet. Probably a better way is just to do cache coherency check and
dma ownership claim both in type1 attach_group w/o adding new op.

But a bigger problem to me is how dma ownership is managed now on
POWER. Previously it was guarded by BUG_ON and vfio_group_viable().
Now with that removed while POWER doesn't support dma claim, does
it imply a regression on POWER platform now?

e.g. what should be returned for POWER in VFIO_GROUP_GET_STATUS:

	if (group->container)
		status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
		           VFIO_GROUP_FLAGS_VIABLE;
	else if (!iommu_group_dma_owner_claimed(group->iommu_group))
		status.flags |= VFIO_GROUP_FLAGS_VIABLE;

Thanks
Kevin
Jason Gunthorpe July 3, 2022, 12:22 a.m. UTC | #4
On Fri, Jul 01, 2022 at 11:40:26PM +0000, Tian, Kevin wrote:

> But a bigger problem to me is how dma ownership is managed now on
> POWER. Previously it was guarded by BUG_ON and vfio_group_viable().

Yes. Simply removing or deferring this can't be good.

I think the solution is to not do iommu operations if there are no
iommu_ops, including allocating a blocking domain or trying to change
the domain - but continue to do all the refcounting/etc.

It is just another crufty work around for platforms that don't support
the iommu framework..

Jason
Jason Gunthorpe July 3, 2022, 12:26 a.m. UTC | #5
On Fri, Jul 01, 2022 at 04:17:51PM +1000, Alexey Kardashevskiy wrote:
> VFIO on POWER does not implement iommu_ops and therefore iommu_capable()
> always returns false and __iommu_group_alloc_blocking_domain() always
> fails.
> 
> iommu_group_claim_dma_owner() in setting container fails for the same
> reason - it cannot allocate a domain.
> 
> This skips the check for platforms supporting VFIO without implementing
> iommu_ops which to my best knowledge is POWER only.
> 
> This also allows setting container in absence of iommu_ops.
> 
> Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
> Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> Not quite sure what the proper small fix is and implementing iommu_ops
> on POWER is not going to happen any time soon or ever :-/
> 
> ---
>  drivers/vfio/vfio.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 61e71c1154be..71408ab26cd0 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -605,7 +605,8 @@ int vfio_register_group_dev(struct vfio_device *device)
>  	 * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
>  	 * restore cache coherency.
>  	 */
> -	if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (device->dev->bus->iommu_ops &&
> +	    !iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
>  		return -EINVAL;

This change should be guarded by some
IS_ENABLED(CONFIG_VFIO_IOMMU_SPAPR_TCE)

We want to do the this check here on every other
configuration. Rejecting null iommu_ops is actually a desired side
effect.

>  	return __vfio_register_dev(device,
> @@ -934,7 +935,7 @@ static void __vfio_group_unset_container(struct vfio_group *group)
>  		driver->ops->detach_group(container->iommu_data,
>  					  group->iommu_group);
>  
> -	if (group->type == VFIO_IOMMU)
> +	if (group->type == VFIO_IOMMU && iommu_group_dma_owner_claimed(group->iommu_group))
>  		iommu_group_release_dma_owner(group->iommu_group);
>  
>  	group->container = NULL;
> @@ -1010,9 +1011,8 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
>  	}
>  
>  	if (group->type == VFIO_IOMMU) {
> -		ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
> -		if (ret)
> -			goto unlock_out;
> +		if (iommu_group_claim_dma_owner(group->iommu_group, f.file))
> +			pr_warn("Failed to claim DMA owner");

We certainly cannot ignore this. As my other email you should make
this succeed inside the iommu subsystem even though the ops are null.

Thanks,
Jason
Jason Gunthorpe July 5, 2022, 12:49 a.m. UTC | #6
On Fri, Jul 01, 2022 at 07:10:45AM +0000, Tian, Kevin wrote:
> > From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > Sent: Friday, July 1, 2022 2:18 PM
> > 
> > VFIO on POWER does not implement iommu_ops and therefore
> > iommu_capable()
> > always returns false and __iommu_group_alloc_blocking_domain() always
> > fails.
> > 
> > iommu_group_claim_dma_owner() in setting container fails for the same
> > reason - it cannot allocate a domain.
> > 
> > This skips the check for platforms supporting VFIO without implementing
> > iommu_ops which to my best knowledge is POWER only.
> > 
> > This also allows setting container in absence of iommu_ops.
> > 
> > Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
> > Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache
> > coherence")
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > ---
> > 
> > Not quite sure what the proper small fix is and implementing iommu_ops
> > on POWER is not going to happen any time soon or ever :-/
> 
> I'm not sure how others feel about checking bus->iommu_ops outside
> of iommu subsystem. This sounds a bit non-modular to me and it's not
> obvious from the caller side why lacking of iommu_ops implies the two
> relevant APIs are not usable.

The more I think about this, the more I think POWER should implement
partial iommu_ops to make this work. It would not support an UNMANAGED
domain, or default domains, but it would support blocking and the
coherency probe.

This makes everything work properly and keeps the mess out of the core
code.

It should not be hard to do if someone can share a bit about the ppc
code and test it..

Jason
Tian, Kevin July 7, 2022, 1:06 a.m. UTC | #7
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Tuesday, July 5, 2022 8:49 AM
> 
> On Fri, Jul 01, 2022 at 07:10:45AM +0000, Tian, Kevin wrote:
> > > From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > Sent: Friday, July 1, 2022 2:18 PM
> > >
> > > VFIO on POWER does not implement iommu_ops and therefore
> > > iommu_capable()
> > > always returns false and __iommu_group_alloc_blocking_domain()
> always
> > > fails.
> > >
> > > iommu_group_claim_dma_owner() in setting container fails for the same
> > > reason - it cannot allocate a domain.
> > >
> > > This skips the check for platforms supporting VFIO without implementing
> > > iommu_ops which to my best knowledge is POWER only.
> > >
> > > This also allows setting container in absence of iommu_ops.
> > >
> > > Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
> > > Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache
> > > coherence")
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > ---
> > >
> > > Not quite sure what the proper small fix is and implementing iommu_ops
> > > on POWER is not going to happen any time soon or ever :-/
> >
> > I'm not sure how others feel about checking bus->iommu_ops outside
> > of iommu subsystem. This sounds a bit non-modular to me and it's not
> > obvious from the caller side why lacking of iommu_ops implies the two
> > relevant APIs are not usable.
> 
> The more I think about this, the more I think POWER should implement
> partial iommu_ops to make this work. It would not support an UNMANAGED
> domain, or default domains, but it would support blocking and the
> coherency probe.

Yes, this sounds a better approach.

> 
> This makes everything work properly and keeps the mess out of the core
> code.
> 
> It should not be hard to do if someone can share a bit about the ppc
> code and test it..
> 
> Jason
diff mbox series

Patch

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 61e71c1154be..71408ab26cd0 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -605,7 +605,8 @@  int vfio_register_group_dev(struct vfio_device *device)
 	 * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
 	 * restore cache coherency.
 	 */
-	if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
+	if (device->dev->bus->iommu_ops &&
+	    !iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
 		return -EINVAL;
 
 	return __vfio_register_dev(device,
@@ -934,7 +935,7 @@  static void __vfio_group_unset_container(struct vfio_group *group)
 		driver->ops->detach_group(container->iommu_data,
 					  group->iommu_group);
 
-	if (group->type == VFIO_IOMMU)
+	if (group->type == VFIO_IOMMU && iommu_group_dma_owner_claimed(group->iommu_group))
 		iommu_group_release_dma_owner(group->iommu_group);
 
 	group->container = NULL;
@@ -1010,9 +1011,8 @@  static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	if (group->type == VFIO_IOMMU) {
-		ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
-		if (ret)
-			goto unlock_out;
+		if (iommu_group_claim_dma_owner(group->iommu_group, f.file))
+			pr_warn("Failed to claim DMA owner");
 	}
 
 	driver = container->iommu_driver;
@@ -1021,7 +1021,8 @@  static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 						group->iommu_group,
 						group->type);
 		if (ret) {
-			if (group->type == VFIO_IOMMU)
+			if (group->type == VFIO_IOMMU &&
+			    iommu_group_dma_owner_claimed(group->iommu_group))
 				iommu_group_release_dma_owner(
 					group->iommu_group);
 			goto unlock_out;