diff mbox series

[iommufd,4/9] iommufd: Convert to msi_device_has_secure_msi()

Message ID 4-v1-9e466539c244+47b5-secure_msi_jgg@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Remove IOMMU_CAP_INTR_REMAP | expand

Commit Message

Jason Gunthorpe Dec. 8, 2022, 8:26 p.m. UTC
iommufd already has a struct device, so trivial conversion.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Tian, Kevin Dec. 9, 2022, 6:01 a.m. UTC | #1
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, December 9, 2022 4:27 AM
>
> @@ -170,7 +170,7 @@ static int iommufd_device_setup_msi(struct
> iommufd_device *idev,
>  	 * interrupt outside this iommufd context.
>  	 */
>  	if (!device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP)
> &&
> -	    !irq_domain_check_msi_remap()) {
> +	    !msi_device_has_secure_msi(idev->dev)) {
>  		if (!allow_unsafe_interrupts)
>  			return -EPERM;
> 

this is where iommufd and vfio diverge.

vfio has a check to ensure all devices in the group has secure_msi.

but iommufd only imposes the check per device.
Jason Gunthorpe Dec. 9, 2022, 2:47 p.m. UTC | #2
On Fri, Dec 09, 2022 at 06:01:14AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Friday, December 9, 2022 4:27 AM
> >
> > @@ -170,7 +170,7 @@ static int iommufd_device_setup_msi(struct
> > iommufd_device *idev,
> >  	 * interrupt outside this iommufd context.
> >  	 */
> >  	if (!device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP)
> > &&
> > -	    !irq_domain_check_msi_remap()) {
> > +	    !msi_device_has_secure_msi(idev->dev)) {
> >  		if (!allow_unsafe_interrupts)
> >  			return -EPERM;
> > 
> 
> this is where iommufd and vfio diverge.
> 
> vfio has a check to ensure all devices in the group has secure_msi.
> 
> but iommufd only imposes the check per device.

Ah, that is an interesting, though pedantic point.

So, let us do this and address the other point about vfio as well:

+++ b/drivers/iommu/iommu.c
@@ -941,6 +941,28 @@ static bool iommu_is_attach_deferred(struct device *dev)
        return false;
 }
 
+static int iommu_group_add_device_list(struct iommu_group *group,
+                                      struct group_device *group_dev)
+{
+       struct group_device *existing;
+
+       lockdep_assert_held(&group->mutex);
+
+       existing = list_first_entry_or_null(&group->devices,
+                                           struct group_device, list);
+
+       /*
+        * It is a driver bug to create groups with different irq_domain
+        * properties.
+        */
+       if (existing && msi_device_has_isolated_msi(existing->dev) !=
+                               msi_device_has_isolated_msi(group_dev->dev))
+               return -EINVAL;
+
+       list_add_tail(&group_dev->list, &group->devices);
+       return 0;
+}
+
 /**
  * iommu_group_add_device - add a device to an iommu group
  * @group: the group into which to add the device (reference should be held)
@@ -992,7 +1014,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
        dev->iommu_group = group;
 
        mutex_lock(&group->mutex);
-       list_add_tail(&device->list, &group->devices);
+       iommu_group_add_device_list(group, device);
        if (group->domain  && !iommu_is_attach_deferred(dev))
                ret = __iommu_attach_device(group->domain, dev);
        mutex_unlock(&group->mutex);
Robin Murphy Dec. 9, 2022, 4:44 p.m. UTC | #3
On 2022-12-09 14:47, Jason Gunthorpe wrote:
> On Fri, Dec 09, 2022 at 06:01:14AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Friday, December 9, 2022 4:27 AM
>>>
>>> @@ -170,7 +170,7 @@ static int iommufd_device_setup_msi(struct
>>> iommufd_device *idev,
>>>   	 * interrupt outside this iommufd context.
>>>   	 */
>>>   	if (!device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP)
>>> &&
>>> -	    !irq_domain_check_msi_remap()) {
>>> +	    !msi_device_has_secure_msi(idev->dev)) {
>>>   		if (!allow_unsafe_interrupts)
>>>   			return -EPERM;
>>>
>>
>> this is where iommufd and vfio diverge.
>>
>> vfio has a check to ensure all devices in the group has secure_msi.
>>
>> but iommufd only imposes the check per device.
> 
> Ah, that is an interesting, though pedantic point.
> 
> So, let us do this and address the other point about vfio as well:
> 
> +++ b/drivers/iommu/iommu.c
> @@ -941,6 +941,28 @@ static bool iommu_is_attach_deferred(struct device *dev)
>          return false;
>   }
>   
> +static int iommu_group_add_device_list(struct iommu_group *group,
> +                                      struct group_device *group_dev)
> +{
> +       struct group_device *existing;
> +
> +       lockdep_assert_held(&group->mutex);
> +
> +       existing = list_first_entry_or_null(&group->devices,
> +                                           struct group_device, list);
> +
> +       /*
> +        * It is a driver bug to create groups with different irq_domain
> +        * properties.
> +        */
> +       if (existing && msi_device_has_isolated_msi(existing->dev) !=
> +                               msi_device_has_isolated_msi(group_dev->dev))
> +               return -EINVAL;

Isn't the problem with this that it's super-early, and a device's MSI 
domain may not actually be resolved until someone starts requesting MSIs 
for it? Maybe Thomas' ongoing per-device stuff changes that, but I'm not 
sure :/

Furthermore, even if the system does have a topology with multiple 
heterogeneous MSI controllers reachable by devices behind the same 
IOMMU, and the IRQ layer decides to choose each device's MSI parent at 
random, that's hardly the IOMMU layer's problem, and shouldn't prevent 
the devices being usable by kernel drivers for whom MSI isolation 
doesn't matter.

Thanks,
Robin.

> +
> +       list_add_tail(&group_dev->list, &group->devices);
> +       return 0;
> +}
> +
>   /**
>    * iommu_group_add_device - add a device to an iommu group
>    * @group: the group into which to add the device (reference should be held)
> @@ -992,7 +1014,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
>          dev->iommu_group = group;
>   
>          mutex_lock(&group->mutex);
> -       list_add_tail(&device->list, &group->devices);
> +       iommu_group_add_device_list(group, device);
>          if (group->domain  && !iommu_is_attach_deferred(dev))
>                  ret = __iommu_attach_device(group->domain, dev);
>          mutex_unlock(&group->mutex);
Jason Gunthorpe Dec. 9, 2022, 5:38 p.m. UTC | #4
On Fri, Dec 09, 2022 at 04:44:06PM +0000, Robin Murphy wrote:

> Isn't the problem with this that it's super-early, and a device's MSI domain
> may not actually be resolved until someone starts requesting MSIs for it?
> Maybe Thomas' ongoing per-device stuff changes that, but I'm not
> sure :/

Yes, this looks correct, OK, so I will do Kevin's thought

Thanks!

> Furthermore, even if the system does have a topology with multiple
> heterogeneous MSI controllers reachable by devices behind the same
> IOMMU,

Sure, but this doesn't exist and my thinking was to put a big red flag
here in case someone actually wants to try to do it - most likely it
is a bug not a real thing

I re-did things to use this new function, iommufd and vfio just
trivially call it

+/**
+ * iommu_group_has_isolated_msi() - Compute msi_device_has_isolated_msi()
+ *       for a group
+ * @group: Group to query
+ *
+ * IOMMU groups should not have differing values of
+ * msi_device_has_isolated_msi() for devices in a group. However nothing
+ * directly prevents this, so ensure mistakes don't result in isolation failures
+ * by checking that all the devices are the same.
+ */
+bool iommu_group_has_isolated_msi(struct iommu_group *group)
+{
+	struct group_device *group_dev;
+	bool ret = true;
+
+	mutex_lock(&group->mutex);
+	list_for_each_entry(group_dev, &group->devices, list)
+		ret &= msi_device_has_isolated_msi(group_dev->dev) ||
+		       device_iommu_capable(group_dev->dev,
+					    IOMMU_CAP_INTR_REMAP);
+	mutex_unlock(&group->mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_has_isolated_msi);
Thomas Gleixner Dec. 12, 2022, 3:17 p.m. UTC | #5
On Fri, Dec 09 2022 at 13:38, Jason Gunthorpe wrote:
> On Fri, Dec 09, 2022 at 04:44:06PM +0000, Robin Murphy wrote:
>
>> Isn't the problem with this that it's super-early, and a device's MSI domain
>> may not actually be resolved until someone starts requesting MSIs for it?
>> Maybe Thomas' ongoing per-device stuff changes that, but I'm not
>> sure :/
>
> Yes, this looks correct, OK, so I will do Kevin's thought

The device MSI domain has to be valid before a device can be probed, at
least that's the case for PCI/MSI devices. The pointer is established
during PCI discovery.

The per device MSI domains do not change that requirement. The only
difference is that device->msi.domain now points to the MSI parent
domain.

In the "global" PCI/MSI domain case the hierarchy walk will (on x86)
start at the PCI/MSI domain and end up either at the remapping unit,
which has the protection property, or at the vector (root) domain, which
does not.

For the per device domain case the walk will start at the parent domain,
which is (on x86) either the remapping unit or the vector (root) domain.

The same is true for ARM(64) and other hierarchy users, just the naming
conventions and possible scenarios are different.

So for both scenarios (global and per-device) searching for that
protection property down the hierarchy starting from device->msi.domain
is correct.

Obvioulsy unless it's done somewhere early in the PCI discovery,
i.e. before the discovery associated the domain pointer.

Thanks,

        tglx
Jason Gunthorpe Dec. 12, 2022, 3:47 p.m. UTC | #6
On Mon, Dec 12, 2022 at 04:17:58PM +0100, Thomas Gleixner wrote:

> Obvioulsy unless it's done somewhere early in the PCI discovery,
> i.e. before the discovery associated the domain pointer.

I thought the problem is more that the iommu drivers change the
assigned irq_domain:

void intel_irq_remap_add_device(struct dmar_pci_notify_info *info)
{
	if (!irq_remapping_enabled || pci_dev_has_special_msi_domain(info->dev))
		return;

	dev_set_msi_domain(&info->dev->dev, map_dev_to_ir(info->dev));
}

Which is ultimately called by 

	bus_register_notifier(&pci_bus_type, &dmar_pci_bus_nb);

And that compares with the iommu setup which is also done from a
bus notifier:

		nb[i].notifier_call = iommu_bus_notifier;
		bus_register_notifier(iommu_buses[i], &nb[i]);

So, I think, there is not reliable ordering between these two things.

At least that is why I was convinced we should not do the idea I
shared :)

Thanks,
Jason
Thomas Gleixner Dec. 12, 2022, 4:25 p.m. UTC | #7
On Mon, Dec 12 2022 at 11:47, Jason Gunthorpe wrote:
> On Mon, Dec 12, 2022 at 04:17:58PM +0100, Thomas Gleixner wrote:
>> Obvioulsy unless it's done somewhere early in the PCI discovery,
>> i.e. before the discovery associated the domain pointer.
>
> I thought the problem is more that the iommu drivers change the
> assigned irq_domain:

Yes they do.

> void intel_irq_remap_add_device(struct dmar_pci_notify_info *info)
> {
> 	if (!irq_remapping_enabled || pci_dev_has_special_msi_domain(info->dev))
> 		return;
>
> 	dev_set_msi_domain(&info->dev->dev, map_dev_to_ir(info->dev));
> }
>
> Which is ultimately called by 
>
> 	bus_register_notifier(&pci_bus_type, &dmar_pci_bus_nb);
>
> And that compares with the iommu setup which is also done from a
> bus notifier:
>
> 		nb[i].notifier_call = iommu_bus_notifier;
> 		bus_register_notifier(iommu_buses[i], &nb[i]);
>
> So, I think, there is not reliable ordering between these two things.

Bah. Notifiers are a complete disaster vs. ordering. This really wants
reliable ordering IMO.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 29ff97f756bc42..13ad737d50b18c 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -4,7 +4,7 @@ 
 #include <linux/iommufd.h>
 #include <linux/slab.h>
 #include <linux/iommu.h>
-#include <linux/irqdomain.h>
+#include <linux/msi.h>
 
 #include "io_pagetable.h"
 #include "iommufd_private.h"
@@ -170,7 +170,7 @@  static int iommufd_device_setup_msi(struct iommufd_device *idev,
 	 * interrupt outside this iommufd context.
 	 */
 	if (!device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP) &&
-	    !irq_domain_check_msi_remap()) {
+	    !msi_device_has_secure_msi(idev->dev)) {
 		if (!allow_unsafe_interrupts)
 			return -EPERM;