diff mbox series

[v11,09/10] virtio-iommu: Set supported page size mask

Message ID 20201030180510.747225-10-jean-philippe@linaro.org (mailing list archive)
State New, archived
Headers show
Series virtio-iommu: VFIO integration | expand

Commit Message

Jean-Philippe Brucker Oct. 30, 2020, 6:05 p.m. UTC
From: Bharat Bhushan <bbhushan2@marvell.com>

The virtio-iommu device can deal with arbitrary page sizes for virtual
endpoints, but for endpoints assigned with VFIO it must follow the page
granule used by the host IOMMU driver.

Implement the interface to set the vIOMMU page size mask, called by VFIO
for each endpoint. We assume that all host IOMMU drivers use the same
page granule (the host page granule). Override the page_size_mask field
in the virtio config space.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v11: Simplify the update logic slightly
---
 hw/virtio/virtio-iommu.c | 50 ++++++++++++++++++++++++++++++++++++++++
 hw/virtio/trace-events   |  1 +
 2 files changed, 51 insertions(+)

Comments

Peter Xu Nov. 2, 2020, 10:47 p.m. UTC | #1
On Fri, Oct 30, 2020 at 07:05:09PM +0100, Jean-Philippe Brucker wrote:
> From: Bharat Bhushan <bbhushan2@marvell.com>
> 
> The virtio-iommu device can deal with arbitrary page sizes for virtual
> endpoints, but for endpoints assigned with VFIO it must follow the page
> granule used by the host IOMMU driver.
> 
> Implement the interface to set the vIOMMU page size mask, called by VFIO
> for each endpoint. We assume that all host IOMMU drivers use the same
> page granule (the host page granule). Override the page_size_mask field
> in the virtio config space.

(Nit: Seems slightly mismatched with the code below)

[...]

> +    /*
> +     * After the machine is finalized, we can't change the mask anymore. If by
> +     * chance the hotplugged device supports the same granule, we can still
> +     * accept it. Having a different masks is possible but the guest will use
> +     * sub-optimal block sizes, so warn about it.
> +     */
> +    if (qdev_hotplug) {
> +        int new_granule = ctz64(new_mask);
> +        int cur_granule = ctz64(cur_mask);
> +
> +        if (new_granule != cur_granule) {
> +            error_setg(errp, "virtio-iommu page mask 0x%"PRIx64
> +                       " is incompatible with mask 0x%"PRIx64, cur_mask,
> +                       new_mask);
> +            return -1;
> +        } else if (new_mask != cur_mask) {
> +            warn_report("virtio-iommu page mask 0x%"PRIx64
> +                        " does not match 0x%"PRIx64, cur_mask, new_mask);

IMHO, new_mask!=cur_mask is ok, as long as it's a superset of reported
cur_mask, then it'll still guarantee to work.

Meanwhile, checking against granule seems not safe enough if the guest driver
started to use huge pages in iommu pgtables...

In summary:

    if (qdev_hotplug) {
        if ((new_mask & cur_mask) == cur_mask) {
            /* Superset of old mask; we're good.  Keep the old mask since same */
            return 0;
        } else {
            /* Guest driver can use any psize in cur_mask, not safe to continue */
            error_setg(...);
            return -1;
        }
    }

Maybe we can also work on top too (if this is the only reason to repost,
especially if Michael would like to pick this up sooner), so I just raise this
up.

Thanks,
Jean-Philippe Brucker Nov. 3, 2020, 4:32 p.m. UTC | #2
On Mon, Nov 02, 2020 at 05:47:25PM -0500, Peter Xu wrote:
> On Fri, Oct 30, 2020 at 07:05:09PM +0100, Jean-Philippe Brucker wrote:
> > From: Bharat Bhushan <bbhushan2@marvell.com>
> > 
> > The virtio-iommu device can deal with arbitrary page sizes for virtual
> > endpoints, but for endpoints assigned with VFIO it must follow the page
> > granule used by the host IOMMU driver.
> > 
> > Implement the interface to set the vIOMMU page size mask, called by VFIO
> > for each endpoint. We assume that all host IOMMU drivers use the same
> > page granule (the host page granule). Override the page_size_mask field
> > in the virtio config space.
> 
> (Nit: Seems slightly mismatched with the code below)
> 
> [...]
> 
> > +    /*
> > +     * After the machine is finalized, we can't change the mask anymore. If by
> > +     * chance the hotplugged device supports the same granule, we can still
> > +     * accept it. Having a different masks is possible but the guest will use
> > +     * sub-optimal block sizes, so warn about it.
> > +     */
> > +    if (qdev_hotplug) {
> > +        int new_granule = ctz64(new_mask);
> > +        int cur_granule = ctz64(cur_mask);
> > +
> > +        if (new_granule != cur_granule) {
> > +            error_setg(errp, "virtio-iommu page mask 0x%"PRIx64
> > +                       " is incompatible with mask 0x%"PRIx64, cur_mask,
> > +                       new_mask);
> > +            return -1;
> > +        } else if (new_mask != cur_mask) {
> > +            warn_report("virtio-iommu page mask 0x%"PRIx64
> > +                        " does not match 0x%"PRIx64, cur_mask, new_mask);
> 
> IMHO, new_mask!=cur_mask is ok, as long as it's a superset of reported
> cur_mask, then it'll still guarantee to work.
> 
> Meanwhile, checking against granule seems not safe enough if the guest driver
> started to use huge pages in iommu pgtables...

As the guest doesn't directly touch the page tables I think it is safe,
albeit slow. If the host IOMMU driver cannot support one huge page for a
mapping, then it will use several small pages instead.

> 
> In summary:
> 
>     if (qdev_hotplug) {
>         if ((new_mask & cur_mask) == cur_mask) {
>             /* Superset of old mask; we're good.  Keep the old mask since same */
>             return 0;

That looks correct, but a bit too restrictive. If we start with
cur_mask = 0xfffffffffffff000, and we hotplug a VFIO device with
new_mask = 0x40201000 (4k page, 2M 1G blocks), then this code rejects it
even though it would work.

Thanks,
Jean

>         } else {
>             /* Guest driver can use any psize in cur_mask, not safe to continue */
>             error_setg(...);
>             return -1;
>         }
>     }
> 
> Maybe we can also work on top too (if this is the only reason to repost,
> especially if Michael would like to pick this up sooner), so I just raise this
> up.
> 
> Thanks,
> 
> -- 
> Peter Xu
>
Peter Xu Nov. 3, 2020, 4:44 p.m. UTC | #3
On Tue, Nov 03, 2020 at 05:32:34PM +0100, Jean-Philippe Brucker wrote:
> > In summary:
> > 
> >     if (qdev_hotplug) {
> >         if ((new_mask & cur_mask) == cur_mask) {
> >             /* Superset of old mask; we're good.  Keep the old mask since same */
> >             return 0;
> 
> That looks correct, but a bit too restrictive. If we start with
> cur_mask = 0xfffffffffffff000, and we hotplug a VFIO device with
> new_mask = 0x40201000 (4k page, 2M 1G blocks), then this code rejects it
> even though it would work.

Yeah I think you're right - it should be the smallest granule that matters the
most.  Thanks,
diff mbox series

Patch

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 78e07aa40a5..fc5c75d6933 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -899,6 +899,55 @@  static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
     return 0;
 }
 
+/*
+ * The default mask (TARGET_PAGE_MASK) is the smallest supported guest granule,
+ * for example 0xfffffffffffff000. When an assigned device has page size
+ * restrictions due to the hardware IOMMU configuration, apply this restriction
+ * to the mask.
+ */
+static int virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
+                                           uint64_t new_mask,
+                                           Error **errp)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+    uint64_t cur_mask = s->config.page_size_mask;
+
+    trace_virtio_iommu_set_page_size_mask(mr->parent_obj.name, cur_mask,
+                                          new_mask);
+
+    if ((cur_mask & new_mask) == 0) {
+        error_setg(errp, "virtio-iommu page mask 0x%"PRIx64
+                   " is incompatible with mask 0x%"PRIx64, cur_mask, new_mask);
+        return -1;
+    }
+
+    /*
+     * After the machine is finalized, we can't change the mask anymore. If by
+     * chance the hotplugged device supports the same granule, we can still
+     * accept it. Having a different masks is possible but the guest will use
+     * sub-optimal block sizes, so warn about it.
+     */
+    if (qdev_hotplug) {
+        int new_granule = ctz64(new_mask);
+        int cur_granule = ctz64(cur_mask);
+
+        if (new_granule != cur_granule) {
+            error_setg(errp, "virtio-iommu page mask 0x%"PRIx64
+                       " is incompatible with mask 0x%"PRIx64, cur_mask,
+                       new_mask);
+            return -1;
+        } else if (new_mask != cur_mask) {
+            warn_report("virtio-iommu page mask 0x%"PRIx64
+                        " does not match 0x%"PRIx64, cur_mask, new_mask);
+        }
+        return 0;
+    }
+
+    s->config.page_size_mask &= new_mask;
+    return 0;
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -1130,6 +1179,7 @@  static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     imrc->translate = virtio_iommu_translate;
     imrc->replay = virtio_iommu_replay;
     imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
+    imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
 }
 
 static const TypeInfo virtio_iommu_info = {
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 982d0002a65..2060a144a2f 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -109,6 +109,7 @@  virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t start,
 virtio_iommu_notify_map(const char *name, uint64_t virt_start, uint64_t virt_end, uint64_t phys_start, uint32_t flags) "mr=%s virt_start=0x%"PRIx64" virt_end=0x%"PRIx64" phys_start=0x%"PRIx64" flags=%d"
 virtio_iommu_notify_unmap(const char *name, uint64_t virt_start, uint64_t virt_end) "mr=%s virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
 virtio_iommu_remap(const char *name, uint64_t virt_start, uint64_t virt_end, uint64_t phys_start) "mr=%s virt_start=0x%"PRIx64" virt_end=0x%"PRIx64" phys_start=0x%"PRIx64
+virtio_iommu_set_page_size_mask(const char *name, uint64_t old, uint64_t new) "mr=%s old_mask=0x%"PRIx64" new_mask=0x%"PRIx64
 virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s"
 virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s"