Message ID | 1479171568-19175-1-git-send-email-peterx@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Nov 14, 2016 at 07:59:28PM -0500, Peter Xu wrote: > When one IOMMU memory region is splitted into multiple memory sections, > vfio will register multiple same notifiers to a vIOMMU for the same > region. That's not sensible. What we need is to register one IOMMU > notifier for each IOMMU region, not per section. > > Solution is simple - we traverse the container->giommu_list, and skip > the registration if memory region is already registered. > > To make vfio's region_add() short, vfio_listener_region_add_iommu() is > introduced. > > Signed-off-by: Peter Xu <peterx@redhat.com> This patch will solve Aviv's issue that vfio_map_notify() will dump many error messages when device assignments are enabled with Intel vIOMMU. Alex, could you help verify whether this is the one that you suggested? I'd be glad to add your suggestted-by if you won't disagree. Thanks, -- peterx
On 15/11/16 11:59, Peter Xu wrote: > When one IOMMU memory region is splitted into multiple memory sections, Out of curiosity - when does this happen? > vfio will register multiple same notifiers to a vIOMMU for the same > region. That's not sensible. What we need is to register one IOMMU > notifier for each IOMMU region, not per section. > > Solution is simple - we traverse the container->giommu_list, and skip > the registration if memory region is already registered. > > To make vfio's region_add() short, vfio_listener_region_add_iommu() is > introduced. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > hw/vfio/common.c | 56 +++++++++++++++++++++++++++++++++++--------------------- > 1 file changed, 35 insertions(+), 21 deletions(-) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 801578b..5279fd1 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -360,6 +360,40 @@ out: > rcu_read_unlock(); > } > > +static void vfio_listener_region_add_iommu(VFIOContainer *container, > + MemoryRegionSection *section, > + hwaddr iova, > + hwaddr end) > +{ > + VFIOGuestIOMMU *giommu; > + > + QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { > + if (giommu->iommu == section->mr) { > + /* We have already registered with this MR, skip */ > + return; > + } > + } > + > + trace_vfio_listener_region_add_iommu(iova, end); > + > + /* > + * FIXME: For VFIO iommu types which have KVM acceleration to > + * avoid bouncing all map/unmaps through qemu this way, this > + * would be the right place to wire that up (tell the KVM > + * device emulation the VFIO iommu handles to use). > + */ > + giommu = g_malloc0(sizeof(*giommu)); > + giommu->iommu = section->mr; > + giommu->iommu_offset = section->offset_within_address_space - > + section->offset_within_region; > + giommu->container = container; > + giommu->n.notify = vfio_iommu_map_notify; > + giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL; > + QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); > + memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); > + memory_region_iommu_replay(giommu->iommu, &giommu->n, false); > +} > + > static void vfio_listener_region_add(MemoryListener *listener, > MemoryRegionSection *section) > { > @@ -439,27 +473,7 @@ static void vfio_listener_region_add(MemoryListener *listener, > memory_region_ref(section->mr); > > if (memory_region_is_iommu(section->mr)) { > - VFIOGuestIOMMU *giommu; > - > - trace_vfio_listener_region_add_iommu(iova, end); > - /* > - * FIXME: For VFIO iommu types which have KVM acceleration to > - * avoid bouncing all map/unmaps through qemu this way, this > - * would be the right place to wire that up (tell the KVM > - * device emulation the VFIO iommu handles to use). > - */ > - giommu = g_malloc0(sizeof(*giommu)); > - giommu->iommu = section->mr; > - giommu->iommu_offset = section->offset_within_address_space - > - section->offset_within_region; > - giommu->container = container; > - giommu->n.notify = vfio_iommu_map_notify; > - giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL; > - QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); > - > - memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); > - memory_region_iommu_replay(giommu->iommu, &giommu->n, false); > - > + vfio_listener_region_add_iommu(container, section, iova, end); > return; > } > >
On Tue, Nov 15, 2016 at 04:11:57PM +1100, Alexey Kardashevskiy wrote: > On 15/11/16 11:59, Peter Xu wrote: > > When one IOMMU memory region is splitted into multiple memory sections, > > > Out of curiosity - when does this happen? My understanding: each PCI device will have one IOMMU region when with vt-d vIOMMU enabled. After interrupt remapping is enabled, another 0xfeexxxxx region is overlapped to filter out the interrupts (see vtd_find_add_as()). So the big Intel IOMMU region is splitted into several. Of course looks like we can remove the interrupt region and only use one IOMMU region for both DMAR and IR, but I don't think that's a good one, since maybe one day we'll have another scenario that will split a single memory region. In that case, I'd prefer we take care of it during registration. Thanks, -- peterx
On Mon, Nov 14, 2016 at 07:59:28PM -0500, Peter Xu wrote: > When one IOMMU memory region is splitted into multiple memory sections, > vfio will register multiple same notifiers to a vIOMMU for the same > region. That's not sensible. What we need is to register one IOMMU > notifier for each IOMMU region, not per section. > > Solution is simple - we traverse the container->giommu_list, and skip > the registration if memory region is already registered. > > To make vfio's region_add() short, vfio_listener_region_add_iommu() is > introduced. > > Signed-off-by: Peter Xu <peterx@redhat.com> This is wrong. It will work on the register side, but when the first section attached to the IOMMU is removed, the IOMMU will be removed from the list (triggering an unmap of the whole vfio space), rather than when the *last* section attached to the MR is removed. You'll get away with it in the simple x86 case, because both sections will be removed at basically the same time, but it's not correct in general. I really think a better approach is to add the section boundary information to the IOMMUNotifier structure within VFIOGuestIOMMU, and check that as well as the MR on remove. That additionally means the IOMMU notifier won't get called for portions of the MR outside the mapped sections, which the notifier handler probably isn't going to expect. > --- > hw/vfio/common.c | 56 +++++++++++++++++++++++++++++++++++--------------------- > 1 file changed, 35 insertions(+), 21 deletions(-) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 801578b..5279fd1 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -360,6 +360,40 @@ out: > rcu_read_unlock(); > } > > +static void vfio_listener_region_add_iommu(VFIOContainer *container, > + MemoryRegionSection *section, > + hwaddr iova, > + hwaddr end) > +{ > + VFIOGuestIOMMU *giommu; > + > + QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { > + if (giommu->iommu == section->mr) { > + /* We have already registered with this MR, skip */ > + return; > + } > + } > + > + trace_vfio_listener_region_add_iommu(iova, end); > + > + /* > + * FIXME: For VFIO iommu types which have KVM acceleration to > + * avoid bouncing all map/unmaps through qemu this way, this > + * would be the right place to wire that up (tell the KVM > + * device emulation the VFIO iommu handles to use). > + */ > + giommu = g_malloc0(sizeof(*giommu)); > + giommu->iommu = section->mr; > + giommu->iommu_offset = section->offset_within_address_space - > + section->offset_within_region; > + giommu->container = container; > + giommu->n.notify = vfio_iommu_map_notify; > + giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL; > + QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); > + memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); > + memory_region_iommu_replay(giommu->iommu, &giommu->n, false); > +} > + > static void vfio_listener_region_add(MemoryListener *listener, > MemoryRegionSection *section) > { > @@ -439,27 +473,7 @@ static void vfio_listener_region_add(MemoryListener *listener, > memory_region_ref(section->mr); > > if (memory_region_is_iommu(section->mr)) { > - VFIOGuestIOMMU *giommu; > - > - trace_vfio_listener_region_add_iommu(iova, end); > - /* > - * FIXME: For VFIO iommu types which have KVM acceleration to > - * avoid bouncing all map/unmaps through qemu this way, this > - * would be the right place to wire that up (tell the KVM > - * device emulation the VFIO iommu handles to use). > - */ > - giommu = g_malloc0(sizeof(*giommu)); > - giommu->iommu = section->mr; > - giommu->iommu_offset = section->offset_within_address_space - > - section->offset_within_region; > - giommu->container = container; > - giommu->n.notify = vfio_iommu_map_notify; > - giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL; > - QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); > - > - memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); > - memory_region_iommu_replay(giommu->iommu, &giommu->n, false); > - > + vfio_listener_region_add_iommu(container, section, iova, end); > return; > } >
On Thu, Nov 17, 2016 at 02:24:54PM +1100, David Gibson wrote: > On Mon, Nov 14, 2016 at 07:59:28PM -0500, Peter Xu wrote: > > When one IOMMU memory region is splitted into multiple memory sections, > > vfio will register multiple same notifiers to a vIOMMU for the same > > region. That's not sensible. What we need is to register one IOMMU > > notifier for each IOMMU region, not per section. > > > > Solution is simple - we traverse the container->giommu_list, and skip > > the registration if memory region is already registered. > > > > To make vfio's region_add() short, vfio_listener_region_add_iommu() is > > introduced. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > This is wrong. It will work on the register side, but when the first > section attached to the IOMMU is removed, the IOMMU will be removed > from the list (triggering an unmap of the whole vfio space), rather > than when the *last* section attached to the MR is removed. > > You'll get away with it in the simple x86 case, because both sections > will be removed at basically the same time, but it's not correct in > general. Hmm, I got your point. Instead of keeping multiple VFIOGuestIOMMU as you have mentioned previous (and below), IMHO another choice is to have a reference count for each VFIOGuestIOMMU. > > I really think a better approach is to add the section boundary > information to the IOMMUNotifier structure within VFIOGuestIOMMU, and > check that as well as the MR on remove. That additionally means the > IOMMU notifier won't get called for portions of the MR outside the > mapped sections, which the notifier handler probably isn't going to > expect. Here, the problem is: what we want to listen to here. IMHO, here what we are trying to listen to is memory region, not memory region section. In that case, keeping multiple VFIOGuestIOMMU object is awkward. I would prefer to have a VFIOGuestIOMMU.refcount field as mentioned above, then: - inc() every time the IOMMU memory region is added, - dec() every time the IOMMU memory region is removed, - free() when dec() == 0 Thanks, -- peterx
diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 801578b..5279fd1 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -360,6 +360,40 @@ out: rcu_read_unlock(); } +static void vfio_listener_region_add_iommu(VFIOContainer *container, + MemoryRegionSection *section, + hwaddr iova, + hwaddr end) +{ + VFIOGuestIOMMU *giommu; + + QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { + if (giommu->iommu == section->mr) { + /* We have already registered with this MR, skip */ + return; + } + } + + trace_vfio_listener_region_add_iommu(iova, end); + + /* + * FIXME: For VFIO iommu types which have KVM acceleration to + * avoid bouncing all map/unmaps through qemu this way, this + * would be the right place to wire that up (tell the KVM + * device emulation the VFIO iommu handles to use). + */ + giommu = g_malloc0(sizeof(*giommu)); + giommu->iommu = section->mr; + giommu->iommu_offset = section->offset_within_address_space - + section->offset_within_region; + giommu->container = container; + giommu->n.notify = vfio_iommu_map_notify; + giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL; + QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); + memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); + memory_region_iommu_replay(giommu->iommu, &giommu->n, false); +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -439,27 +473,7 @@ static void vfio_listener_region_add(MemoryListener *listener, memory_region_ref(section->mr); if (memory_region_is_iommu(section->mr)) { - VFIOGuestIOMMU *giommu; - - trace_vfio_listener_region_add_iommu(iova, end); - /* - * FIXME: For VFIO iommu types which have KVM acceleration to - * avoid bouncing all map/unmaps through qemu this way, this - * would be the right place to wire that up (tell the KVM - * device emulation the VFIO iommu handles to use). - */ - giommu = g_malloc0(sizeof(*giommu)); - giommu->iommu = section->mr; - giommu->iommu_offset = section->offset_within_address_space - - section->offset_within_region; - giommu->container = container; - giommu->n.notify = vfio_iommu_map_notify; - giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL; - QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); - - memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); - memory_region_iommu_replay(giommu->iommu, &giommu->n, false); - + vfio_listener_region_add_iommu(container, section, iova, end); return; }
When one IOMMU memory region is splitted into multiple memory sections, vfio will register multiple same notifiers to a vIOMMU for the same region. That's not sensible. What we need is to register one IOMMU notifier for each IOMMU region, not per section. Solution is simple - we traverse the container->giommu_list, and skip the registration if memory region is already registered. To make vfio's region_add() short, vfio_listener_region_add_iommu() is introduced. Signed-off-by: Peter Xu <peterx@redhat.com> --- hw/vfio/common.c | 56 +++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 21 deletions(-)