Message ID | 20230413110019.48922-1-zhenzhong.duan@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] memory: Optimize replay of guest mapping | expand |
On Thu, Apr 13, 2023 at 07:00:19PM +0800, Zhenzhong Duan wrote: > On x86, there are two notifiers registered due to vtd-ir memory > region splitting the entire address space. During replay of the > address space for each notifier, the whole address space is > scanned which is unnecessary. We only need to scan the space > belong to notifier monitored space. > > While on x86 IOMMU memory region spans over entire address space, > but on some other platforms(e.g. arm mps3-an547), IOMMU memory > region is only a window in the whole address space. user could > register a notifier with arbitrary scope beyond IOMMU memory > region. Though in current implementation replay is only triggered > by VFIO and dirty page sync with notifiers derived from memory > region section, but this isn't guaranteed in the future. > > So, we replay the intersection part of IOMMU memory region and > IOMMU notifier in memory_region_iommu_replay(). > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com>
On Thu, 13 Apr 2023 at 12:12, Zhenzhong Duan <zhenzhong.duan@intel.com> wrote: > > On x86, there are two notifiers registered due to vtd-ir memory > region splitting the entire address space. During replay of the > address space for each notifier, the whole address space is > scanned which is unnecessary. We only need to scan the space > belong to notifier monitored space. > > While on x86 IOMMU memory region spans over entire address space, > but on some other platforms(e.g. arm mps3-an547), IOMMU memory > region is only a window in the whole address space. user could > register a notifier with arbitrary scope beyond IOMMU memory > region. Though in current implementation replay is only triggered > by VFIO and dirty page sync with notifiers derived from memory > region section, but this isn't guaranteed in the future. > > So, we replay the intersection part of IOMMU memory region and > IOMMU notifier in memory_region_iommu_replay(). > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > v3: Fix assert failure on mps3-an547 > v2: Add an assert per Peter > Tested on x86 with a net card passed to guest(kvm/tcg), ping/ssh pass. > Also did simple bootup test with mps3-an547 > > hw/i386/intel_iommu.c | 2 +- > softmmu/memory.c | 5 +++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index a62896759c78..faade7def867 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -3850,7 +3850,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), > }; > > - vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); > + vtd_page_walk(s, &ce, n->start, n->end, &info, vtd_as->pasid); > } > } else { > trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), > diff --git a/softmmu/memory.c b/softmmu/memory.c > index b1a6cae6f583..f7af691991de 100644 > --- a/softmmu/memory.c > +++ b/softmmu/memory.c > @@ -1925,7 +1925,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > { > MemoryRegion *mr = MEMORY_REGION(iommu_mr); > IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); > - hwaddr addr, granularity; > + hwaddr addr, end, granularity; > IOMMUTLBEntry iotlb; > > /* If the IOMMU has its own replay callback, override */ > @@ -1935,8 +1935,9 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > } > > granularity = memory_region_iommu_get_min_page_size(iommu_mr); > + end = MIN(n->end, memory_region_size(mr)); > > - for (addr = 0; addr < memory_region_size(mr); addr += granularity) { > + for (addr = n->start; addr < end; addr += granularity) { > iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx); > if (iotlb.perm != IOMMU_NONE) { > n->notify(n, &iotlb); The documentation for the replay method of IOMMUMemoryRegionClass says: * The default implementation of memory_region_iommu_replay() is to * call the IOMMU translate method for every page in the address space * with flag == IOMMU_NONE and then call the notifier if translate * returns a valid mapping. If this method is implemented then it * overrides the default behaviour, and must provide the full semantics * of memory_region_iommu_replay(), by calling @notifier for every * translation present in the IOMMU. This commit changes the default implementation so it's no longer doing this for every page in the address space. If the change is correct, we should update the doc comment too. Oddly, the doc comment for memory_region_iommu_replay() itself doesn't very clearly state what its semantics are; it could probably be improved. Anyway, this change is OK for the TCG use of iommu notifiers, because that doesn't care about replay. thanks -- PMM
On Tue, Apr 18, 2023 at 11:13:57AM +0100, Peter Maydell wrote: > On Thu, 13 Apr 2023 at 12:12, Zhenzhong Duan <zhenzhong.duan@intel.com> wrote: > > > > On x86, there are two notifiers registered due to vtd-ir memory > > region splitting the entire address space. During replay of the > > address space for each notifier, the whole address space is > > scanned which is unnecessary. We only need to scan the space > > belong to notifier monitored space. > > > > While on x86 IOMMU memory region spans over entire address space, > > but on some other platforms(e.g. arm mps3-an547), IOMMU memory > > region is only a window in the whole address space. user could > > register a notifier with arbitrary scope beyond IOMMU memory > > region. Though in current implementation replay is only triggered > > by VFIO and dirty page sync with notifiers derived from memory > > region section, but this isn't guaranteed in the future. > > > > So, we replay the intersection part of IOMMU memory region and > > IOMMU notifier in memory_region_iommu_replay(). > > > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > > --- > > v3: Fix assert failure on mps3-an547 > > v2: Add an assert per Peter > > Tested on x86 with a net card passed to guest(kvm/tcg), ping/ssh pass. > > Also did simple bootup test with mps3-an547 > > > > hw/i386/intel_iommu.c | 2 +- > > softmmu/memory.c | 5 +++-- > > 2 files changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > index a62896759c78..faade7def867 100644 > > --- a/hw/i386/intel_iommu.c > > +++ b/hw/i386/intel_iommu.c > > @@ -3850,7 +3850,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), > > }; > > > > - vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); > > + vtd_page_walk(s, &ce, n->start, n->end, &info, vtd_as->pasid); > > } > > } else { > > trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), > > diff --git a/softmmu/memory.c b/softmmu/memory.c > > index b1a6cae6f583..f7af691991de 100644 > > --- a/softmmu/memory.c > > +++ b/softmmu/memory.c > > @@ -1925,7 +1925,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > { > > MemoryRegion *mr = MEMORY_REGION(iommu_mr); > > IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); > > - hwaddr addr, granularity; > > + hwaddr addr, end, granularity; > > IOMMUTLBEntry iotlb; > > > > /* If the IOMMU has its own replay callback, override */ > > @@ -1935,8 +1935,9 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > } > > > > granularity = memory_region_iommu_get_min_page_size(iommu_mr); > > + end = MIN(n->end, memory_region_size(mr)); > > > > - for (addr = 0; addr < memory_region_size(mr); addr += granularity) { > > + for (addr = n->start; addr < end; addr += granularity) { > > iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx); > > if (iotlb.perm != IOMMU_NONE) { > > n->notify(n, &iotlb); > > > The documentation for the replay method of IOMMUMemoryRegionClass > says: > * The default implementation of memory_region_iommu_replay() is to > * call the IOMMU translate method for every page in the address space > * with flag == IOMMU_NONE and then call the notifier if translate > * returns a valid mapping. If this method is implemented then it > * overrides the default behaviour, and must provide the full semantics > * of memory_region_iommu_replay(), by calling @notifier for every > * translation present in the IOMMU. > > This commit changes the default implementation so it's no longer > doing this for every page in the address space. If the change is > correct, we should update the doc comment too. > > Oddly, the doc comment for memory_region_iommu_replay() itself > doesn't very clearly state what its semantics are; it could > probably be improved. > > Anyway, this change is OK for the TCG use of iommu notifiers, > because that doesn't care about replay. Since the notifier contains the range information I'd say the change shouldn't affect any caller but only a pure performance difference. Indeed it'll be nicer the documentation can be updated too. Thanks,
>-----Original Message----- >From: Peter Xu <peterx@redhat.com> >Sent: Tuesday, April 18, 2023 9:56 PM >To: Peter Maydell <peter.maydell@linaro.org> >Cc: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu- >devel@nongnu.org; mst@redhat.com; jasowang@redhat.com; >marcel.apfelbaum@gmail.com; pbonzini@redhat.com; >richard.henderson@linaro.org; eduardo@habkost.net; david@redhat.com; >philmd@linaro.org >Subject: Re: [PATCH v3] memory: Optimize replay of guest mapping > >On Tue, Apr 18, 2023 at 11:13:57AM +0100, Peter Maydell wrote: >> On Thu, 13 Apr 2023 at 12:12, Zhenzhong Duan ><zhenzhong.duan@intel.com> wrote: >> > >> > On x86, there are two notifiers registered due to vtd-ir memory >> > region splitting the entire address space. During replay of the >> > address space for each notifier, the whole address space is scanned >> > which is unnecessary. We only need to scan the space belong to >> > notifier monitored space. >> > >> > While on x86 IOMMU memory region spans over entire address space, >> > but on some other platforms(e.g. arm mps3-an547), IOMMU memory >> > region is only a window in the whole address space. user could >> > register a notifier with arbitrary scope beyond IOMMU memory region. >> > Though in current implementation replay is only triggered by VFIO >> > and dirty page sync with notifiers derived from memory region >> > section, but this isn't guaranteed in the future. >> > >> > So, we replay the intersection part of IOMMU memory region and IOMMU >> > notifier in memory_region_iommu_replay(). >> > >> > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> > --- >> > v3: Fix assert failure on mps3-an547 >> > v2: Add an assert per Peter >> > Tested on x86 with a net card passed to guest(kvm/tcg), ping/ssh pass. >> > Also did simple bootup test with mps3-an547 >> > >> > hw/i386/intel_iommu.c | 2 +- >> > softmmu/memory.c | 5 +++-- >> > 2 files changed, 4 insertions(+), 3 deletions(-) >> > >> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index >> > a62896759c78..faade7def867 100644 >> > --- a/hw/i386/intel_iommu.c >> > +++ b/hw/i386/intel_iommu.c >> > @@ -3850,7 +3850,7 @@ static void >vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) >> > .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), >> > }; >> > >> > - vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); >> > + vtd_page_walk(s, &ce, n->start, n->end, &info, >> > + vtd_as->pasid); >> > } >> > } else { >> > trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), >> > diff --git a/softmmu/memory.c b/softmmu/memory.c index >> > b1a6cae6f583..f7af691991de 100644 >> > --- a/softmmu/memory.c >> > +++ b/softmmu/memory.c >> > @@ -1925,7 +1925,7 @@ void >> > memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, >IOMMUNotifier *n) { >> > MemoryRegion *mr = MEMORY_REGION(iommu_mr); >> > IOMMUMemoryRegionClass *imrc = >IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); >> > - hwaddr addr, granularity; >> > + hwaddr addr, end, granularity; >> > IOMMUTLBEntry iotlb; >> > >> > /* If the IOMMU has its own replay callback, override */ @@ >> > -1935,8 +1935,9 @@ void >memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, >IOMMUNotifier *n) >> > } >> > >> > granularity = memory_region_iommu_get_min_page_size(iommu_mr); >> > + end = MIN(n->end, memory_region_size(mr)); >> > >> > - for (addr = 0; addr < memory_region_size(mr); addr += granularity) { >> > + for (addr = n->start; addr < end; addr += granularity) { >> > iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n- >>iommu_idx); >> > if (iotlb.perm != IOMMU_NONE) { >> > n->notify(n, &iotlb); >> >> >> The documentation for the replay method of IOMMUMemoryRegionClass >> says: >> * The default implementation of memory_region_iommu_replay() is to >> * call the IOMMU translate method for every page in the address space >> * with flag == IOMMU_NONE and then call the notifier if translate >> * returns a valid mapping. If this method is implemented then it >> * overrides the default behaviour, and must provide the full semantics >> * of memory_region_iommu_replay(), by calling @notifier for every >> * translation present in the IOMMU. >> >> This commit changes the default implementation so it's no longer doing >> this for every page in the address space. If the change is correct, we >> should update the doc comment too. >> >> Oddly, the doc comment for memory_region_iommu_replay() itself doesn't >> very clearly state what its semantics are; it could probably be >> improved. >> >> Anyway, this change is OK for the TCG use of iommu notifiers, because >> that doesn't care about replay. > >Since the notifier contains the range information I'd say the change shouldn't >affect any caller but only a pure performance difference. Indeed it'll be nicer >the documentation can be updated too. Thanks, > >-- >Peter Xu Thanks Peter Maydell and Peter Xu's comments, will add doc update. May I ask if it's preferred to add doc update to this patch or a separate patch? Regards Zhenzhong
On Wed, 19 Apr 2023 at 03:38, Duan, Zhenzhong <zhenzhong.duan@intel.com> wrote: > >> The documentation for the replay method of IOMMUMemoryRegionClass > >> says: > >> * The default implementation of memory_region_iommu_replay() is to > >> * call the IOMMU translate method for every page in the address space > >> * with flag == IOMMU_NONE and then call the notifier if translate > >> * returns a valid mapping. If this method is implemented then it > >> * overrides the default behaviour, and must provide the full semantics > >> * of memory_region_iommu_replay(), by calling @notifier for every > >> * translation present in the IOMMU. > >> > >> This commit changes the default implementation so it's no longer doing > >> this for every page in the address space. If the change is correct, we > >> should update the doc comment too. > >> > >> Oddly, the doc comment for memory_region_iommu_replay() itself doesn't > >> very clearly state what its semantics are; it could probably be > >> improved. > >> > >> Anyway, this change is OK for the TCG use of iommu notifiers, because > >> that doesn't care about replay. > > > >Since the notifier contains the range information I'd say the change shouldn't > >affect any caller but only a pure performance difference. Indeed it'll be nicer > >the documentation can be updated too. Thanks, > Thanks Peter Maydell and Peter Xu's comments, will add doc update. > May I ask if it's preferred to add doc update to this patch or a separate patch? I suggest doing it in this same patch. thanks -- PMM
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index a62896759c78..faade7def867 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3850,7 +3850,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), }; - vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); + vtd_page_walk(s, &ce, n->start, n->end, &info, vtd_as->pasid); } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), diff --git a/softmmu/memory.c b/softmmu/memory.c index b1a6cae6f583..f7af691991de 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -1925,7 +1925,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) { MemoryRegion *mr = MEMORY_REGION(iommu_mr); IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); - hwaddr addr, granularity; + hwaddr addr, end, granularity; IOMMUTLBEntry iotlb; /* If the IOMMU has its own replay callback, override */ @@ -1935,8 +1935,9 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) } granularity = memory_region_iommu_get_min_page_size(iommu_mr); + end = MIN(n->end, memory_region_size(mr)); - for (addr = 0; addr < memory_region_size(mr); addr += granularity) { + for (addr = n->start; addr < end; addr += granularity) { iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx); if (iotlb.perm != IOMMU_NONE) { n->notify(n, &iotlb);
On x86, there are two notifiers registered due to vtd-ir memory region splitting the entire address space. During replay of the address space for each notifier, the whole address space is scanned which is unnecessary. We only need to scan the space belong to notifier monitored space. While on x86 IOMMU memory region spans over entire address space, but on some other platforms(e.g. arm mps3-an547), IOMMU memory region is only a window in the whole address space. user could register a notifier with arbitrary scope beyond IOMMU memory region. Though in current implementation replay is only triggered by VFIO and dirty page sync with notifiers derived from memory region section, but this isn't guaranteed in the future. So, we replay the intersection part of IOMMU memory region and IOMMU notifier in memory_region_iommu_replay(). Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- v3: Fix assert failure on mps3-an547 v2: Add an assert per Peter Tested on x86 with a net card passed to guest(kvm/tcg), ping/ssh pass. Also did simple bootup test with mps3-an547 hw/i386/intel_iommu.c | 2 +- softmmu/memory.c | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-)