Message ID | 4FA89A717CD8094DBA0FE20FA5F98EAA010E6E9940@SHASXM03.verisilicon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] hw/arm/smmuv3: Simplify range invalidation | expand |
Hi Liu, On 8/23/21 9:50 AM, Liu, Renwei wrote: > Simplify range invalidation which can avoid to iterate over all > iotlb entries multi-times. For instance invalidations patterns like > "invalidate 32 4kB pages starting from 0xffacd000" need to iterate over > all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only needs > to iterate over all iotlb entries once with new implementation. This wouldn't work. This reverts commit 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two range") which is mandated for VFIO and virtio to work. IOTLB invalidations must be naturally aligned and with a power of 2 range, hence this iteration. Thanks Eric > > Signed-off-by: Renwei Liu <renwei.liu@verisilicon.com> > --- > v2: > - Remove file mode change. > > hw/arm/smmu-common.c | 6 +++--- > hw/arm/smmu-internal.h | 2 +- > hw/arm/smmuv3.c | 22 ++++------------------ > 3 files changed, 8 insertions(+), 22 deletions(-) > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > index 0459850a93..ccb085f83c 100644 > --- a/hw/arm/smmu-common.c > +++ b/hw/arm/smmu-common.c > @@ -142,8 +142,8 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, gpointer value, > if (info->asid >= 0 && info->asid != SMMU_IOTLB_ASID(iotlb_key)) { > return false; > } > - return ((info->iova & ~entry->addr_mask) == entry->iova) || > - ((entry->iova & ~info->mask) == info->iova); > + return (entry->iova >= info->iova) && > + ((entry->iova + entry->addr_mask) < (info->iova + info->range)); > } > > inline void > @@ -167,7 +167,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova, > > SMMUIOTLBPageInvInfo info = { > .asid = asid, .iova = iova, > - .mask = (num_pages * 1 << granule) - 1}; > + .range = num_pages * 1 << granule}; > > g_hash_table_foreach_remove(s->iotlb, > smmu_hash_remove_by_asid_iova, > diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h > index 2d75b31953..f0e3a777af 100644 > --- a/hw/arm/smmu-internal.h > +++ b/hw/arm/smmu-internal.h > @@ -101,7 +101,7 @@ uint64_t iova_level_offset(uint64_t iova, int inputsize, > typedef struct SMMUIOTLBPageInvInfo { > int asid; > uint64_t iova; > - uint64_t mask; > + uint64_t range; > } SMMUIOTLBPageInvInfo; > > typedef struct SMMUSIDRange { > diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c > index 01b60bee49..0b009107d1 100644 > --- a/hw/arm/smmuv3.c > +++ b/hw/arm/smmuv3.c > @@ -857,7 +857,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova, > > static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd) > { > - dma_addr_t end, addr = CMD_ADDR(cmd); > + dma_addr_t addr = CMD_ADDR(cmd); > uint8_t type = CMD_TYPE(cmd); > uint16_t vmid = CMD_VMID(cmd); > uint8_t scale = CMD_SCALE(cmd); > @@ -866,7 +866,6 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd) > bool leaf = CMD_LEAF(cmd); > uint8_t tg = CMD_TG(cmd); > uint64_t num_pages; > - uint8_t granule; > int asid = -1; > > if (type == SMMU_CMD_TLBI_NH_VA) { > @@ -880,23 +879,10 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd) > return; > } > > - /* RIL in use */ > - > num_pages = (num + 1) * BIT_ULL(scale); > - granule = tg * 2 + 10; > - > - /* Split invalidations into ^2 range invalidations */ > - end = addr + (num_pages << granule) - 1; > - > - while (addr != end + 1) { > - uint64_t mask = dma_aligned_pow2_mask(addr, end, 64); > - > - num_pages = (mask + 1) >> granule; > - trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf); > - smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages); > - smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl); > - addr += mask + 1; > - } > + trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf); > + smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages); > + smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl); > } > > static gboolean
> -----Original Message----- > From: Eric Auger [mailto:eric.auger@redhat.com] > Sent: Tuesday, August 31, 2021 10:46 PM > To: Liu, Renwei; Peter Maydell > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen, > Jianxian > Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation > > Hi Liu, > > On 8/23/21 9:50 AM, Liu, Renwei wrote: > > Simplify range invalidation which can avoid to iterate over all > > iotlb entries multi-times. For instance invalidations patterns like > > "invalidate 32 4kB pages starting from 0xffacd000" need to iterate > over > > all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only > needs > > to iterate over all iotlb entries once with new implementation. > > This wouldn't work. This reverts commit > 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two > range") > which is mandated for VFIO and virtio to work. IOTLB invalidations must > be naturally aligned and with a power of 2 range, hence this iteration. > > Thanks > > Eric Hi Eric, Could you try the patch firstly? I want to know whether it's failed in your application scenario with this implementation. I agree with you that IOTLB entry must be naturally aligned and with a power of 2 range. But we can invalidate multi IOTLB entries in one iteration. We check the overlap between invalidation range and IOTLB range, not check mask. The final result is same with your implementation (split to multi times with a power of 2 range). I wonder why we can't implement it directly when the application can send an invalidation command with a non power of 2 range. We have tested it in our application scenario and not find any fail. In addition, from the code implementation, smmu_iotlb_inv_iova() should be OK. In another call smmuv3_inv_notifiers_iova() -> smmuv3_notify_iova() -> memory_region_notify_iommu_one(), it also checks range overlap. So it should be OK if the range is not a power of 2. Could you take a look at it again? Thanks Renwei Liu
Hi, On 9/1/21 8:33 AM, Liu, Renwei wrote: >> -----Original Message----- >> From: Eric Auger [mailto:eric.auger@redhat.com] >> Sent: Tuesday, August 31, 2021 10:46 PM >> To: Liu, Renwei; Peter Maydell >> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen, >> Jianxian >> Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation >> >> Hi Liu, >> >> On 8/23/21 9:50 AM, Liu, Renwei wrote: >>> Simplify range invalidation which can avoid to iterate over all >>> iotlb entries multi-times. For instance invalidations patterns like >>> "invalidate 32 4kB pages starting from 0xffacd000" need to iterate >> over >>> all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only >> needs >>> to iterate over all iotlb entries once with new implementation. >> This wouldn't work. This reverts commit >> 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two >> range") >> which is mandated for VFIO and virtio to work. IOTLB invalidations must >> be naturally aligned and with a power of 2 range, hence this iteration. >> >> Thanks >> >> Eric > Hi Eric, > > Could you try the patch firstly? I want to know whether it's failed > in your application scenario with this implementation. There are many test cases, virtio-pci, vhost, VFIO, ... > I agree with you that IOTLB entry must be naturally aligned and > with a power of 2 range. But we can invalidate multi IOTLB entries > in one iteration. We check the overlap between invalidation range > and IOTLB range, not check mask. This smmu_hash_remove_by_asid_iova() change only affects the internal SMMUv3 IOTLB hash table lookup. However there are also IOTLB invalidation notifications sent to components who registered notifiers, ie. smmuv3_notify_iova path. > The final result is same with > your implementation (split to multi times with a power of 2 range). > I wonder why we can't implement it directly when the application can > send an invalidation command with a non power of 2 range. > We have tested it in our application scenario and not find any fail. Assume the driver invalidates 5 * 4kB pages =0x5000 range. Without the loop you removed in smmuv3_notify_iova() event.entry.addr_mask = num_pages * (1 << granule) - 1 = 0x4FFF. This addr_mask is an invalid mask this entry is passed to components who registered invalidation notifiers such as vhost or vfio. for instance in VFIO you have '&' ops on the addr_mask. addr_mask is expected to be a mask of a power of 2 range. Does it clarify? Thanks Eric > > In addition, from the code implementation, smmu_iotlb_inv_iova() > should be OK. In another call smmuv3_inv_notifiers_iova() -> > smmuv3_notify_iova() -> memory_region_notify_iommu_one(), > it also checks range overlap. So it should be OK if the range > is not a power of 2. > > Could you take a look at it again? > > Thanks > Renwei Liu
> -----Original Message----- > From: Eric Auger [mailto:eric.auger@redhat.com] > Sent: Wednesday, September 01, 2021 9:14 PM > To: Liu, Renwei; Peter Maydell > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen, > Jianxian > Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation > > Hi, > > On 9/1/21 8:33 AM, Liu, Renwei wrote: > >> -----Original Message----- > >> From: Eric Auger [mailto:eric.auger@redhat.com] > >> Sent: Tuesday, August 31, 2021 10:46 PM > >> To: Liu, Renwei; Peter Maydell > >> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen, > >> Jianxian > >> Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation > >> > >> Hi Liu, > >> > >> On 8/23/21 9:50 AM, Liu, Renwei wrote: > >>> Simplify range invalidation which can avoid to iterate over all > >>> iotlb entries multi-times. For instance invalidations patterns like > >>> "invalidate 32 4kB pages starting from 0xffacd000" need to iterate > >> over > >>> all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only > >> needs > >>> to iterate over all iotlb entries once with new implementation. > >> This wouldn't work. This reverts commit > >> 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two > >> range") > >> which is mandated for VFIO and virtio to work. IOTLB invalidations > must > >> be naturally aligned and with a power of 2 range, hence this > iteration. > >> > >> Thanks > >> > >> Eric > > Hi Eric, > > > > Could you try the patch firstly? I want to know whether it's failed > > in your application scenario with this implementation. > There are many test cases, virtio-pci, vhost, VFIO, ... > > I agree with you that IOTLB entry must be naturally aligned and > > with a power of 2 range. But we can invalidate multi IOTLB entries > > in one iteration. We check the overlap between invalidation range > > and IOTLB range, not check mask. > This smmu_hash_remove_by_asid_iova() change only affects the internal > SMMUv3 IOTLB hash table lookup. However there are also IOTLB > invalidation notifications sent to components who registered notifiers, > ie. smmuv3_notify_iova path. > > The final result is same with > > your implementation (split to multi times with a power of 2 range). > > I wonder why we can't implement it directly when the application can > > send an invalidation command with a non power of 2 range. > > We have tested it in our application scenario and not find any fail. > Assume the driver invalidates 5 * 4kB pages =0x5000 range. Without the > loop you removed > > in smmuv3_notify_iova() event.entry.addr_mask = num_pages * (1 << > granule) - 1 = 0x4FFF. This addr_mask is an invalid mask > this entry is passed to components who registered invalidation > notifiers > such as vhost or vfio. for instance in VFIO you have '&' ops on the > addr_mask. > addr_mask is expected to be a mask of a power of 2 range. > > Does it clarify? > > Thanks > > Eric Hi Eric Got it, thanks a lot for your clarification. I don't consider the further notifier from vhost or vfio indeed, because they are not registered in our application scenario. Let's keep the previous implementation and ignore this patch. Thanks Renwei Liu > > > > In addition, from the code implementation, smmu_iotlb_inv_iova() > > should be OK. In another call smmuv3_inv_notifiers_iova() -> > > smmuv3_notify_iova() -> memory_region_notify_iommu_one(), > > it also checks range overlap. So it should be OK if the range > > is not a power of 2. > > > > Could you take a look at it again? > > > > Thanks > > Renwei Liu
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 0459850a93..ccb085f83c 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -142,8 +142,8 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, gpointer value, if (info->asid >= 0 && info->asid != SMMU_IOTLB_ASID(iotlb_key)) { return false; } - return ((info->iova & ~entry->addr_mask) == entry->iova) || - ((entry->iova & ~info->mask) == info->iova); + return (entry->iova >= info->iova) && + ((entry->iova + entry->addr_mask) < (info->iova + info->range)); } inline void @@ -167,7 +167,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova, SMMUIOTLBPageInvInfo info = { .asid = asid, .iova = iova, - .mask = (num_pages * 1 << granule) - 1}; + .range = num_pages * 1 << granule}; g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid_iova, diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h index 2d75b31953..f0e3a777af 100644 --- a/hw/arm/smmu-internal.h +++ b/hw/arm/smmu-internal.h @@ -101,7 +101,7 @@ uint64_t iova_level_offset(uint64_t iova, int inputsize, typedef struct SMMUIOTLBPageInvInfo { int asid; uint64_t iova; - uint64_t mask; + uint64_t range; } SMMUIOTLBPageInvInfo; typedef struct SMMUSIDRange { diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 01b60bee49..0b009107d1 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -857,7 +857,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova, static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd) { - dma_addr_t end, addr = CMD_ADDR(cmd); + dma_addr_t addr = CMD_ADDR(cmd); uint8_t type = CMD_TYPE(cmd); uint16_t vmid = CMD_VMID(cmd); uint8_t scale = CMD_SCALE(cmd); @@ -866,7 +866,6 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd) bool leaf = CMD_LEAF(cmd); uint8_t tg = CMD_TG(cmd); uint64_t num_pages; - uint8_t granule; int asid = -1; if (type == SMMU_CMD_TLBI_NH_VA) { @@ -880,23 +879,10 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd) return; } - /* RIL in use */ - num_pages = (num + 1) * BIT_ULL(scale); - granule = tg * 2 + 10; - - /* Split invalidations into ^2 range invalidations */ - end = addr + (num_pages << granule) - 1; - - while (addr != end + 1) { - uint64_t mask = dma_aligned_pow2_mask(addr, end, 64); - - num_pages = (mask + 1) >> granule; - trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf); - smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages); - smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl); - addr += mask + 1; - } + trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf); + smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages); + smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl); } static gboolean
Simplify range invalidation which can avoid to iterate over all iotlb entries multi-times. For instance invalidations patterns like "invalidate 32 4kB pages starting from 0xffacd000" need to iterate over all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only needs to iterate over all iotlb entries once with new implementation. Signed-off-by: Renwei Liu <renwei.liu@verisilicon.com> --- v2: - Remove file mode change. hw/arm/smmu-common.c | 6 +++--- hw/arm/smmu-internal.h | 2 +- hw/arm/smmuv3.c | 22 ++++------------------ 3 files changed, 8 insertions(+), 22 deletions(-)