Message ID | 20210107122909.16317-1-yong.wu@mediatek.com (mailing list archive) |
---|---|
Headers | show |
Series | MediaTek IOMMU improve tlb flush performance in map/unmap | expand |
On Thu, Jan 07, 2021 at 08:29:02PM +0800, Yong Wu wrote: > This patchset is to improve tlb flushing performance in iommu_map/unmap > for MediaTek IOMMU. > > For iommu_map, currently MediaTek IOMMU use IO_PGTABLE_QUIRK_TLBI_ON_MAP > to do tlb_flush for each a memory chunk. this is so unnecessary. we could > improve it by tlb flushing one time at the end of iommu_map. > > For iommu_unmap, currently we have already improve this performance by > gather. But the current gather should take care its granule size. if the > granule size is different, it will do tlb flush and gather again. Our HW > don't care about granule size. thus I gather the range in our file. > > After this patchset, we could achieve only tlb flushing once in iommu_map > and iommu_unmap. > > Regardless of sg, for each a segment, I did a simple test: > > size = 20 * SZ_1M; > /* the worst case, all are 4k mapping. */ > ret = iommu_map(domain, 0x5bb02000, 0x123f1000, size, IOMMU_READ); > iommu_unmap(domain, 0x5bb02000, size); > > This is the comparing time(unit is us): > original-time after-improve > map-20M 59943 2347 > unmap-20M 264 36 > > This patchset also flush tlb once in the iommu_map_sg case. > > patch [1/7][2/7][3/7] are for map while the others are for unmap. > > change note: > v4: a. base on v5.11-rc1. > b. Add a little helper _iommu_map. > c. Fix a build fail for tegra-gart.c. I didn't notice there is another place > call gart_iommu_sync_map. > d. Switch gather->end to the read end address("start + end - 1"). > > v3: https://lore.kernel.org/linux-iommu/20201216103607.23050-1-yong.wu@mediatek.com/#r > Refactor the unmap flow suggested by Robin. > > v2: https://lore.kernel.org/linux-iommu/20201119061836.15238-1-yong.wu@mediatek.com/ > Refactor all the code. > base on v5.10-rc1. > > Yong Wu (7): > iommu: Move iotlb_sync_map out from __iommu_map > iommu: Add iova and size as parameters in iotlb_sync_map > iommu/mediatek: Add iotlb_sync_map to sync whole the iova range > iommu: Switch gather->end to the inclusive end > iommu/io-pgtable: Allow io_pgtable_tlb ops optional > iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once > iommu/mediatek: Remove the tlb-ops for v7s For the series: Acked-by: Will Deacon <will@kernel.org> Joerg -- how would you like to handle merging this? I suppose either you could host a separate branch that I could merge if needed, or I could include this in my pull to you, or something else. Please let me know what you prefer, Cheers, Will
On Thu, 7 Jan 2021 20:29:02 +0800, Yong Wu wrote: > This patchset is to improve tlb flushing performance in iommu_map/unmap > for MediaTek IOMMU. > > For iommu_map, currently MediaTek IOMMU use IO_PGTABLE_QUIRK_TLBI_ON_MAP > to do tlb_flush for each a memory chunk. this is so unnecessary. we could > improve it by tlb flushing one time at the end of iommu_map. > > [...] After discussion with Joerg, I'll queue this (and hopefully the next posting of your IOMMU driver) along with the Arm SMMU patches, and then send that all together. Applied to will (for-joerg/mtk), thanks! [1/7] iommu: Move iotlb_sync_map out from __iommu_map https://git.kernel.org/arm64/c/d8c1df02ac7f [2/7] iommu: Add iova and size as parameters in iotlb_sync_map https://git.kernel.org/arm64/c/2ebbd25873ce [3/7] iommu/mediatek: Add iotlb_sync_map to sync whole the iova range https://git.kernel.org/arm64/c/20143451eff0 [4/7] iommu: Switch gather->end to the inclusive end https://git.kernel.org/arm64/c/862c3715de8f [5/7] iommu/io-pgtable: Allow io_pgtable_tlb ops optional https://git.kernel.org/arm64/c/77e0992aee4e [6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once https://git.kernel.org/arm64/c/f21ae3b10084 [7/7] iommu/mediatek: Remove the tlb-ops for v7s https://git.kernel.org/arm64/c/0954d61a59e3 Cheers,