Message ID | 2-v1-21cc72fcfb22+a7a-iommu_all_defdom_jgg@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iommu: Make default_domain's mandatory | expand |
On 2023-05-01 19:02, Jason Gunthorpe wrote: > tegra-gart seems to be kind of wonky since from the start its 'detach_dev' > op doesn't actually touch hardware. It is supposed to empty the GART of > all translations loaded into it. No, detach should never tear down translations - what if other devices are still using the domain? > Call this weirdness PLATFORM which keeps the basic original > ops->detach_dev() semantic alive without needing much special core code > support. I'm guessing it really ends up in a BLOCKING configuration, but > without any forced cleanup it is unsafe. The GART translation aperture is in physical address space, so the truth is that all devices have access to it at the same time as having access to the rest of physical address space. Attach/detach here really are only bookkeeping for which domain currently owns the aperture. FWIW I wrote up this patch a while ago, not sure if it needs rebasing again... Thanks, Robin. ----->8----- Subject: [PATCH] iommu/tegra-gart: Add default identity domain support The nature of a GART means that supporting identity domains is as easy as doing nothing, so bring the Tegra driver into the modern world of default domains with a trivial implementation. Identity domains are allowed to exist alongside any explicit domain for the translation aperture, since they both simply represent regions of the physical address space with no isolation from each other. As such we'll continue to do the "wrong" thing with groups to allow that to work, since the notion of isolation that groups represent is counterproductive to the GART's established usage model. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/tegra-gart.c | 39 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index c4136eec1f97..07aa7ea6a306 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -111,7 +111,13 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain, spin_lock(&gart->dom_lock); - if (gart->active_domain && gart->active_domain != domain) { + if (domain->type == IOMMU_DOMAIN_IDENTITY) { + if (dev_iommu_priv_get(dev)) { + dev_iommu_priv_set(dev, NULL); + if (--gart->active_devices == 0) + gart->active_domain = NULL; + } + } else if (gart->active_domain && gart->active_domain != domain) { ret = -EINVAL; } else if (dev_iommu_priv_get(dev) != domain) { dev_iommu_priv_set(dev, domain); @@ -124,28 +130,15 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain, return ret; } -static void gart_iommu_set_platform_dma(struct device *dev) -{ - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct gart_device *gart = gart_handle; - - spin_lock(&gart->dom_lock); - - if (dev_iommu_priv_get(dev) == domain) { - dev_iommu_priv_set(dev, NULL); - - if (--gart->active_devices == 0) - gart->active_domain = NULL; - } - - spin_unlock(&gart->dom_lock); -} - static struct iommu_domain *gart_iommu_domain_alloc(struct device *dev, unsigned type) { + static struct iommu_domain identity; struct iommu_domain *domain; + if (type == IOMMU_DOMAIN_IDENTITY) + return &identity; + if (type != IOMMU_DOMAIN_UNMANAGED) return NULL; @@ -162,7 +155,8 @@ static struct iommu_domain *gart_iommu_domain_alloc(struct device *dev, static void gart_iommu_domain_free(struct iommu_domain *domain) { WARN_ON(gart_handle->active_domain == domain); - kfree(domain); + if (domain->type != IOMMU_DOMAIN_IDENTITY) + kfree(domain); } static inline int __gart_iommu_map(struct gart_device *gart, unsigned long iova, @@ -247,6 +241,11 @@ static struct iommu_device *gart_iommu_probe_device(struct device *dev) return &gart_handle->iommu; } +static int gart_iommu_def_domain_type(struct device *dev) +{ + return IOMMU_DOMAIN_IDENTITY; +} + static int gart_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) { @@ -271,9 +270,9 @@ static const struct iommu_ops gart_iommu_ops = { .domain_alloc = gart_iommu_domain_alloc, .probe_device = gart_iommu_probe_device, .device_group = generic_device_group, - .set_platform_dma_ops = gart_iommu_set_platform_dma, .pgsize_bitmap = GART_IOMMU_PGSIZES, .of_xlate = gart_iommu_of_xlate, + .def_domain_type = gart_iommu_def_domain_type, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = gart_iommu_attach_dev, .map = gart_iommu_map,
On Wed, May 03, 2023 at 10:17:29AM +0100, Robin Murphy wrote: > On 2023-05-01 19:02, Jason Gunthorpe wrote: > > tegra-gart seems to be kind of wonky since from the start its 'detach_dev' > > op doesn't actually touch hardware. It is supposed to empty the GART of > > all translations loaded into it. > > No, detach should never tear down translations - what if other devices are > still using the domain? ?? All other drivers do this. The core contract is that this sequence: dom = iommu_domain_alloc() iommu_attach_device(dom, dev) iommu_map(dom,...) iommu_detach_device(dom, dev) Will not continue to have the IOVA mapped to the device. We rely on this for various error paths. If the HW is multi-device then it is supposed to have groups. > > Call this weirdness PLATFORM which keeps the basic original > > ops->detach_dev() semantic alive without needing much special core code > > support. I'm guessing it really ends up in a BLOCKING configuration, but > > without any forced cleanup it is unsafe. > > The GART translation aperture is in physical address space, so the truth is > that all devices have access to it at the same time as having access to the > rest of physical address space. Attach/detach here really are only > bookkeeping for which domain currently owns the aperture. Oh yuk, that is not an UNMANAGED domain either as we now assume empty UNMANAGED domains are blocking in the core... > FWIW I wrote up this patch a while ago, not sure if it needs rebasing > again... That looks like the same as this patch, just calling the detach dev behavior IDENTITY. Can do.. Thanks, Jason
On 2023-05-03 12:01, Jason Gunthorpe wrote: > On Wed, May 03, 2023 at 10:17:29AM +0100, Robin Murphy wrote: >> On 2023-05-01 19:02, Jason Gunthorpe wrote: >>> tegra-gart seems to be kind of wonky since from the start its 'detach_dev' >>> op doesn't actually touch hardware. It is supposed to empty the GART of >>> all translations loaded into it. >> >> No, detach should never tear down translations - what if other devices are >> still using the domain? > > ?? All other drivers do this. The only driver I'm aware of which effectively tore down mappings by freeing its pagetable on detach was sprd-iommu, and that was recently fixed on account of it being clearly wrong. Remember that the GART registers here are literally just its pagetable, nothing more. > The core contract is that this sequence: > > dom = iommu_domain_alloc() > iommu_attach_device(dom, dev) > iommu_map(dom,...) > iommu_detach_device(dom, dev) > > Will not continue to have the IOVA mapped to the device. We rely on > this for various error paths. Yes, I'm not disputing that we expect detach to remove that device's *access* to the IOVA (which is what GART can't do...), but it should absolutely not destroy the IOVA mapping itself. Follow that sequence with iommu_attach_device(dom, dev) again and the caller can expect to be able to continue using the same translation. > If the HW is multi-device then it is supposed to have groups. Groups are in fact the most practical example: set up a VFIO domain, attach two groups to it, map some IOVAs, detach one of the groups, keep using the other. If the detach carried an implicit iommu_unmap() there would be fireworks. >>> Call this weirdness PLATFORM which keeps the basic original >>> ops->detach_dev() semantic alive without needing much special core code >>> support. I'm guessing it really ends up in a BLOCKING configuration, but >>> without any forced cleanup it is unsafe. >> >> The GART translation aperture is in physical address space, so the truth is >> that all devices have access to it at the same time as having access to the >> rest of physical address space. Attach/detach here really are only >> bookkeeping for which domain currently owns the aperture. > > Oh yuk, that is not an UNMANAGED domain either as we now assume empty > UNMANAGED domains are blocking in the core... They are, in the sense that accesses within the aperture won't go anywhere. It might help if domain->geometry.force_aperture was meaningful, because it's never been clear whether it was supposed to reflect a hardware capability (in which case it should be false for GART) or be an instruction to the user of the domain (wherein it's a bit pointless that everyone always sets it). Thanks, Robin.
On Wed, May 03, 2023 at 01:01:34PM +0100, Robin Murphy wrote: > On 2023-05-03 12:01, Jason Gunthorpe wrote: > > On Wed, May 03, 2023 at 10:17:29AM +0100, Robin Murphy wrote: > > > On 2023-05-01 19:02, Jason Gunthorpe wrote: > > > > tegra-gart seems to be kind of wonky since from the start its 'detach_dev' > > > > op doesn't actually touch hardware. It is supposed to empty the GART of > > > > all translations loaded into it. > > > > > > No, detach should never tear down translations - what if other devices are > > > still using the domain? > > > > ?? All other drivers do this. > > The only driver I'm aware of which effectively tore down mappings by freeing > its pagetable on detach was sprd-iommu, and that was recently fixed on > account of it being clearly wrong. By "Teardown" I mean deconfigure the HW. This driver is odd because it doesn't store a page table in the iommu_domain, it keeps it in the GART registers so it can't actually detach/attach fully correctly. :( > Yes, I'm not disputing that we expect detach to remove that device's > *access* to the IOVA (which is what GART can't do...), but it should > absolutely not destroy the IOVA mapping itself. Follow that sequence with > iommu_attach_device(dom, dev) again and the caller can expect to be able to > continue using the same translation. Yes > > If the HW is multi-device then it is supposed to have groups. > > Groups are in fact the most practical example: set up a VFIO domain, attach > two groups to it, map some IOVAs, detach one of the groups, keep using the > other. If the detach carried an implicit iommu_unmap() there would be > fireworks. Yes, I'm not saying an unmap, I used the word teardown to mean remove the HW parts. This gart function doesn't touch the HW at all, that cannot be correct. It should have an xarray in the iommu_domain and on detach it should purge the GART registers and on attach it should load the xarray into the GART registers. We are also technically expecting drivers to support map prior to attach, eg for the direct map reserved region setup. > > Oh yuk, that is not an UNMANAGED domain either as we now assume empty > > UNMANAGED domains are blocking in the core... > > They are, in the sense that accesses within the aperture won't go > anywhere. That is not the definition of BLOCKING we came up with.. It is every IOVA is blocked and the device is safe to hand to VFIO. It can't be just blocking a subset of the IOVA. > It might help if domain->geometry.force_aperture was meaningful, because > it's never been clear whether it was supposed to reflect a hardware > capability (in which case it should be false for GART) or be an instruction > to the user of the domain (wherein it's a bit pointless that everyone always > sets it). force_aperture looks pointless now. Only two drivers don't set it - mtk_v1 and sprd. The only real reader is dma-iommu.c and mtk_v1 doesn't use that. So the only possible user is sprd. The only thing it does is cause dma-iommu.c in ARM64 to use the dma-ranges from OF instead of the domain aperture. sprd has no dma-ranges in arch/arm64/boot/dts/sprd. Further, sprd hard fails any map attempt outside the aperture, so it looks like a bug if the OF somehow chooses a wider aperture as dma-iommu.c will start failing maps. Thus, I propose we just remove the whole thing. All drivers must set an aperture and the aperture is the pure HW capability to map an IOPTE at that address. ie it reflects the design of the page table itself and nothing else. Probably OF dma-ranges should be reflected in the pre-device reserved ranges? This is great, I was starting to look at this part wishing the OF path wasn't different, and this is a clear way forward :) For GART, I'm tempted to give GART a blocking domain and just have its attach always fail - this is enough to block VFIO. Keep the weirdness in one place.. Or ignore it since I doubt anyone is actually using this now. Jason
On Wed, May 03, 2023 at 10:45:11AM -0300, Jason Gunthorpe wrote: > On Wed, May 03, 2023 at 01:01:34PM +0100, Robin Murphy wrote: > > On 2023-05-03 12:01, Jason Gunthorpe wrote: > > > On Wed, May 03, 2023 at 10:17:29AM +0100, Robin Murphy wrote: > > > > On 2023-05-01 19:02, Jason Gunthorpe wrote: > > > > > tegra-gart seems to be kind of wonky since from the start its 'detach_dev' > > > > > op doesn't actually touch hardware. It is supposed to empty the GART of > > > > > all translations loaded into it. > > > > > > > > No, detach should never tear down translations - what if other devices are > > > > still using the domain? > > > > > > ?? All other drivers do this. > > > > The only driver I'm aware of which effectively tore down mappings by freeing > > its pagetable on detach was sprd-iommu, and that was recently fixed on > > account of it being clearly wrong. > > By "Teardown" I mean deconfigure the HW. > > This driver is odd because it doesn't store a page table in the > iommu_domain, it keeps it in the GART registers so it can't actually > detach/attach fully correctly. :( > > > Yes, I'm not disputing that we expect detach to remove that device's > > *access* to the IOVA (which is what GART can't do...), but it should > > absolutely not destroy the IOVA mapping itself. Follow that sequence with > > iommu_attach_device(dom, dev) again and the caller can expect to be able to > > continue using the same translation. > > Yes > > > > If the HW is multi-device then it is supposed to have groups. > > > > Groups are in fact the most practical example: set up a VFIO domain, attach > > two groups to it, map some IOVAs, detach one of the groups, keep using the > > other. If the detach carried an implicit iommu_unmap() there would be > > fireworks. > > Yes, I'm not saying an unmap, I used the word teardown to mean remove > the HW parts. This gart function doesn't touch the HW at all, that > cannot be correct. > > It should have an xarray in the iommu_domain and on detach it should > purge the GART registers and on attach it should load the xarray into > the GART registers. We are also technically expecting drivers to > support map prior to attach, eg for the direct map reserved region > setup. > > > > Oh yuk, that is not an UNMANAGED domain either as we now assume empty > > > UNMANAGED domains are blocking in the core... > > > > They are, in the sense that accesses within the aperture won't go > > anywhere. > > That is not the definition of BLOCKING we came up with.. It is every > IOVA is blocked and the device is safe to hand to VFIO. It can't be just > blocking a subset of the IOVA. > > > It might help if domain->geometry.force_aperture was meaningful, because > > it's never been clear whether it was supposed to reflect a hardware > > capability (in which case it should be false for GART) or be an instruction > > to the user of the domain (wherein it's a bit pointless that everyone always > > sets it). > > force_aperture looks pointless now. Only two drivers don't set it - > mtk_v1 and sprd. > > The only real reader is dma-iommu.c and mtk_v1 doesn't use that. > > So the only possible user is sprd. > > The only thing it does is cause dma-iommu.c in ARM64 to use the > dma-ranges from OF instead of the domain aperture. sprd has no > dma-ranges in arch/arm64/boot/dts/sprd. > > Further, sprd hard fails any map attempt outside the aperture, so it > looks like a bug if the OF somehow chooses a wider aperture as > dma-iommu.c will start failing maps. That all sounds odd. of_dma_configure_id() already sets up the DMA mask based on dma-ranges and the DMA API uses that to restrict what IOVA any buffers can get mapped to for a given device. Drivers can obviously still narrow down the DMA mask further if they have any specific needs. On Tegra, for example, we use this to enforce bus-level DMA masks. The Ethernet controller for instance might support 40 bit addresses, but the memory bus has a quirk where bit 39 is used for extra "swizzling", so we have to restrict DMA masks to 39 bits for all devices, regardless of what the drivers claim. > Thus, I propose we just remove the whole thing. All drivers must set > an aperture and the aperture is the pure HW capability to map an > IOPTE at that address. ie it reflects the design of the page table > itself and nothing else. Yeah, that sounds reasonable. If the aperture represents what the IOMMU supports. Together with each device's DMA mask we should have everything we need. > > Probably OF dma-ranges should be reflected in the pre-device reserved > ranges? > > This is great, I was starting to look at this part wishing the OF path > wasn't different, and this is a clear way forward :) > > For GART, I'm tempted to give GART a blocking domain and just have its > attach always fail - this is enough to block VFIO. Keep the weirdness > in one place.. Or ignore it since I doubt anyone is actually using > this now. For Tegra GART I think there's indeed no use-cases at the moment. Dmitry had at one point tried to make use of it because it can be helpful on some of the older devices that were very memory-constrained. That support never made it upstream because it required significant changes in various places, if I recall correctly. For anything with a decent enough amount of RAM, CMA is usually a better option. This has occasionally come up in the past and I seem to remember that it had once been proposed to simply remove tegra-gart and there had been no objections. Adding Dmitry, if he doesn't have objections to remaving it, neither do I. Thierry
On Wed, May 03, 2023 at 04:43:53PM +0200, Thierry Reding wrote: > > The only thing it does is cause dma-iommu.c in ARM64 to use the > > dma-ranges from OF instead of the domain aperture. sprd has no > > dma-ranges in arch/arm64/boot/dts/sprd. > > > > Further, sprd hard fails any map attempt outside the aperture, so it > > looks like a bug if the OF somehow chooses a wider aperture as > > dma-iommu.c will start failing maps. > > That all sounds odd. of_dma_configure_id() already sets up the DMA mask > based on dma-ranges and the DMA API uses that to restrict what IOVA any > buffers can get mapped to for a given device. Yes, and after it sets up the mask it also passes that range down like this: of_dma_configure_id(): end = dma_start + size - 1; mask = DMA_BIT_MASK(ilog2(end) + 1); dev->coherent_dma_mask &= mask; arch_setup_dma_ops(dev, dma_start, size, iommu, coherent); Which eventually goes to: iommu_setup_dma_ops() iommu_dma_init_domain() Which then does: if (domain->geometry.force_aperture) { And if not set uses the dma_start/size parameter as the actual aperture. (!?) Since sprd does this in the iommu driver: dom->domain.geometry.aperture_start = 0; dom->domain.geometry.aperture_end = SZ_256M - 1; And it is serious about enforcing it during map: unsigned long start = domain->geometry.aperture_start; unsigned long end = domain->geometry.aperture_end; if (iova < start || (iova + size) > (end + 1)) { return -EINVAL; We must see the dma_start/size parameter be a subset of the aperture or eventually dma-iommu.c will see map failures. I can't see how this is can happen, so it looks like omitting force_aperture is a mistake not a deliberate choice. I'll make a patch and see if the SPRD folks have any comment. If there is no reason I can go ahead and purge force_aperture and all the dma_start/size passing through arch_setup_dma_ops(). > > Thus, I propose we just remove the whole thing. All drivers must set > > an aperture and the aperture is the pure HW capability to map an > > IOPTE at that address. ie it reflects the design of the page table > > itself and nothing else. > > Yeah, that sounds reasonable. If the aperture represents what the IOMMU > supports. Together with each device's DMA mask we should have everything > we need. Arguably we should respect the dma-ranges as well, but I think that should come up from the iommu driver via the get_resv_regions() which is the usual place we return FW originated information. > For Tegra GART I think there's indeed no use-cases at the moment. Dmitry > had at one point tried to make use of it because it can be helpful on > some of the older devices that were very memory-constrained. That > support never made it upstream because it required significant changes > in various places, if I recall correctly. For anything with a decent > enough amount of RAM, CMA is usually a better option. So the actual use case of this HW is to linearize buffers? ie it is a general scatter/gather engine? > This has occasionally come up in the past and I seem to remember that it > had once been proposed to simply remove tegra-gart and there had been no > objections. Adding Dmitry, if he doesn't have objections to remaving it, > neither do I. Dmitry, please say yes and I will remove it instead of trying to carry it. The driver is almost 10 years old at this point, I'm skeptical anyone will need it on a 6.2 era kernel.. Thanks, Jason
03.05.2023 20:20, Jason Gunthorpe пишет: > On Wed, May 03, 2023 at 04:43:53PM +0200, Thierry Reding wrote: > >>> The only thing it does is cause dma-iommu.c in ARM64 to use the >>> dma-ranges from OF instead of the domain aperture. sprd has no >>> dma-ranges in arch/arm64/boot/dts/sprd. >>> >>> Further, sprd hard fails any map attempt outside the aperture, so it >>> looks like a bug if the OF somehow chooses a wider aperture as >>> dma-iommu.c will start failing maps. >> >> That all sounds odd. of_dma_configure_id() already sets up the DMA mask >> based on dma-ranges and the DMA API uses that to restrict what IOVA any >> buffers can get mapped to for a given device. > > Yes, and after it sets up the mask it also passes that range down like this: > > of_dma_configure_id(): > end = dma_start + size - 1; > mask = DMA_BIT_MASK(ilog2(end) + 1); > dev->coherent_dma_mask &= mask; > > arch_setup_dma_ops(dev, dma_start, size, iommu, coherent); > > Which eventually goes to: > > iommu_setup_dma_ops() > iommu_dma_init_domain() > > Which then does: > > if (domain->geometry.force_aperture) { > > And if not set uses the dma_start/size parameter as the actual > aperture. (!?) > > Since sprd does this in the iommu driver: > > dom->domain.geometry.aperture_start = 0; > dom->domain.geometry.aperture_end = SZ_256M - 1; > > And it is serious about enforcing it during map: > > unsigned long start = domain->geometry.aperture_start; > unsigned long end = domain->geometry.aperture_end; > > if (iova < start || (iova + size) > (end + 1)) { > return -EINVAL; > > We must see the dma_start/size parameter be a subset of the aperture > or eventually dma-iommu.c will see map failures. > > I can't see how this is can happen, so it looks like omitting > force_aperture is a mistake not a deliberate choice. I'll make a patch > and see if the SPRD folks have any comment. If there is no reason I > can go ahead and purge force_aperture and all the dma_start/size > passing through arch_setup_dma_ops(). > >>> Thus, I propose we just remove the whole thing. All drivers must set >>> an aperture and the aperture is the pure HW capability to map an >>> IOPTE at that address. ie it reflects the design of the page table >>> itself and nothing else. >> >> Yeah, that sounds reasonable. If the aperture represents what the IOMMU >> supports. Together with each device's DMA mask we should have everything >> we need. > > Arguably we should respect the dma-ranges as well, but I think that > should come up from the iommu driver via the get_resv_regions() which > is the usual place we return FW originated information. > >> For Tegra GART I think there's indeed no use-cases at the moment. Dmitry >> had at one point tried to make use of it because it can be helpful on >> some of the older devices that were very memory-constrained. That >> support never made it upstream because it required significant changes >> in various places, if I recall correctly. For anything with a decent >> enough amount of RAM, CMA is usually a better option. > > So the actual use case of this HW is to linearize buffers? ie it is a > general scatter/gather engine? > >> This has occasionally come up in the past and I seem to remember that it >> had once been proposed to simply remove tegra-gart and there had been no >> objections. Adding Dmitry, if he doesn't have objections to remaving it, >> neither do I. > > Dmitry, please say yes and I will remove it instead of trying to carry > it. The driver is almost 10 years old at this point, I'm skeptical > anyone will need it on a 6.2 era kernel.. You probably missed that support for many of 10 years old Tegra2/3 devices was added to kernel during last years. This GART isn't used by upstream DRM driver, but it's used by downstream kernel which uses alternative Tegra DRM driver that works better for older hardware. If it's too much burden to maintain this driver, then feel free to remove it and I'll continue maintaining it in downstream myself. Otherwise I can test your changes if needed.
On Fri, May 12, 2023 at 05:55:23AM +0300, Dmitry Osipenko wrote: > >> This has occasionally come up in the past and I seem to remember that it > >> had once been proposed to simply remove tegra-gart and there had been no > >> objections. Adding Dmitry, if he doesn't have objections to remaving it, > >> neither do I. > > > > Dmitry, please say yes and I will remove it instead of trying to carry > > it. The driver is almost 10 years old at this point, I'm skeptical > > anyone will need it on a 6.2 era kernel.. > > You probably missed that support for many of 10 years old Tegra2/3 > devices was added to kernel during last years. > > This GART isn't used by upstream DRM driver, but it's used by downstream > kernel which uses alternative Tegra DRM driver that works better for > older hardware. It is kernel policy not to carry code to only support out of tree drivers in mainline, so it should be removed, thanks Jason
On 2023-05-12 17:49, Jason Gunthorpe wrote: > On Fri, May 12, 2023 at 05:55:23AM +0300, Dmitry Osipenko wrote: > >>>> This has occasionally come up in the past and I seem to remember that it >>>> had once been proposed to simply remove tegra-gart and there had been no >>>> objections. Adding Dmitry, if he doesn't have objections to remaving it, >>>> neither do I. >>> >>> Dmitry, please say yes and I will remove it instead of trying to carry >>> it. The driver is almost 10 years old at this point, I'm skeptical >>> anyone will need it on a 6.2 era kernel.. >> >> You probably missed that support for many of 10 years old Tegra2/3 >> devices was added to kernel during last years. >> >> This GART isn't used by upstream DRM driver, but it's used by downstream >> kernel which uses alternative Tegra DRM driver that works better for >> older hardware. > > It is kernel policy not to carry code to only support out of tree drivers in > mainline, so it should be removed, thanks Aww, I was literally in the middle of writing a Friday-afternoon patch to fix it... still need to build-test, but it's really not looking too bad at all: drivers/iommu/tegra-gart.c | 53 +++++++++++++++++----------------- 1 file changed, 27 insertions(+), 26 deletions(-) After that I was going to clean up the force_aperture confusion. TBH I've grown to rather like having this driver around as an exception to prove our abstractions and help make sure they make sense - it doesn't take much effort to keep it functional, and it's not like there aren't plenty of in-tree drivers for hardware even more ancient, obscure and less-used than Tegra20. FWIW I have *20*-year-old hardware at home running an up-to-date mainline-based distro for a practical purpose, but I guess that's considered valid if it says Intel on it? :P Thanks, Robin.
On Fri, May 12, 2023 at 07:12:21PM +0100, Robin Murphy wrote: > On 2023-05-12 17:49, Jason Gunthorpe wrote: > > On Fri, May 12, 2023 at 05:55:23AM +0300, Dmitry Osipenko wrote: > > > > > > > This has occasionally come up in the past and I seem to remember that it > > > > > had once been proposed to simply remove tegra-gart and there had been no > > > > > objections. Adding Dmitry, if he doesn't have objections to remaving it, > > > > > neither do I. > > > > > > > > Dmitry, please say yes and I will remove it instead of trying to carry > > > > it. The driver is almost 10 years old at this point, I'm skeptical > > > > anyone will need it on a 6.2 era kernel.. > > > > > > You probably missed that support for many of 10 years old Tegra2/3 > > > devices was added to kernel during last years. > > > > > > This GART isn't used by upstream DRM driver, but it's used by downstream > > > kernel which uses alternative Tegra DRM driver that works better for > > > older hardware. > > > > It is kernel policy not to carry code to only support out of tree drivers in > > mainline, so it should be removed, thanks > > Aww, I was literally in the middle of writing a Friday-afternoon patch to > fix it... still need to build-test, but it's really not looking too bad at > all: But we still need some kind of core support to accommodate a domain with a mixture of identity and translation. Seems a bit pointless for a driver with no in tree user even? > After that I was going to clean up the force_aperture confusion. Oh I already made a patch to delete it :) > TBH I've grown to rather like having this driver around as an > exception to prove our abstractions and help make sure they make > sense Except nobody else seems to know it is here or really understand how weird/broken it is compared to the other drivers. We've managed to ignore the issues, but I wouldn't say it is helping build abstractions. If we remove this driver then the iommu subsystem has no HW with a mixed translation in the iommu_domain and looking forwards I think that will be the kind of HW people want to build. The remaining GART-like hardware is in arch code. > not like there aren't plenty of in-tree drivers for hardware even > more ancient, obscure and less-used than Tegra20. FWIW I have > *20*-year-old hardware at home running an up-to-date mainline-based > distro for a practical purpose, but I guess that's considered valid > if it says Intel on it? :P Well, We keep trying to remove stuff across the kernel.. This week I heard about removing areas of HIGHMEM support, removing ia64 and even some rumbles that it is time to say good bye to ia32 - apparently it is now slightly broken. Who knows what will stick in the end but it is not just ARM. Stuff tends to get removed when it stands in the way.. On the other hand stuff with no in-tree user should just be summarily deleted. We have enough problems keeping actual ancient used code going, the community doesn't need even more work to support some out of tree stuff. Jason
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index a482ff838b5331..09865889ff2480 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -124,11 +124,27 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain, return ret; } -static void gart_iommu_set_platform_dma(struct device *dev) +/* + * FIXME: This weird function that doesn't touch the HW, but it is supposed to + * zap any current translation from the HW. + * + * Preserve whatever this was doing in 2011 as basically the same in the + * new API. The IOMMU_DOMAIN_PLATFORM's attach_dev function is called at almost + * the same times as the old detach_dev. + * + * I suspect the reality is that the UNMANAGED domain is never actually detached + * in real systems, or it is only detached once eveything is already unmapped, + * so this could get by this way. + */ +static int gart_iommu_platform_attach(struct iommu_domain *identity_domain, + struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct gart_device *gart = gart_handle; + if (domain == identity_domain || !domain) + return 0; + spin_lock(&gart->dom_lock); if (dev_iommu_priv_get(dev) == domain) { @@ -139,8 +155,18 @@ static void gart_iommu_set_platform_dma(struct device *dev) } spin_unlock(&gart->dom_lock); + return 0; } +static struct iommu_domain_ops gart_iommu_platform_ops = { + .attach_dev = gart_iommu_platform_attach, +}; + +static struct iommu_domain gart_iommu_platform_domain = { + .type = IOMMU_DOMAIN_PLATFORM, + .ops = &gart_iommu_platform_ops, +}; + static struct iommu_domain *gart_iommu_domain_alloc(unsigned type) { struct iommu_domain *domain; @@ -267,10 +293,10 @@ static void gart_iommu_sync(struct iommu_domain *domain, } static const struct iommu_ops gart_iommu_ops = { + .default_domain = &gart_iommu_platform_domain, .domain_alloc = gart_iommu_domain_alloc, .probe_device = gart_iommu_probe_device, .device_group = generic_device_group, - .set_platform_dma_ops = gart_iommu_set_platform_dma, .pgsize_bitmap = GART_IOMMU_PGSIZES, .of_xlate = gart_iommu_of_xlate, .default_domain_ops = &(const struct iommu_domain_ops) {
tegra-gart seems to be kind of wonky since from the start its 'detach_dev' op doesn't actually touch hardware. It is supposed to empty the GART of all translations loaded into it. Call this weirdness PLATFORM which keeps the basic original ops->detach_dev() semantic alive without needing much special core code support. I'm guessing it really ends up in a BLOCKING configuration, but without any forced cleanup it is unsafe. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/iommu/tegra-gart.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-)