Message ID | 20211008173146.645127-1-zackr@vmware.com (mailing list archive) |
---|---|
Headers | show |
Series | drm/vmwgfx: Support module unload and hotunplug | expand |
On Fri, 2021-10-08 at 13:31 -0400, Zack Rusin wrote: > This is a largely trivial set that makes vmwgfx support module reload > and PCI hot-unplug. It also makes IGT's core_hotunplug pass instead > of kernel oops'ing. > > The one "ugly" change is the "Introduce a new placement for MOB page > tables". It seems vmwgfx has been violating a TTM assumption that > TTM_PL_SYSTEM buffers are never fenced for a while. Apart from a > kernel > oops on module unload it didn't seem to wreak too much havoc, but we > shouldn't be abusing TTM. So to solve it we're introducing a new > placement, which is basically system, but can deal with fenced bo's. > > Cc: Christian König <christian.koenig@amd.com> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Hi, Zack, What part of TTM doesn't allow fenced system memory currently? It was certainly designed to allow that and vmwgfx has been relying on that since the introduction of MOBs IIRC. Also i915 is currently relying on that. /Thomas
On Fri, 2021-10-08 at 22:28 +0200, Thomas Hellström wrote: > On Fri, 2021-10-08 at 13:31 -0400, Zack Rusin wrote: > > This is a largely trivial set that makes vmwgfx support module reload > > and PCI hot-unplug. It also makes IGT's core_hotunplug pass instead > > of kernel oops'ing. > > > > The one "ugly" change is the "Introduce a new placement for MOB page > > tables". It seems vmwgfx has been violating a TTM assumption that > > TTM_PL_SYSTEM buffers are never fenced for a while. Apart from a > > kernel > > oops on module unload it didn't seem to wreak too much havoc, but we > > shouldn't be abusing TTM. So to solve it we're introducing a new > > placement, which is basically system, but can deal with fenced bo's. > > > > Cc: Christian König <christian.koenig@amd.com> > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> > > Hi, Zack, > > What part of TTM doesn't allow fenced system memory currently? It was > certainly designed to allow that and vmwgfx has been relying on that > since the introduction of MOBs IIRC. Also i915 is currently relying on > that. It's the shutdown. BO's allocated through the ttm system manager might be busy during ttm_bo_put which results in them being scheduled for a delayed deletion. The ttm system manager is disabled before the final delayed deletion is ran in ttm_device_fini. This results in crashes during freeing of the BO resources because they're trying to remove themselves from a no longer existent ttm_resource_manager (e.g. in IGT's core_hotunplug on vmwgfx) During review of the trivial patch that was fixing it in ttm Christian said that system domain buffers must be idle or otherwise a number of assumptions in ttm breaks: https://lists.freedesktop.org/archives/dri-devel/2021-September/324027.html And later clarified that in fact system domain buffers being fenced is illegal from a design point of view: https://lists.freedesktop.org/archives/dri-devel/2021-September/324697.html z
On Fri, 2021-10-08 at 20:40 +0000, Zack Rusin wrote: > On Fri, 2021-10-08 at 22:28 +0200, Thomas Hellström wrote: > > On Fri, 2021-10-08 at 13:31 -0400, Zack Rusin wrote: > > > This is a largely trivial set that makes vmwgfx support module > > > reload > > > and PCI hot-unplug. It also makes IGT's core_hotunplug pass > > > instead > > > of kernel oops'ing. > > > > > > The one "ugly" change is the "Introduce a new placement for MOB > > > page > > > tables". It seems vmwgfx has been violating a TTM assumption that > > > TTM_PL_SYSTEM buffers are never fenced for a while. Apart from a > > > kernel > > > oops on module unload it didn't seem to wreak too much havoc, but > > > we > > > shouldn't be abusing TTM. So to solve it we're introducing a new > > > placement, which is basically system, but can deal with fenced > > > bo's. > > > > > > Cc: Christian König <christian.koenig@amd.com> > > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> > > > > Hi, Zack, > > > > What part of TTM doesn't allow fenced system memory currently? It > > was > > certainly designed to allow that and vmwgfx has been relying on > > that > > since the introduction of MOBs IIRC. Also i915 is currently relying > > on > > that. > > It's the shutdown. BO's allocated through the ttm system manager > might > be busy during ttm_bo_put which results in them being scheduled for a > delayed deletion. The ttm system manager is disabled before the final > delayed deletion is ran in ttm_device_fini. This results in crashes > during freeing of the BO resources because they're trying to remove > themselves from a no longer existent ttm_resource_manager (e.g. in > IGT's core_hotunplug on vmwgfx) > > During review of the trivial patch that was fixing it in ttm > Christian > said that system domain buffers must be idle or otherwise a number of > assumptions in ttm breaks: > https://lists.freedesktop.org/archives/dri-devel/2021-September/324027.html > And later clarified that in fact system domain buffers being fenced > is > illegal from a design point of view: > https://lists.freedesktop.org/archives/dri-devel/2021-September/324697.html Hmm, this looks very odd, because I remember reminding Christian as late as this spring that both vmwgfx and i915 sets up GPU bindings to system buffers, as part of the review of a patch series pushing a couple of changes to the swapout path that apparently had missed this. This more sounds like there have been changes to TTM happening not taking into account or knowing that TTM was designed for system buffers bound to GPU and that there were drivers actually doing that. And there is still older code around clearly implying system buffers can be fenced, like ttm_bo_swapout(), and that there is dma fences signaling completion on work that has never touched the GPU, not to mention async eviction where a bo may be evicted to system but has tons of outstanding fenced work in the pipe. So if there has been a design change WRT this I believe it should have been brought up on dri-devel to have it properly discussed so affected drivers could agree on the different options. Perhaps Christian can enlighten us here. Christian? /Thomas > > z
Am 08.10.21 um 23:13 schrieb Thomas Hellström: > On Fri, 2021-10-08 at 20:40 +0000, Zack Rusin wrote: >> On Fri, 2021-10-08 at 22:28 +0200, Thomas Hellström wrote: >>> On Fri, 2021-10-08 at 13:31 -0400, Zack Rusin wrote: >>>> This is a largely trivial set that makes vmwgfx support module >>>> reload >>>> and PCI hot-unplug. It also makes IGT's core_hotunplug pass >>>> instead >>>> of kernel oops'ing. >>>> >>>> The one "ugly" change is the "Introduce a new placement for MOB >>>> page >>>> tables". It seems vmwgfx has been violating a TTM assumption that >>>> TTM_PL_SYSTEM buffers are never fenced for a while. Apart from a >>>> kernel >>>> oops on module unload it didn't seem to wreak too much havoc, but >>>> we >>>> shouldn't be abusing TTM. So to solve it we're introducing a new >>>> placement, which is basically system, but can deal with fenced >>>> bo's. >>>> >>>> Cc: Christian König <christian.koenig@amd.com> >>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> >>> Hi, Zack, >>> >>> What part of TTM doesn't allow fenced system memory currently? It >>> was >>> certainly designed to allow that and vmwgfx has been relying on >>> that >>> since the introduction of MOBs IIRC. Also i915 is currently relying >>> on >>> that. >> It's the shutdown. BO's allocated through the ttm system manager >> might >> be busy during ttm_bo_put which results in them being scheduled for a >> delayed deletion. The ttm system manager is disabled before the final >> delayed deletion is ran in ttm_device_fini. This results in crashes >> during freeing of the BO resources because they're trying to remove >> themselves from a no longer existent ttm_resource_manager (e.g. in >> IGT's core_hotunplug on vmwgfx) >> >> During review of the trivial patch that was fixing it in ttm >> Christian >> said that system domain buffers must be idle or otherwise a number of >> assumptions in ttm breaks: >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Fdri-devel%2F2021-September%2F324027.html&data=04%7C01%7Cchristian.koenig%40amd.com%7C2391a82208e6464c8db208d98aa08dd2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637693244449717755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BZ3C00rZDDdpKNoGa0PYwoHeM89uVzN1Md4iN2qkGB0%3D&reserved=0 >> And later clarified that in fact system domain buffers being fenced >> is >> illegal from a design point of view: >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Fdri-devel%2F2021-September%2F324697.html&data=04%7C01%7Cchristian.koenig%40amd.com%7C2391a82208e6464c8db208d98aa08dd2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637693244449717755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3eXNqeh7Ifqe6lllRMvdfJX%2F7rX7%2FqH3wldNE5AodMc%3D&reserved=0 > Hmm, this looks very odd, because I remember reminding Christian as > late as this spring that both vmwgfx and i915 sets up GPU bindings to > system buffers, as part of the review of a patch series pushing a > couple of changes to the swapout path that apparently had missed this. Well that was the trigger to look more deeply into this and as far as I can tell TTM was never capable of doing this correctly. > This more sounds like there have been changes to TTM happening not > taking into account or knowing that TTM was designed for system buffers > bound to GPU and that there were drivers actually doing that. > > And there is still older code around clearly implying system buffers > can be fenced, like ttm_bo_swapout(), and that there is dma fences > signaling completion on work that has never touched the GPU, not to > mention async eviction where a bo may be evicted to system but has tons > of outstanding fenced work in the pipe. > > So if there has been a design change WRT this I believe it should have > been brought up on dri-devel to have it properly discussed so affected > drivers could agree on the different options. > > Perhaps Christian can enlighten us here. Christian? There are multiple occasions where we assume that BOs in the system domain are not accessible by the GPU, swapout and teardown are just two examples. When Dave reorganized the buffer move code we also had to insert waits for moves to complete for anything which goes into the SYSTEM domain. Apart from that it certainly makes sense to have that restriction. Memory which is accessed by the hardware and not directly evictable must be accounted differently. Regards, Christian. > > /Thomas > > >> z >
On Mon, 2021-10-11 at 10:17 +0200, Christian König wrote: > Am 08.10.21 um 23:13 schrieb Thomas Hellström: > > On Fri, 2021-10-08 at 20:40 +0000, Zack Rusin wrote: > > > On Fri, 2021-10-08 at 22:28 +0200, Thomas Hellström wrote: > > > > On Fri, 2021-10-08 at 13:31 -0400, Zack Rusin wrote: > > > > > This is a largely trivial set that makes vmwgfx support > > > > > module > > > > > reload > > > > > and PCI hot-unplug. It also makes IGT's core_hotunplug pass > > > > > instead > > > > > of kernel oops'ing. > > > > > > > > > > The one "ugly" change is the "Introduce a new placement for > > > > > MOB > > > > > page > > > > > tables". It seems vmwgfx has been violating a TTM assumption > > > > > that > > > > > TTM_PL_SYSTEM buffers are never fenced for a while. Apart > > > > > from a > > > > > kernel > > > > > oops on module unload it didn't seem to wreak too much havoc, > > > > > but > > > > > we > > > > > shouldn't be abusing TTM. So to solve it we're introducing a > > > > > new > > > > > placement, which is basically system, but can deal with > > > > > fenced > > > > > bo's. > > > > > > > > > > Cc: Christian König <christian.koenig@amd.com> > > > > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> > > > > Hi, Zack, > > > > > > > > What part of TTM doesn't allow fenced system memory currently? > > > > It > > > > was > > > > certainly designed to allow that and vmwgfx has been relying on > > > > that > > > > since the introduction of MOBs IIRC. Also i915 is currently > > > > relying > > > > on > > > > that. > > > It's the shutdown. BO's allocated through the ttm system manager > > > might > > > be busy during ttm_bo_put which results in them being scheduled > > > for a > > > delayed deletion. The ttm system manager is disabled before the > > > final > > > delayed deletion is ran in ttm_device_fini. This results in > > > crashes > > > during freeing of the BO resources because they're trying to > > > remove > > > themselves from a no longer existent ttm_resource_manager (e.g. > > > in > > > IGT's core_hotunplug on vmwgfx) > > > > > > During review of the trivial patch that was fixing it in ttm > > > Christian > > > said that system domain buffers must be idle or otherwise a > > > number of > > > assumptions in ttm breaks: > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Fdri-devel%2F2021-September%2F324027.html&data=04%7C01%7Cchristian.koenig%40amd.com%7C2391a82208e6464c8db208d98aa08dd2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637693244449717755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BZ3C00rZDDdpKNoGa0PYwoHeM89uVzN1Md4iN2qkGB0%3D&reserved=0 > > > And later clarified that in fact system domain buffers being > > > fenced > > > is > > > illegal from a design point of view: > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Fdri-devel%2F2021-September%2F324697.html&data=04%7C01%7Cchristian.koenig%40amd.com%7C2391a82208e6464c8db208d98aa08dd2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637693244449717755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3eXNqeh7Ifqe6lllRMvdfJX%2F7rX7%2FqH3wldNE5AodMc%3D&reserved=0 > > Hmm, this looks very odd, because I remember reminding Christian as > > late as this spring that both vmwgfx and i915 sets up GPU bindings > > to > > system buffers, as part of the review of a patch series pushing a > > couple of changes to the swapout path that apparently had missed > > this. > > Well that was the trigger to look more deeply into this and as far as > I > can tell TTM was never capable of doing this correctly. So apart from the teardown, which appear to be an oversight when the system manager was introduced where do whe fail currently with this? > > > This more sounds like there have been changes to TTM happening not > > taking into account or knowing that TTM was designed for system > > buffers > > bound to GPU and that there were drivers actually doing that. > > > > And there is still older code around clearly implying system > > buffers > > can be fenced, like ttm_bo_swapout(), and that there is dma fences > > signaling completion on work that has never touched the GPU, not to > > mention async eviction where a bo may be evicted to system but has > > tons > > of outstanding fenced work in the pipe. > > > > So if there has been a design change WRT this I believe it should > > have > > been brought up on dri-devel to have it properly discussed so > > affected > > drivers could agree on the different options. > > > > Perhaps Christian can enlighten us here. Christian? > > There are multiple occasions where we assume that BOs in the system > domain are not accessible by the GPU, swapout and teardown are just > two > examples. > At swapout we *do* wait for idle after moving to system, It's relying on the swap_notifier to unbind. That's why the swap_notifier is there, so swapout is working perfectly fine. > When Dave reorganized the buffer move code we also had to insert > waits > for moves to complete for anything which goes into the SYSTEM domain. > > Apart from that it certainly makes sense to have that restriction. > Memory which is accessed by the hardware and not directly evictable > must > be accounted differently. Could you elaborate a bit on this? From a swapout point of view, it looks to me like SYSTEM is treated just like TT by TTM? Or is the accounting you mention something amdgpu-specific and more related to the amd GEM domains than to the TTM memory types? Note that TTM was never designed to deal with GPU binding, but to provide a set of placements or memory-types where the memory can be mapped by the CPU and bound by the GPU. TT was a special case solely because of the mappable apertures. A bind mechanism had to be provided for TTM to be able to map TT buffers, and most drivers used that bound mechanism for convenience. So now if this is going to be changed, I think we need to understand why and think this through really thoroughly: * What is not working and why (the teardown seems to be a trivial fix). * How did we end up here, * What's the cost of fixing that up compared to refactoring the drivers that rely on bindable system memory, * What's the justification of a system type at all if it's not GPU- bindable, meaning it's basically equivalent to swapped-out shmem with the exception that it's mappable? It's probably a non-trivial effort to refactor i915 to not use system for gpu-binding and in that case I think we need some solid justification why we need to do that rather than fix up what's not working with TTM + bindable system: So could you please elaborate (assuming that the teardown is fixable) on the other parts that don't work. Thanks /Thomas > > Regards, > Christian. > > > > > /Thomas > > > > > > > z > > >
Am 11.10.21 um 14:04 schrieb Thomas Hellström: > [SNIP] >>> Hmm, this looks very odd, because I remember reminding Christian as >>> late as this spring that both vmwgfx and i915 sets up GPU bindings >>> to >>> system buffers, as part of the review of a patch series pushing a >>> couple of changes to the swapout path that apparently had missed >>> this. >> Well that was the trigger to look more deeply into this and as far as >> I >> can tell TTM was never capable of doing this correctly. > So apart from the teardown, which appear to be an oversight when the > system manager was introduced where do whe fail currently with this? During validation for example. Moving BOs into the system domain means that they are potentially swapped out. In other words when drivers are accessing BOs in the system domain they always need to take care of TTM_TT_FLAG_SWAPPED manually. >>> This more sounds like there have been changes to TTM happening not >>> taking into account or knowing that TTM was designed for system >>> buffers >>> bound to GPU and that there were drivers actually doing that. >>> >>> And there is still older code around clearly implying system >>> buffers >>> can be fenced, like ttm_bo_swapout(), and that there is dma fences >>> signaling completion on work that has never touched the GPU, not to >>> mention async eviction where a bo may be evicted to system but has >>> tons >>> of outstanding fenced work in the pipe. >>> >>> So if there has been a design change WRT this I believe it should >>> have >>> been brought up on dri-devel to have it properly discussed so >>> affected >>> drivers could agree on the different options. >>> >>> Perhaps Christian can enlighten us here. Christian? >> There are multiple occasions where we assume that BOs in the system >> domain are not accessible by the GPU, swapout and teardown are just >> two >> examples. >> > At swapout we *do* wait for idle after moving to system, It's relying > on the swap_notifier to unbind. That's why the swap_notifier is there, > so swapout is working perfectly fine. You can of course define that BOs are not swapable or call ttm_bo_validate() with a system domain placement and then make sure they are swapped in manually, but that is extremely hacky and bad design. As far as I know that's what i915 does, but that doesn't mean that this is a good idea. Additional to that I've already noted that I think this swap_notify callback is not a good idea either. We should rather have a separate TTM_PL_SWAPPED domain for this so that drivers are cleanly informed about the state change. >> When Dave reorganized the buffer move code we also had to insert >> waits >> for moves to complete for anything which goes into the SYSTEM domain. >> >> Apart from that it certainly makes sense to have that restriction. >> Memory which is accessed by the hardware and not directly evictable >> must >> be accounted differently. > Could you elaborate a bit on this? From a swapout point of view, it > looks to me like SYSTEM is treated just like TT by TTM? Or is the > accounting you mention something amdgpu-specific and more related to > the amd GEM domains than to the TTM memory types? No, that is something the Android people came up with to improve the shrinker behavior. > Note that TTM was never designed to deal with GPU binding, but to > provide a set of placements or memory-types where the memory can be > mapped by the CPU and bound by the GPU. TT was a special case solely > because of the mappable apertures. A bind mechanism had to be provided > for TTM to be able to map TT buffers, and most drivers used that bound > mechanism for convenience. Well that's certainly not correct. Before Dave moved this back into the drivers TTM had bind/unbind callbacks for the translation tables. It's just that vmwgfx was an exception and all other drivers where using that correctly. This vmwgfx exception is now what Zack is trying to fix here. > So now if this is going to be changed, I think we need to understand > why and think this through really thoroughly: > > * What is not working and why (the teardown seems to be a trivial fix). > * How did we end up here, > * What's the cost of fixing that up compared to refactoring the drivers > that rely on bindable system memory, > * What's the justification of a system type at all if it's not GPU- > bindable, meaning it's basically equivalent to swapped-out shmem with > the exception that it's mappable? Well, once more that isn't correct. This is nothing new and as far as I know that behavior existing as long as TTM existed. It's just that vmwgfx was a little bit special and broken in some aspects. > It's probably a non-trivial effort to refactor i915 to not use system > for gpu-binding and in that case I think we need some solid > justification why we need to do that rather than fix up what's not > working with TTM + bindable system: > > So could you please elaborate (assuming that the teardown is fixable) > on the other parts that don't work. Well you can of course design i915 however you like, you just need to be aware that when it breaks you need to keep the pieces. Regards, Christian. > > Thanks > > /Thomas > > >> Regards, >> Christian. >> >>> /Thomas >>> >>> >>>> z >
On Tue, 2021-10-12 at 10:27 +0200, Christian König wrote: > Am 11.10.21 um 14:04 schrieb Thomas Hellström: > > [SNIP] > > > > Hmm, this looks very odd, because I remember reminding > > > > Christian as > > > > late as this spring that both vmwgfx and i915 sets up GPU > > > > bindings > > > > to > > > > system buffers, as part of the review of a patch series pushing > > > > a > > > > couple of changes to the swapout path that apparently had > > > > missed > > > > this. > > > Well that was the trigger to look more deeply into this and as > > > far as > > > I > > > can tell TTM was never capable of doing this correctly. > > So apart from the teardown, which appear to be an oversight when > > the > > system manager was introduced where do whe fail currently with > > this? > > During validation for example. Moving BOs into the system domain > means > that they are potentially swapped out. > > In other words when drivers are accessing BOs in the system domain > they > always need to take care of TTM_TT_FLAG_SWAPPED manually. Yes, that's true. Initially TTMs were populated on a page basis. All users of a particular page was required to validate that it was present. This was later changed to per-tt population by Jerome I think and somewhere along the line that requirement on the user to validate appears to have gotten lost as well. > > > > > This more sounds like there have been changes to TTM happening > > > > not > > > > taking into account or knowing that TTM was designed for system > > > > buffers > > > > bound to GPU and that there were drivers actually doing that. > > > > > > > > And there is still older code around clearly implying system > > > > buffers > > > > can be fenced, like ttm_bo_swapout(), and that there is dma > > > > fences > > > > signaling completion on work that has never touched the GPU, > > > > not to > > > > mention async eviction where a bo may be evicted to system but > > > > has > > > > tons > > > > of outstanding fenced work in the pipe. > > > > > > > > So if there has been a design change WRT this I believe it > > > > should > > > > have > > > > been brought up on dri-devel to have it properly discussed so > > > > affected > > > > drivers could agree on the different options. > > > > > > > > Perhaps Christian can enlighten us here. Christian? > > > There are multiple occasions where we assume that BOs in the > > > system > > > domain are not accessible by the GPU, swapout and teardown are > > > just > > > two > > > examples. > > > > > At swapout we *do* wait for idle after moving to system, It's > > relying > > on the swap_notifier to unbind. That's why the swap_notifier is > > there, > > so swapout is working perfectly fine. > > You can of course define that BOs are not swapable or call > ttm_bo_validate() with a system domain placement and then make sure > they > are swapped in manually, but that is extremely hacky and bad design. > > As far as I know that's what i915 does, but that doesn't mean that > this > is a good idea. It might be that it was not a good idea, but this was the initial design for TTM. > > Additional to that I've already noted that I think this swap_notify > callback is not a good idea either. We should rather have a separate > TTM_PL_SWAPPED domain for this so that drivers are cleanly informed > about the state change. > > > > When Dave reorganized the buffer move code we also had to insert > > > waits > > > for moves to complete for anything which goes into the SYSTEM > > > domain. > > > > > > Apart from that it certainly makes sense to have that > > > restriction. > > > Memory which is accessed by the hardware and not directly > > > evictable > > > must > > > be accounted differently. > > Could you elaborate a bit on this? From a swapout point of view, it > > looks to me like SYSTEM is treated just like TT by TTM? Or is the > > accounting you mention something amdgpu-specific and more related > > to > > the amd GEM domains than to the TTM memory types? > > No, that is something the Android people came up with to improve the > shrinker behavior. > > > Note that TTM was never designed to deal with GPU binding, but to > > provide a set of placements or memory-types where the memory can be > > mapped by the CPU and bound by the GPU. TT was a special case > > solely > > because of the mappable apertures. A bind mechanism had to be > > provided > > for TTM to be able to map TT buffers, and most drivers used that > > bound > > mechanism for convenience. > > Well that's certainly not correct. Before Dave moved this back into > the > drivers TTM had bind/unbind callbacks for the translation tables. Yes it had, and as discussed when Dave removed them, these were solely intended to be used from TTM's point of view to keep track of when data was in mappable apertures, so that TTM could point CPU mappings correctly. Now most later drivers found that convenient and used those also to bind other memory types, like vmwgfx does for GMR and MOB memory for example, but that never meant it was a required use-pattern. Don't have time to dig commit messages up but IIRC this was mentioned a number of times during the years. > > It's just that vmwgfx was an exception and all other drivers where > using > that correctly. This vmwgfx exception is now what Zack is trying to > fix > here. While vmwgfx is binding page-tables to system it also used the bind / unbind mechanisms for other memory types for convenience. Other drivers most probably used copy-paste and wasn't aware of the feature. > > > So now if this is going to be changed, I think we need to > > understand > > why and think this through really thoroughly: > > > > * What is not working and why (the teardown seems to be a trivial > > fix). > > * How did we end up here, > > * What's the cost of fixing that up compared to refactoring the > > drivers > > that rely on bindable system memory, > > * What's the justification of a system type at all if it's not GPU- > > bindable, meaning it's basically equivalent to swapped-out shmem > > with > > the exception that it's mappable? > > Well, once more that isn't correct. This is nothing new and as far as > I > know that behavior existing as long as TTM existed. I'm not sure whats incorrect? I'm trying to explain what the initial design was, and it may of course have been bad and the one you propose a better one and if required we certainly need to fix i915 to align with a new one. What worries me though, that if you perceive the design differently and change things in TTM according to that perception that breaks drivers that rely on the initial design and then force drivers to change claiming they are incorrect without a thorough discussion on dri-devel, that's IMHO not good. So I guess we don't have much choice other than to refactor i915 to align to not gpu-bind system, but could we meanwhile at least fix that takedown ordering while that's being worked on? /Thomas
On Tue, 2021-10-12 at 11:10 +0200, Thomas Hellström wrote: > On Tue, 2021-10-12 at 10:27 +0200, Christian König wrote: > > Am 11.10.21 um 14:04 schrieb Thomas Hellström: > > > > > > > > > > So now if this is going to be changed, I think we need to > > > understand > > > why and think this through really thoroughly: > > > > > > * What is not working and why (the teardown seems to be a trivial > > > fix). > > > * How did we end up here, > > > * What's the cost of fixing that up compared to refactoring the > > > drivers > > > that rely on bindable system memory, > > > * What's the justification of a system type at all if it's not > > > GPU- > > > bindable, meaning it's basically equivalent to swapped-out shmem > > > with > > > the exception that it's mappable? > > > > Well, once more that isn't correct. This is nothing new and as far > > as > > I > > know that behavior existing as long as TTM existed. > > I'm not sure whats incorrect? I'm trying to explain what the initial > design was, and it may of course have been bad and the one you > propose > a better one and if required we certainly need to fix i915 to align > with a new one. > > What worries me though, that if you perceive the design differently > and > change things in TTM according to that perception that breaks drivers > that rely on the initial design and then force drivers to change > claiming they are incorrect without a thorough discussion on dri- > devel, > that's IMHO not good. We should probably do that in a seperate thread so that this, fundametally important, discussion is easier to find and reference in the future. It looks like we're settling on a decision here so I'd appreciate an Acked-by for the patch 4/5 just so it doesn't look like I was making things up to someone looking at git history in the future. It seems that in general TTM was designed to be able to handle an amazing number of special/corner cases at a cost of complexity which meant that over the years very few people understood it and the code handling those cases sometimes broke. It sounds like Christian is now trying to reign it in and make the code a lot more focused. Working on other OS'es for the last few years, certainly made me appreciate simple frameworks that move complexity towards drivers that actually need them, e.g. it's of course anecdotal but I found wddm gpu virtual addressing models (iommu/gpummu) a lot easier to grok. On the flip side that does mean that vmwgfx and i915 need to redo some code. For vmwgfx it's probably a net positive anyway as we've been using TTM for, what is really nowadays, an integrated GPU so maybe it's time for us to think about transition to gem. z
On Tue, Oct 12, 2021 at 05:34:50PM +0000, Zack Rusin wrote: > On Tue, 2021-10-12 at 11:10 +0200, Thomas Hellström wrote: > > On Tue, 2021-10-12 at 10:27 +0200, Christian König wrote: > > > Am 11.10.21 um 14:04 schrieb Thomas Hellström: > > > > > > > > > > > > > > So now if this is going to be changed, I think we need to > > > > understand > > > > why and think this through really thoroughly: > > > > > > > > * What is not working and why (the teardown seems to be a trivial > > > > fix). > > > > * How did we end up here, > > > > * What's the cost of fixing that up compared to refactoring the > > > > drivers > > > > that rely on bindable system memory, > > > > * What's the justification of a system type at all if it's not > > > > GPU- > > > > bindable, meaning it's basically equivalent to swapped-out shmem > > > > with > > > > the exception that it's mappable? > > > > > > Well, once more that isn't correct. This is nothing new and as far > > > as > > > I > > > know that behavior existing as long as TTM existed. > > > > I'm not sure whats incorrect? I'm trying to explain what the initial > > design was, and it may of course have been bad and the one you > > propose > > a better one and if required we certainly need to fix i915 to align > > with a new one. > > > > What worries me though, that if you perceive the design differently > > and > > change things in TTM according to that perception that breaks drivers > > that rely on the initial design and then force drivers to change > > claiming they are incorrect without a thorough discussion on dri- > > devel, > > that's IMHO not good. > > We should probably do that in a seperate thread so that this, > fundametally important, discussion is easier to find and reference in > the future. It looks like we're settling on a decision here so I'd > appreciate an Acked-by for the patch 4/5 just so it doesn't look like I > was making things up to someone looking at git history in the future. Jumping in sidesways and late, and also without real context on the decision itself: The way to properly formalize this is - type a kerneldoc patch which writes down the rules we agree on, whether that's uapi, or internal helper api like for ttm, or on types or whatever - get acks from everyone who participated + everyone who might care - merge it > It seems that in general TTM was designed to be able to handle an > amazing number of special/corner cases at a cost of complexity which > meant that over the years very few people understood it and the code > handling those cases sometimes broke. It sounds like Christian is now > trying to reign it in and make the code a lot more focused. > > Working on other OS'es for the last few years, certainly made me > appreciate simple frameworks that move complexity towards drivers that > actually need them, e.g. it's of course anecdotal but I found wddm gpu > virtual addressing models (iommu/gpummu) a lot easier to grok. > > On the flip side that does mean that vmwgfx and i915 need to redo some > code. For vmwgfx it's probably a net positive anyway as we've been > using TTM for, what is really nowadays, an integrated GPU so maybe it's > time for us to think about transition to gem. Aside, but we're looking at adopting ttm for integrated gpu too. The execbuf utils and dynamic memory management helpers for pure gem just aren't quite there yet, and improving ttm a bit in this area looks reasonable (like adding a unified memory aware shrinker like we have in i915-gem). Also I thought vmwgfx is using ttm to also manage some id spaces, you'd have to hand-roll that. Anyway entirely orthogonal. -Daniel
On Wed, 2021-10-13 at 14:50 +0200, Daniel Vetter wrote: > On Tue, Oct 12, 2021 at 05:34:50PM +0000, Zack Rusin wrote: > > > On the flip side that does mean that vmwgfx and i915 need to redo > > some > > code. For vmwgfx it's probably a net positive anyway as we've been > > using TTM for, what is really nowadays, an integrated GPU so maybe > > it's > > time for us to think about transition to gem. > > Aside, but we're looking at adopting ttm for integrated gpu too. The > execbuf utils and dynamic memory management helpers for pure gem just > aren't quite there yet, and improving ttm a bit in this area looks > reasonable (like adding a unified memory aware shrinker like we have > in > i915-gem). > That would certainly be a big help. The situation that I want to avoid is having vmwgfx using TTM for what no other driver is using it for. > Also I thought vmwgfx is using ttm to also manage some id spaces, > you'd > have to hand-roll that. Yes, it's work either way, it's likely less code with GEM but we'd lose support for 3D on older hardware where our device did have dedicated VRAM. Nowadays memory management in our device is rather trivial: every GPU object is just kernel virtual memory. To allow our virtual device to write into that memory we send it an identifier to be able to name the object (we use id's but it could be just the kernel virtual address as integer) plus a page table because, of course, vmx can't read guest kernel's page tables so we need to map the kernel virtual address space to physical addresses so that the host can write into them. So mm in vmwgfx shouldn't require performance enhancing drugs to understand and drug usage while writing vmwgfx code should remain purely recreational ;) z