Message ID | 20230620004217.4700-1-dakr@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | DRM GPUVA Manager & Nouveau VM_BIND UAPI | expand |
Hi Donald, forgot to add your email address to the patch series - sorry about that. This series (v5) contains the Documentation changes you requested. - Danilo On 6/20/23 02:42, Danilo Krummrich wrote: > This patch series provides a new UAPI for the Nouveau driver in order to > support Vulkan features, such as sparse bindings and sparse residency. > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > keep track of GPU virtual address (VA) mappings in a more generic way. > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > serves the following purposes in this context. > > 1) Provide infrastructure to track GPU VA allocations and mappings, > making use of the maple_tree. > > 2) Generically connect GPU VA mappings to their backing buffers, in > particular DRM GEM objects. > > 3) Provide a common implementation to perform more complex mapping > operations on the GPU VA space. In particular splitting and merging > of GPU VA mappings, e.g. for intersecting mapping requests or partial > unmap requests. > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > providing the following new interfaces. > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl > for UMDs to specify the portion of VA space managed by the kernel and > userspace, respectively. > > 2) Allocate and free a VA space region as well as bind and unbind memory > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM > scheduler to queue jobs and support asynchronous processing with DRM syncobjs > as synchronization mechanism. > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context > for GEM buffers) by Christian König. Since the patch implementing drm_exec was > not yet merged into drm-next it is part of this series, as well as a small fix > for this patch, which was found while testing this series. > > This patch series is also available at [1]. > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > corresponding userspace parts for this series. > > The Vulkan CTS test suite passes the sparse binding and sparse residency test > cases for the new UAPI together with Dave's Mesa work. > > There are also some test cases in the igt-gpu-tools project [3] for the new UAPI > and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU > VA manager's logic through Nouveau's new UAPI and should be considered just as > helper for implementation. > > However, I absolutely intend to change those test cases to proper kunit test > cases for the DRM GPUVA manager, once and if we agree on it's usefulness and > design. > > [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > Changes in V2: > ============== > Nouveau: > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence > signalling critical sections. Updates to the VA space are split up in three > separate stages, where only the 2. stage executes in a fence signalling > critical section: > > 1. update the VA space, allocate new structures and page tables > 2. (un-)map the requested memory bindings > 3. free structures and page tables > > - Separated generic job scheduler code from specific job implementations. > - Separated the EXEC and VM_BIND implementation of the UAPI. > - Reworked the locking parts of the nvkm/vmm RAW interface, such that > (un-)map operations can be executed in fence signalling critical sections. > > GPUVA Manager: > - made drm_gpuva_regions optional for users of the GPUVA manager > - allow NULL GEMs for drm_gpuva entries > - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region > entries > - provide callbacks for users to allocate custom drm_gpuva_op structures to > allow inheritance > - added user bits to drm_gpuva_flags > - added a prefetch operation type in order to support generating prefetch > operations in the same way other operations generated > - hand the responsibility for mutual exclusion for a GEM's > drm_gpuva list to the user; simplified corresponding (un-)link functions > > Maple Tree: > - I added two maple tree patches to the series, one to support custom tree > walk macros and one to hand the locking responsibility to the user of the > GPUVA manager without pre-defined lockdep checks. > > Changes in V3: > ============== > Nouveau: > - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page > table cleanup) within a workqueue rather than the job_free() callback of > the scheduler itself. A job_free() callback can stall the execution (run() > callback) of the next job in the queue. Since the page table cleanup > requires to take the same locks as need to be taken for page table > allocation, doing it directly in the job_free() callback would still > violate the fence signalling critical path. > - Separated Nouveau fence allocation and emit, such that we do not violate > the fence signalling critical path in EXEC jobs. > - Implement "regions" (for handling sparse mappings through PDEs and dual > page tables) within Nouveau. > - Drop the requirement for every mapping to be contained within a region. > - Add necassary synchronization of VM_BIND job operation sequences in order > to work around limitations in page table handling. This will be addressed > in a future re-work of Nouveau's page table handling. > - Fixed a couple of race conditions found through more testing. Thanks to > Dave for consitently trying to break it. :-) > > GPUVA Manager: > - Implement pre-allocation capabilities for tree modifications within fence > signalling critical sections. > - Implement accessors to to apply tree modification while walking the GPUVA > tree in order to actually support processing of drm_gpuva_ops through > callbacks in fence signalling critical sections rather than through > pre-allocated operation lists. > - Remove merging of GPUVAs; the kernel has limited to none knowlege about > the semantics of mapping sequences. Hence, merging is purely speculative. > It seems that gaining a significant (or at least a measurable) performance > increase through merging is way more likely to happen when userspace is > responsible for merging mappings up to the next larger page size if > possible. > - Since merging was removed, regions pretty much loose their right to exist. > They might still be useful for handling dual page tables or similar > mechanisms, but since Nouveau seems to be the only driver having a need > for this for now, regions were removed from the GPUVA manager. > - Fixed a couple of maple_tree related issues; thanks to Liam for helping me > out. > > Changes in V4: > ============== > Nouveau: > - Refactored how specific VM_BIND and EXEC jobs are created and how their > arguments are passed to the generic job implementation. > - Fixed a UAF race condition where bind job ops could have been freed > already while still waiting for a job cleanup to finish. This is due to > in certain cases we need to wait for mappings actually being unmapped > before creating sparse regions in the same area. > - Re-based the code onto drm_exec v4 patch. > > GPUVA Manager: > - Fixed a maple tree related bug when pre-allocating MA states. > (Boris Brezillion) > - Made struct drm_gpuva_fn_ops a const object in all occurrences. > (Boris Brezillion) > > Changes in V5: > ============== > Nouveau: > - Link and unlink GPUVAs outside the fence signalling critical path in > nouveau_uvmm_bind_job_submit() holding the dma-resv lock. Mutual exclusion > of BO evicts causing mapping invalidation and regular mapping operations > is ensured with dma-fences. > > GPUVA Manager: > - Removed the separate GEMs GPUVA list lock. Link and unlink as well as > iterating the GEM's GPUVA list should be protected with the GEM's dma-resv > lock instead. > - Renamed DRM_GPUVA_EVICTED flag to DRM_GPUVA_INVALIDATED. Mappings do not > get eviced, they might get invalidated due to eviction. > - Maple tree uses the 'unsinged long' type for node entries. While this > works for GPU VA spaces larger than 32-bit on 64-bit kernel, the GPU VA > space is limited to 32-bit on 32-bit kernels as well. > As long as we do not have a 64-bit capable maple tree for 32-bit kernels, > the GPU VA manager contains checks to throw warnings when GPU VA entries > exceed the maple tree's storage capabilities. > - Extended the Documentation and added example code as requested by Donald > Robson. > > Christian König (1): > drm: execution context for GEM buffers v4 > > Danilo Krummrich (13): > maple_tree: split up MA_STATE() macro > drm: manager to keep track of GPUs VA mappings > drm: debugfs: provide infrastructure to dump a DRM GPU VA space > drm/nouveau: new VM_BIND uapi interfaces > drm/nouveau: get vmm via nouveau_cli_vmm() > drm/nouveau: bo: initialize GEM GPU VA interface > drm/nouveau: move usercopy helpers to nouveau_drv.h > drm/nouveau: fence: separate fence alloc and emit > drm/nouveau: fence: fail to emit when fence context is killed > drm/nouveau: chan: provide nouveau_channel_kill() > drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > drm/nouveau: implement new VM_BIND uAPI > drm/nouveau: debugfs: implement DRM GPU VA debugfs > > Documentation/gpu/driver-uapi.rst | 11 + > Documentation/gpu/drm-mm.rst | 54 + > drivers/gpu/drm/Kconfig | 6 + > drivers/gpu/drm/Makefile | 3 + > drivers/gpu/drm/drm_debugfs.c | 41 + > drivers/gpu/drm/drm_exec.c | 278 +++ > drivers/gpu/drm/drm_gem.c | 3 + > drivers/gpu/drm/drm_gpuva_mgr.c | 1971 ++++++++++++++++ > drivers/gpu/drm/nouveau/Kbuild | 3 + > drivers/gpu/drm/nouveau/Kconfig | 2 + > drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +- > drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +- > drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +- > .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +- > drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 + > drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +- > drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +- > drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 + > drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +- > drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +- > drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +- > drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++ > drivers/gpu/drm/nouveau/nouveau_exec.h | 54 + > drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +- > drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +- > drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +- > drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++ > drivers/gpu/drm/nouveau/nouveau_sched.h | 123 + > drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1979 +++++++++++++++++ > drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 + > drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +- > .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 + > .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +- > .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +- > .../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +- > include/drm/drm_debugfs.h | 25 + > include/drm/drm_drv.h | 6 + > include/drm/drm_exec.h | 119 + > include/drm/drm_gem.h | 52 + > include/drm/drm_gpuva_mgr.h | 682 ++++++ > include/linux/maple_tree.h | 7 +- > include/uapi/drm/nouveau_drm.h | 209 ++ > 51 files changed, 7566 insertions(+), 242 deletions(-) > create mode 100644 drivers/gpu/drm/drm_exec.c > create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > create mode 100644 include/drm/drm_exec.h > create mode 100644 include/drm/drm_gpuva_mgr.h > > > base-commit: 2222dcb0775d36de28992f56455ab3967b30d380
Since this is feature is nouveau only currently and doesn't disturb the current nouveau code paths, I'd like to try and get this work in tree so other drivers can work from it. If there are any major objections to this, I'm happy to pull it back out again, but I'd like to get some acks/rb in the next couple of days in order to land some of it. Dave. > > forgot to add your email address to the patch series - sorry about that. > > This series (v5) contains the Documentation changes you requested. > > - Danilo > > On 6/20/23 02:42, Danilo Krummrich wrote: > > This patch series provides a new UAPI for the Nouveau driver in order to > > support Vulkan features, such as sparse bindings and sparse residency. > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > > serves the following purposes in this context. > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > making use of the maple_tree. > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > particular DRM GEM objects. > > > > 3) Provide a common implementation to perform more complex mapping > > operations on the GPU VA space. In particular splitting and merging > > of GPU VA mappings, e.g. for intersecting mapping requests or partial > > unmap requests. > > > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > > providing the following new interfaces. > > > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl > > for UMDs to specify the portion of VA space managed by the kernel and > > userspace, respectively. > > > > 2) Allocate and free a VA space region as well as bind and unbind memory > > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM > > scheduler to queue jobs and support asynchronous processing with DRM syncobjs > > as synchronization mechanism. > > > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context > > for GEM buffers) by Christian König. Since the patch implementing drm_exec was > > not yet merged into drm-next it is part of this series, as well as a small fix > > for this patch, which was found while testing this series. > > > > This patch series is also available at [1]. > > > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > > corresponding userspace parts for this series. > > > > The Vulkan CTS test suite passes the sparse binding and sparse residency test > > cases for the new UAPI together with Dave's Mesa work. > > > > There are also some test cases in the igt-gpu-tools project [3] for the new UAPI > > and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU > > VA manager's logic through Nouveau's new UAPI and should be considered just as > > helper for implementation. > > > > However, I absolutely intend to change those test cases to proper kunit test > > cases for the DRM GPUVA manager, once and if we agree on it's usefulness and > > design. > > > > [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > > [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > > > Changes in V2: > > ============== > > Nouveau: > > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence > > signalling critical sections. Updates to the VA space are split up in three > > separate stages, where only the 2. stage executes in a fence signalling > > critical section: > > > > 1. update the VA space, allocate new structures and page tables > > 2. (un-)map the requested memory bindings > > 3. free structures and page tables > > > > - Separated generic job scheduler code from specific job implementations. > > - Separated the EXEC and VM_BIND implementation of the UAPI. > > - Reworked the locking parts of the nvkm/vmm RAW interface, such that > > (un-)map operations can be executed in fence signalling critical sections. > > > > GPUVA Manager: > > - made drm_gpuva_regions optional for users of the GPUVA manager > > - allow NULL GEMs for drm_gpuva entries > > - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region > > entries > > - provide callbacks for users to allocate custom drm_gpuva_op structures to > > allow inheritance > > - added user bits to drm_gpuva_flags > > - added a prefetch operation type in order to support generating prefetch > > operations in the same way other operations generated > > - hand the responsibility for mutual exclusion for a GEM's > > drm_gpuva list to the user; simplified corresponding (un-)link functions > > > > Maple Tree: > > - I added two maple tree patches to the series, one to support custom tree > > walk macros and one to hand the locking responsibility to the user of the > > GPUVA manager without pre-defined lockdep checks. > > > > Changes in V3: > > ============== > > Nouveau: > > - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page > > table cleanup) within a workqueue rather than the job_free() callback of > > the scheduler itself. A job_free() callback can stall the execution (run() > > callback) of the next job in the queue. Since the page table cleanup > > requires to take the same locks as need to be taken for page table > > allocation, doing it directly in the job_free() callback would still > > violate the fence signalling critical path. > > - Separated Nouveau fence allocation and emit, such that we do not violate > > the fence signalling critical path in EXEC jobs. > > - Implement "regions" (for handling sparse mappings through PDEs and dual > > page tables) within Nouveau. > > - Drop the requirement for every mapping to be contained within a region. > > - Add necassary synchronization of VM_BIND job operation sequences in order > > to work around limitations in page table handling. This will be addressed > > in a future re-work of Nouveau's page table handling. > > - Fixed a couple of race conditions found through more testing. Thanks to > > Dave for consitently trying to break it. :-) > > > > GPUVA Manager: > > - Implement pre-allocation capabilities for tree modifications within fence > > signalling critical sections. > > - Implement accessors to to apply tree modification while walking the GPUVA > > tree in order to actually support processing of drm_gpuva_ops through > > callbacks in fence signalling critical sections rather than through > > pre-allocated operation lists. > > - Remove merging of GPUVAs; the kernel has limited to none knowlege about > > the semantics of mapping sequences. Hence, merging is purely speculative. > > It seems that gaining a significant (or at least a measurable) performance > > increase through merging is way more likely to happen when userspace is > > responsible for merging mappings up to the next larger page size if > > possible. > > - Since merging was removed, regions pretty much loose their right to exist. > > They might still be useful for handling dual page tables or similar > > mechanisms, but since Nouveau seems to be the only driver having a need > > for this for now, regions were removed from the GPUVA manager. > > - Fixed a couple of maple_tree related issues; thanks to Liam for helping me > > out. > > > > Changes in V4: > > ============== > > Nouveau: > > - Refactored how specific VM_BIND and EXEC jobs are created and how their > > arguments are passed to the generic job implementation. > > - Fixed a UAF race condition where bind job ops could have been freed > > already while still waiting for a job cleanup to finish. This is due to > > in certain cases we need to wait for mappings actually being unmapped > > before creating sparse regions in the same area. > > - Re-based the code onto drm_exec v4 patch. > > > > GPUVA Manager: > > - Fixed a maple tree related bug when pre-allocating MA states. > > (Boris Brezillion) > > - Made struct drm_gpuva_fn_ops a const object in all occurrences. > > (Boris Brezillion) > > > > Changes in V5: > > ============== > > Nouveau: > > - Link and unlink GPUVAs outside the fence signalling critical path in > > nouveau_uvmm_bind_job_submit() holding the dma-resv lock. Mutual exclusion > > of BO evicts causing mapping invalidation and regular mapping operations > > is ensured with dma-fences. > > > > GPUVA Manager: > > - Removed the separate GEMs GPUVA list lock. Link and unlink as well as > > iterating the GEM's GPUVA list should be protected with the GEM's dma-resv > > lock instead. > > - Renamed DRM_GPUVA_EVICTED flag to DRM_GPUVA_INVALIDATED. Mappings do not > > get eviced, they might get invalidated due to eviction. > > - Maple tree uses the 'unsinged long' type for node entries. While this > > works for GPU VA spaces larger than 32-bit on 64-bit kernel, the GPU VA > > space is limited to 32-bit on 32-bit kernels as well. > > As long as we do not have a 64-bit capable maple tree for 32-bit kernels, > > the GPU VA manager contains checks to throw warnings when GPU VA entries > > exceed the maple tree's storage capabilities. > > - Extended the Documentation and added example code as requested by Donald > > Robson. > > > > Christian König (1): > > drm: execution context for GEM buffers v4 > > > > Danilo Krummrich (13): > > maple_tree: split up MA_STATE() macro > > drm: manager to keep track of GPUs VA mappings > > drm: debugfs: provide infrastructure to dump a DRM GPU VA space > > drm/nouveau: new VM_BIND uapi interfaces > > drm/nouveau: get vmm via nouveau_cli_vmm() > > drm/nouveau: bo: initialize GEM GPU VA interface > > drm/nouveau: move usercopy helpers to nouveau_drv.h > > drm/nouveau: fence: separate fence alloc and emit > > drm/nouveau: fence: fail to emit when fence context is killed > > drm/nouveau: chan: provide nouveau_channel_kill() > > drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > > drm/nouveau: implement new VM_BIND uAPI > > drm/nouveau: debugfs: implement DRM GPU VA debugfs > > > > Documentation/gpu/driver-uapi.rst | 11 + > > Documentation/gpu/drm-mm.rst | 54 + > > drivers/gpu/drm/Kconfig | 6 + > > drivers/gpu/drm/Makefile | 3 + > > drivers/gpu/drm/drm_debugfs.c | 41 + > > drivers/gpu/drm/drm_exec.c | 278 +++ > > drivers/gpu/drm/drm_gem.c | 3 + > > drivers/gpu/drm/drm_gpuva_mgr.c | 1971 ++++++++++++++++ > > drivers/gpu/drm/nouveau/Kbuild | 3 + > > drivers/gpu/drm/nouveau/Kconfig | 2 + > > drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +- > > drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +- > > drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +- > > .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +- > > drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 + > > drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > > drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +- > > drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > > drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +- > > drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > > drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 + > > drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +- > > drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +- > > drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +- > > drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++ > > drivers/gpu/drm/nouveau/nouveau_exec.h | 54 + > > drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +- > > drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +- > > drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +- > > drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > > drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > > drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++ > > drivers/gpu/drm/nouveau/nouveau_sched.h | 123 + > > drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1979 +++++++++++++++++ > > drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 + > > drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > > drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +- > > .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +- > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +- > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 + > > .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +- > > .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +- > > .../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +- > > include/drm/drm_debugfs.h | 25 + > > include/drm/drm_drv.h | 6 + > > include/drm/drm_exec.h | 119 + > > include/drm/drm_gem.h | 52 + > > include/drm/drm_gpuva_mgr.h | 682 ++++++ > > include/linux/maple_tree.h | 7 +- > > include/uapi/drm/nouveau_drm.h | 209 ++ > > 51 files changed, 7566 insertions(+), 242 deletions(-) > > create mode 100644 drivers/gpu/drm/drm_exec.c > > create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > > create mode 100644 include/drm/drm_exec.h > > create mode 100644 include/drm/drm_gpuva_mgr.h > > > > > > base-commit: 2222dcb0775d36de28992f56455ab3967b30d380 >
On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie <airlied@gmail.com> wrote: > > Since this is feature is nouveau only currently and doesn't disturb > the current nouveau code paths, I'd like to try and get this work in > tree so other drivers can work from it. > > If there are any major objections to this, I'm happy to pull it back > out again, but I'd like to get some acks/rb in the next couple of days > in order to land some of it. > > Dave. > > > > > > forgot to add your email address to the patch series - sorry about that. > > > > This series (v5) contains the Documentation changes you requested. > > > > - Danilo > > > > On 6/20/23 02:42, Danilo Krummrich wrote: > > > This patch series provides a new UAPI for the Nouveau driver in order to > > > support Vulkan features, such as sparse bindings and sparse residency. > > > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > > > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > > > serves the following purposes in this context. > > > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > > making use of the maple_tree. > > > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > > particular DRM GEM objects. Will this manager be able to connect GPU VA mappings to host memory allocations (aka user pointers) ? I only skimmed over the uapi definitions, but from that quick glance I saw you can only pass a (gem) handle to the vm bind uapi. I think it is an important feature because you don't want to have two GPU VA managers running in your driver (if that's even possible). Maybe we should at least try to make sure the uapi is/will be compatible with such an extension. Oded Oded > > > > > > 3) Provide a common implementation to perform more complex mapping > > > operations on the GPU VA space. In particular splitting and merging > > > of GPU VA mappings, e.g. for intersecting mapping requests or partial > > > unmap requests. > > > > > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > > > providing the following new interfaces. > > > > > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl > > > for UMDs to specify the portion of VA space managed by the kernel and > > > userspace, respectively. > > > > > > 2) Allocate and free a VA space region as well as bind and unbind memory > > > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > > > > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > > > > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM > > > scheduler to queue jobs and support asynchronous processing with DRM syncobjs > > > as synchronization mechanism. > > > > > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > > > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > > > > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context > > > for GEM buffers) by Christian König. Since the patch implementing drm_exec was > > > not yet merged into drm-next it is part of this series, as well as a small fix > > > for this patch, which was found while testing this series. > > > > > > This patch series is also available at [1]. > > > > > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > > > corresponding userspace parts for this series. > > > > > > The Vulkan CTS test suite passes the sparse binding and sparse residency test > > > cases for the new UAPI together with Dave's Mesa work. > > > > > > There are also some test cases in the igt-gpu-tools project [3] for the new UAPI > > > and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU > > > VA manager's logic through Nouveau's new UAPI and should be considered just as > > > helper for implementation. > > > > > > However, I absolutely intend to change those test cases to proper kunit test > > > cases for the DRM GPUVA manager, once and if we agree on it's usefulness and > > > design. > > > > > > [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > > > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > > > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > > > [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > > > > > Changes in V2: > > > ============== > > > Nouveau: > > > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence > > > signalling critical sections. Updates to the VA space are split up in three > > > separate stages, where only the 2. stage executes in a fence signalling > > > critical section: > > > > > > 1. update the VA space, allocate new structures and page tables > > > 2. (un-)map the requested memory bindings > > > 3. free structures and page tables > > > > > > - Separated generic job scheduler code from specific job implementations. > > > - Separated the EXEC and VM_BIND implementation of the UAPI. > > > - Reworked the locking parts of the nvkm/vmm RAW interface, such that > > > (un-)map operations can be executed in fence signalling critical sections. > > > > > > GPUVA Manager: > > > - made drm_gpuva_regions optional for users of the GPUVA manager > > > - allow NULL GEMs for drm_gpuva entries > > > - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region > > > entries > > > - provide callbacks for users to allocate custom drm_gpuva_op structures to > > > allow inheritance > > > - added user bits to drm_gpuva_flags > > > - added a prefetch operation type in order to support generating prefetch > > > operations in the same way other operations generated > > > - hand the responsibility for mutual exclusion for a GEM's > > > drm_gpuva list to the user; simplified corresponding (un-)link functions > > > > > > Maple Tree: > > > - I added two maple tree patches to the series, one to support custom tree > > > walk macros and one to hand the locking responsibility to the user of the > > > GPUVA manager without pre-defined lockdep checks. > > > > > > Changes in V3: > > > ============== > > > Nouveau: > > > - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page > > > table cleanup) within a workqueue rather than the job_free() callback of > > > the scheduler itself. A job_free() callback can stall the execution (run() > > > callback) of the next job in the queue. Since the page table cleanup > > > requires to take the same locks as need to be taken for page table > > > allocation, doing it directly in the job_free() callback would still > > > violate the fence signalling critical path. > > > - Separated Nouveau fence allocation and emit, such that we do not violate > > > the fence signalling critical path in EXEC jobs. > > > - Implement "regions" (for handling sparse mappings through PDEs and dual > > > page tables) within Nouveau. > > > - Drop the requirement for every mapping to be contained within a region. > > > - Add necassary synchronization of VM_BIND job operation sequences in order > > > to work around limitations in page table handling. This will be addressed > > > in a future re-work of Nouveau's page table handling. > > > - Fixed a couple of race conditions found through more testing. Thanks to > > > Dave for consitently trying to break it. :-) > > > > > > GPUVA Manager: > > > - Implement pre-allocation capabilities for tree modifications within fence > > > signalling critical sections. > > > - Implement accessors to to apply tree modification while walking the GPUVA > > > tree in order to actually support processing of drm_gpuva_ops through > > > callbacks in fence signalling critical sections rather than through > > > pre-allocated operation lists. > > > - Remove merging of GPUVAs; the kernel has limited to none knowlege about > > > the semantics of mapping sequences. Hence, merging is purely speculative. > > > It seems that gaining a significant (or at least a measurable) performance > > > increase through merging is way more likely to happen when userspace is > > > responsible for merging mappings up to the next larger page size if > > > possible. > > > - Since merging was removed, regions pretty much loose their right to exist. > > > They might still be useful for handling dual page tables or similar > > > mechanisms, but since Nouveau seems to be the only driver having a need > > > for this for now, regions were removed from the GPUVA manager. > > > - Fixed a couple of maple_tree related issues; thanks to Liam for helping me > > > out. > > > > > > Changes in V4: > > > ============== > > > Nouveau: > > > - Refactored how specific VM_BIND and EXEC jobs are created and how their > > > arguments are passed to the generic job implementation. > > > - Fixed a UAF race condition where bind job ops could have been freed > > > already while still waiting for a job cleanup to finish. This is due to > > > in certain cases we need to wait for mappings actually being unmapped > > > before creating sparse regions in the same area. > > > - Re-based the code onto drm_exec v4 patch. > > > > > > GPUVA Manager: > > > - Fixed a maple tree related bug when pre-allocating MA states. > > > (Boris Brezillion) > > > - Made struct drm_gpuva_fn_ops a const object in all occurrences. > > > (Boris Brezillion) > > > > > > Changes in V5: > > > ============== > > > Nouveau: > > > - Link and unlink GPUVAs outside the fence signalling critical path in > > > nouveau_uvmm_bind_job_submit() holding the dma-resv lock. Mutual exclusion > > > of BO evicts causing mapping invalidation and regular mapping operations > > > is ensured with dma-fences. > > > > > > GPUVA Manager: > > > - Removed the separate GEMs GPUVA list lock. Link and unlink as well as > > > iterating the GEM's GPUVA list should be protected with the GEM's dma-resv > > > lock instead. > > > - Renamed DRM_GPUVA_EVICTED flag to DRM_GPUVA_INVALIDATED. Mappings do not > > > get eviced, they might get invalidated due to eviction. > > > - Maple tree uses the 'unsinged long' type for node entries. While this > > > works for GPU VA spaces larger than 32-bit on 64-bit kernel, the GPU VA > > > space is limited to 32-bit on 32-bit kernels as well. > > > As long as we do not have a 64-bit capable maple tree for 32-bit kernels, > > > the GPU VA manager contains checks to throw warnings when GPU VA entries > > > exceed the maple tree's storage capabilities. > > > - Extended the Documentation and added example code as requested by Donald > > > Robson. > > > > > > Christian König (1): > > > drm: execution context for GEM buffers v4 > > > > > > Danilo Krummrich (13): > > > maple_tree: split up MA_STATE() macro > > > drm: manager to keep track of GPUs VA mappings > > > drm: debugfs: provide infrastructure to dump a DRM GPU VA space > > > drm/nouveau: new VM_BIND uapi interfaces > > > drm/nouveau: get vmm via nouveau_cli_vmm() > > > drm/nouveau: bo: initialize GEM GPU VA interface > > > drm/nouveau: move usercopy helpers to nouveau_drv.h > > > drm/nouveau: fence: separate fence alloc and emit > > > drm/nouveau: fence: fail to emit when fence context is killed > > > drm/nouveau: chan: provide nouveau_channel_kill() > > > drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > > > drm/nouveau: implement new VM_BIND uAPI > > > drm/nouveau: debugfs: implement DRM GPU VA debugfs > > > > > > Documentation/gpu/driver-uapi.rst | 11 + > > > Documentation/gpu/drm-mm.rst | 54 + > > > drivers/gpu/drm/Kconfig | 6 + > > > drivers/gpu/drm/Makefile | 3 + > > > drivers/gpu/drm/drm_debugfs.c | 41 + > > > drivers/gpu/drm/drm_exec.c | 278 +++ > > > drivers/gpu/drm/drm_gem.c | 3 + > > > drivers/gpu/drm/drm_gpuva_mgr.c | 1971 ++++++++++++++++ > > > drivers/gpu/drm/nouveau/Kbuild | 3 + > > > drivers/gpu/drm/nouveau/Kconfig | 2 + > > > drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +- > > > drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +- > > > drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +- > > > .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +- > > > drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 + > > > drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > > > drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +- > > > drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > > > drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +- > > > drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > > > drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 + > > > drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +- > > > drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +- > > > drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +- > > > drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++ > > > drivers/gpu/drm/nouveau/nouveau_exec.h | 54 + > > > drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +- > > > drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +- > > > drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +- > > > drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > > > drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > > > drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++ > > > drivers/gpu/drm/nouveau/nouveau_sched.h | 123 + > > > drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > > > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1979 +++++++++++++++++ > > > drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 + > > > drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > > > drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +- > > > .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +- > > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +- > > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 + > > > .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +- > > > .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +- > > > .../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +- > > > include/drm/drm_debugfs.h | 25 + > > > include/drm/drm_drv.h | 6 + > > > include/drm/drm_exec.h | 119 + > > > include/drm/drm_gem.h | 52 + > > > include/drm/drm_gpuva_mgr.h | 682 ++++++ > > > include/linux/maple_tree.h | 7 +- > > > include/uapi/drm/nouveau_drm.h | 209 ++ > > > 51 files changed, 7566 insertions(+), 242 deletions(-) > > > create mode 100644 drivers/gpu/drm/drm_exec.c > > > create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > > > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > > > create mode 100644 include/drm/drm_exec.h > > > create mode 100644 include/drm/drm_gpuva_mgr.h > > > > > > > > > base-commit: 2222dcb0775d36de28992f56455ab3967b30d380 > >
On Tue, 20 Jun 2023 at 17:06, Oded Gabbay <ogabbay@kernel.org> wrote: > > On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie <airlied@gmail.com> wrote: > > > > Since this is feature is nouveau only currently and doesn't disturb > > the current nouveau code paths, I'd like to try and get this work in > > tree so other drivers can work from it. > > > > If there are any major objections to this, I'm happy to pull it back > > out again, but I'd like to get some acks/rb in the next couple of days > > in order to land some of it. > > > > Dave. > > > > > > > > > > forgot to add your email address to the patch series - sorry about that. > > > > > > This series (v5) contains the Documentation changes you requested. > > > > > > - Danilo > > > > > > On 6/20/23 02:42, Danilo Krummrich wrote: > > > > This patch series provides a new UAPI for the Nouveau driver in order to > > > > support Vulkan features, such as sparse bindings and sparse residency. > > > > > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > > > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > > > > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > > > > serves the following purposes in this context. > > > > > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > > > making use of the maple_tree. > > > > > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > > > particular DRM GEM objects. > Will this manager be able to connect GPU VA mappings to host memory > allocations (aka user pointers) ? > > I only skimmed over the uapi definitions, but from that quick glance I > saw you can only pass a (gem) handle to the vm bind uapi. > > I think it is an important feature because you don't want to have two > GPU VA managers running in your driver (if that's even possible). > Maybe we should at least try to make sure the uapi is/will be > compatible with such an extension. > I think that would have to be a new uAPI entry point anyways, since managing user ptrs is extra, but the uAPI is nouveau specific and nouveau has no hostptr support as of now. The gpuva manager is kernel internal, I think adding host ptr tracking is useful, but I don't think it's a blocker right now. One of the reasons I'd like to get this in the tree is to add things like that instead of overloading this initial patchset with feature creep. Dave.
On Tue, Jun 20, 2023 at 10:13 AM Dave Airlie <airlied@gmail.com> wrote: > > On Tue, 20 Jun 2023 at 17:06, Oded Gabbay <ogabbay@kernel.org> wrote: > > > > On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie <airlied@gmail.com> wrote: > > > > > > Since this is feature is nouveau only currently and doesn't disturb > > > the current nouveau code paths, I'd like to try and get this work in > > > tree so other drivers can work from it. > > > > > > If there are any major objections to this, I'm happy to pull it back > > > out again, but I'd like to get some acks/rb in the next couple of days > > > in order to land some of it. > > > > > > Dave. > > > > > > > > > > > > > > forgot to add your email address to the patch series - sorry about that. > > > > > > > > This series (v5) contains the Documentation changes you requested. > > > > > > > > - Danilo > > > > > > > > On 6/20/23 02:42, Danilo Krummrich wrote: > > > > > This patch series provides a new UAPI for the Nouveau driver in order to > > > > > support Vulkan features, such as sparse bindings and sparse residency. > > > > > > > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > > > > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > > > > > serves the following purposes in this context. > > > > > > > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > > > > making use of the maple_tree. > > > > > > > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > > > > particular DRM GEM objects. > > Will this manager be able to connect GPU VA mappings to host memory > > allocations (aka user pointers) ? > > > > I only skimmed over the uapi definitions, but from that quick glance I > > saw you can only pass a (gem) handle to the vm bind uapi. > > > > I think it is an important feature because you don't want to have two > > GPU VA managers running in your driver (if that's even possible). > > Maybe we should at least try to make sure the uapi is/will be > > compatible with such an extension. > > > > I think that would have to be a new uAPI entry point anyways, since > managing user ptrs is extra, but the uAPI is nouveau specific and > nouveau has no hostptr support as of now. > > The gpuva manager is kernel internal, I think adding host ptr tracking > is useful, but I don't think it's a blocker right now. > > One of the reasons I'd like to get this in the tree is to add things > like that instead of overloading this initial patchset with feature > creep. > > Dave. ok, that makes sense. Thanks, Oded
Hi Danilo, On Tue, 20 Jun 2023 02:42:03 +0200 Danilo Krummrich <dakr@redhat.com> wrote: > This patch series provides a new UAPI for the Nouveau driver in order to > support Vulkan features, such as sparse bindings and sparse residency. > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > keep track of GPU virtual address (VA) mappings in a more generic way. > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > serves the following purposes in this context. > > 1) Provide infrastructure to track GPU VA allocations and mappings, > making use of the maple_tree. > > 2) Generically connect GPU VA mappings to their backing buffers, in > particular DRM GEM objects. > > 3) Provide a common implementation to perform more complex mapping > operations on the GPU VA space. In particular splitting and merging > of GPU VA mappings, e.g. for intersecting mapping requests or partial > unmap requests. > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > providing the following new interfaces. > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl > for UMDs to specify the portion of VA space managed by the kernel and > userspace, respectively. > > 2) Allocate and free a VA space region as well as bind and unbind memory > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM > scheduler to queue jobs and support asynchronous processing with DRM syncobjs > as synchronization mechanism. > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context > for GEM buffers) by Christian König. Since the patch implementing drm_exec was > not yet merged into drm-next it is part of this series, as well as a small fix > for this patch, which was found while testing this series. > > This patch series is also available at [1]. > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > corresponding userspace parts for this series. > > The Vulkan CTS test suite passes the sparse binding and sparse residency test > cases for the new UAPI together with Dave's Mesa work. > > There are also some test cases in the igt-gpu-tools project [3] for the new UAPI > and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU > VA manager's logic through Nouveau's new UAPI and should be considered just as > helper for implementation. > > However, I absolutely intend to change those test cases to proper kunit test > cases for the DRM GPUVA manager, once and if we agree on it's usefulness and > design. > > [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > Changes in V2: > ============== > Nouveau: > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence > signalling critical sections. Updates to the VA space are split up in three > separate stages, where only the 2. stage executes in a fence signalling > critical section: > > 1. update the VA space, allocate new structures and page tables > 2. (un-)map the requested memory bindings > 3. free structures and page tables > > - Separated generic job scheduler code from specific job implementations. > - Separated the EXEC and VM_BIND implementation of the UAPI. > - Reworked the locking parts of the nvkm/vmm RAW interface, such that > (un-)map operations can be executed in fence signalling critical sections. > > GPUVA Manager: > - made drm_gpuva_regions optional for users of the GPUVA manager > - allow NULL GEMs for drm_gpuva entries > - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region > entries > - provide callbacks for users to allocate custom drm_gpuva_op structures to > allow inheritance > - added user bits to drm_gpuva_flags > - added a prefetch operation type in order to support generating prefetch > operations in the same way other operations generated > - hand the responsibility for mutual exclusion for a GEM's > drm_gpuva list to the user; simplified corresponding (un-)link functions > > Maple Tree: > - I added two maple tree patches to the series, one to support custom tree > walk macros and one to hand the locking responsibility to the user of the > GPUVA manager without pre-defined lockdep checks. > > Changes in V3: > ============== > Nouveau: > - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page > table cleanup) within a workqueue rather than the job_free() callback of > the scheduler itself. A job_free() callback can stall the execution (run() > callback) of the next job in the queue. Since the page table cleanup > requires to take the same locks as need to be taken for page table > allocation, doing it directly in the job_free() callback would still > violate the fence signalling critical path. > - Separated Nouveau fence allocation and emit, such that we do not violate > the fence signalling critical path in EXEC jobs. > - Implement "regions" (for handling sparse mappings through PDEs and dual > page tables) within Nouveau. > - Drop the requirement for every mapping to be contained within a region. > - Add necassary synchronization of VM_BIND job operation sequences in order > to work around limitations in page table handling. This will be addressed > in a future re-work of Nouveau's page table handling. > - Fixed a couple of race conditions found through more testing. Thanks to > Dave for consitently trying to break it. :-) > > GPUVA Manager: > - Implement pre-allocation capabilities for tree modifications within fence > signalling critical sections. > - Implement accessors to to apply tree modification while walking the GPUVA > tree in order to actually support processing of drm_gpuva_ops through > callbacks in fence signalling critical sections rather than through > pre-allocated operation lists. > - Remove merging of GPUVAs; the kernel has limited to none knowlege about > the semantics of mapping sequences. Hence, merging is purely speculative. > It seems that gaining a significant (or at least a measurable) performance > increase through merging is way more likely to happen when userspace is > responsible for merging mappings up to the next larger page size if > possible. > - Since merging was removed, regions pretty much loose their right to exist. > They might still be useful for handling dual page tables or similar > mechanisms, but since Nouveau seems to be the only driver having a need > for this for now, regions were removed from the GPUVA manager. > - Fixed a couple of maple_tree related issues; thanks to Liam for helping me > out. > > Changes in V4: > ============== > Nouveau: > - Refactored how specific VM_BIND and EXEC jobs are created and how their > arguments are passed to the generic job implementation. > - Fixed a UAF race condition where bind job ops could have been freed > already while still waiting for a job cleanup to finish. This is due to > in certain cases we need to wait for mappings actually being unmapped > before creating sparse regions in the same area. > - Re-based the code onto drm_exec v4 patch. > > GPUVA Manager: > - Fixed a maple tree related bug when pre-allocating MA states. > (Boris Brezillion) > - Made struct drm_gpuva_fn_ops a const object in all occurrences. > (Boris Brezillion) > > Changes in V5: > ============== > Nouveau: > - Link and unlink GPUVAs outside the fence signalling critical path in > nouveau_uvmm_bind_job_submit() holding the dma-resv lock. Mutual exclusion > of BO evicts causing mapping invalidation and regular mapping operations > is ensured with dma-fences. > > GPUVA Manager: > - Removed the separate GEMs GPUVA list lock. Link and unlink as well as > iterating the GEM's GPUVA list should be protected with the GEM's dma-resv > lock instead. > - Renamed DRM_GPUVA_EVICTED flag to DRM_GPUVA_INVALIDATED. Mappings do not > get eviced, they might get invalidated due to eviction. > - Maple tree uses the 'unsinged long' type for node entries. While this > works for GPU VA spaces larger than 32-bit on 64-bit kernel, the GPU VA > space is limited to 32-bit on 32-bit kernels as well. > As long as we do not have a 64-bit capable maple tree for 32-bit kernels, > the GPU VA manager contains checks to throw warnings when GPU VA entries > exceed the maple tree's storage capabilities. > - Extended the Documentation and added example code as requested by Donald > Robson. > > Christian König (1): > drm: execution context for GEM buffers v4 > > Danilo Krummrich (13): > maple_tree: split up MA_STATE() macro > drm: manager to keep track of GPUs VA mappings > drm: debugfs: provide infrastructure to dump a DRM GPU VA space Core drm patches are Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> The only thing I'm worried about is the 'sync mapping requests have to go through the async path and wait for all previous async requests to be processed' problem I mentioned in one of your previous submission, but I'm happy leave that for later. > drm/nouveau: new VM_BIND uapi interfaces > drm/nouveau: get vmm via nouveau_cli_vmm() > drm/nouveau: bo: initialize GEM GPU VA interface > drm/nouveau: move usercopy helpers to nouveau_drv.h > drm/nouveau: fence: separate fence alloc and emit > drm/nouveau: fence: fail to emit when fence context is killed > drm/nouveau: chan: provide nouveau_channel_kill() > drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > drm/nouveau: implement new VM_BIND uAPI > drm/nouveau: debugfs: implement DRM GPU VA debugfs > > Documentation/gpu/driver-uapi.rst | 11 + > Documentation/gpu/drm-mm.rst | 54 + > drivers/gpu/drm/Kconfig | 6 + > drivers/gpu/drm/Makefile | 3 + > drivers/gpu/drm/drm_debugfs.c | 41 + > drivers/gpu/drm/drm_exec.c | 278 +++ > drivers/gpu/drm/drm_gem.c | 3 + > drivers/gpu/drm/drm_gpuva_mgr.c | 1971 ++++++++++++++++ > drivers/gpu/drm/nouveau/Kbuild | 3 + > drivers/gpu/drm/nouveau/Kconfig | 2 + > drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +- > drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +- > drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +- > .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +- > drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 + > drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +- > drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +- > drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 + > drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +- > drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +- > drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +- > drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++ > drivers/gpu/drm/nouveau/nouveau_exec.h | 54 + > drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +- > drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +- > drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +- > drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++ > drivers/gpu/drm/nouveau/nouveau_sched.h | 123 + > drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1979 +++++++++++++++++ > drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 + > drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +- > .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 + > .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +- > .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +- > .../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +- > include/drm/drm_debugfs.h | 25 + > include/drm/drm_drv.h | 6 + > include/drm/drm_exec.h | 119 + > include/drm/drm_gem.h | 52 + > include/drm/drm_gpuva_mgr.h | 682 ++++++ > include/linux/maple_tree.h | 7 +- > include/uapi/drm/nouveau_drm.h | 209 ++ > 51 files changed, 7566 insertions(+), 242 deletions(-) > create mode 100644 drivers/gpu/drm/drm_exec.c > create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > create mode 100644 include/drm/drm_exec.h > create mode 100644 include/drm/drm_gpuva_mgr.h > > > base-commit: 2222dcb0775d36de28992f56455ab3967b30d380
Hi Boris, On 6/20/23 11:25, Boris Brezillon wrote: > Hi Danilo, > > On Tue, 20 Jun 2023 02:42:03 +0200 > Danilo Krummrich <dakr@redhat.com> wrote: > >> This patch series provides a new UAPI for the Nouveau driver in order to >> support Vulkan features, such as sparse bindings and sparse residency. >> >> Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to >> keep track of GPU virtual address (VA) mappings in a more generic way. >> >> The DRM GPUVA manager is indented to help drivers implement userspace-manageable >> GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it >> serves the following purposes in this context. >> >> 1) Provide infrastructure to track GPU VA allocations and mappings, >> making use of the maple_tree. >> >> 2) Generically connect GPU VA mappings to their backing buffers, in >> particular DRM GEM objects. >> >> 3) Provide a common implementation to perform more complex mapping >> operations on the GPU VA space. In particular splitting and merging >> of GPU VA mappings, e.g. for intersecting mapping requests or partial >> unmap requests. >> >> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself >> providing the following new interfaces. >> >> 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl >> for UMDs to specify the portion of VA space managed by the kernel and >> userspace, respectively. >> >> 2) Allocate and free a VA space region as well as bind and unbind memory >> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. >> >> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. >> >> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM >> scheduler to queue jobs and support asynchronous processing with DRM syncobjs >> as synchronization mechanism. >> >> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, >> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. >> >> The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context >> for GEM buffers) by Christian König. Since the patch implementing drm_exec was >> not yet merged into drm-next it is part of this series, as well as a small fix >> for this patch, which was found while testing this series. >> >> This patch series is also available at [1]. >> >> There is a Mesa NVK merge request by Dave Airlie [2] implementing the >> corresponding userspace parts for this series. >> >> The Vulkan CTS test suite passes the sparse binding and sparse residency test >> cases for the new UAPI together with Dave's Mesa work. >> >> There are also some test cases in the igt-gpu-tools project [3] for the new UAPI >> and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU >> VA manager's logic through Nouveau's new UAPI and should be considered just as >> helper for implementation. >> >> However, I absolutely intend to change those test cases to proper kunit test >> cases for the DRM GPUVA manager, once and if we agree on it's usefulness and >> design. >> >> [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / >> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 >> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ >> [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind >> >> Changes in V2: >> ============== >> Nouveau: >> - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence >> signalling critical sections. Updates to the VA space are split up in three >> separate stages, where only the 2. stage executes in a fence signalling >> critical section: >> >> 1. update the VA space, allocate new structures and page tables >> 2. (un-)map the requested memory bindings >> 3. free structures and page tables >> >> - Separated generic job scheduler code from specific job implementations. >> - Separated the EXEC and VM_BIND implementation of the UAPI. >> - Reworked the locking parts of the nvkm/vmm RAW interface, such that >> (un-)map operations can be executed in fence signalling critical sections. >> >> GPUVA Manager: >> - made drm_gpuva_regions optional for users of the GPUVA manager >> - allow NULL GEMs for drm_gpuva entries >> - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region >> entries >> - provide callbacks for users to allocate custom drm_gpuva_op structures to >> allow inheritance >> - added user bits to drm_gpuva_flags >> - added a prefetch operation type in order to support generating prefetch >> operations in the same way other operations generated >> - hand the responsibility for mutual exclusion for a GEM's >> drm_gpuva list to the user; simplified corresponding (un-)link functions >> >> Maple Tree: >> - I added two maple tree patches to the series, one to support custom tree >> walk macros and one to hand the locking responsibility to the user of the >> GPUVA manager without pre-defined lockdep checks. >> >> Changes in V3: >> ============== >> Nouveau: >> - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page >> table cleanup) within a workqueue rather than the job_free() callback of >> the scheduler itself. A job_free() callback can stall the execution (run() >> callback) of the next job in the queue. Since the page table cleanup >> requires to take the same locks as need to be taken for page table >> allocation, doing it directly in the job_free() callback would still >> violate the fence signalling critical path. >> - Separated Nouveau fence allocation and emit, such that we do not violate >> the fence signalling critical path in EXEC jobs. >> - Implement "regions" (for handling sparse mappings through PDEs and dual >> page tables) within Nouveau. >> - Drop the requirement for every mapping to be contained within a region. >> - Add necassary synchronization of VM_BIND job operation sequences in order >> to work around limitations in page table handling. This will be addressed >> in a future re-work of Nouveau's page table handling. >> - Fixed a couple of race conditions found through more testing. Thanks to >> Dave for consitently trying to break it. :-) >> >> GPUVA Manager: >> - Implement pre-allocation capabilities for tree modifications within fence >> signalling critical sections. >> - Implement accessors to to apply tree modification while walking the GPUVA >> tree in order to actually support processing of drm_gpuva_ops through >> callbacks in fence signalling critical sections rather than through >> pre-allocated operation lists. >> - Remove merging of GPUVAs; the kernel has limited to none knowlege about >> the semantics of mapping sequences. Hence, merging is purely speculative. >> It seems that gaining a significant (or at least a measurable) performance >> increase through merging is way more likely to happen when userspace is >> responsible for merging mappings up to the next larger page size if >> possible. >> - Since merging was removed, regions pretty much loose their right to exist. >> They might still be useful for handling dual page tables or similar >> mechanisms, but since Nouveau seems to be the only driver having a need >> for this for now, regions were removed from the GPUVA manager. >> - Fixed a couple of maple_tree related issues; thanks to Liam for helping me >> out. >> >> Changes in V4: >> ============== >> Nouveau: >> - Refactored how specific VM_BIND and EXEC jobs are created and how their >> arguments are passed to the generic job implementation. >> - Fixed a UAF race condition where bind job ops could have been freed >> already while still waiting for a job cleanup to finish. This is due to >> in certain cases we need to wait for mappings actually being unmapped >> before creating sparse regions in the same area. >> - Re-based the code onto drm_exec v4 patch. >> >> GPUVA Manager: >> - Fixed a maple tree related bug when pre-allocating MA states. >> (Boris Brezillion) >> - Made struct drm_gpuva_fn_ops a const object in all occurrences. >> (Boris Brezillion) >> >> Changes in V5: >> ============== >> Nouveau: >> - Link and unlink GPUVAs outside the fence signalling critical path in >> nouveau_uvmm_bind_job_submit() holding the dma-resv lock. Mutual exclusion >> of BO evicts causing mapping invalidation and regular mapping operations >> is ensured with dma-fences. >> >> GPUVA Manager: >> - Removed the separate GEMs GPUVA list lock. Link and unlink as well as >> iterating the GEM's GPUVA list should be protected with the GEM's dma-resv >> lock instead. >> - Renamed DRM_GPUVA_EVICTED flag to DRM_GPUVA_INVALIDATED. Mappings do not >> get eviced, they might get invalidated due to eviction. >> - Maple tree uses the 'unsinged long' type for node entries. While this >> works for GPU VA spaces larger than 32-bit on 64-bit kernel, the GPU VA >> space is limited to 32-bit on 32-bit kernels as well. >> As long as we do not have a 64-bit capable maple tree for 32-bit kernels, >> the GPU VA manager contains checks to throw warnings when GPU VA entries >> exceed the maple tree's storage capabilities. >> - Extended the Documentation and added example code as requested by Donald >> Robson. >> >> Christian König (1): >> drm: execution context for GEM buffers v4 >> >> Danilo Krummrich (13): >> maple_tree: split up MA_STATE() macro >> drm: manager to keep track of GPUs VA mappings >> drm: debugfs: provide infrastructure to dump a DRM GPU VA space > > Core drm patches are > > Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> > > The only thing I'm worried about is the 'sync mapping requests have to > go through the async path and wait for all previous async requests to > be processed' problem I mentioned in one of your previous submission, > but I'm happy leave that for later. Yes, I'm aware of this limitation. Let me quickly try to explain where this limitation comes from and how I intend to address it. In order to be able to allocate the required page tables for a mapping request and in order to free corresponding page tables once the (async) job finished I need to know the corresponding sequence of operations (drm_gpuva_ops) to fulfill the mapping request. This requires me to update the GPUVA space in the ioctl() rather than in the async stage, because otherwise I would need to wait for previous jobs to finish before being able to submit subsequent jobs to the job queue, since I need an up to date view of the GPUVA space in order to calculate the sequence of operations to fulfill a mapping request. As a consequence all jobs need to be processed in the order they were submitted, including synchronous jobs. @Matt: I think you will have the same limitation with synchronous jobs as your implementation in XE should be similar? In order to address it I want to switch to using callbacks rather than 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the asynchronous stage. This would allow me to 'fit' synchronous jobs between jobs waiting in the async job queue. However, to do this I have to re-work how the page table handling in Nouveau is implemented, since this would require me to be able to manage the page tables without knowing the exact sequence of operations to fulfill a mapping request. - Danilo > >> drm/nouveau: new VM_BIND uapi interfaces >> drm/nouveau: get vmm via nouveau_cli_vmm() >> drm/nouveau: bo: initialize GEM GPU VA interface >> drm/nouveau: move usercopy helpers to nouveau_drv.h >> drm/nouveau: fence: separate fence alloc and emit >> drm/nouveau: fence: fail to emit when fence context is killed >> drm/nouveau: chan: provide nouveau_channel_kill() >> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm >> drm/nouveau: implement new VM_BIND uAPI >> drm/nouveau: debugfs: implement DRM GPU VA debugfs >> >> Documentation/gpu/driver-uapi.rst | 11 + >> Documentation/gpu/drm-mm.rst | 54 + >> drivers/gpu/drm/Kconfig | 6 + >> drivers/gpu/drm/Makefile | 3 + >> drivers/gpu/drm/drm_debugfs.c | 41 + >> drivers/gpu/drm/drm_exec.c | 278 +++ >> drivers/gpu/drm/drm_gem.c | 3 + >> drivers/gpu/drm/drm_gpuva_mgr.c | 1971 ++++++++++++++++ >> drivers/gpu/drm/nouveau/Kbuild | 3 + >> drivers/gpu/drm/nouveau/Kconfig | 2 + >> drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +- >> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +- >> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +- >> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +- >> drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 + >> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + >> drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +- >> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- >> drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +- >> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + >> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 + >> drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +- >> drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +- >> drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +- >> drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++ >> drivers/gpu/drm/nouveau/nouveau_exec.h | 54 + >> drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +- >> drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +- >> drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +- >> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + >> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- >> drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++ >> drivers/gpu/drm/nouveau/nouveau_sched.h | 123 + >> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- >> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1979 +++++++++++++++++ >> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 + >> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- >> drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +- >> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +- >> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +- >> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 + >> .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +- >> .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +- >> .../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +- >> include/drm/drm_debugfs.h | 25 + >> include/drm/drm_drv.h | 6 + >> include/drm/drm_exec.h | 119 + >> include/drm/drm_gem.h | 52 + >> include/drm/drm_gpuva_mgr.h | 682 ++++++ >> include/linux/maple_tree.h | 7 +- >> include/uapi/drm/nouveau_drm.h | 209 ++ >> 51 files changed, 7566 insertions(+), 242 deletions(-) >> create mode 100644 drivers/gpu/drm/drm_exec.c >> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h >> create mode 100644 include/drm/drm_exec.h >> create mode 100644 include/drm/drm_gpuva_mgr.h >> >> >> base-commit: 2222dcb0775d36de28992f56455ab3967b30d380 >
Hi Danilo, On Tue, 20 Jun 2023 14:46:07 +0200 Danilo Krummrich <dakr@redhat.com> wrote: > > The only thing I'm worried about is the 'sync mapping requests have to > > go through the async path and wait for all previous async requests to > > be processed' problem I mentioned in one of your previous submission, > > but I'm happy leave that for later. > > Yes, I'm aware of this limitation. > > Let me quickly try to explain where this limitation comes from and how I > intend to address it. > > In order to be able to allocate the required page tables for a mapping > request and in order to free corresponding page tables once the (async) > job finished I need to know the corresponding sequence of operations > (drm_gpuva_ops) to fulfill the mapping request. > > This requires me to update the GPUVA space in the ioctl() rather than in > the async stage, because otherwise I would need to wait for previous > jobs to finish before being able to submit subsequent jobs to the job > queue, since I need an up to date view of the GPUVA space in order to > calculate the sequence of operations to fulfill a mapping request. > > As a consequence all jobs need to be processed in the order they were > submitted, including synchronous jobs. > > @Matt: I think you will have the same limitation with synchronous jobs > as your implementation in XE should be similar? > > In order to address it I want to switch to using callbacks rather than > 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the > asynchronous stage. > This would allow me to 'fit' synchronous jobs > between jobs waiting in the async job queue. However, to do this I have > to re-work how the page table handling in Nouveau is implemented, since > this would require me to be able to manage the page tables without > knowing the exact sequence of operations to fulfill a mapping request. Ok, so I think that's more or less what we're trying to do right now in PowerVR. - First, we make sure we reserve enough MMU page tables for a given map operation to succeed no matter the VM state in the VM_BIND job submission path (our VM_BIND ioctl). That means we're always over-provisioning and returning unused memory back when the operation is done if we end up using less memory. - We pre-allocate for the mapple-tree insertions. - Then we map using drm_gpuva_sm_map() and the callbacks we provided in the drm_sched::run_job() path. We guarantee that no memory is allocated in that path thanks to the pre-allocation/reservation we've done at VM_BIND job submission time. The problem I see with this v5 is that: 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(), which, in our case, is called in the async drm_sched::run_job() path, and we don't hold the lock in that path (it's been released just after the job submission). 2/ I'm worried that Liam's plan to only reserve what's actually needed based on the mapple tree state is going to play against us, because the mapple-tree is only modified at job exec time, and we might have several unmaps happening between the moment we created and queued the jobs, and the moment they actually get executed, meaning the mapple-tree reservation might no longer fit the bill. For issue #1, it shouldn't be to problematic if we use a regular lock to insert to/remove from the GEM gpuva list. For issue #2, I can see a way out if, instead of freeing gpuva nodes, we flag those as unused when we see that something happening later in the queue is going to map a section being unmapped. All of this implies keeping access to already queued VM_BIND jobs (using the spsc queue at the entity level is not practical), and iterating over them every time a new sync or async job is queued to flag what needs to be retained. It would obviously be easier if we could tell the mapple-tree API 'provision as if the tree was empty', so all we have to do is just over-provision for both the page tables and mapple-tree insertion, and free the unused mem when the operation is done. Don't know if you already thought about that and/or have solutions to solve these issues. Regards, Boris
Hi Boris, On 6/22/23 15:01, Boris Brezillon wrote: > Hi Danilo, > > On Tue, 20 Jun 2023 14:46:07 +0200 > Danilo Krummrich <dakr@redhat.com> wrote: > >>> The only thing I'm worried about is the 'sync mapping requests have to >>> go through the async path and wait for all previous async requests to >>> be processed' problem I mentioned in one of your previous submission, >>> but I'm happy leave that for later. >> >> Yes, I'm aware of this limitation. >> >> Let me quickly try to explain where this limitation comes from and how I >> intend to address it. >> >> In order to be able to allocate the required page tables for a mapping >> request and in order to free corresponding page tables once the (async) >> job finished I need to know the corresponding sequence of operations >> (drm_gpuva_ops) to fulfill the mapping request. >> >> This requires me to update the GPUVA space in the ioctl() rather than in >> the async stage, because otherwise I would need to wait for previous >> jobs to finish before being able to submit subsequent jobs to the job >> queue, since I need an up to date view of the GPUVA space in order to >> calculate the sequence of operations to fulfill a mapping request. >> >> As a consequence all jobs need to be processed in the order they were >> submitted, including synchronous jobs. >> >> @Matt: I think you will have the same limitation with synchronous jobs >> as your implementation in XE should be similar? >> >> In order to address it I want to switch to using callbacks rather than >> 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the >> asynchronous stage. >> This would allow me to 'fit' synchronous jobs >> between jobs waiting in the async job queue. However, to do this I have >> to re-work how the page table handling in Nouveau is implemented, since >> this would require me to be able to manage the page tables without >> knowing the exact sequence of operations to fulfill a mapping request. > > Ok, so I think that's more or less what we're trying to do right > now in PowerVR. > > - First, we make sure we reserve enough MMU page tables for a given map > operation to succeed no matter the VM state in the VM_BIND job > submission path (our VM_BIND ioctl). That means we're always > over-provisioning and returning unused memory back when the operation > is done if we end up using less memory. > - We pre-allocate for the mapple-tree insertions. > - Then we map using drm_gpuva_sm_map() and the callbacks we provided in > the drm_sched::run_job() path. We guarantee that no memory is > allocated in that path thanks to the pre-allocation/reservation we've > done at VM_BIND job submission time. > > The problem I see with this v5 is that: > > 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(), > which, in our case, is called in the async drm_sched::run_job() path, > and we don't hold the lock in that path (it's been released just > after the job submission). My solution to this, as by now, is to - in the same way we pre-allocate - to just pre-link and pre-unlink. And then fix things up in the cleanup path. However, depending on the driver, this might require you to set a flag in the driver specific structure (embedding struct drm_gpuva) whether the gpuva is actually mapped (as in has active page table entries). Maybe we could also just add such a flag to struct drm_gpuva. But yeah, doesn't sound too nice to be honest... > 2/ I'm worried that Liam's plan to only reserve what's actually needed > based on the mapple tree state is going to play against us, because > the mapple-tree is only modified at job exec time, and we might have > several unmaps happening between the moment we created and queued the > jobs, and the moment they actually get executed, meaning the > mapple-tree reservation might no longer fit the bill. Yes, I'm aware and I explained to Liam in detail why we need the mas_preallocate_worst_case() way of doing it. See this mail: https://lore.kernel.org/nouveau/68cd25de-e767-725e-2e7b-703217230bb0@redhat.com/T/#ma326e200b1de1e3c9df4e9fcb3bf243061fee8b5 He hasn't answered yet, but I hope we can just get (or actually keep) such a function (hopefully with better naming), since it shouldn't interfere with anything else. > > For issue #1, it shouldn't be to problematic if we use a regular lock to > insert to/remove from the GEM gpuva list. Yes, that's why I had a separate GEM gpuva list lock in the first place. However, this doesn't really work when generating ops rather than using the callback interface. Have a look at drm_gpuva_gem_unmap_ops_create() requested by Matt for XE. This function generates drm_gpuva_ops to unmap all mappings of a given GEM. In order to do that the function must iterate the GEM's gpuva list and allocate operations for each mapping. As a consequence the gpuva list lock wouldn't be allowed to be taken in the fence signalling path (run_job()) any longer. Hence, we can just protect the list with the GEM's dma-resv lock. However, I can understand that it might be inconvenient for the callback interface and admittedly my solution to that isn't that nice as well. Hence the following idea: For drivers to be able to use their own lock for that it would be enough to get rid of the lockdep checks. We could just add a flag to the GPUVA manager to let the driver indicate it wants to do it's own locking for the GPUVA list and skip the lockdep checks for the dma-resv lock in that case. > > For issue #2, I can see a way out if, instead of freeing gpuva nodes, > we flag those as unused when we see that something happening later in > the queue is going to map a section being unmapped. All of this implies > keeping access to already queued VM_BIND jobs (using the spsc queue at > the entity level is not practical), and iterating over them every time > a new sync or async job is queued to flag what needs to be retained. It > would obviously be easier if we could tell the mapple-tree API > 'provision as if the tree was empty', so all we have to do is just > over-provision for both the page tables and mapple-tree insertion, and > free the unused mem when the operation is done. > > Don't know if you already thought about that and/or have solutions to > solve these issues. As already mentioned above, I'd just expect we can keep it the over-provision way, as you say. I think it's a legit use case to not know the state of the maple tree at the time the pre-allocated nodes will be used and keeping that should not interfere with Liams plan to (hopefully separately) optimize for the pre-allocation use case they have within -mm. But let's wait for his take on that. - Danilo > > Regards, > > Boris >
Hi Danilo, On Thu, 22 Jun 2023 15:58:23 +0200 Danilo Krummrich <dakr@redhat.com> wrote: > Hi Boris, > > On 6/22/23 15:01, Boris Brezillon wrote: > > Hi Danilo, > > > > On Tue, 20 Jun 2023 14:46:07 +0200 > > Danilo Krummrich <dakr@redhat.com> wrote: > > > >>> The only thing I'm worried about is the 'sync mapping requests have to > >>> go through the async path and wait for all previous async requests to > >>> be processed' problem I mentioned in one of your previous submission, > >>> but I'm happy leave that for later. > >> > >> Yes, I'm aware of this limitation. > >> > >> Let me quickly try to explain where this limitation comes from and how I > >> intend to address it. > >> > >> In order to be able to allocate the required page tables for a mapping > >> request and in order to free corresponding page tables once the (async) > >> job finished I need to know the corresponding sequence of operations > >> (drm_gpuva_ops) to fulfill the mapping request. > >> > >> This requires me to update the GPUVA space in the ioctl() rather than in > >> the async stage, because otherwise I would need to wait for previous > >> jobs to finish before being able to submit subsequent jobs to the job > >> queue, since I need an up to date view of the GPUVA space in order to > >> calculate the sequence of operations to fulfill a mapping request. > >> > >> As a consequence all jobs need to be processed in the order they were > >> submitted, including synchronous jobs. > >> > >> @Matt: I think you will have the same limitation with synchronous jobs > >> as your implementation in XE should be similar? > >> > >> In order to address it I want to switch to using callbacks rather than > >> 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the > >> asynchronous stage. > >> This would allow me to 'fit' synchronous jobs > >> between jobs waiting in the async job queue. However, to do this I have > >> to re-work how the page table handling in Nouveau is implemented, since > >> this would require me to be able to manage the page tables without > >> knowing the exact sequence of operations to fulfill a mapping request. > > > > Ok, so I think that's more or less what we're trying to do right > > now in PowerVR. > > > > - First, we make sure we reserve enough MMU page tables for a given map > > operation to succeed no matter the VM state in the VM_BIND job > > submission path (our VM_BIND ioctl). That means we're always > > over-provisioning and returning unused memory back when the operation > > is done if we end up using less memory. > > - We pre-allocate for the mapple-tree insertions. > > - Then we map using drm_gpuva_sm_map() and the callbacks we provided in > > the drm_sched::run_job() path. We guarantee that no memory is > > allocated in that path thanks to the pre-allocation/reservation we've > > done at VM_BIND job submission time. > > > > The problem I see with this v5 is that: > > > > 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(), > > which, in our case, is called in the async drm_sched::run_job() path, > > and we don't hold the lock in that path (it's been released just > > after the job submission). > > My solution to this, as by now, is to - in the same way we pre-allocate > - to just pre-link and pre-unlink. And then fix things up in the cleanup > path. > > However, depending on the driver, this might require you to set a flag > in the driver specific structure (embedding struct drm_gpuva) whether > the gpuva is actually mapped (as in has active page table entries). > Maybe we could also just add such a flag to struct drm_gpuva. But yeah, > doesn't sound too nice to be honest... > > > 2/ I'm worried that Liam's plan to only reserve what's actually needed > > based on the mapple tree state is going to play against us, because > > the mapple-tree is only modified at job exec time, and we might have > > several unmaps happening between the moment we created and queued the > > jobs, and the moment they actually get executed, meaning the > > mapple-tree reservation might no longer fit the bill. > > Yes, I'm aware and I explained to Liam in detail why we need the > mas_preallocate_worst_case() way of doing it. > > See this mail: > https://lore.kernel.org/nouveau/68cd25de-e767-725e-2e7b-703217230bb0@redhat.com/T/#ma326e200b1de1e3c9df4e9fcb3bf243061fee8b5 > > He hasn't answered yet, but I hope we can just get (or actually keep) > such a function (hopefully with better naming), since it shouldn't > interfere with anything else. My bad, I started reading your reply and got interrupted. Never got back to it, which I should definitely have done before posting my questions. Anyway, glad to hear we're on the same page regarding the mas_preallocate_worst_case() thing. > > > > > For issue #1, it shouldn't be to problematic if we use a regular lock to > > insert to/remove from the GEM gpuva list. > > Yes, that's why I had a separate GEM gpuva list lock in the first place. > However, this doesn't really work when generating ops rather than using > the callback interface. > > Have a look at drm_gpuva_gem_unmap_ops_create() requested by Matt for > XE. This function generates drm_gpuva_ops to unmap all mappings of a > given GEM. In order to do that the function must iterate the GEM's gpuva > list and allocate operations for each mapping. As a consequence the > gpuva list lock wouldn't be allowed to be taken in the fence signalling > path (run_job()) any longer. Hence, we can just protect the list with > the GEM's dma-resv lock. Yeah, I see why using dma_resv when pre-inserting the mapping is useful, it just didn't really work with late mapping insertion. > > However, I can understand that it might be inconvenient for the callback > interface and admittedly my solution to that isn't that nice as well. > Hence the following idea: > > For drivers to be able to use their own lock for that it would be enough > to get rid of the lockdep checks. We could just add a flag to the GPUVA > manager to let the driver indicate it wants to do it's own locking for > the GPUVA list and skip the lockdep checks for the dma-resv lock in that > case. Sounds good to me. > > > > > For issue #2, I can see a way out if, instead of freeing gpuva nodes, > > we flag those as unused when we see that something happening later in > > the queue is going to map a section being unmapped. All of this implies > > keeping access to already queued VM_BIND jobs (using the spsc queue at > > the entity level is not practical), and iterating over them every time > > a new sync or async job is queued to flag what needs to be retained. It > > would obviously be easier if we could tell the mapple-tree API > > 'provision as if the tree was empty', so all we have to do is just > > over-provision for both the page tables and mapple-tree insertion, and > > free the unused mem when the operation is done. > > > > Don't know if you already thought about that and/or have solutions to > > solve these issues. > > As already mentioned above, I'd just expect we can keep it the > over-provision way, as you say. I think it's a legit use case to not > know the state of the maple tree at the time the pre-allocated nodes > will be used and keeping that should not interfere with Liams plan to > (hopefully separately) optimize for the pre-allocation use case they > have within -mm. > > But let's wait for his take on that. Sure. As I said, I'm fine getting this version merged, we can sort out the changes needed for PowerVR later. Just thought I'd mention those issues early, so you're not surprised when we come back with crazy requests (which apparently are not that crazy ;-)). Regards, Boris
On 6/22/23 17:19, Boris Brezillon wrote: > Hi Danilo, > > On Thu, 22 Jun 2023 15:58:23 +0200 > Danilo Krummrich <dakr@redhat.com> wrote: > >> Hi Boris, >> >> On 6/22/23 15:01, Boris Brezillon wrote: >>> Hi Danilo, >>> >>> On Tue, 20 Jun 2023 14:46:07 +0200 >>> Danilo Krummrich <dakr@redhat.com> wrote: >>> >>>>> The only thing I'm worried about is the 'sync mapping requests have to >>>>> go through the async path and wait for all previous async requests to >>>>> be processed' problem I mentioned in one of your previous submission, >>>>> but I'm happy leave that for later. >>>> >>>> Yes, I'm aware of this limitation. >>>> >>>> Let me quickly try to explain where this limitation comes from and how I >>>> intend to address it. >>>> >>>> In order to be able to allocate the required page tables for a mapping >>>> request and in order to free corresponding page tables once the (async) >>>> job finished I need to know the corresponding sequence of operations >>>> (drm_gpuva_ops) to fulfill the mapping request. >>>> >>>> This requires me to update the GPUVA space in the ioctl() rather than in >>>> the async stage, because otherwise I would need to wait for previous >>>> jobs to finish before being able to submit subsequent jobs to the job >>>> queue, since I need an up to date view of the GPUVA space in order to >>>> calculate the sequence of operations to fulfill a mapping request. >>>> >>>> As a consequence all jobs need to be processed in the order they were >>>> submitted, including synchronous jobs. >>>> >>>> @Matt: I think you will have the same limitation with synchronous jobs >>>> as your implementation in XE should be similar? >>>> >>>> In order to address it I want to switch to using callbacks rather than >>>> 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the >>>> asynchronous stage. >>>> This would allow me to 'fit' synchronous jobs >>>> between jobs waiting in the async job queue. However, to do this I have >>>> to re-work how the page table handling in Nouveau is implemented, since >>>> this would require me to be able to manage the page tables without >>>> knowing the exact sequence of operations to fulfill a mapping request. >>> >>> Ok, so I think that's more or less what we're trying to do right >>> now in PowerVR. >>> >>> - First, we make sure we reserve enough MMU page tables for a given map >>> operation to succeed no matter the VM state in the VM_BIND job >>> submission path (our VM_BIND ioctl). That means we're always >>> over-provisioning and returning unused memory back when the operation >>> is done if we end up using less memory. >>> - We pre-allocate for the mapple-tree insertions. >>> - Then we map using drm_gpuva_sm_map() and the callbacks we provided in >>> the drm_sched::run_job() path. We guarantee that no memory is >>> allocated in that path thanks to the pre-allocation/reservation we've >>> done at VM_BIND job submission time. >>> >>> The problem I see with this v5 is that: >>> >>> 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(), >>> which, in our case, is called in the async drm_sched::run_job() path, >>> and we don't hold the lock in that path (it's been released just >>> after the job submission). >> >> My solution to this, as by now, is to - in the same way we pre-allocate >> - to just pre-link and pre-unlink. And then fix things up in the cleanup >> path. >> >> However, depending on the driver, this might require you to set a flag >> in the driver specific structure (embedding struct drm_gpuva) whether >> the gpuva is actually mapped (as in has active page table entries). >> Maybe we could also just add such a flag to struct drm_gpuva. But yeah, >> doesn't sound too nice to be honest... >> >>> 2/ I'm worried that Liam's plan to only reserve what's actually needed >>> based on the mapple tree state is going to play against us, because >>> the mapple-tree is only modified at job exec time, and we might have >>> several unmaps happening between the moment we created and queued the >>> jobs, and the moment they actually get executed, meaning the >>> mapple-tree reservation might no longer fit the bill. >> >> Yes, I'm aware and I explained to Liam in detail why we need the >> mas_preallocate_worst_case() way of doing it. >> >> See this mail: >> https://lore.kernel.org/nouveau/68cd25de-e767-725e-2e7b-703217230bb0@redhat.com/T/#ma326e200b1de1e3c9df4e9fcb3bf243061fee8b5 >> >> He hasn't answered yet, but I hope we can just get (or actually keep) >> such a function (hopefully with better naming), since it shouldn't >> interfere with anything else. > > My bad, I started reading your reply and got interrupted. Never got > back to it, which I should definitely have done before posting my > questions. Anyway, glad to hear we're on the same page regarding the > mas_preallocate_worst_case() thing. No worries, I should probably also reply to Liams patch introducing the change. I will do that in a minute. > >> >>> >>> For issue #1, it shouldn't be to problematic if we use a regular lock to >>> insert to/remove from the GEM gpuva list. >> >> Yes, that's why I had a separate GEM gpuva list lock in the first place. >> However, this doesn't really work when generating ops rather than using >> the callback interface. >> >> Have a look at drm_gpuva_gem_unmap_ops_create() requested by Matt for >> XE. This function generates drm_gpuva_ops to unmap all mappings of a >> given GEM. In order to do that the function must iterate the GEM's gpuva >> list and allocate operations for each mapping. As a consequence the >> gpuva list lock wouldn't be allowed to be taken in the fence signalling >> path (run_job()) any longer. Hence, we can just protect the list with >> the GEM's dma-resv lock. > > Yeah, I see why using dma_resv when pre-inserting the mapping is > useful, it just didn't really work with late mapping insertion. > >> >> However, I can understand that it might be inconvenient for the callback >> interface and admittedly my solution to that isn't that nice as well. >> Hence the following idea: >> >> For drivers to be able to use their own lock for that it would be enough >> to get rid of the lockdep checks. We could just add a flag to the GPUVA >> manager to let the driver indicate it wants to do it's own locking for >> the GPUVA list and skip the lockdep checks for the dma-resv lock in that >> case. > > Sounds good to me. I think it's way better than the pre-link / pre-unlink mess. I will add this to v6. > >> >>> >>> For issue #2, I can see a way out if, instead of freeing gpuva nodes, >>> we flag those as unused when we see that something happening later in >>> the queue is going to map a section being unmapped. All of this implies >>> keeping access to already queued VM_BIND jobs (using the spsc queue at >>> the entity level is not practical), and iterating over them every time >>> a new sync or async job is queued to flag what needs to be retained. It >>> would obviously be easier if we could tell the mapple-tree API >>> 'provision as if the tree was empty', so all we have to do is just >>> over-provision for both the page tables and mapple-tree insertion, and >>> free the unused mem when the operation is done. >>> >>> Don't know if you already thought about that and/or have solutions to >>> solve these issues. >> >> As already mentioned above, I'd just expect we can keep it the >> over-provision way, as you say. I think it's a legit use case to not >> know the state of the maple tree at the time the pre-allocated nodes >> will be used and keeping that should not interfere with Liams plan to >> (hopefully separately) optimize for the pre-allocation use case they >> have within -mm. >> >> But let's wait for his take on that. > > Sure. As I said, I'm fine getting this version merged, we can sort out > the changes needed for PowerVR later. Just thought I'd mention those > issues early, so you're not surprised when we come back with crazy > requests (which apparently are not that crazy ;-)). They're not crazy at all, in fact they entirely represent what the callback interface was designed for. :-) - Danilo > > Regards, > > Boris >