Message ID | 20250129195212.745731-29-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce GPU SVM and Xe SVM implementation | expand |
On 29/01/2025 19:52, Matthew Brost wrote: > Migration is implemented with range granularity, with VRAM backing being > a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime of the > TTM BO is limited to when the SVM range is in VRAM (i.e., when a VRAM > SVM range is migrated to SRAM, the TTM BO is destroyed). > > The design choice for using TTM BO for VRAM backing store, as opposed to > direct buddy allocation, is as follows: > > - DRM buddy allocations are not at page granularity, offering no > advantage over a BO. > - Unified eviction is required (SVM VRAM and TTM BOs need to be able to > evict each other). > - For exhaustive eviction [1], SVM VRAM allocations will almost certainly > require a dma-resv. > - Likely allocation size is 2M which makes of size of BO (872) > acceptable per allocation (872 / 2M == .0004158). > > With this, using TTM BO for VRAM backing store seems to be an obvious > choice as it allows leveraging of the TTM eviction code. > > Current migration policy is migrate any SVM range greater than or equal > to 64k once. > > [1] https://patchwork.freedesktop.org/series/133643/ > > v2: > - Rebase on latest GPU SVM > - Retry page fault on get pages returning mixed allocation > - Use drm_gpusvm_devmem > v3: > - Use new BO flags > - New range structure (Thomas) > - Hide migration behind Kconfig > - Kernel doc (Thomas) > - Use check_pages_threshold > v4: > - Don't evict partial unmaps in garbage collector (Thomas) > - Use %pe to print errors (Thomas) > - Use %p to print pointers (Thomas) > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > --- > drivers/gpu/drm/xe/xe_svm.c | 99 +++++++++++++++++++++++++++++++++++-- > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > 2 files changed, 100 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c > index ba1db030bf33..fc030855d078 100644 > --- a/drivers/gpu/drm/xe/xe_svm.c > +++ b/drivers/gpu/drm/xe/xe_svm.c > @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem *devmem_allocatio > return 0; > } > > -__maybe_unused > static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { > .devmem_release = xe_svm_devmem_release, > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, > return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id); > } > > +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) > +{ > + return &tile->mem.vram; > +} > + > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile, > + struct xe_svm_range *range, > + const struct drm_gpusvm_ctx *ctx) > +{ > + struct xe_mem_region *mr = tile_to_mr(tile); > + struct drm_buddy_block *block; > + struct list_head *blocks; > + struct xe_bo *bo; > + ktime_t end = 0; > + int err; > + > +retry: > + xe_vm_lock(vm, false); > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range->base.itree.last + 1 - > + range->base.itree.start, ttm_bo_type_device, > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > + XE_BO_FLAG_CPU_ADDR_MIRROR); > + xe_vm_unlock(vm); What was the trick again to ensure eviction is not triggered at this point? I thought there was some trick with eviction_valuable() but I can't find it. > + if (IS_ERR(bo)) { > + err = PTR_ERR(bo); > + if (xe_vm_validate_should_retry(NULL, err, &end)) > + goto retry; > + return bo; > + } > + > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > + vm->xe->drm.dev, vm->svm.gpusvm.mm, > + &gpusvm_devmem_ops, > + &tile->mem.vram.dpagemap, > + range->base.itree.last + 1 - > + range->base.itree.start); > + > + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; > + list_for_each_entry(block, blocks, link) > + block->private = mr; > + > + /* > + * Take ref because as soon as drm_gpusvm_migrate_to_devmem succeeds the > + * creation ref can be dropped upon CPU fault or unmap. > + */ > + xe_bo_get(bo); > + > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, &range->base, > + &bo->devmem_allocation, ctx); > + if (err) { > + xe_bo_put(bo); /* Local ref */ > + xe_bo_put(bo); /* Creation ref */ > + return ERR_PTR(err); > + } > + > + return bo; > +} > + > /** > * xe_svm_handle_pagefault() - SVM handle page fault > * @vm: The VM. > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, > * @fault_addr: The GPU fault address. > * @atomic: The fault atomic access bit. > * > - * Create GPU bindings for a SVM page fault. > + * Create GPU bindings for a SVM page fault. Optionally migrate to device > + * memory. > * > * Return: 0 on success, negative error code on error. > */ > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > struct xe_tile *tile, u64 fault_addr, > bool atomic) > { > - struct drm_gpusvm_ctx ctx = { .read_only = xe_vma_read_only(vma), }; > + struct drm_gpusvm_ctx ctx = { > + .read_only = xe_vma_read_only(vma), > + .devmem_possible = IS_DGFX(vm->xe) && > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), > + .check_pages_threshold = IS_DGFX(vm->xe) && > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0, > + }; > struct xe_svm_range *range; > struct drm_gpusvm_range *r; > struct drm_exec exec; > struct dma_fence *fence; > + struct xe_bo *bo = NULL; > ktime_t end = 0; > int err; > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > retry: > + xe_bo_put(bo); > + bo = NULL; > + > /* Always process UNMAPs first so view SVM ranges is current */ > err = xe_svm_garbage_collector(vm); > if (err) > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > if (xe_svm_range_is_valid(range, tile)) > return 0; > > + /* XXX: Add migration policy, for now migrate range once */ > + if (!range->migrated && range->base.flags.migrate_devmem && > + (range->base.itree.last + 1 - range->base.itree.start) >= SZ_64K) { > + range->migrated = true; > + > + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); > + if (IS_ERR(bo)) { > + drm_info(&vm->xe->drm, > + "VRAM allocation failed, falling back to retrying, asid=%u, errno %pe\n", > + vm->usm.asid, bo); > + bo = NULL; > + goto retry; > + } > + } > + > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); > - if (err == -EFAULT || err == -EPERM) /* Corner where CPU mappings have changed */ > + /* Corner where CPU mappings have changed */ > + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { > + if (err == -EOPNOTSUPP) > + drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base); > + drm_info(&vm->xe->drm, > + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno %pe\n", > + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); > goto retry; > + } > if (err) > goto err_out; > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > dma_fence_put(fence); > > err_out: > + xe_bo_put(bo); > > return err; > } > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h > index 63daffdfdbf6..4c2576162c39 100644 > --- a/drivers/gpu/drm/xe/xe_svm.h > +++ b/drivers/gpu/drm/xe/xe_svm.h > @@ -35,6 +35,11 @@ struct xe_svm_range { > * range. Protected by GPU SVM notifier lock. > */ > u8 tile_invalidated; > + /** > + * @migrated: Range has been migrated to device memory, protected by > + * GPU fault handler locking. > + */ > + u8 migrated :1; > }; > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
On Thu, Jan 30, 2025 at 02:22:55PM +0000, Matthew Auld wrote: > On 29/01/2025 19:52, Matthew Brost wrote: > > Migration is implemented with range granularity, with VRAM backing being > > a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime of the > > TTM BO is limited to when the SVM range is in VRAM (i.e., when a VRAM > > SVM range is migrated to SRAM, the TTM BO is destroyed). > > > > The design choice for using TTM BO for VRAM backing store, as opposed to > > direct buddy allocation, is as follows: > > > > - DRM buddy allocations are not at page granularity, offering no > > advantage over a BO. > > - Unified eviction is required (SVM VRAM and TTM BOs need to be able to > > evict each other). > > - For exhaustive eviction [1], SVM VRAM allocations will almost certainly > > require a dma-resv. > > - Likely allocation size is 2M which makes of size of BO (872) > > acceptable per allocation (872 / 2M == .0004158). > > > > With this, using TTM BO for VRAM backing store seems to be an obvious > > choice as it allows leveraging of the TTM eviction code. > > > > Current migration policy is migrate any SVM range greater than or equal > > to 64k once. > > > > [1] https://patchwork.freedesktop.org/series/133643/ > > > > v2: > > - Rebase on latest GPU SVM > > - Retry page fault on get pages returning mixed allocation > > - Use drm_gpusvm_devmem > > v3: > > - Use new BO flags > > - New range structure (Thomas) > > - Hide migration behind Kconfig > > - Kernel doc (Thomas) > > - Use check_pages_threshold > > v4: > > - Don't evict partial unmaps in garbage collector (Thomas) > > - Use %pe to print errors (Thomas) > > - Use %p to print pointers (Thomas) > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > --- > > drivers/gpu/drm/xe/xe_svm.c | 99 +++++++++++++++++++++++++++++++++++-- > > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > > 2 files changed, 100 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c > > index ba1db030bf33..fc030855d078 100644 > > --- a/drivers/gpu/drm/xe/xe_svm.c > > +++ b/drivers/gpu/drm/xe/xe_svm.c > > @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem *devmem_allocatio > > return 0; > > } > > -__maybe_unused > > static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { > > .devmem_release = xe_svm_devmem_release, > > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, > > return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id); > > } > > +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) > > +{ > > + return &tile->mem.vram; > > +} > > + > > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile, > > + struct xe_svm_range *range, > > + const struct drm_gpusvm_ctx *ctx) > > +{ > > + struct xe_mem_region *mr = tile_to_mr(tile); > > + struct drm_buddy_block *block; > > + struct list_head *blocks; > > + struct xe_bo *bo; > > + ktime_t end = 0; > > + int err; > > + > > +retry: > > + xe_vm_lock(vm, false); > > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range->base.itree.last + 1 - > > + range->base.itree.start, ttm_bo_type_device, > > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > > + XE_BO_FLAG_CPU_ADDR_MIRROR); > > + xe_vm_unlock(vm); > > What was the trick again to ensure eviction is not triggered at this point? > I thought there was some trick with eviction_valuable() but I can't find it. > I dropped that given the hacky nature of how it was implemented. Yes, it is possible that we allocate VRAM and it is immediately evicted before the bind occurs but in practice should never really happen given this BO should be the last entry on the LRU list. Even if this happens, I believe this is harmless given the bind will abort and trigger a retry. Matt > > + if (IS_ERR(bo)) { > > + err = PTR_ERR(bo); > > + if (xe_vm_validate_should_retry(NULL, err, &end)) > > + goto retry; > > + return bo; > > + } > > + > > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > > + vm->xe->drm.dev, vm->svm.gpusvm.mm, > > + &gpusvm_devmem_ops, > > + &tile->mem.vram.dpagemap, > > + range->base.itree.last + 1 - > > + range->base.itree.start); > > + > > + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; > > + list_for_each_entry(block, blocks, link) > > + block->private = mr; > > + > > + /* > > + * Take ref because as soon as drm_gpusvm_migrate_to_devmem succeeds the > > + * creation ref can be dropped upon CPU fault or unmap. > > + */ > > + xe_bo_get(bo); > > + > > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, &range->base, > > + &bo->devmem_allocation, ctx); > > + if (err) { > > + xe_bo_put(bo); /* Local ref */ > > + xe_bo_put(bo); /* Creation ref */ > > + return ERR_PTR(err); > > + } > > + > > + return bo; > > +} > > + > > /** > > * xe_svm_handle_pagefault() - SVM handle page fault > > * @vm: The VM. > > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, > > * @fault_addr: The GPU fault address. > > * @atomic: The fault atomic access bit. > > * > > - * Create GPU bindings for a SVM page fault. > > + * Create GPU bindings for a SVM page fault. Optionally migrate to device > > + * memory. > > * > > * Return: 0 on success, negative error code on error. > > */ > > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > struct xe_tile *tile, u64 fault_addr, > > bool atomic) > > { > > - struct drm_gpusvm_ctx ctx = { .read_only = xe_vma_read_only(vma), }; > > + struct drm_gpusvm_ctx ctx = { > > + .read_only = xe_vma_read_only(vma), > > + .devmem_possible = IS_DGFX(vm->xe) && > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), > > + .check_pages_threshold = IS_DGFX(vm->xe) && > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0, > > + }; > > struct xe_svm_range *range; > > struct drm_gpusvm_range *r; > > struct drm_exec exec; > > struct dma_fence *fence; > > + struct xe_bo *bo = NULL; > > ktime_t end = 0; > > int err; > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > retry: > > + xe_bo_put(bo); > > + bo = NULL; > > + > > /* Always process UNMAPs first so view SVM ranges is current */ > > err = xe_svm_garbage_collector(vm); > > if (err) > > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > if (xe_svm_range_is_valid(range, tile)) > > return 0; > > + /* XXX: Add migration policy, for now migrate range once */ > > + if (!range->migrated && range->base.flags.migrate_devmem && > > + (range->base.itree.last + 1 - range->base.itree.start) >= SZ_64K) { > > + range->migrated = true; > > + > > + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); > > + if (IS_ERR(bo)) { > > + drm_info(&vm->xe->drm, > > + "VRAM allocation failed, falling back to retrying, asid=%u, errno %pe\n", > > + vm->usm.asid, bo); > > + bo = NULL; > > + goto retry; > > + } > > + } > > + > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); > > - if (err == -EFAULT || err == -EPERM) /* Corner where CPU mappings have changed */ > > + /* Corner where CPU mappings have changed */ > > + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { > > + if (err == -EOPNOTSUPP) > > + drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base); > > + drm_info(&vm->xe->drm, > > + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno %pe\n", > > + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); > > goto retry; > > + } > > if (err) > > goto err_out; > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > dma_fence_put(fence); > > err_out: > > + xe_bo_put(bo); > > return err; > > } > > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h > > index 63daffdfdbf6..4c2576162c39 100644 > > --- a/drivers/gpu/drm/xe/xe_svm.h > > +++ b/drivers/gpu/drm/xe/xe_svm.h > > @@ -35,6 +35,11 @@ struct xe_svm_range { > > * range. Protected by GPU SVM notifier lock. > > */ > > u8 tile_invalidated; > > + /** > > + * @migrated: Range has been migrated to device memory, protected by > > + * GPU fault handler locking. > > + */ > > + u8 migrated :1; > > }; > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr); >
On Thu, 2025-01-30 at 08:32 -0800, Matthew Brost wrote: > On Thu, Jan 30, 2025 at 02:22:55PM +0000, Matthew Auld wrote: > > On 29/01/2025 19:52, Matthew Brost wrote: > > > Migration is implemented with range granularity, with VRAM > > > backing being > > > a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime > > > of the > > > TTM BO is limited to when the SVM range is in VRAM (i.e., when a > > > VRAM > > > SVM range is migrated to SRAM, the TTM BO is destroyed). > > > > > > The design choice for using TTM BO for VRAM backing store, as > > > opposed to > > > direct buddy allocation, is as follows: > > > > > > - DRM buddy allocations are not at page granularity, offering no > > > advantage over a BO. > > > - Unified eviction is required (SVM VRAM and TTM BOs need to be > > > able to > > > evict each other). > > > - For exhaustive eviction [1], SVM VRAM allocations will almost > > > certainly > > > require a dma-resv. > > > - Likely allocation size is 2M which makes of size of BO (872) > > > acceptable per allocation (872 / 2M == .0004158). > > > > > > With this, using TTM BO for VRAM backing store seems to be an > > > obvious > > > choice as it allows leveraging of the TTM eviction code. > > > > > > Current migration policy is migrate any SVM range greater than or > > > equal > > > to 64k once. > > > > > > [1] https://patchwork.freedesktop.org/series/133643/ > > > > > > v2: > > > - Rebase on latest GPU SVM > > > - Retry page fault on get pages returning mixed allocation > > > - Use drm_gpusvm_devmem > > > v3: > > > - Use new BO flags > > > - New range structure (Thomas) > > > - Hide migration behind Kconfig > > > - Kernel doc (Thomas) > > > - Use check_pages_threshold > > > v4: > > > - Don't evict partial unmaps in garbage collector (Thomas) > > > - Use %pe to print errors (Thomas) > > > - Use %p to print pointers (Thomas) > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > --- > > > drivers/gpu/drm/xe/xe_svm.c | 99 > > > +++++++++++++++++++++++++++++++++++-- > > > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > > > 2 files changed, 100 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c > > > b/drivers/gpu/drm/xe/xe_svm.c > > > index ba1db030bf33..fc030855d078 100644 > > > --- a/drivers/gpu/drm/xe/xe_svm.c > > > +++ b/drivers/gpu/drm/xe/xe_svm.c > > > @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct > > > drm_gpusvm_devmem *devmem_allocatio > > > return 0; > > > } > > > -__maybe_unused > > > static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { > > > .devmem_release = xe_svm_devmem_release, > > > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > > > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct > > > xe_svm_range *range, > > > return (range->tile_present & ~range->tile_invalidated) > > > & BIT(tile->id); > > > } > > > +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) > > > +{ > > > + return &tile->mem.vram; > > > +} > > > + > > > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct > > > xe_tile *tile, > > > + struct xe_svm_range > > > *range, > > > + const struct > > > drm_gpusvm_ctx *ctx) > > > +{ > > > + struct xe_mem_region *mr = tile_to_mr(tile); > > > + struct drm_buddy_block *block; > > > + struct list_head *blocks; > > > + struct xe_bo *bo; > > > + ktime_t end = 0; > > > + int err; > > > + > > > +retry: > > > + xe_vm_lock(vm, false); > > > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range- > > > >base.itree.last + 1 - > > > + range->base.itree.start, > > > ttm_bo_type_device, > > > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > > > + XE_BO_FLAG_CPU_ADDR_MIRROR); > > > + xe_vm_unlock(vm); > > > > What was the trick again to ensure eviction is not triggered at > > this point? > > I thought there was some trick with eviction_valuable() but I can't > > find it. > > > > I dropped that given the hacky nature of how it was implemented. Yes, > it > is possible that we allocate VRAM and it is immediately evicted > before > the bind occurs but in practice should never really happen given this > BO > should be the last entry on the LRU list. Even if this happens, I > believe this is harmless given the bind will abort and trigger a > retry. It might be worth mentioning that in the multidevice series, we create an external bo and hold on to the lock until the struct mm_struct is populated with the underlying pages. After that, but before populating the gpu_vm, eviction can happen. But since now the cpu page-table is populated, that would trigger a notifier callback and a seqno update and as Matt says, the bind will be aborted and retried. /Thomas > > Matt > > > > + if (IS_ERR(bo)) { > > > + err = PTR_ERR(bo); > > > + if (xe_vm_validate_should_retry(NULL, err, > > > &end)) > > > + goto retry; > > > + return bo; > > > + } > > > + > > > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > > > + vm->xe->drm.dev, vm- > > > >svm.gpusvm.mm, > > > + &gpusvm_devmem_ops, > > > + &tile->mem.vram.dpagemap, > > > + range->base.itree.last + 1 - > > > + range->base.itree.start); > > > + > > > + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)- > > > >blocks; > > > + list_for_each_entry(block, blocks, link) > > > + block->private = mr; > > > + > > > + /* > > > + * Take ref because as soon as > > > drm_gpusvm_migrate_to_devmem succeeds the > > > + * creation ref can be dropped upon CPU fault or unmap. > > > + */ > > > + xe_bo_get(bo); > > > + > > > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, > > > &range->base, > > > + &bo- > > > >devmem_allocation, ctx); > > > + if (err) { > > > + xe_bo_put(bo); /* Local ref */ > > > + xe_bo_put(bo); /* Creation ref */ > > > + return ERR_PTR(err); > > > + } > > > + > > > + return bo; > > > +} > > > + > > > /** > > > * xe_svm_handle_pagefault() - SVM handle page fault > > > * @vm: The VM. > > > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct > > > xe_svm_range *range, > > > * @fault_addr: The GPU fault address. > > > * @atomic: The fault atomic access bit. > > > * > > > - * Create GPU bindings for a SVM page fault. > > > + * Create GPU bindings for a SVM page fault. Optionally migrate > > > to device > > > + * memory. > > > * > > > * Return: 0 on success, negative error code on error. > > > */ > > > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm > > > *vm, struct xe_vma *vma, > > > struct xe_tile *tile, u64 > > > fault_addr, > > > bool atomic) > > > { > > > - struct drm_gpusvm_ctx ctx = { .read_only = > > > xe_vma_read_only(vma), }; > > > + struct drm_gpusvm_ctx ctx = { > > > + .read_only = xe_vma_read_only(vma), > > > + .devmem_possible = IS_DGFX(vm->xe) && > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), > > > + .check_pages_threshold = IS_DGFX(vm->xe) && > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) > > > ? SZ_64K : 0, > > > + }; > > > struct xe_svm_range *range; > > > struct drm_gpusvm_range *r; > > > struct drm_exec exec; > > > struct dma_fence *fence; > > > + struct xe_bo *bo = NULL; > > > ktime_t end = 0; > > > int err; > > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, > > > struct xe_vma *vma, > > > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > > retry: > > > + xe_bo_put(bo); > > > + bo = NULL; > > > + > > > /* Always process UNMAPs first so view SVM ranges is > > > current */ > > > err = xe_svm_garbage_collector(vm); > > > if (err) > > > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm > > > *vm, struct xe_vma *vma, > > > if (xe_svm_range_is_valid(range, tile)) > > > return 0; > > > + /* XXX: Add migration policy, for now migrate range once > > > */ > > > + if (!range->migrated && range->base.flags.migrate_devmem > > > && > > > + (range->base.itree.last + 1 - range- > > > >base.itree.start) >= SZ_64K) { > > > + range->migrated = true; > > > + > > > + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); > > > + if (IS_ERR(bo)) { > > > + drm_info(&vm->xe->drm, > > > + "VRAM allocation failed, > > > falling back to retrying, asid=%u, errno %pe\n", > > > + vm->usm.asid, bo); > > > + bo = NULL; > > > + goto retry; > > > + } > > > + } > > > + > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, > > > &ctx); > > > - if (err == -EFAULT || err == -EPERM) /* Corner where > > > CPU mappings have changed */ > > > + /* Corner where CPU mappings have changed */ > > > + if (err == -EOPNOTSUPP || err == -EFAULT || err == - > > > EPERM) { > > > + if (err == -EOPNOTSUPP) > > > + drm_gpusvm_range_evict(&vm->svm.gpusvm, > > > &range->base); > > > + drm_info(&vm->xe->drm, > > > + "Get pages failed, falling back to > > > retrying, asid=%u, gpusvm=%p, errno %pe\n", > > > + vm->usm.asid, &vm->svm.gpusvm, > > > ERR_PTR(err)); > > > goto retry; > > > + } > > > if (err) > > > goto err_out; > > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, > > > struct xe_vma *vma, > > > dma_fence_put(fence); > > > err_out: > > > + xe_bo_put(bo); > > > return err; > > > } > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h > > > b/drivers/gpu/drm/xe/xe_svm.h > > > index 63daffdfdbf6..4c2576162c39 100644 > > > --- a/drivers/gpu/drm/xe/xe_svm.h > > > +++ b/drivers/gpu/drm/xe/xe_svm.h > > > @@ -35,6 +35,11 @@ struct xe_svm_range { > > > * range. Protected by GPU SVM notifier lock. > > > */ > > > u8 tile_invalidated; > > > + /** > > > + * @migrated: Range has been migrated to device memory, > > > protected by > > > + * GPU fault handler locking. > > > + */ > > > + u8 migrated :1; > > > }; > > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region > > > *mr); > >
On 30/01/2025 16:32, Matthew Brost wrote: > On Thu, Jan 30, 2025 at 02:22:55PM +0000, Matthew Auld wrote: >> On 29/01/2025 19:52, Matthew Brost wrote: >>> Migration is implemented with range granularity, with VRAM backing being >>> a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime of the >>> TTM BO is limited to when the SVM range is in VRAM (i.e., when a VRAM >>> SVM range is migrated to SRAM, the TTM BO is destroyed). >>> >>> The design choice for using TTM BO for VRAM backing store, as opposed to >>> direct buddy allocation, is as follows: >>> >>> - DRM buddy allocations are not at page granularity, offering no >>> advantage over a BO. >>> - Unified eviction is required (SVM VRAM and TTM BOs need to be able to >>> evict each other). >>> - For exhaustive eviction [1], SVM VRAM allocations will almost certainly >>> require a dma-resv. >>> - Likely allocation size is 2M which makes of size of BO (872) >>> acceptable per allocation (872 / 2M == .0004158). >>> >>> With this, using TTM BO for VRAM backing store seems to be an obvious >>> choice as it allows leveraging of the TTM eviction code. >>> >>> Current migration policy is migrate any SVM range greater than or equal >>> to 64k once. >>> >>> [1] https://patchwork.freedesktop.org/series/133643/ >>> >>> v2: >>> - Rebase on latest GPU SVM >>> - Retry page fault on get pages returning mixed allocation >>> - Use drm_gpusvm_devmem >>> v3: >>> - Use new BO flags >>> - New range structure (Thomas) >>> - Hide migration behind Kconfig >>> - Kernel doc (Thomas) >>> - Use check_pages_threshold >>> v4: >>> - Don't evict partial unmaps in garbage collector (Thomas) >>> - Use %pe to print errors (Thomas) >>> - Use %p to print pointers (Thomas) >>> >>> Signed-off-by: Matthew Brost <matthew.brost@intel.com> >>> --- >>> drivers/gpu/drm/xe/xe_svm.c | 99 +++++++++++++++++++++++++++++++++++-- >>> drivers/gpu/drm/xe/xe_svm.h | 5 ++ >>> 2 files changed, 100 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c >>> index ba1db030bf33..fc030855d078 100644 >>> --- a/drivers/gpu/drm/xe/xe_svm.c >>> +++ b/drivers/gpu/drm/xe/xe_svm.c >>> @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem *devmem_allocatio >>> return 0; >>> } >>> -__maybe_unused >>> static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { >>> .devmem_release = xe_svm_devmem_release, >>> .populate_devmem_pfn = xe_svm_populate_devmem_pfn, >>> @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, >>> return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id); >>> } >>> +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) >>> +{ >>> + return &tile->mem.vram; >>> +} >>> + >>> +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile, >>> + struct xe_svm_range *range, >>> + const struct drm_gpusvm_ctx *ctx) >>> +{ >>> + struct xe_mem_region *mr = tile_to_mr(tile); >>> + struct drm_buddy_block *block; >>> + struct list_head *blocks; >>> + struct xe_bo *bo; >>> + ktime_t end = 0; >>> + int err; >>> + >>> +retry: >>> + xe_vm_lock(vm, false); >>> + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range->base.itree.last + 1 - >>> + range->base.itree.start, ttm_bo_type_device, >>> + XE_BO_FLAG_VRAM_IF_DGFX(tile) | >>> + XE_BO_FLAG_CPU_ADDR_MIRROR); >>> + xe_vm_unlock(vm); >> >> What was the trick again to ensure eviction is not triggered at this point? >> I thought there was some trick with eviction_valuable() but I can't find it. >> > > I dropped that given the hacky nature of how it was implemented. Yes, it > is possible that we allocate VRAM and it is immediately evicted before > the bind occurs but in practice should never really happen given this BO > should be the last entry on the LRU list. Even if this happens, I > believe this is harmless given the bind will abort and trigger a retry. Looking at xe_svm_bo_evict() it wants to use stuff like bo->devmem_allocation, but that is not set up yet? For example dereferencing the devmem_allocation->mm from there will potentially hit a NPD? > > Matt > >>> + if (IS_ERR(bo)) { >>> + err = PTR_ERR(bo); >>> + if (xe_vm_validate_should_retry(NULL, err, &end)) >>> + goto retry; >>> + return bo; >>> + } >>> + >>> + drm_gpusvm_devmem_init(&bo->devmem_allocation, >>> + vm->xe->drm.dev, vm->svm.gpusvm.mm, >>> + &gpusvm_devmem_ops, >>> + &tile->mem.vram.dpagemap, >>> + range->base.itree.last + 1 - >>> + range->base.itree.start); >>> + >>> + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; >>> + list_for_each_entry(block, blocks, link) >>> + block->private = mr; >>> + >>> + /* >>> + * Take ref because as soon as drm_gpusvm_migrate_to_devmem succeeds the >>> + * creation ref can be dropped upon CPU fault or unmap. >>> + */ >>> + xe_bo_get(bo); >>> + >>> + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, &range->base, >>> + &bo->devmem_allocation, ctx); >>> + if (err) { >>> + xe_bo_put(bo); /* Local ref */ >>> + xe_bo_put(bo); /* Creation ref */ >>> + return ERR_PTR(err); >>> + } >>> + >>> + return bo; >>> +} >>> + >>> /** >>> * xe_svm_handle_pagefault() - SVM handle page fault >>> * @vm: The VM. >>> @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, >>> * @fault_addr: The GPU fault address. >>> * @atomic: The fault atomic access bit. >>> * >>> - * Create GPU bindings for a SVM page fault. >>> + * Create GPU bindings for a SVM page fault. Optionally migrate to device >>> + * memory. >>> * >>> * Return: 0 on success, negative error code on error. >>> */ >>> @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, >>> struct xe_tile *tile, u64 fault_addr, >>> bool atomic) >>> { >>> - struct drm_gpusvm_ctx ctx = { .read_only = xe_vma_read_only(vma), }; >>> + struct drm_gpusvm_ctx ctx = { >>> + .read_only = xe_vma_read_only(vma), >>> + .devmem_possible = IS_DGFX(vm->xe) && >>> + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), >>> + .check_pages_threshold = IS_DGFX(vm->xe) && >>> + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0, >>> + }; >>> struct xe_svm_range *range; >>> struct drm_gpusvm_range *r; >>> struct drm_exec exec; >>> struct dma_fence *fence; >>> + struct xe_bo *bo = NULL; >>> ktime_t end = 0; >>> int err; >>> @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, >>> xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); >>> retry: >>> + xe_bo_put(bo); >>> + bo = NULL; >>> + >>> /* Always process UNMAPs first so view SVM ranges is current */ >>> err = xe_svm_garbage_collector(vm); >>> if (err) >>> @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, >>> if (xe_svm_range_is_valid(range, tile)) >>> return 0; >>> + /* XXX: Add migration policy, for now migrate range once */ >>> + if (!range->migrated && range->base.flags.migrate_devmem && >>> + (range->base.itree.last + 1 - range->base.itree.start) >= SZ_64K) { >>> + range->migrated = true; >>> + >>> + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); >>> + if (IS_ERR(bo)) { >>> + drm_info(&vm->xe->drm, >>> + "VRAM allocation failed, falling back to retrying, asid=%u, errno %pe\n", >>> + vm->usm.asid, bo); >>> + bo = NULL; >>> + goto retry; >>> + } >>> + } >>> + >>> err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); >>> - if (err == -EFAULT || err == -EPERM) /* Corner where CPU mappings have changed */ >>> + /* Corner where CPU mappings have changed */ >>> + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { >>> + if (err == -EOPNOTSUPP) >>> + drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base); >>> + drm_info(&vm->xe->drm, >>> + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno %pe\n", >>> + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); >>> goto retry; >>> + } >>> if (err) >>> goto err_out; >>> @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, >>> dma_fence_put(fence); >>> err_out: >>> + xe_bo_put(bo); >>> return err; >>> } >>> diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h >>> index 63daffdfdbf6..4c2576162c39 100644 >>> --- a/drivers/gpu/drm/xe/xe_svm.h >>> +++ b/drivers/gpu/drm/xe/xe_svm.h >>> @@ -35,6 +35,11 @@ struct xe_svm_range { >>> * range. Protected by GPU SVM notifier lock. >>> */ >>> u8 tile_invalidated; >>> + /** >>> + * @migrated: Range has been migrated to device memory, protected by >>> + * GPU fault handler locking. >>> + */ >>> + u8 migrated :1; >>> }; >>> int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr); >>
On Thu, Jan 30, 2025 at 04:56:39PM +0000, Matthew Auld wrote: > On 30/01/2025 16:32, Matthew Brost wrote: > > On Thu, Jan 30, 2025 at 02:22:55PM +0000, Matthew Auld wrote: > > > On 29/01/2025 19:52, Matthew Brost wrote: > > > > Migration is implemented with range granularity, with VRAM backing being > > > > a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime of the > > > > TTM BO is limited to when the SVM range is in VRAM (i.e., when a VRAM > > > > SVM range is migrated to SRAM, the TTM BO is destroyed). > > > > > > > > The design choice for using TTM BO for VRAM backing store, as opposed to > > > > direct buddy allocation, is as follows: > > > > > > > > - DRM buddy allocations are not at page granularity, offering no > > > > advantage over a BO. > > > > - Unified eviction is required (SVM VRAM and TTM BOs need to be able to > > > > evict each other). > > > > - For exhaustive eviction [1], SVM VRAM allocations will almost certainly > > > > require a dma-resv. > > > > - Likely allocation size is 2M which makes of size of BO (872) > > > > acceptable per allocation (872 / 2M == .0004158). > > > > > > > > With this, using TTM BO for VRAM backing store seems to be an obvious > > > > choice as it allows leveraging of the TTM eviction code. > > > > > > > > Current migration policy is migrate any SVM range greater than or equal > > > > to 64k once. > > > > > > > > [1] https://patchwork.freedesktop.org/series/133643/ > > > > > > > > v2: > > > > - Rebase on latest GPU SVM > > > > - Retry page fault on get pages returning mixed allocation > > > > - Use drm_gpusvm_devmem > > > > v3: > > > > - Use new BO flags > > > > - New range structure (Thomas) > > > > - Hide migration behind Kconfig > > > > - Kernel doc (Thomas) > > > > - Use check_pages_threshold > > > > v4: > > > > - Don't evict partial unmaps in garbage collector (Thomas) > > > > - Use %pe to print errors (Thomas) > > > > - Use %p to print pointers (Thomas) > > > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > > --- > > > > drivers/gpu/drm/xe/xe_svm.c | 99 +++++++++++++++++++++++++++++++++++-- > > > > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > > > > 2 files changed, 100 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c > > > > index ba1db030bf33..fc030855d078 100644 > > > > --- a/drivers/gpu/drm/xe/xe_svm.c > > > > +++ b/drivers/gpu/drm/xe/xe_svm.c > > > > @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem *devmem_allocatio > > > > return 0; > > > > } > > > > -__maybe_unused > > > > static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { > > > > .devmem_release = xe_svm_devmem_release, > > > > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > > > > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, > > > > return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id); > > > > } > > > > +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) > > > > +{ > > > > + return &tile->mem.vram; > > > > +} > > > > + > > > > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile, > > > > + struct xe_svm_range *range, > > > > + const struct drm_gpusvm_ctx *ctx) > > > > +{ > > > > + struct xe_mem_region *mr = tile_to_mr(tile); > > > > + struct drm_buddy_block *block; > > > > + struct list_head *blocks; > > > > + struct xe_bo *bo; > > > > + ktime_t end = 0; > > > > + int err; > > > > + > > > > +retry: > > > > + xe_vm_lock(vm, false); > > > > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range->base.itree.last + 1 - > > > > + range->base.itree.start, ttm_bo_type_device, > > > > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > > > > + XE_BO_FLAG_CPU_ADDR_MIRROR); > > > > + xe_vm_unlock(vm); > > > > > > What was the trick again to ensure eviction is not triggered at this point? > > > I thought there was some trick with eviction_valuable() but I can't find it. > > > > > > > I dropped that given the hacky nature of how it was implemented. Yes, it > > is possible that we allocate VRAM and it is immediately evicted before > > the bind occurs but in practice should never really happen given this BO > > should be the last entry on the LRU list. Even if this happens, I > > believe this is harmless given the bind will abort and trigger a retry. > > Looking at xe_svm_bo_evict() it wants to use stuff like > bo->devmem_allocation, but that is not set up yet? For example > dereferencing the devmem_allocation->mm from there will potentially hit a > NPD? Good catch. I think drm_gpusvm_devmem_init at least needs to be moved under BO's dma resv lock. The multi-GPU work Thomas is doing will even expand this scope further to include drm_gpusvm_migrate_to_devmem under the BO dma-resv too - this was ommitted in this series given we'd have to rework the mmap read lock a bit too which I'd prefer to wait on until his series. Matt > > > > > Matt > > > > > > + if (IS_ERR(bo)) { > > > > + err = PTR_ERR(bo); > > > > + if (xe_vm_validate_should_retry(NULL, err, &end)) > > > > + goto retry; > > > > + return bo; > > > > + } > > > > + > > > > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > > > > + vm->xe->drm.dev, vm->svm.gpusvm.mm, > > > > + &gpusvm_devmem_ops, > > > > + &tile->mem.vram.dpagemap, > > > > + range->base.itree.last + 1 - > > > > + range->base.itree.start); > > > > + > > > > + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; > > > > + list_for_each_entry(block, blocks, link) > > > > + block->private = mr; > > > > + > > > > + /* > > > > + * Take ref because as soon as drm_gpusvm_migrate_to_devmem succeeds the > > > > + * creation ref can be dropped upon CPU fault or unmap. > > > > + */ > > > > + xe_bo_get(bo); > > > > + > > > > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, &range->base, > > > > + &bo->devmem_allocation, ctx); > > > > + if (err) { > > > > + xe_bo_put(bo); /* Local ref */ > > > > + xe_bo_put(bo); /* Creation ref */ > > > > + return ERR_PTR(err); > > > > + } > > > > + > > > > + return bo; > > > > +} > > > > + > > > > /** > > > > * xe_svm_handle_pagefault() - SVM handle page fault > > > > * @vm: The VM. > > > > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, > > > > * @fault_addr: The GPU fault address. > > > > * @atomic: The fault atomic access bit. > > > > * > > > > - * Create GPU bindings for a SVM page fault. > > > > + * Create GPU bindings for a SVM page fault. Optionally migrate to device > > > > + * memory. > > > > * > > > > * Return: 0 on success, negative error code on error. > > > > */ > > > > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > > > struct xe_tile *tile, u64 fault_addr, > > > > bool atomic) > > > > { > > > > - struct drm_gpusvm_ctx ctx = { .read_only = xe_vma_read_only(vma), }; > > > > + struct drm_gpusvm_ctx ctx = { > > > > + .read_only = xe_vma_read_only(vma), > > > > + .devmem_possible = IS_DGFX(vm->xe) && > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), > > > > + .check_pages_threshold = IS_DGFX(vm->xe) && > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0, > > > > + }; > > > > struct xe_svm_range *range; > > > > struct drm_gpusvm_range *r; > > > > struct drm_exec exec; > > > > struct dma_fence *fence; > > > > + struct xe_bo *bo = NULL; > > > > ktime_t end = 0; > > > > int err; > > > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > > > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > > > retry: > > > > + xe_bo_put(bo); > > > > + bo = NULL; > > > > + > > > > /* Always process UNMAPs first so view SVM ranges is current */ > > > > err = xe_svm_garbage_collector(vm); > > > > if (err) > > > > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > > > if (xe_svm_range_is_valid(range, tile)) > > > > return 0; > > > > + /* XXX: Add migration policy, for now migrate range once */ > > > > + if (!range->migrated && range->base.flags.migrate_devmem && > > > > + (range->base.itree.last + 1 - range->base.itree.start) >= SZ_64K) { > > > > + range->migrated = true; > > > > + > > > > + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); > > > > + if (IS_ERR(bo)) { > > > > + drm_info(&vm->xe->drm, > > > > + "VRAM allocation failed, falling back to retrying, asid=%u, errno %pe\n", > > > > + vm->usm.asid, bo); > > > > + bo = NULL; > > > > + goto retry; > > > > + } > > > > + } > > > > + > > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); > > > > - if (err == -EFAULT || err == -EPERM) /* Corner where CPU mappings have changed */ > > > > + /* Corner where CPU mappings have changed */ > > > > + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { > > > > + if (err == -EOPNOTSUPP) > > > > + drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base); > > > > + drm_info(&vm->xe->drm, > > > > + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno %pe\n", > > > > + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); > > > > goto retry; > > > > + } > > > > if (err) > > > > goto err_out; > > > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, > > > > dma_fence_put(fence); > > > > err_out: > > > > + xe_bo_put(bo); > > > > return err; > > > > } > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h > > > > index 63daffdfdbf6..4c2576162c39 100644 > > > > --- a/drivers/gpu/drm/xe/xe_svm.h > > > > +++ b/drivers/gpu/drm/xe/xe_svm.h > > > > @@ -35,6 +35,11 @@ struct xe_svm_range { > > > > * range. Protected by GPU SVM notifier lock. > > > > */ > > > > u8 tile_invalidated; > > > > + /** > > > > + * @migrated: Range has been migrated to device memory, protected by > > > > + * GPU fault handler locking. > > > > + */ > > > > + u8 migrated :1; > > > > }; > > > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr); > > > >
On Thu, 2025-01-30 at 09:31 -0800, Matthew Brost wrote: > On Thu, Jan 30, 2025 at 04:56:39PM +0000, Matthew Auld wrote: > > On 30/01/2025 16:32, Matthew Brost wrote: > > > On Thu, Jan 30, 2025 at 02:22:55PM +0000, Matthew Auld wrote: > > > > On 29/01/2025 19:52, Matthew Brost wrote: > > > > > Migration is implemented with range granularity, with VRAM > > > > > backing being > > > > > a VM private TTM BO (i.e., shares dma-resv with VM). The > > > > > lifetime of the > > > > > TTM BO is limited to when the SVM range is in VRAM (i.e., > > > > > when a VRAM > > > > > SVM range is migrated to SRAM, the TTM BO is destroyed). > > > > > > > > > > The design choice for using TTM BO for VRAM backing store, as > > > > > opposed to > > > > > direct buddy allocation, is as follows: > > > > > > > > > > - DRM buddy allocations are not at page granularity, offering > > > > > no > > > > > advantage over a BO. > > > > > - Unified eviction is required (SVM VRAM and TTM BOs need to > > > > > be able to > > > > > evict each other). > > > > > - For exhaustive eviction [1], SVM VRAM allocations will > > > > > almost certainly > > > > > require a dma-resv. > > > > > - Likely allocation size is 2M which makes of size of BO > > > > > (872) > > > > > acceptable per allocation (872 / 2M == .0004158). > > > > > > > > > > With this, using TTM BO for VRAM backing store seems to be an > > > > > obvious > > > > > choice as it allows leveraging of the TTM eviction code. > > > > > > > > > > Current migration policy is migrate any SVM range greater > > > > > than or equal > > > > > to 64k once. > > > > > > > > > > [1] https://patchwork.freedesktop.org/series/133643/ > > > > > > > > > > v2: > > > > > - Rebase on latest GPU SVM > > > > > - Retry page fault on get pages returning mixed allocation > > > > > - Use drm_gpusvm_devmem > > > > > v3: > > > > > - Use new BO flags > > > > > - New range structure (Thomas) > > > > > - Hide migration behind Kconfig > > > > > - Kernel doc (Thomas) > > > > > - Use check_pages_threshold > > > > > v4: > > > > > - Don't evict partial unmaps in garbage collector (Thomas) > > > > > - Use %pe to print errors (Thomas) > > > > > - Use %p to print pointers (Thomas) > > > > > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > > > --- > > > > > drivers/gpu/drm/xe/xe_svm.c | 99 > > > > > +++++++++++++++++++++++++++++++++++-- > > > > > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > > > > > 2 files changed, 100 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c > > > > > b/drivers/gpu/drm/xe/xe_svm.c > > > > > index ba1db030bf33..fc030855d078 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_svm.c > > > > > +++ b/drivers/gpu/drm/xe/xe_svm.c > > > > > @@ -502,7 +502,6 @@ static int > > > > > xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem > > > > > *devmem_allocatio > > > > > return 0; > > > > > } > > > > > -__maybe_unused > > > > > static const struct drm_gpusvm_devmem_ops > > > > > gpusvm_devmem_ops = { > > > > > .devmem_release = xe_svm_devmem_release, > > > > > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > > > > > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct > > > > > xe_svm_range *range, > > > > > return (range->tile_present & ~range- > > > > > >tile_invalidated) & BIT(tile->id); > > > > > } > > > > > +static struct xe_mem_region *tile_to_mr(struct xe_tile > > > > > *tile) > > > > > +{ > > > > > + return &tile->mem.vram; > > > > > +} > > > > > + > > > > > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, > > > > > struct xe_tile *tile, > > > > > + struct xe_svm_range > > > > > *range, > > > > > + const struct > > > > > drm_gpusvm_ctx *ctx) > > > > > +{ > > > > > + struct xe_mem_region *mr = tile_to_mr(tile); > > > > > + struct drm_buddy_block *block; > > > > > + struct list_head *blocks; > > > > > + struct xe_bo *bo; > > > > > + ktime_t end = 0; > > > > > + int err; > > > > > + > > > > > +retry: > > > > > + xe_vm_lock(vm, false); > > > > > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range- > > > > > >base.itree.last + 1 - > > > > > + range->base.itree.start, > > > > > ttm_bo_type_device, > > > > > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > > > > > + XE_BO_FLAG_CPU_ADDR_MIRROR); > > > > > + xe_vm_unlock(vm); > > > > > > > > What was the trick again to ensure eviction is not triggered at > > > > this point? > > > > I thought there was some trick with eviction_valuable() but I > > > > can't find it. > > > > > > > > > > I dropped that given the hacky nature of how it was implemented. > > > Yes, it > > > is possible that we allocate VRAM and it is immediately evicted > > > before > > > the bind occurs but in practice should never really happen given > > > this BO > > > should be the last entry on the LRU list. Even if this happens, I > > > believe this is harmless given the bind will abort and trigger a > > > retry. > > > > Looking at xe_svm_bo_evict() it wants to use stuff like > > bo->devmem_allocation, but that is not set up yet? For example > > dereferencing the devmem_allocation->mm from there will potentially > > hit a > > NPD? > > Good catch. I think drm_gpusvm_devmem_init at least needs to be moved > under BO's dma resv lock. > > The multi-GPU work Thomas is doing will even expand this scope > further > to include drm_gpusvm_migrate_to_devmem under the BO dma-resv too - > this > was ommitted in this series given we'd have to rework the mmap read > lock > a bit too which I'd prefer to wait on until his series. TBH, I think all pages need to be present in the CPU page-table before we can release the dma-resv lock. That will ensure the eviction causes an invalidation later than the migration invalidation, and everybody's happy. An alternative until the multi-device series lands could be to pin the bo until the end of the function. That would avoid the locking trickiness. /Thomas > > Matt > > > > > > > > > Matt > > > > > > > > + if (IS_ERR(bo)) { > > > > > + err = PTR_ERR(bo); > > > > > + if (xe_vm_validate_should_retry(NULL, err, > > > > > &end)) > > > > > + goto retry; > > > > > + return bo; > > > > > + } > > > > > + > > > > > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > > > > > + vm->xe->drm.dev, vm- > > > > > >svm.gpusvm.mm, > > > > > + &gpusvm_devmem_ops, > > > > > + &tile->mem.vram.dpagemap, > > > > > + range->base.itree.last + 1 - > > > > > + range->base.itree.start); > > > > > + > > > > > + blocks = &to_xe_ttm_vram_mgr_resource(bo- > > > > > >ttm.resource)->blocks; > > > > > + list_for_each_entry(block, blocks, link) > > > > > + block->private = mr; > > > > > + > > > > > + /* > > > > > + * Take ref because as soon as > > > > > drm_gpusvm_migrate_to_devmem succeeds the > > > > > + * creation ref can be dropped upon CPU fault or > > > > > unmap. > > > > > + */ > > > > > + xe_bo_get(bo); > > > > > + > > > > > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, > > > > > &range->base, > > > > > + &bo- > > > > > >devmem_allocation, ctx); > > > > > + if (err) { > > > > > + xe_bo_put(bo); /* Local ref */ > > > > > + xe_bo_put(bo); /* Creation ref */ > > > > > + return ERR_PTR(err); > > > > > + } > > > > > + > > > > > + return bo; > > > > > +} > > > > > + > > > > > /** > > > > > * xe_svm_handle_pagefault() - SVM handle page fault > > > > > * @vm: The VM. > > > > > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct > > > > > xe_svm_range *range, > > > > > * @fault_addr: The GPU fault address. > > > > > * @atomic: The fault atomic access bit. > > > > > * > > > > > - * Create GPU bindings for a SVM page fault. > > > > > + * Create GPU bindings for a SVM page fault. Optionally > > > > > migrate to device > > > > > + * memory. > > > > > * > > > > > * Return: 0 on success, negative error code on error. > > > > > */ > > > > > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct > > > > > xe_vm *vm, struct xe_vma *vma, > > > > > struct xe_tile *tile, u64 > > > > > fault_addr, > > > > > bool atomic) > > > > > { > > > > > - struct drm_gpusvm_ctx ctx = { .read_only = > > > > > xe_vma_read_only(vma), }; > > > > > + struct drm_gpusvm_ctx ctx = { > > > > > + .read_only = xe_vma_read_only(vma), > > > > > + .devmem_possible = IS_DGFX(vm->xe) && > > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR > > > > > OR), > > > > > + .check_pages_threshold = IS_DGFX(vm->xe) && > > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR > > > > > OR) ? SZ_64K : 0, > > > > > + }; > > > > > struct xe_svm_range *range; > > > > > struct drm_gpusvm_range *r; > > > > > struct drm_exec exec; > > > > > struct dma_fence *fence; > > > > > + struct xe_bo *bo = NULL; > > > > > ktime_t end = 0; > > > > > int err; > > > > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm > > > > > *vm, struct xe_vma *vma, > > > > > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > > > > retry: > > > > > + xe_bo_put(bo); > > > > > + bo = NULL; > > > > > + > > > > > /* Always process UNMAPs first so view SVM ranges is > > > > > current */ > > > > > err = xe_svm_garbage_collector(vm); > > > > > if (err) > > > > > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm > > > > > *vm, struct xe_vma *vma, > > > > > if (xe_svm_range_is_valid(range, tile)) > > > > > return 0; > > > > > + /* XXX: Add migration policy, for now migrate range > > > > > once */ > > > > > + if (!range->migrated && range- > > > > > >base.flags.migrate_devmem && > > > > > + (range->base.itree.last + 1 - range- > > > > > >base.itree.start) >= SZ_64K) { > > > > > + range->migrated = true; > > > > > + > > > > > + bo = xe_svm_alloc_vram(vm, tile, range, > > > > > &ctx); > > > > > + if (IS_ERR(bo)) { > > > > > + drm_info(&vm->xe->drm, > > > > > + "VRAM allocation failed, > > > > > falling back to retrying, asid=%u, errno %pe\n", > > > > > + vm->usm.asid, bo); > > > > > + bo = NULL; > > > > > + goto retry; > > > > > + } > > > > > + } > > > > > + > > > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, > > > > > &ctx); > > > > > - if (err == -EFAULT || err == -EPERM) /* Corner > > > > > where CPU mappings have changed */ > > > > > + /* Corner where CPU mappings have changed */ > > > > > + if (err == -EOPNOTSUPP || err == -EFAULT || err == - > > > > > EPERM) { > > > > > + if (err == -EOPNOTSUPP) > > > > > + drm_gpusvm_range_evict(&vm- > > > > > >svm.gpusvm, &range->base); > > > > > + drm_info(&vm->xe->drm, > > > > > + "Get pages failed, falling back to > > > > > retrying, asid=%u, gpusvm=%p, errno %pe\n", > > > > > + vm->usm.asid, &vm->svm.gpusvm, > > > > > ERR_PTR(err)); > > > > > goto retry; > > > > > + } > > > > > if (err) > > > > > goto err_out; > > > > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm > > > > > *vm, struct xe_vma *vma, > > > > > dma_fence_put(fence); > > > > > err_out: > > > > > + xe_bo_put(bo); > > > > > return err; > > > > > } > > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h > > > > > b/drivers/gpu/drm/xe/xe_svm.h > > > > > index 63daffdfdbf6..4c2576162c39 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_svm.h > > > > > +++ b/drivers/gpu/drm/xe/xe_svm.h > > > > > @@ -35,6 +35,11 @@ struct xe_svm_range { > > > > > * range. Protected by GPU SVM notifier lock. > > > > > */ > > > > > u8 tile_invalidated; > > > > > + /** > > > > > + * @migrated: Range has been migrated to device > > > > > memory, protected by > > > > > + * GPU fault handler locking. > > > > > + */ > > > > > + u8 migrated :1; > > > > > }; > > > > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region > > > > > *mr); > > > > > >
On Thu, Jan 30, 2025 at 07:51:49PM +0100, Thomas Hellström wrote: > On Thu, 2025-01-30 at 09:31 -0800, Matthew Brost wrote: > > On Thu, Jan 30, 2025 at 04:56:39PM +0000, Matthew Auld wrote: > > > On 30/01/2025 16:32, Matthew Brost wrote: > > > > On Thu, Jan 30, 2025 at 02:22:55PM +0000, Matthew Auld wrote: > > > > > On 29/01/2025 19:52, Matthew Brost wrote: > > > > > > Migration is implemented with range granularity, with VRAM > > > > > > backing being > > > > > > a VM private TTM BO (i.e., shares dma-resv with VM). The > > > > > > lifetime of the > > > > > > TTM BO is limited to when the SVM range is in VRAM (i.e., > > > > > > when a VRAM > > > > > > SVM range is migrated to SRAM, the TTM BO is destroyed). > > > > > > > > > > > > The design choice for using TTM BO for VRAM backing store, as > > > > > > opposed to > > > > > > direct buddy allocation, is as follows: > > > > > > > > > > > > - DRM buddy allocations are not at page granularity, offering > > > > > > no > > > > > > advantage over a BO. > > > > > > - Unified eviction is required (SVM VRAM and TTM BOs need to > > > > > > be able to > > > > > > evict each other). > > > > > > - For exhaustive eviction [1], SVM VRAM allocations will > > > > > > almost certainly > > > > > > require a dma-resv. > > > > > > - Likely allocation size is 2M which makes of size of BO > > > > > > (872) > > > > > > acceptable per allocation (872 / 2M == .0004158). > > > > > > > > > > > > With this, using TTM BO for VRAM backing store seems to be an > > > > > > obvious > > > > > > choice as it allows leveraging of the TTM eviction code. > > > > > > > > > > > > Current migration policy is migrate any SVM range greater > > > > > > than or equal > > > > > > to 64k once. > > > > > > > > > > > > [1] https://patchwork.freedesktop.org/series/133643/ > > > > > > > > > > > > v2: > > > > > > - Rebase on latest GPU SVM > > > > > > - Retry page fault on get pages returning mixed allocation > > > > > > - Use drm_gpusvm_devmem > > > > > > v3: > > > > > > - Use new BO flags > > > > > > - New range structure (Thomas) > > > > > > - Hide migration behind Kconfig > > > > > > - Kernel doc (Thomas) > > > > > > - Use check_pages_threshold > > > > > > v4: > > > > > > - Don't evict partial unmaps in garbage collector (Thomas) > > > > > > - Use %pe to print errors (Thomas) > > > > > > - Use %p to print pointers (Thomas) > > > > > > > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > > > > --- > > > > > > drivers/gpu/drm/xe/xe_svm.c | 99 > > > > > > +++++++++++++++++++++++++++++++++++-- > > > > > > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > > > > > > 2 files changed, 100 insertions(+), 4 deletions(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c > > > > > > b/drivers/gpu/drm/xe/xe_svm.c > > > > > > index ba1db030bf33..fc030855d078 100644 > > > > > > --- a/drivers/gpu/drm/xe/xe_svm.c > > > > > > +++ b/drivers/gpu/drm/xe/xe_svm.c > > > > > > @@ -502,7 +502,6 @@ static int > > > > > > xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem > > > > > > *devmem_allocatio > > > > > > return 0; > > > > > > } > > > > > > -__maybe_unused > > > > > > static const struct drm_gpusvm_devmem_ops > > > > > > gpusvm_devmem_ops = { > > > > > > .devmem_release = xe_svm_devmem_release, > > > > > > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > > > > > > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct > > > > > > xe_svm_range *range, > > > > > > return (range->tile_present & ~range- > > > > > > >tile_invalidated) & BIT(tile->id); > > > > > > } > > > > > > +static struct xe_mem_region *tile_to_mr(struct xe_tile > > > > > > *tile) > > > > > > +{ > > > > > > + return &tile->mem.vram; > > > > > > +} > > > > > > + > > > > > > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, > > > > > > struct xe_tile *tile, > > > > > > + struct xe_svm_range > > > > > > *range, > > > > > > + const struct > > > > > > drm_gpusvm_ctx *ctx) > > > > > > +{ > > > > > > + struct xe_mem_region *mr = tile_to_mr(tile); > > > > > > + struct drm_buddy_block *block; > > > > > > + struct list_head *blocks; > > > > > > + struct xe_bo *bo; > > > > > > + ktime_t end = 0; > > > > > > + int err; > > > > > > + > > > > > > +retry: > > > > > > + xe_vm_lock(vm, false); > > > > > > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range- > > > > > > >base.itree.last + 1 - > > > > > > + range->base.itree.start, > > > > > > ttm_bo_type_device, > > > > > > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > > > > > > + XE_BO_FLAG_CPU_ADDR_MIRROR); > > > > > > + xe_vm_unlock(vm); > > > > > > > > > > What was the trick again to ensure eviction is not triggered at > > > > > this point? > > > > > I thought there was some trick with eviction_valuable() but I > > > > > can't find it. > > > > > > > > > > > > > I dropped that given the hacky nature of how it was implemented. > > > > Yes, it > > > > is possible that we allocate VRAM and it is immediately evicted > > > > before > > > > the bind occurs but in practice should never really happen given > > > > this BO > > > > should be the last entry on the LRU list. Even if this happens, I > > > > believe this is harmless given the bind will abort and trigger a > > > > retry. > > > > > > Looking at xe_svm_bo_evict() it wants to use stuff like > > > bo->devmem_allocation, but that is not set up yet? For example > > > dereferencing the devmem_allocation->mm from there will potentially > > > hit a > > > NPD? > > > > Good catch. I think drm_gpusvm_devmem_init at least needs to be moved > > under BO's dma resv lock. > > > > The multi-GPU work Thomas is doing will even expand this scope > > further > > to include drm_gpusvm_migrate_to_devmem under the BO dma-resv too - > > this > > was ommitted in this series given we'd have to rework the mmap read > > lock > > a bit too which I'd prefer to wait on until his series. > > TBH, I think all pages need to be present in the CPU page-table before > we can release the dma-resv lock. That will ensure the eviction causes > an invalidation later than the migration invalidation, and everybody's > happy. > Yea, perhaps. Certainly this is safer but I think I reasoned it actually ok given 2 opposing migrate functions lock the individual pages before the migration actually happens. We'd likely end up hitting the retry in the eviction code though due to partial eviction. I do agree in general with taking dma-resv lock here for safety / simplicity. > An alternative until the multi-device series lands could be to pin the > bo until the end of the function. That would avoid the locking > trickiness. > I'd rather just rework the locking the structure with the mmap taken in the Xe layer + a comment indicating moving this lock to a DRM layer is preferred. Matt > /Thomas > > > > > Matt > > > > > > > > > > > > > Matt > > > > > > > > > > + if (IS_ERR(bo)) { > > > > > > + err = PTR_ERR(bo); > > > > > > + if (xe_vm_validate_should_retry(NULL, err, > > > > > > &end)) > > > > > > + goto retry; > > > > > > + return bo; > > > > > > + } > > > > > > + > > > > > > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > > > > > > + vm->xe->drm.dev, vm- > > > > > > >svm.gpusvm.mm, > > > > > > + &gpusvm_devmem_ops, > > > > > > + &tile->mem.vram.dpagemap, > > > > > > + range->base.itree.last + 1 - > > > > > > + range->base.itree.start); > > > > > > + > > > > > > + blocks = &to_xe_ttm_vram_mgr_resource(bo- > > > > > > >ttm.resource)->blocks; > > > > > > + list_for_each_entry(block, blocks, link) > > > > > > + block->private = mr; > > > > > > + > > > > > > + /* > > > > > > + * Take ref because as soon as > > > > > > drm_gpusvm_migrate_to_devmem succeeds the > > > > > > + * creation ref can be dropped upon CPU fault or > > > > > > unmap. > > > > > > + */ > > > > > > + xe_bo_get(bo); > > > > > > + > > > > > > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, > > > > > > &range->base, > > > > > > + &bo- > > > > > > >devmem_allocation, ctx); > > > > > > + if (err) { > > > > > > + xe_bo_put(bo); /* Local ref */ > > > > > > + xe_bo_put(bo); /* Creation ref */ > > > > > > + return ERR_PTR(err); > > > > > > + } > > > > > > + > > > > > > + return bo; > > > > > > +} > > > > > > + > > > > > > /** > > > > > > * xe_svm_handle_pagefault() - SVM handle page fault > > > > > > * @vm: The VM. > > > > > > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct > > > > > > xe_svm_range *range, > > > > > > * @fault_addr: The GPU fault address. > > > > > > * @atomic: The fault atomic access bit. > > > > > > * > > > > > > - * Create GPU bindings for a SVM page fault. > > > > > > + * Create GPU bindings for a SVM page fault. Optionally > > > > > > migrate to device > > > > > > + * memory. > > > > > > * > > > > > > * Return: 0 on success, negative error code on error. > > > > > > */ > > > > > > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct > > > > > > xe_vm *vm, struct xe_vma *vma, > > > > > > struct xe_tile *tile, u64 > > > > > > fault_addr, > > > > > > bool atomic) > > > > > > { > > > > > > - struct drm_gpusvm_ctx ctx = { .read_only = > > > > > > xe_vma_read_only(vma), }; > > > > > > + struct drm_gpusvm_ctx ctx = { > > > > > > + .read_only = xe_vma_read_only(vma), > > > > > > + .devmem_possible = IS_DGFX(vm->xe) && > > > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR > > > > > > OR), > > > > > > + .check_pages_threshold = IS_DGFX(vm->xe) && > > > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR > > > > > > OR) ? SZ_64K : 0, > > > > > > + }; > > > > > > struct xe_svm_range *range; > > > > > > struct drm_gpusvm_range *r; > > > > > > struct drm_exec exec; > > > > > > struct dma_fence *fence; > > > > > > + struct xe_bo *bo = NULL; > > > > > > ktime_t end = 0; > > > > > > int err; > > > > > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm > > > > > > *vm, struct xe_vma *vma, > > > > > > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > > > > > retry: > > > > > > + xe_bo_put(bo); > > > > > > + bo = NULL; > > > > > > + > > > > > > /* Always process UNMAPs first so view SVM ranges is > > > > > > current */ > > > > > > err = xe_svm_garbage_collector(vm); > > > > > > if (err) > > > > > > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm > > > > > > *vm, struct xe_vma *vma, > > > > > > if (xe_svm_range_is_valid(range, tile)) > > > > > > return 0; > > > > > > + /* XXX: Add migration policy, for now migrate range > > > > > > once */ > > > > > > + if (!range->migrated && range- > > > > > > >base.flags.migrate_devmem && > > > > > > + (range->base.itree.last + 1 - range- > > > > > > >base.itree.start) >= SZ_64K) { > > > > > > + range->migrated = true; > > > > > > + > > > > > > + bo = xe_svm_alloc_vram(vm, tile, range, > > > > > > &ctx); > > > > > > + if (IS_ERR(bo)) { > > > > > > + drm_info(&vm->xe->drm, > > > > > > + "VRAM allocation failed, > > > > > > falling back to retrying, asid=%u, errno %pe\n", > > > > > > + vm->usm.asid, bo); > > > > > > + bo = NULL; > > > > > > + goto retry; > > > > > > + } > > > > > > + } > > > > > > + > > > > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, > > > > > > &ctx); > > > > > > - if (err == -EFAULT || err == -EPERM) /* Corner > > > > > > where CPU mappings have changed */ > > > > > > + /* Corner where CPU mappings have changed */ > > > > > > + if (err == -EOPNOTSUPP || err == -EFAULT || err == - > > > > > > EPERM) { > > > > > > + if (err == -EOPNOTSUPP) > > > > > > + drm_gpusvm_range_evict(&vm- > > > > > > >svm.gpusvm, &range->base); > > > > > > + drm_info(&vm->xe->drm, > > > > > > + "Get pages failed, falling back to > > > > > > retrying, asid=%u, gpusvm=%p, errno %pe\n", > > > > > > + vm->usm.asid, &vm->svm.gpusvm, > > > > > > ERR_PTR(err)); > > > > > > goto retry; > > > > > > + } > > > > > > if (err) > > > > > > goto err_out; > > > > > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm > > > > > > *vm, struct xe_vma *vma, > > > > > > dma_fence_put(fence); > > > > > > err_out: > > > > > > + xe_bo_put(bo); > > > > > > return err; > > > > > > } > > > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h > > > > > > b/drivers/gpu/drm/xe/xe_svm.h > > > > > > index 63daffdfdbf6..4c2576162c39 100644 > > > > > > --- a/drivers/gpu/drm/xe/xe_svm.h > > > > > > +++ b/drivers/gpu/drm/xe/xe_svm.h > > > > > > @@ -35,6 +35,11 @@ struct xe_svm_range { > > > > > > * range. Protected by GPU SVM notifier lock. > > > > > > */ > > > > > > u8 tile_invalidated; > > > > > > + /** > > > > > > + * @migrated: Range has been migrated to device > > > > > > memory, protected by > > > > > > + * GPU fault handler locking. > > > > > > + */ > > > > > > + u8 migrated :1; > > > > > > }; > > > > > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region > > > > > > *mr); > > > > > > > > >
On Wed, 2025-01-29 at 11:52 -0800, Matthew Brost wrote: > Migration is implemented with range granularity, with VRAM backing > being > a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime of > the > TTM BO is limited to when the SVM range is in VRAM (i.e., when a VRAM > SVM range is migrated to SRAM, the TTM BO is destroyed). > > The design choice for using TTM BO for VRAM backing store, as opposed > to > direct buddy allocation, is as follows: > > - DRM buddy allocations are not at page granularity, offering no > advantage over a BO. > - Unified eviction is required (SVM VRAM and TTM BOs need to be able > to > evict each other). > - For exhaustive eviction [1], SVM VRAM allocations will almost > certainly > require a dma-resv. > - Likely allocation size is 2M which makes of size of BO (872) > acceptable per allocation (872 / 2M == .0004158). > > With this, using TTM BO for VRAM backing store seems to be an obvious > choice as it allows leveraging of the TTM eviction code. > > Current migration policy is migrate any SVM range greater than or > equal > to 64k once. > > [1] https://patchwork.freedesktop.org/series/133643/ > > v2: > - Rebase on latest GPU SVM > - Retry page fault on get pages returning mixed allocation > - Use drm_gpusvm_devmem > v3: > - Use new BO flags > - New range structure (Thomas) > - Hide migration behind Kconfig > - Kernel doc (Thomas) > - Use check_pages_threshold > v4: > - Don't evict partial unmaps in garbage collector (Thomas) > - Use %pe to print errors (Thomas) > - Use %p to print pointers (Thomas) > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@lists.freedesktop.org> > --- > drivers/gpu/drm/xe/xe_svm.c | 99 > +++++++++++++++++++++++++++++++++++-- > drivers/gpu/drm/xe/xe_svm.h | 5 ++ > 2 files changed, 100 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_svm.c > b/drivers/gpu/drm/xe/xe_svm.c > index ba1db030bf33..fc030855d078 100644 > --- a/drivers/gpu/drm/xe/xe_svm.c > +++ b/drivers/gpu/drm/xe/xe_svm.c > @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct > drm_gpusvm_devmem *devmem_allocatio > return 0; > } > > -__maybe_unused > static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { > .devmem_release = xe_svm_devmem_release, > .populate_devmem_pfn = xe_svm_populate_devmem_pfn, > @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct > xe_svm_range *range, > return (range->tile_present & ~range->tile_invalidated) & > BIT(tile->id); > } > > +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) > +{ > + return &tile->mem.vram; > +} > + > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct > xe_tile *tile, > + struct xe_svm_range *range, > + const struct drm_gpusvm_ctx > *ctx) > +{ > + struct xe_mem_region *mr = tile_to_mr(tile); > + struct drm_buddy_block *block; > + struct list_head *blocks; > + struct xe_bo *bo; > + ktime_t end = 0; > + int err; > + > +retry: > + xe_vm_lock(vm, false); > + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range- > >base.itree.last + 1 - > + range->base.itree.start, > ttm_bo_type_device, > + XE_BO_FLAG_VRAM_IF_DGFX(tile) | > + XE_BO_FLAG_CPU_ADDR_MIRROR); > + xe_vm_unlock(vm); > + if (IS_ERR(bo)) { > + err = PTR_ERR(bo); > + if (xe_vm_validate_should_retry(NULL, err, &end)) > + goto retry; > + return bo; > + } > + > + drm_gpusvm_devmem_init(&bo->devmem_allocation, > + vm->xe->drm.dev, vm->svm.gpusvm.mm, > + &gpusvm_devmem_ops, > + &tile->mem.vram.dpagemap, > + range->base.itree.last + 1 - > + range->base.itree.start); > + > + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)- > >blocks; > + list_for_each_entry(block, blocks, link) > + block->private = mr; > + > + /* > + * Take ref because as soon as drm_gpusvm_migrate_to_devmem > succeeds the > + * creation ref can be dropped upon CPU fault or unmap. > + */ > + xe_bo_get(bo); > + > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, &range- > >base, > + &bo->devmem_allocation, > ctx); > + if (err) { > + xe_bo_put(bo); /* Local ref */ > + xe_bo_put(bo); /* Creation ref */ > + return ERR_PTR(err); > + } > + > + return bo; > +} > + > /** > * xe_svm_handle_pagefault() - SVM handle page fault > * @vm: The VM. > @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct > xe_svm_range *range, > * @fault_addr: The GPU fault address. > * @atomic: The fault atomic access bit. > * > - * Create GPU bindings for a SVM page fault. > + * Create GPU bindings for a SVM page fault. Optionally migrate to > device > + * memory. > * > * Return: 0 on success, negative error code on error. > */ > @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, > struct xe_vma *vma, > struct xe_tile *tile, u64 fault_addr, > bool atomic) > { > - struct drm_gpusvm_ctx ctx = { .read_only = > xe_vma_read_only(vma), }; > + struct drm_gpusvm_ctx ctx = { > + .read_only = xe_vma_read_only(vma), > + .devmem_possible = IS_DGFX(vm->xe) && > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), > + .check_pages_threshold = IS_DGFX(vm->xe) && > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? > SZ_64K : 0, > + }; > struct xe_svm_range *range; > struct drm_gpusvm_range *r; > struct drm_exec exec; > struct dma_fence *fence; > + struct xe_bo *bo = NULL; > ktime_t end = 0; > int err; > > @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, > struct xe_vma *vma, > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); > > retry: > + xe_bo_put(bo); > + bo = NULL; > + > /* Always process UNMAPs first so view SVM ranges is current > */ > err = xe_svm_garbage_collector(vm); > if (err) > @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, > struct xe_vma *vma, > if (xe_svm_range_is_valid(range, tile)) > return 0; > > + /* XXX: Add migration policy, for now migrate range once */ > + if (!range->migrated && range->base.flags.migrate_devmem && > + (range->base.itree.last + 1 - range->base.itree.start) > >= SZ_64K) { > + range->migrated = true; > + > + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); > + if (IS_ERR(bo)) { > + drm_info(&vm->xe->drm, > + "VRAM allocation failed, falling > back to retrying, asid=%u, errno %pe\n", > + vm->usm.asid, bo); > + bo = NULL; > + goto retry; > + } > + } > + > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); > - if (err == -EFAULT || err == -EPERM) /* Corner where CPU > mappings have changed */ > + /* Corner where CPU mappings have changed */ > + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { > + if (err == -EOPNOTSUPP) > + drm_gpusvm_range_evict(&vm->svm.gpusvm, > &range->base); > + drm_info(&vm->xe->drm, > + "Get pages failed, falling back to > retrying, asid=%u, gpusvm=%p, errno %pe\n", > + vm->usm.asid, &vm->svm.gpusvm, > ERR_PTR(err)); > goto retry; > + } > if (err) > goto err_out; > > @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, > struct xe_vma *vma, > dma_fence_put(fence); > > err_out: > + xe_bo_put(bo); > > return err; > } > diff --git a/drivers/gpu/drm/xe/xe_svm.h > b/drivers/gpu/drm/xe/xe_svm.h > index 63daffdfdbf6..4c2576162c39 100644 > --- a/drivers/gpu/drm/xe/xe_svm.h > +++ b/drivers/gpu/drm/xe/xe_svm.h > @@ -35,6 +35,11 @@ struct xe_svm_range { > * range. Protected by GPU SVM notifier lock. > */ > u8 tile_invalidated; > + /** > + * @migrated: Range has been migrated to device memory, > protected by > + * GPU fault handler locking. > + */ > + u8 migrated :1; > }; > > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index ba1db030bf33..fc030855d078 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -502,7 +502,6 @@ static int xe_svm_populate_devmem_pfn(struct drm_gpusvm_devmem *devmem_allocatio return 0; } -__maybe_unused static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = { .devmem_release = xe_svm_devmem_release, .populate_devmem_pfn = xe_svm_populate_devmem_pfn, @@ -582,6 +581,64 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id); } +static struct xe_mem_region *tile_to_mr(struct xe_tile *tile) +{ + return &tile->mem.vram; +} + +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile, + struct xe_svm_range *range, + const struct drm_gpusvm_ctx *ctx) +{ + struct xe_mem_region *mr = tile_to_mr(tile); + struct drm_buddy_block *block; + struct list_head *blocks; + struct xe_bo *bo; + ktime_t end = 0; + int err; + +retry: + xe_vm_lock(vm, false); + bo = xe_bo_create(tile_to_xe(tile), tile, vm, range->base.itree.last + 1 - + range->base.itree.start, ttm_bo_type_device, + XE_BO_FLAG_VRAM_IF_DGFX(tile) | + XE_BO_FLAG_CPU_ADDR_MIRROR); + xe_vm_unlock(vm); + if (IS_ERR(bo)) { + err = PTR_ERR(bo); + if (xe_vm_validate_should_retry(NULL, err, &end)) + goto retry; + return bo; + } + + drm_gpusvm_devmem_init(&bo->devmem_allocation, + vm->xe->drm.dev, vm->svm.gpusvm.mm, + &gpusvm_devmem_ops, + &tile->mem.vram.dpagemap, + range->base.itree.last + 1 - + range->base.itree.start); + + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; + list_for_each_entry(block, blocks, link) + block->private = mr; + + /* + * Take ref because as soon as drm_gpusvm_migrate_to_devmem succeeds the + * creation ref can be dropped upon CPU fault or unmap. + */ + xe_bo_get(bo); + + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm, &range->base, + &bo->devmem_allocation, ctx); + if (err) { + xe_bo_put(bo); /* Local ref */ + xe_bo_put(bo); /* Creation ref */ + return ERR_PTR(err); + } + + return bo; +} + /** * xe_svm_handle_pagefault() - SVM handle page fault * @vm: The VM. @@ -590,7 +647,8 @@ static bool xe_svm_range_is_valid(struct xe_svm_range *range, * @fault_addr: The GPU fault address. * @atomic: The fault atomic access bit. * - * Create GPU bindings for a SVM page fault. + * Create GPU bindings for a SVM page fault. Optionally migrate to device + * memory. * * Return: 0 on success, negative error code on error. */ @@ -598,11 +656,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, struct xe_tile *tile, u64 fault_addr, bool atomic) { - struct drm_gpusvm_ctx ctx = { .read_only = xe_vma_read_only(vma), }; + struct drm_gpusvm_ctx ctx = { + .read_only = xe_vma_read_only(vma), + .devmem_possible = IS_DGFX(vm->xe) && + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), + .check_pages_threshold = IS_DGFX(vm->xe) && + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0, + }; struct xe_svm_range *range; struct drm_gpusvm_range *r; struct drm_exec exec; struct dma_fence *fence; + struct xe_bo *bo = NULL; ktime_t end = 0; int err; @@ -610,6 +675,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma)); retry: + xe_bo_put(bo); + bo = NULL; + /* Always process UNMAPs first so view SVM ranges is current */ err = xe_svm_garbage_collector(vm); if (err) @@ -625,9 +693,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, if (xe_svm_range_is_valid(range, tile)) return 0; + /* XXX: Add migration policy, for now migrate range once */ + if (!range->migrated && range->base.flags.migrate_devmem && + (range->base.itree.last + 1 - range->base.itree.start) >= SZ_64K) { + range->migrated = true; + + bo = xe_svm_alloc_vram(vm, tile, range, &ctx); + if (IS_ERR(bo)) { + drm_info(&vm->xe->drm, + "VRAM allocation failed, falling back to retrying, asid=%u, errno %pe\n", + vm->usm.asid, bo); + bo = NULL; + goto retry; + } + } + err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); - if (err == -EFAULT || err == -EPERM) /* Corner where CPU mappings have changed */ + /* Corner where CPU mappings have changed */ + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { + if (err == -EOPNOTSUPP) + drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base); + drm_info(&vm->xe->drm, + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno %pe\n", + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); goto retry; + } if (err) goto err_out; @@ -658,6 +748,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma, dma_fence_put(fence); err_out: + xe_bo_put(bo); return err; } diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 63daffdfdbf6..4c2576162c39 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -35,6 +35,11 @@ struct xe_svm_range { * range. Protected by GPU SVM notifier lock. */ u8 tile_invalidated; + /** + * @migrated: Range has been migrated to device memory, protected by + * GPU fault handler locking. + */ + u8 migrated :1; }; int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
Migration is implemented with range granularity, with VRAM backing being a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime of the TTM BO is limited to when the SVM range is in VRAM (i.e., when a VRAM SVM range is migrated to SRAM, the TTM BO is destroyed). The design choice for using TTM BO for VRAM backing store, as opposed to direct buddy allocation, is as follows: - DRM buddy allocations are not at page granularity, offering no advantage over a BO. - Unified eviction is required (SVM VRAM and TTM BOs need to be able to evict each other). - For exhaustive eviction [1], SVM VRAM allocations will almost certainly require a dma-resv. - Likely allocation size is 2M which makes of size of BO (872) acceptable per allocation (872 / 2M == .0004158). With this, using TTM BO for VRAM backing store seems to be an obvious choice as it allows leveraging of the TTM eviction code. Current migration policy is migrate any SVM range greater than or equal to 64k once. [1] https://patchwork.freedesktop.org/series/133643/ v2: - Rebase on latest GPU SVM - Retry page fault on get pages returning mixed allocation - Use drm_gpusvm_devmem v3: - Use new BO flags - New range structure (Thomas) - Hide migration behind Kconfig - Kernel doc (Thomas) - Use check_pages_threshold v4: - Don't evict partial unmaps in garbage collector (Thomas) - Use %pe to print errors (Thomas) - Use %p to print pointers (Thomas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/xe/xe_svm.c | 99 +++++++++++++++++++++++++++++++++++-- drivers/gpu/drm/xe/xe_svm.h | 5 ++ 2 files changed, 100 insertions(+), 4 deletions(-)