Message ID | 1441807630-20072-3-git-send-email-michal.winiarski@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Sep 09, 2015 at 04:07:09PM +0200, Micha? Winiarski wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > Userspace can pass in an offset that it presumes the object is located > at. The kernel will then do its utmost to fit the object into that > location. The assumption is that userspace is handling its own object > locations (for example along with full-ppgtt) and that the kernel will > rarely have to make space for the user's requests. > > v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris > Wilson. Fixed incorrect error paths causing crash found by Michal Winiarski. > (Not published externally) > > v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed eviction > to allow eviction of soft-pinned objects when another soft-pinned object used > by a subsequent execbuffer overlaps reported by Michal Winiarski. > (Not published externally) > > v4: Moved soft-pinned objects to the front of ordered_vmas so that they are > pinned first after an address conflict happens to avoid repeated conflicts in > rare cases (Suggested by Chris Wilson). Expanded comment on > drm_i915_gem_exec_object2.offset to cover this new API. > > v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability > (Kristian). Added check for multiple pinnings on eviction (Akash). Made sure > buffers are not considered misplaced without the user specifying > EXEC_OBJECT_SUPPORTS_48BBADDRESS. User must assume responsibility for any > addressing workarounds. Updated object2.offset field comment again to clarify > NO_RELOC case (Chris). checkpatch cleanup. > > v6: Rebase on top of current nightly. Dropped any references to > EXEC_OBJECT_SUPPORTS_48BBADDRESS since those don't exist in upstream > yet, and are not a dependency. > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Akash Goel <akash.goel@intel.com> > Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > Cc: Michal Winiarski <michal.winiarski@intel.com> > Cc: Zou Nanhai <nanhai.zou@intel.com> > Cc: Kristian Høgsberg <hoegsberg@gmail.com> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > Signed-off-by: Micha? Winiarski <michal.winiarski@intel.com> Again, the precursors are not included in this series or upstream, so NAK. -Chris
> -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf > Of Chris Wilson > Sent: Wednesday, September 9, 2015 22:25 > To: Winiarski, Michal > Cc: intel-gfx@lists.freedesktop.org; Kristian Høgsberg; dri- > devel@lists.freedesktop.org; Goel, Akash; Belgaumkar, Vinay; mesa- > dev@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH v6 2/2] drm/i915: Add soft-pinning API for > execbuffer > > On Wed, Sep 09, 2015 at 04:07:09PM +0200, Micha? Winiarski wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > Userspace can pass in an offset that it presumes the object is located > > at. The kernel will then do its utmost to fit the object into that > > location. The assumption is that userspace is handling its own object > > locations (for example along with full-ppgtt) and that the kernel will > > rarely have to make space for the user's requests. > > > > v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested > > by Chris Wilson. Fixed incorrect error paths causing crash found by Michal > Winiarski. > > (Not published externally) > > > > v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed > > eviction to allow eviction of soft-pinned objects when another > > soft-pinned object used by a subsequent execbuffer overlaps reported by > Michal Winiarski. > > (Not published externally) > > > > v4: Moved soft-pinned objects to the front of ordered_vmas so that > > they are pinned first after an address conflict happens to avoid > > repeated conflicts in rare cases (Suggested by Chris Wilson). > > Expanded comment on drm_i915_gem_exec_object2.offset to cover this > new API. > > > > v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this > > capability (Kristian). Added check for multiple pinnings on eviction > > (Akash). Made sure buffers are not considered misplaced without the > > user specifying EXEC_OBJECT_SUPPORTS_48BBADDRESS. User must > assume > > responsibility for any addressing workarounds. Updated object2.offset > > field comment again to clarify NO_RELOC case (Chris). checkpatch cleanup. > > > > v6: Rebase on top of current nightly. Dropped any references to > > EXEC_OBJECT_SUPPORTS_48BBADDRESS since those don't exist in > upstream > > yet, and are not a dependency. > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Akash Goel <akash.goel@intel.com> > > Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > > Cc: Michal Winiarski <michal.winiarski@intel.com> > > Cc: Zou Nanhai <nanhai.zou@intel.com> > > Cc: Kristian Høgsberg <hoegsberg@gmail.com> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > > Signed-off-by: Micha? Winiarski <michal.winiarski@intel.com> > > Again, the precursors are not included in this series or upstream, so NAK. > -Chris I have post a beignet's OpenCL2.0 SVM patch based on this patch set. It works well on i386 system. Can you review this patchset again? thanks. > > -- > Chris Wilson, Intel Open Source Technology Centre > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of > Yang, Rong R > Sent: Wednesday, November 4, 2015 10:32 AM > To: Chris Wilson; Winiarski, Michal > Cc: intel-gfx@lists.freedesktop.org; Kristian H?gsberg; dri- > devel@lists.freedesktop.org; Goel, Akash; mesa-dev@lists.freedesktop.org; > Belgaumkar, Vinay > Subject: Re: [Intel-gfx] [PATCH v6 2/2] drm/i915: Add soft-pinning API for > execbuffer > > > > > -----Original Message----- > > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf > > Of Chris Wilson > > Sent: Wednesday, September 9, 2015 22:25 > > To: Winiarski, Michal > > Cc: intel-gfx@lists.freedesktop.org; Kristian Høgsberg; dri- > > devel@lists.freedesktop.org; Goel, Akash; Belgaumkar, Vinay; mesa- > > dev@lists.freedesktop.org > > Subject: Re: [Intel-gfx] [PATCH v6 2/2] drm/i915: Add soft-pinning API for > > execbuffer > > > > On Wed, Sep 09, 2015 at 04:07:09PM +0200, Micha? Winiarski wrote: > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > Userspace can pass in an offset that it presumes the object is located > > > at. The kernel will then do its utmost to fit the object into that > > > location. The assumption is that userspace is handling its own object > > > locations (for example along with full-ppgtt) and that the kernel will > > > rarely have to make space for the user's requests. > > > > > > v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested > > > by Chris Wilson. Fixed incorrect error paths causing crash found by Michal > > Winiarski. > > > (Not published externally) > > > > > > v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed > > > eviction to allow eviction of soft-pinned objects when another > > > soft-pinned object used by a subsequent execbuffer overlaps reported by > > Michal Winiarski. > > > (Not published externally) > > > > > > v4: Moved soft-pinned objects to the front of ordered_vmas so that > > > they are pinned first after an address conflict happens to avoid > > > repeated conflicts in rare cases (Suggested by Chris Wilson). > > > Expanded comment on drm_i915_gem_exec_object2.offset to cover this > > new API. > > > > > > v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this > > > capability (Kristian). Added check for multiple pinnings on eviction > > > (Akash). Made sure buffers are not considered misplaced without the > > > user specifying EXEC_OBJECT_SUPPORTS_48BBADDRESS. User must > > assume > > > responsibility for any addressing workarounds. Updated object2.offset > > > field comment again to clarify NO_RELOC case (Chris). checkpatch cleanup. > > > > > > v6: Rebase on top of current nightly. Dropped any references to > > > EXEC_OBJECT_SUPPORTS_48BBADDRESS since those don't exist in > > upstream > > > yet, and are not a dependency. > > > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Akash Goel <akash.goel@intel.com> > > > Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > > > Cc: Michal Winiarski <michal.winiarski@intel.com> > > > Cc: Zou Nanhai <nanhai.zou@intel.com> > > > Cc: Kristian Høgsberg <hoegsberg@gmail.com> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > > > Signed-off-by: Micha? Winiarski <michal.winiarski@intel.com> > > > > Again, the precursors are not included in this series or upstream, so NAK. > > -Chris > I have post a beignet's OpenCL2.0 SVM patch based on this patch set. It works > well on i386 system. > Can you review this patchset again? thanks. Chris posted a new version which we want to use instead. Akash has posted a comment on it already. http://patchwork.freedesktop.org/patch/61122/ Thomas.
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 066a0ef..bd619af 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -170,6 +170,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_RESOURCE_STREAMER: value = HAS_RESOURCE_STREAMER(dev); break; + case I915_PARAM_HAS_EXEC_SOFTPIN: + value = 1; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2ef83c5..8eb01e0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2815,6 +2815,7 @@ void i915_gem_vma_destroy(struct i915_vma *vma); #define PIN_OFFSET_BIAS (1<<3) #define PIN_USER (1<<4) #define PIN_UPDATE (1<<5) +#define PIN_OFFSET_FIXED (1<<6) #define PIN_OFFSET_MASK (~4095) int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj, @@ -3157,6 +3158,8 @@ int __must_check i915_gem_evict_something(struct drm_device *dev, unsigned long start, unsigned long end, unsigned flags); +int __must_check +i915_gem_evict_for_vma(struct i915_vma *target); int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle); int i915_gem_evict_everything(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 41263cd..bb2ff81 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3437,22 +3437,42 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, if (IS_ERR(vma)) goto err_unpin; + if (flags & PIN_OFFSET_FIXED) { + uint64_t offset = flags & PIN_OFFSET_MASK; + + if (offset & (alignment - 1)) { + ret = -EINVAL; + goto err_free_vma; + } + vma->node.start = offset; + vma->node.size = size; + vma->node.color = obj->cache_level; + ret = drm_mm_reserve_node(&vm->mm, &vma->node); + if (ret) { + ret = i915_gem_evict_for_vma(vma); + if (ret == 0) + ret = drm_mm_reserve_node(&vm->mm, &vma->node); + } + if (ret) + goto err_free_vma; + } else { search_free: - ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, - size, alignment, - obj->cache_level, - start, end, - DRM_MM_SEARCH_DEFAULT, - DRM_MM_CREATE_DEFAULT); - if (ret) { - ret = i915_gem_evict_something(dev, vm, size, alignment, - obj->cache_level, - start, end, - flags); - if (ret == 0) - goto search_free; + ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, + size, alignment, + obj->cache_level, + start, end, + DRM_MM_SEARCH_DEFAULT, + DRM_MM_CREATE_DEFAULT); + if (ret) { + ret = i915_gem_evict_something(dev, vm, size, alignment, + obj->cache_level, + start, end, + flags); + if (ret == 0) + goto search_free; - goto err_free_vma; + goto err_free_vma; + } } if (WARN_ON(!i915_gem_valid_gtt_space(vma, obj->cache_level))) { ret = -EINVAL; @@ -3986,6 +4006,10 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags) vma->node.start < (flags & PIN_OFFSET_MASK)) return true; + if (flags & PIN_OFFSET_FIXED && + vma->node.start != (flags & PIN_OFFSET_MASK)) + return true; + return false; } diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index d09e35e..bc5b91c 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -199,6 +199,45 @@ found: return ret; } +int +i915_gem_evict_for_vma(struct i915_vma *target) +{ + struct drm_mm_node *node, *next; + + list_for_each_entry_safe(node, next, + &target->vm->mm.head_node.node_list, + node_list) { + struct i915_vma *vma; + int ret; + + if (node->start + node->size <= target->node.start) + continue; + if (node->start >= target->node.start + target->node.size) + break; + + vma = container_of(node, typeof(*vma), node); + + if (vma->pin_count) { + if (!vma->exec_entry || (vma->pin_count > 1)) + /* Object is pinned for some other use */ + return -EBUSY; + + /* We need to evict a buffer in the same batch */ + if (vma->exec_entry->flags & EXEC_OBJECT_PINNED) + /* Overlapping fixed objects in the same batch */ + return -EINVAL; + + return -ENOSPC; + } + + ret = i915_vma_unbind(vma); + if (ret) + return ret; + } + + return 0; +} + /** * i915_gem_evict_vm - Evict all idle vmas from a vm * @vm: Address space to cleanse diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a953d49..49ef155 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -594,6 +594,8 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, flags |= PIN_GLOBAL | PIN_MAPPABLE; if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS) flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS; + if (entry->flags & EXEC_OBJECT_PINNED) + flags |= entry->offset | PIN_OFFSET_FIXED; } ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags); @@ -663,6 +665,10 @@ eb_vma_misplaced(struct i915_vma *vma) vma->node.start & (entry->alignment - 1)) return true; + if (entry->flags & EXEC_OBJECT_PINNED && + vma->node.start != entry->offset) + return true; + if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS && vma->node.start < BATCH_OFFSET_BIAS) return true; @@ -684,6 +690,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, struct i915_vma *vma; struct i915_address_space *vm; struct list_head ordered_vmas; + struct list_head pinned_vmas; bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4; int retry; @@ -692,6 +699,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm; INIT_LIST_HEAD(&ordered_vmas); + INIT_LIST_HEAD(&pinned_vmas); while (!list_empty(vmas)) { struct drm_i915_gem_exec_object2 *entry; bool need_fence, need_mappable; @@ -710,7 +718,9 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, obj->tiling_mode != I915_TILING_NONE; need_mappable = need_fence || need_reloc_mappable(vma); - if (need_mappable) { + if (entry->flags & EXEC_OBJECT_PINNED) + list_move_tail(&vma->exec_list, &pinned_vmas); + else if (need_mappable) { entry->flags |= __EXEC_OBJECT_NEEDS_MAP; list_move(&vma->exec_list, &ordered_vmas); } else @@ -720,6 +730,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, obj->base.pending_write_domain = 0; } list_splice(&ordered_vmas, vmas); + list_splice(&pinned_vmas, vmas); /* Attempt to pin all of the buffers into the GTT. * This is done in 3 phases: @@ -1398,7 +1409,8 @@ eb_get_batch(struct eb_vmas *eb) * Note that actual hangs have only been observed on gen7, but for * paranoia do it everywhere. */ - vma->exec_entry->flags |= __EXEC_OBJECT_NEEDS_BIAS; + if ((vma->exec_entry->flags & EXEC_OBJECT_PINNED) == 0) + vma->exec_entry->flags |= __EXEC_OBJECT_NEEDS_BIAS; return vma->obj; } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index fd5aa47..0a0e781 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -356,6 +356,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_EU_TOTAL 34 #define I915_PARAM_HAS_GPU_RESET 35 #define I915_PARAM_HAS_RESOURCE_STREAMER 36 +#define I915_PARAM_HAS_EXEC_SOFTPIN 37 typedef struct drm_i915_getparam { __s32 param; @@ -684,13 +685,20 @@ struct drm_i915_gem_exec_object2 { /** * Returned value of the updated offset of the object, for future * presumed_offset writes. + * When the EXEC_OBJECT_PINNED flag is specified this is populated by + * the user with the GTT offset at which this object will be pinned. + * When the I915_EXEC_NO_RELOC flag is specified this must contain the + * presumed_offset of the object. + * During execbuffer2 the kernel populates it with the value of the + * current GTT offset of the object, for future presumed_offset writes. */ __u64 offset; #define EXEC_OBJECT_NEEDS_FENCE (1<<0) #define EXEC_OBJECT_NEEDS_GTT (1<<1) #define EXEC_OBJECT_WRITE (1<<2) -#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1) +#define EXEC_OBJECT_PINNED (1<<3) +#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_PINNED<<1) __u64 flags; __u64 rsvd1;