Message ID | 1379875560-6433-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Sep 22, 2013 at 11:46:00AM -0700, Ben Widawsky wrote: > From: Ben Widawsky <ben@bwidawsk.net> > > Building on the last patch which created the new function pointers in > the VM for bind/unbind, here we actually put those new function pointers > to use. > > Split out as a separate patch to aid in review. I'm fine with squashing > into the previous patch if people request it. > > v2: Updated to address the smart ggtt which can do aliasing as needed > Make sure we bind to global gtt when mappable and fenceable. I thought > we could get away without this initialy, but we cannot. > > v3: Make the global GTT binding explicitly use the ggtt VM for > bind_vma(). While at it, use the new ggtt_vma helper (Chris) > > v4: Make the code support the secure dispatch flag, which requires > special handling during execbuf. This was fixed (incorrectly) later in > the series, but having it here earlier in the series should be perfectly > acceptable. (Chris) > Move do_switch over to the new, special ggtt_vma interface. > > v5: Don't use a local variable (or assertion) when setting the batch > object to the global GTT during secure dispatch (Chris) > > v6: Caclulate the exec offset for the secure case (Bug fix missed on > v4). (Chris) > Remove redundant check for has_global_gtt_mapping, since it is done in > bind_vma. > > v7: Remove now unused dev_priv in do_switch > Don't pass the vm to ggtt_offset (error from v6 which I should have > caught before sending). > > v8: Assert, and rework the SNB workaround (to make it a bit clearer) > code to make sure the VM can't be anything but the GGTT. > > v9: Fixing more bugs which can't exist yet on the behest of Chris. Make > sure that the batch object is properly bound, and added to the global > VM's active list - for when we use non-global VMs. (Chris) Not quite, the patch introduced an outright bug in addition to potential issue of vm != ggtt. > CC: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Minor comments inline, (for the series) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > @@ -1118,8 +1115,32 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > * batch" bit. Hence we need to pin secure batches into the global gtt. > * hsw should have this fixed, but let's be paranoid and do it > * unconditionally for now. */ > - if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping) > - i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level); > + if (flags & I915_DISPATCH_SECURE) { > + struct i915_address_space *ggtt = obj_to_ggtt(batch_obj); Please leave whitespace after variable declarations. > + /* Assuming all privileged batches are in the global GTT means > + * we need to make sure we have a global gtt offset, as well as > + * the PTEs mapped. As mentioned above, we can forego this on > + * HSW, but don't. > + */ And a line of whitespace here since this is a block comment and not closely coupled to the next line of code. > + ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false); > + if (ret) > + goto err; > + > + ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj), > + batch_obj->cache_level, > + GLOBAL_BIND); > + > + /* XXX: Since the active list is per VM, we need to make sure > + * this VMA ends up on the GGTT's active list to avoid premature > + * eviction. > + */ No XXX required, unless you have a magical plan; the reasoning is sound. > + i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring); > + > + i915_gem_object_unpin(batch_obj); I think this interface violates Rusty's rules (API should be easy to use but hard to misuse). vma = i915_gem_object_pin(batch_obj, ggtt, 0, false, false); if (IS_ERR(vm)) { ret = PTR_ERR(vm); goto err; } ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND); // this would a flag to a future pin() i915_vma_move_to_active(vma, ring); exec_start += vma->node.start; i915_gem_object_unpin(batch_obj, vma); What I am stressing here is that the vma->node is only valid whilst the object is pinned, and that access should be through the vma rather than the object. However, that interface is a little too idealistic. -Chris
On Mon, Sep 23, 2013 at 09:39:31AM +0100, Chris Wilson wrote: > On Sun, Sep 22, 2013 at 11:46:00AM -0700, Ben Widawsky wrote: > > From: Ben Widawsky <ben@bwidawsk.net> > > > > Building on the last patch which created the new function pointers in > > the VM for bind/unbind, here we actually put those new function pointers > > to use. > > > > Split out as a separate patch to aid in review. I'm fine with squashing > > into the previous patch if people request it. > > > > v2: Updated to address the smart ggtt which can do aliasing as needed > > Make sure we bind to global gtt when mappable and fenceable. I thought > > we could get away without this initialy, but we cannot. > > > > v3: Make the global GTT binding explicitly use the ggtt VM for > > bind_vma(). While at it, use the new ggtt_vma helper (Chris) > > > > v4: Make the code support the secure dispatch flag, which requires > > special handling during execbuf. This was fixed (incorrectly) later in > > the series, but having it here earlier in the series should be perfectly > > acceptable. (Chris) > > Move do_switch over to the new, special ggtt_vma interface. > > > > v5: Don't use a local variable (or assertion) when setting the batch > > object to the global GTT during secure dispatch (Chris) > > > > v6: Caclulate the exec offset for the secure case (Bug fix missed on > > v4). (Chris) > > Remove redundant check for has_global_gtt_mapping, since it is done in > > bind_vma. > > > > v7: Remove now unused dev_priv in do_switch > > Don't pass the vm to ggtt_offset (error from v6 which I should have > > caught before sending). > > > > v8: Assert, and rework the SNB workaround (to make it a bit clearer) > > code to make sure the VM can't be anything but the GGTT. > > > > v9: Fixing more bugs which can't exist yet on the behest of Chris. Make > > sure that the batch object is properly bound, and added to the global > > VM's active list - for when we use non-global VMs. (Chris) > > Not quite, the patch introduced an outright bug in addition to potential > issue of vm != ggtt. Which bug are we talking about again? I'll update the commit message, but I never saw a bug other than vm != ggtt (and ones I introduced while trying to make you happy since v3). > > > CC: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > Minor comments inline, > (for the series) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > > > @@ -1118,8 +1115,32 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > > * batch" bit. Hence we need to pin secure batches into the global gtt. > > * hsw should have this fixed, but let's be paranoid and do it > > * unconditionally for now. */ > > - if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping) > > - i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level); > > + if (flags & I915_DISPATCH_SECURE) { > > + struct i915_address_space *ggtt = obj_to_ggtt(batch_obj); > > Please leave whitespace after variable declarations. > > > + /* Assuming all privileged batches are in the global GTT means > > + * we need to make sure we have a global gtt offset, as well as > > + * the PTEs mapped. As mentioned above, we can forego this on > > + * HSW, but don't. > > + */ > And a line of whitespace here since this is a block comment and not > closely coupled to the next line of code. > > > + ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false); > > + if (ret) > > + goto err; > > + > > + ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj), > > + batch_obj->cache_level, > > + GLOBAL_BIND); > > + > > + /* XXX: Since the active list is per VM, we need to make sure > > + * this VMA ends up on the GGTT's active list to avoid premature > > + * eviction. > > + */ > > No XXX required, unless you have a magical plan; the reasoning is sound. I used the XXX because it is a bit tricky to understand for the casual observer. Maybe "NB" was more appropriate. Anyway, removed. > > > + i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring); > > + > > + i915_gem_object_unpin(batch_obj); > > I think this interface violates Rusty's rules (API should be easy to > use but hard to misuse). > > vma = i915_gem_object_pin(batch_obj, ggtt, 0, false, false); > if (IS_ERR(vm)) { > ret = PTR_ERR(vm); > goto err; > } > You're missing a step here, I assume you mean: i915_gem_obj_ggtt_pin(...) vma = i915_gem_obj_to_ggtt(...) if (IS_ERR)... Or had you something else in mind? > ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND); // this would a flag to a future pin() > i915_vma_move_to_active(vma, ring); > > exec_start += vma->node.start; > i915_gem_object_unpin(batch_obj, vma); > > What I am stressing here is that the vma->node is only valid whilst the > object is pinned, and that access should be through the vma rather than > the object. However, that interface is a little too idealistic. > -Chris > I am fine with this change - though I don't find it personally more clear in stressing your point. Earlier requests were heavily in favor is using "ggtt" where it couldn't be a generic VM. I think therefore the excessive use of i915_gem_obj_to_ggtt makes the code look less generic, which it is. Ie. I don't see this as abusing the interface. I got all the other nitpicks. Once you ack the "missing step" I shall resend.
On Mon, Sep 23, 2013 at 03:00:06PM -0700, Ben Widawsky wrote: > > I think this interface violates Rusty's rules (API should be easy to > > use but hard to misuse). > > > > vma = i915_gem_object_pin(batch_obj, ggtt, 0, false, false); > > if (IS_ERR(vm)) { > > ret = PTR_ERR(vm); > > goto err; > > } > > > > You're missing a step here, I assume you mean: > i915_gem_obj_ggtt_pin(...) > vma = i915_gem_obj_to_ggtt(...) > if (IS_ERR)... > > Or had you something else in mind? I was thinking of making the pin return the vma instead. Then you know that the vma is valid until its unpin. I think it helps here, but to cater for all use cases we would need something analagous to get_pages, pin_pages and unpin_pages. (get_vma, pin_vma, unpin_vma). -Chris
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9995cdb..e8ae8fd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2102,17 +2102,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, /* i915_gem_gtt.c */ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev); -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level); -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj); - void i915_gem_restore_gtt_mappings(struct drm_device *dev); int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj); -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level); -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj); void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj); void i915_gem_init_global_gtt(struct drm_device *dev); void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f6c8b0e..378d4ef 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2693,12 +2693,8 @@ int i915_vma_unbind(struct i915_vma *vma) trace_i915_vma_unbind(vma); - if (obj->has_global_gtt_mapping) - i915_gem_gtt_unbind_object(obj); - if (obj->has_aliasing_ppgtt_mapping) { - i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj); - obj->has_aliasing_ppgtt_mapping = 0; - } + vma->vm->unbind_vma(vma); + i915_gem_gtt_finish_object(obj); i915_gem_object_unpin_pages(obj); @@ -3155,7 +3151,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev) /** * Finds free space in the GTT aperture and binds the object there. */ -static int +int i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, struct i915_address_space *vm, unsigned alignment, @@ -3424,7 +3420,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level) { struct drm_device *dev = obj->base.dev; - drm_i915_private_t *dev_priv = dev->dev_private; struct i915_vma *vma; int ret; @@ -3463,11 +3458,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, return ret; } - if (obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(obj, cache_level); - if (obj->has_aliasing_ppgtt_mapping) - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, - obj, cache_level); + list_for_each_entry(vma, &obj->vma_list, vma_link) + vma->vm->bind_vma(vma, cache_level, 0); } list_for_each_entry(vma, &obj->vma_list, vma_link) @@ -3795,6 +3787,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, bool map_and_fenceable, bool nonblocking) { + const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0; struct i915_vma *vma; int ret; @@ -3823,20 +3816,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, } if (!i915_gem_obj_bound(obj, vm)) { - struct drm_i915_private *dev_priv = obj->base.dev->dev_private; - ret = i915_gem_object_bind_to_vm(obj, vm, alignment, map_and_fenceable, nonblocking); if (ret) return ret; - if (!dev_priv->mm.aliasing_ppgtt) - i915_gem_gtt_bind_object(obj, obj->cache_level); - } + vma = i915_gem_obj_to_vma(obj, vm); + vm->bind_vma(vma, obj->cache_level, flags); + } else + vma = i915_gem_obj_to_vma(obj, vm); + /* Objects are created map and fenceable. If we bind an object + * the first time, and we had aliasing PPGTT (and didn't request + * GLOBAL), we'll need to do this on the second bind.*/ if (!obj->has_global_gtt_mapping && map_and_fenceable) - i915_gem_gtt_bind_object(obj, obj->cache_level); + vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND); obj->pin_count++; obj->pin_mappable |= map_and_fenceable; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 1a877a5..d4eb88a 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -422,8 +422,10 @@ static int do_switch(struct i915_hw_context *to) return ret; } - if (!to->obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(to->obj, to->obj->cache_level); + if (!to->obj->has_global_gtt_mapping) { + struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj); + vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND); + } if (!to->is_initialized || is_default_context(to)) hw_flags |= MI_RESTORE_INHIBIT; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 0ce0d47..51dd656 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -286,8 +286,14 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, if (unlikely(IS_GEN6(dev) && reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION && !target_i915_obj->has_global_gtt_mapping)) { - i915_gem_gtt_bind_object(target_i915_obj, - target_i915_obj->cache_level); + /* SNB shall not support full PPGTT. This path can only be taken + * when the VM is the GGTT (aliasing PPGTT is not a real VM, and + * therefore doesn't count). + */ + BUG_ON(vm != obj_to_ggtt(target_i915_obj)); + vm->bind_vma(i915_gem_obj_to_ggtt(target_i915_obj), + target_i915_obj->cache_level, + GLOBAL_BIND); } /* Validate that the target is in a valid r/w GPU domain */ @@ -464,11 +470,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, struct intel_ring_buffer *ring, bool *need_reloc) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct drm_i915_gem_object *obj = vma->obj; struct drm_i915_gem_exec_object2 *entry = vma->exec_entry; bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4; bool need_fence, need_mappable; - struct drm_i915_gem_object *obj = vma->obj; + u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) && + !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0; int ret; need_fence = @@ -497,14 +504,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, } } - /* Ensure ppgtt mapping exists if needed */ - if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) { - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, - obj, obj->cache_level); - - obj->has_aliasing_ppgtt_mapping = 1; - } - if (entry->offset != vma->node.start) { entry->offset = vma->node.start; *need_reloc = true; @@ -515,9 +514,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER; } - if (entry->flags & EXEC_OBJECT_NEEDS_GTT && - !obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(obj, obj->cache_level); + vma->vm->bind_vma(vma, obj->cache_level, flags); return 0; } @@ -936,7 +933,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct intel_ring_buffer *ring; struct i915_ctx_hang_stats *hs; u32 ctx_id = i915_execbuffer2_get_context_id(*args); - u32 exec_start, exec_len; + u32 exec_len, exec_start = args->batch_start_offset; u32 mask, flags; int ret, mode, i; bool need_relocs; @@ -1118,8 +1115,32 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, * batch" bit. Hence we need to pin secure batches into the global gtt. * hsw should have this fixed, but let's be paranoid and do it * unconditionally for now. */ - if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level); + if (flags & I915_DISPATCH_SECURE) { + struct i915_address_space *ggtt = obj_to_ggtt(batch_obj); + /* Assuming all privileged batches are in the global GTT means + * we need to make sure we have a global gtt offset, as well as + * the PTEs mapped. As mentioned above, we can forego this on + * HSW, but don't. + */ + ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false); + if (ret) + goto err; + + ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj), + batch_obj->cache_level, + GLOBAL_BIND); + + /* XXX: Since the active list is per VM, we need to make sure + * this VMA ends up on the GGTT's active list to avoid premature + * eviction. + */ + i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring); + + i915_gem_object_unpin(batch_obj); + + exec_start += i915_gem_obj_ggtt_offset(batch_obj); + } else + exec_start += i915_gem_obj_offset(batch_obj, vm); ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas); if (ret) @@ -1161,8 +1182,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } - exec_start = i915_gem_obj_offset(batch_obj, vm) + - args->batch_start_offset; exec_len = args->batch_len; if (cliprects) { for (i = 0; i < args->num_cliprects; i++) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 65b61d4..e053f14 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev) dev_priv->mm.aliasing_ppgtt = NULL; } -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level) -{ - ppgtt->base.insert_entries(&ppgtt->base, obj->pages, - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, - cache_level); -} - static void __always_unused gen6_ppgtt_bind_vma(struct i915_vma *vma, enum i915_cache_level cache_level, @@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma, gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level); } -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj) -{ - ppgtt->base.clear_range(&ppgtt->base, - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, - obj->base.size >> PAGE_SHIFT); -} - static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma) { const unsigned long entry = vma->node.start >> PAGE_SHIFT; @@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) dev_priv->gtt.base.total / PAGE_SIZE); list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { + struct i915_vma *vma = i915_gem_obj_to_vma(obj, + &dev_priv->gtt.base); i915_gem_clflush_object(obj, obj->pin_display); - i915_gem_gtt_bind_object(obj, obj->cache_level); + vma->vm->bind_vma(vma, obj->cache_level, 0); } i915_gem_chipset_flush(dev); @@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma, } } -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level) -{ - struct drm_device *dev = obj->base.dev; - struct drm_i915_private *dev_priv = dev->dev_private; - const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; - - dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages, - entry, - cache_level); - - obj->has_global_gtt_mapping = 1; -} - -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj) -{ - struct drm_device *dev = obj->base.dev; - struct drm_i915_private *dev_priv = dev->dev_private; - const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; - - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, - entry, - obj->base.size >> PAGE_SHIFT); - - obj->has_global_gtt_mapping = 0; -} - static void gen6_ggtt_unbind_vma(struct i915_vma *vma) { struct drm_device *dev = vma->vm->dev;