From patchwork Tue Sep 24 16:58:00 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 2934691 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 4CF689F288 for ; Tue, 24 Sep 2013 17:01:01 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9D49D20270 for ; Tue, 24 Sep 2013 17:00:59 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id D044C203A8 for ; Tue, 24 Sep 2013 17:00:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BB594E77BF for ; Tue, 24 Sep 2013 10:00:57 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail.bwidawsk.net (bwidawsk.net [166.78.191.112]) by gabe.freedesktop.org (Postfix) with ESMTP id 01ED1E77A0 for ; Tue, 24 Sep 2013 09:58:23 -0700 (PDT) Received: by mail.bwidawsk.net (Postfix, from userid 5001) id 9048B59FB4; Tue, 24 Sep 2013 09:58:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from lundgren.intel.com (unknown [131.252.27.73]) by mail.bwidawsk.net (Postfix) with ESMTPSA id 1447459FB3; Tue, 24 Sep 2013 09:58:09 -0700 (PDT) From: Ben Widawsky To: Intel GFX Date: Tue, 24 Sep 2013 09:58:00 -0700 Message-Id: <1380041881-3397-5-git-send-email-benjamin.widawsky@intel.com> X-Mailer: git-send-email 1.8.4 In-Reply-To: <1380041881-3397-1-git-send-email-benjamin.widawsky@intel.com> References: <1380041881-3397-1-git-send-email-benjamin.widawsky@intel.com> Cc: Ben Widawsky Subject: [Intel-gfx] [PATCH 5/6] drm/i915: Use the new vm [un]bind functions X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ben Widawsky Building on the last patch which created the new function pointers in the VM for bind/unbind, here we actually put those new function pointers to use. Split out as a separate patch to aid in review. I'm fine with squashing into the previous patch if people request it. v2: Updated to address the smart ggtt which can do aliasing as needed Make sure we bind to global gtt when mappable and fenceable. I thought we could get away without this initialy, but we cannot. v3: Make the global GTT binding explicitly use the ggtt VM for bind_vma(). While at it, use the new ggtt_vma helper (Chris) v4: Make the code support the secure dispatch flag, which requires special handling during execbuf. This was fixed (incorrectly) later in the series, but having it here earlier in the series should be perfectly acceptable. (Chris) Move do_switch over to the new, special ggtt_vma interface. v5: Don't use a local variable (or assertion) when setting the batch object to the global GTT during secure dispatch (Chris) v6: Caclulate the exec offset for the secure case (Bug fix missed on v4). (Chris) Remove redundant check for has_global_gtt_mapping, since it is done in bind_vma. v7: Remove now unused dev_priv in do_switch Don't pass the vm to ggtt_offset (error from v6 which I should have caught before sending). v8: Assert, and rework the SNB workaround (to make it a bit clearer) code to make sure the VM can't be anything but the GGTT. v9: Fixing more bugs which can't exist yet on the behest of Chris. Make sure that the batch object is properly bound, and added to the global VM's active list - for when we use non-global VMs. (Chris) Note that there is an ongoing confusion about what bugs existed, but the "known" bugs are fixed here. v10: Nitpicks on whitespacing etc. introduced in v9 (Chris) Reviewed-by: Chris Wilson Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_drv.h | 9 ----- drivers/gpu/drm/i915/i915_gem.c | 33 +++++++--------- drivers/gpu/drm/i915/i915_gem_context.c | 6 ++- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 61 ++++++++++++++++++++---------- drivers/gpu/drm/i915/i915_gem_gtt.c | 48 ++--------------------- 5 files changed, 62 insertions(+), 95 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9995cdb..e8ae8fd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2102,17 +2102,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, /* i915_gem_gtt.c */ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev); -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level); -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj); - void i915_gem_restore_gtt_mappings(struct drm_device *dev); int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj); -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level); -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj); void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj); void i915_gem_init_global_gtt(struct drm_device *dev); void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f6c8b0e..378d4ef 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2693,12 +2693,8 @@ int i915_vma_unbind(struct i915_vma *vma) trace_i915_vma_unbind(vma); - if (obj->has_global_gtt_mapping) - i915_gem_gtt_unbind_object(obj); - if (obj->has_aliasing_ppgtt_mapping) { - i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj); - obj->has_aliasing_ppgtt_mapping = 0; - } + vma->vm->unbind_vma(vma); + i915_gem_gtt_finish_object(obj); i915_gem_object_unpin_pages(obj); @@ -3155,7 +3151,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev) /** * Finds free space in the GTT aperture and binds the object there. */ -static int +int i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, struct i915_address_space *vm, unsigned alignment, @@ -3424,7 +3420,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level) { struct drm_device *dev = obj->base.dev; - drm_i915_private_t *dev_priv = dev->dev_private; struct i915_vma *vma; int ret; @@ -3463,11 +3458,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, return ret; } - if (obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(obj, cache_level); - if (obj->has_aliasing_ppgtt_mapping) - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, - obj, cache_level); + list_for_each_entry(vma, &obj->vma_list, vma_link) + vma->vm->bind_vma(vma, cache_level, 0); } list_for_each_entry(vma, &obj->vma_list, vma_link) @@ -3795,6 +3787,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, bool map_and_fenceable, bool nonblocking) { + const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0; struct i915_vma *vma; int ret; @@ -3823,20 +3816,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, } if (!i915_gem_obj_bound(obj, vm)) { - struct drm_i915_private *dev_priv = obj->base.dev->dev_private; - ret = i915_gem_object_bind_to_vm(obj, vm, alignment, map_and_fenceable, nonblocking); if (ret) return ret; - if (!dev_priv->mm.aliasing_ppgtt) - i915_gem_gtt_bind_object(obj, obj->cache_level); - } + vma = i915_gem_obj_to_vma(obj, vm); + vm->bind_vma(vma, obj->cache_level, flags); + } else + vma = i915_gem_obj_to_vma(obj, vm); + /* Objects are created map and fenceable. If we bind an object + * the first time, and we had aliasing PPGTT (and didn't request + * GLOBAL), we'll need to do this on the second bind.*/ if (!obj->has_global_gtt_mapping && map_and_fenceable) - i915_gem_gtt_bind_object(obj, obj->cache_level); + vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND); obj->pin_count++; obj->pin_mappable |= map_and_fenceable; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 1a877a5..d4eb88a 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -422,8 +422,10 @@ static int do_switch(struct i915_hw_context *to) return ret; } - if (!to->obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(to->obj, to->obj->cache_level); + if (!to->obj->has_global_gtt_mapping) { + struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj); + vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND); + } if (!to->is_initialized || is_default_context(to)) hw_flags |= MI_RESTORE_INHIBIT; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 0ce0d47..88c924f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -286,8 +286,14 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, if (unlikely(IS_GEN6(dev) && reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION && !target_i915_obj->has_global_gtt_mapping)) { - i915_gem_gtt_bind_object(target_i915_obj, - target_i915_obj->cache_level); + /* SNB shall not support full PPGTT. This path can only be taken + * when the VM is the GGTT (aliasing PPGTT is not a real VM, and + * therefore doesn't count). + */ + BUG_ON(vm != obj_to_ggtt(target_i915_obj)); + vm->bind_vma(i915_gem_obj_to_ggtt(target_i915_obj), + target_i915_obj->cache_level, + GLOBAL_BIND); } /* Validate that the target is in a valid r/w GPU domain */ @@ -464,11 +470,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, struct intel_ring_buffer *ring, bool *need_reloc) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct drm_i915_gem_object *obj = vma->obj; struct drm_i915_gem_exec_object2 *entry = vma->exec_entry; bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4; bool need_fence, need_mappable; - struct drm_i915_gem_object *obj = vma->obj; + u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) && + !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0; int ret; need_fence = @@ -497,14 +504,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, } } - /* Ensure ppgtt mapping exists if needed */ - if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) { - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, - obj, obj->cache_level); - - obj->has_aliasing_ppgtt_mapping = 1; - } - if (entry->offset != vma->node.start) { entry->offset = vma->node.start; *need_reloc = true; @@ -515,9 +514,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER; } - if (entry->flags & EXEC_OBJECT_NEEDS_GTT && - !obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(obj, obj->cache_level); + vma->vm->bind_vma(vma, obj->cache_level, flags); return 0; } @@ -936,7 +933,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct intel_ring_buffer *ring; struct i915_ctx_hang_stats *hs; u32 ctx_id = i915_execbuffer2_get_context_id(*args); - u32 exec_start, exec_len; + u32 exec_len, exec_start = args->batch_start_offset; u32 mask, flags; int ret, mode, i; bool need_relocs; @@ -1118,8 +1115,34 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, * batch" bit. Hence we need to pin secure batches into the global gtt. * hsw should have this fixed, but let's be paranoid and do it * unconditionally for now. */ - if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping) - i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level); + if (flags & I915_DISPATCH_SECURE) { + struct i915_address_space *ggtt = obj_to_ggtt(batch_obj); + + /* Assuming all privileged batches are in the global GTT means + * we need to make sure we have a global gtt offset, as well as + * the PTEs mapped. As mentioned above, we can forego this on + * HSW, but don't. + */ + + ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false); + if (ret) + goto err; + + ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj), + batch_obj->cache_level, + GLOBAL_BIND); + + /* Since the active list is per VM, we need to make sure this + * VMA ends up on the GGTT's active list to avoid premature + * eviction. + */ + i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring); + + i915_gem_object_unpin(batch_obj); + + exec_start += i915_gem_obj_ggtt_offset(batch_obj); + } else + exec_start += i915_gem_obj_offset(batch_obj, vm); ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas); if (ret) @@ -1161,8 +1184,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } - exec_start = i915_gem_obj_offset(batch_obj, vm) + - args->batch_start_offset; exec_len = args->batch_len; if (cliprects) { for (i = 0; i < args->num_cliprects; i++) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 65b61d4..e053f14 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev) dev_priv->mm.aliasing_ppgtt = NULL; } -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level) -{ - ppgtt->base.insert_entries(&ppgtt->base, obj->pages, - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, - cache_level); -} - static void __always_unused gen6_ppgtt_bind_vma(struct i915_vma *vma, enum i915_cache_level cache_level, @@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma, gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level); } -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_object *obj) -{ - ppgtt->base.clear_range(&ppgtt->base, - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, - obj->base.size >> PAGE_SHIFT); -} - static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma) { const unsigned long entry = vma->node.start >> PAGE_SHIFT; @@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) dev_priv->gtt.base.total / PAGE_SIZE); list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { + struct i915_vma *vma = i915_gem_obj_to_vma(obj, + &dev_priv->gtt.base); i915_gem_clflush_object(obj, obj->pin_display); - i915_gem_gtt_bind_object(obj, obj->cache_level); + vma->vm->bind_vma(vma, obj->cache_level, 0); } i915_gem_chipset_flush(dev); @@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma, } } -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level) -{ - struct drm_device *dev = obj->base.dev; - struct drm_i915_private *dev_priv = dev->dev_private; - const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; - - dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages, - entry, - cache_level); - - obj->has_global_gtt_mapping = 1; -} - -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj) -{ - struct drm_device *dev = obj->base.dev; - struct drm_i915_private *dev_priv = dev->dev_private; - const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; - - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, - entry, - obj->base.size >> PAGE_SHIFT); - - obj->has_global_gtt_mapping = 0; -} - static void gen6_ggtt_unbind_vma(struct i915_vma *vma) { struct drm_device *dev = vma->vm->dev;