From patchwork Wed Aug 5 12:22:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11701853 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A9271510 for ; Wed, 5 Aug 2020 12:23:19 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2475F22D2B for ; Wed, 5 Aug 2020 12:23:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2475F22D2B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 527936E5BE; Wed, 5 Aug 2020 12:23:00 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1419C6E57E for ; Wed, 5 Aug 2020 12:22:49 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 22039479-1500050 for multiple; Wed, 05 Aug 2020 13:22:34 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 5 Aug 2020 13:22:14 +0100 Message-Id: <20200805122231.23313-21-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200805122231.23313-1-chris@chris-wilson.co.uk> References: <20200805122231.23313-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 20/37] drm/i915/gem: Bind the fence async for execbuf X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is illegal to wait on an another vma while holding the vm->mutex, as that easily leads to ABBA deadlocks (we wait on a second vma that waits on us to release the vm->mutex). So while the vm->mutex exists, we compute the required register transfer inside the i915_ggtt.mutex, setting up a fence for tracking the register writes, but move the waiting outside of the lock into the async binding pipeline. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 21 +-- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 139 +++++++++++++++++- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h | 5 + drivers/gpu/drm/i915/i915_vma.h | 2 - 4 files changed, 152 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 301e67dcdbde..4cdaf5d81ef1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1054,15 +1054,6 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_bind_vma *bind) return err; pin: - if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { - err = __i915_vma_pin_fence(vma); /* XXX no waiting */ - if (unlikely(err)) - return err; - - if (vma->fence) - bind->ev->flags |= __EXEC_OBJECT_HAS_FENCE; - } - bind_flags &= ~atomic_read(&vma->flags); if (bind_flags) { err = set_bind_fence(vma, work); @@ -1093,6 +1084,15 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_bind_vma *bind) bind->ev->flags |= __EXEC_OBJECT_HAS_PIN; GEM_BUG_ON(eb_vma_misplaced(entry, vma, bind->ev->flags)); + if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { + err = __i915_vma_pin_fence_async(vma, &work->base); + if (unlikely(err)) + return err; + + if (vma->fence) + bind->ev->flags |= __EXEC_OBJECT_HAS_FENCE; + } + return 0; } @@ -1158,6 +1158,9 @@ static void __eb_bind_vma(struct eb_vm_work *work) struct eb_bind_vma *bind = &work->bind[n]; struct i915_vma *vma = bind->ev->vma; + if (bind->ev->flags & __EXEC_OBJECT_HAS_FENCE) + __i915_vma_apply_fence_async(vma); + if (!bind->bind_flags) goto put; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index 7fb36b12fe7a..ce06b949dc7c 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -21,10 +21,13 @@ * IN THE SOFTWARE. */ +#include "i915_active.h" #include "i915_drv.h" #include "i915_scatterlist.h" +#include "i915_sw_fence_work.h" #include "i915_pvinfo.h" #include "i915_vgpu.h" +#include "i915_vma.h" /** * DOC: fence register handling @@ -340,19 +343,37 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt) return ERR_PTR(-EDEADLK); } -int __i915_vma_pin_fence(struct i915_vma *vma) +static int fence_wait_bind(struct i915_fence_reg *reg) +{ + struct dma_fence *fence; + int err = 0; + + fence = i915_active_fence_get(®->active.excl); + if (fence) { + err = dma_fence_wait(fence, true); + dma_fence_put(fence); + } + + return err; +} + +static int __i915_vma_pin_fence(struct i915_vma *vma) { struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm); - struct i915_fence_reg *fence; + struct i915_fence_reg *fence = vma->fence; struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL; int err; lockdep_assert_held(&vma->vm->mutex); /* Just update our place in the LRU if our fence is getting reused. */ - if (vma->fence) { - fence = vma->fence; + if (fence) { GEM_BUG_ON(fence->vma != vma); + + err = fence_wait_bind(fence); + if (err) + return err; + atomic_inc(&fence->pin_count); if (!fence->dirty) { list_move_tail(&fence->link, &ggtt->fence_list); @@ -384,6 +405,116 @@ int __i915_vma_pin_fence(struct i915_vma *vma) return err; } +static int set_bind_fence(struct i915_fence_reg *fence, + struct dma_fence_work *work) +{ + struct dma_fence *prev; + int err; + + if (rcu_access_pointer(fence->active.excl.fence) == &work->dma) + return 0; + + err = i915_sw_fence_await_active(&work->chain, + &fence->active, + I915_ACTIVE_AWAIT_ACTIVE); + if (err) + return err; + + if (i915_active_acquire(&fence->active)) + return -ENOENT; + + prev = i915_active_set_exclusive(&fence->active, &work->dma); + if (unlikely(prev)) { + err = i915_sw_fence_await_dma_fence(&work->chain, prev, 0, + GFP_NOWAIT | __GFP_NOWARN); + dma_fence_put(prev); + } + + i915_active_release(&fence->active); + return err < 0 ? err : 0; +} + +int __i915_vma_pin_fence_async(struct i915_vma *vma, + struct dma_fence_work *work) +{ + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm); + struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL; + struct i915_fence_reg *fence = vma->fence; + int err; + + lockdep_assert_held(&vma->vm->mutex); + + /* Just update our place in the LRU if our fence is getting reused. */ + if (fence) { + GEM_BUG_ON(fence->vma != vma); + GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); + } else if (set) { + if (!i915_vma_is_map_and_fenceable(vma)) + return -EINVAL; + + fence = fence_find(ggtt); + if (IS_ERR(fence)) + return -ENOSPC; + + GEM_BUG_ON(atomic_read(&fence->pin_count)); + fence->dirty = true; + } else { + return 0; + } + + atomic_inc(&fence->pin_count); + list_move_tail(&fence->link, &ggtt->fence_list); + if (!fence->dirty) + return 0; + + if (INTEL_GEN(fence_to_i915(fence)) < 4 && + rcu_access_pointer(vma->active.excl.fence) != &work->dma) { + /* implicit 'unfenced' GPU blits */ + err = i915_sw_fence_await_active(&work->chain, + &vma->active, + I915_ACTIVE_AWAIT_ACTIVE); + if (err) + goto err_unpin; + } + + err = set_bind_fence(fence, work); + if (err) + goto err_unpin; + + if (set) { + fence->start = vma->node.start; + fence->size = vma->fence_size; + fence->stride = i915_gem_object_get_stride(vma->obj); + fence->tiling = i915_gem_object_get_tiling(vma->obj); + + vma->fence = fence; + } else { + fence->tiling = 0; + vma->fence = NULL; + } + + set = xchg(&fence->vma, set); + if (set && set != vma) { + GEM_BUG_ON(set->fence != fence); + WRITE_ONCE(set->fence, NULL); + i915_vma_revoke_mmap(set); + } + + return 0; + +err_unpin: + atomic_dec(&fence->pin_count); + return err; +} + +void __i915_vma_apply_fence_async(struct i915_vma *vma) +{ + struct i915_fence_reg *fence = vma->fence; + + if (fence->dirty) + fence_write(fence); +} + /** * i915_vma_pin_fence - set up fencing for a vma * @vma: vma to map through a fence reg diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 9eef679e1311..d306ac14d47e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -30,6 +30,7 @@ #include "i915_active.h" +struct dma_fence_work; struct drm_i915_gem_object; struct i915_ggtt; struct i915_vma; @@ -70,6 +71,10 @@ void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, struct sg_table *pages); +int __i915_vma_pin_fence_async(struct i915_vma *vma, + struct dma_fence_work *work); +void __i915_vma_apply_fence_async(struct i915_vma *vma); + void intel_ggtt_init_fences(struct i915_ggtt *ggtt); void intel_ggtt_fini_fences(struct i915_ggtt *ggtt); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 9a26e6cbe8cd..4c0882041513 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -334,8 +334,6 @@ static inline struct page *i915_vma_first_page(struct i915_vma *vma) int __must_check i915_vma_pin_fence(struct i915_vma *vma); void i915_vma_revoke_fence(struct i915_vma *vma); -int __i915_vma_pin_fence(struct i915_vma *vma); - static inline void __i915_vma_unpin_fence(struct i915_vma *vma) { GEM_BUG_ON(atomic_read(&vma->fence->pin_count) <= 0);