From patchwork Fri Jun 16 14:05:25 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9791779 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5193D6038E for ; Fri, 16 Jun 2017 14:05:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 566A828564 for ; Fri, 16 Jun 2017 14:05:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4B44428649; Fri, 16 Jun 2017 14:05:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E0B9228564 for ; Fri, 16 Jun 2017 14:05:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 867156E911; Fri, 16 Jun 2017 14:05:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7C6FA6E911 for ; Fri, 16 Jun 2017 14:05:50 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 7327026-1500050 for ; Fri, 16 Jun 2017 15:05:47 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Fri, 16 Jun 2017 15:05:46 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 16 Jun 2017 15:05:25 +0100 Message-Id: <20170616140525.6394-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170616140525.6394-1-chris@chris-wilson.co.uk> References: <20170616140525.6394-1-chris@chris-wilson.co.uk> X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [CI 10/10] drm/i915: Stash a pointer to the obj's resv in the vma X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP During execbuf, a mandatory step is that we add this request (this fence) to each object's reservation_object. Inside execbuf, we track the vma, and to add the fence to the reservation_object then means having to first chase the obj, incurring another cache miss. We can reduce the number of cache misses by stashing a pointer to the reservation_object in the vma itself. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 25 ++++++++++++------------- drivers/gpu/drm/i915/i915_vma.c | 1 + drivers/gpu/drm/i915/i915_vma.h | 3 ++- 3 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 2f7a2d2510fc..eb46dfa374a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1192,17 +1192,17 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, if (err) goto err_request; - GEM_BUG_ON(!reservation_object_test_signaled_rcu(obj->resv, true)); + GEM_BUG_ON(!reservation_object_test_signaled_rcu(batch->resv, true)); i915_vma_move_to_active(batch, rq, 0); - reservation_object_lock(obj->resv, NULL); - reservation_object_add_excl_fence(obj->resv, &rq->fence); - reservation_object_unlock(obj->resv); + reservation_object_lock(batch->resv, NULL); + reservation_object_add_excl_fence(batch->resv, &rq->fence); + reservation_object_unlock(batch->resv); i915_vma_unpin(batch); i915_vma_move_to_active(vma, rq, true); - reservation_object_lock(vma->obj->resv, NULL); - reservation_object_add_excl_fence(vma->obj->resv, &rq->fence); - reservation_object_unlock(vma->obj->resv); + reservation_object_lock(vma->resv, NULL); + reservation_object_add_excl_fence(vma->resv, &rq->fence); + reservation_object_unlock(vma->resv); rq->batch = batch; @@ -1252,7 +1252,6 @@ relocate_entry(struct i915_vma *vma, struct i915_execbuffer *eb, const struct i915_vma *target) { - struct drm_i915_gem_object *obj = vma->obj; u64 offset = reloc->offset; u64 target_offset = relocation_target(reloc, target); bool wide = eb->reloc_cache.use_64bit_reloc; @@ -1260,7 +1259,7 @@ relocate_entry(struct i915_vma *vma, if (!eb->reloc_cache.vaddr && (DBG_FORCE_RELOC == FORCE_GPU_RELOC || - !reservation_object_test_signaled_rcu(obj->resv, true))) { + !reservation_object_test_signaled_rcu(vma->resv, true))) { const unsigned int gen = eb->reloc_cache.gen; unsigned int len; u32 *batch; @@ -1320,7 +1319,7 @@ relocate_entry(struct i915_vma *vma, } repeat: - vaddr = reloc_vaddr(obj, &eb->reloc_cache, offset >> PAGE_SHIFT); + vaddr = reloc_vaddr(vma->obj, &eb->reloc_cache, offset >> PAGE_SHIFT); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -1793,11 +1792,11 @@ static int eb_relocate(struct i915_execbuffer *eb) return eb_relocate_slow(eb); } -static void eb_export_fence(struct drm_i915_gem_object *obj, +static void eb_export_fence(struct i915_vma *vma, struct drm_i915_gem_request *req, unsigned int flags) { - struct reservation_object *resv = obj->resv; + struct reservation_object *resv = vma->resv; /* * Ignore errors from failing to allocate the new fence, we can't @@ -1856,7 +1855,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) const struct drm_i915_gem_exec_object2 *entry = &eb->exec[i]; struct i915_vma *vma = exec_to_vma(entry); - eb_export_fence(vma->obj, eb->request, entry->flags); + eb_export_fence(vma, eb->request, entry->flags); if (unlikely(entry->flags & __EXEC_OBJECT_HAS_REF)) i915_vma_put(vma); } diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index f5c57dff288e..532c709febbd 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -90,6 +90,7 @@ vma_create(struct drm_i915_gem_object *obj, init_request_active(&vma->last_fence, NULL); vma->vm = vm; vma->obj = obj; + vma->resv = obj->resv; vma->size = obj->base.size; vma->display_alignment = I915_GTT_MIN_ALIGNMENT; diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 04d7a5da70fd..4a673fc1a432 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -50,6 +50,7 @@ struct i915_vma { struct drm_i915_gem_object *obj; struct i915_address_space *vm; struct drm_i915_fence_reg *fence; + struct reservation_object *resv; /** Alias of obj->resv */ struct sg_table *pages; void __iomem *iomap; u64 size; @@ -111,8 +112,8 @@ struct i915_vma { /** * Used for performing relocations during execbuffer insertion. */ - struct hlist_node exec_node; struct drm_i915_gem_exec_object2 *exec_entry; + struct hlist_node exec_node; u32 exec_handle; struct i915_gem_context *ctx;