From patchwork Mon May 4 04:48:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524971 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 18ABE15AB for ; Mon, 4 May 2020 04:49:44 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 00C0F206C0 for ; Mon, 4 May 2020 04:49:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 00C0F206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 164626E2E9; Mon, 4 May 2020 04:49:37 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D78326E2E0 for ; Mon, 4 May 2020 04:49:35 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102357-1500050 for multiple; Mon, 04 May 2020 05:49:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:42 +0100 Message-Id: <20200504044903.7626-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/22] drm/i915: Allow some leniency in PCU reads X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Extend the timeout for pcode reads to 10ms as they should not be performed along critical paths, and succeeding after a short delay is better than failing entirely. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1800 Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/intel_sideband.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_sideband.c b/drivers/gpu/drm/i915/intel_sideband.c index 14daf6af6854..d5129c1dd452 100644 --- a/drivers/gpu/drm/i915/intel_sideband.c +++ b/drivers/gpu/drm/i915/intel_sideband.c @@ -429,7 +429,7 @@ int sandybridge_pcode_read(struct drm_i915_private *i915, u32 mbox, mutex_lock(&i915->sb_lock); err = __sandybridge_pcode_rw(i915, mbox, val, val1, - 500, 0, + 500, 20, true); mutex_unlock(&i915->sb_lock); From patchwork Mon May 4 04:48:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11525005 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C21D14C0 for ; Mon, 4 May 2020 04:49:58 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2502A206C0 for ; Mon, 4 May 2020 04:49:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2502A206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B38676E317; Mon, 4 May 2020 04:49:43 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 684DB6E10B for ; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102358-1500050 for multiple; Mon, 04 May 2020 05:49:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:43 +0100 Message-Id: <20200504044903.7626-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/22] drm/i915/gem: Specify address type for chained reloc batches X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is required that a chained batch be in the same address domain as its parent, and also that must be specified in the command for earlier gen as it is not inferred from the chaining until gen6. Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches") Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index cce7df231cb9..ab0d4df13c0b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1004,14 +1004,14 @@ static int reloc_gpu_chain(struct reloc_cache *cache) GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE / sizeof(u32)); cmd = cache->rq_cmd + cache->rq_size; *cmd++ = MI_ARB_CHECK; - if (cache->gen >= 8) { + if (cache->gen >= 8) *cmd++ = MI_BATCH_BUFFER_START_GEN8; - *cmd++ = lower_32_bits(batch->node.start); - *cmd++ = upper_32_bits(batch->node.start); - } else { + else if (cache->gen >= 6) *cmd++ = MI_BATCH_BUFFER_START; - *cmd++ = lower_32_bits(batch->node.start); - } + else + *cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT; + *cmd++ = lower_32_bits(batch->node.start); + *cmd++ = upper_32_bits(batch->node.start); i915_gem_object_flush_map(cache->rq_vma->obj); i915_gem_object_unpin_map(cache->rq_vma->obj); cache->rq_vma = NULL; From patchwork Mon May 4 04:48:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524973 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA22415AB for ; Mon, 4 May 2020 04:49:45 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D2737206C0 for ; Mon, 4 May 2020 04:49:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D2737206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3382A6E2EA; Mon, 4 May 2020 04:49:37 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B8EFE6E0D8 for ; Mon, 4 May 2020 04:49:35 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102359-1500050 for multiple; Mon, 04 May 2020 05:49:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:44 +0100 Message-Id: <20200504044903.7626-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/22] drm/i915/gem: Implement legacy MI_STORE_DATA_IMM X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 204 ++++++++++------- .../i915/gem/selftests/i915_gem_execbuffer.c | 206 ++++++++++++++++++ .../drm/i915/selftests/i915_live_selftests.h | 1 + 3 files changed, 337 insertions(+), 74 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index ab0d4df13c0b..44d7da0e200e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -955,7 +955,7 @@ static void reloc_cache_init(struct reloc_cache *cache, cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment; cache->node.flags = 0; cache->rq = NULL; - cache->rq_size = 0; + cache->target = NULL; } static inline void *unmask_page(unsigned long p) @@ -1325,7 +1325,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, ce = intel_context_create(engine); if (IS_ERR(ce)) { - err = PTR_ERR(rq); + err = PTR_ERR(ce); goto err_unpin; } @@ -1376,6 +1376,11 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, return err; } +static bool can_store_dword(const struct intel_engine_cs *engine) +{ + return engine->class != VIDEO_DECODE_CLASS || !IS_GEN(engine->i915, 6); +} + static u32 *reloc_gpu(struct i915_execbuffer *eb, struct i915_vma *vma, unsigned int len) @@ -1387,9 +1392,9 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb, if (unlikely(!cache->rq)) { struct intel_engine_cs *engine = eb->engine; - if (!intel_engine_can_store_dword(engine)) { + if (!can_store_dword(engine)) { engine = engine->gt->engine_class[COPY_ENGINE_CLASS][0]; - if (!engine || !intel_engine_can_store_dword(engine)) + if (!engine) return ERR_PTR(-ENODEV); } @@ -1435,91 +1440,138 @@ static inline bool use_reloc_gpu(struct i915_vma *vma) return !dma_resv_test_signaled_rcu(vma->resv, true); } -static u64 -relocate_entry(struct i915_vma *vma, - const struct drm_i915_gem_relocation_entry *reloc, - struct i915_execbuffer *eb, - const struct i915_vma *target) +static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset) { - u64 offset = reloc->offset; - u64 target_offset = relocation_target(reloc, target); - bool wide = eb->reloc_cache.use_64bit_reloc; - void *vaddr; + struct page *page; + unsigned long addr; - if (!eb->reloc_cache.vaddr && use_reloc_gpu(vma)) { - const unsigned int gen = eb->reloc_cache.gen; - unsigned int len; - u32 *batch; - u64 addr; + GEM_BUG_ON(vma->pages != vma->obj->mm.pages); - if (wide) - len = offset & 7 ? 8 : 5; - else if (gen >= 4) - len = 4; - else - len = 3; + page = i915_gem_object_get_page(vma->obj, offset >> PAGE_SHIFT); + addr = PFN_PHYS(page_to_pfn(page)); + GEM_BUG_ON(overflows_type(addr, u32)); /* expected dma32 */ - batch = reloc_gpu(eb, vma, len); - if (IS_ERR(batch)) - goto repeat; + return addr + offset_in_page(offset); +} + +static bool __reloc_entry_gpu(struct i915_vma *vma, + struct i915_execbuffer *eb, + u64 offset, + u64 target_addr) +{ + const unsigned int gen = eb->reloc_cache.gen; + unsigned int len; + u32 *batch; + u64 addr; + + if (gen >= 8) + len = offset & 7 ? 8 : 5; + else if (gen >= 4) + len = 4; + else + len = 3; + + batch = reloc_gpu(eb, vma, len); + if (IS_ERR(batch)) + return false; + + addr = gen8_canonical_addr(vma->node.start + offset); + if (gen >= 8) { + if (offset & 7) { + *batch++ = MI_STORE_DWORD_IMM_GEN4; + *batch++ = lower_32_bits(addr); + *batch++ = upper_32_bits(addr); + *batch++ = lower_32_bits(target_addr); + + addr = gen8_canonical_addr(addr + 4); - addr = gen8_canonical_addr(vma->node.start + offset); - if (wide) { - if (offset & 7) { - *batch++ = MI_STORE_DWORD_IMM_GEN4; - *batch++ = lower_32_bits(addr); - *batch++ = upper_32_bits(addr); - *batch++ = lower_32_bits(target_offset); - - addr = gen8_canonical_addr(addr + 4); - - *batch++ = MI_STORE_DWORD_IMM_GEN4; - *batch++ = lower_32_bits(addr); - *batch++ = upper_32_bits(addr); - *batch++ = upper_32_bits(target_offset); - } else { - *batch++ = (MI_STORE_DWORD_IMM_GEN4 | (1 << 21)) + 1; - *batch++ = lower_32_bits(addr); - *batch++ = upper_32_bits(addr); - *batch++ = lower_32_bits(target_offset); - *batch++ = upper_32_bits(target_offset); - } - } else if (gen >= 6) { *batch++ = MI_STORE_DWORD_IMM_GEN4; - *batch++ = 0; - *batch++ = addr; - *batch++ = target_offset; - } else if (gen >= 4) { - *batch++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; - *batch++ = 0; - *batch++ = addr; - *batch++ = target_offset; + *batch++ = lower_32_bits(addr); + *batch++ = upper_32_bits(addr); + *batch++ = upper_32_bits(target_addr); } else { - *batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL; - *batch++ = addr; - *batch++ = target_offset; + *batch++ = (MI_STORE_DWORD_IMM_GEN4 | (1 << 21)) + 1; + *batch++ = lower_32_bits(addr); + *batch++ = upper_32_bits(addr); + *batch++ = lower_32_bits(target_addr); + *batch++ = upper_32_bits(target_addr); } - - goto out; + } else if (gen >= 6) { + *batch++ = MI_STORE_DWORD_IMM_GEN4; + *batch++ = 0; + *batch++ = addr; + *batch++ = target_addr; + } else if (IS_I965G(eb->i915)) { + *batch++ = MI_STORE_DWORD_IMM_GEN4; + *batch++ = 0; + *batch++ = vma_phys_addr(vma, offset); + *batch++ = target_addr; + } else if (gen >= 4) { + *batch++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; + *batch++ = 0; + *batch++ = addr; + *batch++ = target_addr; + } else if (gen >= 3 && + !(IS_I915G(eb->i915) || IS_I915GM(eb->i915))) { + *batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL; + *batch++ = addr; + *batch++ = target_addr; + } else { + *batch++ = MI_STORE_DWORD_IMM; + *batch++ = vma_phys_addr(vma, offset); + *batch++ = target_addr; } + return true; +} + +static bool reloc_entry_gpu(struct i915_vma *vma, + struct i915_execbuffer *eb, + u64 offset, + u64 target_addr) +{ + if (eb->reloc_cache.vaddr) + return false; + + if (!use_reloc_gpu(vma)) + return false; + + return __reloc_entry_gpu(vma, eb, offset, target_addr); +} + +static u64 +relocate_entry(struct i915_vma *vma, + const struct drm_i915_gem_relocation_entry *reloc, + struct i915_execbuffer *eb, + const struct i915_vma *target) +{ + u64 target_addr = relocation_target(reloc, target); + u64 offset = reloc->offset; + + if (!reloc_entry_gpu(vma, eb, offset, target_addr)) { + bool wide = eb->reloc_cache.use_64bit_reloc; + void *vaddr; + repeat: - vaddr = reloc_vaddr(vma->obj, &eb->reloc_cache, offset >> PAGE_SHIFT); - if (IS_ERR(vaddr)) - return PTR_ERR(vaddr); + vaddr = reloc_vaddr(vma->obj, + &eb->reloc_cache, + offset >> PAGE_SHIFT); + if (IS_ERR(vaddr)) + return PTR_ERR(vaddr); - clflush_write32(vaddr + offset_in_page(offset), - lower_32_bits(target_offset), - eb->reloc_cache.vaddr); + GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32))); + clflush_write32(vaddr + offset_in_page(offset), + lower_32_bits(target_addr), + eb->reloc_cache.vaddr); - if (wide) { - offset += sizeof(u32); - target_offset >>= 32; - wide = false; - goto repeat; + if (wide) { + offset += sizeof(u32); + target_addr >>= 32; + wide = false; + goto repeat; + } } -out: return target->node.start | UPDATE; } @@ -3022,3 +3074,7 @@ end:; kvfree(exec2_list); return err; } + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftests/i915_gem_execbuffer.c" +#endif diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c new file mode 100644 index 000000000000..985f9fbd0ba0 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c @@ -0,0 +1,206 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include "i915_selftest.h" + +#include "gt/intel_engine_pm.h" +#include "selftests/igt_flush_test.h" + +static void hexdump(const void *buf, size_t len) +{ + const size_t rowsize = 8 * sizeof(u32); + const void *prev = NULL; + bool skip = false; + size_t pos; + + for (pos = 0; pos < len; pos += rowsize) { + char line[128]; + + if (prev && !memcmp(prev, buf + pos, rowsize)) { + if (!skip) { + pr_info("*\n"); + skip = true; + } + continue; + } + + WARN_ON_ONCE(hex_dump_to_buffer(buf + pos, len - pos, + rowsize, sizeof(u32), + line, sizeof(line), + false) >= sizeof(line)); + pr_info("[%04zx] %s\n", pos, line); + + prev = buf + pos; + skip = false; + } +} + +static u64 read_reloc(const u32 *map, int x, const u64 mask) +{ + u64 reloc; + + memcpy(&reloc, &map[x], sizeof(reloc)); + return reloc & mask; +} + +static int __igt_gpu_reloc(struct i915_execbuffer *eb, + struct drm_i915_gem_object *obj) +{ + enum { + X = 0, + Y = 3, + Z = 8 + }; + const u64 mask = GENMASK_ULL(eb->reloc_cache.gen >= 8 ? 63 : 31, 0); + const u32 *map = page_mask_bits(obj->mm.mapping); + struct i915_request *rq; + struct i915_vma *vma; + int inc; + int err; + + vma = i915_vma_instance(obj, eb->context->vm, NULL); + if (IS_ERR(vma)) + return PTR_ERR(vma); + + err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_HIGH); + if (err) + return err; + + /* 8-Byte aligned */ + if (!__reloc_entry_gpu(vma, eb, X * sizeof(u32), X)) { + err = -EIO; + goto unpin_vma; + } + + /* !8-Byte aligned */ + if (!__reloc_entry_gpu(vma, eb, Y * sizeof(u32), Y)) { + err = -EIO; + goto unpin_vma; + } + + /* Skip to the end of the cmd page */ + inc = PAGE_SIZE / sizeof(u32) - RELOC_TAIL - 1; + inc -= eb->reloc_cache.rq_size; + memset(eb->reloc_cache.rq_cmd + eb->reloc_cache.rq_size, + 0, inc * sizeof(u32)); + eb->reloc_cache.rq_size += inc; + + /* Force batch chaining */ + if (!__reloc_entry_gpu(vma, eb, Z * sizeof(u32), Z)) { + err = -EIO; + goto unpin_vma; + } + + GEM_BUG_ON(!eb->reloc_cache.rq); + rq = i915_request_get(eb->reloc_cache.rq); + err = reloc_gpu_flush(&eb->reloc_cache); + if (err) + goto put_rq; + GEM_BUG_ON(eb->reloc_cache.rq); + + err = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE, HZ / 2); + if (err) { + intel_gt_set_wedged(eb->engine->gt); + goto put_rq; + } + + if (!i915_request_completed(rq)) { + pr_err("%s: did not wait for relocations!\n", eb->engine->name); + err = -EINVAL; + goto put_rq; + } + + if (read_reloc(map, X, mask) != X) { + pr_err("%s[X]: map[%d] %llx != %x\n", + eb->engine->name, X, read_reloc(map, X, mask), X); + err = -EINVAL; + } + + if (read_reloc(map, Y, mask) != Y) { + pr_err("%s[Y]: map[%d] %llx != %x\n", + eb->engine->name, Y, read_reloc(map, Y, mask), Y); + err = -EINVAL; + } + + if (read_reloc(map, Z, mask) != Z) { + pr_err("%s[Z]: map[%d] %llx != %x\n", + eb->engine->name, Z, read_reloc(map, Z, mask), Z); + err = -EINVAL; + } + + if (err) + hexdump(map, 4096); + +put_rq: + i915_request_put(rq); +unpin_vma: + i915_vma_unpin(vma); + return err; +} + +static int igt_gpu_reloc(void *arg) +{ + struct i915_execbuffer eb; + struct drm_i915_gem_object *scratch; + int err = 0; + u32 *map; + + eb.i915 = arg; + + scratch = i915_gem_object_create_internal(eb.i915, 4096); + if (IS_ERR(scratch)) + return PTR_ERR(scratch); + + map = i915_gem_object_pin_map(scratch, I915_MAP_WC); + if (IS_ERR(map)) { + err = PTR_ERR(map); + goto err_scratch; + } + + for_each_uabi_engine(eb.engine, eb.i915) { + reloc_cache_init(&eb.reloc_cache, eb.i915); + memset(map, POISON_INUSE, 4096); + + intel_engine_pm_get(eb.engine); + eb.context = intel_context_create(eb.engine); + if (IS_ERR(eb.context)) { + err = PTR_ERR(eb.context); + goto err_pm; + } + + err = intel_context_pin(eb.context); + if (err) + goto err_put; + + err = __igt_gpu_reloc(&eb, scratch); + + intel_context_unpin(eb.context); +err_put: + intel_context_put(eb.context); +err_pm: + intel_engine_pm_put(eb.engine); + if (err) + break; + } + + if (igt_flush_test(eb.i915)) + err = -EIO; + +err_scratch: + i915_gem_object_put(scratch); + return err; +} + +int i915_gem_execbuffer_live_selftests(struct drm_i915_private *i915) +{ + static const struct i915_subtest tests[] = { + SUBTEST(igt_gpu_reloc), + }; + + if (intel_gt_is_wedged(&i915->gt)) + return 0; + + return i915_live_subtests(tests, i915); +} diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h index 0a953bfc0585..5dd5d81646c4 100644 --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h @@ -37,6 +37,7 @@ selftest(gem, i915_gem_live_selftests) selftest(evict, i915_gem_evict_live_selftests) selftest(hugepages, i915_gem_huge_page_live_selftests) selftest(gem_contexts, i915_gem_context_live_selftests) +selftest(gem_execbuf, i915_gem_execbuffer_live_selftests) selftest(blt, i915_gem_object_blt_live_selftests) selftest(client, i915_gem_client_blt_live_selftests) selftest(reset, intel_reset_live_selftests) From patchwork Mon May 4 04:48:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524981 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98C2914C0 for ; Mon, 4 May 2020 04:49:50 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8184A206C0 for ; Mon, 4 May 2020 04:49:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8184A206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0399C6E2E8; Mon, 4 May 2020 04:49:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id C49F86E2E8 for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102360-1500050 for multiple; Mon, 04 May 2020 05:49:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:45 +0100 Message-Id: <20200504044903.7626-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/22] drm/i915/gt: Small tidy of gen8+ breadcrumb emission X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Use a local to shrink a line under 80 columns, and refactor the common emit_xcs_breadcrumb() wrapper of ggtt-write. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 34 +++++++++++++---------------- 1 file changed, 15 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index d4ef344657b0..c00366387b54 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -4641,8 +4641,7 @@ static u32 *emit_preempt_busywait(struct i915_request *request, u32 *cs) } static __always_inline u32* -gen8_emit_fini_breadcrumb_footer(struct i915_request *request, - u32 *cs) +gen8_emit_fini_breadcrumb_tail(struct i915_request *request, u32 *cs) { *cs++ = MI_USER_INTERRUPT; @@ -4656,14 +4655,16 @@ gen8_emit_fini_breadcrumb_footer(struct i915_request *request, return gen8_emit_wa_tail(request, cs); } -static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs) +static u32 *emit_xcs_breadcrumb(struct i915_request *request, u32 *cs) { - cs = gen8_emit_ggtt_write(cs, - request->fence.seqno, - i915_request_active_timeline(request)->hwsp_offset, - 0); + u32 addr = i915_request_active_timeline(request)->hwsp_offset; - return gen8_emit_fini_breadcrumb_footer(request, cs); + return gen8_emit_ggtt_write(cs, request->fence.seqno, addr, 0); +} + +static u32 *gen8_emit_fini_breadcrumb(struct i915_request *rq, u32 *cs) +{ + return gen8_emit_fini_breadcrumb_tail(rq, emit_xcs_breadcrumb(rq, cs)); } static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs) @@ -4681,7 +4682,7 @@ static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs) PIPE_CONTROL_FLUSH_ENABLE | PIPE_CONTROL_CS_STALL); - return gen8_emit_fini_breadcrumb_footer(request, cs); + return gen8_emit_fini_breadcrumb_tail(request, cs); } static u32 * @@ -4697,7 +4698,7 @@ gen11_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs) PIPE_CONTROL_DC_FLUSH_ENABLE | PIPE_CONTROL_FLUSH_ENABLE); - return gen8_emit_fini_breadcrumb_footer(request, cs); + return gen8_emit_fini_breadcrumb_tail(request, cs); } /* @@ -4735,7 +4736,7 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request *request, u32 *cs) } static __always_inline u32* -gen12_emit_fini_breadcrumb_footer(struct i915_request *request, u32 *cs) +gen12_emit_fini_breadcrumb_tail(struct i915_request *request, u32 *cs) { *cs++ = MI_USER_INTERRUPT; @@ -4749,14 +4750,9 @@ gen12_emit_fini_breadcrumb_footer(struct i915_request *request, u32 *cs) return gen8_emit_wa_tail(request, cs); } -static u32 *gen12_emit_fini_breadcrumb(struct i915_request *request, u32 *cs) +static u32 *gen12_emit_fini_breadcrumb(struct i915_request *rq, u32 *cs) { - cs = gen8_emit_ggtt_write(cs, - request->fence.seqno, - i915_request_active_timeline(request)->hwsp_offset, - 0); - - return gen12_emit_fini_breadcrumb_footer(request, cs); + return gen12_emit_fini_breadcrumb_tail(rq, emit_xcs_breadcrumb(rq, cs)); } static u32 * @@ -4775,7 +4771,7 @@ gen12_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs) PIPE_CONTROL_FLUSH_ENABLE | PIPE_CONTROL_HDC_PIPELINE_FLUSH); - return gen12_emit_fini_breadcrumb_footer(request, cs); + return gen12_emit_fini_breadcrumb_tail(request, cs); } static void execlists_park(struct intel_engine_cs *engine) From patchwork Mon May 4 04:48:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11525003 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEBE317EF for ; Mon, 4 May 2020 04:49:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 96254206C0 for ; Mon, 4 May 2020 04:49:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 96254206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 77BAA6E313; Mon, 4 May 2020 04:49:43 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7CE9E6E2ED for ; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102361-1500050 for multiple; Mon, 04 May 2020 05:49:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:46 +0100 Message-Id: <20200504044903.7626-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/22] drm/i915: Mark concurrent submissions with a weak-dependency X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson , stable@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We recorded the dependencies for WAIT_FOR_SUBMIT in order that we could correctly perform priority inheritance from the parallel branches to the common trunk. However, for the purpose of timeslicing and reset handling, the dependency is weak -- as we the pair of requests are allowed to run in parallel and not in strict succession. So for example we do need to suspend one if the other hangs. The real significance though is that this allows us to rearrange groups of WAIT_FOR_SUBMIT linked requests along the single engine, and so can resolve user level inter-batch scheduling dependencies from user semaphores. Fixes: c81471f5e95c ("drm/i915: Copy across scheduler behaviour flags across submit fences") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: # v5.6+ --- drivers/gpu/drm/i915/gt/intel_lrc.c | 9 +++++++++ drivers/gpu/drm/i915/i915_request.c | 8 ++++++-- drivers/gpu/drm/i915/i915_scheduler.c | 4 +++- drivers/gpu/drm/i915/i915_scheduler.h | 3 ++- drivers/gpu/drm/i915/i915_scheduler_types.h | 1 + 5 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index c00366387b54..508661cf61d9 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1883,6 +1883,9 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); + if (p->flags & I915_DEPENDENCY_WEAK) + continue; + /* Leave semaphores spinning on the other engines */ if (w->engine != rq->engine) continue; @@ -2729,6 +2732,9 @@ static void __execlists_hold(struct i915_request *rq) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); + if (p->flags & I915_DEPENDENCY_WEAK) + continue; + /* Leave semaphores spinning on the other engines */ if (w->engine != rq->engine) continue; @@ -2853,6 +2859,9 @@ static void __execlists_unhold(struct i915_request *rq) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); + if (p->flags & I915_DEPENDENCY_WEAK) + continue; + /* Propagate any change in error status */ if (rq->fence.error) i915_request_set_error_once(w, rq->fence.error); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 22635bbabf06..95edc5523a01 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1038,7 +1038,9 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from) return 0; if (to->engine->schedule) { - ret = i915_sched_node_add_dependency(&to->sched, &from->sched); + ret = i915_sched_node_add_dependency(&to->sched, + &from->sched, + 0); if (ret < 0) return ret; } @@ -1200,7 +1202,9 @@ __i915_request_await_execution(struct i915_request *to, /* Couple the dependency tree for PI on this exposed to->fence */ if (to->engine->schedule) { - err = i915_sched_node_add_dependency(&to->sched, &from->sched); + err = i915_sched_node_add_dependency(&to->sched, + &from->sched, + I915_DEPENDENCY_WEAK); if (err < 0) return err; } diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 37cfcf5b321b..5f4c1e49e974 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -462,7 +462,8 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node, } int i915_sched_node_add_dependency(struct i915_sched_node *node, - struct i915_sched_node *signal) + struct i915_sched_node *signal, + unsigned long flags) { struct i915_dependency *dep; @@ -473,6 +474,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node, local_bh_disable(); if (!__i915_sched_node_add_dependency(node, signal, dep, + flags | I915_DEPENDENCY_EXTERNAL | I915_DEPENDENCY_ALLOC)) i915_dependency_free(dep); diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index d1dc4efef77b..6f0bf00fc569 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -34,7 +34,8 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node, unsigned long flags); int i915_sched_node_add_dependency(struct i915_sched_node *node, - struct i915_sched_node *signal); + struct i915_sched_node *signal, + unsigned long flags); void i915_sched_node_fini(struct i915_sched_node *node); diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index d18e70550054..7186875088a0 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -78,6 +78,7 @@ struct i915_dependency { unsigned long flags; #define I915_DEPENDENCY_ALLOC BIT(0) #define I915_DEPENDENCY_EXTERNAL BIT(1) +#define I915_DEPENDENCY_WEAK BIT(2) }; #endif /* _I915_SCHEDULER_TYPES_H_ */ From patchwork Mon May 4 04:48:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524991 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 671CB17EF for ; Mon, 4 May 2020 04:49:54 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4EAEE206C0 for ; Mon, 4 May 2020 04:49:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4EAEE206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 115516E2ED; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6B4F26E2F2 for ; Mon, 4 May 2020 04:49:39 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102362-1500050 for multiple; Mon, 04 May 2020 05:49:08 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:47 +0100 Message-Id: <20200504044903.7626-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/22] drm/i915/selftests: Repeat the rps clock frequency measurement X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Repeat the measurement of the clock frequency a few times and use the median to try and reduce the systematic measurement error. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_rps.c | 54 +++++++++++++++++++------- 1 file changed, 40 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c index b89a7d7611f6..bfa1a15564f7 100644 --- a/drivers/gpu/drm/i915/gt/selftest_rps.c +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c @@ -56,6 +56,18 @@ static int cmp_u64(const void *A, const void *B) return 0; } +static int cmp_u32(const void *A, const void *B) +{ + const u32 *a = A, *b = B; + + if (a < b) + return -1; + else if (a > b) + return 1; + else + return 0; +} + static struct i915_vma * create_spin_counter(struct intel_engine_cs *engine, struct i915_address_space *vm, @@ -236,8 +248,8 @@ int live_rps_clock_interval(void *arg) for_each_engine(engine, gt, id) { unsigned long saved_heartbeat; struct i915_request *rq; - ktime_t dt; u32 cycles; + u64 dt; if (!intel_engine_can_store_dword(engine)) continue; @@ -286,15 +298,29 @@ int live_rps_clock_interval(void *arg) engine->name); err = -ENODEV; } else { - preempt_disable(); - dt = ktime_get(); - cycles = -intel_uncore_read_fw(gt->uncore, - GEN6_RP_CUR_UP_EI); - udelay(1000); - dt = ktime_sub(ktime_get(), dt); - cycles += intel_uncore_read_fw(gt->uncore, - GEN6_RP_CUR_UP_EI); - preempt_enable(); + ktime_t dt_[5]; + u32 cycles_[5]; + int i; + + for (i = 0; i < 5; i++) { + preempt_disable(); + + dt_[i] = ktime_get(); + cycles_[i] = -intel_uncore_read_fw(gt->uncore, GEN6_RP_CUR_UP_EI); + + udelay(1000); + + dt_[i] = ktime_sub(ktime_get(), dt_[i]); + cycles_[i] += intel_uncore_read_fw(gt->uncore, GEN6_RP_CUR_UP_EI); + + preempt_enable(); + } + + /* Use the median of both cycle/dt; close enough */ + sort(cycles_, 5, sizeof(*cycles_), cmp_u32, NULL); + cycles = (cycles_[1] + 2 * cycles_[2] + cycles_[3]) / 4; + sort(dt_, 5, sizeof(*dt_), cmp_u64, NULL); + dt = div_u64(dt_[1] + 2 * dt_[2] + dt_[3], 4); } intel_uncore_write_fw(gt->uncore, GEN6_RP_CONTROL, 0); @@ -306,14 +332,14 @@ int live_rps_clock_interval(void *arg) if (err == 0) { u64 time = intel_gt_pm_interval_to_ns(gt, cycles); u32 expected = - intel_gt_ns_to_pm_interval(gt, ktime_to_ns(dt)); + intel_gt_ns_to_pm_interval(gt, dt); pr_info("%s: rps counted %d C0 cycles [%lldns] in %lldns [%d cycles], using GT clock frequency of %uKHz\n", - engine->name, cycles, time, ktime_to_ns(dt), expected, + engine->name, cycles, time, dt, expected, gt->clock_frequency / 1000); - if (10 * time < 8 * ktime_to_ns(dt) || - 8 * time > 10 * ktime_to_ns(dt)) { + if (10 * time < 8 * dt || + 8 * time > 10 * dt) { pr_err("%s: rps clock time does not match walltime!\n", engine->name); err = -EINVAL; From patchwork Mon May 4 04:48:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524977 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A3CF415AB for ; Mon, 4 May 2020 04:49:48 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8C8E9206C0 for ; Mon, 4 May 2020 04:49:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8C8E9206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 001276E2E5; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 869EC6E161 for ; Mon, 4 May 2020 04:49:37 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102363-1500050 for multiple; Mon, 04 May 2020 05:49:08 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:48 +0100 Message-Id: <20200504044903.7626-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/22] drm/i915/gt: Stop holding onto the pinned_default_state X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" As we only restore the default context state upon banning a context, we only need enough of the state to run the ring and nothing more. That is we only need our bare protocontext. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Mika Kuoppala Cc: Andi Shyti Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 14 +----- drivers/gpu/drm/i915/gt/intel_engine_types.h | 1 - drivers/gpu/drm/i915/gt/intel_lrc.c | 14 ++---- drivers/gpu/drm/i915/gt/selftest_context.c | 11 ++-- drivers/gpu/drm/i915/gt/selftest_lrc.c | 53 +++++++++++++++----- 5 files changed, 47 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 811debefebc0..d0a1078ef632 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -21,18 +21,11 @@ static int __engine_unpark(struct intel_wakeref *wf) struct intel_engine_cs *engine = container_of(wf, typeof(*engine), wakeref); struct intel_context *ce; - void *map; ENGINE_TRACE(engine, "\n"); intel_gt_pm_get(engine->gt); - /* Pin the default state for fast resets from atomic context. */ - map = NULL; - if (engine->default_state) - map = shmem_pin_map(engine->default_state); - engine->pinned_default_state = map; - /* Discard stale context state from across idling */ ce = engine->kernel_context; if (ce) { @@ -42,6 +35,7 @@ static int __engine_unpark(struct intel_wakeref *wf) if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) && ce->state) { struct drm_i915_gem_object *obj = ce->state->obj; int type = i915_coherent_map_type(engine->i915); + void *map; map = i915_gem_object_pin_map(obj, type); if (!IS_ERR(map)) { @@ -260,12 +254,6 @@ static int __engine_park(struct intel_wakeref *wf) if (engine->park) engine->park(engine); - if (engine->pinned_default_state) { - shmem_unpin_map(engine->default_state, - engine->pinned_default_state); - engine->pinned_default_state = NULL; - } - engine->execlists.no_priolist = false; /* While gt calls i915_vma_parked(), we have to break the lock cycle */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 6c676774dcd9..c84525363bb7 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -339,7 +339,6 @@ struct intel_engine_cs { unsigned long wakeref_serial; struct intel_wakeref wakeref; struct file *default_state; - void *pinned_default_state; struct { struct intel_ring *ring; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 508661cf61d9..66dd5e09a123 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1271,14 +1271,11 @@ execlists_check_context(const struct intel_context *ce, static void restore_default_state(struct intel_context *ce, struct intel_engine_cs *engine) { - u32 *regs = ce->lrc_reg_state; + u32 *regs; - if (engine->pinned_default_state) - memcpy(regs, /* skip restoring the vanilla PPHWSP */ - engine->pinned_default_state + LRC_STATE_OFFSET, - engine->context_size - PAGE_SIZE); + regs = memset(ce->lrc_reg_state, 0, engine->context_size - PAGE_SIZE); + execlists_init_reg_state(regs, ce, engine, ce->ring, true); - execlists_init_reg_state(regs, ce, engine, ce->ring, false); ce->runtime.last = intel_context_get_runtime(ce); } @@ -4175,8 +4172,6 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) * image back to the expected values to skip over the guilty request. */ __i915_request_reset(rq, stalled); - if (!stalled) - goto out_replay; /* * We want a simple context + ring to execute the breadcrumb update. @@ -4186,9 +4181,6 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) * future request will be after userspace has had the opportunity * to recreate its own state. */ - GEM_BUG_ON(!intel_context_is_pinned(ce)); - restore_default_state(ce, engine); - out_replay: ENGINE_TRACE(engine, "replay {head:%04x, tail:%04x}\n", head, ce->ring->tail); diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c index b8ed3cbe1277..a56dff3b157a 100644 --- a/drivers/gpu/drm/i915/gt/selftest_context.c +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -154,10 +154,7 @@ static int live_context_size(void *arg) */ for_each_engine(engine, gt, id) { - struct { - struct file *state; - void *pinned; - } saved; + struct file *saved; if (!engine->context_size) continue; @@ -171,8 +168,7 @@ static int live_context_size(void *arg) * active state is sufficient, we are only checking that we * don't use more than we planned. */ - saved.state = fetch_and_zero(&engine->default_state); - saved.pinned = fetch_and_zero(&engine->pinned_default_state); + saved = fetch_and_zero(&engine->default_state); /* Overlaps with the execlists redzone */ engine->context_size += I915_GTT_PAGE_SIZE; @@ -181,8 +177,7 @@ static int live_context_size(void *arg) engine->context_size -= I915_GTT_PAGE_SIZE; - engine->pinned_default_state = saved.pinned; - engine->default_state = saved.state; + engine->default_state = saved; intel_engine_pm_put(engine); diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 7529df92f6a2..d946c0521dbc 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -5206,6 +5206,7 @@ store_context(struct intel_context *ce, struct i915_vma *scratch) { struct i915_vma *batch; u32 dw, x, *cs, *hw; + u32 *defaults; batch = create_user_vma(ce->vm, SZ_64K); if (IS_ERR(batch)) @@ -5217,9 +5218,16 @@ store_context(struct intel_context *ce, struct i915_vma *scratch) return ERR_CAST(cs); } + defaults = shmem_pin_map(ce->engine->default_state); + if (!defaults) { + i915_gem_object_unpin_map(batch->obj); + i915_vma_put(batch); + return ERR_PTR(-ENOMEM); + } + x = 0; dw = 0; - hw = ce->engine->pinned_default_state; + hw = defaults; hw += LRC_STATE_OFFSET / sizeof(*hw); do { u32 len = hw[dw] & 0x7f; @@ -5250,6 +5258,8 @@ store_context(struct intel_context *ce, struct i915_vma *scratch) *cs++ = MI_BATCH_BUFFER_END; + shmem_unpin_map(ce->engine->default_state, defaults); + i915_gem_object_flush_map(batch->obj); i915_gem_object_unpin_map(batch->obj); @@ -5360,6 +5370,7 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison) { struct i915_vma *batch; u32 dw, *cs, *hw; + u32 *defaults; batch = create_user_vma(ce->vm, SZ_64K); if (IS_ERR(batch)) @@ -5371,8 +5382,15 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison) return ERR_CAST(cs); } + defaults = shmem_pin_map(ce->engine->default_state); + if (!defaults) { + i915_gem_object_unpin_map(batch->obj); + i915_vma_put(batch); + return ERR_PTR(-ENOMEM); + } + dw = 0; - hw = ce->engine->pinned_default_state; + hw = defaults; hw += LRC_STATE_OFFSET / sizeof(*hw); do { u32 len = hw[dw] & 0x7f; @@ -5400,6 +5418,8 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison) *cs++ = MI_BATCH_BUFFER_END; + shmem_unpin_map(ce->engine->default_state, defaults); + i915_gem_object_flush_map(batch->obj); i915_gem_object_unpin_map(batch->obj); @@ -5467,6 +5487,7 @@ static int compare_isolation(struct intel_engine_cs *engine, { u32 x, dw, *hw, *lrc; u32 *A[2], *B[2]; + u32 *defaults; int err = 0; A[0] = i915_gem_object_pin_map(ref[0]->obj, I915_MAP_WC); @@ -5499,9 +5520,15 @@ static int compare_isolation(struct intel_engine_cs *engine, } lrc += LRC_STATE_OFFSET / sizeof(*hw); + defaults = shmem_pin_map(ce->engine->default_state); + if (!defaults) { + err = -ENOMEM; + goto err_lrc; + } + x = 0; dw = 0; - hw = engine->pinned_default_state; + hw = defaults; hw += LRC_STATE_OFFSET / sizeof(*hw); do { u32 len = hw[dw] & 0x7f; @@ -5541,6 +5568,8 @@ static int compare_isolation(struct intel_engine_cs *engine, } while (dw < PAGE_SIZE / sizeof(u32) && (hw[dw] & ~BIT(0)) != MI_BATCH_BUFFER_END); + shmem_unpin_map(ce->engine->default_state, defaults); +err_lrc: i915_gem_object_unpin_map(ce->state->obj); err_B1: i915_gem_object_unpin_map(result[1]->obj); @@ -5690,18 +5719,16 @@ static int live_lrc_isolation(void *arg) continue; intel_engine_pm_get(engine); - if (engine->pinned_default_state) { - for (i = 0; i < ARRAY_SIZE(poison); i++) { - int result; + for (i = 0; i < ARRAY_SIZE(poison); i++) { + int result; - result = __lrc_isolation(engine, poison[i]); - if (result && !err) - err = result; + result = __lrc_isolation(engine, poison[i]); + if (result && !err) + err = result; - result = __lrc_isolation(engine, ~poison[i]); - if (result && !err) - err = result; - } + result = __lrc_isolation(engine, ~poison[i]); + if (result && !err) + err = result; } intel_engine_pm_put(engine); if (igt_flush_test(gt->i915)) { From patchwork Mon May 4 04:48:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11525001 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3016914C0 for ; Mon, 4 May 2020 04:49:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 18EF0206C0 for ; Mon, 4 May 2020 04:49:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 18EF0206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 69CA66E311; Mon, 4 May 2020 04:49:43 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 63DD46E2EC for ; Mon, 4 May 2020 04:49:39 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102364-1500050 for multiple; Mon, 04 May 2020 05:49:08 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:49 +0100 Message-Id: <20200504044903.7626-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/22] dma-buf: Proxy fence, an unsignaled fence placeholder X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Often we need to create a fence for a future event that has not yet been associated with a fence. We can store a proxy fence, a placeholder, in the timeline and replace it later when the real fence is known. Any listeners that attach to the proxy fence will automatically be signaled when the real fence completes, and any future listeners will instead be attach directly to the real fence avoiding any indirection overhead. Signed-off-by: Chris Wilson Cc: Lionel Landwerlin --- drivers/dma-buf/Makefile | 13 +- drivers/dma-buf/dma-fence-private.h | 20 + drivers/dma-buf/dma-fence-proxy.c | 255 ++++++++++ drivers/dma-buf/dma-fence.c | 4 +- drivers/dma-buf/selftests.h | 1 + drivers/dma-buf/st-dma-fence-proxy.c | 699 +++++++++++++++++++++++++++ include/linux/dma-fence-proxy.h | 34 ++ 7 files changed, 1022 insertions(+), 4 deletions(-) create mode 100644 drivers/dma-buf/dma-fence-private.h create mode 100644 drivers/dma-buf/dma-fence-proxy.c create mode 100644 drivers/dma-buf/st-dma-fence-proxy.c create mode 100644 include/linux/dma-fence-proxy.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 995e05f609ff..afaf6dadd9a3 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,6 +1,12 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ - dma-resv.o seqno-fence.o +obj-y := \ + dma-buf.o \ + dma-fence.o \ + dma-fence-array.o \ + dma-fence-chain.o \ + dma-fence-proxy.o \ + dma-resv.o \ + seqno-fence.o obj-$(CONFIG_DMABUF_HEAPS) += dma-heap.o obj-$(CONFIG_DMABUF_HEAPS) += heaps/ obj-$(CONFIG_SYNC_FILE) += sync_file.o @@ -10,6 +16,7 @@ obj-$(CONFIG_UDMABUF) += udmabuf.o dmabuf_selftests-y := \ selftest.o \ st-dma-fence.o \ - st-dma-fence-chain.o + st-dma-fence-chain.o \ + st-dma-fence-proxy.o obj-$(CONFIG_DMABUF_SELFTESTS) += dmabuf_selftests.o diff --git a/drivers/dma-buf/dma-fence-private.h b/drivers/dma-buf/dma-fence-private.h new file mode 100644 index 000000000000..6924d28af0fa --- /dev/null +++ b/drivers/dma-buf/dma-fence-private.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Fence mechanism for dma-buf and to allow for asynchronous dma access + * + * Copyright (C) 2012 Canonical Ltd + * Copyright (C) 2012 Texas Instruments + * + * Authors: + * Rob Clark + * Maarten Lankhorst + */ + +#ifndef DMA_FENCE_PRIVATE_H +#define DMA_FENCE_PRIAVTE_H + +struct dma_fence; + +bool __dma_fence_enable_signaling(struct dma_fence *fence); + +#endif /* DMA_FENCE_PRIAVTE_H */ diff --git a/drivers/dma-buf/dma-fence-proxy.c b/drivers/dma-buf/dma-fence-proxy.c new file mode 100644 index 000000000000..d6cde1f30ef4 --- /dev/null +++ b/drivers/dma-buf/dma-fence-proxy.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * dma-fence-proxy: placeholder unsignaled fence + * + * Copyright (C) 2017-2019 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include "dma-fence-private.h" + +struct dma_fence_proxy { + struct dma_fence base; + + struct dma_fence *real; + struct dma_fence_cb cb; + struct irq_work work; + + wait_queue_head_t wq; +}; + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +#define same_lockclass(A, B) (A)->dep_map.key == (B)->dep_map.key +#else +#define same_lockclass(A, B) 0 +#endif + +static const char *proxy_get_driver_name(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + struct dma_fence *real = READ_ONCE(p->real); + + return real ? real->ops->get_driver_name(real) : "proxy"; +} + +static const char *proxy_get_timeline_name(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + struct dma_fence *real = READ_ONCE(p->real); + + return real ? real->ops->get_timeline_name(real) : "unset"; +} + +static void proxy_irq_work(struct irq_work *work) +{ + struct dma_fence_proxy *p = container_of(work, typeof(*p), work); + + dma_fence_signal(&p->base); + dma_fence_put(&p->base); +} + +static void proxy_callback(struct dma_fence *real, struct dma_fence_cb *cb) +{ + struct dma_fence_proxy *p = container_of(cb, typeof(*p), cb); + + if (real->error) + dma_fence_set_error(&p->base, real->error); + + /* Lower the height of the proxy chain -> single stack frame */ + irq_work_queue(&p->work); +} + +static bool proxy_enable_signaling(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + struct dma_fence *real = READ_ONCE(p->real); + bool ret = true; + + if (real) { + spin_lock_nested(real->lock, + same_lockclass(&p->wq.lock, real->lock)); + ret = __dma_fence_enable_signaling(real); + spin_unlock(real->lock); + } + + return ret; +} + +static void proxy_release(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + + dma_fence_put(p->real); + dma_fence_free(&p->base); +} + +const struct dma_fence_ops dma_fence_proxy_ops = { + .get_driver_name = proxy_get_driver_name, + .get_timeline_name = proxy_get_timeline_name, + .enable_signaling = proxy_enable_signaling, + .wait = dma_fence_default_wait, + .release = proxy_release, +}; +EXPORT_SYMBOL_GPL(dma_fence_proxy_ops); + +/** + * dma_fence_create_proxy - Create an unset dma-fence + * + * dma_fence_create_proxy() creates a new dma_fence stub that is initially + * unsignaled and may later be replaced with a real fence. Any listeners + * to the proxy fence will be signaled when the target fence signals its + * completion. + */ +struct dma_fence *dma_fence_create_proxy(void) +{ + struct dma_fence_proxy *p; + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (!p) + return NULL; + + init_waitqueue_head(&p->wq); + dma_fence_init(&p->base, &dma_fence_proxy_ops, &p->wq.lock, + dma_fence_context_alloc(1), 0); + init_irq_work(&p->work, proxy_irq_work); + + return &p->base; +} +EXPORT_SYMBOL(dma_fence_create_proxy); + +static void wrap_signal_locked(struct dma_fence *fence, struct dma_fence *real) +{ + if (real->error) + dma_fence_set_error(fence, real->error); + dma_fence_signal_locked(fence); +} + +static void __wake_up_listeners(struct dma_fence_proxy *p) +{ + struct wait_queue_entry *wait, *next; + + list_for_each_entry_safe(wait, next, &p->wq.head, entry) { + INIT_LIST_HEAD(&wait->entry); + wait->func(wait, TASK_NORMAL, 0, p->real); + } +} + +static void proxy_assign(struct dma_fence *fence, struct dma_fence *real) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + unsigned long flags; + + if (WARN_ON(fence == real)) + return; + + if (WARN_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))) + return; + + if (WARN_ON(p->real)) + return; + + spin_lock_irqsave(&p->wq.lock, flags); + + if (unlikely(!real)) { + dma_fence_signal_locked(&p->base); + goto unlock; + } + + p->real = dma_fence_get(real); + + spin_lock_nested(real->lock, same_lockclass(&p->wq.lock, real->lock)); + if (dma_fence_is_signaled(real)) { + wrap_signal_locked(&p->base, real); + } else if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, + &p->base.flags) && + !__dma_fence_enable_signaling(real)) { + wrap_signal_locked(&p->base, real); + } else { + dma_fence_get(&p->base); + p->cb.func = proxy_callback; + list_add_tail(&p->cb.node, &real->cb_list); + } + spin_unlock(real->lock); + +unlock: + __wake_up_listeners(p); + spin_unlock_irqrestore(&p->wq.lock, flags); +} + +/** + * dma_fence_replace_proxy - Replace the proxy fence with the real target + * @slot: pointer to location of fence to update + * @fence: the new fence to store in @slot + * + * Once the real dma_fence is known, we can replace the proxy fence holder + * with a pointer to the real dma fence. Future listeners will attach to + * the real fence, avoiding any indirection overhead. Previous listeners + * will remain attached to the proxy fence, and be signaled in turn when + * the target fence completes. + */ +struct dma_fence * +dma_fence_replace_proxy(struct dma_fence __rcu **slot, struct dma_fence *fence) +{ + struct dma_fence *old; + + if (fence) + dma_fence_get(fence); + + old = rcu_replace_pointer(*slot, fence, true); + if (old && dma_fence_is_proxy(old)) + proxy_assign(old, fence); + + return old; +} +EXPORT_SYMBOL(dma_fence_replace_proxy); + +void dma_fence_add_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait) +{ + if (dma_fence_is_proxy(fence)) { + struct dma_fence_proxy *p = + container_of(fence, typeof(*p), base); + unsigned long flags; + + spin_lock_irqsave(&p->wq.lock, flags); + if (!p->real) { + list_add_tail(&wait->entry, &p->wq.head); + wait = NULL; + } + fence = p->real; + spin_unlock_irqrestore(&p->wq.lock, flags); + } + + if (wait) { + INIT_LIST_HEAD(&wait->entry); + wait->func(wait, TASK_NORMAL, 0, fence); + } +} +EXPORT_SYMBOL(dma_fence_add_proxy_listener); + +bool dma_fence_remove_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait) +{ + bool ret = false; + + if (dma_fence_is_proxy(fence)) { + struct dma_fence_proxy *p = + container_of(fence, typeof(*p), base); + unsigned long flags; + + spin_lock_irqsave(&p->wq.lock, flags); + if (!list_empty(&wait->entry)) { + list_del_init(&wait->entry); + ret = true; + } + spin_unlock_irqrestore(&p->wq.lock, flags); + } + + return ret; +} +EXPORT_SYMBOL(dma_fence_remove_proxy_listener); diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 052a41e2451c..fa7bedc6703d 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -19,6 +19,8 @@ #define CREATE_TRACE_POINTS #include +#include "dma-fence-private.h" + EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); @@ -273,7 +275,7 @@ void dma_fence_free(struct dma_fence *fence) } EXPORT_SYMBOL(dma_fence_free); -static bool __dma_fence_enable_signaling(struct dma_fence *fence) +bool __dma_fence_enable_signaling(struct dma_fence *fence) { bool was_set; diff --git a/drivers/dma-buf/selftests.h b/drivers/dma-buf/selftests.h index 55918ef9adab..616eca70e2d8 100644 --- a/drivers/dma-buf/selftests.h +++ b/drivers/dma-buf/selftests.h @@ -12,3 +12,4 @@ selftest(sanitycheck, __sanitycheck__) /* keep first (igt selfcheck) */ selftest(dma_fence, dma_fence) selftest(dma_fence_chain, dma_fence_chain) +selftest(dma_fence_proxy, dma_fence_proxy) diff --git a/drivers/dma-buf/st-dma-fence-proxy.c b/drivers/dma-buf/st-dma-fence-proxy.c new file mode 100644 index 000000000000..c95811199c16 --- /dev/null +++ b/drivers/dma-buf/st-dma-fence-proxy.c @@ -0,0 +1,699 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "selftest.h" + +static struct kmem_cache *slab_fences; + +static struct mock_fence { + struct dma_fence base; + spinlock_t lock; +} *to_mock_fence(struct dma_fence *f) { + return container_of(f, struct mock_fence, base); +} + +static const char *mock_name(struct dma_fence *f) +{ + return "mock"; +} + +static void mock_fence_release(struct dma_fence *f) +{ + kmem_cache_free(slab_fences, to_mock_fence(f)); +} + +static const struct dma_fence_ops mock_ops = { + .get_driver_name = mock_name, + .get_timeline_name = mock_name, + .release = mock_fence_release, +}; + +static struct dma_fence *mock_fence(void) +{ + struct mock_fence *f; + + f = kmem_cache_alloc(slab_fences, GFP_KERNEL); + if (!f) + return NULL; + + spin_lock_init(&f->lock); + dma_fence_init(&f->base, &mock_ops, &f->lock, 0, 0); + + return &f->base; +} + +static int sanitycheck(void *arg) +{ + struct dma_fence *f; + + f = dma_fence_create_proxy(); + if (!f) + return -ENOMEM; + + dma_fence_signal(f); + dma_fence_put(f); + + return 0; +} + +struct fences { + struct dma_fence *real; + struct dma_fence *proxy; + struct dma_fence __rcu *slot; +}; + +static int create_fences(struct fences *f, bool attach) +{ + f->proxy = dma_fence_create_proxy(); + if (!f->proxy) + return -ENOMEM; + + RCU_INIT_POINTER(f->slot, f->proxy); + + f->real = mock_fence(); + if (!f->real) { + dma_fence_put(f->proxy); + return -ENOMEM; + } + + if (attach) + dma_fence_replace_proxy(&f->slot, f->real); + + return 0; +} + +static void free_fences(struct fences *f) +{ + dma_fence_put(dma_fence_replace_proxy(&f->slot, NULL)); + dma_fence_put(f->real); + dma_fence_put(f->proxy); +} + +static int wrap_signaling(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_is_signaled(f.proxy)) { + pr_err("Fence unexpectedly signaled on creation\n"); + goto err_free; + } + + if (dma_fence_signal(f.real)) { + pr_err("Fence reported being already signaled\n"); + goto err_free; + } + + if (!dma_fence_is_signaled(f.proxy)) { + pr_err("Fence not reporting signaled\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_signaling_recurse(void *arg) +{ + struct fences f; + struct dma_fence *chain; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + chain = dma_fence_create_proxy(); + if (!chain) { + err = -ENOMEM; + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, chain); + dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real)); + dma_fence_put(chain); + + /* f.real <- chain <- f.proxy */ + + if (dma_fence_is_signaled(f.proxy)) { + pr_err("Fence unexpectedly signaled on creation\n"); + goto err_free; + } + + if (dma_fence_signal(f.real)) { + pr_err("Fence reported being already signaled\n"); + goto err_free; + } + + if (!dma_fence_is_signaled(f.proxy)) { + pr_err("Fence not reporting signaled\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +struct simple_cb { + struct dma_fence_cb cb; + bool seen; +}; + +static void simple_callback(struct dma_fence *f, struct dma_fence_cb *cb) +{ + smp_store_mb(container_of(cb, struct simple_cb, cb)->seen, true); +} + +static int wrap_add_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_add_callback_recurse(void *arg) +{ + struct simple_cb cb = {}; + struct dma_fence *chain; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + chain = dma_fence_create_proxy(); + if (!chain) { + err = -ENOMEM; + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, chain); + dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real)); + dma_fence_put(chain); + + /* f.real <- chain <- f.proxy */ + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_late_add_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + dma_fence_signal(f.real); + + if (!dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Added callback, but fence was already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (cb.seen) { + pr_err("Callback called after failed attachment!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_early_add_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_early_add_callback_late(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_signal(f.real); + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_early_add_callback_early(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_rm_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + if (!dma_fence_remove_callback(f.proxy, &cb.cb)) { + pr_err("Failed to remove callback!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (cb.seen) { + pr_err("Callback still signaled after removal!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_late_rm_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + if (dma_fence_remove_callback(f.proxy, &cb.cb)) { + pr_err("Callback removal succeed after being executed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_status(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_get_status(f.proxy)) { + pr_err("Fence unexpectedly has signaled status on creation\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!dma_fence_get_status(f.proxy)) { + pr_err("Fence not reporting signaled status\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_error(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + dma_fence_set_error(f.real, -EIO); + + if (dma_fence_get_status(f.proxy)) { + pr_err("Fence unexpectedly has error status before signal\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (dma_fence_get_status(f.proxy) != -EIO) { + pr_err("Fence not reporting error status, got %d\n", + dma_fence_get_status(f.proxy)); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_wait(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_wait_timeout(f.proxy, false, 0) != 0) { + pr_err("Wait reported complete before being signaled\n"); + goto err_free; + } + + dma_fence_signal(f.real); + + if (dma_fence_wait_timeout(f.proxy, false, 0) == 0) { + pr_err("Wait reported incomplete after being signaled\n"); + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +struct wait_timer { + struct timer_list timer; + struct fences f; +}; + +static void wait_timer(struct timer_list *timer) +{ + struct wait_timer *wt = from_timer(wt, timer, timer); + + dma_fence_signal(wt->f.real); +} + +static int wrap_wait_timeout(void *arg) +{ + struct wait_timer wt; + int err = -EINVAL; + + if (create_fences(&wt.f, true)) + return -ENOMEM; + + timer_setup_on_stack(&wt.timer, wait_timer, 0); + + if (dma_fence_wait_timeout(wt.f.proxy, false, 1) != 0) { + pr_err("Wait reported complete before being signaled\n"); + goto err_free; + } + + mod_timer(&wt.timer, jiffies + 1); + + if (dma_fence_wait_timeout(wt.f.proxy, false, 2) != 0) { + if (timer_pending(&wt.timer)) { + pr_notice("Timer did not fire within the jiffie!\n"); + err = 0; /* not our fault! */ + } else { + pr_err("Wait reported incomplete after timeout\n"); + } + goto err_free; + } + + err = 0; +err_free: + del_timer_sync(&wt.timer); + destroy_timer_on_stack(&wt.timer); + dma_fence_signal(wt.f.real); + free_fences(&wt.f); + return err; +} + +struct proxy_wait { + struct wait_queue_entry base; + struct dma_fence *fence; + bool seen; +}; + +static int proxy_wait_cb(struct wait_queue_entry *entry, + unsigned int mode, int flags, void *key) +{ + struct proxy_wait *p = container_of(entry, typeof(*p), base); + + p->fence = key; + p->seen = true; + + return 0; +} + +static int wrap_listen_early(void *arg) +{ + struct proxy_wait wait = { .base.func = proxy_wait_cb }; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_add_proxy_listener(f.proxy, &wait.base); + + if (!wait.seen) { + pr_err("Proxy listener was not called after replace!\n"); + err = -EINVAL; + goto err_free; + } + + if (wait.fence != f.real) { + pr_err("Proxy listener was not passed the real fence!\n"); + err = -EINVAL; + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +static int wrap_listen_late(void *arg) +{ + struct proxy_wait wait = { .base.func = proxy_wait_cb }; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_add_proxy_listener(f.proxy, &wait.base); + dma_fence_replace_proxy(&f.slot, f.real); + + if (!wait.seen) { + pr_err("Proxy listener was not called on replace!\n"); + err = -EINVAL; + goto err_free; + } + + if (wait.fence != f.real) { + pr_err("Proxy listener was not passed the real fence!\n"); + err = -EINVAL; + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +static int wrap_listen_cancel(void *arg) +{ + struct proxy_wait wait = { .base.func = proxy_wait_cb }; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_add_proxy_listener(f.proxy, &wait.base); + if (!dma_fence_remove_proxy_listener(f.proxy, &wait.base)) { + pr_err("Cancelling listener, already detached?\n"); + err = -EINVAL; + goto err_free; + } + dma_fence_replace_proxy(&f.slot, f.real); + + if (wait.seen) { + pr_err("Proxy listener was called after being removed!\n"); + err = -EINVAL; + goto err_free; + } + + if (dma_fence_remove_proxy_listener(f.proxy, &wait.base)) { + pr_err("Double listener cancellation!\n"); + err = -EINVAL; + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +int dma_fence_proxy(void) +{ + static const struct subtest tests[] = { + SUBTEST(sanitycheck), + SUBTEST(wrap_signaling), + SUBTEST(wrap_signaling_recurse), + SUBTEST(wrap_add_callback), + SUBTEST(wrap_add_callback_recurse), + SUBTEST(wrap_late_add_callback), + SUBTEST(wrap_early_add_callback), + SUBTEST(wrap_early_add_callback_late), + SUBTEST(wrap_early_add_callback_early), + SUBTEST(wrap_rm_callback), + SUBTEST(wrap_late_rm_callback), + SUBTEST(wrap_status), + SUBTEST(wrap_error), + SUBTEST(wrap_wait), + SUBTEST(wrap_wait_timeout), + SUBTEST(wrap_listen_early), + SUBTEST(wrap_listen_late), + SUBTEST(wrap_listen_cancel), + }; + int ret; + + slab_fences = KMEM_CACHE(mock_fence, + SLAB_TYPESAFE_BY_RCU | + SLAB_HWCACHE_ALIGN); + if (!slab_fences) + return -ENOMEM; + + ret = subtests(tests, NULL); + + kmem_cache_destroy(slab_fences); + + return ret; +} diff --git a/include/linux/dma-fence-proxy.h b/include/linux/dma-fence-proxy.h new file mode 100644 index 000000000000..063cde6b42c4 --- /dev/null +++ b/include/linux/dma-fence-proxy.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * dma-fence-proxy: allows waiting upon unset and future fences + * + * Copyright (C) 2017 Intel Corporation + */ + +#ifndef __LINUX_DMA_FENCE_PROXY_H +#define __LINUX_DMA_FENCE_PROXY_H + +#include +#include + +struct wait_queue_entry; + +extern const struct dma_fence_ops dma_fence_proxy_ops; + +struct dma_fence *dma_fence_create_proxy(void); + +static inline bool dma_fence_is_proxy(struct dma_fence *fence) +{ + return fence->ops == &dma_fence_proxy_ops; +} + +struct dma_fence * +dma_fence_replace_proxy(struct dma_fence __rcu **slot, + struct dma_fence *fence); + +void dma_fence_add_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait); +bool dma_fence_remove_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait); + +#endif /* __LINUX_DMA_FENCE_PROXY_H */ From patchwork Mon May 4 04:48:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11525009 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E40814C0 for ; Mon, 4 May 2020 04:49:59 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 36BD7206C0 for ; Mon, 4 May 2020 04:49:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 36BD7206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7A3926E30E; Mon, 4 May 2020 04:49:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6955C6E161 for ; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102365-1500050 for multiple; Mon, 04 May 2020 05:49:08 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:50 +0100 Message-Id: <20200504044903.7626-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/22] drm/syncobj: Allow use of dma-fence-proxy X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allow the callers to supply a dma-fence-proxy for asynchronous waiting on future fences. Signed-off-by: Chris Wilson --- drivers/gpu/drm/drm_syncobj.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 42d46414f767..e141db0e1eb6 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -184,6 +184,7 @@ */ #include +#include #include #include #include @@ -324,14 +325,9 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *old_fence; struct syncobj_wait_entry *cur, *tmp; - if (fence) - dma_fence_get(fence); - spin_lock(&syncobj->lock); - old_fence = rcu_dereference_protected(syncobj->fence, - lockdep_is_held(&syncobj->lock)); - rcu_assign_pointer(syncobj->fence, fence); + old_fence = dma_fence_replace_proxy(&syncobj->fence, fence); if (fence != old_fence) { list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) From patchwork Mon May 4 04:48:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524969 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 34D5214C0 for ; Mon, 4 May 2020 04:49:41 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1229F206C0 for ; Mon, 4 May 2020 04:49:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1229F206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 02E706E0D8; Mon, 4 May 2020 04:49:37 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id C160D6E10B for ; Mon, 4 May 2020 04:49:35 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102366-1500050 for multiple; Mon, 04 May 2020 05:49:09 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:51 +0100 Message-Id: <20200504044903.7626-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/22] drm/i915/gem: Teach execbuf how to wait on future syncobj X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If a syncobj has not yet been assigned, treat it as a future fence and install and wait upon a dma-fence-proxy. The proxy will be replace by the real fence later, and that fence will be responsible for signaling our waiter. Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854 Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 21 +++- drivers/gpu/drm/i915/i915_request.c | 113 ++++++++++++++++++ 2 files changed, 132 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 44d7da0e200e..8b854f87a249 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -5,6 +5,7 @@ */ #include +#include #include #include #include @@ -2524,8 +2525,24 @@ await_fence_array(struct i915_execbuffer *eb, continue; fence = drm_syncobj_fence_get(syncobj); - if (!fence) - return -EINVAL; + if (!fence) { + struct dma_fence *old; + + fence = dma_fence_create_proxy(); + if (!fence) + return -ENOMEM; + + spin_lock(&syncobj->lock); + old = rcu_dereference_protected(syncobj->fence, true); + if (unlikely(old)) { + dma_fence_put(fence); + fence = dma_fence_get(old); + } else { + rcu_assign_pointer(syncobj->fence, + dma_fence_get(fence)); + } + spin_unlock(&syncobj->lock); + } err = i915_request_await_dma_fence(eb->request, fence); dma_fence_put(fence); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 95edc5523a01..8583fe5bb3b6 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -23,6 +23,7 @@ */ #include +#include #include #include #include @@ -1065,6 +1066,116 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from) return 0; } +struct await_proxy { + struct wait_queue_entry base; + struct i915_request *request; + struct dma_fence *fence; + struct timer_list timer; + struct work_struct work; + int (*attach)(struct await_proxy *ap); + void *data; +}; + +static void await_proxy_work(struct work_struct *work) +{ + struct await_proxy *ap = container_of(work, typeof(*ap), work); + struct i915_request *rq = ap->request; + int err; + + del_timer_sync(&ap->timer); + + if (ap->fence) { + mutex_lock(&rq->context->timeline->mutex); + err = ap->attach(ap); + mutex_unlock(&rq->context->timeline->mutex); + if (err < 0) + i915_sw_fence_set_error_once(&rq->submit, err); + } + + i915_sw_fence_complete(&rq->submit); + + dma_fence_put(ap->fence); + kfree(ap); +} + +static int +await_proxy_wake(struct wait_queue_entry *entry, + unsigned int mode, + int flags, + void *fence) +{ + struct await_proxy *ap = container_of(entry, typeof(*ap), base); + + ap->fence = dma_fence_get(fence); + schedule_work(&ap->work); + + return 0; +} + +static void +await_proxy_timer(struct timer_list *t) +{ + struct await_proxy *ap = container_of(t, typeof(*ap), timer); + + if (dma_fence_remove_proxy_listener(ap->base.private, &ap->base)) { + struct i915_request *rq = ap->request; + + pr_notice("Asynchronous wait on unset proxy fence by %s:%s:%llx timed out\n", + rq->fence.ops->get_driver_name(&rq->fence), + rq->fence.ops->get_timeline_name(&rq->fence), + rq->fence.seqno); + i915_sw_fence_set_error_once(&rq->submit, -ETIMEDOUT); + + schedule_work(&ap->work); + } +} + +static int +__i915_request_await_proxy(struct i915_request *rq, + struct dma_fence *fence, + unsigned long timeout, + int (*attach)(struct await_proxy *ap), + void *data) +{ + struct await_proxy *ap; + + ap = kzalloc(sizeof(*ap), I915_FENCE_GFP); + if (!ap) + return -ENOMEM; + + i915_sw_fence_await(&rq->submit); + + ap->base.private = fence; + ap->base.func = await_proxy_wake; + ap->request = rq; + INIT_WORK(&ap->work, await_proxy_work); + ap->attach = attach; + ap->data = data; + + timer_setup(&ap->timer, await_proxy_timer, 0); + if (timeout) + mod_timer(&ap->timer, round_jiffies_up(jiffies + timeout)); + + dma_fence_add_proxy_listener(fence, &ap->base); + return 0; +} + +static int await_proxy(struct await_proxy *ap) +{ + return i915_request_await_dma_fence(ap->request, ap->fence); +} + +static int +i915_request_await_proxy(struct i915_request *rq, struct dma_fence *fence) +{ + /* + * Wait until we know the real fence so that can optimise the + * inter-fence synchronisation. + */ + return __i915_request_await_proxy(rq, fence, I915_FENCE_TIMEOUT, + await_proxy, NULL); +} + int i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) { @@ -1111,6 +1222,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) if (dma_fence_is_i915(fence)) ret = i915_request_await_request(rq, to_request(fence)); + else if (dma_fence_is_proxy(fence)) + ret = i915_request_await_proxy(rq, fence); else ret = i915_sw_fence_await_dma_fence(&rq->submit, fence, fence->context ? I915_FENCE_TIMEOUT : 0, From patchwork Mon May 4 04:48:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524979 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B62CF15AB for ; Mon, 4 May 2020 04:49:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9ECF3206C0 for ; Mon, 4 May 2020 04:49:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9ECF3206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1B57E6E161; Mon, 4 May 2020 04:49:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3B20B6E2DF for ; Mon, 4 May 2020 04:49:37 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102367-1500050 for multiple; Mon, 04 May 2020 05:49:09 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:52 +0100 Message-Id: <20200504044903.7626-11-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/22] drm/i915/gem: Allow combining submit-fences with syncobj X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We allow exported sync_file fences to be used as submit fences, but they are not the only source of user fences. We also accept an array of syncobj, and as with sync_file these are dma_fences underneath and so feature the same set of controls. The submit-fence allows for a request to be scheduled at the same time as the signaler, rather than as normal after. Userspace can combine submit-fence with its own semaphores for intra-batch scheduling. Not exposing submit-fences to syncobj was at the time just a matter of pragmatic expediency. Fixes: a88b6e4cbafd ("drm/i915: Allow specification of parallel execbuf") Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Lionel Landwerlin Reviewed-by: Tvrtko Ursulin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 14 ++++++---- drivers/gpu/drm/i915/i915_request.c | 26 ++++++++++++++++++- include/uapi/drm/i915_drm.h | 7 ++--- 3 files changed, 38 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 8b854f87a249..67ba33b3de60 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2432,7 +2432,7 @@ static void __free_fence_array(struct drm_syncobj **fences, unsigned int n) { while (n--) - drm_syncobj_put(ptr_mask_bits(fences[n], 2)); + drm_syncobj_put(ptr_mask_bits(fences[n], 3)); kvfree(fences); } @@ -2489,7 +2489,7 @@ get_fence_array(struct drm_i915_gem_execbuffer2 *args, BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) & ~__I915_EXEC_FENCE_UNKNOWN_FLAGS); - fences[n] = ptr_pack_bits(syncobj, fence.flags, 2); + fences[n] = ptr_pack_bits(syncobj, fence.flags, 3); } return fences; @@ -2520,7 +2520,7 @@ await_fence_array(struct i915_execbuffer *eb, struct dma_fence *fence; unsigned int flags; - syncobj = ptr_unpack_bits(fences[n], &flags, 2); + syncobj = ptr_unpack_bits(fences[n], &flags, 3); if (!(flags & I915_EXEC_FENCE_WAIT)) continue; @@ -2544,7 +2544,11 @@ await_fence_array(struct i915_execbuffer *eb, spin_unlock(&syncobj->lock); } - err = i915_request_await_dma_fence(eb->request, fence); + if (flags & I915_EXEC_FENCE_WAIT_SUBMIT) + err = i915_request_await_execution(eb->request, fence, + eb->engine->bond_execute); + else + err = i915_request_await_dma_fence(eb->request, fence); dma_fence_put(fence); if (err < 0) return err; @@ -2565,7 +2569,7 @@ signal_fence_array(struct i915_execbuffer *eb, struct drm_syncobj *syncobj; unsigned int flags; - syncobj = ptr_unpack_bits(fences[n], &flags, 2); + syncobj = ptr_unpack_bits(fences[n], &flags, 3); if (!(flags & I915_EXEC_FENCE_SIGNAL)) continue; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 8583fe5bb3b6..92fa5267bcce 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1326,6 +1326,26 @@ __i915_request_await_execution(struct i915_request *to, &from->fence); } +static int execution_proxy(struct await_proxy *ap) +{ + return i915_request_await_execution(ap->request, ap->fence, ap->data); +} + +static int +i915_request_await_proxy_execution(struct i915_request *rq, + struct dma_fence *fence, + void (*hook)(struct i915_request *rq, + struct dma_fence *signal)) +{ + /* + * We have to wait until the real request is known in order to + * be able to hook into its execution, as opposed to waiting for + * its completion. + */ + return __i915_request_await_proxy(rq, fence, I915_FENCE_TIMEOUT, + execution_proxy, hook); +} + int i915_request_await_execution(struct i915_request *rq, struct dma_fence *fence, @@ -1362,10 +1382,14 @@ i915_request_await_execution(struct i915_request *rq, ret = __i915_request_await_execution(rq, to_request(fence), hook); + else if (dma_fence_is_proxy(fence)) + ret = i915_request_await_proxy_execution(rq, + fence, + hook); else ret = i915_sw_fence_await_dma_fence(&rq->submit, fence, I915_FENCE_TIMEOUT, - GFP_KERNEL); + I915_FENCE_GFP); if (ret < 0) return ret; } while (--nchild); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 14b67cd6b54b..704dd0e3bc1d 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1040,9 +1040,10 @@ struct drm_i915_gem_exec_fence { */ __u32 handle; -#define I915_EXEC_FENCE_WAIT (1<<0) -#define I915_EXEC_FENCE_SIGNAL (1<<1) -#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1)) +#define I915_EXEC_FENCE_WAIT (1u << 0) +#define I915_EXEC_FENCE_SIGNAL (1u << 1) +#define I915_EXEC_FENCE_WAIT_SUBMIT (1u << 2) +#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_WAIT_SUBMIT << 1)) __u32 flags; }; From patchwork Mon May 4 04:48:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11525007 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BEB1E15AB for ; Mon, 4 May 2020 04:49:58 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A72F7206C0 for ; Mon, 4 May 2020 04:49:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A72F7206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7A2136E303; Mon, 4 May 2020 04:49:44 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 729206E2EC for ; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102368-1500050 for multiple; Mon, 04 May 2020 05:49:09 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:53 +0100 Message-Id: <20200504044903.7626-12-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/22] drm/i915/gt: Declare when we enabled timeslicing X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kenneth Graunke , Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Let userspace know if they can trust timeslicing by including it as part of the I915_PARAM_HAS_SCHEDULER::I915_SCHEDULER_CAP_TIMESLICING v2: Only declare timeslicing if we can safely preempt userspace. Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802 Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854 Signed-off-by: Chris Wilson Cc: Kenneth Graunke Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_engine_user.c | 1 + include/uapi/drm/i915_drm.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c index 848decee9066..8415511f1465 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c @@ -98,6 +98,7 @@ static void set_scheduler_caps(struct drm_i915_private *i915) MAP(HAS_PREEMPTION, PREEMPTION), MAP(HAS_SEMAPHORES, SEMAPHORES), MAP(SUPPORTS_STATS, ENGINE_BUSY_STATS), + MAP(HAS_TIMESLICES, TIMESLICING), #undef MAP }; struct intel_engine_cs *engine; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 704dd0e3bc1d..1ee227b5131a 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -523,6 +523,7 @@ typedef struct drm_i915_irq_wait { #define I915_SCHEDULER_CAP_PREEMPTION (1ul << 2) #define I915_SCHEDULER_CAP_SEMAPHORES (1ul << 3) #define I915_SCHEDULER_CAP_ENGINE_BUSY_STATS (1ul << 4) +#define I915_SCHEDULER_CAP_TIMESLICING (1ul << 5) #define I915_PARAM_HUC_STATUS 42 From patchwork Mon May 4 04:48:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11525011 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4F2C17EF for ; Mon, 4 May 2020 04:49:59 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AD9C220757 for ; Mon, 4 May 2020 04:49:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AD9C220757 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DBD1D6E30F; Mon, 4 May 2020 04:49:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 846966E2EF for ; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102369-1500050 for multiple; Mon, 04 May 2020 05:49:09 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:54 +0100 Message-Id: <20200504044903.7626-13-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/22] drm/i915: Replace the hardcoded I915_FENCE_TIMEOUT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Expose the hardcoded timeout for unsignaled foreign fences as a Kconfig option, primarily to allow brave systems to disable the timeout and solely rely on correct signaling. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/Kconfig.profile | 12 ++++++++++++ drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/display/intel_display.c | 5 +++-- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_client_blt.c | 3 +-- drivers/gpu/drm/i915/gem/i915_gem_fence.c | 4 ++-- drivers/gpu/drm/i915/i915_config.c | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 10 +++++++++- drivers/gpu/drm/i915/i915_request.c | 11 +++++++---- 9 files changed, 51 insertions(+), 12 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_config.c diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 0bfd276c19fe..3925be65d314 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -1,3 +1,15 @@ +config DRM_I915_FENCE_TIMEOUT + int "Timeout for unsignaled foreign fences" + default 10000 # milliseconds + help + When listening to a foreign fence, we install a supplementary timer + to ensure that we are always signaled and our userspace is able to + make forward progress. This value specifies the timeout used for an + unsignaled foreign fence. + + May be 0 to disable the timeout, and rely on the foreign fence being + eventually signaled. + config DRM_I915_USERFAULT_AUTOSUSPEND int "Runtime autosuspend delay for userspace GGTT mmaps (ms)" default 250 # milliseconds diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 5359c736c789..b0da6ea6e3f1 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -35,6 +35,7 @@ subdir-ccflags-y += -I$(srctree)/$(src) # core driver code i915-y += i915_drv.o \ + i915_config.o \ i915_irq.o \ i915_getparam.o \ i915_params.o \ diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 2a17cf38d3dc..7b3578cfcacc 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -15815,7 +15815,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane, if (new_plane_state->uapi.fence) { /* explicit fencing */ ret = i915_sw_fence_await_dma_fence(&state->commit_ready, new_plane_state->uapi.fence, - I915_FENCE_TIMEOUT, + i915_fence_timeout(dev_priv), GFP_KERNEL); if (ret < 0) return ret; @@ -15842,7 +15842,8 @@ intel_prepare_plane_fb(struct drm_plane *_plane, ret = i915_sw_fence_await_reservation(&state->commit_ready, obj->base.resv, NULL, - false, I915_FENCE_TIMEOUT, + false, + i915_fence_timeout(dev_priv), GFP_KERNEL); if (ret < 0) goto unpin_fb; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index 34be4c0ee7c5..bc0223716906 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -108,7 +108,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, if (clflush) { i915_sw_fence_await_reservation(&clflush->base.chain, obj->base.resv, NULL, true, - I915_FENCE_TIMEOUT, + i915_fence_timeout(to_i915(obj->base.dev)), I915_FENCE_GFP); dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); dma_fence_work_commit(&clflush->base); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c index 3a146aa2593b..d3a86a4d5c04 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c @@ -288,8 +288,7 @@ int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj, i915_gem_object_lock(obj); err = i915_sw_fence_await_reservation(&work->wait, - obj->base.resv, NULL, - true, I915_FENCE_TIMEOUT, + obj->base.resv, NULL, true, 0, I915_FENCE_GFP); if (err < 0) { dma_fence_set_error(&work->dma, err); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_fence.c b/drivers/gpu/drm/i915/gem/i915_gem_fence.c index 2f6100ec2608..8ab842c80f99 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_fence.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_fence.c @@ -72,8 +72,8 @@ i915_gem_object_lock_fence(struct drm_i915_gem_object *obj) 0, 0); if (i915_sw_fence_await_reservation(&stub->chain, - obj->base.resv, NULL, - true, I915_FENCE_TIMEOUT, + obj->base.resv, NULL, true, + i915_fence_timeout(to_i915(obj->base.dev)), I915_FENCE_GFP) < 0) goto err; diff --git a/drivers/gpu/drm/i915/i915_config.c b/drivers/gpu/drm/i915/i915_config.c new file mode 100644 index 000000000000..b79b5f6d2cfa --- /dev/null +++ b/drivers/gpu/drm/i915/i915_config.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include "i915_drv.h" + +unsigned long +i915_fence_context_timeout(const struct drm_i915_private *i915, u64 context) +{ + if (context && IS_ACTIVE(CONFIG_DRM_I915_FENCE_TIMEOUT)) + return msecs_to_jiffies_timeout(CONFIG_DRM_I915_FENCE_TIMEOUT); + + return 0; +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6af69555733e..2e3b5c4d0759 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -614,8 +614,16 @@ struct i915_gem_mm { #define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */ +unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915, + u64 context); + +static inline unsigned long +i915_fence_timeout(const struct drm_i915_private *i915) +{ + return i915_fence_context_timeout(i915, U64_MAX); +} + #define I915_RESET_TIMEOUT (10 * HZ) /* 10s */ -#define I915_FENCE_TIMEOUT (10 * HZ) /* 10s */ #define I915_ENGINE_DEAD_TIMEOUT (4 * HZ) /* Seqno, head and subunits dead */ #define I915_SEQNO_DEAD_TIMEOUT (12 * HZ) /* Seqno dead with active head */ diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 92fa5267bcce..8a1750c69ef0 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1172,7 +1172,8 @@ i915_request_await_proxy(struct i915_request *rq, struct dma_fence *fence) * Wait until we know the real fence so that can optimise the * inter-fence synchronisation. */ - return __i915_request_await_proxy(rq, fence, I915_FENCE_TIMEOUT, + return __i915_request_await_proxy(rq, fence, + i915_fence_timeout(rq->i915), await_proxy, NULL); } @@ -1226,7 +1227,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) ret = i915_request_await_proxy(rq, fence); else ret = i915_sw_fence_await_dma_fence(&rq->submit, fence, - fence->context ? I915_FENCE_TIMEOUT : 0, + i915_fence_context_timeout(rq->i915, + fence->context), I915_FENCE_GFP); if (ret < 0) return ret; @@ -1342,7 +1344,8 @@ i915_request_await_proxy_execution(struct i915_request *rq, * be able to hook into its execution, as opposed to waiting for * its completion. */ - return __i915_request_await_proxy(rq, fence, I915_FENCE_TIMEOUT, + return __i915_request_await_proxy(rq, fence, + i915_fence_timeout(rq->i915), execution_proxy, hook); } @@ -1388,7 +1391,7 @@ i915_request_await_execution(struct i915_request *rq, hook); else ret = i915_sw_fence_await_dma_fence(&rq->submit, fence, - I915_FENCE_TIMEOUT, + i915_fence_timeout(rq->i915), I915_FENCE_GFP); if (ret < 0) return ret; From patchwork Mon May 4 04:48:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524999 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A218615AB for ; Mon, 4 May 2020 04:49:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8B090206C0 for ; Mon, 4 May 2020 04:49:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8B090206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7945A6E301; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 721506E2F3 for ; Mon, 4 May 2020 04:49:39 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102370-1500050 for multiple; Mon, 04 May 2020 05:49:09 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:55 +0100 Message-Id: <20200504044903.7626-14-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/22] drm/i915: Drop I915_RESET_TIMEOUT and friends X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" These were used to set various timeouts for the reset procedure (deciding when the engine was dead, and even if the reset itself was not making forward progress). No longer used. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2e3b5c4d0759..ad287e5d6ded 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -623,13 +623,6 @@ i915_fence_timeout(const struct drm_i915_private *i915) return i915_fence_context_timeout(i915, U64_MAX); } -#define I915_RESET_TIMEOUT (10 * HZ) /* 10s */ - -#define I915_ENGINE_DEAD_TIMEOUT (4 * HZ) /* Seqno, head and subunits dead */ -#define I915_SEQNO_DEAD_TIMEOUT (12 * HZ) /* Seqno dead with active head */ - -#define I915_ENGINE_WEDGED_TIMEOUT (60 * HZ) /* Reset but no recovery? */ - /* Amount of SAGV/QGV points, BSpec precisely defines this */ #define I915_NUM_QGV_POINTS 8 From patchwork Mon May 4 04:48:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524987 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2151315AB for ; Mon, 4 May 2020 04:49:53 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0A38D206C0 for ; Mon, 4 May 2020 04:49:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0A38D206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 295026E2EF; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id C9F006E161 for ; Mon, 4 May 2020 04:49:35 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102371-1500050 for multiple; Mon, 04 May 2020 05:49:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:56 +0100 Message-Id: <20200504044903.7626-15-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 15/22] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This timeout is only used in one place, to provide a tiny bit of grace for slow igt to cleanup after themselves. If we are a bit stricter and opt to kill outstanding requsts rather than wait, we can speed up igt by not waiting for 200ms after a hang. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 11 ++++++----- drivers/gpu/drm/i915/i915_drv.h | 2 -- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 8e98df6a3045..649acf1fc33d 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1463,12 +1463,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val) { int ret; - if (val & DROP_RESET_ACTIVE && - wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT)) - intel_gt_set_wedged(gt); + if (val & (DROP_RETIRE | DROP_RESET_ACTIVE)) + intel_gt_wait_for_idle(gt, 1); - if (val & DROP_RETIRE) - intel_gt_retire_requests(gt); + if (val & DROP_RESET_ACTIVE && intel_gt_pm_get_if_awake(gt)) { + intel_gt_set_wedged(gt); + intel_gt_pm_put(gt); + } if (val & (DROP_IDLE | DROP_ACTIVE)) { ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ad287e5d6ded..97687ea53c3d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -612,8 +612,6 @@ struct i915_gem_mm { u32 shrink_count; }; -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */ - unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915, u64 context); From patchwork Mon May 4 04:48:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524985 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F3B815AB for ; Mon, 4 May 2020 04:49:52 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 57F63206C0 for ; Mon, 4 May 2020 04:49:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 57F63206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DB4E56E2EC; Mon, 4 May 2020 04:49:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D07CE6E2DF for ; Mon, 4 May 2020 04:49:35 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102372-1500050 for multiple; Mon, 04 May 2020 05:49:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:57 +0100 Message-Id: <20200504044903.7626-16-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 16/22] drm/i915: Always defer fenced work to the worker X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Currently, if an error is raised we always call the cleanup locally [and skip the main work callback]. However, some future users may need to take a mutex to cleanup and so we cannot immediately execute the cleanup as we may still be in interrupt context. With the execute-immediate flag, for most cases this should result in immediate cleanup of an error. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_sw_fence_work.c | 25 +++++++++++------------ 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c index a3a81bb8f2c3..29f63ebc24e8 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c @@ -16,11 +16,14 @@ static void fence_complete(struct dma_fence_work *f) static void fence_work(struct work_struct *work) { struct dma_fence_work *f = container_of(work, typeof(*f), work); - int err; - err = f->ops->work(f); - if (err) - dma_fence_set_error(&f->dma, err); + if (!f->dma.error) { + int err; + + err = f->ops->work(f); + if (err) + dma_fence_set_error(&f->dma, err); + } fence_complete(f); dma_fence_put(&f->dma); @@ -36,15 +39,11 @@ fence_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) if (fence->error) dma_fence_set_error(&f->dma, fence->error); - if (!f->dma.error) { - dma_fence_get(&f->dma); - if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags)) - fence_work(&f->work); - else - queue_work(system_unbound_wq, &f->work); - } else { - fence_complete(f); - } + dma_fence_get(&f->dma); + if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags)) + fence_work(&f->work); + else + queue_work(system_unbound_wq, &f->work); break; case FENCE_FREE: From patchwork Mon May 4 04:48:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524995 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E33315AB for ; Mon, 4 May 2020 04:49:55 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7713C206C0 for ; Mon, 4 May 2020 04:49:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7713C206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E58016E30D; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 675B06E10B for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102373-1500050 for multiple; Mon, 04 May 2020 05:49:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:58 +0100 Message-Id: <20200504044903.7626-17-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 17/22] drm/i915/gem: Assign context id for async work X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allocate a few dma fence context id that we can use to associate async work [for the CPU] launched on behalf of this context. For extra fun, we allow a configurable concurrency width. A current example would be that we spawn an unbound worker for every userptr get_pages. In the future, we wish to charge this work to the context that initiated the async work and to impose concurrency limits based on the context. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++++ drivers/gpu/drm/i915/gem/i915_gem_context.h | 6 ++++++ drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++++++ 3 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 900ea8b7fc8f..fd7d064a0e46 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -715,6 +715,10 @@ __create_context(struct drm_i915_private *i915) ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); mutex_init(&ctx->mutex); + ctx->async.width = rounddown_pow_of_two(num_online_cpus()); + ctx->async.context = dma_fence_context_alloc(ctx->async.width); + ctx->async.width--; + spin_lock_init(&ctx->stale.lock); INIT_LIST_HEAD(&ctx->stale.engines); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index 3702b2fb27ab..e104ff0ae740 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx) +{ + return (ctx->async.context + + (atomic_fetch_inc(&ctx->async.cur) & ctx->async.width)); +} + static inline struct i915_gem_context * i915_gem_context_get(struct i915_gem_context *ctx) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 28760bd03265..5f5cfa3a3e9b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -85,6 +85,12 @@ struct i915_gem_context { struct intel_timeline *timeline; + struct { + u64 context; + atomic_t cur; + unsigned int width; + } async; + /** * @vm: unique address space (GTT) * From patchwork Mon May 4 04:48:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524983 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B84914C0 for ; Mon, 4 May 2020 04:49:51 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 842EE206C0 for ; Mon, 4 May 2020 04:49:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 842EE206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1334F6E2F0; Mon, 4 May 2020 04:49:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B2AA66E2E0 for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102374-1500050 for multiple; Mon, 04 May 2020 05:49:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:48:59 +0100 Message-Id: <20200504044903.7626-18-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 18/22] drm/i915: Export a preallocate variant of i915_active_acquire() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_active.c | 107 ++++++++++++++++++++++++++--- drivers/gpu/drm/i915/i915_active.h | 5 ++ 2 files changed, 103 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index d960d0be5bd2..71ad0d452680 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -217,11 +217,10 @@ excl_retire(struct dma_fence *fence, struct dma_fence_cb *cb) } static struct i915_active_fence * -active_instance(struct i915_active *ref, struct intel_timeline *tl) +active_instance(struct i915_active *ref, u64 idx) { struct active_node *node, *prealloc; struct rb_node **p, *parent; - u64 idx = tl->fence_context; /* * We track the most recently used timeline to skip a rbtree search @@ -367,7 +366,7 @@ int i915_active_ref(struct i915_active *ref, if (err) return err; - active = active_instance(ref, tl); + active = active_instance(ref, tl->fence_context); if (!active) { err = -ENOMEM; goto out; @@ -384,32 +383,104 @@ int i915_active_ref(struct i915_active *ref, atomic_dec(&ref->count); } if (!__i915_active_fence_set(active, fence)) - atomic_inc(&ref->count); + __i915_active_acquire(ref); out: i915_active_release(ref); return err; } -struct dma_fence * -i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +static struct dma_fence * +__i915_active_set_fence(struct i915_active *ref, + struct i915_active_fence *active, + struct dma_fence *fence) { struct dma_fence *prev; /* We expect the caller to manage the exclusive timeline ordering */ GEM_BUG_ON(i915_active_is_idle(ref)); + if (is_barrier(active)) { /* proto-node used by our idle barrier */ + /* + * This request is on the kernel_context timeline, and so + * we can use it to substitute for the pending idle-barrer + * request that we want to emit on the kernel_context. + */ + __active_del_barrier(ref, node_from_active(active)); + RCU_INIT_POINTER(active->fence, NULL); + atomic_dec(&ref->count); + } + rcu_read_lock(); - prev = __i915_active_fence_set(&ref->excl, f); + prev = __i915_active_fence_set(active, fence); if (prev) prev = dma_fence_get_rcu(prev); else - atomic_inc(&ref->count); + __i915_active_acquire(ref); rcu_read_unlock(); return prev; } +static struct i915_active_fence * +__active_lookup(struct i915_active *ref, u64 idx) +{ + struct active_node *node; + struct rb_node *p; + + /* Like active_instance() but with no malloc */ + + node = READ_ONCE(ref->cache); + if (node && node->timeline == idx) + return &node->base; + + spin_lock_irq(&ref->tree_lock); + GEM_BUG_ON(i915_active_is_idle(ref)); + + p = ref->tree.rb_node; + while (p) { + node = rb_entry(p, struct active_node, node); + if (node->timeline == idx) { + ref->cache = node; + spin_unlock_irq(&ref->tree_lock); + return &node->base; + } + + if (node->timeline < idx) + p = p->rb_right; + else + p = p->rb_left; + } + + spin_unlock_irq(&ref->tree_lock); + + return NULL; +} + +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) +{ + struct dma_fence *prev = ERR_PTR(-ENOENT); + struct i915_active_fence *active; + + if (!i915_active_acquire_if_busy(ref)) + return ERR_PTR(-EINVAL); + + active = __active_lookup(ref, idx); + if (active) + prev = __i915_active_set_fence(ref, active, fence); + + i915_active_release(ref); + return prev; +} + +struct dma_fence * +i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +{ + /* We expect the caller to manage the exclusive timeline ordering */ + return __i915_active_set_fence(ref, &ref->excl, f); +} + bool i915_active_acquire_if_busy(struct i915_active *ref) { debug_active_assert(ref); @@ -443,6 +514,24 @@ int i915_active_acquire(struct i915_active *ref) return err; } +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx) +{ + struct i915_active_fence *active; + int err; + + err = i915_active_acquire(ref); + if (err) + return err; + + active = active_instance(ref, idx); + if (!active) { + i915_active_release(ref); + return -ENOMEM; + } + + return 0; /* return with active ref */ +} + void i915_active_release(struct i915_active *ref) { debug_active_assert(ref); @@ -804,7 +893,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, */ RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN)); node->base.cb.node.prev = (void *)engine; - atomic_inc(&ref->count); + __i915_active_acquire(ref); } GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN)); diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index cf4058150966..042502abefe5 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -163,6 +163,9 @@ void __i915_active_init(struct i915_active *ref, __i915_active_init(ref, active, retire, &__mkey, &__wkey); \ } while (0) +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); + int i915_active_ref(struct i915_active *ref, struct intel_timeline *tl, struct dma_fence *fence); @@ -198,7 +201,9 @@ int i915_request_await_active(struct i915_request *rq, #define I915_ACTIVE_AWAIT_BARRIER BIT(2) int i915_active_acquire(struct i915_active *ref); +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx); bool i915_active_acquire_if_busy(struct i915_active *ref); + void i915_active_release(struct i915_active *ref); static inline void __i915_active_acquire(struct i915_active *ref) From patchwork Mon May 4 04:49:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524993 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0028114C0 for ; Mon, 4 May 2020 04:49:55 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DCECF206C0 for ; Mon, 4 May 2020 04:49:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DCECF206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 66F7F6E2F8; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id A2A926E2DF for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102375-1500050 for multiple; Mon, 04 May 2020 05:49:10 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:49:00 +0100 Message-Id: <20200504044903.7626-19-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 19/22] drm/i915/gem: Separate the ww_mutex walker into its own list X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" In preparation for making eb_vma bigger and heavy to run inn parallel, we need to stop apply an in-place swap() to reorder around ww_mutex deadlocks. Keep the array intact and reorder the locks using a dedicated list. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 48 +++++++++++-------- drivers/gpu/drm/i915/i915_utils.h | 6 +++ 2 files changed, 34 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 67ba33b3de60..0d773767c2ac 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -36,6 +36,7 @@ struct eb_vma { struct drm_i915_gem_exec_object2 *exec; struct list_head bind_link; struct list_head reloc_link; + struct list_head lock_link; struct hlist_node node; u32 handle; @@ -254,6 +255,8 @@ struct i915_execbuffer { /** list of vma that have execobj.relocation_count */ struct list_head relocs; + struct list_head lock; + /** * Track the most recently used object for relocations, as we * frequently have to perform multiple relocations within the same @@ -399,6 +402,10 @@ static int eb_create(struct i915_execbuffer *eb) eb->lut_size = -eb->buffer_count; } + INIT_LIST_HEAD(&eb->relocs); + INIT_LIST_HEAD(&eb->unbound); + INIT_LIST_HEAD(&eb->lock); + return 0; } @@ -604,6 +611,8 @@ eb_add_vma(struct i915_execbuffer *eb, eb_unreserve_vma(ev); list_add_tail(&ev->bind_link, &eb->unbound); } + + list_add_tail(&ev->lock_link, &eb->lock); } static inline int use_cpu_reloc(const struct reloc_cache *cache, @@ -880,9 +889,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) unsigned int i; int err = 0; - INIT_LIST_HEAD(&eb->relocs); - INIT_LIST_HEAD(&eb->unbound); - for (i = 0; i < eb->buffer_count; i++) { struct i915_vma *vma; @@ -1790,38 +1796,39 @@ static int eb_relocate(struct i915_execbuffer *eb) static int eb_move_to_gpu(struct i915_execbuffer *eb) { - const unsigned int count = eb->buffer_count; struct ww_acquire_ctx acquire; - unsigned int i; + struct eb_vma *ev; int err = 0; ww_acquire_init(&acquire, &reservation_ww_class); - for (i = 0; i < count; i++) { - struct eb_vma *ev = &eb->vma[i]; + list_for_each_entry(ev, &eb->lock, lock_link) { struct i915_vma *vma = ev->vma; err = ww_mutex_lock_interruptible(&vma->resv->lock, &acquire); if (err == -EDEADLK) { - GEM_BUG_ON(i == 0); - do { - int j = i - 1; + struct eb_vma *unlock = ev, *en; - ww_mutex_unlock(&eb->vma[j].vma->resv->lock); - - swap(eb->vma[i], eb->vma[j]); - } while (--i); + list_for_each_entry_safe_continue_reverse(unlock, en, &eb->lock, lock_link) { + ww_mutex_unlock(&unlock->vma->resv->lock); + list_move_tail(&unlock->lock_link, &eb->lock); + } + GEM_BUG_ON(!list_is_first(&ev->lock_link, &eb->lock)); err = ww_mutex_lock_slow_interruptible(&vma->resv->lock, &acquire); } - if (err) - break; + if (err) { + list_for_each_entry_continue_reverse(ev, &eb->lock, lock_link) + ww_mutex_unlock(&ev->vma->resv->lock); + + ww_acquire_fini(&acquire); + goto err_skip; + } } ww_acquire_done(&acquire); - while (i--) { - struct eb_vma *ev = &eb->vma[i]; + list_for_each_entry(ev, &eb->lock, lock_link) { struct i915_vma *vma = ev->vma; unsigned int flags = ev->flags; struct drm_i915_gem_object *obj = vma->obj; @@ -2122,9 +2129,10 @@ static int eb_parse(struct i915_execbuffer *eb) if (err) goto err_trampoline; - eb->vma[eb->buffer_count].vma = i915_vma_get(shadow); - eb->vma[eb->buffer_count].flags = __EXEC_OBJECT_HAS_PIN; eb->batch = &eb->vma[eb->buffer_count++]; + eb->batch->vma = i915_vma_get(shadow); + eb->batch->flags = __EXEC_OBJECT_HAS_PIN; + list_add_tail(&eb->batch->lock_link, &eb->lock); eb->vma[eb->buffer_count].vma = NULL; eb->trampoline = trampoline; diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index 03a73d2bd50d..28813806bc19 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -266,6 +266,12 @@ static inline int list_is_last_rcu(const struct list_head *list, return READ_ONCE(list->next) == head; } +#define list_for_each_entry_safe_continue_reverse(pos, n, head, member) \ + for (pos = list_prev_entry(pos, member), \ + n = list_prev_entry(pos, member); \ + &pos->member != (head); \ + pos = n, n = list_prev_entry(n, member)) + /* * Wait until the work is finally complete, even if it tries to postpone * by requeueing itself. Note, that if the worker never cancels itself, From patchwork Mon May 4 04:49:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524975 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 863B014C0 for ; Mon, 4 May 2020 04:49:47 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6E53E206C0 for ; Mon, 4 May 2020 04:49:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E53E206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BCA246E2DF; Mon, 4 May 2020 04:49:38 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9989A6E161 for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102376-1500050 for multiple; Mon, 04 May 2020 05:49:11 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:49:01 +0100 Message-Id: <20200504044903.7626-20-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 20/22] drm/i915/gem: Asynchronous GTT unbinding X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson , Matthew Auld Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is reasonably common for userspace (even modern drivers like iris) to reuse an active address for a new buffer. This would cause the application to stall under its mutex (originally struct_mutex) until the old batches were idle and it could synchronously remove the stale PTE. However, we can queue up a job that waits on the signal for the old nodes to complete and upon those signals, remove the old nodes replacing them with the new ones for the batch. This is still CPU driven, but in theory we can do the GTT patching from the GPU. The job itself has a completion signal allowing the execbuf to wait upon the rebinding, and also other observers to coordinate with the common VM activity. Letting userspace queue up more work, lets it do more stuff without blocking other clients. In turn, we take care not to let it too much concurrent work, creating a small number of queues for each context to limit the number of concurrent tasks. The implementation relies on only scheduling one unbind operation per vma as we use the unbound vma->node location to track the stale PTE. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1402 Signed-off-by: Chris Wilson Cc: Matthew Auld Cc: Andi Shyti --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 798 ++++++++++++++++-- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 1 + drivers/gpu/drm/i915/gt/intel_ggtt.c | 3 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 4 + drivers/gpu/drm/i915/gt/intel_gtt.h | 2 + drivers/gpu/drm/i915/gt/intel_ppgtt.c | 3 +- drivers/gpu/drm/i915/i915_gem.c | 7 + drivers/gpu/drm/i915/i915_gem_gtt.c | 5 + drivers/gpu/drm/i915/i915_vma.c | 130 +-- drivers/gpu/drm/i915/i915_vma.h | 5 + 10 files changed, 836 insertions(+), 122 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 0d773767c2ac..924d1ed50732 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -19,6 +19,7 @@ #include "gt/intel_gt.h" #include "gt/intel_gt_buffer_pool.h" #include "gt/intel_gt_pm.h" +#include "gt/intel_gt_requests.h" #include "gt/intel_ring.h" #include "i915_drv.h" @@ -32,6 +33,9 @@ struct eb_vma { struct i915_vma *vma; unsigned int flags; + struct drm_mm_node hole; + unsigned int bind_flags; + /** This vma's place in the execbuf reservation list */ struct drm_i915_gem_exec_object2 *exec; struct list_head bind_link; @@ -58,7 +62,8 @@ enum { #define __EXEC_OBJECT_HAS_FENCE BIT(30) #define __EXEC_OBJECT_NEEDS_MAP BIT(29) #define __EXEC_OBJECT_NEEDS_BIAS BIT(28) -#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 28) /* all of the above */ +#define __EXEC_OBJECT_HAS_PAGES BIT(27) +#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 27) /* all of the above */ #define __EXEC_HAS_RELOC BIT(31) #define __EXEC_INTERNAL_FLAGS (~0u << 31) @@ -72,11 +77,12 @@ enum { I915_EXEC_RESOURCE_STREAMER) /* Catch emission of unexpected errors for CI! */ +#define __EINVAL__ 22 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) #undef EINVAL #define EINVAL ({ \ DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \ - 22; \ + __EINVAL__; \ }) #endif @@ -317,6 +323,12 @@ static struct eb_vma_array *eb_vma_array_create(unsigned int count) return arr; } +static struct eb_vma_array *eb_vma_array_get(struct eb_vma_array *arr) +{ + kref_get(&arr->kref); + return arr; +} + static inline void eb_unreserve_vma(struct eb_vma *ev) { struct i915_vma *vma = ev->vma; @@ -327,8 +339,12 @@ static inline void eb_unreserve_vma(struct eb_vma *ev) if (ev->flags & __EXEC_OBJECT_HAS_PIN) __i915_vma_unpin(vma); + if (ev->flags & __EXEC_OBJECT_HAS_PAGES) + i915_vma_put_pages(vma); + ev->flags &= ~(__EXEC_OBJECT_HAS_PIN | - __EXEC_OBJECT_HAS_FENCE); + __EXEC_OBJECT_HAS_FENCE | + __EXEC_OBJECT_HAS_PAGES); } static void eb_vma_array_destroy(struct kref *kref) @@ -414,7 +430,7 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry, const struct i915_vma *vma, unsigned int flags) { - if (vma->node.size < entry->pad_to_size) + if (vma->node.size < max(vma->size, entry->pad_to_size)) return true; if (entry->alignment && !IS_ALIGNED(vma->node.start, entry->alignment)) @@ -487,12 +503,18 @@ eb_pin_vma(struct i915_execbuffer *eb, if (entry->flags & EXEC_OBJECT_PINNED) return false; + /* Concurrent async binds in progress, get in the queue */ + if (!i915_active_is_idle(&vma->vm->active)) + return false; + /* Failing that pick any _free_ space if suitable */ if (unlikely(i915_vma_pin(vma, entry->pad_to_size, entry->alignment, eb_pin_flags(entry, ev->flags) | - PIN_USER | PIN_NOEVICT))) + PIN_USER | + PIN_NOEVICT | + PIN_NOSEARCH))) return false; } @@ -632,85 +654,700 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache, obj->cache_level != I915_CACHE_NONE); } -static int eb_reserve_vma(const struct i915_execbuffer *eb, - struct eb_vma *ev, - u64 pin_flags) +struct eb_vm_work { + struct dma_fence_work base; + struct list_head unbound; + struct eb_vma_array *array; + struct i915_address_space *vm; + struct list_head evict_list; + u64 *p_flags; + u64 active; +}; + +static inline u64 node_end(const struct drm_mm_node *node) +{ + return node->start + node->size; +} + +static int set_bind_fence(struct i915_vma *vma, struct eb_vm_work *work) +{ + struct dma_fence *prev; + int err = 0; + + lockdep_assert_held(&vma->vm->mutex); + prev = i915_active_set_exclusive(&vma->active, &work->base.dma); + if (unlikely(prev)) { + err = i915_sw_fence_await_dma_fence(&work->base.chain, prev, 0, + GFP_NOWAIT | __GFP_NOWARN); + dma_fence_put(prev); + } + + return err < 0 ? err : 0; +} + +static int await_evict(struct eb_vm_work *work, struct i915_vma *vma) +{ + int err; + + if (rcu_access_pointer(vma->active.excl.fence) == &work->base.dma) + return 0; + + /* Wait for all other previous activity */ + err = i915_sw_fence_await_active(&work->base.chain, + &vma->active, + I915_ACTIVE_AWAIT_ACTIVE); + /* Then insert along the exclusive vm->mutex timeline */ + if (err == 0) + err = set_bind_fence(vma, work); + + return err; +} + +static int +evict_for_node(struct eb_vm_work *work, + struct eb_vma *const target, + unsigned int flags) +{ + struct i915_address_space *vm = target->vma->vm; + const unsigned long color = target->vma->node.color; + const u64 start = target->vma->node.start; + const u64 end = start + target->vma->node.size; + u64 hole_start = start, hole_end = end; + struct i915_vma *vma, *next; + struct drm_mm_node *node; + LIST_HEAD(evict_list); + LIST_HEAD(steal_list); + int err = 0; + + lockdep_assert_held(&vm->mutex); + GEM_BUG_ON(drm_mm_node_allocated(&target->vma->node)); + GEM_BUG_ON(!IS_ALIGNED(start, I915_GTT_PAGE_SIZE)); + GEM_BUG_ON(!IS_ALIGNED(end, I915_GTT_PAGE_SIZE)); + + if (i915_vm_has_cache_coloring(vm)) { + /* Expand search to cover neighbouring guard pages (or lack!) */ + if (hole_start) + hole_start -= I915_GTT_PAGE_SIZE; + + /* Always look at the page afterwards to avoid the end-of-GTT */ + hole_end += I915_GTT_PAGE_SIZE; + } + GEM_BUG_ON(hole_start >= hole_end); + + drm_mm_for_each_node_in_range(node, &vm->mm, hole_start, hole_end) { + GEM_BUG_ON(node == &target->vma->node); + + /* If we find any non-objects (!vma), we cannot evict them */ + if (node->color == I915_COLOR_UNEVICTABLE) { + err = -ENOSPC; + goto err; + } + + /* + * If we are using coloring to insert guard pages between + * different cache domains within the address space, we have + * to check whether the objects on either side of our range + * abutt and conflict. If they are in conflict, then we evict + * those as well to make room for our guard pages. + */ + if (i915_vm_has_cache_coloring(vm)) { + if (node_end(node) == start && node->color == color) + continue; + + if (node->start == end && node->color == color) + continue; + } + + GEM_BUG_ON(!drm_mm_node_allocated(node)); + vma = container_of(node, typeof(*vma), node); + + if (i915_vma_is_pinned(vma)) { + err = -ENOSPC; + goto err; + } + + /* If this VMA is already being freed, or idle, steal it! */ + if (!i915_active_acquire_if_busy(&vma->active)) { + list_move(&vma->vm_link, &steal_list); + continue; + } + + if (flags & PIN_NONBLOCK) + err = -EAGAIN; + else + err = await_evict(work, vma); + i915_active_release(&vma->active); + if (err) + goto err; + + GEM_BUG_ON(!i915_vma_is_active(vma)); + list_move(&vma->vm_link, &evict_list); + } + + list_for_each_entry_safe(vma, next, &steal_list, vm_link) { + atomic_and(~I915_VMA_BIND_MASK, &vma->flags); + __i915_vma_evict(vma); + drm_mm_remove_node(&vma->node); + /* No ref held; vma may now be concurrently freed */ + } + + /* No overlapping nodes to evict, claim the slot for ourselves! */ + if (list_empty(&evict_list)) + return drm_mm_reserve_node(&vm->mm, &target->vma->node); + + /* + * Mark this range as reserved. + * + * We have not yet removed the PTEs for the old evicted nodes, so + * must prevent this range from being reused for anything else. The + * PTE will be cleared when the range is idle (during the rebind + * phase in the worker). + */ + target->hole.color = I915_COLOR_UNEVICTABLE; + target->hole.start = start; + target->hole.size = end; + + list_for_each_entry(vma, &evict_list, vm_link) { + target->hole.start = + min(target->hole.start, vma->node.start); + target->hole.size = + max(target->hole.size, node_end(&vma->node)); + + GEM_BUG_ON(vma->node.mm != &vm->mm); + drm_mm_remove_node(&vma->node); + atomic_and(~I915_VMA_BIND_MASK, &vma->flags); + GEM_BUG_ON(i915_vma_is_pinned(vma)); + } + list_splice(&evict_list, &work->evict_list); + + target->hole.size -= target->hole.start; + + return drm_mm_reserve_node(&vm->mm, &target->hole); + +err: + list_splice(&evict_list, &vm->bound_list); + list_splice(&steal_list, &vm->bound_list); + return err; +} + +static int +evict_in_range(struct eb_vm_work *work, + struct eb_vma * const target, + u64 start, u64 end, u64 align) +{ + struct i915_address_space *vm = target->vma->vm; + struct i915_vma *vma, *next; + struct drm_mm_scan scan; + LIST_HEAD(evict_list); + bool found = false; + + lockdep_assert_held(&vm->mutex); + + drm_mm_scan_init_with_range(&scan, &vm->mm, + target->vma->node.size, + align, + target->vma->node.color, + start, end, + DRM_MM_INSERT_BEST); + + list_for_each_entry_safe(vma, next, &vm->bound_list, vm_link) { + if (i915_vma_is_pinned(vma)) + continue; + + list_move(&vma->vm_link, &evict_list); + if (drm_mm_scan_add_block(&scan, &vma->node)) { + target->vma->node.start = + round_up(scan.hit_start, align); + found = true; + break; + } + } + + list_for_each_entry(vma, &evict_list, vm_link) + drm_mm_scan_remove_block(&scan, &vma->node); + list_splice(&evict_list, &vm->bound_list); + if (!found) + return -ENOSPC; + + return evict_for_node(work, target, 0); +} + +static u64 random_offset(u64 start, u64 end, u64 len, u64 align) +{ + u64 range, addr; + + GEM_BUG_ON(range_overflows(start, len, end)); + GEM_BUG_ON(round_up(start, align) > round_down(end - len, align)); + + range = round_down(end - len, align) - round_up(start, align); + if (range) { + if (sizeof(unsigned long) == sizeof(u64)) { + addr = get_random_long(); + } else { + addr = get_random_int(); + if (range > U32_MAX) { + addr <<= 32; + addr |= get_random_int(); + } + } + div64_u64_rem(addr, range, &addr); + start += addr; + } + + return round_up(start, align); +} + +static u64 align0(u64 align) +{ + return align <= I915_GTT_MIN_ALIGNMENT ? 0 : align; +} + +static struct drm_mm_node *__best_hole(struct drm_mm *mm, u64 size) +{ + struct rb_node *rb = mm->holes_size.rb_root.rb_node; + struct drm_mm_node *best = NULL; + + do { + struct drm_mm_node *node = + rb_entry(rb, struct drm_mm_node, rb_hole_size); + + if (size <= node->hole_size) { + best = node; + rb = rb->rb_right; + } else { + rb = rb->rb_left; + } + } while (rb); + + return best; +} + +static int best_hole(struct drm_mm *mm, struct drm_mm_node *node, + u64 start, u64 end, u64 align) +{ + struct drm_mm_node *hole; + u64 size = node->size; + + do { + hole = __best_hole(mm, size); + if (!hole) + return -ENOSPC; + + node->start = round_up(max(start, drm_mm_hole_node_start(hole)), + align); + if (min(drm_mm_hole_node_end(hole), end) >= + node->start + node->size) + return drm_mm_reserve_node(mm, node); + + /* + * Too expensive to search for every single hole every time, + * so just look for the next bigger hole, introducing enough + * space for alignments. Finding the smallest hole with ideal + * alignment scales very poorly, so we choose to waste space + * if an alignment is forced. On the other hand, simply + * randomly selecting an offset in 48b space will cause us + * to use the majority of that space and exhaust all memory + * in storing the page directories. Compromise is required. + */ + size = hole->hole_size + align; + } while (1); +} + +static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev) { struct drm_i915_gem_exec_object2 *entry = ev->exec; + const unsigned int exec_flags = ev->flags; struct i915_vma *vma = ev->vma; + struct i915_address_space *vm = vma->vm; + u64 start = 0, end = vm->total; + u64 align = entry->alignment ?: I915_GTT_MIN_ALIGNMENT; + unsigned int bind_flags; int err; - if (drm_mm_node_allocated(&vma->node) && - eb_vma_misplaced(entry, vma, ev->flags)) { - err = i915_vma_unbind(vma); - if (err) - return err; + lockdep_assert_held(&vm->mutex); + + bind_flags = PIN_USER; + if (exec_flags & EXEC_OBJECT_NEEDS_GTT) + bind_flags |= PIN_GLOBAL; + + if (drm_mm_node_allocated(&vma->node)) + goto pin; + + GEM_BUG_ON(i915_vma_is_pinned(vma)); + GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_BIND_MASK)); + GEM_BUG_ON(i915_active_fence_isset(&vma->active.excl)); + GEM_BUG_ON(!vma->size); + + /* Reuse old address (if it doesn't conflict with new requirements) */ + if (eb_vma_misplaced(entry, vma, exec_flags)) { + vma->node.start = entry->offset & PIN_OFFSET_MASK; + vma->node.size = max(entry->pad_to_size, vma->size); + vma->node.color = 0; + if (i915_vm_has_cache_coloring(vm)) + vma->node.color = vma->obj->cache_level; } - err = i915_vma_pin(vma, - entry->pad_to_size, entry->alignment, - eb_pin_flags(entry, ev->flags) | pin_flags); - if (err) - return err; + /* + * Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset, + * limit address to the first 4GBs for unflagged objects. + */ + if (!(exec_flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)) + end = min_t(u64, end, (1ULL << 32) - I915_GTT_PAGE_SIZE); - if (entry->offset != vma->node.start) { - entry->offset = vma->node.start | UPDATE; - eb->args->flags |= __EXEC_HAS_RELOC; + align = max(align, vma->display_alignment); + if (exec_flags & __EXEC_OBJECT_NEEDS_MAP) { + vma->node.size = max_t(u64, vma->node.size, vma->fence_size); + end = min_t(u64, end, i915_vm_to_ggtt(vm)->mappable_end); + align = max_t(u64, align, vma->fence_alignment); } - if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) { - err = i915_vma_pin_fence(vma); - if (unlikely(err)) { - i915_vma_unpin(vma); + if (exec_flags & __EXEC_OBJECT_NEEDS_BIAS) + start = BATCH_OFFSET_BIAS; + + GEM_BUG_ON(!vma->node.size); + if (vma->node.size > end - start) + return -E2BIG; + + /* Try the user's preferred location first (mandatory if soft-pinned) */ + err = -__EINVAL__; + if (vma->node.start >= start && + IS_ALIGNED(vma->node.start, align) && + !range_overflows(vma->node.start, vma->node.size, end)) { + unsigned int pin_flags; + + if (drm_mm_reserve_node(&vm->mm, &vma->node) == 0) + goto pin; + + pin_flags = 0; + if (!(exec_flags & EXEC_OBJECT_PINNED)) + pin_flags = PIN_NONBLOCK; + + err = evict_for_node(work, ev, pin_flags); + if (err == 0) + goto pin; + } + if (exec_flags & EXEC_OBJECT_PINNED) + return err; + + /* Try the first available free space */ + if (!best_hole(&vm->mm, &vma->node, start, end, align)) + goto pin; + + /* Pick a random slot and see if it's available [O(N) worst case] */ + vma->node.start = random_offset(start, end, vma->node.size, align); + if (evict_for_node(work, ev, 0) == 0) + goto pin; + + pr_err("nothing!: size:%llx, align:%llx\n", vma->node.size, align); + + /* Otherwise search all free space [degrades to O(N^2)] */ + if (drm_mm_insert_node_in_range(&vm->mm, &vma->node, + vma->node.size, + align0(align), + vma->node.color, + start, end, + DRM_MM_INSERT_BEST) == 0) + goto pin; + + /* Pretty busy! Loop over "LRU" and evict oldest in our search range */ + err = evict_in_range(work, ev, start, end, align); + if (unlikely(err)) + return err; + +pin: + if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { + err = __i915_vma_pin_fence(vma); /* XXX no waiting */ + if (unlikely(err)) return err; - } if (vma->fence) ev->flags |= __EXEC_OBJECT_HAS_FENCE; } + bind_flags &= ~atomic_read(&vma->flags); + if (bind_flags) { + err = set_bind_fence(vma, work); + if (unlikely(err)) + return err; + + atomic_add(I915_VMA_PAGES_ACTIVE, &vma->pages_count); + atomic_or(bind_flags, &vma->flags); + + if (i915_vma_is_ggtt(vma)) + __i915_vma_set_map_and_fenceable(vma); + + GEM_BUG_ON(!i915_vma_is_active(vma)); + list_move_tail(&vma->vm_link, &vm->bound_list); + ev->bind_flags = bind_flags; + } + __i915_vma_pin(vma); /* and release */ + + GEM_BUG_ON(!bind_flags && !drm_mm_node_allocated(&vma->node)); + GEM_BUG_ON(!(drm_mm_node_allocated(&vma->node) ^ + drm_mm_node_allocated(&ev->hole))); + + if (entry->offset != vma->node.start) { + entry->offset = vma->node.start | UPDATE; + *work->p_flags |= __EXEC_HAS_RELOC; + } + ev->flags |= __EXEC_OBJECT_HAS_PIN; GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags)); return 0; } -static int eb_reserve(struct i915_execbuffer *eb) +static int __eb_bind_vma(struct eb_vm_work *work, int err) { - const unsigned int count = eb->buffer_count; - unsigned int pin_flags = PIN_USER | PIN_NONBLOCK; - struct list_head last; + struct i915_address_space *vm = work->vm; struct eb_vma *ev; - unsigned int i, pass; - int err = 0; + + GEM_BUG_ON(!intel_gt_pm_is_awake(vm->gt)); /* - * Attempt to pin all of the buffers into the GTT. - * This is done in 3 phases: - * - * 1a. Unbind all objects that do not match the GTT constraints for - * the execbuffer (fenceable, mappable, alignment etc). - * 1b. Increment pin count for already bound objects. - * 2. Bind new objects. - * 3. Decrement pin count. - * - * This avoid unnecessary unbinding of later objects in order to make - * room for the earlier objects *unless* we need to defragment. + * We have to wait until the stale nodes are completely idle before + * we can remove their PTE and unbind their pages. Hence, after + * claiming their slot in the drm_mm, we defer their removal to + * after the fences are signaled. */ + if (!list_empty(&work->evict_list)) { + struct i915_vma *vma, *vn; + + mutex_lock(&vm->mutex); + list_for_each_entry_safe(vma, vn, &work->evict_list, vm_link) { + GEM_BUG_ON(vma->vm != vm); + __i915_vma_evict(vma); + GEM_BUG_ON(!i915_vma_is_active(vma)); + } + mutex_unlock(&vm->mutex); + } + + /* + * Now we know the nodes we require in drm_mm are idle, we can + * replace the PTE in those ranges with our own. + */ + list_for_each_entry(ev, &work->unbound, bind_link) { + struct i915_vma *vma = ev->vma; + + if (!ev->bind_flags) + goto put; - if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex)) - return -EINTR; + GEM_BUG_ON(vma->vm != vm); + GEM_BUG_ON(!i915_vma_is_active(vma)); + + if (err == 0) + err = vma->ops->bind_vma(vma, + vma->obj->cache_level, + ev->bind_flags | + I915_VMA_ALLOC); + if (err) + atomic_and(~ev->bind_flags, &vma->flags); + + if (drm_mm_node_allocated(&ev->hole)) { + mutex_lock(&vm->mutex); + GEM_BUG_ON(ev->hole.mm != &vm->mm); + GEM_BUG_ON(ev->hole.color != I915_COLOR_UNEVICTABLE); + GEM_BUG_ON(drm_mm_node_allocated(&vma->node)); + drm_mm_remove_node(&ev->hole); + if (!err) { + drm_mm_reserve_node(&vm->mm, &vma->node); + GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); + } else { + list_del_init(&vma->vm_link); + } + mutex_unlock(&vm->mutex); + } + ev->bind_flags = 0; + +put: + GEM_BUG_ON(drm_mm_node_allocated(&ev->hole)); + } + INIT_LIST_HEAD(&work->unbound); + + return err; +} + +static int eb_bind_vma(struct dma_fence_work *base) +{ + struct eb_vm_work *work = container_of(base, typeof(*work), base); + + return __eb_bind_vma(work, 0); +} + +static void eb_vma_work_release(struct dma_fence_work *base) +{ + struct eb_vm_work *work = container_of(base, typeof(*work), base); + + if (work->active) { + if (!list_empty(&work->unbound)) { + GEM_BUG_ON(!work->base.dma.error); + __eb_bind_vma(work, work->base.dma.error); + } + i915_active_release(&work->vm->active); + } + + eb_vma_array_put(work->array); +} + +static const struct dma_fence_work_ops eb_bind_ops = { + .name = "eb_bind", + .work = eb_bind_vma, + .release = eb_vma_work_release, +}; + +static struct eb_vm_work *eb_vm_work(struct i915_execbuffer *eb) +{ + struct eb_vm_work *work; + + work = kzalloc(sizeof(*work), GFP_KERNEL); + if (!work) + return NULL; + + dma_fence_work_init(&work->base, &eb_bind_ops); + list_replace_init(&eb->unbound, &work->unbound); + work->array = eb_vma_array_get(eb->array); + work->p_flags = &eb->args->flags; + work->vm = eb->context->vm; + + /* Preallocate our slot in vm->active, outside of vm->mutex */ + work->active = i915_gem_context_async_id(eb->gem_context); + if (i915_active_acquire_for_context(&work->vm->active, work->active)) { + work->active = 0; + work->base.dma.error = -ENOMEM; + dma_fence_work_commit(&work->base); + return NULL; + } + + INIT_LIST_HEAD(&work->evict_list); + + GEM_BUG_ON(list_empty(&work->unbound)); + GEM_BUG_ON(!list_empty(&eb->unbound)); + + return work; +} + +static int eb_vm_throttle(struct eb_vm_work *work) +{ + struct dma_fence *p; + int err; + + /* Keep async work queued per context */ + p = __i915_active_ref(&work->vm->active, work->active, &work->base.dma); + if (IS_ERR_OR_NULL(p)) + return PTR_ERR_OR_ZERO(p); + + err = i915_sw_fence_await_dma_fence(&work->base.chain, p, 0, + GFP_NOWAIT | __GFP_NOWARN); + dma_fence_put(p); + + return err < 0 ? err : 0; +} + +static int eb_prepare_vma(struct eb_vma *ev) +{ + struct i915_vma *vma = ev->vma; + int err; + + ev->hole.flags = 0; + ev->bind_flags = 0; + + if (!(ev->flags & __EXEC_OBJECT_HAS_PAGES)) { + err = i915_vma_get_pages(vma); + if (err) + return err; + + ev->flags |= __EXEC_OBJECT_HAS_PAGES; + } + + return 0; +} + +static int eb_reserve(struct i915_execbuffer *eb) +{ + const unsigned int count = eb->buffer_count; + struct i915_address_space *vm = eb->context->vm; + struct list_head last; + unsigned int i, pass; + struct eb_vma *ev; + int err = 0; pass = 0; do { + struct eb_vm_work *work; + list_for_each_entry(ev, &eb->unbound, bind_link) { - err = eb_reserve_vma(eb, ev, pin_flags); + err = eb_prepare_vma(ev); + switch (err) { + case 0: + break; + case -EAGAIN: + goto retry; + default: + return err; + } + } + + work = eb_vm_work(eb); + if (!work) + return -ENOMEM; + + /* No allocations allowed beyond this point */ + if (mutex_lock_interruptible(&vm->mutex)) { + work->base.dma.error = -EINTR; + dma_fence_work_commit(&work->base); + return -EINTR; + } + + err = eb_vm_throttle(work); + if (err) { + mutex_unlock(&vm->mutex); + work->base.dma.error = err; + dma_fence_work_commit(&work->base); + return err; + } + + list_for_each_entry(ev, &work->unbound, bind_link) { + struct i915_vma *vma = ev->vma; + + /* + * Check if this node is being evicted or must be. + * + * As we use the single node inside the vma to track + * both the eviction and where to insert the new node, + * we cannot handle migrating the vma inside the worker. + */ + if (drm_mm_node_allocated(&vma->node)) { + if (eb_vma_misplaced(ev->exec, vma, ev->flags)) { + err = -ENOSPC; + break; + } + } else { + if (i915_vma_is_active(vma)) { + err = -ENOSPC; + break; + } + } + + err = i915_active_acquire(&vma->active); + if (!err) { + err = eb_reserve_vma(work, ev); + i915_active_release(&vma->active); + } if (err) break; } - if (!(err == -ENOSPC || err == -EAGAIN)) - break; + mutex_unlock(&vm->mutex); + + dma_fence_get(&work->base.dma); + dma_fence_work_commit_imm(&work->base); + if (err == -ENOSPC && dma_fence_wait(&work->base.dma, true)) + err = -EINTR; + dma_fence_put(&work->base.dma); + if (err != -ENOSPC) + return err; + +retry: /* Resort *all* the objects into priority order */ INIT_LIST_HEAD(&eb->unbound); INIT_LIST_HEAD(&last); @@ -739,37 +1376,52 @@ static int eb_reserve(struct i915_execbuffer *eb) } list_splice_tail(&last, &eb->unbound); + if (signal_pending(current)) + return -EINTR; + if (err == -EAGAIN) { - mutex_unlock(&eb->i915->drm.struct_mutex); flush_workqueue(eb->i915->mm.userptr_wq); - mutex_lock(&eb->i915->drm.struct_mutex); continue; } - switch (pass++) { - case 0: - break; + /* Now safe to wait with no reservations held */ + list_for_each_entry(ev, &eb->unbound, bind_link) { + struct i915_vma *vma = ev->vma; - case 1: - /* Too fragmented, unbind everything and retry */ - mutex_lock(&eb->context->vm->mutex); - err = i915_gem_evict_vm(eb->context->vm); - mutex_unlock(&eb->context->vm->mutex); + GEM_BUG_ON(ev->flags & __EXEC_OBJECT_HAS_PIN); + + if (drm_mm_node_allocated(&vma->node) && + eb_vma_misplaced(ev->exec, vma, ev->flags)) { + err = i915_vma_unbind(vma); + if (err) + return err; + } + + /* Wait for previous to avoid reusing vma->node */ + err = i915_vma_wait_for_unbind(vma); if (err) - goto unlock; - break; + return err; + } + switch (pass++) { default: - err = -ENOSPC; - goto unlock; - } + return -ENOSPC; - pin_flags = PIN_USER; - } while (1); + case 2: + if (intel_gt_wait_for_idle(vm->gt, + MAX_SCHEDULE_TIMEOUT)) + return -EINTR; -unlock: - mutex_unlock(&eb->i915->drm.struct_mutex); - return err; + fallthrough; + case 1: + if (i915_active_wait(&vm->active)) + return -EINTR; + + fallthrough; + case 0: + break; + } + } while (1); } static unsigned int eb_batch_index(const struct i915_execbuffer *eb) @@ -1235,6 +1887,8 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj, { void *vaddr; + GEM_BUG_ON(page >= obj->base.size >> PAGE_SHIFT); + if (cache->page == page) { vaddr = unmask_page(cache->vaddr); } else { @@ -1244,6 +1898,7 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj, if (!vaddr) vaddr = reloc_kmap(obj, cache, page); } + GEM_BUG_ON(!vaddr); return vaddr; } @@ -1596,6 +2251,8 @@ eb_relocate_entry(struct i915_execbuffer *eb, if (unlikely(!target)) return -ENOENT; + GEM_BUG_ON(!i915_vma_is_pinned(target->vma)); + /* Validate that the target is in a valid r/w GPU domain */ if (unlikely(reloc->write_domain & (reloc->write_domain - 1))) { drm_dbg(&i915->drm, "reloc with multiple write domains: " @@ -1872,7 +2529,6 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) err = i915_vma_move_to_active(vma, eb->request, flags); i915_vma_unlock(vma); - eb_unreserve_vma(ev); } ww_acquire_fini(&acquire); @@ -2783,7 +3439,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, * snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure * batch" bit. Hence we need to pin secure batches into the global gtt. * hsw should have this fixed, but bdw mucks it up again. */ - batch = eb.batch->vma; + batch = i915_vma_get(eb.batch->vma); if (eb.batch_flags & I915_DISPATCH_SECURE) { struct i915_vma *vma; @@ -2803,6 +3459,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, goto err_parse; } + GEM_BUG_ON(vma->obj != batch->obj); batch = vma; } @@ -2882,6 +3539,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_parse: if (batch->private) intel_gt_buffer_pool_put(batch->private); + i915_vma_put(batch); err_vma: if (eb.trampoline) i915_vma_unpin(eb.trampoline); diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index f4fec7eb4064..2c5ac598ade2 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -370,6 +370,7 @@ static struct i915_vma *pd_vma_create(struct gen6_ppgtt *ppgtt, int size) atomic_set(&vma->flags, I915_VMA_GGTT); vma->ggtt_view.type = I915_GGTT_VIEW_ROTATED; /* prevent fencing */ + INIT_LIST_HEAD(&vma->vm_link); INIT_LIST_HEAD(&vma->obj_link); INIT_LIST_HEAD(&vma->closed_link); diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 66165b10256e..dfc979a24450 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -568,7 +568,8 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, if (flags & I915_VMA_LOCAL_BIND) { struct i915_ppgtt *alias = i915_vm_to_ggtt(vma->vm)->alias; - if (flags & I915_VMA_ALLOC) { + if (flags & I915_VMA_ALLOC && + !test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) { ret = alias->vm.allocate_va_range(&alias->vm, vma->node.start, vma->size); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 2a72cce63fd9..82d4f943c346 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -194,6 +194,8 @@ void __i915_vm_close(struct i915_address_space *vm) void i915_address_space_fini(struct i915_address_space *vm) { + i915_active_fini(&vm->active); + spin_lock(&vm->free_pages.lock); if (pagevec_count(&vm->free_pages.pvec)) vm_free_pages_release(vm, true); @@ -246,6 +248,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) drm_mm_init(&vm->mm, 0, vm->total); vm->mm.head_node.color = I915_COLOR_UNEVICTABLE; + i915_active_init(&vm->active, NULL, NULL); + stash_init(&vm->free_pages); INIT_LIST_HEAD(&vm->bound_list); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index d93ebdf3fa0e..773fc76dfa1b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -263,6 +263,8 @@ struct i915_address_space { */ struct list_head bound_list; + struct i915_active active; + struct pagestash free_pages; /* Global GTT */ diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c index f86f7e68ce5e..ecdd58f4b993 100644 --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c @@ -162,7 +162,8 @@ static int ppgtt_bind_vma(struct i915_vma *vma, u32 pte_flags; int err; - if (flags & I915_VMA_ALLOC) { + if (flags & I915_VMA_ALLOC && + !test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) { err = vma->vm->allocate_va_range(vma->vm, vma->node.start, vma->size); if (err) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 0cbcb9f54e7d..6effa85532c6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -984,6 +984,9 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, return vma; if (i915_vma_misplaced(vma, size, alignment, flags)) { + if (flags & PIN_NOEVICT) + return ERR_PTR(-ENOSPC); + if (flags & PIN_NONBLOCK) { if (i915_vma_is_pinned(vma) || i915_vma_is_active(vma)) return ERR_PTR(-ENOSPC); @@ -998,6 +1001,10 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, return ERR_PTR(ret); } + if (flags & PIN_NONBLOCK && + i915_active_fence_isset(&vma->active.excl)) + return ERR_PTR(-EAGAIN); + ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL); if (ret) return ERR_PTR(ret); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index cb43381b0d37..7e1225874b03 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -219,6 +219,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, mode = DRM_MM_INSERT_HIGHEST; if (flags & PIN_MAPPABLE) mode = DRM_MM_INSERT_LOW; + if (flags & PIN_NOSEARCH) + mode |= DRM_MM_INSERT_ONCE; /* We only allocate in PAGE_SIZE/GTT_PAGE_SIZE (4096) chunks, * so we know that we always have a minimum alignment of 4096. @@ -236,6 +238,9 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, if (err != -ENOSPC) return err; + if (flags & PIN_NOSEARCH) + return -ENOSPC; + if (mode & DRM_MM_INSERT_ONCE) { err = drm_mm_insert_node_in_range(&vm->mm, node, size, alignment, color, diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index fc14ebf9a0b7..9da9ef99a8bb 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -132,6 +132,7 @@ vma_create(struct drm_i915_gem_object *obj, fs_reclaim_release(GFP_KERNEL); } + INIT_LIST_HEAD(&vma->vm_link); INIT_LIST_HEAD(&vma->closed_link); if (view && view->type != I915_GGTT_VIEW_NORMAL) { @@ -342,25 +343,37 @@ struct i915_vma_work *i915_vma_work(void) return vw; } -int i915_vma_wait_for_bind(struct i915_vma *vma) +static int +__i915_vma_wait_excl(struct i915_vma *vma, bool bound, unsigned int flags) { + struct dma_fence *fence; int err = 0; - if (rcu_access_pointer(vma->active.excl.fence)) { - struct dma_fence *fence; + fence = i915_active_fence_get(&vma->active.excl); + if (!fence) + return 0; - rcu_read_lock(); - fence = dma_fence_get_rcu_safe(&vma->active.excl.fence); - rcu_read_unlock(); - if (fence) { - err = dma_fence_wait(fence, MAX_SCHEDULE_TIMEOUT); - dma_fence_put(fence); - } + if (drm_mm_node_allocated(&vma->node) == bound) { + if (flags & PIN_NOEVICT) + err = -EBUSY; + else + err = dma_fence_wait(fence, true); } + dma_fence_put(fence); return err; } +int i915_vma_wait_for_bind(struct i915_vma *vma) +{ + return __i915_vma_wait_excl(vma, true, 0); +} + +int i915_vma_wait_for_unbind(struct i915_vma *vma) +{ + return __i915_vma_wait_excl(vma, false, 0); +} + /** * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space. * @vma: VMA to map @@ -630,8 +643,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) u64 start, end; int ret; - GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND)); + GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_BIND_MASK)); GEM_BUG_ON(drm_mm_node_allocated(&vma->node)); + GEM_BUG_ON(i915_active_fence_isset(&vma->active.excl)); size = max(size, vma->size); alignment = max(alignment, vma->display_alignment); @@ -727,7 +741,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, color)); - list_add_tail(&vma->vm_link, &vma->vm->bound_list); + list_move_tail(&vma->vm_link, &vma->vm->bound_list); return 0; } @@ -735,15 +749,12 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) static void i915_vma_detach(struct i915_vma *vma) { - GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); - GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND)); - /* * And finally now the object is completely decoupled from this * vma, we can drop its hold on the backing storage and allow * it to be reaped by the shrinker. */ - list_del(&vma->vm_link); + list_del_init(&vma->vm_link); } static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) @@ -789,7 +800,7 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) return pinned; } -static int vma_get_pages(struct i915_vma *vma) +int i915_vma_get_pages(struct i915_vma *vma) { int err = 0; @@ -836,7 +847,7 @@ static void __vma_put_pages(struct i915_vma *vma, unsigned int count) mutex_unlock(&vma->pages_mutex); } -static void vma_put_pages(struct i915_vma *vma) +void i915_vma_put_pages(struct i915_vma *vma) { if (atomic_add_unless(&vma->pages_count, -1, 1)) return; @@ -853,9 +864,13 @@ static void vma_unbind_pages(struct i915_vma *vma) /* The upper portion of pages_count is the number of bindings */ count = atomic_read(&vma->pages_count); count >>= I915_VMA_PAGES_BIAS; - GEM_BUG_ON(!count); + if (count) + __vma_put_pages(vma, count | count << I915_VMA_PAGES_BIAS); +} - __vma_put_pages(vma, count | count << I915_VMA_PAGES_BIAS); +static int __wait_for_unbind(struct i915_vma *vma, unsigned int flags) +{ + return __i915_vma_wait_excl(vma, false, flags); } int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) @@ -875,10 +890,14 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (try_qad_pin(vma, flags & I915_VMA_BIND_MASK)) return 0; - err = vma_get_pages(vma); + err = i915_vma_get_pages(vma); if (err) return err; + err = __wait_for_unbind(vma, flags); + if (err) + goto err_pages; + if (flags & vma->vm->bind_async_flags) { work = i915_vma_work(); if (!work) { @@ -940,6 +959,10 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) goto err_unlock; if (!(bound & I915_VMA_BIND_MASK)) { + err = __wait_for_unbind(vma, flags); + if (err) + goto err_active; + err = i915_vma_insert(vma, size, alignment, flags); if (err) goto err_active; @@ -959,6 +982,7 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) GEM_BUG_ON(bound + I915_VMA_PAGES_ACTIVE < bound); atomic_add(I915_VMA_PAGES_ACTIVE, &vma->pages_count); list_move_tail(&vma->vm_link, &vma->vm->bound_list); + GEM_BUG_ON(!i915_vma_is_active(vma)); __i915_vma_pin(vma); GEM_BUG_ON(!i915_vma_is_pinned(vma)); @@ -980,7 +1004,7 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (wakeref) intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); err_pages: - vma_put_pages(vma); + i915_vma_put_pages(vma); return err; } @@ -1084,6 +1108,7 @@ void i915_vma_release(struct kref *ref) GEM_BUG_ON(drm_mm_node_allocated(&vma->node)); } GEM_BUG_ON(i915_vma_is_active(vma)); + GEM_BUG_ON(!list_empty(&vma->vm_link)); if (vma->obj) { struct drm_i915_gem_object *obj = vma->obj; @@ -1142,7 +1167,7 @@ static void __i915_vma_iounmap(struct i915_vma *vma) { GEM_BUG_ON(i915_vma_is_pinned(vma)); - if (vma->iomap == NULL) + if (!vma->iomap) return; io_mapping_unmap(vma->iomap); @@ -1232,31 +1257,9 @@ int i915_vma_move_to_active(struct i915_vma *vma, return 0; } -int __i915_vma_unbind(struct i915_vma *vma) +void __i915_vma_evict(struct i915_vma *vma) { - int ret; - - lockdep_assert_held(&vma->vm->mutex); - - if (i915_vma_is_pinned(vma)) { - vma_print_allocator(vma, "is pinned"); - return -EAGAIN; - } - - /* - * After confirming that no one else is pinning this vma, wait for - * any laggards who may have crept in during the wait (through - * a residual pin skipping the vm->mutex) to complete. - */ - ret = i915_vma_sync(vma); - if (ret) - return ret; - - if (!drm_mm_node_allocated(&vma->node)) - return 0; - GEM_BUG_ON(i915_vma_is_pinned(vma)); - GEM_BUG_ON(i915_vma_is_active(vma)); if (i915_vma_is_map_and_fenceable(vma)) { /* Force a pagefault for domain tracking on next user access */ @@ -1295,6 +1298,33 @@ int __i915_vma_unbind(struct i915_vma *vma) i915_vma_detach(vma); vma_unbind_pages(vma); +} + +int __i915_vma_unbind(struct i915_vma *vma) +{ + int ret; + + lockdep_assert_held(&vma->vm->mutex); + + if (!drm_mm_node_allocated(&vma->node)) + return 0; + + if (i915_vma_is_pinned(vma)) { + vma_print_allocator(vma, "is pinned"); + return -EAGAIN; + } + + /* + * After confirming that no one else is pinning this vma, wait for + * any laggards who may have crept in during the wait (through + * a residual pin skipping the vm->mutex) to complete. + */ + ret = i915_vma_sync(vma); + if (ret) + return ret; + + GEM_BUG_ON(i915_vma_is_active(vma)); + __i915_vma_evict(vma); drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */ return 0; @@ -1306,13 +1336,13 @@ int i915_vma_unbind(struct i915_vma *vma) intel_wakeref_t wakeref = 0; int err; - if (!drm_mm_node_allocated(&vma->node)) - return 0; - /* Optimistic wait before taking the mutex */ err = i915_vma_sync(vma); if (err) - goto out_rpm; + return err; + + if (!drm_mm_node_allocated(&vma->node)) + return 0; if (i915_vma_is_pinned(vma)) { vma_print_allocator(vma, "is pinned"); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 8ad1daabcd58..478e8679f331 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -203,6 +203,7 @@ bool i915_vma_misplaced(const struct i915_vma *vma, u64 size, u64 alignment, u64 flags); void __i915_vma_set_map_and_fenceable(struct i915_vma *vma); void i915_vma_revoke_mmap(struct i915_vma *vma); +void __i915_vma_evict(struct i915_vma *vma); int __i915_vma_unbind(struct i915_vma *vma); int __must_check i915_vma_unbind(struct i915_vma *vma); void i915_vma_unlink_ctx(struct i915_vma *vma); @@ -239,6 +240,9 @@ int __must_check i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags); int i915_ggtt_pin(struct i915_vma *vma, u32 align, unsigned int flags); +int i915_vma_get_pages(struct i915_vma *vma); +void i915_vma_put_pages(struct i915_vma *vma); + static inline int i915_vma_pin_count(const struct i915_vma *vma) { return atomic_read(&vma->flags) & I915_VMA_PIN_MASK; @@ -376,6 +380,7 @@ void i915_vma_make_shrinkable(struct i915_vma *vma); void i915_vma_make_purgeable(struct i915_vma *vma); int i915_vma_wait_for_bind(struct i915_vma *vma); +int i915_vma_wait_for_unbind(struct i915_vma *vma); static inline int i915_vma_sync(struct i915_vma *vma) { From patchwork Mon May 4 04:49:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524989 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD9CE14C0 for ; Mon, 4 May 2020 04:49:53 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A6224206C0 for ; Mon, 4 May 2020 04:49:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A6224206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 43C7B6E2F3; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 59EBF6E0D8 for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102377-1500050 for multiple; Mon, 04 May 2020 05:49:11 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:49:02 +0100 Message-Id: <20200504044903.7626-21-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 21/22] drm/i915/gem: Bind the fence async for execbuf X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is illegal to wait on an another vma while holding the vm->mutex, as that easily leads to ABBA deadlocks (we wait on a second vma that waits on us to release the vm->mutex). So while the vm->mutex exists, move the waiting outside of the lock into the async binding pipeline. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 41 ++++-- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 137 +++++++++++++++++- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h | 5 + 3 files changed, 166 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 924d1ed50732..60f49efec241 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -519,13 +519,23 @@ eb_pin_vma(struct i915_execbuffer *eb, } if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) { - if (unlikely(i915_vma_pin_fence(vma))) { - i915_vma_unpin(vma); - return false; - } + struct i915_fence_reg *reg = vma->fence; - if (vma->fence) + /* Avoid waiting to change the fence; defer to async worker */ + if (reg) { + if (READ_ONCE(reg->dirty)) { + __i915_vma_unpin(vma); + return false; + } + + atomic_inc(®->pin_count); ev->flags |= __EXEC_OBJECT_HAS_FENCE; + } else { + if (i915_gem_object_is_tiled(vma->obj)) { + __i915_vma_unpin(vma); + return false; + } + } } ev->flags |= __EXEC_OBJECT_HAS_PIN; @@ -1055,15 +1065,6 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev) return err; pin: - if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { - err = __i915_vma_pin_fence(vma); /* XXX no waiting */ - if (unlikely(err)) - return err; - - if (vma->fence) - ev->flags |= __EXEC_OBJECT_HAS_FENCE; - } - bind_flags &= ~atomic_read(&vma->flags); if (bind_flags) { err = set_bind_fence(vma, work); @@ -1094,6 +1095,15 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev) ev->flags |= __EXEC_OBJECT_HAS_PIN; GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags)); + if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { + err = __i915_vma_pin_fence_async(vma, &work->base); + if (unlikely(err)) + return err; + + if (vma->fence) + ev->flags |= __EXEC_OBJECT_HAS_FENCE; + } + return 0; } @@ -1129,6 +1139,9 @@ static int __eb_bind_vma(struct eb_vm_work *work, int err) list_for_each_entry(ev, &work->unbound, bind_link) { struct i915_vma *vma = ev->vma; + if (ev->flags & __EXEC_OBJECT_HAS_FENCE) + __i915_vma_apply_fence_async(vma); + if (!ev->bind_flags) goto put; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index 7fb36b12fe7a..734b6aa61809 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -21,10 +21,13 @@ * IN THE SOFTWARE. */ +#include "i915_active.h" #include "i915_drv.h" #include "i915_scatterlist.h" +#include "i915_sw_fence_work.h" #include "i915_pvinfo.h" #include "i915_vgpu.h" +#include "i915_vma.h" /** * DOC: fence register handling @@ -340,19 +343,37 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt) return ERR_PTR(-EDEADLK); } +static int fence_wait_bind(struct i915_fence_reg *reg) +{ + struct dma_fence *fence; + int err = 0; + + fence = i915_active_fence_get(®->active.excl); + if (fence) { + err = dma_fence_wait(fence, true); + dma_fence_put(fence); + } + + return err; +} + int __i915_vma_pin_fence(struct i915_vma *vma) { struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm); - struct i915_fence_reg *fence; + struct i915_fence_reg *fence = vma->fence; struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL; int err; lockdep_assert_held(&vma->vm->mutex); /* Just update our place in the LRU if our fence is getting reused. */ - if (vma->fence) { - fence = vma->fence; + if (fence) { GEM_BUG_ON(fence->vma != vma); + + err = fence_wait_bind(fence); + if (err) + return err; + atomic_inc(&fence->pin_count); if (!fence->dirty) { list_move_tail(&fence->link, &ggtt->fence_list); @@ -384,6 +405,116 @@ int __i915_vma_pin_fence(struct i915_vma *vma) return err; } +static int set_bind_fence(struct i915_fence_reg *fence, + struct dma_fence_work *work) +{ + struct dma_fence *prev; + int err; + + if (rcu_access_pointer(fence->active.excl.fence) == &work->dma) + return 0; + + err = i915_sw_fence_await_active(&work->chain, + &fence->active, + I915_ACTIVE_AWAIT_ACTIVE); + if (err) + return err; + + if (i915_active_acquire(&fence->active)) + return -ENOENT; + + prev = i915_active_set_exclusive(&fence->active, &work->dma); + if (unlikely(prev)) { + err = i915_sw_fence_await_dma_fence(&work->chain, prev, 0, + GFP_NOWAIT | __GFP_NOWARN); + dma_fence_put(prev); + } + + i915_active_release(&fence->active); + return err < 0 ? err : 0; +} + +int __i915_vma_pin_fence_async(struct i915_vma *vma, + struct dma_fence_work *work) +{ + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm); + struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL; + struct i915_fence_reg *fence = vma->fence; + int err; + + lockdep_assert_held(&vma->vm->mutex); + + /* Just update our place in the LRU if our fence is getting reused. */ + if (fence) { + GEM_BUG_ON(fence->vma != vma); + GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); + } else if (set) { + if (!i915_vma_is_map_and_fenceable(vma)) + return -EINVAL; + + fence = fence_find(ggtt); + if (IS_ERR(fence)) + return -ENOSPC; + + GEM_BUG_ON(atomic_read(&fence->pin_count)); + fence->dirty = true; + } else { + return 0; + } + + atomic_inc(&fence->pin_count); + list_move_tail(&fence->link, &ggtt->fence_list); + if (!fence->dirty) + return 0; + + if (INTEL_GEN(fence_to_i915(fence)) < 4 && + rcu_access_pointer(vma->active.excl.fence) != &work->dma) { + /* implicit 'unfenced' GPU blits */ + err = i915_sw_fence_await_active(&work->chain, + &vma->active, + I915_ACTIVE_AWAIT_ACTIVE); + if (err) + goto err_unpin; + } + + err = set_bind_fence(fence, work); + if (err) + goto err_unpin; + + if (set) { + fence->start = vma->node.start; + fence->size = vma->fence_size; + fence->stride = i915_gem_object_get_stride(vma->obj); + fence->tiling = i915_gem_object_get_tiling(vma->obj); + + vma->fence = fence; + } else { + fence->tiling = 0; + vma->fence = NULL; + } + + set = xchg(&fence->vma, set); + if (set && set != vma) { + GEM_BUG_ON(set->fence != fence); + WRITE_ONCE(set->fence, NULL); + i915_vma_revoke_mmap(set); + } + + return 0; + +err_unpin: + atomic_dec(&fence->pin_count); + return err; +} + +void __i915_vma_apply_fence_async(struct i915_vma *vma) +{ + struct i915_fence_reg *fence = vma->fence; + + if (fence->dirty) + fence_write(fence); +} + /** * i915_vma_pin_fence - set up fencing for a vma * @vma: vma to map through a fence reg diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 9eef679e1311..d306ac14d47e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -30,6 +30,7 @@ #include "i915_active.h" +struct dma_fence_work; struct drm_i915_gem_object; struct i915_ggtt; struct i915_vma; @@ -70,6 +71,10 @@ void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, struct sg_table *pages); +int __i915_vma_pin_fence_async(struct i915_vma *vma, + struct dma_fence_work *work); +void __i915_vma_apply_fence_async(struct i915_vma *vma); + void intel_ggtt_init_fences(struct i915_ggtt *ggtt); void intel_ggtt_fini_fences(struct i915_ggtt *ggtt); From patchwork Mon May 4 04:49:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11524997 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24A1F17EF for ; Mon, 4 May 2020 04:49:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0CFC8206C0 for ; Mon, 4 May 2020 04:49:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0CFC8206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D6F486E30C; Mon, 4 May 2020 04:49:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id BFADC6E2E5 for ; Mon, 4 May 2020 04:49:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21102378-1500050 for multiple; Mon, 04 May 2020 05:49:11 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 4 May 2020 05:49:03 +0100 Message-Id: <20200504044903.7626-22-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504044903.7626-1-chris@chris-wilson.co.uk> References: <20200504044903.7626-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 22/22] drm/i915/gem: Lazily acquire the device wakeref for freeing objects X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We only need the device wakeref on freeing the objects if we have to unbind the object from the global GTT, or otherwise update device information. If the objects are clean, we never need the wakeref, so avoid taking until required. Signed-off-by: Chris Wilson Cc: Janusz Krzysztofik --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 9d1d0131f7c2..99356c00c19e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -162,9 +162,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, struct llist_node *freed) { struct drm_i915_gem_object *obj, *on; - intel_wakeref_t wakeref; - wakeref = intel_runtime_pm_get(&i915->runtime_pm); llist_for_each_entry_safe(obj, on, freed, freed) { struct i915_mmap_offset *mmo, *mn; @@ -224,7 +222,6 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, call_rcu(&obj->rcu, __i915_gem_free_object_rcu); cond_resched(); } - intel_runtime_pm_put(&i915->runtime_pm, wakeref); } void i915_gem_flush_free_objects(struct drm_i915_private *i915)