From patchwork Mon May 11 07:57:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539887 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7484C912 for ; Mon, 11 May 2020 07:58:03 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5D35520735 for ; Mon, 11 May 2020 07:58:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D35520735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 56AF36E220; Mon, 11 May 2020 07:58:00 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B30946E220 for ; Mon, 11 May 2020 07:57:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160785-1500050 for multiple; Mon, 11 May 2020 08:57:24 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:03 +0100 Message-Id: <20200511075722.13483-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/20] drm/i915/gt: Mark up the racy read of execlists->context_tag X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since we are using bitops on context_tag to allow us to reserve and release inflight tags concurrently, the scan for the next bit is intentionally racy. [ 516.446854] BUG: KCSAN: data-race in execlists_schedule_in.isra.0 [i915] / execlists_schedule_out [i915] [ 516.446874] [ 516.446886] write (marked) to 0xffff8881f7644048 of 8 bytes by interrupt on cpu 2: [ 516.447076] execlists_schedule_out+0x538/0x6a0 [i915] [ 516.447263] process_csb+0x10b/0x3d0 [i915] [ 516.447449] execlists_submission_tasklet+0x30/0x170 [i915] [ 516.447468] tasklet_action_common.isra.0+0x42/0x90 [ 516.447484] __do_softirq+0xc8/0x206 [ 516.447498] irq_exit+0xcd/0xe0 [ 516.447516] do_IRQ+0x44/0xc0 [ 516.447535] ret_from_intr+0x0/0x1c [ 516.447550] cpuidle_enter_state+0x199/0x400 [ 516.447572] cpuidle_enter+0x50/0x90 [ 516.447587] do_idle+0x197/0x1e0 [ 516.447600] cpu_startup_entry+0x14/0x20 [ 516.447619] start_secondary+0xf9/0x130 [ 516.447643] secondary_startup_64+0xa4/0xb0 [ 516.447655] [ 516.447671] read to 0xffff8881f7644048 of 8 bytes by task 460 on cpu 1: [ 516.447863] execlists_schedule_in.isra.0+0x3cf/0x5a0 [i915] [ 516.448064] execlists_dequeue+0xf8f/0x1690 [i915] [ 516.448252] __execlists_submission_tasklet+0x48/0x60 [i915] [ 516.448440] execlists_submit_request+0x2e2/0x310 [i915] [ 516.448634] submit_notify+0x8f/0xc8 [i915] [ 516.448820] __i915_sw_fence_complete+0x61/0x420 [i915] [ 516.449005] i915_sw_fence_complete+0x58/0x80 [i915] [ 516.449208] i915_sw_fence_commit+0x16/0x20 [i915] [ 516.449399] __i915_request_queue+0x60/0x70 [i915] [ 516.449590] i915_gem_do_execbuffer+0x33f1/0x4a00 [i915] [ 516.449782] i915_gem_execbuffer2_ioctl+0x2a2/0x550 [i915] [ 516.449800] drm_ioctl_kernel+0xe9/0x130 [ 516.449814] drm_ioctl+0x27d/0x45e [ 516.449827] ksys_ioctl+0x89/0xb0 [ 516.449842] __x64_sys_ioctl+0x42/0x60 [ 516.449864] do_syscall_64+0x6e/0x2c0 [ 516.449878] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_lrc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 8e254f639751..ed45fc40f884 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1367,7 +1367,7 @@ __execlists_schedule_in(struct i915_request *rq) ce->lrc.ccid = ce->tag; } else { /* We don't need a strict matching tag, just different values */ - unsigned int tag = ffs(engine->context_tag); + unsigned int tag = ffs(READ_ONCE(engine->context_tag)); GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG); clear_bit(tag - 1, &engine->context_tag); From patchwork Mon May 11 07:57:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539895 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9B76912 for ; Mon, 11 May 2020 07:58:09 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C238420735 for ; Mon, 11 May 2020 07:58:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C238420735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7CEC96E279; Mon, 11 May 2020 07:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7B3256E279 for ; Mon, 11 May 2020 07:58:00 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160786-1500050 for multiple; Mon, 11 May 2020 08:57:24 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:04 +0100 Message-Id: <20200511075722.13483-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/20] drm/i915/gt: Couple up old virtual breadcrumb on new sibling X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" The second try at staging the transfer of the breadcrumb. In part one, we realised we could not simply move to the second engine as we were only holding the breadcrumb lock on the first. So in commit 6c81e21a4742 ("drm/i915/gt: Stage the transfer of the virtual breadcrumb"), we removed it from the first engine and marked up this request to reattach the signaling on the new engine. However, this failed to take into account that we only attach the breadcrumb if the new request is added at the start of the queue, which if we are transferring, it is because we know there to be a request to be signaled (and hence we would not be attached). In this second try, we remove from the first list under its lock, take ownership of the link, and then take the second lock to complete the transfer. Fixes: 6c81e21a4742 ("drm/i915/gt: Stage the transfer of the virtual breadcrumb") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index ed45fc40f884..c5591248dafb 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1825,13 +1825,12 @@ static void virtual_xfer_breadcrumbs(struct virtual_engine *ve, struct i915_request *rq) { struct intel_engine_cs *old = ve->siblings[0]; + bool xfer = false; /* All unattached (rq->engine == old) must already be completed */ spin_lock(&old->breadcrumbs.irq_lock); if (!list_empty(&ve->context.signal_link)) { - list_del_init(&ve->context.signal_link); - /* * We cannot acquire the new engine->breadcrumbs.irq_lock * (as we are holding a breadcrumbs.irq_lock already), @@ -1839,12 +1838,21 @@ static void virtual_xfer_breadcrumbs(struct virtual_engine *ve, * The queued irq_work will occur when we finally drop * the engine->active.lock after dequeue. */ - set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags); + __list_del_entry(&ve->context.signal_link); + xfer = true; + } + spin_unlock(&old->breadcrumbs.irq_lock); + + if (xfer) { + struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; + + spin_lock(&b->irq_lock); + list_add_tail(&ve->context.signal_link, &b->signalers); + spin_unlock(&b->irq_lock); /* Also transfer the pending irq_work for the old breadcrumb. */ intel_engine_signal_breadcrumbs(rq->engine); } - spin_unlock(&old->breadcrumbs.irq_lock); } #define for_each_waiter(p__, rq__) \ From patchwork Mon May 11 07:57:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539883 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3B8E139A for ; Mon, 11 May 2020 07:58:00 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6813A20735 for ; Mon, 11 May 2020 07:58:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6813A20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C9DBF6E252; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B36316E248 for ; Mon, 11 May 2020 07:57:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160787-1500050 for multiple; Mon, 11 May 2020 08:57:25 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:05 +0100 Message-Id: <20200511075722.13483-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/20] dma-buf: Use atomic_fetch_add() for the context id X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Now that atomic64_fetch_add() exists we can use it to return the base context id, rather than the atomic64_add_return(N) - N concoction. Suggested-by: Mika Kuoppala Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/dma-buf/dma-fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 052a41e2451c..90edf2b281b0 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -106,7 +106,7 @@ EXPORT_SYMBOL(dma_fence_get_stub); u64 dma_fence_context_alloc(unsigned num) { WARN_ON(!num); - return atomic64_add_return(num, &dma_fence_context_counter) - num; + return atomic64_fetch_add(num, &dma_fence_context_counter); } EXPORT_SYMBOL(dma_fence_context_alloc); From patchwork Mon May 11 07:57:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539909 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D2EE912 for ; Mon, 11 May 2020 07:58:15 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 456F82080C for ; Mon, 11 May 2020 07:58:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 456F82080C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3E6196E28E; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id E59E46E252 for ; Mon, 11 May 2020 07:57:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160788-1500050 for multiple; Mon, 11 May 2020 08:57:25 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:06 +0100 Message-Id: <20200511075722.13483-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/20] drm/i915: Mark the addition of the initial-breadcrumb in the request X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" The initial-breadcrumb is used to mark the end of the awaiting and the beginning of the user payload. We verify that we do not start the user payload before all signaler are completed, checking our semaphore setup by looking for the initial breadcrumb being written too early. We also want to ensure that we do not add semaphore waits after we have already closed the semaphore section, an issue for later deferred waits. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 5 ++++- drivers/gpu/drm/i915/i915_request.c | 7 ++++++- drivers/gpu/drm/i915/i915_request.h | 27 ++++++++++++++++++++------- 3 files changed, 30 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index c5591248dafb..f93f13d20b5a 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1894,7 +1894,7 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl) continue; /* No waiter should start before its signaler */ - GEM_BUG_ON(w->context->timeline->has_initial_breadcrumb && + GEM_BUG_ON(i915_request_has_initial_breadcrumb(w) && i915_request_started(w) && !i915_request_completed(rq)); @@ -3501,6 +3501,7 @@ static int gen8_emit_init_breadcrumb(struct i915_request *rq) { u32 *cs; + GEM_BUG_ON(i915_request_has_initial_breadcrumb(rq)); if (!i915_request_timeline(rq)->has_initial_breadcrumb) return 0; @@ -3527,6 +3528,8 @@ static int gen8_emit_init_breadcrumb(struct i915_request *rq) /* Record the updated position of the request's payload */ rq->infix = intel_ring_offset(rq, cs); + __set_bit(I915_FENCE_FLAG_INITIAL_BREADCRUMB, &rq->fence.flags); + return 0; } diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 2d5b98549ddc..00b7c4eb3f32 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -951,6 +951,7 @@ __emit_semaphore_wait(struct i915_request *to, u32 *cs; GEM_BUG_ON(INTEL_GEN(to->i915) < 8); + GEM_BUG_ON(i915_request_has_initial_breadcrumb(to)); /* We need to pin the signaler's HWSP until we are finished reading. */ err = intel_timeline_read_hwsp(from, to, &hwsp_offset); @@ -1000,6 +1001,9 @@ emit_semaphore_wait(struct i915_request *to, if (!intel_context_use_semaphores(to->context)) goto await_fence; + if (i915_request_has_initial_breadcrumb(to)) + goto await_fence; + if (!rcu_access_pointer(from->hwsp_cacheline)) goto await_fence; @@ -1256,7 +1260,8 @@ __i915_request_await_execution(struct i915_request *to, * immediate execution, and so we must wait until it reaches the * active slot. */ - if (intel_engine_has_semaphores(to->engine)) { + if (intel_engine_has_semaphores(to->engine) && + !i915_request_has_initial_breadcrumb(to)) { err = __emit_semaphore_wait(to, from, from->fence.seqno - 1); if (err < 0) return err; diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index d8ce908e1346..98ae2dc82371 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -84,19 +84,26 @@ enum { I915_FENCE_FLAG_PQUEUE, /* - * I915_FENCE_FLAG_SIGNAL - this request is currently on signal_list + * I915_FENCE_FLAG_HOLD - this request is currently on hold * - * Internal bookkeeping used by the breadcrumb code to track when - * a request is on the various signal_list. + * This request has been suspended, pending an ongoing investigation. */ - I915_FENCE_FLAG_SIGNAL, + I915_FENCE_FLAG_HOLD, /* - * I915_FENCE_FLAG_HOLD - this request is currently on hold + * I915_FENCE_FLAG_INITIAL_BREADCRUMB - this request has the initial + * breadcrumb that marks the end of semaphore waits and start of the + * user payload. + */ + I915_FENCE_FLAG_INITIAL_BREADCRUMB, + + /* + * I915_FENCE_FLAG_SIGNAL - this request is currently on signal_list * - * This request has been suspended, pending an ongoing investigation. + * Internal bookkeeping used by the breadcrumb code to track when + * a request is on the various signal_list. */ - I915_FENCE_FLAG_HOLD, + I915_FENCE_FLAG_SIGNAL, /* * I915_FENCE_FLAG_NOPREEMPT - this request should not be preempted @@ -390,6 +397,12 @@ static inline bool i915_request_in_priority_queue(const struct i915_request *rq) return test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); } +static inline bool +i915_request_has_initial_breadcrumb(const struct i915_request *rq) +{ + return test_bit(I915_FENCE_FLAG_INITIAL_BREADCRUMB, &rq->fence.flags); +} + /** * Returns true if seq1 is later than seq2. */ From patchwork Mon May 11 07:57:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539919 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FB5B139A for ; Mon, 11 May 2020 07:58:18 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47C8120735 for ; Mon, 11 May 2020 07:58:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47C8120735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 720BB6E351; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0B96E6E260 for ; Mon, 11 May 2020 07:57:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160789-1500050 for multiple; Mon, 11 May 2020 08:57:25 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:07 +0100 Message-Id: <20200511075722.13483-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/20] drm/i915: Tidy awaiting on dma-fences X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Just tidy up the return handling for completed dma-fences. While it may return errors for invalid fence, we already know that we have a good fence and the only error will be an already signaled fence. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_sw_fence.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 7daf81f55c90..295b9829e2da 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -546,13 +546,11 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, cb->fence = fence; i915_sw_fence_await(fence); - ret = dma_fence_add_callback(dma, &cb->base, __dma_i915_sw_fence_wake); - if (ret == 0) { - ret = 1; - } else { + ret = 1; + if (dma_fence_add_callback(dma, &cb->base, __dma_i915_sw_fence_wake)) { + /* fence already signaled */ __dma_i915_sw_fence_wake(dma, &cb->base); - if (ret == -ENOENT) /* fence already signaled */ - ret = 0; + ret = 0; } return ret; From patchwork Mon May 11 07:57:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539907 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 905CA175D for ; Mon, 11 May 2020 07:58:14 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 794FD20735 for ; Mon, 11 May 2020 07:58:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 794FD20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 90ED76E3B7; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5CF2B6E248 for ; Mon, 11 May 2020 07:58:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160790-1500050 for multiple; Mon, 11 May 2020 08:57:26 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:08 +0100 Message-Id: <20200511075722.13483-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/20] dma-buf: Proxy fence, an unsignaled fence placeholder X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Often we need to create a fence for a future event that has not yet been associated with a fence. We can store a proxy fence, a placeholder, in the timeline and replace it later when the real fence is known. Any listeners that attach to the proxy fence will automatically be signaled when the real fence completes, and any future listeners will instead be attach directly to the real fence avoiding any indirection overhead. Signed-off-by: Chris Wilson Cc: Lionel Landwerlin --- drivers/dma-buf/Makefile | 13 +- drivers/dma-buf/dma-fence-private.h | 20 + drivers/dma-buf/dma-fence-proxy.c | 248 ++++++++++ drivers/dma-buf/dma-fence.c | 4 +- drivers/dma-buf/selftests.h | 1 + drivers/dma-buf/st-dma-fence-proxy.c | 699 +++++++++++++++++++++++++++ include/linux/dma-fence-proxy.h | 34 ++ 7 files changed, 1015 insertions(+), 4 deletions(-) create mode 100644 drivers/dma-buf/dma-fence-private.h create mode 100644 drivers/dma-buf/dma-fence-proxy.c create mode 100644 drivers/dma-buf/st-dma-fence-proxy.c create mode 100644 include/linux/dma-fence-proxy.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 995e05f609ff..afaf6dadd9a3 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,6 +1,12 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ - dma-resv.o seqno-fence.o +obj-y := \ + dma-buf.o \ + dma-fence.o \ + dma-fence-array.o \ + dma-fence-chain.o \ + dma-fence-proxy.o \ + dma-resv.o \ + seqno-fence.o obj-$(CONFIG_DMABUF_HEAPS) += dma-heap.o obj-$(CONFIG_DMABUF_HEAPS) += heaps/ obj-$(CONFIG_SYNC_FILE) += sync_file.o @@ -10,6 +16,7 @@ obj-$(CONFIG_UDMABUF) += udmabuf.o dmabuf_selftests-y := \ selftest.o \ st-dma-fence.o \ - st-dma-fence-chain.o + st-dma-fence-chain.o \ + st-dma-fence-proxy.o obj-$(CONFIG_DMABUF_SELFTESTS) += dmabuf_selftests.o diff --git a/drivers/dma-buf/dma-fence-private.h b/drivers/dma-buf/dma-fence-private.h new file mode 100644 index 000000000000..6924d28af0fa --- /dev/null +++ b/drivers/dma-buf/dma-fence-private.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Fence mechanism for dma-buf and to allow for asynchronous dma access + * + * Copyright (C) 2012 Canonical Ltd + * Copyright (C) 2012 Texas Instruments + * + * Authors: + * Rob Clark + * Maarten Lankhorst + */ + +#ifndef DMA_FENCE_PRIVATE_H +#define DMA_FENCE_PRIAVTE_H + +struct dma_fence; + +bool __dma_fence_enable_signaling(struct dma_fence *fence); + +#endif /* DMA_FENCE_PRIAVTE_H */ diff --git a/drivers/dma-buf/dma-fence-proxy.c b/drivers/dma-buf/dma-fence-proxy.c new file mode 100644 index 000000000000..f0cd89b966e0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-proxy.c @@ -0,0 +1,248 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * dma-fence-proxy: placeholder unsignaled fence + * + * Copyright (C) 2017-2019 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include "dma-fence-private.h" + +struct dma_fence_proxy { + struct dma_fence base; + + struct dma_fence *real; + struct dma_fence_cb cb; + struct irq_work work; + + wait_queue_head_t wq; +}; + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +#define same_lockclass(A, B) (A)->dep_map.key == (B)->dep_map.key +#else +#define same_lockclass(A, B) 0 +#endif + +static const char *proxy_get_driver_name(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + struct dma_fence *real = READ_ONCE(p->real); + + return real ? real->ops->get_driver_name(real) : "proxy"; +} + +static const char *proxy_get_timeline_name(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + struct dma_fence *real = READ_ONCE(p->real); + + return real ? real->ops->get_timeline_name(real) : "unset"; +} + +static void proxy_irq_work(struct irq_work *work) +{ + struct dma_fence_proxy *p = container_of(work, typeof(*p), work); + + dma_fence_signal(&p->base); + dma_fence_put(&p->base); +} + +static void proxy_callback(struct dma_fence *real, struct dma_fence_cb *cb) +{ + struct dma_fence_proxy *p = container_of(cb, typeof(*p), cb); + + if (real->error) + dma_fence_set_error(&p->base, real->error); + + /* Lower the height of the proxy chain -> single stack frame */ + irq_work_queue(&p->work); +} + +static bool proxy_enable_signaling(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + struct dma_fence *real = READ_ONCE(p->real); + bool ret = true; + + if (real) { + spin_lock_nested(real->lock, + same_lockclass(&p->wq.lock, real->lock)); + ret = __dma_fence_enable_signaling(real); + spin_unlock(real->lock); + } + + return ret; +} + +static void proxy_release(struct dma_fence *fence) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + + dma_fence_put(p->real); + dma_fence_free(&p->base); +} + +const struct dma_fence_ops dma_fence_proxy_ops = { + .get_driver_name = proxy_get_driver_name, + .get_timeline_name = proxy_get_timeline_name, + .enable_signaling = proxy_enable_signaling, + .wait = dma_fence_default_wait, + .release = proxy_release, +}; +EXPORT_SYMBOL_GPL(dma_fence_proxy_ops); + +/** + * dma_fence_create_proxy - Create an unset dma-fence + * + * dma_fence_create_proxy() creates a new dma_fence stub that is initially + * unsignaled and may later be replaced with a real fence. Any listeners + * to the proxy fence will be signaled when the target fence signals its + * completion. + */ +struct dma_fence *dma_fence_create_proxy(void) +{ + struct dma_fence_proxy *p; + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (!p) + return NULL; + + init_waitqueue_head(&p->wq); + dma_fence_init(&p->base, &dma_fence_proxy_ops, &p->wq.lock, + dma_fence_context_alloc(1), 0); + init_irq_work(&p->work, proxy_irq_work); + + return &p->base; +} +EXPORT_SYMBOL(dma_fence_create_proxy); + +static void __wake_up_listeners(struct dma_fence_proxy *p) +{ + struct wait_queue_entry *wait, *next; + + list_for_each_entry_safe(wait, next, &p->wq.head, entry) { + INIT_LIST_HEAD(&wait->entry); + wait->func(wait, TASK_NORMAL, 0, p->real); + } +} + +static void proxy_assign(struct dma_fence *fence, struct dma_fence *real) +{ + struct dma_fence_proxy *p = container_of(fence, typeof(*p), base); + unsigned long flags; + + if (WARN_ON(fence == real)) + return; + + if (WARN_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))) + return; + + if (WARN_ON(p->real)) + return; + + spin_lock_irqsave(&p->wq.lock, flags); + + if (unlikely(!real)) { + dma_fence_signal_locked(&p->base); + goto unlock; + } + + p->real = dma_fence_get(real); + + dma_fence_get(&p->base); + spin_lock_nested(real->lock, same_lockclass(&p->wq.lock, real->lock)); + if (dma_fence_is_signaled_locked(real)) { + proxy_callback(real, &p->cb); + } else if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, + &p->base.flags) && + !__dma_fence_enable_signaling(real)) { + proxy_callback(real, &p->cb); + } else { + p->cb.func = proxy_callback; + list_add_tail(&p->cb.node, &real->cb_list); + } + spin_unlock(real->lock); + +unlock: + __wake_up_listeners(p); + spin_unlock_irqrestore(&p->wq.lock, flags); +} + +/** + * dma_fence_replace_proxy - Replace the proxy fence with the real target + * @slot: pointer to location of fence to update + * @fence: the new fence to store in @slot + * + * Once the real dma_fence is known, we can replace the proxy fence holder + * with a pointer to the real dma fence. Future listeners will attach to + * the real fence, avoiding any indirection overhead. Previous listeners + * will remain attached to the proxy fence, and be signaled in turn when + * the target fence completes. + */ +struct dma_fence * +dma_fence_replace_proxy(struct dma_fence __rcu **slot, struct dma_fence *fence) +{ + struct dma_fence *old; + + if (fence) + dma_fence_get(fence); + + old = rcu_replace_pointer(*slot, fence, true); + if (old && dma_fence_is_proxy(old)) + proxy_assign(old, fence); + + return old; +} +EXPORT_SYMBOL(dma_fence_replace_proxy); + +void dma_fence_add_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait) +{ + if (dma_fence_is_proxy(fence)) { + struct dma_fence_proxy *p = + container_of(fence, typeof(*p), base); + unsigned long flags; + + spin_lock_irqsave(&p->wq.lock, flags); + if (!p->real) { + list_add_tail(&wait->entry, &p->wq.head); + wait = NULL; + } + fence = p->real; + spin_unlock_irqrestore(&p->wq.lock, flags); + } + + if (wait) { + INIT_LIST_HEAD(&wait->entry); + wait->func(wait, TASK_NORMAL, 0, fence); + } +} +EXPORT_SYMBOL(dma_fence_add_proxy_listener); + +bool dma_fence_remove_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait) +{ + bool ret = false; + + if (dma_fence_is_proxy(fence)) { + struct dma_fence_proxy *p = + container_of(fence, typeof(*p), base); + unsigned long flags; + + spin_lock_irqsave(&p->wq.lock, flags); + if (!list_empty(&wait->entry)) { + list_del_init(&wait->entry); + ret = true; + } + spin_unlock_irqrestore(&p->wq.lock, flags); + } + + return ret; +} +EXPORT_SYMBOL(dma_fence_remove_proxy_listener); diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 90edf2b281b0..5a9ff241e39e 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -19,6 +19,8 @@ #define CREATE_TRACE_POINTS #include +#include "dma-fence-private.h" + EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); @@ -273,7 +275,7 @@ void dma_fence_free(struct dma_fence *fence) } EXPORT_SYMBOL(dma_fence_free); -static bool __dma_fence_enable_signaling(struct dma_fence *fence) +bool __dma_fence_enable_signaling(struct dma_fence *fence) { bool was_set; diff --git a/drivers/dma-buf/selftests.h b/drivers/dma-buf/selftests.h index 55918ef9adab..616eca70e2d8 100644 --- a/drivers/dma-buf/selftests.h +++ b/drivers/dma-buf/selftests.h @@ -12,3 +12,4 @@ selftest(sanitycheck, __sanitycheck__) /* keep first (igt selfcheck) */ selftest(dma_fence, dma_fence) selftest(dma_fence_chain, dma_fence_chain) +selftest(dma_fence_proxy, dma_fence_proxy) diff --git a/drivers/dma-buf/st-dma-fence-proxy.c b/drivers/dma-buf/st-dma-fence-proxy.c new file mode 100644 index 000000000000..c95811199c16 --- /dev/null +++ b/drivers/dma-buf/st-dma-fence-proxy.c @@ -0,0 +1,699 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "selftest.h" + +static struct kmem_cache *slab_fences; + +static struct mock_fence { + struct dma_fence base; + spinlock_t lock; +} *to_mock_fence(struct dma_fence *f) { + return container_of(f, struct mock_fence, base); +} + +static const char *mock_name(struct dma_fence *f) +{ + return "mock"; +} + +static void mock_fence_release(struct dma_fence *f) +{ + kmem_cache_free(slab_fences, to_mock_fence(f)); +} + +static const struct dma_fence_ops mock_ops = { + .get_driver_name = mock_name, + .get_timeline_name = mock_name, + .release = mock_fence_release, +}; + +static struct dma_fence *mock_fence(void) +{ + struct mock_fence *f; + + f = kmem_cache_alloc(slab_fences, GFP_KERNEL); + if (!f) + return NULL; + + spin_lock_init(&f->lock); + dma_fence_init(&f->base, &mock_ops, &f->lock, 0, 0); + + return &f->base; +} + +static int sanitycheck(void *arg) +{ + struct dma_fence *f; + + f = dma_fence_create_proxy(); + if (!f) + return -ENOMEM; + + dma_fence_signal(f); + dma_fence_put(f); + + return 0; +} + +struct fences { + struct dma_fence *real; + struct dma_fence *proxy; + struct dma_fence __rcu *slot; +}; + +static int create_fences(struct fences *f, bool attach) +{ + f->proxy = dma_fence_create_proxy(); + if (!f->proxy) + return -ENOMEM; + + RCU_INIT_POINTER(f->slot, f->proxy); + + f->real = mock_fence(); + if (!f->real) { + dma_fence_put(f->proxy); + return -ENOMEM; + } + + if (attach) + dma_fence_replace_proxy(&f->slot, f->real); + + return 0; +} + +static void free_fences(struct fences *f) +{ + dma_fence_put(dma_fence_replace_proxy(&f->slot, NULL)); + dma_fence_put(f->real); + dma_fence_put(f->proxy); +} + +static int wrap_signaling(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_is_signaled(f.proxy)) { + pr_err("Fence unexpectedly signaled on creation\n"); + goto err_free; + } + + if (dma_fence_signal(f.real)) { + pr_err("Fence reported being already signaled\n"); + goto err_free; + } + + if (!dma_fence_is_signaled(f.proxy)) { + pr_err("Fence not reporting signaled\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_signaling_recurse(void *arg) +{ + struct fences f; + struct dma_fence *chain; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + chain = dma_fence_create_proxy(); + if (!chain) { + err = -ENOMEM; + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, chain); + dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real)); + dma_fence_put(chain); + + /* f.real <- chain <- f.proxy */ + + if (dma_fence_is_signaled(f.proxy)) { + pr_err("Fence unexpectedly signaled on creation\n"); + goto err_free; + } + + if (dma_fence_signal(f.real)) { + pr_err("Fence reported being already signaled\n"); + goto err_free; + } + + if (!dma_fence_is_signaled(f.proxy)) { + pr_err("Fence not reporting signaled\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +struct simple_cb { + struct dma_fence_cb cb; + bool seen; +}; + +static void simple_callback(struct dma_fence *f, struct dma_fence_cb *cb) +{ + smp_store_mb(container_of(cb, struct simple_cb, cb)->seen, true); +} + +static int wrap_add_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_add_callback_recurse(void *arg) +{ + struct simple_cb cb = {}; + struct dma_fence *chain; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + chain = dma_fence_create_proxy(); + if (!chain) { + err = -ENOMEM; + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, chain); + dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real)); + dma_fence_put(chain); + + /* f.real <- chain <- f.proxy */ + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_late_add_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + dma_fence_signal(f.real); + + if (!dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Added callback, but fence was already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (cb.seen) { + pr_err("Callback called after failed attachment!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_early_add_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_early_add_callback_late(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_signal(f.real); + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_early_add_callback_early(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_rm_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + if (!dma_fence_remove_callback(f.proxy, &cb.cb)) { + pr_err("Failed to remove callback!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (cb.seen) { + pr_err("Callback still signaled after removal!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_late_rm_callback(void *arg) +{ + struct simple_cb cb = {}; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) { + pr_err("Failed to add callback, fence already signaled!\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!cb.seen) { + pr_err("Callback failed!\n"); + goto err_free; + } + + if (dma_fence_remove_callback(f.proxy, &cb.cb)) { + pr_err("Callback removal succeed after being executed!\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_status(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_get_status(f.proxy)) { + pr_err("Fence unexpectedly has signaled status on creation\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (!dma_fence_get_status(f.proxy)) { + pr_err("Fence not reporting signaled status\n"); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_error(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + dma_fence_set_error(f.real, -EIO); + + if (dma_fence_get_status(f.proxy)) { + pr_err("Fence unexpectedly has error status before signal\n"); + goto err_free; + } + + dma_fence_signal(f.real); + if (dma_fence_get_status(f.proxy) != -EIO) { + pr_err("Fence not reporting error status, got %d\n", + dma_fence_get_status(f.proxy)); + goto err_free; + } + + err = 0; +err_free: + free_fences(&f); + return err; +} + +static int wrap_wait(void *arg) +{ + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, true)) + return -ENOMEM; + + if (dma_fence_wait_timeout(f.proxy, false, 0) != 0) { + pr_err("Wait reported complete before being signaled\n"); + goto err_free; + } + + dma_fence_signal(f.real); + + if (dma_fence_wait_timeout(f.proxy, false, 0) == 0) { + pr_err("Wait reported incomplete after being signaled\n"); + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +struct wait_timer { + struct timer_list timer; + struct fences f; +}; + +static void wait_timer(struct timer_list *timer) +{ + struct wait_timer *wt = from_timer(wt, timer, timer); + + dma_fence_signal(wt->f.real); +} + +static int wrap_wait_timeout(void *arg) +{ + struct wait_timer wt; + int err = -EINVAL; + + if (create_fences(&wt.f, true)) + return -ENOMEM; + + timer_setup_on_stack(&wt.timer, wait_timer, 0); + + if (dma_fence_wait_timeout(wt.f.proxy, false, 1) != 0) { + pr_err("Wait reported complete before being signaled\n"); + goto err_free; + } + + mod_timer(&wt.timer, jiffies + 1); + + if (dma_fence_wait_timeout(wt.f.proxy, false, 2) != 0) { + if (timer_pending(&wt.timer)) { + pr_notice("Timer did not fire within the jiffie!\n"); + err = 0; /* not our fault! */ + } else { + pr_err("Wait reported incomplete after timeout\n"); + } + goto err_free; + } + + err = 0; +err_free: + del_timer_sync(&wt.timer); + destroy_timer_on_stack(&wt.timer); + dma_fence_signal(wt.f.real); + free_fences(&wt.f); + return err; +} + +struct proxy_wait { + struct wait_queue_entry base; + struct dma_fence *fence; + bool seen; +}; + +static int proxy_wait_cb(struct wait_queue_entry *entry, + unsigned int mode, int flags, void *key) +{ + struct proxy_wait *p = container_of(entry, typeof(*p), base); + + p->fence = key; + p->seen = true; + + return 0; +} + +static int wrap_listen_early(void *arg) +{ + struct proxy_wait wait = { .base.func = proxy_wait_cb }; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_replace_proxy(&f.slot, f.real); + dma_fence_add_proxy_listener(f.proxy, &wait.base); + + if (!wait.seen) { + pr_err("Proxy listener was not called after replace!\n"); + err = -EINVAL; + goto err_free; + } + + if (wait.fence != f.real) { + pr_err("Proxy listener was not passed the real fence!\n"); + err = -EINVAL; + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +static int wrap_listen_late(void *arg) +{ + struct proxy_wait wait = { .base.func = proxy_wait_cb }; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_add_proxy_listener(f.proxy, &wait.base); + dma_fence_replace_proxy(&f.slot, f.real); + + if (!wait.seen) { + pr_err("Proxy listener was not called on replace!\n"); + err = -EINVAL; + goto err_free; + } + + if (wait.fence != f.real) { + pr_err("Proxy listener was not passed the real fence!\n"); + err = -EINVAL; + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +static int wrap_listen_cancel(void *arg) +{ + struct proxy_wait wait = { .base.func = proxy_wait_cb }; + struct fences f; + int err = -EINVAL; + + if (create_fences(&f, false)) + return -ENOMEM; + + dma_fence_add_proxy_listener(f.proxy, &wait.base); + if (!dma_fence_remove_proxy_listener(f.proxy, &wait.base)) { + pr_err("Cancelling listener, already detached?\n"); + err = -EINVAL; + goto err_free; + } + dma_fence_replace_proxy(&f.slot, f.real); + + if (wait.seen) { + pr_err("Proxy listener was called after being removed!\n"); + err = -EINVAL; + goto err_free; + } + + if (dma_fence_remove_proxy_listener(f.proxy, &wait.base)) { + pr_err("Double listener cancellation!\n"); + err = -EINVAL; + goto err_free; + } + + err = 0; +err_free: + dma_fence_signal(f.real); + free_fences(&f); + return err; +} + +int dma_fence_proxy(void) +{ + static const struct subtest tests[] = { + SUBTEST(sanitycheck), + SUBTEST(wrap_signaling), + SUBTEST(wrap_signaling_recurse), + SUBTEST(wrap_add_callback), + SUBTEST(wrap_add_callback_recurse), + SUBTEST(wrap_late_add_callback), + SUBTEST(wrap_early_add_callback), + SUBTEST(wrap_early_add_callback_late), + SUBTEST(wrap_early_add_callback_early), + SUBTEST(wrap_rm_callback), + SUBTEST(wrap_late_rm_callback), + SUBTEST(wrap_status), + SUBTEST(wrap_error), + SUBTEST(wrap_wait), + SUBTEST(wrap_wait_timeout), + SUBTEST(wrap_listen_early), + SUBTEST(wrap_listen_late), + SUBTEST(wrap_listen_cancel), + }; + int ret; + + slab_fences = KMEM_CACHE(mock_fence, + SLAB_TYPESAFE_BY_RCU | + SLAB_HWCACHE_ALIGN); + if (!slab_fences) + return -ENOMEM; + + ret = subtests(tests, NULL); + + kmem_cache_destroy(slab_fences); + + return ret; +} diff --git a/include/linux/dma-fence-proxy.h b/include/linux/dma-fence-proxy.h new file mode 100644 index 000000000000..063cde6b42c4 --- /dev/null +++ b/include/linux/dma-fence-proxy.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * dma-fence-proxy: allows waiting upon unset and future fences + * + * Copyright (C) 2017 Intel Corporation + */ + +#ifndef __LINUX_DMA_FENCE_PROXY_H +#define __LINUX_DMA_FENCE_PROXY_H + +#include +#include + +struct wait_queue_entry; + +extern const struct dma_fence_ops dma_fence_proxy_ops; + +struct dma_fence *dma_fence_create_proxy(void); + +static inline bool dma_fence_is_proxy(struct dma_fence *fence) +{ + return fence->ops == &dma_fence_proxy_ops; +} + +struct dma_fence * +dma_fence_replace_proxy(struct dma_fence __rcu **slot, + struct dma_fence *fence); + +void dma_fence_add_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait); +bool dma_fence_remove_proxy_listener(struct dma_fence *fence, + struct wait_queue_entry *wait); + +#endif /* __LINUX_DMA_FENCE_PROXY_H */ From patchwork Mon May 11 07:57:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539913 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AA477912 for ; Mon, 11 May 2020 07:58:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 92E5620735 for ; Mon, 11 May 2020 07:58:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 92E5620735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CFB456E3E5; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 09ECF6E26C for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160791-1500050 for multiple; Mon, 11 May 2020 08:57:26 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:09 +0100 Message-Id: <20200511075722.13483-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/20] drm/syncobj: Allow use of dma-fence-proxy X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allow the callers to supply a dma-fence-proxy for asynchronous waiting on future fences. Signed-off-by: Chris Wilson --- drivers/gpu/drm/drm_syncobj.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 42d46414f767..e141db0e1eb6 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -184,6 +184,7 @@ */ #include +#include #include #include #include @@ -324,14 +325,9 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *old_fence; struct syncobj_wait_entry *cur, *tmp; - if (fence) - dma_fence_get(fence); - spin_lock(&syncobj->lock); - old_fence = rcu_dereference_protected(syncobj->fence, - lockdep_is_held(&syncobj->lock)); - rcu_assign_pointer(syncobj->fence, fence); + old_fence = dma_fence_replace_proxy(&syncobj->fence, fence); if (fence != old_fence) { list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) From patchwork Mon May 11 07:57:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539917 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C8C96175D for ; Mon, 11 May 2020 07:58:17 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B126120735 for ; Mon, 11 May 2020 07:58:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B126120735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 66E1F6E29E; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7B3FD6E27C for ; Mon, 11 May 2020 07:58:00 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160792-1500050 for multiple; Mon, 11 May 2020 08:57:26 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:10 +0100 Message-Id: <20200511075722.13483-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/20] drm/i915/gem: Teach execbuf how to wait on future syncobj X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If a syncobj has not yet been assigned, treat it as a future fence and install and wait upon a dma-fence-proxy. The proxy will be replace by the real fence later, and that fence will be responsible for signaling our waiter. Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854 Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 21 ++- drivers/gpu/drm/i915/i915_request.c | 153 ++++++++++++++++++ drivers/gpu/drm/i915/i915_scheduler.c | 41 +++++ drivers/gpu/drm/i915/i915_scheduler.h | 3 + 4 files changed, 216 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d54a4933cc05..199131db200f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -5,6 +5,7 @@ */ #include +#include #include #include #include @@ -2524,8 +2525,24 @@ await_fence_array(struct i915_execbuffer *eb, continue; fence = drm_syncobj_fence_get(syncobj); - if (!fence) - return -EINVAL; + if (!fence) { + struct dma_fence *old; + + fence = dma_fence_create_proxy(); + if (!fence) + return -ENOMEM; + + spin_lock(&syncobj->lock); + old = rcu_dereference_protected(syncobj->fence, true); + if (unlikely(old)) { + dma_fence_put(fence); + fence = dma_fence_get(old); + } else { + rcu_assign_pointer(syncobj->fence, + dma_fence_get(fence)); + } + spin_unlock(&syncobj->lock); + } err = i915_request_await_dma_fence(eb->request, fence); dma_fence_put(fence); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 00b7c4eb3f32..945494b06bce 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -24,6 +24,7 @@ #include #include +#include #include #include #include @@ -379,6 +380,7 @@ static bool fatal_error(int error) case 0: /* not an error! */ case -EAGAIN: /* innocent victim of a GT reset (__i915_request_reset) */ case -ETIMEDOUT: /* waiting for Godot (timer_i915_sw_fence_wake) */ + case -EDEADLK: /* cyclic fence lockup (await_proxy) */ return false; default: return true; @@ -1133,6 +1135,155 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence) return err; } +struct await_proxy { + struct wait_queue_entry base; + struct i915_request *request; + struct dma_fence *fence; + struct timer_list timer; + struct work_struct work; + int (*attach)(struct await_proxy *ap); + void *data; +}; + +static void await_proxy_work(struct work_struct *work) +{ + struct await_proxy *ap = container_of(work, typeof(*ap), work); + struct i915_request *rq = ap->request; + + del_timer_sync(&ap->timer); + + if (ap->fence) { + int err = 0; + + /* + * If the fence is external, we impose a 10s timeout. + * However, if the fence is internal, we skip a timeout in + * the belief that all fences are in-order (DAG, no cycles) + * and we can enforce forward progress by reset the GPU if + * necessary. A future fence, provided userspace, can trivially + * generate a cycle in the dependency graph, and so cause + * that entire cycle to become deadlocked and for no forward + * progress to either be made, and the driver being kept + * eternally awake. + */ + if (dma_fence_is_i915(ap->fence) && + !i915_sched_node_verify_dag(&rq->sched, + &to_request(ap->fence)->sched)) + err = -EDEADLK; + + if (!err) { + mutex_lock(&rq->context->timeline->mutex); + err = ap->attach(ap); + mutex_unlock(&rq->context->timeline->mutex); + } + + /* Don't flag an error for co-dependent scheduling */ + if (err == -EDEADLK) { + struct i915_sched_node *waiter = + &to_request(ap->fence)->sched; + struct i915_dependency *p; + + list_for_each_entry_lockless(p, + &rq->sched.waiters_list, + wait_link) { + if (p->waiter == waiter && + p->flags & I915_DEPENDENCY_WEAK) { + err = 0; + break; + } + } + } + + if (err < 0) + i915_sw_fence_set_error_once(&rq->submit, err); + } + + i915_sw_fence_complete(&rq->submit); + + dma_fence_put(ap->fence); + kfree(ap); +} + +static int +await_proxy_wake(struct wait_queue_entry *entry, + unsigned int mode, + int flags, + void *fence) +{ + struct await_proxy *ap = container_of(entry, typeof(*ap), base); + + ap->fence = dma_fence_get(fence); + schedule_work(&ap->work); + + return 0; +} + +static void +await_proxy_timer(struct timer_list *t) +{ + struct await_proxy *ap = container_of(t, typeof(*ap), timer); + + if (dma_fence_remove_proxy_listener(ap->base.private, &ap->base)) { + struct i915_request *rq = ap->request; + + pr_notice("Asynchronous wait on unset proxy fence by %s:%s:%llx timed out\n", + rq->fence.ops->get_driver_name(&rq->fence), + rq->fence.ops->get_timeline_name(&rq->fence), + rq->fence.seqno); + i915_sw_fence_set_error_once(&rq->submit, -ETIMEDOUT); + + schedule_work(&ap->work); + } +} + +static int +__i915_request_await_proxy(struct i915_request *rq, + struct dma_fence *fence, + unsigned long timeout, + int (*attach)(struct await_proxy *ap), + void *data) +{ + struct await_proxy *ap; + + ap = kzalloc(sizeof(*ap), I915_FENCE_GFP); + if (!ap) + return -ENOMEM; + + i915_sw_fence_await(&rq->submit); + mark_external(rq); + + ap->base.private = fence; + ap->base.func = await_proxy_wake; + ap->request = rq; + INIT_WORK(&ap->work, await_proxy_work); + ap->attach = attach; + ap->data = data; + + timer_setup(&ap->timer, await_proxy_timer, 0); + if (timeout) + mod_timer(&ap->timer, round_jiffies_up(jiffies + timeout)); + + dma_fence_add_proxy_listener(fence, &ap->base); + return 0; +} + +static int await_proxy(struct await_proxy *ap) +{ + return i915_request_await_dma_fence(ap->request, ap->fence); +} + +static int +i915_request_await_proxy(struct i915_request *rq, struct dma_fence *fence) +{ + /* + * Wait until we know the real fence so that can optimise the + * inter-fence synchronisation. + */ + return __i915_request_await_proxy(rq, fence, + i915_fence_timeout(rq->i915), + await_proxy, NULL); +} + int i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) { @@ -1179,6 +1330,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) if (dma_fence_is_i915(fence)) ret = i915_request_await_request(rq, to_request(fence)); + else if (dma_fence_is_proxy(fence)) + ret = i915_request_await_proxy(rq, fence); else ret = i915_request_await_external(rq, fence); if (ret < 0) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index bec2a9c25425..f8e797a7eee9 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -472,6 +472,47 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node, return 0; } +bool i915_sched_node_verify_dag(struct i915_sched_node *waiter, + struct i915_sched_node *signaler) +{ + struct i915_dependency *dep, *p; + struct i915_dependency stack; + bool result = false; + LIST_HEAD(dfs); + + if (list_empty(&waiter->waiters_list)) + return true; + + spin_lock_irq(&schedule_lock); + + stack.signaler = signaler; + list_add(&stack.dfs_link, &dfs); + + list_for_each_entry(dep, &dfs, dfs_link) { + struct i915_sched_node *node = dep->signaler; + + if (node_signaled(node)) + continue; + + list_for_each_entry(p, &node->signalers_list, signal_link) { + if (p->signaler == waiter) + goto out; + + if (list_empty(&p->dfs_link)) + list_add_tail(&p->dfs_link, &dfs); + } + } + + result = true; +out: + list_for_each_entry_safe(dep, p, &dfs, dfs_link) + INIT_LIST_HEAD(&dep->dfs_link); + + spin_unlock_irq(&schedule_lock); + + return result; +} + void i915_sched_node_fini(struct i915_sched_node *node) { struct i915_dependency *dep, *tmp; diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 6f0bf00fc569..13432add8929 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -28,6 +28,9 @@ void i915_sched_node_init(struct i915_sched_node *node); void i915_sched_node_reinit(struct i915_sched_node *node); +bool i915_sched_node_verify_dag(struct i915_sched_node *waiter, + struct i915_sched_node *signal); + bool __i915_sched_node_add_dependency(struct i915_sched_node *node, struct i915_sched_node *signal, struct i915_dependency *dep, From patchwork Mon May 11 07:57:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539921 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 064DC912 for ; Mon, 11 May 2020 07:58:19 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E30D620735 for ; Mon, 11 May 2020 07:58:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E30D620735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DFB2E6E3EC; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id CB19A89CD9 for ; Mon, 11 May 2020 07:58:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160793-1500050 for multiple; Mon, 11 May 2020 08:57:26 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:11 +0100 Message-Id: <20200511075722.13483-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/20] drm/i915/gem: Allow combining submit-fences with syncobj X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We allow exported sync_file fences to be used as submit fences, but they are not the only source of user fences. We also accept an array of syncobj, and as with sync_file these are dma_fences underneath and so feature the same set of controls. The submit-fence allows for a request to be scheduled at the same time as the signaler, rather than as normal after. Userspace can combine submit-fence with its own semaphores for intra-batch scheduling. Not exposing submit-fences to syncobj was at the time just a matter of pragmatic expediency. Fixes: a88b6e4cbafd ("drm/i915: Allow specification of parallel execbuf") Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Lionel Landwerlin Reviewed-by: Tvrtko Ursulin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 14 +++++++---- drivers/gpu/drm/i915/i915_request.c | 25 +++++++++++++++++++ include/uapi/drm/i915_drm.h | 7 +++--- 3 files changed, 38 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 199131db200f..6368f0070157 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2432,7 +2432,7 @@ static void __free_fence_array(struct drm_syncobj **fences, unsigned int n) { while (n--) - drm_syncobj_put(ptr_mask_bits(fences[n], 2)); + drm_syncobj_put(ptr_mask_bits(fences[n], 3)); kvfree(fences); } @@ -2489,7 +2489,7 @@ get_fence_array(struct drm_i915_gem_execbuffer2 *args, BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) & ~__I915_EXEC_FENCE_UNKNOWN_FLAGS); - fences[n] = ptr_pack_bits(syncobj, fence.flags, 2); + fences[n] = ptr_pack_bits(syncobj, fence.flags, 3); } return fences; @@ -2520,7 +2520,7 @@ await_fence_array(struct i915_execbuffer *eb, struct dma_fence *fence; unsigned int flags; - syncobj = ptr_unpack_bits(fences[n], &flags, 2); + syncobj = ptr_unpack_bits(fences[n], &flags, 3); if (!(flags & I915_EXEC_FENCE_WAIT)) continue; @@ -2544,7 +2544,11 @@ await_fence_array(struct i915_execbuffer *eb, spin_unlock(&syncobj->lock); } - err = i915_request_await_dma_fence(eb->request, fence); + if (flags & I915_EXEC_FENCE_WAIT_SUBMIT) + err = i915_request_await_execution(eb->request, fence, + eb->engine->bond_execute); + else + err = i915_request_await_dma_fence(eb->request, fence); dma_fence_put(fence); if (err < 0) return err; @@ -2565,7 +2569,7 @@ signal_fence_array(struct i915_execbuffer *eb, struct drm_syncobj *syncobj; unsigned int flags; - syncobj = ptr_unpack_bits(fences[n], &flags, 2); + syncobj = ptr_unpack_bits(fences[n], &flags, 3); if (!(flags & I915_EXEC_FENCE_SIGNAL)) continue; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 945494b06bce..9ad1e6761492 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1433,6 +1433,27 @@ __i915_request_await_execution(struct i915_request *to, &from->fence); } +static int execution_proxy(struct await_proxy *ap) +{ + return i915_request_await_execution(ap->request, ap->fence, ap->data); +} + +static int +i915_request_await_proxy_execution(struct i915_request *rq, + struct dma_fence *fence, + void (*hook)(struct i915_request *rq, + struct dma_fence *signal)) +{ + /* + * We have to wait until the real request is known in order to + * be able to hook into its execution, as opposed to waiting for + * its completion. + */ + return __i915_request_await_proxy(rq, fence, + i915_fence_timeout(rq->i915), + execution_proxy, hook); +} + int i915_request_await_execution(struct i915_request *rq, struct dma_fence *fence, @@ -1472,6 +1493,10 @@ i915_request_await_execution(struct i915_request *rq, ret = __i915_request_await_execution(rq, to_request(fence), hook); + else if (dma_fence_is_proxy(fence)) + ret = i915_request_await_proxy_execution(rq, + fence, + hook); else ret = i915_request_await_external(rq, fence); if (ret < 0) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 14b67cd6b54b..704dd0e3bc1d 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1040,9 +1040,10 @@ struct drm_i915_gem_exec_fence { */ __u32 handle; -#define I915_EXEC_FENCE_WAIT (1<<0) -#define I915_EXEC_FENCE_SIGNAL (1<<1) -#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1)) +#define I915_EXEC_FENCE_WAIT (1u << 0) +#define I915_EXEC_FENCE_SIGNAL (1u << 1) +#define I915_EXEC_FENCE_WAIT_SUBMIT (1u << 2) +#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_WAIT_SUBMIT << 1)) __u32 flags; }; From patchwork Mon May 11 07:57:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539893 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC5B7139A for ; Mon, 11 May 2020 07:58:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B47E020735 for ; Mon, 11 May 2020 07:58:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B47E020735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 034586E265; Mon, 11 May 2020 07:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id E62FD6E25B for ; Mon, 11 May 2020 07:58:00 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160794-1500050 for multiple; Mon, 11 May 2020 08:57:27 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:12 +0100 Message-Id: <20200511075722.13483-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/20] drm/i915/gt: Declare when we enabled timeslicing X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kenneth Graunke , Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Let userspace know if they can trust timeslicing by including it as part of the I915_PARAM_HAS_SCHEDULER::I915_SCHEDULER_CAP_TIMESLICING v2: Only declare timeslicing if we can safely preempt userspace. Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802 Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854 Signed-off-by: Chris Wilson Cc: Kenneth Graunke Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_engine_user.c | 1 + include/uapi/drm/i915_drm.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c index 848decee9066..8415511f1465 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c @@ -98,6 +98,7 @@ static void set_scheduler_caps(struct drm_i915_private *i915) MAP(HAS_PREEMPTION, PREEMPTION), MAP(HAS_SEMAPHORES, SEMAPHORES), MAP(SUPPORTS_STATS, ENGINE_BUSY_STATS), + MAP(HAS_TIMESLICES, TIMESLICING), #undef MAP }; struct intel_engine_cs *engine; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 704dd0e3bc1d..1ee227b5131a 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -523,6 +523,7 @@ typedef struct drm_i915_irq_wait { #define I915_SCHEDULER_CAP_PREEMPTION (1ul << 2) #define I915_SCHEDULER_CAP_SEMAPHORES (1ul << 3) #define I915_SCHEDULER_CAP_ENGINE_BUSY_STATS (1ul << 4) +#define I915_SCHEDULER_CAP_TIMESLICING (1ul << 5) #define I915_PARAM_HUC_STATUS 42 From patchwork Mon May 11 07:57:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539899 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9747F139A for ; Mon, 11 May 2020 07:58:11 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7FBDE20735 for ; Mon, 11 May 2020 07:58:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7FBDE20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8C1356E27C; Mon, 11 May 2020 07:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 78D226E248 for ; Mon, 11 May 2020 07:58:00 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160795-1500050 for multiple; Mon, 11 May 2020 08:57:27 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:13 +0100 Message-Id: <20200511075722.13483-11-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/20] drm/i915/gem: Remove redundant exec_fence X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since there can only be one of in_fence/exec_fence, just use the single in_fence local. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 24 ++++++++----------- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6368f0070157..2067557e277b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2643,7 +2643,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, struct drm_i915_private *i915 = to_i915(dev); struct i915_execbuffer eb; struct dma_fence *in_fence = NULL; - struct dma_fence *exec_fence = NULL; struct sync_file *out_fence = NULL; struct i915_vma *batch; int out_fence_fd = -1; @@ -2698,8 +2697,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, goto err_in_fence; } - exec_fence = sync_file_get_fence(lower_32_bits(args->rsvd2)); - if (!exec_fence) { + in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2)); + if (!in_fence) { err = -EINVAL; goto err_in_fence; } @@ -2709,7 +2708,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, out_fence_fd = get_unused_fd_flags(O_CLOEXEC); if (out_fence_fd < 0) { err = out_fence_fd; - goto err_exec_fence; + goto err_in_fence; } } @@ -2800,14 +2799,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, } if (in_fence) { - err = i915_request_await_dma_fence(eb.request, in_fence); - if (err < 0) - goto err_request; - } - - if (exec_fence) { - err = i915_request_await_execution(eb.request, exec_fence, - eb.engine->bond_execute); + if (args->flags & I915_EXEC_FENCE_SUBMIT) + err = i915_request_await_execution(eb.request, + in_fence, + eb.engine->bond_execute); + else + err = i915_request_await_dma_fence(eb.request, + in_fence); if (err < 0) goto err_request; } @@ -2876,8 +2874,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_out_fence: if (out_fence_fd != -1) put_unused_fd(out_fence_fd); -err_exec_fence: - dma_fence_put(exec_fence); err_in_fence: dma_fence_put(in_fence); return err; From patchwork Mon May 11 07:57:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539891 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6574139A for ; Mon, 11 May 2020 07:58:07 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AEA8720735 for ; Mon, 11 May 2020 07:58:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AEA8720735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E054A6E260; Mon, 11 May 2020 07:58:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id E91B66E265 for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160796-1500050 for multiple; Mon, 11 May 2020 08:57:27 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:14 +0100 Message-Id: <20200511075722.13483-12-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/20] drm/i915: Drop no-semaphore boosting X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Now that we have fast timeslicing on semaphores, we no longer need to prioritise none-semaphore work as we will yield any work blocked on a sempahore to the next in the queue. Previously with no timeslicing, blocking on the semaphore caused extremely bad scheduling with multiple clients utilising multiple rings. Now, there is no impact and we can remove the complication. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ------- drivers/gpu/drm/i915/gt/intel_lrc.c | 9 ----- drivers/gpu/drm/i915/gt/selftest_context.c | 1 + drivers/gpu/drm/i915/i915_priolist_types.h | 4 +- drivers/gpu/drm/i915/i915_request.c | 40 ++----------------- drivers/gpu/drm/i915/i915_request.h | 1 - drivers/gpu/drm/i915/i915_scheduler.c | 12 +++--- drivers/gpu/drm/i915/i915_scheduler_types.h | 3 +- 8 files changed, 12 insertions(+), 73 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 2067557e277b..0a4606faf966 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2603,21 +2603,6 @@ static void eb_request_add(struct i915_execbuffer *eb) /* Check that the context wasn't destroyed before submission */ if (likely(!intel_context_is_closed(eb->context))) { attr = eb->gem_context->sched; - - /* - * Boost actual workloads past semaphores! - * - * With semaphores we spin on one engine waiting for another, - * simply to reduce the latency of starting our work when - * the signaler completes. However, if there is any other - * work that we could be doing on this engine instead, that - * is better utilisation and will reduce the overall duration - * of the current work. To avoid PI boosting a semaphore - * far in the distance past over useful work, we keep a history - * of any semaphore use along our dependency chain. - */ - if (!(rq->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN)) - attr.priority |= I915_PRIORITY_NOSEMAPHORE; } else { /* Serialise with context_close via the add_to_timeline */ i915_request_set_error_once(rq, -ENOENT); diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index f93f13d20b5a..382969c1c7ca 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -429,15 +429,6 @@ static int effective_prio(const struct i915_request *rq) if (i915_request_has_nopreempt(rq)) prio = I915_PRIORITY_UNPREEMPTABLE; - /* - * On unwinding the active request, we give it a priority bump - * if it has completed waiting on any semaphore. If we know that - * the request has already started, we can prevent an unwanted - * preempt-to-idle cycle by taking that into account now. - */ - if (__i915_request_has_started(rq)) - prio |= I915_PRIORITY_NOSEMAPHORE; - return prio; } diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c index a56dff3b157a..52af1cee9a94 100644 --- a/drivers/gpu/drm/i915/gt/selftest_context.c +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -24,6 +24,7 @@ static int request_sync(struct i915_request *rq) /* Opencode i915_request_add() so we can keep the timeline locked. */ __i915_request_commit(rq); + rq->sched.attr.priority = I915_PRIORITY_BARRIER; __i915_request_queue(rq, NULL); timeout = i915_request_wait(rq, 0, HZ / 10); diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h index e18723d8df86..5003a71113cb 100644 --- a/drivers/gpu/drm/i915/i915_priolist_types.h +++ b/drivers/gpu/drm/i915/i915_priolist_types.h @@ -24,14 +24,12 @@ enum { I915_PRIORITY_DISPLAY, }; -#define I915_USER_PRIORITY_SHIFT 1 +#define I915_USER_PRIORITY_SHIFT 0 #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT) #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT) #define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1) -#define I915_PRIORITY_NOSEMAPHORE ((u8)BIT(0)) - /* Smallest priority value that cannot be bumped. */ #define I915_PRIORITY_INVALID (INT_MIN | (u8)I915_PRIORITY_MASK) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 9ad1e6761492..9738dab5a9f6 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -369,8 +369,6 @@ __await_execution(struct i915_request *rq, } spin_unlock_irq(&signal->lock); - /* Copy across semaphore status as we need the same behaviour */ - rq->sched.flags |= signal->sched.flags; return 0; } @@ -539,10 +537,8 @@ void __i915_request_unsubmit(struct i915_request *request) spin_unlock(&request->lock); /* We've already spun, don't charge on resubmitting. */ - if (request->sched.semaphores && i915_request_started(request)) { - request->sched.attr.priority |= I915_PRIORITY_NOSEMAPHORE; + if (request->sched.semaphores && i915_request_started(request)) request->sched.semaphores = 0; - } /* * We don't need to wake_up any waiters on request->execute, they @@ -600,15 +596,6 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } -static void irq_semaphore_cb(struct irq_work *wrk) -{ - struct i915_request *rq = - container_of(wrk, typeof(*rq), semaphore_work); - - i915_schedule_bump_priority(rq, I915_PRIORITY_NOSEMAPHORE); - i915_request_put(rq); -} - static int __i915_sw_fence_call semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) { @@ -616,11 +603,6 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) switch (state) { case FENCE_COMPLETE: - if (!(READ_ONCE(rq->sched.attr.priority) & I915_PRIORITY_NOSEMAPHORE)) { - i915_request_get(rq); - init_irq_work(&rq->semaphore_work, irq_semaphore_cb); - irq_work_queue(&rq->semaphore_work); - } break; case FENCE_FREE: @@ -999,6 +981,7 @@ emit_semaphore_wait(struct i915_request *to, gfp_t gfp) { const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask; + struct i915_sw_fence *wait = &to->submit; if (!intel_context_use_semaphores(to->context)) goto await_fence; @@ -1033,11 +1016,10 @@ emit_semaphore_wait(struct i915_request *to, goto await_fence; to->sched.semaphores |= mask; - to->sched.flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN; - return 0; + wait = &to->semaphore; await_fence: - return i915_sw_fence_await_dma_fence(&to->submit, + return i915_sw_fence_await_dma_fence(wait, &from->fence, 0, I915_FENCE_GFP); } @@ -1072,17 +1054,6 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from) if (ret < 0) return ret; - if (to->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN) { - ret = i915_sw_fence_await_dma_fence(&to->semaphore, - &from->fence, 0, - I915_FENCE_GFP); - if (ret < 0) - return ret; - } - - if (from->sched.flags & I915_SCHED_HAS_EXTERNAL_CHAIN) - to->sched.flags |= I915_SCHED_HAS_EXTERNAL_CHAIN; - return 0; } @@ -1706,9 +1677,6 @@ void i915_request_add(struct i915_request *rq) attr = ctx->sched; rcu_read_unlock(); - if (!(rq->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN)) - attr.priority |= I915_PRIORITY_NOSEMAPHORE; - __i915_request_queue(rq, &attr); mutex_unlock(&tl->mutex); diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 98ae2dc82371..8ec7ee4dbadc 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -216,7 +216,6 @@ struct i915_request { }; struct list_head execute_cb; struct i915_sw_fence semaphore; - struct irq_work semaphore_work; /* * A list of everyone we wait upon, and everyone who waits upon us. diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index f8e797a7eee9..56defe78ae54 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -51,11 +51,11 @@ static void assert_priolists(struct intel_engine_execlists * const execlists) GEM_BUG_ON(rb_first_cached(&execlists->queue) != rb_first(&execlists->queue.rb_root)); - last_prio = (INT_MAX >> I915_USER_PRIORITY_SHIFT) + 1; + last_prio = INT_MAX; for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) { const struct i915_priolist *p = to_priolist(rb); - GEM_BUG_ON(p->priority >= last_prio); + GEM_BUG_ON(p->priority > last_prio); last_prio = p->priority; GEM_BUG_ON(!p->used); @@ -434,15 +434,13 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node, dep->waiter = node; dep->flags = flags; - /* Keep track of whether anyone on this chain has a semaphore */ - if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN && - !node_started(signal)) - node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN; - /* All set, now publish. Beware the lockless walkers. */ list_add_rcu(&dep->signal_link, &node->signalers_list); list_add_rcu(&dep->wait_link, &signal->waiters_list); + /* Propagate the chains */ + node->flags |= signal->flags; + ret = true; } diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index 6ab2c5289bed..f72e6c397b08 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -65,8 +65,7 @@ struct i915_sched_node { struct list_head link; struct i915_sched_attr attr; unsigned int flags; -#define I915_SCHED_HAS_SEMAPHORE_CHAIN BIT(0) -#define I915_SCHED_HAS_EXTERNAL_CHAIN BIT(1) +#define I915_SCHED_HAS_EXTERNAL_CHAIN BIT(0) intel_engine_mask_t semaphores; }; From patchwork Mon May 11 07:57:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539889 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 04A96912 for ; Mon, 11 May 2020 07:58:05 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E11FB20735 for ; Mon, 11 May 2020 07:58:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E11FB20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2D8BA89CD9; Mon, 11 May 2020 07:58:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id F1E8D6E267 for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160797-1500050 for multiple; Mon, 11 May 2020 08:57:27 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:15 +0100 Message-Id: <20200511075722.13483-13-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/20] drm/i915: Move saturated workload detection back to the context X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" When we introduced the saturated workload detection to tell us to back off from semaphore usage [semaphores have a noticeable impact on contended bus cycles with the CPU for some heavy workloads], we first introduced it as a per-context tracker. This allows individual contexts to try and optimise their own usage, but we found that with the local tracking and the no-semaphore boosting, the first context to disable semaphores got a massive priority boost and so would starve the rest and all new contexts (as they started with semaphores enabled and lower priority). Hence we moved the saturated workload detection to the engine, and a consequence had to disable semaphores on virtual engines. Now that we do not have semaphore priority boosting, we can move the tracking back to the context and virtual engines can now utilise the faster inter-engine synchronisation. References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global") Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context.c | 1 + drivers/gpu/drm/i915/gt/intel_context_types.h | 2 ++ drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 -- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 -- drivers/gpu/drm/i915/gt/intel_lrc.c | 15 --------------- drivers/gpu/drm/i915/i915_request.c | 4 ++-- 6 files changed, 5 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index e4aece20bc80..762a251d553b 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -268,6 +268,7 @@ static int __intel_context_active(struct i915_active *active) if (err) goto err_timeline; + ce->saturated = 0; return 0; err_timeline: diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 4954b0df4864..aed26d93c2ca 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -78,6 +78,8 @@ struct intel_context { } lrc; u32 tag; /* cookie passed to HW to track this context on submission */ + intel_engine_mask_t saturated; /* submitting semaphores too late? */ + /* Time on GPU as tracked by the hw. */ struct { struct ewma_runtime avg; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index d0a1078ef632..6d7fdba5adef 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -229,8 +229,6 @@ static int __engine_park(struct intel_wakeref *wf) struct intel_engine_cs *engine = container_of(wf, typeof(*engine), wakeref); - engine->saturated = 0; - /* * If one and only one request is completed between pm events, * we know that we are inside the kernel context and it is diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index c113b7805e65..2b1232a233bc 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -332,8 +332,6 @@ struct intel_engine_cs { struct intel_context *kernel_context; /* pinned */ - intel_engine_mask_t saturated; /* submitting semaphores too late? */ - struct { struct delayed_work work; struct i915_request *systole; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 382969c1c7ca..730f639a4477 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -5591,21 +5591,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings, ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL; ve->base.uabi_instance = I915_ENGINE_CLASS_INVALID_VIRTUAL; - /* - * The decision on whether to submit a request using semaphores - * depends on the saturated state of the engine. We only compute - * this during HW submission of the request, and we need for this - * state to be globally applied to all requests being submitted - * to this engine. Virtual engines encompass more than one physical - * engine and so we cannot accurately tell in advance if one of those - * engines is already saturated and so cannot afford to use a semaphore - * and be pessimized in priority for doing so -- if we are the only - * context using semaphores after all other clients have stopped, we - * will be starved on the saturated system. Such a global switch for - * semaphores is less than ideal, but alas is the current compromise. - */ - ve->base.saturated = ALL_ENGINES; - snprintf(ve->base.name, sizeof(ve->base.name), "virtual"); intel_engine_init_active(&ve->base, ENGINE_VIRTUAL); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 9738dab5a9f6..dae0b2c44951 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -469,7 +469,7 @@ bool __i915_request_submit(struct i915_request *request) */ if (request->sched.semaphores && i915_sw_fence_signaled(&request->semaphore)) - engine->saturated |= request->sched.semaphores; + request->context->saturated |= request->sched.semaphores; engine->emit_fini_breadcrumb(request, request->ring->vaddr + request->postfix); @@ -921,7 +921,7 @@ already_busywaiting(struct i915_request *rq) * * See the are-we-too-late? check in __i915_request_submit(). */ - return rq->sched.semaphores | READ_ONCE(rq->engine->saturated); + return rq->sched.semaphores | READ_ONCE(rq->context->saturated); } static int From patchwork Mon May 11 07:57:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539901 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68DE0139A for ; Mon, 11 May 2020 07:58:12 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 515E620735 for ; Mon, 11 May 2020 07:58:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 515E620735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6DCAF6E26C; Mon, 11 May 2020 07:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id CFAC86E260 for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160798-1500050 for multiple; Mon, 11 May 2020 08:57:27 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:16 +0100 Message-Id: <20200511075722.13483-14-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/20] drm/i915: Remove the saturation backoff for HW semaphores X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Could our scheduling now be good enough that we avoid unnecessary semaphores and do not waste bus cycles checking old results? Judging by local runs of the examples from last year, possibly! References: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems") Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context.c | 1 - drivers/gpu/drm/i915/gt/intel_context_types.h | 2 - drivers/gpu/drm/i915/i915_request.c | 54 ++----------------- drivers/gpu/drm/i915/i915_request.h | 1 - 4 files changed, 3 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 762a251d553b..e4aece20bc80 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -268,7 +268,6 @@ static int __intel_context_active(struct i915_active *active) if (err) goto err_timeline; - ce->saturated = 0; return 0; err_timeline: diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index aed26d93c2ca..4954b0df4864 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -78,8 +78,6 @@ struct intel_context { } lrc; u32 tag; /* cookie passed to HW to track this context on submission */ - intel_engine_mask_t saturated; /* submitting semaphores too late? */ - /* Time on GPU as tracked by the hw. */ struct { struct ewma_runtime avg; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index dae0b2c44951..b87766b02efb 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -120,7 +120,6 @@ static void i915_fence_release(struct dma_fence *fence) * caught trying to reuse dead objects. */ i915_sw_fence_fini(&rq->submit); - i915_sw_fence_fini(&rq->semaphore); /* Keep one request on each engine for reserved use under mempressure */ if (!cmpxchg(&rq->engine->request_pool, NULL, rq)) @@ -451,26 +450,6 @@ bool __i915_request_submit(struct i915_request *request) if (unlikely(fatal_error(request->fence.error))) __i915_request_skip(request); - /* - * Are we using semaphores when the gpu is already saturated? - * - * Using semaphores incurs a cost in having the GPU poll a - * memory location, busywaiting for it to change. The continual - * memory reads can have a noticeable impact on the rest of the - * system with the extra bus traffic, stalling the cpu as it too - * tries to access memory across the bus (perf stat -e bus-cycles). - * - * If we installed a semaphore on this request and we only submit - * the request after the signaler completed, that indicates the - * system is overloaded and using semaphores at this time only - * increases the amount of work we are doing. If so, we disable - * further use of semaphores until we are idle again, whence we - * optimistically try again. - */ - if (request->sched.semaphores && - i915_sw_fence_signaled(&request->semaphore)) - request->context->saturated |= request->sched.semaphores; - engine->emit_fini_breadcrumb(request, request->ring->vaddr + request->postfix); @@ -536,10 +515,6 @@ void __i915_request_unsubmit(struct i915_request *request) spin_unlock(&request->lock); - /* We've already spun, don't charge on resubmitting. */ - if (request->sched.semaphores && i915_request_started(request)) - request->sched.semaphores = 0; - /* * We don't need to wake_up any waiters on request->execute, they * will get woken by any other event or us re-adding this request @@ -596,23 +571,6 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } -static int __i915_sw_fence_call -semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) -{ - struct i915_request *rq = container_of(fence, typeof(*rq), semaphore); - - switch (state) { - case FENCE_COMPLETE: - break; - - case FENCE_FREE: - i915_request_put(rq); - break; - } - - return NOTIFY_DONE; -} - static void retire_requests(struct intel_timeline *tl) { struct i915_request *rq, *rn; @@ -668,7 +626,6 @@ static void __i915_request_ctor(void *arg) spin_lock_init(&rq->lock); i915_sched_node_init(&rq->sched); i915_sw_fence_init(&rq->submit, submit_notify); - i915_sw_fence_init(&rq->semaphore, semaphore_notify); dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0); @@ -757,7 +714,6 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) /* We bump the ref for the fence chain */ i915_sw_fence_reinit(&i915_request_get(rq)->submit); - i915_sw_fence_reinit(&i915_request_get(rq)->semaphore); i915_sched_node_reinit(&rq->sched); @@ -918,10 +874,8 @@ already_busywaiting(struct i915_request *rq) * if we have detected the engine is saturated (i.e. would not be * submitted early and cause bus traffic reading an already passed * semaphore). - * - * See the are-we-too-late? check in __i915_request_submit(). */ - return rq->sched.semaphores | READ_ONCE(rq->context->saturated); + return rq->sched.semaphores; } static int @@ -981,7 +935,6 @@ emit_semaphore_wait(struct i915_request *to, gfp_t gfp) { const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask; - struct i915_sw_fence *wait = &to->submit; if (!intel_context_use_semaphores(to->context)) goto await_fence; @@ -1016,10 +969,10 @@ emit_semaphore_wait(struct i915_request *to, goto await_fence; to->sched.semaphores |= mask; - wait = &to->semaphore; + return 0; await_fence: - return i915_sw_fence_await_dma_fence(wait, + return i915_sw_fence_await_dma_fence(&to->submit, &from->fence, 0, I915_FENCE_GFP); } @@ -1654,7 +1607,6 @@ void __i915_request_queue(struct i915_request *rq, */ if (attr && rq->engine->schedule) rq->engine->schedule(rq, attr); - i915_sw_fence_commit(&rq->semaphore); i915_sw_fence_commit(&rq->submit); } diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 8ec7ee4dbadc..246c80dd37f1 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -215,7 +215,6 @@ struct i915_request { } duration; }; struct list_head execute_cb; - struct i915_sw_fence semaphore; /* * A list of everyone we wait upon, and everyone who waits upon us. From patchwork Mon May 11 07:57:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539915 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50949139A for ; Mon, 11 May 2020 07:58:17 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3923B20735 for ; Mon, 11 May 2020 07:58:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3923B20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 257C76E3F2; Mon, 11 May 2020 07:58:06 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id CE5246E25B for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160799-1500050 for multiple; Mon, 11 May 2020 08:57:28 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:17 +0100 Message-Id: <20200511075722.13483-15-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 15/20] drm/i915/gt: Use built-in active intel_context reference X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since a few rearragements ago, we have an explicit reference to the containing intel_context from inside the active reference and can drop our own reference handling dancing around releasing the i915_active. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index e4aece20bc80..e9b754b317bb 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -155,15 +155,7 @@ void intel_context_unpin(struct intel_context *ce) CE_TRACE(ce, "unpin\n"); ce->ops->unpin(ce); - /* - * Once released, we may asynchronously drop the active reference. - * As that may be the only reference keeping the context alive, - * take an extra now so that it is not freed before we finish - * dereferencing it. - */ - intel_context_get(ce); intel_context_active_release(ce); - intel_context_put(ce); } static int __context_pin_state(struct i915_vma *vma) From patchwork Mon May 11 07:57:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539911 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11A8B175D for ; Mon, 11 May 2020 07:58:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EE2F420735 for ; Mon, 11 May 2020 07:58:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE2F420735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0C53D6E3EE; Mon, 11 May 2020 07:58:06 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 96C496E248 for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160800-1500050 for multiple; Mon, 11 May 2020 08:57:28 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:18 +0100 Message-Id: <20200511075722.13483-16-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 16/20] drm/i915: Drop I915_RESET_TIMEOUT and friends X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" These were used to set various timeouts for the reset procedure (deciding when the engine was dead, and even if the reset itself was not making forward progress). No longer used. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2e3b5c4d0759..ad287e5d6ded 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -623,13 +623,6 @@ i915_fence_timeout(const struct drm_i915_private *i915) return i915_fence_context_timeout(i915, U64_MAX); } -#define I915_RESET_TIMEOUT (10 * HZ) /* 10s */ - -#define I915_ENGINE_DEAD_TIMEOUT (4 * HZ) /* Seqno, head and subunits dead */ -#define I915_SEQNO_DEAD_TIMEOUT (12 * HZ) /* Seqno dead with active head */ - -#define I915_ENGINE_WEDGED_TIMEOUT (60 * HZ) /* Reset but no recovery? */ - /* Amount of SAGV/QGV points, BSpec precisely defines this */ #define I915_NUM_QGV_POINTS 8 From patchwork Mon May 11 07:57:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539897 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF06D912 for ; Mon, 11 May 2020 07:58:10 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 97BC120735 for ; Mon, 11 May 2020 07:58:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 97BC120735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 523EA6E267; Mon, 11 May 2020 07:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 94DB46E220 for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160801-1500050 for multiple; Mon, 11 May 2020 08:57:28 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:19 +0100 Message-Id: <20200511075722.13483-17-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 17/20] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This timeout is only used in one place, to provide a tiny bit of grace for slow igt to cleanup after themselves. If we are a bit stricter and opt to kill outstanding requsts rather than wait, we can speed up igt by not waiting for 200ms after a hang. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 11 ++++++----- drivers/gpu/drm/i915/i915_drv.h | 2 -- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 8e98df6a3045..649acf1fc33d 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1463,12 +1463,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val) { int ret; - if (val & DROP_RESET_ACTIVE && - wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT)) - intel_gt_set_wedged(gt); + if (val & (DROP_RETIRE | DROP_RESET_ACTIVE)) + intel_gt_wait_for_idle(gt, 1); - if (val & DROP_RETIRE) - intel_gt_retire_requests(gt); + if (val & DROP_RESET_ACTIVE && intel_gt_pm_get_if_awake(gt)) { + intel_gt_set_wedged(gt); + intel_gt_pm_put(gt); + } if (val & (DROP_IDLE | DROP_ACTIVE)) { ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ad287e5d6ded..97687ea53c3d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -612,8 +612,6 @@ struct i915_gem_mm { u32 shrink_count; }; -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */ - unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915, u64 context); From patchwork Mon May 11 07:57:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539885 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 37651912 for ; Mon, 11 May 2020 07:58:02 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1F90B20735 for ; Mon, 11 May 2020 07:58:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1F90B20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DAD246E262; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B2C2289CD9 for ; Mon, 11 May 2020 07:57:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160802-1500050 for multiple; Mon, 11 May 2020 08:57:28 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:20 +0100 Message-Id: <20200511075722.13483-18-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 18/20] drm/i915/selftests: Always call the provided engine->emit_init_breadcrumb X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" While this does not appear to fix any issues, the backend itself knows when it wants to emit a breadcrumb, so let it make the final call. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/selftests/i915_perf.c | 3 +-- drivers/gpu/drm/i915/selftests/igt_spinner.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c index 5608fab98d5d..ca0c9dbab713 100644 --- a/drivers/gpu/drm/i915/selftests/i915_perf.c +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c @@ -221,8 +221,7 @@ static int live_noa_delay(void *arg) goto out; } - if (rq->engine->emit_init_breadcrumb && - i915_request_timeline(rq)->has_initial_breadcrumb) { + if (rq->engine->emit_init_breadcrumb) { err = rq->engine->emit_init_breadcrumb(rq); if (err) { i915_request_add(rq); diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c index 9ad4ab088466..e35ba5f9e73f 100644 --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c @@ -169,8 +169,7 @@ igt_spinner_create_request(struct igt_spinner *spin, intel_gt_chipset_flush(engine->gt); - if (engine->emit_init_breadcrumb && - i915_request_timeline(rq)->has_initial_breadcrumb) { + if (engine->emit_init_breadcrumb) { err = engine->emit_init_breadcrumb(rq); if (err) goto cancel_rq; From patchwork Mon May 11 07:57:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539903 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1891D912 for ; Mon, 11 May 2020 07:58:13 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0108820735 for ; Mon, 11 May 2020 07:58:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0108820735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 578146E29A; Mon, 11 May 2020 07:58:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 93CB589CD9 for ; Mon, 11 May 2020 07:57:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160803-1500050 for multiple; Mon, 11 May 2020 08:57:29 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:21 +0100 Message-Id: <20200511075722.13483-19-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 19/20] drm/i915: Emit await(batch) before MI_BB_START X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Be consistent and ensure that we always emit the asynchronous waits prior to issuing instructions that use the address. This ensures that if we do emit GPU commands to do the await, they are before our use! Signed-off-by: Chris Wilson --- .../drm/i915/gem/selftests/i915_gem_context.c | 49 ++++++++++++------- .../drm/i915/gem/selftests/igt_gem_utils.c | 26 ++++------ drivers/gpu/drm/i915/gt/intel_renderstate.c | 16 +++--- drivers/gpu/drm/i915/selftests/i915_request.c | 28 +++++------ 4 files changed, 65 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 87d264fe54b2..b81978890641 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -972,12 +972,6 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, goto err_batch; } - err = rq->engine->emit_bb_start(rq, - batch->node.start, batch->node.size, - 0); - if (err) - goto err_request; - i915_vma_lock(batch); err = i915_request_await_object(rq, batch->obj, false); if (err == 0) @@ -994,6 +988,18 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, if (err) goto skip_request; + if (rq->engine->emit_init_breadcrumb) { + err = rq->engine->emit_init_breadcrumb(rq); + if (err) + goto skip_request; + } + + err = rq->engine->emit_bb_start(rq, + batch->node.start, batch->node.size, + 0); + if (err) + goto skip_request; + i915_vma_unpin_and_release(&batch, 0); i915_vma_unpin(vma); @@ -1005,7 +1011,6 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, skip_request: i915_request_set_error_once(rq, err); -err_request: i915_request_add(rq); err_batch: i915_vma_unpin_and_release(&batch, 0); @@ -1541,10 +1546,6 @@ static int write_to_scratch(struct i915_gem_context *ctx, goto err_unpin; } - err = engine->emit_bb_start(rq, vma->node.start, vma->node.size, 0); - if (err) - goto err_request; - i915_vma_lock(vma); err = i915_request_await_object(rq, vma->obj, false); if (err == 0) @@ -1553,6 +1554,16 @@ static int write_to_scratch(struct i915_gem_context *ctx, if (err) goto skip_request; + if (rq->engine->emit_init_breadcrumb) { + err = rq->engine->emit_init_breadcrumb(rq); + if (err) + goto skip_request; + } + + err = engine->emit_bb_start(rq, vma->node.start, vma->node.size, 0); + if (err) + goto skip_request; + i915_vma_unpin(vma); i915_request_add(rq); @@ -1560,7 +1571,6 @@ static int write_to_scratch(struct i915_gem_context *ctx, goto out_vm; skip_request: i915_request_set_error_once(rq, err); -err_request: i915_request_add(rq); err_unpin: i915_vma_unpin(vma); @@ -1674,10 +1684,6 @@ static int read_from_scratch(struct i915_gem_context *ctx, goto err_unpin; } - err = engine->emit_bb_start(rq, vma->node.start, vma->node.size, flags); - if (err) - goto err_request; - i915_vma_lock(vma); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) @@ -1686,6 +1692,16 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto skip_request; + if (rq->engine->emit_init_breadcrumb) { + err = rq->engine->emit_init_breadcrumb(rq); + if (err) + goto skip_request; + } + + err = engine->emit_bb_start(rq, vma->node.start, vma->node.size, flags); + if (err) + goto skip_request; + i915_vma_unpin(vma); i915_request_add(rq); @@ -1708,7 +1724,6 @@ static int read_from_scratch(struct i915_gem_context *ctx, goto out_vm; skip_request: i915_request_set_error_once(rq, err); -err_request: i915_request_add(rq); err_unpin: i915_vma_unpin(vma); diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index 772d8cba7da9..e21b5023ca7d 100644 --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -83,6 +83,8 @@ igt_emit_store_dw(struct i915_vma *vma, offset += PAGE_SIZE; } *cmd = MI_BATCH_BUFFER_END; + + i915_gem_object_flush_map(obj); i915_gem_object_unpin_map(obj); intel_gt_chipset_flush(vma->vm->gt); @@ -126,16 +128,6 @@ int igt_gpu_fill_dw(struct intel_context *ce, goto err_batch; } - flags = 0; - if (INTEL_GEN(ce->vm->i915) <= 5) - flags |= I915_DISPATCH_SECURE; - - err = rq->engine->emit_bb_start(rq, - batch->node.start, batch->node.size, - flags); - if (err) - goto err_request; - i915_vma_lock(batch); err = i915_request_await_object(rq, batch->obj, false); if (err == 0) @@ -152,15 +144,17 @@ int igt_gpu_fill_dw(struct intel_context *ce, if (err) goto skip_request; - i915_request_add(rq); - - i915_vma_unpin_and_release(&batch, 0); + flags = 0; + if (INTEL_GEN(ce->vm->i915) <= 5) + flags |= I915_DISPATCH_SECURE; - return 0; + err = rq->engine->emit_bb_start(rq, + batch->node.start, batch->node.size, + flags); skip_request: - i915_request_set_error_once(rq, err); -err_request: + if (err) + i915_request_set_error_once(rq, err); i915_request_add(rq); err_batch: i915_vma_unpin_and_release(&batch, 0); diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.c b/drivers/gpu/drm/i915/gt/intel_renderstate.c index 708cb7808865..f59e7875cc5e 100644 --- a/drivers/gpu/drm/i915/gt/intel_renderstate.c +++ b/drivers/gpu/drm/i915/gt/intel_renderstate.c @@ -219,6 +219,14 @@ int intel_renderstate_emit(struct intel_renderstate *so, if (!so->vma) return 0; + i915_vma_lock(so->vma); + err = i915_request_await_object(rq, so->vma->obj, false); + if (err == 0) + err = i915_vma_move_to_active(so->vma, rq, 0); + i915_vma_unlock(so->vma); + if (err) + return err; + err = engine->emit_bb_start(rq, so->batch_offset, so->batch_size, I915_DISPATCH_SECURE); @@ -233,13 +241,7 @@ int intel_renderstate_emit(struct intel_renderstate *so, return err; } - i915_vma_lock(so->vma); - err = i915_request_await_object(rq, so->vma->obj, false); - if (err == 0) - err = i915_vma_move_to_active(so->vma, rq, 0); - i915_vma_unlock(so->vma); - - return err; + return 0; } void intel_renderstate_fini(struct intel_renderstate *so) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index 15b1ca9f7a01..ffdfcb3805b5 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -865,13 +865,6 @@ static int live_all_engines(void *arg) goto out_request; } - err = engine->emit_bb_start(request[idx], - batch->node.start, - batch->node.size, - 0); - GEM_BUG_ON(err); - request[idx]->batch = batch; - i915_vma_lock(batch); err = i915_request_await_object(request[idx], batch->obj, 0); if (err == 0) @@ -879,6 +872,13 @@ static int live_all_engines(void *arg) i915_vma_unlock(batch); GEM_BUG_ON(err); + err = engine->emit_bb_start(request[idx], + batch->node.start, + batch->node.size, + 0); + GEM_BUG_ON(err); + request[idx]->batch = batch; + i915_request_get(request[idx]); i915_request_add(request[idx]); idx++; @@ -993,13 +993,6 @@ static int live_sequential_engines(void *arg) } } - err = engine->emit_bb_start(request[idx], - batch->node.start, - batch->node.size, - 0); - GEM_BUG_ON(err); - request[idx]->batch = batch; - i915_vma_lock(batch); err = i915_request_await_object(request[idx], batch->obj, false); @@ -1008,6 +1001,13 @@ static int live_sequential_engines(void *arg) i915_vma_unlock(batch); GEM_BUG_ON(err); + err = engine->emit_bb_start(request[idx], + batch->node.start, + batch->node.size, + 0); + GEM_BUG_ON(err); + request[idx]->batch = batch; + i915_request_get(request[idx]); i915_request_add(request[idx]); From patchwork Mon May 11 07:57:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11539905 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1433139A for ; Mon, 11 May 2020 07:58:13 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C692120735 for ; Mon, 11 May 2020 07:58:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C692120735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EB64F6E28B; Mon, 11 May 2020 07:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id EAF716E25B for ; Mon, 11 May 2020 07:57:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21160804-1500050 for multiple; Mon, 11 May 2020 08:57:29 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 May 2020 08:57:22 +0100 Message-Id: <20200511075722.13483-20-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511075722.13483-1-chris@chris-wilson.co.uk> References: <20200511075722.13483-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 20/20] drm/i915/selftests: Always flush before unpining after writing X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Be consistent, and even when we know we had used a WC, flush the mapped object after writing into it. The flush understands the mapping type and will only flush the WCB if I915_MAP_WC. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_object_blt.c | 8 ++++++-- drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c | 2 ++ drivers/gpu/drm/i915/gt/selftest_ring_submission.c | 2 ++ drivers/gpu/drm/i915/gt/selftest_rps.c | 2 ++ drivers/gpu/drm/i915/selftests/i915_request.c | 9 +++++++-- 5 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c index 2fc7737ef5f4..f457d7130491 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c @@ -78,10 +78,12 @@ struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce, } while (rem); *cmd = MI_BATCH_BUFFER_END; - intel_gt_chipset_flush(ce->vm->gt); + i915_gem_object_flush_map(pool->obj); i915_gem_object_unpin_map(pool->obj); + intel_gt_chipset_flush(ce->vm->gt); + batch = i915_vma_instance(pool->obj, ce->vm, NULL); if (IS_ERR(batch)) { err = PTR_ERR(batch); @@ -289,10 +291,12 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce, } while (rem); *cmd = MI_BATCH_BUFFER_END; - intel_gt_chipset_flush(ce->vm->gt); + i915_gem_object_flush_map(pool->obj); i915_gem_object_unpin_map(pool->obj); + intel_gt_chipset_flush(ce->vm->gt); + batch = i915_vma_instance(pool->obj, ce->vm, NULL); if (IS_ERR(batch)) { err = PTR_ERR(batch); diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index 3f6079e1dfb6..87d7d8aa080f 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -158,6 +158,8 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) return PTR_ERR(map); map[offset / sizeof(*map)] = v; + + __i915_gem_object_flush_map(ctx->obj, offset, sizeof(*map)); i915_gem_object_unpin_map(ctx->obj); return 0; diff --git a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c index 9995faadd7e8..3350e7c995bc 100644 --- a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c @@ -54,6 +54,8 @@ static struct i915_vma *create_wally(struct intel_engine_cs *engine) *cs++ = STACK_MAGIC; *cs++ = MI_BATCH_BUFFER_END; + + i915_gem_object_flush_map(obj); i915_gem_object_unpin_map(obj); vma->private = intel_context_create(engine); /* dummy residuals */ diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c index bfa1a15564f7..6275d69aa9cc 100644 --- a/drivers/gpu/drm/i915/gt/selftest_rps.c +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c @@ -727,6 +727,7 @@ int live_rps_frequency_cs(void *arg) err_vma: *cancel = MI_BATCH_BUFFER_END; + i915_gem_object_flush_map(vma->obj); i915_gem_object_unpin_map(vma->obj); i915_vma_unpin(vma); i915_vma_put(vma); @@ -868,6 +869,7 @@ int live_rps_frequency_srm(void *arg) err_vma: *cancel = MI_BATCH_BUFFER_END; + i915_gem_object_flush_map(vma->obj); i915_gem_object_unpin_map(vma->obj); i915_vma_unpin(vma); i915_vma_put(vma); diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index ffdfcb3805b5..6014e8dfcbb1 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -816,10 +816,12 @@ static int recursive_batch_resolve(struct i915_vma *batch) return PTR_ERR(cmd); *cmd = MI_BATCH_BUFFER_END; - intel_gt_chipset_flush(batch->vm->gt); + __i915_gem_object_flush_map(batch->obj, 0, sizeof(*cmd)); i915_gem_object_unpin_map(batch->obj); + intel_gt_chipset_flush(batch->vm->gt); + return 0; } @@ -1060,9 +1062,12 @@ static int live_sequential_engines(void *arg) I915_MAP_WC); if (!IS_ERR(cmd)) { *cmd = MI_BATCH_BUFFER_END; - intel_gt_chipset_flush(engine->gt); + __i915_gem_object_flush_map(request[idx]->batch->obj, + 0, sizeof(*cmd)); i915_gem_object_unpin_map(request[idx]->batch->obj); + + intel_gt_chipset_flush(engine->gt); } i915_vma_put(request[idx]->batch);