From patchwork Fri May 25 09:32:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10426745 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 831976032C for ; Fri, 25 May 2018 09:33:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6DF8728671 for ; Fri, 25 May 2018 09:33:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 62D032880B; Fri, 25 May 2018 09:33:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CE38028AAA for ; Fri, 25 May 2018 09:33:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 069F86E8F7; Fri, 25 May 2018 09:33:20 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6BB506E8F7 for ; Fri, 25 May 2018 09:33:18 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11838006-1500050 for multiple; Fri, 25 May 2018 10:32:48 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Fri, 25 May 2018 10:32:47 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 25 May 2018 10:32:06 +0100 Message-Id: <20180525093206.1919-19-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180525093206.1919-1-chris@chris-wilson.co.uk> References: <20180525093206.1919-1-chris@chris-wilson.co.uk> X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 18/18] drm/i915: Hold request reference for submission until retirement X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Currently the async submission backends (guc and execlists) hold a extra reference to the requests being processed as they are not serialised with request retirement. If we instead, prevent the request being dropped from the engine timeline until after submission has finished processing the request, we can use a single reference held for the entire submission process (currently, it is held only for the submission fence). By doing so we remove a few atomics from inside the irqoff path, on the order of 200ns as measured by gem_syslatency, bringing the impact of direct submission into line with the previous tasklet implementation. The tradeoff is that as we may postpone the retirement, we have to check for any residual requests after detecting that the engines are idle. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_request.c | 20 ++++++++--------- drivers/gpu/drm/i915/intel_engine_cs.c | 24 ++++++++++++++++++++- drivers/gpu/drm/i915/intel_guc_submission.c | 4 +--- drivers/gpu/drm/i915/intel_lrc.c | 10 +-------- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 5 files changed, 36 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index f91942e4d852..2cfa6f3a2f16 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -347,17 +347,15 @@ static void free_capture_list(struct i915_request *request) static void __retire_engine_upto(struct intel_engine_cs *engine, struct i915_request *rq) { + struct list_head * const requests = &engine->timeline.requests; struct i915_request *tmp; if (list_empty(&rq->link)) return; - do { - tmp = list_first_entry(&engine->timeline.requests, - typeof(*tmp), link); - - intel_engine_retire_request(engine, tmp); - } while (tmp != rq); + do + tmp = list_first_entry(requests, typeof(*tmp), link); + while (intel_engine_retire_request(engine, tmp) && tmp != rq); } static void i915_request_retire(struct i915_request *request) @@ -376,6 +374,8 @@ static void i915_request_retire(struct i915_request *request) trace_i915_request_retire(request); + __retire_engine_upto(request->engine, request); + advance_ring(request); free_capture_list(request); @@ -414,8 +414,6 @@ static void i915_request_retire(struct i915_request *request) atomic_dec_if_positive(&request->gem_context->ban_score); intel_context_unpin(request->hw_context); - __retire_engine_upto(request->engine, request); - unreserve_gt(request->i915); i915_sched_node_fini(request->i915, &request->sched); @@ -722,8 +720,10 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx) rq->timeline->fence_context, timeline_get_seqno(rq->timeline)); - /* We bump the ref for the fence chain */ - i915_sw_fence_init(&i915_request_get(rq)->submit, submit_notify); + /* We bump the ref for the fence chain and for the submit backend. */ + refcount_set(&rq->fence.refcount.refcount, 3); + + i915_sw_fence_init(&rq->submit, submit_notify); init_waitqueue_head(&rq->execute); i915_sched_node_init(&rq->sched); diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index cce7234b9071..b45d6aa7d29d 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1071,8 +1071,10 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915) * * This request has been completed and is part of the chain being retired by * the caller, so drop any reference to it from the engine. + * + * Returns: true if the reference was dropped, false if it was still busy. */ -void intel_engine_retire_request(struct intel_engine_cs *engine, +bool intel_engine_retire_request(struct intel_engine_cs *engine, struct i915_request *rq) { GEM_TRACE("%s(%s) fence %llx:%d, global=%d, current %d\n", @@ -1085,6 +1087,10 @@ void intel_engine_retire_request(struct intel_engine_cs *engine, GEM_BUG_ON(rq->engine != engine); GEM_BUG_ON(!i915_request_completed(rq)); + /* Don't drop the final ref until after the backend has finished */ + if (port_request(engine->execlists.port) == rq) + return false; + local_irq_disable(); spin_lock(&engine->timeline.lock); @@ -1116,6 +1122,19 @@ void intel_engine_retire_request(struct intel_engine_cs *engine, if (engine->last_retired_context) intel_context_unpin(engine->last_retired_context); engine->last_retired_context = rq->hw_context; + + i915_request_put(rq); + return true; +} + +static void engine_retire_requests(struct intel_engine_cs *engine) +{ + struct i915_request *rq, *next; + + list_for_each_entry_safe(rq, next, &engine->timeline.requests, link) { + if (WARN_ON(!intel_engine_retire_request(engine, rq))) + break; + } } /** @@ -1148,6 +1167,7 @@ void intel_engines_park(struct drm_i915_private *i915) "%s is not idle before parking\n", engine->name); intel_engine_dump(engine, &p, NULL); + engine->cancel_requests(engine); } /* Must be reset upon idling, or we may miss the busy wakeup. */ @@ -1156,6 +1176,8 @@ void intel_engines_park(struct drm_i915_private *i915) if (engine->park) engine->park(engine); + engine_retire_requests(engine); + if (engine->pinned_default_state) { i915_gem_object_unpin_map(engine->default_state); engine->pinned_default_state = NULL; diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index 133367a17863..6f6223644140 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -669,8 +669,7 @@ static void guc_submit(struct intel_engine_cs *engine) static void port_assign(struct execlist_port *port, struct i915_request *rq) { GEM_BUG_ON(port_isset(port)); - - port_set(port, i915_request_get(rq)); + port_set(port, rq); } static inline int rq_prio(const struct i915_request *rq) @@ -793,7 +792,6 @@ static void guc_submission_tasklet(unsigned long data) rq = port_request(port); while (rq && i915_request_completed(rq)) { trace_i915_request_out(rq); - i915_request_put(rq); port = execlists_port_complete(execlists, port); if (port_isset(port)) { diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index c88ea807945a..a0b139debe1f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -513,11 +513,7 @@ static bool can_merge_ctx(const struct intel_context *prev, static void port_assign(struct execlist_port *port, struct i915_request *rq) { GEM_BUG_ON(rq == port_request(port)); - - if (port_isset(port)) - i915_request_put(port_request(port)); - - port_set(port, port_pack(i915_request_get(rq), port_count(port))); + port_set(port, port_pack(rq, port_count(port))); } static void inject_preempt_context(struct intel_engine_cs *engine) @@ -793,8 +789,6 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists) INTEL_CONTEXT_SCHEDULE_OUT : INTEL_CONTEXT_SCHEDULE_PREEMPTED); - i915_request_put(rq); - memset(port, 0, sizeof(*port)); port++; } @@ -1061,8 +1055,6 @@ static void process_csb(struct intel_engine_cs *engine) execlists_context_schedule_out(rq, INTEL_CONTEXT_SCHEDULE_OUT); - i915_request_put(rq); - GEM_TRACE("%s completed ctx=%d\n", engine->name, port->context_id); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 86d99366e7ed..1d4847c11d71 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -888,7 +888,7 @@ int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine); int intel_init_blt_ring_buffer(struct intel_engine_cs *engine); int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine); -void intel_engine_retire_request(struct intel_engine_cs *engine, +bool intel_engine_retire_request(struct intel_engine_cs *engine, struct i915_request *rq); int intel_engine_stop_cs(struct intel_engine_cs *engine);