From patchwork Fri Sep 9 11:00:54 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9322953 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8FF9160869 for ; Fri, 9 Sep 2016 11:01:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83F6A29DE1 for ; Fri, 9 Sep 2016 11:01:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7909F29DE5; Fri, 9 Sep 2016 11:01:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C02A729DE2 for ; Fri, 9 Sep 2016 11:01:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 608B76E790; Fri, 9 Sep 2016 11:01:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id 333796E791 for ; Fri, 9 Sep 2016 11:01:22 +0000 (UTC) Received: by mail-wm0-x241.google.com with SMTP id a6so2215748wmc.2 for ; Fri, 09 Sep 2016 04:01:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:subject:date:message-id:in-reply-to:references; bh=RBJSS0h7g7tcZYdUTgCse0QmjG/fDw9rGdZ5UpaCWms=; b=QO0tcxh67Z1wxHTa+kvvTMVjr7pgAz/TP6w2N/4zNuc8CXyWx6YzIPtvy2vZlc69Jd UfW6oye+/rWyp44rPap1NYEMxZjjlt3bSQVKowpYfB65lm0lD2bw3zNp7dKaSxyoNIuz LA598zwU45IZP+dnlulM+I1ts5maryo39Vb0pkqObo1Np4PDVjRWj3791UJXmN/DlTy7 XqtV2nljQThWgJJBJtBzLGuYfSnXF/JMsFwCIEfwtmTuwfax42IHI0IV/3eA1rNddxog goFEO1MaJcfvxRJznqX+hI7yqKnbRplee/qgBkoc4LUGVqueUtP89ONfYe2XSQFQ38oN SR1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references; bh=RBJSS0h7g7tcZYdUTgCse0QmjG/fDw9rGdZ5UpaCWms=; b=KLJgVELQS9r2/zXQ7Uqi861LXcYsRiM8764mI8fvFke6DUK4QMWzF5vqpPA6MwXSNd knjXPjPfGUKjEzRea94njVbevH9ZmMzBVm5DX72UwzVoMm3W4VMKz2WYKQCROYQOq8po KKwdFilJ/RA+0V9u/e49pnnV/IVh8qgNvFkxlYRuhCONiVJ0dNY4426BTneRv2pVh3Uc ed5OIDhhK7I7pUOaKkAvpPyQRfdqqj1DW40T4s7a5ftslKCPKkJBAlqaxDh13gQc4tpn m4aTeckIFl/X5MWytLfYS3OWDQRmI80j5Qs5xF9hvKQZTiZh9OAfmUZGrLYLAKNnbOxL VkQQ== X-Gm-Message-State: AE9vXwNh0TmabA8rBmH8Oo1/M36eZ4rSVeB/lh4a1yFEdsR7jsnazBRHMlhcA/dr3TxQjw== X-Received: by 10.28.156.16 with SMTP id f16mr2258704wme.56.1473418880646; Fri, 09 Sep 2016 04:01:20 -0700 (PDT) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id kq2sm2786494wjc.41.2016.09.09.04.01.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Sep 2016 04:01:19 -0700 (PDT) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 9 Sep 2016 12:00:54 +0100 Message-Id: <20160909110101.31967-14-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20160909110101.31967-1-chris@chris-wilson.co.uk> References: <20160909110101.31967-1-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [CI 14/21] drm/i915: Drive request submission through fence callbacks X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Drive final request submission from a callback from the fence. This way the request is queued until all dependencies are resolved, at which point it is handed to the backend for queueing to hardware. At this point, no dependencies are set on the request, so the callback is immediate. A side-effect of imposing a heavier-irqsafe spinlock for execlist submission is that we lose the softirq enabling after scheduling the execlists tasklet. To compensate, we manually kickstart the softirq by disabling and enabling the bh around the fence signaling. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c | 3 +++ drivers/gpu/drm/i915/i915_gem_request.c | 27 ++++++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_request.h | 3 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 3 ++- drivers/gpu/drm/i915/intel_breadcrumbs.c | 3 +++ drivers/gpu/drm/i915/intel_lrc.c | 5 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++++++++ 7 files changed, 48 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 674e0eaf39ea..89a5f8d948e7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2549,6 +2549,9 @@ i915_gem_find_active_request(struct intel_engine_cs *engine) if (i915_gem_request_completed(request)) continue; + if (!i915_sw_fence_done(&request->submit)) + break; + return request; } diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 64c370681a81..074fc06ff488 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -318,6 +318,26 @@ static int i915_gem_get_seqno(struct drm_i915_private *dev_priv, u32 *seqno) return 0; } +static int __i915_sw_fence_call +submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) +{ + struct drm_i915_gem_request *request = + container_of(fence, typeof(*request), submit); + + /* Will be called from irq-context when using foreign DMA fences */ + + switch (state) { + case FENCE_COMPLETE: + request->engine->submit_request(request); + break; + + case FENCE_FREE: + break; + } + + return NOTIFY_DONE; +} + /** * i915_gem_request_alloc - allocate a request structure * @@ -396,6 +416,8 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, engine->fence_context, seqno); + i915_sw_fence_init(&req->submit, submit_notify); + INIT_LIST_HEAD(&req->active_list); req->i915 = dev_priv; req->engine = engine; @@ -530,7 +552,10 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) reserved_tail, ret); i915_gem_mark_busy(engine); - engine->submit_request(request); + + local_bh_disable(); + i915_sw_fence_commit(&request->submit); + local_bh_enable(); /* Kick the execlists tasklet if just scheduled */ } static void reset_wait_queue(wait_queue_head_t *q, wait_queue_t *wait) diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h index def35721e9ed..e141b1cca16a 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.h +++ b/drivers/gpu/drm/i915/i915_gem_request.h @@ -28,6 +28,7 @@ #include #include "i915_gem.h" +#include "i915_sw_fence.h" struct intel_wait { struct rb_node node; @@ -82,6 +83,8 @@ struct drm_i915_gem_request { struct intel_ring *ring; struct intel_signal_node signaling; + struct i915_sw_fence submit; + /** GEM sequence number associated with the previous request, * when the HWS breadcrumb is equal to this the GPU is processing * this request. diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index d5a4e9edccc5..0eb6b71935cf 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1016,7 +1016,8 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv) /* Replay the current set of previously submitted requests */ list_for_each_entry(request, &engine->request_list, link) - i915_guc_submit(request); + if (i915_sw_fence_done(&request->submit)) + i915_guc_submit(request); } return 0; diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c index 2491e4c1eaf0..9bad14d22c95 100644 --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c @@ -462,7 +462,10 @@ static int intel_breadcrumbs_signaler(void *arg) */ intel_engine_remove_wait(engine, &request->signaling.wait); + + local_bh_disable(); fence_signal(&request->fence); + local_bh_enable(); /* kick start the tasklets */ /* Find the next oldest signal. Note that as we have * not been holding the lock, another client may diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 61549a623e2c..16d7cdd11082 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -590,14 +590,15 @@ static void intel_lrc_irq_handler(unsigned long data) static void execlists_submit_request(struct drm_i915_gem_request *request) { struct intel_engine_cs *engine = request->engine; + unsigned long flags; - spin_lock_bh(&engine->execlist_lock); + spin_lock_irqsave(&engine->execlist_lock, flags); list_add_tail(&request->execlist_link, &engine->execlist_queue); if (execlists_elsp_idle(engine)) tasklet_hi_schedule(&engine->irq_tasklet); - spin_unlock_bh(&engine->execlist_lock); + spin_unlock_irqrestore(&engine->execlist_lock, flags); } int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 32f527447310..7f64d611159b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -226,7 +226,15 @@ struct intel_engine_cs { #define I915_DISPATCH_PINNED BIT(1) #define I915_DISPATCH_RS BIT(2) int (*emit_request)(struct drm_i915_gem_request *req); + + /* Pass the request to the hardware queue (e.g. directly into + * the legacy ringbuffer or to the end of an execlist). + * + * This is called from an atomic context with irqs disabled; must + * be irq safe. + */ void (*submit_request)(struct drm_i915_gem_request *req); + /* Some chipsets are not quite as coherent as advertised and need * an expensive kick to force a true read of the up-to-date seqno. * However, the up-to-date seqno is not always required and the last