From patchwork Fri Jun 29 07:53:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10495701 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 90CE96022E for ; Fri, 29 Jun 2018 07:54:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C45029119 for ; Fri, 29 Jun 2018 07:54:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70B7F2912F; Fri, 29 Jun 2018 07:54:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CE4E629119 for ; Fri, 29 Jun 2018 07:54:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3A5446EF7A; Fri, 29 Jun 2018 07:54:43 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2A1F56EF72 for ; Fri, 29 Jun 2018 07:54:40 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12195518-1500050 for multiple; Fri, 29 Jun 2018 08:54:38 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Fri, 29 Jun 2018 08:54:37 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 29 Jun 2018 08:53:33 +0100 Message-Id: <20180629075348.27358-22-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180629075348.27358-1-chris@chris-wilson.co.uk> References: <20180629075348.27358-1-chris@chris-wilson.co.uk> X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 22/37] drm/i915: Allow contexts to share a single timeline across all engines X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Previously, our view has been always to run the engines independently within a context. (Multiple engines happened before we had contexts and timelines, so they always operated independently and that behaviour persisted into contexts.) However, at the user level the context often represents a single timeline (e.g. GL contexts) and userspace must ensure that the individual engines are serialised to present that ordering to the client (or forgot about this detail entirely and hope no one notices - a fair ploy if the client can only directly control one engine themselves ;) In the next patch, we will want to construct a set of engines that operate as one, that have a single timeline interwoven between them, to present a single virtual engine to the user. (They submit to the virtual engine, then we decide which engine to execute on based.) To that end, we want to be able to create contexts which have a single timeline (fence context) shared between all engines, rather than multiple timelines. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c | 26 ++++++++++++++++- drivers/gpu/drm/i915/i915_gem_context.h | 3 ++ drivers/gpu/drm/i915/i915_request.c | 10 +++++-- drivers/gpu/drm/i915/i915_request.h | 5 +++- drivers/gpu/drm/i915/i915_sw_fence.c | 39 +++++++++++++++++++++---- drivers/gpu/drm/i915/i915_sw_fence.h | 13 +++++++-- drivers/gpu/drm/i915/intel_lrc.c | 5 +++- include/uapi/drm/i915_drm.h | 3 +- 8 files changed, 91 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index f377856a8edf..fa3212349256 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -133,6 +133,9 @@ static void i915_gem_context_free(struct i915_gem_context *ctx) ce->ops->destroy(ce); } + if (ctx->timeline) + i915_timeline_put(ctx->timeline); + kfree(ctx->name); put_pid(ctx->pid); @@ -359,6 +362,7 @@ static void __destroy_hw_context(struct i915_gem_context *ctx, } #define CREATE_VM BIT(0) +#define CREATE_TIMELINE BIT(1) static struct i915_gem_context * i915_gem_create_context(struct drm_i915_private *dev_priv, @@ -369,6 +373,9 @@ i915_gem_create_context(struct drm_i915_private *dev_priv, lockdep_assert_held(&dev_priv->drm.struct_mutex); + if (flags & CREATE_TIMELINE && !HAS_EXECLISTS(dev_priv)) + return ERR_PTR(-EINVAL); + /* Reap the most stale context */ contexts_free_first(dev_priv); @@ -391,6 +398,18 @@ i915_gem_create_context(struct drm_i915_private *dev_priv, ctx->desc_template = default_desc_template(dev_priv, ppgtt); } + if (flags & CREATE_TIMELINE) { + struct i915_timeline *timeline; + + timeline = i915_timeline_create(dev_priv, ctx->name); + if (IS_ERR(timeline)) { + __destroy_hw_context(ctx, file_priv); + return ERR_CAST(timeline); + } + + ctx->timeline = timeline; + } + trace_i915_context_create(ctx); return ctx; @@ -733,7 +752,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, if (args->pad != 0) return -EINVAL; - if (args->flags & ~I915_GEM_CONTEXT_SHARE_GTT) + if (args->flags & + ~(I915_GEM_CONTEXT_SHARE_GTT | + I915_GEM_CONTEXT_SINGLE_TIMELINE)) return -EINVAL; if (client_is_banned(file_priv)) { @@ -757,6 +778,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, flags &= ~CREATE_VM; } + if (args->flags & I915_GEM_CONTEXT_SINGLE_TIMELINE) + flags |= CREATE_TIMELINE; + err = i915_mutex_lock_interruptible(dev); if (err) goto out; diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index b116e4942c10..5e883d688afe 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -41,6 +41,7 @@ struct drm_i915_private; struct drm_i915_file_private; struct i915_hw_ppgtt; struct i915_request; +struct i915_timeline; struct i915_vma; struct intel_ring; @@ -66,6 +67,8 @@ struct i915_gem_context { /** file_priv: owning file descriptor */ struct drm_i915_file_private *file_priv; + struct i915_timeline *timeline; + /** * @ppgtt: unique address space (GTT) * diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 51f28786b0e4..6305278b8040 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1037,8 +1037,14 @@ void i915_request_add(struct i915_request *request) prev = i915_gem_active_raw(&timeline->last_request, &request->i915->drm.struct_mutex); if (prev && !i915_request_completed(prev)) { - i915_sw_fence_await_sw_fence(&request->submit, &prev->submit, - &request->submitq); + if (prev->engine == engine) + i915_sw_fence_await_sw_fence(&request->submit, + &prev->submit, + &request->submitq); + else + __i915_sw_fence_await_dma_fence(&request->submit, + &prev->fence, + &request->dmaq); if (engine->schedule) __i915_sched_node_add_dependency(&request->sched, &prev->sched, diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index e1c9365dfefb..74b8d416b9e9 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -108,7 +108,10 @@ struct i915_request { * It is used by the driver to then queue the request for execution. */ struct i915_sw_fence submit; - wait_queue_entry_t submitq; + union { + wait_queue_entry_t submitq; + struct i915_sw_dma_fence_cb dmaq; + }; wait_queue_head_t execute; /* diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 1de5173e53a2..376f933bfb4c 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -362,11 +362,6 @@ int i915_sw_fence_await_sw_fence_gfp(struct i915_sw_fence *fence, return __i915_sw_fence_await_sw_fence(fence, signaler, NULL, gfp); } -struct i915_sw_dma_fence_cb { - struct dma_fence_cb base; - struct i915_sw_fence *fence; -}; - struct i915_sw_dma_fence_cb_timer { struct i915_sw_dma_fence_cb base; struct dma_fence *dma; @@ -482,6 +477,40 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, return ret; } +static void __dma_i915_sw_fence_wake(struct dma_fence *dma, + struct dma_fence_cb *data) +{ + struct i915_sw_dma_fence_cb *cb = container_of(data, typeof(*cb), base); + + i915_sw_fence_complete(cb->fence); +} + +int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, + struct dma_fence *dma, + struct i915_sw_dma_fence_cb *cb) +{ + int ret; + + debug_fence_assert(fence); + + if (dma_fence_is_signaled(dma)) + return 0; + + cb->fence = fence; + i915_sw_fence_await(fence); + + ret = dma_fence_add_callback(dma, &cb->base, __dma_i915_sw_fence_wake); + if (ret == 0) { + ret = 1; + } else { + i915_sw_fence_complete(fence); + if (ret == -ENOENT) /* fence already signaled */ + ret = 0; + } + + return ret; +} + int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, struct reservation_object *resv, const struct dma_fence_ops *exclude, diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h index fe2ef4dadfc6..914a734d49bc 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.h +++ b/drivers/gpu/drm/i915/i915_sw_fence.h @@ -10,14 +10,13 @@ #ifndef _I915_SW_FENCE_H_ #define _I915_SW_FENCE_H_ +#include #include #include #include /* for NOTIFY_DONE */ #include struct completion; -struct dma_fence; -struct dma_fence_ops; struct reservation_object; struct i915_sw_fence { @@ -69,10 +68,20 @@ int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence, int i915_sw_fence_await_sw_fence_gfp(struct i915_sw_fence *fence, struct i915_sw_fence *after, gfp_t gfp); + +struct i915_sw_dma_fence_cb { + struct dma_fence_cb base; + struct i915_sw_fence *fence; +}; + +int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, + struct dma_fence *dma, + struct i915_sw_dma_fence_cb *cb); int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, struct dma_fence *dma, unsigned long timeout, gfp_t gfp); + int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, struct reservation_object *resv, const struct dma_fence_ops *exclude, diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0121e1fd3160..8a4a1c0b986e 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2836,7 +2836,10 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx, goto error_deref_obj; } - timeline = i915_timeline_create(ctx->i915, ctx->name); + if (ctx->timeline) + timeline = i915_timeline_get(ctx->timeline); + else + timeline = i915_timeline_create(ctx->i915, ctx->name); if (IS_ERR(timeline)) { ret = PTR_ERR(timeline); goto error_deref_obj; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index cb8a1354633d..f50065413bfa 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1411,7 +1411,8 @@ struct drm_i915_gem_context_create_v2 { /* output: id of new context*/ __u32 ctx_id; __u32 flags; -#define I915_GEM_CONTEXT_SHARE_GTT 0x1 +#define I915_GEM_CONTEXT_SHARE_GTT 0x1 +#define I915_GEM_CONTEXT_SINGLE_TIMELINE 0x2 __u32 share_ctx; __u32 pad; };