From patchwork Sun Apr 24 18:10:09 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 8920641 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id DCCD8BF29F for ; Sun, 24 Apr 2016 18:11:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 941A42026F for ; Sun, 24 Apr 2016 18:11:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 6FF9720222 for ; Sun, 24 Apr 2016 18:11:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EC7706E40A; Sun, 24 Apr 2016 18:11:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id DB7666E215 for ; Sun, 24 Apr 2016 18:10:46 +0000 (UTC) Received: by mail-wm0-x241.google.com with SMTP id n3so19145078wmn.1 for ; Sun, 24 Apr 2016 11:10:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=y+X3gUU/ZGM5Xc1jInp0LiW1q5G/dO42Yzy+5ZVDo6Q=; b=fkBwiCWOIYKq4hqlu8rFFOhUZrpCLzIczJpfzcdjL7PeLYxsx16vZe88qi6/gtsiLd i6lmaqO5frhH5qQpoAlJgIdRBul7XnBIzh3kV0xUa/oxrDdmBOPOhwbDc6wx5Ysq4xkq 8nrkwOh3s9wID3tMQ6yAoiJ2HaGHEpR0cKZgalQeQ+c4wb9ZmGVhtK7HRDQTnZ8ZNVNr h0HUhPk/c/OI6o3e1QI3mFGSFVWhFZ4D4dw72l9SW4WGLiP1t2y36il9hHHIA+7K48kk xA9O7MnphmbXn1PMnDJp6wVu+NTDNoQtPH3IoGG0BSZsMwTFAbP/mwdouO1ypMU61ZvN kVGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=y+X3gUU/ZGM5Xc1jInp0LiW1q5G/dO42Yzy+5ZVDo6Q=; b=jPC248+vz9kCLU4dg7w3m8Nw2eAXxkL1Y6keyq4av0cWMJmhLlMPXV34coGDg6r8ox CbPUyhnqq9wrdki1EBrSAJhBK5QNnSq+hpqFgpDZdUK5XfioRh+CqnfJM4ugPLdbZ4Yu kM1NC+BVJlPrOL+GYABUojNAS5DGps22SngWZN0RB/uLdGJqnMSSAJ59EKkCi5ambrOV muyfcgypKHdzWOu8X3xCiZ3uXAGVcT0pDtYi25esLAR3zJOC96MCMchBD8jHdYLIug/I vTr81GHRADHkQWYFyNeqIaybBavunv/gvz6G2RnLDW6tEQ1cXXUKEnhyr3lNM1r2/443 OG5Q== X-Gm-Message-State: AOPr4FWdgyRpa4GBDA/F9XG/kH4EDYt/IPs9XlFisEH618tRwOuv4nHwQTPPAbAXvAM5Tg== X-Received: by 10.28.26.78 with SMTP id a75mr3261877wma.59.1461521444755; Sun, 24 Apr 2016 11:10:44 -0700 (PDT) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id lh1sm19590888wjb.20.2016.04.24.11.10.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Apr 2016 11:10:43 -0700 (PDT) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sun, 24 Apr 2016 19:10:09 +0100 Message-Id: <1461521419-18086-11-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1461521419-18086-1-git-send-email-chris@chris-wilson.co.uk> References: <1461521419-18086-1-git-send-email-chris@chris-wilson.co.uk> Cc: Mika Kuoppala Subject: [Intel-gfx] [PATCH v3 11/21] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The code to switch_mm() is already handled by i915_switch_context(), the only difference required to setup the aliasing ppgtt is that we need to emit te switch_mm() on the first context, i.e. when transitioning from engine->last_context == NULL. This allows us to defer the initialisation of the GPU from early device initialisation to first use, which should marginally speed up both. The caveat is that we then defer the context initialisation until first use - i.e. we cannot assume that the GPU engines are initialised. For example, this means that power contexts for rc6 (Ironlake) need to explicitly loaded, as they are. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/i915_gem.c | 30 ------------- drivers/gpu/drm/i915/i915_gem_context.c | 77 +++++++++++++-------------------- drivers/gpu/drm/i915/i915_gem_gtt.c | 14 ------ drivers/gpu/drm/i915/i915_gem_gtt.h | 1 - 5 files changed, 31 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7dd79c9a8ce3..7adc7c83a116 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3304,7 +3304,6 @@ void i915_gem_context_lost(struct drm_i915_private *dev_priv); void i915_gem_context_fini(struct drm_device *dev); void i915_gem_context_reset(struct drm_device *dev); int i915_gem_context_open(struct drm_device *dev, struct drm_file *file); -int i915_gem_context_enable(struct drm_i915_gem_request *req); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); int i915_switch_context(struct drm_i915_gem_request *req); struct intel_context * diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index aafcb4942acf..3b294dcf0add 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4907,36 +4907,6 @@ i915_gem_init_hw(struct drm_device *dev) * on re-initialisation */ ret = i915_gem_set_seqno(dev, dev_priv->next_seqno+0x100); - if (ret) - goto out; - - /* Now it is safe to go back round and do everything else: */ - for_each_engine(engine, dev_priv) { - struct drm_i915_gem_request *req; - - req = i915_gem_request_alloc(engine, NULL); - if (IS_ERR(req)) { - ret = PTR_ERR(req); - break; - } - - ret = i915_ppgtt_init_ring(req); - if (ret) - goto err_request; - - ret = i915_gem_context_enable(req); - if (ret) - goto err_request; - -err_request: - i915_add_request_no_flush(req); - if (ret) { - DRM_ERROR("Failed to enable %s, error=%d\n", - engine->name, ret); - i915_gem_cleanup_engines(dev); - break; - } - } out: intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index e5b3d74f8222..4312e2382453 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -431,27 +431,6 @@ void i915_gem_context_fini(struct drm_device *dev) dev_priv->kernel_context = NULL; } -int i915_gem_context_enable(struct drm_i915_gem_request *req) -{ - struct intel_engine_cs *engine = req->engine; - int ret; - - if (i915.enable_execlists) { - if (engine->init_context == NULL) - return 0; - - ret = engine->init_context(req); - } else - ret = i915_switch_context(req); - - if (ret) { - DRM_ERROR("ring init context: %d\n", ret); - return ret; - } - - return 0; -} - static int context_idr_cleanup(int id, void *p, void *data) { struct intel_context *ctx = p; @@ -630,7 +609,8 @@ static int remap_l3(struct drm_i915_gem_request *req, int slice) return 0; } -static inline bool skip_rcs_switch(struct intel_engine_cs *engine, +static inline bool skip_rcs_switch(struct i915_hw_ppgtt *ppgtt, + struct intel_engine_cs *engine, struct intel_context *to) { if (to->remap_slice) @@ -639,21 +619,27 @@ static inline bool skip_rcs_switch(struct intel_engine_cs *engine, if (!to->legacy_hw_ctx.initialized) return false; - if (to->ppgtt && - !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) + if (ppgtt && (intel_engine_flag(engine) & ppgtt->pd_dirty_rings)) return false; return to == engine->last_context; } static bool -needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to) +needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt, + struct intel_engine_cs *engine, + struct intel_context *to) { - if (!to->ppgtt) + if (!ppgtt) return false; + /* Always load the ppgtt on first use */ + if (!engine->last_context) + return true; + + /* Same context without new entries, skip */ if (engine->last_context == to && - !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) + !(intel_engine_flag(engine) & ppgtt->pd_dirty_rings)) return false; if (engine->id != RCS) @@ -666,9 +652,11 @@ needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to) } static bool -needs_pd_load_post(struct intel_context *to, u32 hw_flags) +needs_pd_load_post(struct i915_hw_ppgtt *ppgtt, + struct intel_context *to, + u32 hw_flags) { - if (!to->ppgtt) + if (!ppgtt) return false; if (!IS_GEN8(to->i915)) @@ -684,11 +672,12 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) { struct intel_context *to = req->ctx; struct intel_engine_cs *engine = req->engine; + struct i915_hw_ppgtt *ppgtt = to->ppgtt ?: req->i915->mm.aliasing_ppgtt; struct intel_context *from; u32 hw_flags; int ret, i; - if (skip_rcs_switch(engine, to)) + if (skip_rcs_switch(ppgtt, engine, to)) return 0; /* Trying to pin first makes error handling easier. */ @@ -719,13 +708,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) if (ret) goto unpin_out; - if (needs_pd_load_pre(engine, to)) { + if (needs_pd_load_pre(ppgtt, engine, to)) { /* Older GENs and non render rings still want the load first, * "PP_DCLV followed by PP_DIR_BASE register through Load * Register Immediate commands in Ring Buffer before submitting * a context."*/ trace_switch_mm(engine, to); - ret = to->ppgtt->switch_mm(to->ppgtt, req); + ret = ppgtt->switch_mm(ppgtt, req); if (ret) goto unpin_out; } @@ -736,16 +725,11 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) * space. This means we must enforce that a page table load * occur when this occurs. */ hw_flags = MI_RESTORE_INHIBIT; - else if (to->ppgtt && - intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings) + else if (ppgtt && intel_engine_flag(engine) & ppgtt->pd_dirty_rings) hw_flags = MI_FORCE_RESTORE; else hw_flags = 0; - /* We should never emit switch_mm more than once */ - WARN_ON(needs_pd_load_pre(engine, to) && - needs_pd_load_post(to, hw_flags)); - if (to != from || (hw_flags & MI_FORCE_RESTORE)) { ret = mi_set_context(req, hw_flags); if (ret) @@ -780,9 +764,9 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) /* GEN8 does *not* require an explicit reload if the PDPs have been * setup, and we do not wish to move them. */ - if (needs_pd_load_post(to, hw_flags)) { + if (needs_pd_load_post(ppgtt, to, hw_flags)) { trace_switch_mm(engine, to); - ret = to->ppgtt->switch_mm(to->ppgtt, req); + ret = ppgtt->switch_mm(ppgtt, req); /* The hardware context switch is emitted, but we haven't * actually changed the state - so it's probably safe to bail * here. Still, let the user know something dangerous has @@ -792,8 +776,8 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) return ret; } - if (to->ppgtt) - to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine); + if (ppgtt) + ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine); for (i = 0; i < MAX_L3_SLICES; i++) { if (!(to->remap_slice & (1<id != RCS || req->ctx->legacy_hw_ctx.rcs_state == NULL) { struct intel_context *to = req->ctx; + struct i915_hw_ppgtt *ppgtt = + to->ppgtt ?: req->i915->mm.aliasing_ppgtt; - if (needs_pd_load_pre(engine, to)) { + if (needs_pd_load_pre(ppgtt, engine, to)) { int ret; trace_switch_mm(engine, to); - ret = to->ppgtt->switch_mm(to->ppgtt, req); + ret = ppgtt->switch_mm(ppgtt, req); if (ret) return ret; - /* Doing a PD load always reloads the page dirs */ - to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine); + ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine); } if (to != engine->last_context) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 942b98f910f8..750569af4b0e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2186,20 +2186,6 @@ int i915_ppgtt_init_hw(struct drm_device *dev) return 0; } -int i915_ppgtt_init_ring(struct drm_i915_gem_request *req) -{ - struct drm_i915_private *dev_priv = req->i915; - struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; - - if (i915.enable_execlists) - return 0; - - if (!ppgtt) - return 0; - - return ppgtt->switch_mm(ppgtt, req); -} - struct i915_hw_ppgtt * i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index b0ae6632c01a..114850368bca 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -522,7 +522,6 @@ void i915_ggtt_cleanup_hw(struct drm_device *dev); int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt); int i915_ppgtt_init_hw(struct drm_device *dev); -int i915_ppgtt_init_ring(struct drm_i915_gem_request *req); void i915_ppgtt_release(struct kref *kref); struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv);