From patchwork Fri Sep 14 16:09:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10600999 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3C7614DA for ; Fri, 14 Sep 2018 16:09:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDC042BB06 for ; Fri, 14 Sep 2018 16:09:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D1E8E2BB30; Fri, 14 Sep 2018 16:09:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 778052BB06 for ; Fri, 14 Sep 2018 16:09:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 488606E939; Fri, 14 Sep 2018 16:09:44 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0ECF86E935 for ; Fri, 14 Sep 2018 16:09:41 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id b19-v6so2495196wme.3 for ; Fri, 14 Sep 2018 09:09:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+z1bcIflRWV4RDfRAZrnfCHKyRhRHB584Fi/sUksZDk=; b=Jqr4x0gwpp41MeiH1fsXvHGNVWDnyVtliq49T3/mrczkQmV0Phiuflxt7VzCTBU/Zs fZnj1Wqs9hi5/BMNqpSyWBh/ga+GlmwOjo/T0cfgKVRf1GsX/6FiFhyeQgeL4+IiOZ8u n77f6cwXnZhDT/dkMfSIj2oZiqcu54AL1MW9Eb51n94htx/Y2l8njqOKhNM5jDuylK83 o5ZjE56s1zNXc8lzp1kNl/dQ4z/0h9QrUAz8XAXWgEDaqpVR3wUlQXsYYVXp+1jFc8To JjAFm29/dD7LQUa11kLG82KHJN+KHcDXY4XB1Zh+qgH5TlVHpaM8opX0ynjz1bwrBC4l OkYQ== X-Gm-Message-State: APzg51AD+qSJHLc/qUpHY+T7k+p+BW69q9UMWDFCLW3CxQIa/cogF2MC I+5p511dvsTsicLIVlen2i1hWVxAIk8= X-Google-Smtp-Source: ANB0VdamcMkO+nJdA0xxbnv+r93hSBSOjtGabPpyrna/5d4Uv4PDE+g6QgoVfgegJInOaazhJudplg== X-Received: by 2002:a1c:be06:: with SMTP id o6-v6mr3045811wmf.65.1536941379437; Fri, 14 Sep 2018 09:09:39 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id z141-v6sm2651536wmc.3.2018.09.14.09.09.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Sep 2018 09:09:38 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 14 Sep 2018 17:09:27 +0100 Message-Id: <20180914160932.16457-2-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> References: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 1/6] drm/i915/execlists: Move RPCS setup to context pin X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Configuring RPCS in context image just before pin is sufficient and will come extra handy in one of the following patches. v2: * Split image setup a bit differently. (Chris Wilson) v3: * Update context image after reset as well - otherwise the application of pinned default state clears the RPCS. Signed-off-by: Tvrtko Ursulin Suggested-by: Chris Wilson Cc: Chris Wilson Reviewed-by: Chris Wilson # v2 --- drivers/gpu/drm/i915/intel_lrc.c | 47 ++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index a51be16ddaac..4fcff1be91c9 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,6 +1305,26 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } +static u32 make_rpcs(struct drm_i915_private *dev_priv); + +static void +__execlists_update_reg_state(struct intel_engine_cs *engine, + struct intel_context *ce) +{ + u32 *regs = ce->lrc_reg_state; + struct intel_ring *ring = ce->ring; + + regs[CTX_RING_BUFFER_START + 1] = i915_ggtt_offset(ring->vma); + regs[CTX_RING_HEAD + 1] = ring->head; + regs[CTX_RING_TAIL + 1] = ring->tail; + + /* RPCS */ + if (engine->class == RENDER_CLASS) { + ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = + make_rpcs(engine->i915); + } +} + static struct intel_context * __execlists_context_pin(struct intel_engine_cs *engine, struct i915_gem_context *ctx, @@ -1343,10 +1363,8 @@ __execlists_context_pin(struct intel_engine_cs *engine, GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head)); ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE; - ce->lrc_reg_state[CTX_RING_BUFFER_START+1] = - i915_ggtt_offset(ce->ring->vma); - ce->lrc_reg_state[CTX_RING_HEAD + 1] = ce->ring->head; - ce->lrc_reg_state[CTX_RING_TAIL + 1] = ce->ring->tail; + + __execlists_update_reg_state(engine, ce); ce->state->obj->pin_global++; i915_gem_context_get(ctx); @@ -1955,14 +1973,14 @@ static void execlists_reset(struct intel_engine_cs *engine, engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE, engine->context_size - PAGE_SIZE); } - execlists_init_reg_state(regs, - request->gem_context, engine, request->ring); /* Move the RING_HEAD onto the breadcrumb, past the hanging batch */ - regs[CTX_RING_BUFFER_START + 1] = i915_ggtt_offset(request->ring->vma); - request->ring->head = intel_ring_wrap(request->ring, request->postfix); - regs[CTX_RING_HEAD + 1] = request->ring->head; + + execlists_init_reg_state(regs, request->gem_context, engine, + request->ring); + + __execlists_update_reg_state(engine, request->hw_context); intel_ring_update_space(request->ring); @@ -2710,8 +2728,7 @@ static void execlists_init_reg_state(u32 *regs, if (rcs) { regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1); - CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, - make_rpcs(dev_priv)); + CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, 0); i915_oa_init_reg_state(engine, ctx, regs); } @@ -2872,12 +2889,8 @@ void intel_lr_context_resume(struct drm_i915_private *i915) intel_ring_reset(ce->ring, 0); - if (ce->pin_count) { /* otherwise done in context_pin */ - u32 *regs = ce->lrc_reg_state; - - regs[CTX_RING_HEAD + 1] = ce->ring->head; - regs[CTX_RING_TAIL + 1] = ce->ring->tail; - } + if (ce->pin_count) /* otherwise done in context_pin */ + __execlists_update_reg_state(engine, ce); } } } From patchwork Fri Sep 14 16:09:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10600997 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7155714DA for ; Fri, 14 Sep 2018 16:09:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A9792BB06 for ; Fri, 14 Sep 2018 16:09:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4D3682BB30; Fri, 14 Sep 2018 16:09:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B43D02BB06 for ; Fri, 14 Sep 2018 16:09:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5FC386E93A; Fri, 14 Sep 2018 16:09:44 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id EC0FD6E935 for ; Fri, 14 Sep 2018 16:09:41 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id b19-v6so2495254wme.3 for ; Fri, 14 Sep 2018 09:09:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WJZ7j0O4TXbUJ1MZ4czuSorsLahZUgUBnI2+K4FOMs8=; b=Zs276guOxGEO1tVlsxdegUXzm/GTiZtq/57BzH3l+W+25P0pJZwBe2VoLSCEjGiUuM ZpdBl4LiyR9DZMQK3kuPhB6ZUkz0ddgbfv2Ptv9GQhv4sVQgbYIJZYoDO6nPQD4VrDzI ALYBG/6duXW223yohPb/5GK5L0FDGCAOg/iQc/PR7G9WFCnL8qsu8JCiggDlcfKikRdw UaVpYLqwgvRkFK7FkP6ERVtpu4pN03cYui2Dhqh5wtEvENBqQyGFTr5vPygHm3c7L5Gi 7sozgPHWCIWDc9epd2EajqOvvlvzb5VJoJ9Ep2xW+CcVMhx8Ng8FaqZ/o0+zsuxCElh7 5s7Q== X-Gm-Message-State: APzg51Cz/RANIYlyA1NFHnYQyazhmB/x70QfAjqAeExHCsGfq6B4E5Jd vLdxuCYcWQkqeNh5iA8H8nWAQfvXoKw= X-Google-Smtp-Source: ANB0VdYP/Ij2AKhh7Kybi/wKcEc5+XkUA9xpVUiikfjU4uwMlj4FBXLW8/1bHnI2nj/LfLT9RJvvnw== X-Received: by 2002:a1c:cb4d:: with SMTP id b74-v6mr3038720wmg.123.1536941380295; Fri, 14 Sep 2018 09:09:40 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id z141-v6sm2651536wmc.3.2018.09.14.09.09.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Sep 2018 09:09:39 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 14 Sep 2018 17:09:28 +0100 Message-Id: <20180914160932.16457-3-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> References: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 2/6] drm/i915: Record the sseu configuration per-context & engine X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Lionel Landwerlin We want to expose the ability to reconfigure the slices, subslice and eu per context and per engine. To facilitate that, store the current configuration on the context for each engine, which is initially set to the device default upon creation. v2: record sseu configuration per context & engine (Chris) v3: introduce the i915_gem_context_sseu to store powergating programming, sseu_dev_info has grown quite a bit (Lionel) v4: rename i915_gem_sseu into intel_sseu (Chris) use to_intel_context() (Chris) v5: More to_intel_context() (Tvrtko) Switch intel_sseu from union to struct (Tvrtko) Move context default sseu in existing loop (Chris) v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko) Tvrtko Ursulin: v7: * Pass intel_sseu by pointer instead of value to make_rpcs. * Rebase for make_rpcs changes. v8: * Rebase for RPCS edit on pin. v9: * Rebase for context image setup changes. v10: * Rename dev_priv to i915. (Chris Wilson) Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 14 ++++++++++++ drivers/gpu/drm/i915/i915_gem_context.c | 2 ++ drivers/gpu/drm/i915/i915_gem_context.h | 4 ++++ drivers/gpu/drm/i915/i915_request.h | 10 +++++++++ drivers/gpu/drm/i915/intel_lrc.c | 30 ++++++++++++------------- 5 files changed, 45 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7d4daa7412f1..6b7ae63e47c3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3479,6 +3479,20 @@ mkwrite_device_info(struct drm_i915_private *dev_priv) return (struct intel_device_info *)&dev_priv->info; } +static inline struct intel_sseu +intel_device_default_sseu(struct drm_i915_private *i915) +{ + const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; + struct intel_sseu value = { + .slice_mask = sseu->slice_mask, + .subslice_mask = sseu->subslice_mask[0], + .min_eus_per_subslice = sseu->max_eus_per_subslice, + .max_eus_per_subslice = sseu->max_eus_per_subslice, + }; + + return value; +} + /* modesetting */ extern void intel_modeset_init_hw(struct drm_device *dev); extern int intel_modeset_init(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index f772593b99ab..0b8cc748648b 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -343,6 +343,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, struct intel_context *ce = &ctx->__engine[n]; ce->gem_context = ctx; + /* Use the whole device by default */ + ce->sseu = intel_device_default_sseu(dev_priv); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index 08165f6a0a84..7510de738b35 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -31,6 +31,7 @@ #include "i915_gem.h" #include "i915_scheduler.h" +#include "intel_device_info.h" struct pid; @@ -170,6 +171,9 @@ struct i915_gem_context { int pin_count; const struct intel_context_ops *ops; + + /** sseu: Control eu/slice partitioning */ + struct intel_sseu sseu; } __engine[I915_NUM_ENGINES]; /** ring_size: size for allocating the per-engine ring buffer */ diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 7fa94b024968..3a4be20ea74a 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -39,6 +39,16 @@ struct drm_i915_gem_object; struct i915_request; struct i915_timeline; +/* + * Powergating configuration for a particular (context,engine). + */ +struct intel_sseu { + u8 slice_mask; + u8 subslice_mask; + u8 min_eus_per_subslice; + u8 max_eus_per_subslice; +}; + struct intel_wait { struct rb_node node; struct task_struct *tsk; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 4fcff1be91c9..1c9cf41dab3f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,7 +1305,8 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 make_rpcs(struct drm_i915_private *dev_priv); +static u32 +make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu); static void __execlists_update_reg_state(struct intel_engine_cs *engine, @@ -1321,7 +1322,7 @@ __execlists_update_reg_state(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) { ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = - make_rpcs(engine->i915); + make_rpcs(engine->i915, &ce->sseu); } } @@ -2508,18 +2509,19 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) } static u32 -make_rpcs(struct drm_i915_private *dev_priv) +make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) { - bool subslice_pg = INTEL_INFO(dev_priv)->sseu.has_subslice_pg; - u8 slices = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask); - u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]); + const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; + bool subslice_pg = sseu->has_subslice_pg; + u8 slices = hweight8(ctx_sseu->slice_mask); + u8 subslices = hweight8(ctx_sseu->subslice_mask); u32 rpcs = 0; /* * No explicit RPCS request is needed to ensure full * slice/subslice/EU enablement prior to Gen9. */ - if (INTEL_GEN(dev_priv) < 9) + if (INTEL_GEN(i915) < 9) return 0; /* @@ -2547,7 +2549,7 @@ make_rpcs(struct drm_i915_private *dev_priv) * subslices are enabled, or a count between one and four on the first * slice. */ - if (IS_GEN11(dev_priv) && slices == 1 && subslices >= 4) { + if (IS_GEN11(i915) && slices == 1 && subslices >= 4) { GEM_BUG_ON(subslices & 1); subslice_pg = false; @@ -2560,10 +2562,10 @@ make_rpcs(struct drm_i915_private *dev_priv) * must make an explicit request through RPCS for full * enablement. */ - if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) { + if (sseu->has_slice_pg) { u32 mask, val = slices; - if (INTEL_GEN(dev_priv) >= 11) { + if (INTEL_GEN(i915) >= 11) { mask = GEN11_RPCS_S_CNT_MASK; val <<= GEN11_RPCS_S_CNT_SHIFT; } else { @@ -2588,18 +2590,16 @@ make_rpcs(struct drm_i915_private *dev_priv) rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val; } - if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) { + if (sseu->has_eu_pg) { u32 val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK; From patchwork Fri Sep 14 16:09:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10600995 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C27D914DA for ; Fri, 14 Sep 2018 16:09:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ABE1F2BB06 for ; Fri, 14 Sep 2018 16:09:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9F04D2BB30; Fri, 14 Sep 2018 16:09:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1E7F82BB06 for ; Fri, 14 Sep 2018 16:09:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3588A6E935; Fri, 14 Sep 2018 16:09:44 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1A4B16E935 for ; Fri, 14 Sep 2018 16:09:43 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id c14-v6so2490187wmb.4 for ; Fri, 14 Sep 2018 09:09:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dMTGgi88POJWE9CxY2D8v8mZSBvdlLFQIFsr5BD154M=; b=KMKlQ8mE+MzurqUbDwaXkgHZCZWA6nn0OfmX6OZBOG+RMonij7Bl73kiaUAoy0mUIN BQ1TWHKwcTKdjBPMgOASj8sUKXUZ3TjTS6sQqoN2XOObCDZnR5X6DfS1wVoSH83Jq2mK 47Oz4FzCw6WHqnWu+s4EgVKNBiBaokD3z/UkI42IeU7s+PtPJe0IQ5PB6PfwswupAT6E ASeSNNIJbIWB9pwic4Sat9pnF7ue8HnhSurhB1L5O/42Cd0mFfUeAcDAUeGbTfP7PZIK HX5LiB8pEWf3GetqHMS3InRCMhD6quqOOTU0ELwk8GnD3kNl0USwFtM1cVFeMA2HZAAG Wn8w== X-Gm-Message-State: APzg51CCKTErNBHn8uboZX8HrCQramFj4mfrU4YNzPtS3X386nH22Utr yjdjBkl8iHI1gioVLqg8wOPARuDbVj8= X-Google-Smtp-Source: ANB0VdZPfJLeNQT3gzA8WsyAasAtL/obSRTAt9dtcmsJnKyPY2Vbr4T0ImHCPfSWNWrESJ0wmuYt1Q== X-Received: by 2002:a7b:c04c:: with SMTP id u12-v6mr3062908wmc.24.1536941381433; Fri, 14 Sep 2018 09:09:41 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id z141-v6sm2651536wmc.3.2018.09.14.09.09.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Sep 2018 09:09:40 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 14 Sep 2018 17:09:29 +0100 Message-Id: <20180914160932.16457-4-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> References: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 3/6] drm/i915/perf: lock powergating configuration to default when active X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Lionel Landwerlin If some of the contexts submitting workloads to the GPU have been configured to shutdown slices/subslices, we might loose the NOA configurations written in the NOA muxes. One possible solution to this problem is to reprogram the NOA muxes when we switch to a new context. We initially tried this in the workaround batchbuffer but some concerns where raised about the cost of reprogramming at every context switch. This solution is also not without consequences from the userspace point of view. Reprogramming of the muxes can only happen once the powergating configuration has changed (which happens after context switch). This means for a window of time during the recording, counters recorded by the OA unit might be invalid. This requires userspace dealing with OA reports to discard the invalid values. Minimizing the reprogramming could be implemented by tracking of the last programmed configuration somewhere in GGTT and use MI_PREDICATE to discard some of the programming commands, but the command streamer would still have to parse all the MI_LRI instructions in the workaround batchbuffer. Another solution, which this change implements, is to simply disregard the user requested configuration for the period of time when i915/perf is active. There is no known issue with this apart from a performance penality for some media workloads that benefit from running on a partially powergated GPU. We already prevent RC6 from affecting the programming so it doesn't sound completely unreasonable to hold on powergating for the same reason. v2: Leave RPCS programming in intel_lrc.c (Lionel) v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel) More to_intel_context() (Tvrtko) s/dev_priv/i915/ (Tvrtko) Tvrtko Ursulin: v4: * Rebase for make_rpcs changes. v5: * Apply OA restriction from make_rpcs directly. v6: * Rebase for context image setup changes. v7: * Move stream assignment before metric enable. Signed-off-by: Lionel Landwerlin Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_perf.c | 13 +++++++++---- drivers/gpu/drm/i915/intel_lrc.c | 29 +++++++++++++++++++---------- drivers/gpu/drm/i915/intel_lrc.h | 2 ++ 3 files changed, 30 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 664b96bb65a3..46ddd1913fee 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1677,6 +1677,11 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, CTX_REG(reg_state, state_offset, flex_regs[i], value); } + + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, + gen8_make_rpcs(dev_priv, + &to_intel_context(ctx, + dev_priv->engine[RCS])->sseu)); } /* @@ -2092,6 +2097,9 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (ret) goto err_lock; + stream->ops = &i915_oa_stream_ops; + dev_priv->perf.oa.exclusive_stream = stream; + ret = dev_priv->perf.oa.ops.enable_metric_set(dev_priv, stream->oa_config); if (ret) { @@ -2099,15 +2107,12 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, goto err_enable; } - stream->ops = &i915_oa_stream_ops; - - dev_priv->perf.oa.exclusive_stream = stream; - mutex_unlock(&dev_priv->drm.struct_mutex); return 0; err_enable: + dev_priv->perf.oa.exclusive_stream = NULL; dev_priv->perf.oa.ops.disable_metric_set(dev_priv); mutex_unlock(&dev_priv->drm.struct_mutex); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 1c9cf41dab3f..3b379c7295a3 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,9 +1305,6 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 -make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu); - static void __execlists_update_reg_state(struct intel_engine_cs *engine, struct intel_context *ce) @@ -1322,7 +1319,7 @@ __execlists_update_reg_state(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) { ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = - make_rpcs(engine->i915, &ce->sseu); + gen8_make_rpcs(engine->i915, &ce->sseu); } } @@ -2508,13 +2505,12 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) return logical_ring_init(engine); } -static u32 -make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) +u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu) { const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; bool subslice_pg = sseu->has_subslice_pg; - u8 slices = hweight8(ctx_sseu->slice_mask); - u8 subslices = hweight8(ctx_sseu->subslice_mask); + struct intel_sseu ctx_sseu; + u8 slices, subslices; u32 rpcs = 0; /* @@ -2524,6 +2520,19 @@ make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) if (INTEL_GEN(i915) < 9) return 0; + /* + * If i915/perf is active, we want a stable powergating configuration + * on the system. The most natural configuration to take in that case + * is the default (i.e maximum the hardware can do). + */ + if (unlikely(i915->perf.oa.exclusive_stream)) + ctx_sseu = intel_device_default_sseu(i915); + else + ctx_sseu = *req_sseu; + + slices = hweight8(ctx_sseu.slice_mask); + subslices = hweight8(ctx_sseu.subslice_mask); + /* * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits * wide and Icelake has up to eight subslices, specfial programming is @@ -2593,13 +2602,13 @@ make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) if (sseu->has_eu_pg) { u32 val; - val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index f5a5502ecf70..a4e28cc55fda 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -104,4 +104,6 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv); void intel_execlists_set_default_submission(struct intel_engine_cs *engine); +u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu); + #endif /* _INTEL_LRC_H_ */ From patchwork Fri Sep 14 16:09:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10601005 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A7B7714DA for ; Fri, 14 Sep 2018 16:09:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9187B2BB06 for ; Fri, 14 Sep 2018 16:09:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 860B82BB30; Fri, 14 Sep 2018 16:09:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 221D42BB06 for ; Fri, 14 Sep 2018 16:09:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7D5036E93F; Fri, 14 Sep 2018 16:09:55 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by gabe.freedesktop.org (Postfix) with ESMTPS id 623566E93C for ; Fri, 14 Sep 2018 16:09:44 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id 207-v6so2496777wme.5 for ; Fri, 14 Sep 2018 09:09:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XLzPja5VgZN4zB9nkeMR91Vc7FGd0XItd/PIb3XYXkA=; b=A4vvEgC3af8iDpOxRX6DXwQX8r4UlJGMYl23mpCj2WkkseWrF3epgaezzsGGmJO+a/ Xc22MOY3x5VeKW+i+wMXVw6Wj2jarQW3wd9STMntO/TuiF8mnEsYcKpGDsAqFrwnanC/ TAIJUbCVz+TbUbKPeUHVu4d1AdIKmuzbwKAmEFpJo7Y7qwsPpGeSaNEPYMHZnZ/VMMFR zBZBzk3aHwe4Ia3EvkbbV2HEZIXKNCIKJbLSsw12ePPeM1XTDR5fEgGPpvcDg7DKjob2 IFphr8mV7hYAXifbR8+EiDqBg90KOMZdUDGNhHJ0IooQjJE6Tz/Zj1RpJ3iew8Cy411e DUnA== X-Gm-Message-State: APzg51ANh0UHHM87fAmHXVBwuaN1/Gi9bCUiIfLSVwRU5k6ssohHl1wn fOt7MQbnTtRL57KssMvHbrv2xZL0pHo= X-Google-Smtp-Source: ANB0VdajtaIAb8kNUoeR0ZcZUVzYkodci7zbrBoatR6EuRnyPfeGoKPaCPwxsMUuJWfcZVdDqueH3A== X-Received: by 2002:a1c:b213:: with SMTP id b19-v6mr2854159wmf.141.1536941382695; Fri, 14 Sep 2018 09:09:42 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id z141-v6sm2651536wmc.3.2018.09.14.09.09.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Sep 2018 09:09:41 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 14 Sep 2018 17:09:30 +0100 Message-Id: <20180914160932.16457-5-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> References: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 4/6] drm/i915: Add timeline barrier support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Timeline barrier allows serialization between different timelines. After calling i915_timeline_set_barrier with a request, all following submissions on this timeline will be set up as depending on this request, or barrier. Once the barrier has been completed it automatically gets cleared and things continue as normal. This facility will be used by the upcoming context SSEU code. v2: * Assert barrier has been retired on timeline_fini. (Chris Wilson) * Fix mock_timeline. Signed-off-by: Tvrtko Ursulin Suggested-by: Chris Wilson Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_request.c | 13 +++++++++ drivers/gpu/drm/i915/i915_timeline.c | 3 +++ drivers/gpu/drm/i915/i915_timeline.h | 27 +++++++++++++++++++ .../gpu/drm/i915/selftests/mock_timeline.c | 2 ++ 4 files changed, 45 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index a492385b2089..76fc80330c85 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -644,6 +644,15 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } +static int add_timeline_barrier(struct i915_request *rq) +{ + struct i915_request *barrier = + i915_gem_active_raw(&rq->timeline->barrier, + &rq->i915->drm.struct_mutex); + + return barrier ? i915_request_await_dma_fence(rq, &barrier->fence) : 0; +} + /** * i915_request_alloc - allocate a request structure * @@ -808,6 +817,10 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx) */ rq->head = rq->ring->emit; + ret = add_timeline_barrier(rq); + if (ret) + goto err_unwind; + /* Unconditionally invalidate GPU caches and TLBs. */ ret = engine->emit_flush(rq, EMIT_INVALIDATE); if (ret) diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index 4667cc08c416..5a87c5bd5154 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -37,6 +37,8 @@ void i915_timeline_init(struct drm_i915_private *i915, INIT_LIST_HEAD(&timeline->requests); i915_syncmap_init(&timeline->sync); + + init_request_active(&timeline->barrier, NULL); } /** @@ -69,6 +71,7 @@ void i915_timelines_park(struct drm_i915_private *i915) void i915_timeline_fini(struct i915_timeline *timeline) { GEM_BUG_ON(!list_empty(&timeline->requests)); + GEM_BUG_ON(i915_gem_active_isset(&timeline->barrier)); i915_syncmap_free(&timeline->sync); diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h index a2c2c3ab5fb0..c8526ab44dbc 100644 --- a/drivers/gpu/drm/i915/i915_timeline.h +++ b/drivers/gpu/drm/i915/i915_timeline.h @@ -72,6 +72,16 @@ struct i915_timeline { */ u32 global_sync[I915_NUM_ENGINES]; + /** + * Barrier provides the ability to serialize ordering between different + * timelines. + * + * Users can call i915_timeline_set_barrier which will make all + * subsequent submissions be executed only after this barrier has been + * completed. + */ + struct i915_gem_active barrier; + struct list_head link; const char *name; @@ -125,4 +135,21 @@ static inline bool i915_timeline_sync_is_later(struct i915_timeline *tl, void i915_timelines_park(struct drm_i915_private *i915); +/** + * i915_timeline_set_barrier - orders submission between different timelines + * @timeline: timeline to set the barrier on + * @rq: request after which new submissions can proceed + * + * Sets the passed in request as the serialization point for all subsequent + * submissions on @timeline. Subsequent requests will not be submitted to GPU + * until the barrier has been completed. + */ +static inline void +i915_timeline_set_barrier(struct i915_timeline *timeline, + struct i915_request *rq) +{ + GEM_BUG_ON(timeline->fence_context == rq->timeline->fence_context); + i915_gem_active_set(&timeline->barrier, rq); +} + #endif diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c index dcf3b16f5a07..a718b64c988e 100644 --- a/drivers/gpu/drm/i915/selftests/mock_timeline.c +++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c @@ -19,6 +19,8 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context) i915_syncmap_init(&timeline->sync); + init_request_active(&timeline->barrier, NULL); + INIT_LIST_HEAD(&timeline->link); } From patchwork Fri Sep 14 16:09:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10601003 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ADD7E15E8 for ; Fri, 14 Sep 2018 16:09:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96A452BB06 for ; Fri, 14 Sep 2018 16:09:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8AAEF2BB30; Fri, 14 Sep 2018 16:09:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7D0282BB06 for ; Fri, 14 Sep 2018 16:09:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F1D0F6E940; Fri, 14 Sep 2018 16:09:52 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id B27BE6E93F for ; Fri, 14 Sep 2018 16:09:45 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id y139-v6so2478163wmc.2 for ; Fri, 14 Sep 2018 09:09:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=H12cWJf1fSUmdlRx3zw+oZOdiHKn3RJJm5JGvfS6zYQ=; b=EBIX0bBgEmP6TN0SGI3Q6SsKGk0fBBHFpqbOO/M6nO4p0ob+z0ZjcGBSpxHfhtdyG6 hPAOY/GlYqxNp8ZIg184INniOleSfql6PX7QxZXMme2sTwBn8eXyggEnDzng8P2cAeB/ aNz3LvK2qQxpwFm9l0d4R9kGTPDtB9BDkxDvG1o8+3m5+Zw0RCQhDgF7eow6djB6WkHJ OAsfymcg7udPBBMgBSMwOVI6u+wo8YnDuwu4CNI6K42jbEyMoU/yatO1OvR648oBunll GxiyRjD2VaHblgvO3zpoz4goRO8ZBRANy5Zk5PqSWKFZsk+0elqhWB4QVIt1CxVxg7hh pu3A== X-Gm-Message-State: APzg51CQVdoMwtor1sj6ICuKlIalDtToTcA6xNpstRQkkwBJJPx2j8ty KixZmbMkqvsNfV9rzt7sFMzVaJVLZHs= X-Google-Smtp-Source: ANB0VdbDvEmyMqsiqCTMfa9bpWHnEbvYigQeIxNzrUJGfktXIz3DfBfs3LJIXEppE8a6PPycUIyU6w== X-Received: by 2002:a1c:bb0a:: with SMTP id l10-v6mr2947332wmf.32.1536941383799; Fri, 14 Sep 2018 09:09:43 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id z141-v6sm2651536wmc.3.2018.09.14.09.09.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Sep 2018 09:09:43 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 14 Sep 2018 17:09:31 +0100 Message-Id: <20180914160932.16457-6-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> References: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 5/6] drm/i915: Expose RPCS (SSEU) configuration to userspace X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We want to allow userspace to reconfigure the subslice configuration for its own use case. To do so, we expose a context parameter to allow adjustment of the RPCS register stored within the context image (and currently not accessible via LRI). If the context is adjusted before first use, the adjustment is for "free"; otherwise if the context is active we flush the context off the GPU (stalling all users) and forcing the GPU to save the context to memory where we can modify it and so ensure that the register is reloaded on next execution. The overhead of managing additional EU subslices can be significant, especially in multi-context workloads. Non-GPGPU contexts should preferably disable the subslices it is not using, and others should fine-tune the number to match their workload. We expose complete control over the RPCS register, allowing configuration of slice/subslice, via masks packed into a u64 for simplicity. For example, struct drm_i915_gem_context_param arg; struct drm_i915_gem_context_param_sseu sseu = { .class = 0, .instance = 0, }; memset(&arg, 0, sizeof(arg)); arg.ctx_id = ctx; arg.param = I915_CONTEXT_PARAM_SSEU; arg.value = (uintptr_t) &sseu; if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &arg) == 0) { sseu.packed.subslice_mask = 0; drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &arg); } could be used to disable all subslices where supported. v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel) v3: Add ability to program this per engine (Chris) v4: Move most get_sseu() into i915_gem_context.c (Lionel) v5: Validate sseu configuration against the device's capabilities (Lionel) v6: Change context powergating settings through MI_SDM on kernel context (Chris) v7: Synchronize the requests following a powergating setting change using a global dependency (Chris) Iterate timelines through dev_priv.gt.active_rings (Tvrtko) Disable RPCS configuration setting for non capable users (Lionel/Tvrtko) v8: s/union intel_sseu/struct intel_sseu/ (Lionel) s/dev_priv/i915/ (Tvrtko) Change uapi class/instance fields to u16 (Tvrtko) Bump mask fields to 64bits (Lionel) Don't return EPERM when dynamic sseu is disabled (Tvrtko) v9: Import context image into kernel context's ppgtt only when reconfiguring powergated slice/subslices (Chris) Use aliasing ppgtt when needed (Michel) Tvrtko Ursulin: v10: * Update for upstream changes. * Request submit needs a RPM reference. * Reject on !FULL_PPGTT for simplicity. * Pull out get/set param to helpers for readability and less indent. * Use i915_request_await_dma_fence in add_global_barrier to skip waits on the same timeline and avoid GEM_BUG_ON. * No need to explicitly assign a NULL pointer to engine in legacy mode. * No need to move gen8_make_rpcs up. * Factored out global barrier as prep patch. * Allow to only CAP_SYS_ADMIN if !Gen11. v11: * Remove engine vfunc in favour of local helper. (Chris Wilson) * Stop retiring requests before updates since it is not needed (Chris Wilson) * Implement direct CPU update path for idle contexts. (Chris Wilson) * Left side dependency needs only be on the same context timeline. (Chris Wilson) * It is sufficient to order the timeline. (Chris Wilson) * Reject !RCS configuration attempts with -ENODEV for now. v12: * Rebase for make_rpcs. v13: * Centralize SSEU normalization to make_rpcs. * Type width checking (uAPI <-> implementation). * Gen11 restrictions uAPI checks. * Gen11 subslice count differences handling. Chris Wilson: * args->size handling fixes. * Update context image from GGTT. * Postpone context image update to pinning. * Use i915_gem_active_raw instead of last_request_on_engine. v14: * Add activity tracker on intel_context to fix the lifetime issues and simplify the code. (Chris Wilson) v15: * Fix context pin leak if no space in ring by simplifying the context pinning sequence. v16: * Rebase for context get/set param locking changes. * Just -ENODEV on !Gen11. (Joonas) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107634 Issue: https://github.com/intel/media-driver/issues/267 Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Cc: Dmitry Rogozhkin Cc: Tvrtko Ursulin Cc: Zhipeng Gong Cc: Joonas Lahtinen Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson # v15 --- drivers/gpu/drm/i915/i915_gem_context.c | 313 +++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_context.h | 6 + drivers/gpu/drm/i915/intel_lrc.c | 4 +- include/uapi/drm/i915_drm.h | 43 ++++ 4 files changed, 363 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 0b8cc748648b..092c282a8ec2 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -90,6 +90,7 @@ #include #include "i915_drv.h" #include "i915_trace.h" +#include "intel_lrc_reg.h" #include "intel_workarounds.h" #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 @@ -322,6 +323,14 @@ static u32 default_desc_template(const struct drm_i915_private *i915, return desc; } +static void intel_context_retire(struct i915_gem_active *active, + struct i915_request *rq) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + + intel_context_unpin(ce); +} + static struct i915_gem_context * __create_hw_context(struct drm_i915_private *dev_priv, struct drm_i915_file_private *file_priv) @@ -345,6 +354,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, ce->gem_context = ctx; /* Use the whole device by default */ ce->sseu = intel_device_default_sseu(dev_priv); + + init_request_active(&ce->active, intel_context_retire); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); @@ -846,6 +857,56 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, return 0; } +static int get_sseu(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct drm_i915_gem_context_param_sseu user_sseu; + struct intel_engine_cs *engine; + struct intel_context *ce; + int ret; + + if (args->size == 0) + goto out; + else if (args->size < sizeof(user_sseu)) + return -EINVAL; + + if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value), + sizeof(user_sseu))) + return -EFAULT; + + if (user_sseu.rsvd1 || user_sseu.rsvd2) + return -EINVAL; + + engine = intel_engine_lookup_user(ctx->i915, + user_sseu.class, + user_sseu.instance); + if (!engine) + return -EINVAL; + + /* Only use for mutex here is to serialize get_param and set_param. */ + ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex); + if (ret) + return ret; + + ce = to_intel_context(ctx, engine); + + user_sseu.slice_mask = ce->sseu.slice_mask; + user_sseu.subslice_mask = ce->sseu.subslice_mask; + user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice; + user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice; + + mutex_unlock(&ctx->i915->drm.struct_mutex); + + if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu, + sizeof(user_sseu))) + return -EFAULT; + +out: + args->size = sizeof(user_sseu); + + return 0; +} + int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -858,15 +919,17 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, if (!ctx) return -ENOENT; - args->size = 0; switch (args->param) { case I915_CONTEXT_PARAM_BAN_PERIOD: ret = -EINVAL; break; case I915_CONTEXT_PARAM_NO_ZEROMAP: + args->size = 0; args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags); break; case I915_CONTEXT_PARAM_GTT_SIZE: + args->size = 0; + if (ctx->ppgtt) args->value = ctx->ppgtt->vm.total; else if (to_i915(dev)->mm.aliasing_ppgtt) @@ -875,14 +938,20 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, args->value = to_i915(dev)->ggtt.vm.total; break; case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE: + args->size = 0; args->value = i915_gem_context_no_error_capture(ctx); break; case I915_CONTEXT_PARAM_BANNABLE: + args->size = 0; args->value = i915_gem_context_is_bannable(ctx); break; case I915_CONTEXT_PARAM_PRIORITY: + args->size = 0; args->value = ctx->sched.priority; break; + case I915_CONTEXT_PARAM_SSEU: + ret = get_sseu(ctx, args); + break; default: ret = -EINVAL; break; @@ -892,6 +961,244 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +static int gen8_emit_rpcs_config(struct i915_request *rq, + struct intel_context *ce, + struct intel_sseu sseu) +{ + u64 offset; + u32 *cs; + + cs = intel_ring_begin(rq, 4); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + offset = ce->state->node.start + + LRC_STATE_PN * PAGE_SIZE + + (CTX_R_PWR_CLK_STATE + 1) * 4; + + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; + *cs++ = lower_32_bits(offset); + *cs++ = upper_32_bits(offset); + *cs++ = gen8_make_rpcs(rq->i915, &sseu); + + intel_ring_advance(rq, cs); + + return 0; +} + +static int +gen8_modify_rpcs_gpu(struct intel_context *ce, + struct intel_engine_cs *engine, + struct intel_sseu sseu) +{ + struct drm_i915_private *i915 = engine->i915; + struct i915_request *rq, *prev; + int ret; + + GEM_BUG_ON(!ce->pin_count); + + lockdep_assert_held(&i915->drm.struct_mutex); + + /* Submitting requests etc needs the hw awake. */ + intel_runtime_pm_get(i915); + + rq = i915_request_alloc(engine, i915->kernel_context); + if (IS_ERR(rq)) { + ret = PTR_ERR(rq); + goto out_put; + } + + ret = gen8_emit_rpcs_config(rq, ce, sseu); + if (ret) + goto out_add; + + /* Queue this switch after all other activity by this context. */ + prev = i915_gem_active_raw(&ce->ring->timeline->last_request, + &i915->drm.struct_mutex); + if (prev && !i915_request_completed(prev)) + i915_sw_fence_await_sw_fence_gfp(&rq->submit, + &prev->submit, + I915_FENCE_GFP); + + /* Order all following requests to be after. */ + i915_timeline_set_barrier(ce->ring->timeline, rq); + + /* + * Guarantee context image and the timeline remains pinned until the + * modifying request is retired by setting the ce activity tracker. + * + * But we only need to take one pin on the account of it. Or in other + * words transfer the pinned ce object to tracked active request. + */ + if (!i915_gem_active_isset(&ce->active)) + __intel_context_pin(ce); + i915_gem_active_set(&ce->active, rq); + +out_add: + i915_request_add(rq); +out_put: + intel_runtime_pm_put(i915); + + return ret; +} + +static int +i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx, + struct intel_engine_cs *engine, + struct intel_sseu sseu) +{ + struct intel_context *ce = to_intel_context(ctx, engine); + int ret; + + GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8); + GEM_BUG_ON(engine->id != RCS); + + ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex); + if (ret) + return ret; + + /* Nothing to do if unmodified. */ + if (!memcmp(&ce->sseu, &sseu, sizeof(sseu))) + goto out; + + /* + * If context is not idle we have to submit an ordered request to modify + * its context image via the kernel context. Pristine and idle contexts + * will be configured on pinning. + */ + if (ce->pin_count) + ret = gen8_modify_rpcs_gpu(ce, engine, sseu); + + if (!ret) + ce->sseu = sseu; + +out: + mutex_unlock(&ctx->i915->drm.struct_mutex); + + return ret; +} + +static int +user_to_context_sseu(struct drm_i915_private *i915, + const struct drm_i915_gem_context_param_sseu *user, + struct intel_sseu *context) +{ + const struct sseu_dev_info *device = &INTEL_INFO(i915)->sseu; + + /* No zeros in any field. */ + if (!user->slice_mask || !user->subslice_mask || + !user->min_eus_per_subslice || !user->max_eus_per_subslice) + return -EINVAL; + + /* Max > min. */ + if (user->max_eus_per_subslice < user->min_eus_per_subslice) + return -EINVAL; + + /* Check validity against hardware. */ + if (user->slice_mask & ~device->slice_mask) + return -EINVAL; + + if (user->subslice_mask & ~device->subslice_mask[0]) + return -EINVAL; + + if (user->max_eus_per_subslice > device->max_eus_per_subslice) + return -EINVAL; + + /* + * Some future proofing on the types since the uAPI is wider than the + * current internal implementation. + */ + if (WARN_ON((fls(user->slice_mask) > + sizeof(context->slice_mask) * BITS_PER_BYTE) || + (fls(user->subslice_mask) > + sizeof(context->subslice_mask) * BITS_PER_BYTE) || + overflows_type(user->min_eus_per_subslice, + context->min_eus_per_subslice) || + overflows_type(user->max_eus_per_subslice, + context->max_eus_per_subslice))) + return -EINVAL; + + context->slice_mask = user->slice_mask; + context->subslice_mask = user->subslice_mask; + context->min_eus_per_subslice = user->min_eus_per_subslice; + context->max_eus_per_subslice = user->max_eus_per_subslice; + + /* Part specific restrictions. */ + if (IS_GEN11(i915)) { + unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]); + unsigned int req_s = hweight8(context->slice_mask); + unsigned int req_ss = hweight8(context->subslice_mask); + + /* + * Only full subslice enablement is possible if more than one + * slice is turned on. + */ + if (req_s > 1 && req_ss != hw_ss_per_s) + return -EINVAL; + + /* + * If more than four (SScount bitfield limit) subslices are + * requested then the number has to be even. + */ + if (req_ss > 4 && (req_ss & 1)) + return -EINVAL; + + /* + * If only one slice is enabled subslice count must be at most + * half of the all available subslices. + */ + if (req_s == 1 && req_ss > (hw_ss_per_s / 2)) + return -EINVAL; + } + + return 0; +} + +static int set_sseu(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct drm_i915_private *i915 = ctx->i915; + struct drm_i915_gem_context_param_sseu user_sseu; + struct intel_engine_cs *engine; + struct intel_sseu sseu; + int ret; + + if (args->size < sizeof(user_sseu)) + return -EINVAL; + + if (!IS_GEN11(i915)) + return -ENODEV; + + if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value), + sizeof(user_sseu))) + return -EFAULT; + + if (user_sseu.rsvd1 || user_sseu.rsvd2) + return -EINVAL; + + engine = intel_engine_lookup_user(i915, + user_sseu.class, + user_sseu.instance); + if (!engine) + return -EINVAL; + + /* Only render engine supports RPCS configuration. */ + if (engine->class != RENDER_CLASS) + return -ENODEV; + + ret = user_to_context_sseu(i915, &user_sseu, &sseu); + if (ret) + return ret; + + ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu); + if (ret) + return ret; + + args->size = sizeof(user_sseu); + + return 0; +} + int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -953,7 +1260,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, ctx->sched.priority = priority; } break; - + case I915_CONTEXT_PARAM_SSEU: + ret = set_sseu(ctx, args); + break; default: ret = -EINVAL; break; diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index 7510de738b35..c4aebfe7d4d2 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -170,6 +170,12 @@ struct i915_gem_context { u64 lrc_desc; int pin_count; + /** + * active: Active tracker for the external rq activity on this + * intel_context object. + */ + struct i915_gem_active active; + const struct intel_context_ops *ops; /** sseu: Control eu/slice partitioning */ diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3b379c7295a3..0cfa99a13522 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2558,7 +2558,9 @@ u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu) * subslices are enabled, or a count between one and four on the first * slice. */ - if (IS_GEN11(i915) && slices == 1 && subslices >= 4) { + if (IS_GEN11(i915) && + slices == 1 && + subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) { GEM_BUG_ON(subslices & 1); subslice_pg = false; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index a4446f452040..e195c38b15a6 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1478,9 +1478,52 @@ struct drm_i915_gem_context_param { #define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */ #define I915_CONTEXT_DEFAULT_PRIORITY 0 #define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */ + /* + * When using the following param, value should be a pointer to + * drm_i915_gem_context_param_sseu. + */ +#define I915_CONTEXT_PARAM_SSEU 0x7 __u64 value; }; +struct drm_i915_gem_context_param_sseu { + /* + * Engine class & instance to be configured or queried. + */ + __u16 class; + __u16 instance; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd1; + + /* + * Mask of slices to enable for the context. Valid values are a subset + * of the bitmask value returned for I915_PARAM_SLICE_MASK. + */ + __u64 slice_mask; + + /* + * Mask of subslices to enable for the context. Valid values are a + * subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK. + */ + __u64 subslice_mask; + + /* + * Minimum/Maximum number of EUs to enable per subslice for the + * context. min_eus_per_subslice must be inferior or equal to + * max_eus_per_subslice. + */ + __u16 min_eus_per_subslice; + __u16 max_eus_per_subslice; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd2; +}; + enum drm_i915_oa_format { I915_OA_FORMAT_A13 = 1, /* HSW only */ I915_OA_FORMAT_A29, /* HSW only */ From patchwork Fri Sep 14 16:09:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10601001 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4303615E8 for ; Fri, 14 Sep 2018 16:09:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AC612BB06 for ; Fri, 14 Sep 2018 16:09:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1F54D2BB30; Fri, 14 Sep 2018 16:09:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CD4C72BB06 for ; Fri, 14 Sep 2018 16:09:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1764F6E93E; Fri, 14 Sep 2018 16:09:49 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2251E6E93E for ; Fri, 14 Sep 2018 16:09:46 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id y2-v6so2497851wma.1 for ; Fri, 14 Sep 2018 09:09:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EQtOo3I/PIzGR+AdPIHMARRUHKv9IzHAwiJC03NaQO4=; b=j/RTb8Vypr/6QxTJkbjswd39JClhs6nu1aFi2Bajw/v0S1ZtFKELtLAApe/ht1Atnr +FlNlGEkmxkRRb1luBShOMPMORMG2ma5aMNpqS/Cn7Fq0y+eiy+WwqxwTl/HoxnbHsTk aNBI6dUEITji/RZBDz0QDpDUoBLdoKpXx6fhifmQI8rOXcng9lpFrnKwMQ+wD7sm7OLE eu7kEaPfF4Dzf7iuZ0b9HVA/iMGdphkVi66g4rGgM8Ln7IhBdJ4XDbr8m99HbPcbYeN6 LgKWgwkW4/GaeJf56OY68I0bdiNVQggXH4UozWYU6DJbbtNN6F0N6+X+NmreqBNsM80J dAWA== X-Gm-Message-State: APzg51B18izC4Hv60UKaLu4IgL3b63RjeP1DVLCaVUpK16CcWGqHfAW8 ExJ+0u3YC2Q1D/02jhnCBpJ0Zd6C4rY= X-Google-Smtp-Source: ANB0VdYvKxaK72ZCopsOCokOYEGeFD+b/uMUI2cGrgdem6J+yfZ5XBMlpwKqQg3uIi0uEaoqssUg/A== X-Received: by 2002:a1c:1d87:: with SMTP id d129-v6mr2937520wmd.34.1536941384597; Fri, 14 Sep 2018 09:09:44 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id z141-v6sm2651536wmc.3.2018.09.14.09.09.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Sep 2018 09:09:44 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 14 Sep 2018 17:09:32 +0100 Message-Id: <20180914160932.16457-7-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> References: <20180914160932.16457-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 6/6] drm/i915/icl: Support co-existance between per-context SSEU and OA X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin When OA is active we want to lock the powergating configuration, but on Icelake users like media stack will have issues if we lock to the full device configuration. Instead lock to a subset of (sub)slices which are currently a known working configuration for all users. Signed-off-by: Tvrtko Ursulin Cc: Lionel Landwerlin --- drivers/gpu/drm/i915/intel_lrc.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0cfa99a13522..2aaf2237a2b0 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2522,13 +2522,28 @@ u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu) /* * If i915/perf is active, we want a stable powergating configuration - * on the system. The most natural configuration to take in that case - * is the default (i.e maximum the hardware can do). + * on the system. + * + * We could choose full enablement, but on ICL we know there are use + * cases which disable slices for functional, apart for performance + * reasons. So in this case we select a known stable subset. */ - if (unlikely(i915->perf.oa.exclusive_stream)) - ctx_sseu = intel_device_default_sseu(i915); - else + if (!i915->perf.oa.exclusive_stream) { ctx_sseu = *req_sseu; + } else { + ctx_sseu = intel_device_default_sseu(i915); + + if (IS_GEN11(i915)) { + /* + * We only need subslice count so it doesn't matter + * which ones we select - just turn of low bits in the + * amount of half of all available subslices per slice. + */ + ctx_sseu.subslice_mask = + ~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2)); + ctx_sseu.slice_mask = 0x1; + } + } slices = hweight8(ctx_sseu.slice_mask); subslices = hweight8(ctx_sseu.subslice_mask);