From patchwork Wed Sep 5 14:22:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588971 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4B1E112B for ; Wed, 5 Sep 2018 14:22:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF7272A2FC for ; Wed, 5 Sep 2018 14:22:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B341B2A2DA; Wed, 5 Sep 2018 14:22:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6BAD42A2DA for ; Wed, 5 Sep 2018 14:22:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 422FD6E4C4; Wed, 5 Sep 2018 14:22:35 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id 641CB6E4BD for ; Wed, 5 Sep 2018 14:22:33 +0000 (UTC) Received: by mail-wm0-x241.google.com with SMTP id n11-v6so7927361wmc.2 for ; Wed, 05 Sep 2018 07:22:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lNOSq62IzwEbWURvB608LNDbusaDOstHXemnYnCltYY=; b=gPr4BzyD32aESMdjbArzgZsPVR8D8+qJyt3EZkCzZ6gBwwgrpjDEmdGKO6bEbT+ewL jwZyPfkXqgMyFxAVcNuIkTlk2hq6vQNxlvEiVQD1ijNM3PLK/W+C/Jay8IDCdHkjKsoB 8uA4wjHySEDy5DlmVcoaPFeBGF1xIiiVm01JTrE0khDrtv5yhK0/1F2eDGeEXQco6lhh Vg1U/kjRHYU9Ghadt/WvzIPwSQ+RRBtljm1HBx/gaQYB9B2MfpsJgyx5RxaCxAvURKaV WZ2IGHcR0thhehQzNNA3gf3g8oeEMc0G+ObryYykLa7TVtO6MxMjUjRY+Sj6m3IJqIq9 w1BQ== X-Gm-Message-State: APzg51DTiHPJM43qm/rE1IcpVgxIE8k0iRs93XDZHrRYX1XOU8w9Fjh4 43ccDl2vDJTpSlyPV4NbmLmFS2EJIbo= X-Google-Smtp-Source: ANB0VdZEj9HucXJ/kWditpx4bhIz5swI6vy9pUAHrdY++F/NeG1vTk78B+EmTGX1kI6Uz8E4/+QFRw== X-Received: by 2002:a1c:7015:: with SMTP id l21-v6mr372349wmc.81.1536157351900; Wed, 05 Sep 2018 07:22:31 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:31 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:16 +0100 Message-Id: <20180905142222.3251-2-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 1/7] drm/i915/execlists: Move RPCS setup to context pin X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Configuring RPCS in context image just before pin is sufficient and will come extra handy in one of the following patches. v2: * Split image setup a bit differently. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Suggested-by: Chris Wilson Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 9b1f0e5211a0..358fad63564c 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,6 +1305,8 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } +static u32 make_rpcs(struct drm_i915_private *dev_priv); + static struct intel_context * __execlists_context_pin(struct intel_engine_cs *engine, struct i915_gem_context *ctx, @@ -1344,6 +1346,12 @@ __execlists_context_pin(struct intel_engine_cs *engine, GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head)); ce->lrc_reg_state[CTX_RING_HEAD+1] = ce->ring->head; + /* RPCS */ + if (engine->class == RENDER_CLASS) { + ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = + make_rpcs(engine->i915); + } + ce->state->obj->pin_global++; i915_gem_context_get(ctx); return ce; @@ -2706,8 +2714,7 @@ static void execlists_init_reg_state(u32 *regs, if (rcs) { regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1); - CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, - make_rpcs(dev_priv)); + CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, 0); i915_oa_init_reg_state(engine, ctx, regs); } From patchwork Wed Sep 5 14:22:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588973 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2868112B for ; Wed, 5 Sep 2018 14:22:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C8D012A2DA for ; Wed, 5 Sep 2018 14:22:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BB57B2A306; Wed, 5 Sep 2018 14:22:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 776222A2DA for ; Wed, 5 Sep 2018 14:22:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C66346E4C7; Wed, 5 Sep 2018 14:22:35 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id 682F66E4C4 for ; Wed, 5 Sep 2018 14:22:34 +0000 (UTC) Received: by mail-wr1-x441.google.com with SMTP id m27-v6so7912347wrf.3 for ; Wed, 05 Sep 2018 07:22:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Oq0OnZ6b+w9UIW71Fin4f20WD8FI3vcqoM+hgRUxRLM=; b=CsTngInDEwRQkita3Vd35SX4IPXWBw8a+Q8LG4THroExtssAkRN8pArc5WXkoav/3M zmH/82CK6EPeda7cRtGH1BGMwaaAB/VuIVwn/KLtUpzf+2tJloE+z1wA+V7SPXGGP3cj I429l1rjIGrBW5ezVpKS0y6wpdNsrIa2A4OvVk0gw8SufV2GsB5GgsqDjeQToHprI6z4 15UVmVEuh+YZN/naVaR4EQjiyaqfoeHMzgKXf42xLGMk/dZTEkzSJXFNJ/AEWkuXQDqK jpJmxHMPc9cfF5XtWR+5HqNeJ/oa5mfUmWczfSNZc50IEYdRGoLdq8LUQcBNX4raGfze d+cA== X-Gm-Message-State: APzg51DAgZGXUP5xoZLqKkTFLlw3OWtaqp9liwokOkMyCmWVslnbp8iK Lu+77GrymtuoIW1GY0l38AxlX/S1+QA= X-Google-Smtp-Source: ANB0VdYwMWYtU9zM8Ht0rBBN6cdoLyGkELkc4MOQK5hBMQsfjdS6aFoTxdEkwHvng1PaZ4Efj95xNQ== X-Received: by 2002:a5d:6841:: with SMTP id o1-v6mr26483486wrw.159.1536157352953; Wed, 05 Sep 2018 07:22:32 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:32 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:17 +0100 Message-Id: <20180905142222.3251-3-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 2/7] drm/i915: Program RPCS for Broadwell X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Wilson Currently we only configure the power gating for Skylake and above, but the configuration should equally apply to Broadwell and Braswell. Even though, there is not as much variation as for later generations, we want to expose control over the configuration to userspace and may want to opt out of the "always-enabled" setting. Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_lrc.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 358fad63564c..3bdc1ac3e926 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2501,13 +2501,6 @@ make_rpcs(struct drm_i915_private *dev_priv) u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]); u32 rpcs = 0; - /* - * No explicit RPCS request is needed to ensure full - * slice/subslice/EU enablement prior to Gen9. - */ - if (INTEL_GEN(dev_priv) < 9) - return 0; - /* * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits * wide and Icelake has up to eight subslices, specfial programming is From patchwork Wed Sep 5 14:22:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588977 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D8665A4 for ; Wed, 5 Sep 2018 14:22:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5784C2A2DA for ; Wed, 5 Sep 2018 14:22:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4BC7D2A306; Wed, 5 Sep 2018 14:22:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BEF282A2DA for ; Wed, 5 Sep 2018 14:22:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AFD506E4CF; Wed, 5 Sep 2018 14:22:39 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9CB466E4C5 for ; Wed, 5 Sep 2018 14:22:35 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id n11-v6so7927496wmc.2 for ; Wed, 05 Sep 2018 07:22:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gjihDMGblY78jnRWMDaKq0mcppUaC9v7Xo1067rNaeY=; b=QuG2bBSBOEc7lrez6oNHuF2nIDMoqR6QRjZpFSJ3JSapK96sYOcUpyaDWqV53R3uIp nXXRnDJNZ4oePfgFHdvnwx1s5CgwLnXjQ5v02Yow3sRc5dOisVBdFlTtodO/l8XJQbsA X9cK2pX1LZbV0tN5094gBhAClqlgM15KFaN+t8oDG1mODcgUxKDUN8iktx9VaIFGFhCE EjWrHeO9Xae/ugNgx1os2BCnf8Kt4K+pnI+JwzWlxza7m16nXNJF5Q+0ChsINWnsKD/F ephqFsf3n+Jqkf+/YozjB2OYG4bDVZKh0EEn5iIJDyFH6huFdbVD7NDR72ZvhUVipK8C kYyA== X-Gm-Message-State: APzg51A/etEd5bi63OhhQSmq/KcmD8X439c1O9RuwlgONIVDouM0rpBX lVdyPkLplGUIwWZeVwsWi/6EDrrZ03c= X-Google-Smtp-Source: ANB0VdbAi9KCg8I7aL1eJuNQ9CheDcoCtcu2gU2LSctBQTlQwkcDwQY8yAcSFv2o+nF3+00SKkxmmg== X-Received: by 2002:a1c:5f85:: with SMTP id t127-v6mr403004wmb.16.1536157353829; Wed, 05 Sep 2018 07:22:33 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:33 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:18 +0100 Message-Id: <20180905142222.3251-4-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 3/7] drm/i915: Record the sseu configuration per-context & engine X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Wilson We want to expose the ability to reconfigure the slices, subslice and eu per context and per engine. To facilitate that, store the current configuration on the context for each engine, which is initially set to the device default upon creation. v2: record sseu configuration per context & engine (Chris) v3: introduce the i915_gem_context_sseu to store powergating programming, sseu_dev_info has grown quite a bit (Lionel) v4: rename i915_gem_sseu into intel_sseu (Chris) use to_intel_context() (Chris) v5: More to_intel_context() (Tvrtko) Switch intel_sseu from union to struct (Tvrtko) Move context default sseu in existing loop (Chris) v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko) Tvrtko Ursulin: v7: * Pass intel_sseu by pointer instead of value to make_rpcs. * Rebase for make_rpcs changes. v8: * Rebase for RPCS edit on pin. v9: * Rebase for context image setup changes. Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 14 +++++++++++++ drivers/gpu/drm/i915/i915_gem_context.c | 2 ++ drivers/gpu/drm/i915/i915_gem_context.h | 4 ++++ drivers/gpu/drm/i915/i915_request.h | 10 ++++++++++ drivers/gpu/drm/i915/intel_lrc.c | 26 ++++++++++++------------- 5 files changed, 43 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 767615ecdea5..e4682fc572e6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3449,6 +3449,20 @@ mkwrite_device_info(struct drm_i915_private *dev_priv) return (struct intel_device_info *)&dev_priv->info; } +static inline struct intel_sseu +intel_device_default_sseu(struct drm_i915_private *i915) +{ + const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; + struct intel_sseu value = { + .slice_mask = sseu->slice_mask, + .subslice_mask = sseu->subslice_mask[0], + .min_eus_per_subslice = sseu->max_eus_per_subslice, + .max_eus_per_subslice = sseu->max_eus_per_subslice, + }; + + return value; +} + /* modesetting */ extern void intel_modeset_init_hw(struct drm_device *dev); extern int intel_modeset_init(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 747b8170a15a..ca2c8fcd1090 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -343,6 +343,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, struct intel_context *ce = &ctx->__engine[n]; ce->gem_context = ctx; + /* Use the whole device by default */ + ce->sseu = intel_device_default_sseu(dev_priv); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index e09673ca731d..79d2e8f62ad1 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -31,6 +31,7 @@ #include "i915_gem.h" #include "i915_scheduler.h" +#include "intel_device_info.h" struct pid; @@ -165,6 +166,9 @@ struct i915_gem_context { int pin_count; const struct intel_context_ops *ops; + + /** sseu: Control eu/slice partitioning */ + struct intel_sseu sseu; } __engine[I915_NUM_ENGINES]; /** ring_size: size for allocating the per-engine ring buffer */ diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 9898301ab7ef..eb6f8cce16c4 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -39,6 +39,16 @@ struct drm_i915_gem_object; struct i915_request; struct i915_timeline; +/* + * Powergating configuration for a particular (context,engine). + */ +struct intel_sseu { + u8 slice_mask; + u8 subslice_mask; + u8 min_eus_per_subslice; + u8 max_eus_per_subslice; +}; + struct intel_wait { struct rb_node node; struct task_struct *tsk; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3bdc1ac3e926..8a477e43dbca 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,7 +1305,8 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 make_rpcs(struct drm_i915_private *dev_priv); +static u32 make_rpcs(struct drm_i915_private *dev_priv, + struct intel_sseu *ctx_sseu); static struct intel_context * __execlists_context_pin(struct intel_engine_cs *engine, @@ -1349,7 +1350,7 @@ __execlists_context_pin(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) { ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = - make_rpcs(engine->i915); + make_rpcs(engine->i915, &ce->sseu); } ce->state->obj->pin_global++; @@ -2493,12 +2494,13 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) return logical_ring_init(engine); } -static u32 -make_rpcs(struct drm_i915_private *dev_priv) +static u32 make_rpcs(struct drm_i915_private *dev_priv, + struct intel_sseu *ctx_sseu) { - bool subslice_pg = INTEL_INFO(dev_priv)->sseu.has_subslice_pg; - u8 slices = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask); - u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]); + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu; + bool subslice_pg = sseu->has_subslice_pg; + u8 slices = hweight8(ctx_sseu->slice_mask); + u8 subslices = hweight8(ctx_sseu->subslice_mask); u32 rpcs = 0; /* @@ -2539,7 +2541,7 @@ make_rpcs(struct drm_i915_private *dev_priv) * must make an explicit request through RPCS for full * enablement. */ - if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) { + if (sseu->has_slice_pg) { u32 mask, val = slices; if (INTEL_GEN(dev_priv) >= 11) { @@ -2567,18 +2569,16 @@ make_rpcs(struct drm_i915_private *dev_priv) rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val; } - if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) { + if (sseu->has_eu_pg) { u32 val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK; From patchwork Wed Sep 5 14:22:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588975 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 89C0A5A4 for ; Wed, 5 Sep 2018 14:22:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 75F232A2DA for ; Wed, 5 Sep 2018 14:22:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6A47C2A306; Wed, 5 Sep 2018 14:22:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EB81D2A2DA for ; Wed, 5 Sep 2018 14:22:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3F6476E4CA; Wed, 5 Sep 2018 14:22:38 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id A450B6E4C8 for ; Wed, 5 Sep 2018 14:22:36 +0000 (UTC) Received: by mail-wm0-x241.google.com with SMTP id j192-v6so7996224wmj.1 for ; Wed, 05 Sep 2018 07:22:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=C1NiMgXz8wIhGYoUZpuOmkRONK4SARc/IapNq/hdrBc=; b=Nnzwu72OPC/HehkJxisUK4F9T5jlt4tDTRYZj4IFTju+WbJP+GCuidQOmnYy2T33Zd 49oD9O+lKh+9tY80A59H0skL/1Mza3Mfjuotkzkb5Ur7UmRScBxhIjkEwc7FalWrwiAJ rsMNbQ34gHeXLSIA56Jgpqg+Sykon1EZGeGMGOjR5mF8KDPMRL/+9i1HIjgW8S312PE5 YziSHi7r+d73S+ZubQzXKtBt7QtK+4+JoknLWnbWQ84SkRmclKURXUuKXCwdpxPKInul rTo5HLAVv8oGQKn+4d8QmZGrUq/urLlyI9hQKTSfglbqb4PoxEsZnxjBE1WTGtm0hPV/ Tenw== X-Gm-Message-State: APzg51AdaslNTBDISWVcNWI7m66NjmmAfEkpROl2qi4XKWnCvt8XkRNk ugYtHsAIcdiW6gIz8XE3YoX1K/TqOaY= X-Google-Smtp-Source: ANB0VdY5FbAEJMYsO/O2Gxe9FZOLDzdZtxwKM0dODrwdWWUiGsK8CeWqSmdeV99/3YtQpVgQHDdw0w== X-Received: by 2002:a1c:b157:: with SMTP id a84-v6mr396988wmf.18.1536157354906; Wed, 05 Sep 2018 07:22:34 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:34 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:19 +0100 Message-Id: <20180905142222.3251-5-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 4/7] drm/i915/perf: lock powergating configuration to default when active X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Lionel Landwerlin If some of the contexts submitting workloads to the GPU have been configured to shutdown slices/subslices, we might loose the NOA configurations written in the NOA muxes. One possible solution to this problem is to reprogram the NOA muxes when we switch to a new context. We initially tried this in the workaround batchbuffer but some concerns where raised about the cost of reprogramming at every context switch. This solution is also not without consequences from the userspace point of view. Reprogramming of the muxes can only happen once the powergating configuration has changed (which happens after context switch). This means for a window of time during the recording, counters recorded by the OA unit might be invalid. This requires userspace dealing with OA reports to discard the invalid values. Minimizing the reprogramming could be implemented by tracking of the last programmed configuration somewhere in GGTT and use MI_PREDICATE to discard some of the programming commands, but the command streamer would still have to parse all the MI_LRI instructions in the workaround batchbuffer. Another solution, which this change implements, is to simply disregard the user requested configuration for the period of time when i915/perf is active. There is no known issue with this apart from a performance penality for some media workloads that benefit from running on a partially powergated GPU. We already prevent RC6 from affecting the programming so it doesn't sound completely unreasonable to hold on powergating for the same reason. v2: Leave RPCS programming in intel_lrc.c (Lionel) v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel) More to_intel_context() (Tvrtko) s/dev_priv/i915/ (Tvrtko) Tvrtko Ursulin: v4: * Rebase for make_rpcs changes. v5: * Apply OA restriction from make_rpcs directly. v6: * Rebase for context image setup changes. Signed-off-by: Lionel Landwerlin Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_perf.c | 5 +++++ drivers/gpu/drm/i915/intel_lrc.c | 30 ++++++++++++++++++++---------- drivers/gpu/drm/i915/intel_lrc.h | 3 +++ 3 files changed, 28 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ccb20230df2c..dd65b72bddd4 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1677,6 +1677,11 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, CTX_REG(reg_state, state_offset, flex_regs[i], value); } + + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, + gen8_make_rpcs(dev_priv, + &to_intel_context(ctx, + dev_priv->engine[RCS])->sseu)); } /* diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 8a477e43dbca..9709c1fbe836 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,9 +1305,6 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 make_rpcs(struct drm_i915_private *dev_priv, - struct intel_sseu *ctx_sseu); - static struct intel_context * __execlists_context_pin(struct intel_engine_cs *engine, struct i915_gem_context *ctx, @@ -1350,7 +1347,7 @@ __execlists_context_pin(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) { ce->lrc_reg_state[CTX_R_PWR_CLK_STATE + 1] = - make_rpcs(engine->i915, &ce->sseu); + gen8_make_rpcs(engine->i915, &ce->sseu); } ce->state->obj->pin_global++; @@ -2494,15 +2491,28 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) return logical_ring_init(engine); } -static u32 make_rpcs(struct drm_i915_private *dev_priv, - struct intel_sseu *ctx_sseu) +u32 gen8_make_rpcs(struct drm_i915_private *dev_priv, + struct intel_sseu *req_sseu) { const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu; bool subslice_pg = sseu->has_subslice_pg; - u8 slices = hweight8(ctx_sseu->slice_mask); - u8 subslices = hweight8(ctx_sseu->subslice_mask); + struct intel_sseu ctx_sseu; + u8 slices, subslices; u32 rpcs = 0; + /* + * If i915/perf is active, we want a stable powergating configuration + * on the system. The most natural configuration to take in that case + * is the default (i.e maximum the hardware can do). + */ + if (unlikely(dev_priv->perf.oa.exclusive_stream)) + ctx_sseu = intel_device_default_sseu(dev_priv); + else + ctx_sseu = *req_sseu; + + slices = hweight8(ctx_sseu.slice_mask); + subslices = hweight8(ctx_sseu.subslice_mask); + /* * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits * wide and Icelake has up to eight subslices, specfial programming is @@ -2572,13 +2582,13 @@ static u32 make_rpcs(struct drm_i915_private *dev_priv, if (sseu->has_eu_pg) { u32 val; - val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index f5a5502ecf70..11da6fc0002d 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -104,4 +104,7 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv); void intel_execlists_set_default_submission(struct intel_engine_cs *engine); +u32 gen8_make_rpcs(struct drm_i915_private *dev_priv, + struct intel_sseu *ctx_sseu); + #endif /* _INTEL_LRC_H_ */ From patchwork Wed Sep 5 14:22:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588979 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38960112B for ; Wed, 5 Sep 2018 14:22:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 23C092A2DA for ; Wed, 5 Sep 2018 14:22:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 17F2C2A306; Wed, 5 Sep 2018 14:22:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 90BDE2A2DA for ; Wed, 5 Sep 2018 14:22:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A36CB6E4CE; Wed, 5 Sep 2018 14:22:39 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0A56D6E4C8 for ; Wed, 5 Sep 2018 14:22:38 +0000 (UTC) Received: by mail-wm0-x244.google.com with SMTP id b19-v6so8213917wme.3 for ; Wed, 05 Sep 2018 07:22:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=P/bSNVqgejZVzS7PqoeL8elFQ2QUJ2Qc9GJhqk87mF8=; b=tgUvX+GB4EHJRtBG/fluSqLjCKHwKXvfFmdMBFnxYELt2lfayX/KNm7McTQCylPTMh EYk3ErUwIZ1DrwyjVSyopfsIaKPcRKWYTR2mq/4HYY43fClEsNduAgphUem6jLUc7mfP Yd88XQ1neGkno/jXv6tSpUvzgTLlc/UKJ7wQBdELdHO8oMTFfAVti2PrBPu8Sbr2UhVf mPTn61SS6oi9/XgXWpZ04bZ+63Vn2O0piLs6Vtv+1sfVdGjklRl0k7FShH/z7ezC+nWT tN9RJRQdC9t1I1NSBwForB7TkDXBjFjI4z5bjF2lWpT3W+JzFHzq4xxbbgu8JJfJn+7h BkWw== X-Gm-Message-State: APzg51BcV+6FztftWrBNLB0mEob3+WL5Ri9sd0PISaTxGXy8CVXT5E8o zx3O6EV79Eqm2OObUYrbeDPn/3xyCn8= X-Google-Smtp-Source: ANB0Vdb5Gnxk3RxRIEQAia41sUj9KDGjJvqxmiUCsGd2CTKuXdgodZgGq+kdQdcEepp7luyHpw4TCg== X-Received: by 2002:a1c:8291:: with SMTP id e139-v6mr415122wmd.39.1536157356394; Wed, 05 Sep 2018 07:22:36 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:35 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:20 +0100 Message-Id: <20180905142222.3251-6-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 5/7] drm/i915: Add timeline barrier support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Timeline barrier allows serialization between different timelines. After calling i915_timeline_set_barrier with a request, all following submissions on this timeline will be set up as depending on this request, or barrier. Once the barrier has been completed it automatically gets cleared and things continue as normal. This facility will be used by the upcoming context SSEU code. v2: * Assert barrier has been retired on timeline_fini. (Chris Wilson) * Fix mock_timeline. Signed-off-by: Tvrtko Ursulin Suggested-by: Chris Wilson Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_request.c | 13 +++++++++ drivers/gpu/drm/i915/i915_timeline.c | 3 +++ drivers/gpu/drm/i915/i915_timeline.h | 27 +++++++++++++++++++ .../gpu/drm/i915/selftests/mock_timeline.c | 2 ++ 4 files changed, 45 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 09ed48833b54..245f43f3bcc1 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -644,6 +644,15 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } +static int add_timeline_barrier(struct i915_request *rq) +{ + struct i915_request *barrier = + i915_gem_active_raw(&rq->timeline->barrier, + &rq->i915->drm.struct_mutex); + + return barrier ? i915_request_await_dma_fence(rq, &barrier->fence) : 0; +} + /** * i915_request_alloc - allocate a request structure * @@ -806,6 +815,10 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx) */ rq->head = rq->ring->emit; + ret = add_timeline_barrier(rq); + if (ret) + goto err_unwind; + /* Unconditionally invalidate GPU caches and TLBs. */ ret = engine->emit_flush(rq, EMIT_INVALIDATE); if (ret) diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index 4667cc08c416..5a87c5bd5154 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -37,6 +37,8 @@ void i915_timeline_init(struct drm_i915_private *i915, INIT_LIST_HEAD(&timeline->requests); i915_syncmap_init(&timeline->sync); + + init_request_active(&timeline->barrier, NULL); } /** @@ -69,6 +71,7 @@ void i915_timelines_park(struct drm_i915_private *i915) void i915_timeline_fini(struct i915_timeline *timeline) { GEM_BUG_ON(!list_empty(&timeline->requests)); + GEM_BUG_ON(i915_gem_active_isset(&timeline->barrier)); i915_syncmap_free(&timeline->sync); diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h index a2c2c3ab5fb0..c8526ab44dbc 100644 --- a/drivers/gpu/drm/i915/i915_timeline.h +++ b/drivers/gpu/drm/i915/i915_timeline.h @@ -72,6 +72,16 @@ struct i915_timeline { */ u32 global_sync[I915_NUM_ENGINES]; + /** + * Barrier provides the ability to serialize ordering between different + * timelines. + * + * Users can call i915_timeline_set_barrier which will make all + * subsequent submissions be executed only after this barrier has been + * completed. + */ + struct i915_gem_active barrier; + struct list_head link; const char *name; @@ -125,4 +135,21 @@ static inline bool i915_timeline_sync_is_later(struct i915_timeline *tl, void i915_timelines_park(struct drm_i915_private *i915); +/** + * i915_timeline_set_barrier - orders submission between different timelines + * @timeline: timeline to set the barrier on + * @rq: request after which new submissions can proceed + * + * Sets the passed in request as the serialization point for all subsequent + * submissions on @timeline. Subsequent requests will not be submitted to GPU + * until the barrier has been completed. + */ +static inline void +i915_timeline_set_barrier(struct i915_timeline *timeline, + struct i915_request *rq) +{ + GEM_BUG_ON(timeline->fence_context == rq->timeline->fence_context); + i915_gem_active_set(&timeline->barrier, rq); +} + #endif diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c index dcf3b16f5a07..a718b64c988e 100644 --- a/drivers/gpu/drm/i915/selftests/mock_timeline.c +++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c @@ -19,6 +19,8 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context) i915_syncmap_init(&timeline->sync); + init_request_active(&timeline->barrier, NULL); + INIT_LIST_HEAD(&timeline->link); } From patchwork Wed Sep 5 14:22:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588981 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D6C8B112B for ; Wed, 5 Sep 2018 14:22:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C267C2A2DA for ; Wed, 5 Sep 2018 14:22:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B6FD42A309; Wed, 5 Sep 2018 14:22:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id AA1DF2A2DA for ; Wed, 5 Sep 2018 14:22:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CB88C6E4AA; Wed, 5 Sep 2018 14:22:41 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id E021F6E4D0 for ; Wed, 5 Sep 2018 14:22:39 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id y2-v6so8207866wma.1 for ; Wed, 05 Sep 2018 07:22:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=h0L3kALhXqSczb+k65deUCpbquUK9N01z5B9VYphpbA=; b=LTfVtF8LE+mPuz23E0jZu+T7Oe7hZTrka0qzP/uiSeHznQGHFt5gvLHFOu6GcBt1c1 LkI8K0zazGbJY7Gt3W9U9BlB+D4+JRt/28TReFoLvUlpmEpn6gXfR7QuruQ+yuINhue1 /UppQIQBPaMNDjf2j/qzn+dzE4ecucb6JSAnoRN2dGnjnohDwoXvIEnI3stT0r+FEtRh BJK5FdBx9DvaoK6kxO55FPO6xDHG4LqnI1uCfhHh6Fyoq2wSH5ChpJmNKDWXPTuFP55/ uQ/Za7QcEZbgLWAkeZt9m/qDa3YF0TO3g5pvFVSq2MVLr0tDgdNg4qSuHsYM6RImxDjn mshA== X-Gm-Message-State: APzg51DcF7xGz7KulfGIbifVLU27bIgIDCSuEjeLEsNlUrl/FXlrbAi1 dqytXjORJtPboSk+OsHrq3xNbIYwzJ8= X-Google-Smtp-Source: ANB0VdbNxgFdTWRgeAso/pYkmtPX6i6cJE/10BJ6zXAu6gIkSwX3vlBhZDfH5s+/9GTVxHIs5ktgJA== X-Received: by 2002:a1c:dac9:: with SMTP id r192-v6mr369507wmg.141.1536157358097; Wed, 05 Sep 2018 07:22:38 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:37 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:21 +0100 Message-Id: <20180905142222.3251-7-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 6/7] drm/i915: Expose RPCS (SSEU) configuration to userspace X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Wilson We want to allow userspace to reconfigure the subslice configuration for its own use case. To do so, we expose a context parameter to allow adjustment of the RPCS register stored within the context image (and currently not accessible via LRI). If the context is adjusted before first use, the adjustment is for "free"; otherwise if the context is active we flush the context off the GPU (stalling all users) and forcing the GPU to save the context to memory where we can modify it and so ensure that the register is reloaded on next execution. The overhead of managing additional EU subslices can be significant, especially in multi-context workloads. Non-GPGPU contexts should preferably disable the subslices it is not using, and others should fine-tune the number to match their workload. We expose complete control over the RPCS register, allowing configuration of slice/subslice, via masks packed into a u64 for simplicity. For example, struct drm_i915_gem_context_param arg; struct drm_i915_gem_context_param_sseu sseu = { .class = 0, .instance = 0, }; memset(&arg, 0, sizeof(arg)); arg.ctx_id = ctx; arg.param = I915_CONTEXT_PARAM_SSEU; arg.value = (uintptr_t) &sseu; if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &arg) == 0) { sseu.packed.subslice_mask = 0; drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &arg); } could be used to disable all subslices where supported. v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel) v3: Add ability to program this per engine (Chris) v4: Move most get_sseu() into i915_gem_context.c (Lionel) v5: Validate sseu configuration against the device's capabilities (Lionel) v6: Change context powergating settings through MI_SDM on kernel context (Chris) v7: Synchronize the requests following a powergating setting change using a global dependency (Chris) Iterate timelines through dev_priv.gt.active_rings (Tvrtko) Disable RPCS configuration setting for non capable users (Lionel/Tvrtko) v8: s/union intel_sseu/struct intel_sseu/ (Lionel) s/dev_priv/i915/ (Tvrtko) Change uapi class/instance fields to u16 (Tvrtko) Bump mask fields to 64bits (Lionel) Don't return EPERM when dynamic sseu is disabled (Tvrtko) v9: Import context image into kernel context's ppgtt only when reconfiguring powergated slice/subslices (Chris) Use aliasing ppgtt when needed (Michel) Tvrtko Ursulin: v10: * Update for upstream changes. * Request submit needs a RPM reference. * Reject on !FULL_PPGTT for simplicity. * Pull out get/set param to helpers for readability and less indent. * Use i915_request_await_dma_fence in add_global_barrier to skip waits on the same timeline and avoid GEM_BUG_ON. * No need to explicitly assign a NULL pointer to engine in legacy mode. * No need to move gen8_make_rpcs up. * Factored out global barrier as prep patch. * Allow to only CAP_SYS_ADMIN if !Gen11. v11: * Remove engine vfunc in favour of local helper. (Chris Wilson) * Stop retiring requests before updates since it is not needed (Chris Wilson) * Implement direct CPU update path for idle contexts. (Chris Wilson) * Left side dependency needs only be on the same context timeline. (Chris Wilson) * It is sufficient to order the timeline. (Chris Wilson) * Reject !RCS configuration attempts with -ENODEV for now. v12: * Rebase for make_rpcs. v13: * Centralize SSEU normalization to make_rpcs. * Type width checking (uAPI <-> implementation). * Gen11 restrictions uAPI checks. * Gen11 subslice count differences handling. Chris Wilson: * args->size handling fixes. * Update context image from GGTT. * Postpone context image update to pinning. * Use i915_gem_active_raw instead of last_request_on_engine. v14: * Add activity tracker on intel_context to fix the lifetime issues and simplify the code. (Chris Wilson) v15: * Fix context pin leak if no space in ring by simplifying the context pinning sequence. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899 Issue: https://github.com/intel/media-driver/issues/267 Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Cc: Dmitry Rogozhkin Cc: Tvrtko Ursulin Cc: Zhipeng Gong Cc: Joonas Lahtinen Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c | 303 +++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_context.h | 6 + drivers/gpu/drm/i915/intel_lrc.c | 4 +- include/uapi/drm/i915_drm.h | 43 ++++ 4 files changed, 353 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index ca2c8fcd1090..aa1f34e63080 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -90,6 +90,7 @@ #include #include "i915_drv.h" #include "i915_trace.h" +#include "intel_lrc_reg.h" #include "intel_workarounds.h" #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 @@ -322,6 +323,14 @@ static u32 default_desc_template(const struct drm_i915_private *i915, return desc; } +static void intel_context_retire(struct i915_gem_active *active, + struct i915_request *rq) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + + intel_context_unpin(ce); +} + static struct i915_gem_context * __create_hw_context(struct drm_i915_private *dev_priv, struct drm_i915_file_private *file_priv) @@ -345,6 +354,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, ce->gem_context = ctx; /* Use the whole device by default */ ce->sseu = intel_device_default_sseu(dev_priv); + + init_request_active(&ce->active, intel_context_retire); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); @@ -846,6 +857,48 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, return 0; } +static int get_sseu(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct drm_i915_gem_context_param_sseu user_sseu; + struct intel_engine_cs *engine; + struct intel_context *ce; + + if (args->size == 0) + goto out; + else if (args->size < sizeof(user_sseu)) + return -EINVAL; + + if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value), + sizeof(user_sseu))) + return -EFAULT; + + if (user_sseu.rsvd1 || user_sseu.rsvd2) + return -EINVAL; + + engine = intel_engine_lookup_user(ctx->i915, + user_sseu.class, + user_sseu.instance); + if (!engine) + return -EINVAL; + + ce = to_intel_context(ctx, engine); + + user_sseu.slice_mask = ce->sseu.slice_mask; + user_sseu.subslice_mask = ce->sseu.subslice_mask; + user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice; + user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice; + + if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu, + sizeof(user_sseu))) + return -EFAULT; + +out: + args->size = sizeof(user_sseu); + + return 0; +} + int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -858,15 +911,17 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, if (!ctx) return -ENOENT; - args->size = 0; switch (args->param) { case I915_CONTEXT_PARAM_BAN_PERIOD: ret = -EINVAL; break; case I915_CONTEXT_PARAM_NO_ZEROMAP: + args->size = 0; args->value = ctx->flags & CONTEXT_NO_ZEROMAP; break; case I915_CONTEXT_PARAM_GTT_SIZE: + args->size = 0; + if (ctx->ppgtt) args->value = ctx->ppgtt->vm.total; else if (to_i915(dev)->mm.aliasing_ppgtt) @@ -875,14 +930,20 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, args->value = to_i915(dev)->ggtt.vm.total; break; case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE: + args->size = 0; args->value = i915_gem_context_no_error_capture(ctx); break; case I915_CONTEXT_PARAM_BANNABLE: + args->size = 0; args->value = i915_gem_context_is_bannable(ctx); break; case I915_CONTEXT_PARAM_PRIORITY: + args->size = 0; args->value = ctx->sched.priority; break; + case I915_CONTEXT_PARAM_SSEU: + ret = get_sseu(ctx, args); + break; default: ret = -EINVAL; break; @@ -892,6 +953,242 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +static int gen8_emit_rpcs_config(struct i915_request *rq, + struct intel_context *ce, + struct intel_sseu sseu) +{ + u64 offset; + u32 *cs; + + cs = intel_ring_begin(rq, 4); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + offset = ce->state->node.start + + LRC_STATE_PN * PAGE_SIZE + + (CTX_R_PWR_CLK_STATE + 1) * 4; + + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; + *cs++ = lower_32_bits(offset); + *cs++ = upper_32_bits(offset); + *cs++ = gen8_make_rpcs(rq->i915, &sseu); + + intel_ring_advance(rq, cs); + + return 0; +} + +static int +gen8_modify_rpcs_gpu(struct intel_context *ce, + struct intel_engine_cs *engine, + struct intel_sseu sseu) +{ + struct drm_i915_private *i915 = engine->i915; + struct i915_request *rq, *prev; + int ret; + + GEM_BUG_ON(!ce->pin_count); + + lockdep_assert_held(&i915->drm.struct_mutex); + + /* Submitting requests etc needs the hw awake. */ + intel_runtime_pm_get(i915); + + rq = i915_request_alloc(engine, i915->kernel_context); + if (IS_ERR(rq)) { + ret = PTR_ERR(rq); + goto out_put; + } + + ret = gen8_emit_rpcs_config(rq, ce, sseu); + if (ret) + goto out_add; + + /* Queue this switch after all other activity by this context. */ + prev = i915_gem_active_raw(&ce->ring->timeline->last_request, + &i915->drm.struct_mutex); + if (prev && !i915_request_completed(prev)) + i915_sw_fence_await_sw_fence_gfp(&rq->submit, + &prev->submit, + I915_FENCE_GFP); + + /* Order all following requests to be after. */ + i915_timeline_set_barrier(ce->ring->timeline, rq); + + /* + * Guarantee context image and the timeline remains pinned until the + * modifying request is retired by setting the ce activity tracker. + * + * But we only need to take one pin on the account of it. Or in other + * words transfer the pinned ce object to tracked active request. + */ + if (!i915_gem_active_isset(&ce->active)) + __intel_context_pin(ce); + i915_gem_active_set(&ce->active, rq); + +out_add: + i915_request_add(rq); +out_put: + intel_runtime_pm_put(i915); + + return ret; +} + +static int +i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx, + struct intel_engine_cs *engine, + struct intel_sseu sseu) +{ + struct intel_context *ce = to_intel_context(ctx, engine); + int ret = 0; + + GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8); + GEM_BUG_ON(engine->id != RCS); + + lockdep_assert_held(&ctx->i915->drm.struct_mutex); + + /* Nothing to do if unmodified. */ + if (!memcmp(&ce->sseu, &sseu, sizeof(sseu))) + return 0; + + /* + * If context is not idle we have to submit an ordered request to modify + * its context image via the kernel context. Pristine and idle contexts + * will be configured on pinning. + */ + if (ce->pin_count) + ret = gen8_modify_rpcs_gpu(ce, engine, sseu); + + if (!ret) + ce->sseu = sseu; + + return ret; +} + +static int +user_to_context_sseu(struct drm_i915_private *i915, + const struct drm_i915_gem_context_param_sseu *user, + struct intel_sseu *context) +{ + const struct sseu_dev_info *device = &INTEL_INFO(i915)->sseu; + + /* No zeros in any field. */ + if (!user->slice_mask || !user->subslice_mask || + !user->min_eus_per_subslice || !user->max_eus_per_subslice) + return -EINVAL; + + /* Max > min. */ + if (user->max_eus_per_subslice < user->min_eus_per_subslice) + return -EINVAL; + + /* Check validity against hardware. */ + if (user->slice_mask & ~device->slice_mask) + return -EINVAL; + + if (user->subslice_mask & ~device->subslice_mask[0]) + return -EINVAL; + + if (user->max_eus_per_subslice > device->max_eus_per_subslice) + return -EINVAL; + + /* + * Some future proofing on the types since the uAPI is wider than the + * current internal implementation. + */ + if (WARN_ON((fls(user->slice_mask) > + sizeof(context->slice_mask) * BITS_PER_BYTE) || + (fls(user->subslice_mask) > + sizeof(context->subslice_mask) * BITS_PER_BYTE) || + overflows_type(user->min_eus_per_subslice, + context->min_eus_per_subslice) || + overflows_type(user->max_eus_per_subslice, + context->max_eus_per_subslice))) + return -EINVAL; + + context->slice_mask = user->slice_mask; + context->subslice_mask = user->subslice_mask; + context->min_eus_per_subslice = user->min_eus_per_subslice; + context->max_eus_per_subslice = user->max_eus_per_subslice; + + /* Part specific restrictions. */ + if (IS_GEN11(i915)) { + unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]); + unsigned int req_s = hweight8(context->slice_mask); + unsigned int req_ss = hweight8(context->subslice_mask); + + /* + * Only full subslice enablement is possible if more than one + * slice is turned on. + */ + if (req_s > 1 && req_ss != hw_ss_per_s) + return -EINVAL; + + /* + * If more than four (SScount bitfield limit) subslices are + * requested then the number has to be even. + */ + if (req_ss > 4 && (req_ss & 1)) + return -EINVAL; + + /* + * If only one slice is enabled subslice count must be at most + * half of the all available subslices. + */ + if (req_s == 1 && req_ss > (hw_ss_per_s / 2)) + return -EINVAL; + } + + return 0; +} + +static int set_sseu(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct drm_i915_private *i915 = ctx->i915; + struct drm_i915_gem_context_param_sseu user_sseu; + struct intel_engine_cs *engine; + struct intel_sseu sseu; + int ret; + + if (args->size < sizeof(user_sseu)) + return -EINVAL; + + if (INTEL_GEN(i915) < 8) + return -ENODEV; + + if (!IS_GEN11(i915) && !capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value), + sizeof(user_sseu))) + return -EFAULT; + + if (user_sseu.rsvd1 || user_sseu.rsvd2) + return -EINVAL; + + engine = intel_engine_lookup_user(i915, + user_sseu.class, + user_sseu.instance); + if (!engine) + return -EINVAL; + + /* Only render engine supports RPCS configuration. */ + if (engine->class != RENDER_CLASS) + return -ENODEV; + + ret = user_to_context_sseu(i915, &user_sseu, &sseu); + if (ret) + return ret; + + ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu); + if (ret) + return ret; + + args->size = sizeof(user_sseu); + + return 0; +} + int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -957,7 +1254,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, ctx->sched.priority = priority; } break; - + case I915_CONTEXT_PARAM_SSEU: + ret = set_sseu(ctx, args); + break; default: ret = -EINVAL; break; diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index 79d2e8f62ad1..968e1d47d944 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -165,6 +165,12 @@ struct i915_gem_context { u64 lrc_desc; int pin_count; + /** + * active: Active tracker for the external rq activity on this + * intel_context object. + */ + struct i915_gem_active active; + const struct intel_context_ops *ops; /** sseu: Control eu/slice partitioning */ diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 9709c1fbe836..3c85392a3109 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2538,7 +2538,9 @@ u32 gen8_make_rpcs(struct drm_i915_private *dev_priv, * subslices are enabled, or a count between one and four on the first * slice. */ - if (IS_GEN11(dev_priv) && slices == 1 && subslices >= 4) { + if (IS_GEN11(dev_priv) && + slices == 1 && + subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) { GEM_BUG_ON(subslices & 1); subslice_pg = false; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index a4446f452040..e195c38b15a6 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1478,9 +1478,52 @@ struct drm_i915_gem_context_param { #define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */ #define I915_CONTEXT_DEFAULT_PRIORITY 0 #define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */ + /* + * When using the following param, value should be a pointer to + * drm_i915_gem_context_param_sseu. + */ +#define I915_CONTEXT_PARAM_SSEU 0x7 __u64 value; }; +struct drm_i915_gem_context_param_sseu { + /* + * Engine class & instance to be configured or queried. + */ + __u16 class; + __u16 instance; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd1; + + /* + * Mask of slices to enable for the context. Valid values are a subset + * of the bitmask value returned for I915_PARAM_SLICE_MASK. + */ + __u64 slice_mask; + + /* + * Mask of subslices to enable for the context. Valid values are a + * subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK. + */ + __u64 subslice_mask; + + /* + * Minimum/Maximum number of EUs to enable per subslice for the + * context. min_eus_per_subslice must be inferior or equal to + * max_eus_per_subslice. + */ + __u16 min_eus_per_subslice; + __u16 max_eus_per_subslice; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd2; +}; + enum drm_i915_oa_format { I915_OA_FORMAT_A13 = 1, /* HSW only */ I915_OA_FORMAT_A29, /* HSW only */ From patchwork Wed Sep 5 14:22:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10588983 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0DF165A4 for ; Wed, 5 Sep 2018 14:22:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB0602A2DA for ; Wed, 5 Sep 2018 14:22:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DFDF22A2FC; Wed, 5 Sep 2018 14:22:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9A4A92A306 for ; Wed, 5 Sep 2018 14:22:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1EBE66E4D0; Wed, 5 Sep 2018 14:22:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 77F886E4D4 for ; Wed, 5 Sep 2018 14:22:40 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id g33-v6so7913669wrd.1 for ; Wed, 05 Sep 2018 07:22:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cQ43TgWJR2o87BI18sXJ8UmJqeDOxTLmpxHwgoUNmUM=; b=eFFc/oj4lUZ0Wdx2hcr1h5ukj0Q+P93F7MONkciOa64sUom70PoVPCq1HwdALXhVsi IIxlyv4V/6PJsmHLSGSBY0jGUzq3ENkm2/6W4Y+vmfEU2lFSrEf5SrunnbmEXG4GsIj2 U1drhzs5PHQn1v/rSywO5khAtoXfI/wbUaAxWoWqMiTqAGFEUTAw40N7ynnmxigQuhPx d4mIGW69OubAktTVrS3PErWyLdxXLkeEQyqPp2AMaUwNiRcmsx9zbLy/WAUD0GX+uigu S4y4XC6sJIBu2wTz/oG8pgd/3Urbcv4Iu1ot3kjwdYvxj9f5NuxTq45g9AqDMUILe0j6 4YJQ== X-Gm-Message-State: APzg51D9sLgY0ipbZ8g6GEcmy6tynDB/ipCOt/EGtOAcLP4TnvKIuhHT Gl7fCDzZLC5Yb/NaBh5wZ4F9TdH5/CQ= X-Google-Smtp-Source: ANB0VdZGGOaAAK2BOHjX24W2IWVESLu3CLiXEjgFf5lmGynwiaO2dzGFIH+FdtQsH8OMGD4ALtBsng== X-Received: by 2002:adf:eb87:: with SMTP id t7-v6mr26955315wrn.123.1536157359033; Wed, 05 Sep 2018 07:22:39 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id x125-v6sm2851438wmg.27.2018.09.05.07.22.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Sep 2018 07:22:38 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 5 Sep 2018 15:22:22 +0100 Message-Id: <20180905142222.3251-8-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> References: <20180905142222.3251-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 7/7] drm/i915/icl: Support co-existance between per-context SSEU and OA X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin When OA is active we want to lock the powergating configuration, but on Icelake users like media stack will have issues if we lock to the full device configuration. Instead lock to a subset of (sub)slices which are currently a known working configuration for all users. Signed-off-by: Tvrtko Ursulin Cc: Lionel Landwerlin --- drivers/gpu/drm/i915/intel_lrc.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3c85392a3109..19c9c46308e5 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2502,13 +2502,28 @@ u32 gen8_make_rpcs(struct drm_i915_private *dev_priv, /* * If i915/perf is active, we want a stable powergating configuration - * on the system. The most natural configuration to take in that case - * is the default (i.e maximum the hardware can do). + * on the system. + * + * We could choose full enablement, but on ICL we know there are use + * cases which disable slices for functional, apart for performance + * reasons. So in this case we select a known stable subset. */ - if (unlikely(dev_priv->perf.oa.exclusive_stream)) - ctx_sseu = intel_device_default_sseu(dev_priv); - else + if (!dev_priv->perf.oa.exclusive_stream) { ctx_sseu = *req_sseu; + } else { + ctx_sseu = intel_device_default_sseu(dev_priv); + + if (IS_GEN11(dev_priv)) { + /* + * We only need subslice count so it doesn't matter + * which ones we select - just turn of low bits in the + * amount of half of all available subslices per slice. + */ + ctx_sseu.subslice_mask = + ~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2)); + ctx_sseu.slice_mask = 0x1; + } + } slices = hweight8(ctx_sseu.slice_mask); subslices = hweight8(ctx_sseu.subslice_mask);