From patchwork Fri Aug 31 11:53:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10583641 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1B3C14E1 for ; Fri, 31 Aug 2018 11:53:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA58F283C7 for ; Fri, 31 Aug 2018 11:53:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CE9A529D58; Fri, 31 Aug 2018 11:53:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4BCCA2BA56 for ; Fri, 31 Aug 2018 11:53:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ACECB6E8BD; Fri, 31 Aug 2018 11:53:43 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id AC7BC6E8BD for ; Fri, 31 Aug 2018 11:53:41 +0000 (UTC) Received: by mail-wm0-x241.google.com with SMTP id s12-v6so4983724wmc.0 for ; Fri, 31 Aug 2018 04:53:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Qfaqn+RPzbRbmqVlx+H8i5hxC592eIrRHq9D9yy69fw=; b=oy+dI4PK7qufAvlj+bqMOJStFYSM2SRx0cLvwRnU5+DIFd6O+yHlvHXBoihWLa6Wyv UGCNZmwvmrq/hDiRfKHDdmi3GOU3Bkh+IODuseWqPI7gQmNyqJQ0kcKCb/IUBzUP4kXS 01G+Go1uqptPyn7pBFD3iyOLatkF85fSkBEEVuTmF6E4o3agmskVE8nNFgj8Jz1BEhQi DY+A6ikRk6HMVFspcSmk/QZ8o+x9Sl5NEiv40X5Rzqj1Y6Ys1/odVDcxpZs3mRdEAR+t 1S+AROLoySb8Vd45i+yJ9hyOQbSQVWGn1YY6l5vk7y5NqANpWUadMCYJ8GGscGDqmMf0 39ZA== X-Gm-Message-State: APzg51BZxCXmzQ1NtPk+bLrFLDFM8IB4OgqOJIfyIHjGK37Cn7CzyFdH muXhdk+QqFriS/l39ylbIxcE/F9PSJI= X-Google-Smtp-Source: ANB0VdaJ5jzTJgq0gDECdUDjkx2GKJYUkRIgB3JJmTf7C+1i82JAQi7YVES7xFdJbyuivgKNxKv7uQ== X-Received: by 2002:a1c:7915:: with SMTP id l21-v6mr4870102wme.136.1535716420195; Fri, 31 Aug 2018 04:53:40 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id t9-v6sm24319063wra.91.2018.08.31.04.53.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Aug 2018 04:53:39 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 31 Aug 2018 12:53:34 +0100 Message-Id: <20180831115334.2274-1-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 Subject: [Intel-gfx] [PATCH] drm/i915/icl: Fix context RPCS programming X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin There are two issues with the current RPCS programming for Icelake: Expansion of the slice count bitfield has been missed, as well as the required programming workaround for the subslice count bitfield size limitation. 1) Bitfield width for configuring the active slice count has grown so we need to program the GEN8_R_PWR_CLK_STATE accordingly. Current code was always requesting eight times the number of slices (due writting to a bitfield starting three bits higher than it should). These requests were luckily a) capped by the hardware to the available number of slices, and b) we haven't yet exported the code to ask for reduced slice configurations. Due both of the above there was no impact from this incorrect programming but we should still fix it. 2) Due subslice count bitfield being only three bits wide and furthermore capped to a maximum documented value of four, special programming workaround is needed to enable more than four subslices. With this programming driver has to consider the GT configuration as 2x4x8, while the hardware internally translates this to 1x8x8. A limitation stemming from this is that either a subslice count between one and four can be selected, or a subslice count equaling the total number of subslices in all selected slices. In other words, odd subslice counts greater than four are impossible, as are odd subslice counts greater than a single slice subslice count. This also had no impact in the current code base due breakage from 1) always reqesting more than one slice. While fixing this we also add some asserts to flag up any future bitfield overflows. Signed-off-by: Tvrtko Ursulin Bspec: 12247 Reported-by: tony.ye@intel.com Suggested-by: Lionel Landwerlin Cc: Lionel Landwerlin Cc: tony.ye@intel.com Cc: Mika Kuoppala Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_reg.h | 2 + drivers/gpu/drm/i915/intel_lrc.c | 89 +++++++++++++++++++++++++++----- 2 files changed, 78 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index f2321785cbd6..09bc8e730ee1 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -344,6 +344,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN8_RPCS_S_CNT_ENABLE (1 << 18) #define GEN8_RPCS_S_CNT_SHIFT 15 #define GEN8_RPCS_S_CNT_MASK (0x7 << GEN8_RPCS_S_CNT_SHIFT) +#define GEN11_RPCS_S_CNT_SHIFT 12 +#define GEN11_RPCS_S_CNT_MASK (0x3f << GEN11_RPCS_S_CNT_SHIFT) #define GEN8_RPCS_SS_CNT_ENABLE (1 << 11) #define GEN8_RPCS_SS_CNT_SHIFT 8 #define GEN8_RPCS_SS_CNT_MASK (0x7 << GEN8_RPCS_SS_CNT_SHIFT) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index f8ceb9c99dd6..323c46319cb8 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2480,6 +2480,9 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) static u32 make_rpcs(struct drm_i915_private *dev_priv) { + bool subslice_pg = INTEL_INFO(dev_priv)->sseu.has_subslice_pg; + u8 slices = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask); + u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]); u32 rpcs = 0; /* @@ -2489,6 +2492,38 @@ make_rpcs(struct drm_i915_private *dev_priv) if (INTEL_GEN(dev_priv) < 9) return 0; + /* + * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits + * wide and Icelake has up to eight subslices, specfial programming is + * needed in order to correctly enable all subslices. + * + * According to documentation software must consider the configuration + * as 2x4x8 and hardware will translate this to 1x8x8. + * + * Furthemore, even though SScount is three bits, maximum documented + * value for it is four. From this some rules/restrictions follow: + * + * 1. + * If enabled subslice count is greater than four, two whole slices must + * be enabled instead. + * + * 2. + * When more than one slice is enabled, hardware ignores the subslice + * count altogether. + * + * From these restrictions it follows that it is not possible to enable + * a count of subslices between the SScount maximum of four restriction, + * and the maximum available number on a particular SKU. Either all + * subslices are enabled, or a count between one and four on the first + * slice. + */ + if (IS_GEN11(dev_priv) && slices == 1 && subslices >= 4) { + GEM_BUG_ON(subslices & 1); + + subslice_pg = false; + slices *= 2; + } + /* * Starting in Gen9, render power gating can leave * slice/subslice/EU in a partially enabled state. We @@ -2496,24 +2531,52 @@ make_rpcs(struct drm_i915_private *dev_priv) * enablement. */ if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) { - rpcs |= GEN8_RPCS_S_CNT_ENABLE; - rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask) << - GEN8_RPCS_S_CNT_SHIFT; - rpcs |= GEN8_RPCS_ENABLE; + u32 mask; + + rpcs = slices; + + if (INTEL_GEN(dev_priv) >= 11) { + mask = GEN11_RPCS_S_CNT_MASK; + rpcs <<= GEN11_RPCS_S_CNT_SHIFT; + } else { + mask = GEN8_RPCS_S_CNT_MASK; + rpcs <<= GEN8_RPCS_S_CNT_SHIFT; + } + + GEM_BUG_ON(rpcs & ~mask); + rpcs &= mask; + + rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE; } - if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) { - rpcs |= GEN8_RPCS_SS_CNT_ENABLE; - rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]) << - GEN8_RPCS_SS_CNT_SHIFT; - rpcs |= GEN8_RPCS_ENABLE; + if (subslice_pg) { + u32 val = subslices; + + val <<= GEN8_RPCS_SS_CNT_SHIFT; + + GEM_BUG_ON(val & ~GEN8_RPCS_SS_CNT_MASK); + val &= GEN8_RPCS_SS_CNT_MASK; + + rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val; } if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) { - rpcs |= INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MIN_SHIFT; - rpcs |= INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MAX_SHIFT; + u32 val; + + val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << + GEN8_RPCS_EU_MIN_SHIFT; + GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); + val &= GEN8_RPCS_EU_MIN_MASK; + + rpcs |= val; + + val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << + GEN8_RPCS_EU_MAX_SHIFT; + GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); + val &= GEN8_RPCS_EU_MAX_MASK; + + rpcs |= val; + rpcs |= GEN8_RPCS_ENABLE; }