From patchwork Wed Apr 27 23:07:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12829828 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3B280C433FE for ; Wed, 27 Apr 2022 23:08:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0521510F823; Wed, 27 Apr 2022 23:07:54 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6713910EDBA; Wed, 27 Apr 2022 23:07:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651100873; x=1682636873; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e1BxofhpDeBUp2vIZ0Sw+QC5nMLkRAec7DLj2JtpQj8=; b=LXfsXWfp2TvCFRyuD2w0TtLR/KUd6fdCYJ5Y+fDCPuD5AUSSjgdQT6BJ /Z2ueUALqzY9/Lgmb/FethVS0bYmJLuXd6qSg8Mhltd34O+hNbw3sTwNW Yh9cipy1ZfpNIhCOozSyRyKBXm8PWn/eTEnICTV3v5gLYny1pF1Ir2sES LQxWYFZa300lUehhKrAFjhTtmRJyTW3iTxrn1ZaVBzsrgnqE3oD9CiLTW wUdckgdw024N2rcvF41VshwzWi7SMBy/gptuOGUIxwulntTo14PKXM1Tv Sal34D23tmKRL1k0po5qrD/uIndnN9z7cQOp3udfkXld4RA0kZrhXe5PA Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10330"; a="265912013" X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="265912013" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="533495681" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:52 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 1/5] drm/i915/sseu: Don't try to store EU mask internally in UAPI format Date: Wed, 27 Apr 2022 16:07:43 -0700 Message-Id: <20220427230747.906625-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427230747.906625-1-matthew.d.roper@intel.com> References: <20220427230747.906625-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tvrtko.ursulin@linux.intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Storing the EU mask internally in the same format the I915_QUERY topology queries use makes the final copy_to_user() a bit simpler, but makes the rest of the driver's SSEU more complicated. Given that modern platforms (gen11 and beyond) are architecturally guaranteed to have equivalent EU masks for every subslice, it also wastes quite a bit of space since we're storing a duplicate copy of the EU mask for every single subslice where we really only need to store one instance. Let's add a has_common_ss_eumask flag to the SSEU structure to determine which type of hardware we're working on. For the older pre-gen11 platforms the various subslices can have different EU masks so we use an array of u16[] to store each subslice's copy. For gen11 and beyond we'll only use index [0] of the array and not worry about copying the repeated value, except when converting into uapi form for the I915_QUERY ioctl. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +++++++++++++++++++++------- drivers/gpu/drm/i915/gt/intel_sseu.h | 21 ++++++++- drivers/gpu/drm/i915/i915_query.c | 8 ++-- 3 files changed, 73 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 9881a6790574..13387b4024ea 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -91,36 +91,70 @@ static int sseu_eu_idx(const struct sseu_dev_info *sseu, int slice, static u16 sseu_get_eus(const struct sseu_dev_info *sseu, int slice, int subslice) { - int i, offset = sseu_eu_idx(sseu, slice, subslice); - u16 eu_mask = 0; - - for (i = 0; i < sseu->eu_stride; i++) - eu_mask |= - ((u16)sseu->eu_mask[offset + i]) << (i * BITS_PER_BYTE); + if (!intel_sseu_has_subslice(sseu, slice, subslice)) + return 0; - return eu_mask; + if (sseu->has_common_ss_eumask) + return sseu->eu_mask[0]; + else + return sseu->eu_mask[slice * sseu->max_subslices + subslice]; } static void sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice, u16 eu_mask) { - int i, offset = sseu_eu_idx(sseu, slice, subslice); + WARN_ON(sseu->has_common_ss_eumask); + WARN_ON(sseu->max_eus_per_subslice > sizeof(sseu->eu_mask[0]) * BITS_PER_BYTE); - for (i = 0; i < sseu->eu_stride; i++) - sseu->eu_mask[offset + i] = - (eu_mask >> (BITS_PER_BYTE * i)) & 0xff; + sseu->eu_mask[slice * sseu->max_subslices + subslice] = + eu_mask & GENMASK(sseu->max_eus_per_subslice - 1, 0); } static u16 compute_eu_total(const struct sseu_dev_info *sseu) { u16 i, total = 0; + if (sseu->has_common_ss_eumask) + return intel_sseu_subslices_per_slice(sseu, 0) * + hweight16(sseu->eu_mask[0]); + for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++) - total += hweight8(sseu->eu_mask[i]); + total += hweight16(sseu->eu_mask[i]); return total; } +/** + * intel_sseu_copy_eumask_to_user - Copy EU mask into a userspace buffer + * @to: Pointer to userspace buffer to copy to + * @sseu: SSEU structure containing EU mask to copy + * + * Copies the EU mask to a userspace buffer in the format expected by + * the query ioctl's topology queries. + * + * Returns the result of the copy_to_user() operation. + */ +int intel_sseu_copy_eumask_to_user(void __user *to, + const struct sseu_dev_info *sseu) +{ + u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE] = {}; + int len = sseu->max_slices * sseu->max_subslices * sseu->eu_stride; + int s, ss, i; + + for (s = 0; s < sseu->max_slices; s++) { + for (ss = 0; ss < sseu->max_subslices; ss++) { + int offset = sseu_eu_idx(sseu, s, ss); + u16 mask = sseu_get_eus(sseu, s, ss); + + for (i = 0; i < sseu->eu_stride; i++) + eu_mask[offset + i] = + (mask >> (BITS_PER_BYTE * i)) & 0xff; + } + } + + return copy_to_user(to, eu_mask, len); +} + static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en) { u32 ss_mask; @@ -134,7 +168,7 @@ static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en) static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, u32 g_ss_en, u32 c_ss_en, u16 eu_en) { - int s, ss; + int s; /* g_ss_en/c_ss_en represent entire subslice mask across all slices */ GEM_BUG_ON(sseu->max_slices * sseu->max_subslices > @@ -162,11 +196,9 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, get_ss_stride_mask(sseu, s, g_ss_en | c_ss_en)); - - for (ss = 0; ss < sseu->max_subslices; ss++) - if (intel_sseu_has_subslice(sseu, s, ss)) - sseu_set_eus(sseu, s, ss, eu_en); } + sseu->has_common_ss_eumask = 1; + sseu->eu_mask[0] = eu_en; sseu->eu_per_subslice = hweight16(eu_en); sseu->eu_total = compute_eu_total(sseu); } diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 5c078df4729c..106726a2244e 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -57,7 +57,21 @@ struct sseu_dev_info { u8 subslice_mask[GEN_SS_MASK_SIZE]; u8 geometry_subslice_mask[GEN_SS_MASK_SIZE]; u8 compute_subslice_mask[GEN_SS_MASK_SIZE]; - u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE]; + + /* + * EU masks. Use has_common_ss_eumask to determine how the field + * will be interpreted. + * + * On pre-gen11 platforms, each subslice has independent EU fusing, so + * we store an array of u16's that are sufficient to represent each + * subslice's EU mask on pre-gen11 platforms. + * + * For gen11 and beyond, all subslices will always have the same set of + * enabled/disabled EUs so only eu_mask[0] is utilized; all other array + * entries are ignored. + */ + u16 eu_mask[GEN_MAX_HSW_SLICES * GEN_MAX_SS_PER_HSW_SLICE]; + u16 eu_total; u8 eu_per_subslice; u8 min_eu_in_pool; @@ -66,6 +80,8 @@ struct sseu_dev_info { u8 has_slice_pg:1; u8 has_subslice_pg:1; u8 has_eu_pg:1; + /* All subslices have the same set of enabled/disabled EUs? */ + u8 has_common_ss_eumask:1; /* Topology fields */ u8 max_slices; @@ -145,4 +161,7 @@ void intel_sseu_print_topology(struct drm_i915_private *i915, u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice); +int intel_sseu_copy_eumask_to_user(void __user *to, + const struct sseu_dev_info *sseu); + #endif /* __INTEL_SSEU_H__ */ diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index 7584cec53d5d..16f43bf32a05 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -76,10 +76,10 @@ static int fill_topology_info(const struct sseu_dev_info *sseu, subslice_mask, subslice_length)) return -EFAULT; - if (copy_to_user(u64_to_user_ptr(query_item->data_ptr + - sizeof(topo) + - slice_length + subslice_length), - sseu->eu_mask, eu_length)) + if (intel_sseu_copy_eumask_to_user(u64_to_user_ptr(query_item->data_ptr + + sizeof(topo) + + slice_length + subslice_length), + sseu)) return -EFAULT; return total_length; From patchwork Wed Apr 27 23:07:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12829826 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8D083C433EF for ; Wed, 27 Apr 2022 23:07:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D96A810F822; Wed, 27 Apr 2022 23:07:54 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9E89E10E6A8; Wed, 27 Apr 2022 23:07:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651100873; x=1682636873; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Dr6RzPQTLcFgeteKkg2fEvYdVvXVCQCCLv4Mrji2CyQ=; b=P99r2MR58AXkAXQACvx7e/Id6PAKQ9EgZXiRjxk+91zlQ+fRCLzMJSu5 GvrK08alriUJAKHzuzMelQXEqAcQAT5pdmadCGkAsWuangNlIHgfrq3OC /s7uWV+jPYQnxekV72D0XpiTJUJx5UBJIuKJ05oFhNsuLd3FFSogvpUs5 KrhTUF9XcG7Y0cksBgfwHO3g7uS0JY4jKMn0uPKkiclheyAYDNcVJbJ5p 3QqJcVgEtMIYo+seq9nUFYBXqxQZrIH78jHdCYgmEa0s4hnAOSYq5jkj1 h4YgdpBNs9gTN0XYQ/bi3DxfMsg7GYV48yvN9LXbAXhO91pN9P0GdteLg Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10330"; a="265912014" X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="265912014" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="533495685" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 2/5] drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK Date: Wed, 27 Apr 2022 16:07:44 -0700 Message-Id: <20220427230747.906625-3-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427230747.906625-1-matthew.d.roper@intel.com> References: <20220427230747.906625-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tvrtko.ursulin@linux.intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Slice/subslice/EU information should be obtained via the topology queries provided by the I915_QUERY interface; let's turn off support for the old GETPARAM lookups on Xe_HP and beyond where we can't return meaningful values. The slice mask lookup is meaningless since Xe_HP doesn't support traditional slices (and we make no attempt to return the various new units like gslices, cslices, mslices, etc.) here. The subslice mask lookup is even more problematic; given the distinct masks for geometry vs compute purposes, the combined mask returned here is likely not what userspace would want to act upon anyway. The value is also limited to 32-bits by the nature of the GETPARAM ioctl which is sufficient for the initial Xe_HP platforms, but is unable to convey the larger masks that will be needed on other upcoming platforms. Finally, the value returned here becomes even less meaningful when used on multi-tile platforms where each tile will have its own masks. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_getparam.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index c12a0adefda5..ac9767c56619 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -148,11 +148,19 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, value = intel_engines_has_context_isolation(i915); break; case I915_PARAM_SLICE_MASK: + /* Not supported from Xe_HP onward; use topology queries */ + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + return -EINVAL; + value = sseu->slice_mask; if (!value) return -ENODEV; break; case I915_PARAM_SUBSLICE_MASK: + /* Not supported from Xe_HP onward; use topology queries */ + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + return -EINVAL; + /* Only copy bits from the first slice */ memcpy(&value, sseu->subslice_mask, min(sseu->ss_stride, (u8)sizeof(value))); From patchwork Wed Apr 27 23:07:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12829831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3CD61C433F5 for ; Wed, 27 Apr 2022 23:08:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2B1BE10F832; Wed, 27 Apr 2022 23:08:10 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id CF75E10EDBA; Wed, 27 Apr 2022 23:07:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651100873; x=1682636873; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lUf66ZnAVhl/rYa+kuUa5DCiytTqc7HQSC0tlHA4nBE=; b=k9hhy5jhSQPitKkVbGp/3wsN5HT3olNkN+4MByypn6TOI/DjBB+pERpv dC0atdvUKJSZsN5vz/m2HRqdJCj9HsutKrIPDN+NFomg9nd5ll/5ZRzds X2kDYLvZrfMkX6LKrgMPwzDxCfk+K1xxOE2LSeW8y4n2If1YUqftFX/Wh EUe/vOeuQ6qq+uKs5BkSl8alpQDlqepiO2Io/bHfGImyFpDcBNAllej9D rqP9jewSLZWBOAaFEig/i0vY4REyCXCWVn0XuLRBp20H7NDYNCMgYUqM+ qJtHDdmyoTbt3lZur9GRZPuAv5fXO15S+7NYEq9vpPLM1UkKf29GuO596 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10330"; a="265912015" X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="265912015" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="533495690" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 3/5] drm/i915/xehp: Use separate sseu init function Date: Wed, 27 Apr 2022 16:07:45 -0700 Message-Id: <20220427230747.906625-4-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427230747.906625-1-matthew.d.roper@intel.com> References: <20220427230747.906625-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tvrtko.ursulin@linux.intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Xe_HP has enough fundamental differences from previous platforms that it makes sense to use a separate SSEU init function to keep things straightforward and easy to understand. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 85 ++++++++++++++++------------ 1 file changed, 48 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 13387b4024ea..ef66c2b8861a 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -203,13 +203,42 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, sseu->eu_total = compute_eu_total(sseu); } -static void gen12_sseu_info_init(struct intel_gt *gt) +static void xehp_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; u32 g_dss_en, c_dss_en = 0; u16 eu_en = 0; u8 eu_en_fuse; + int eu; + + /* + * The concept of slice has been removed in Xe_HP. To be compatible + * with prior generations, assume a single slice across the entire + * device. Then calculate out the DSS for each workload type within + * that software slice. + */ + intel_sseu_set_info(sseu, 1, 32, 16); + + g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); + c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE); + + eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; + + for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++) + if (eu_en_fuse & BIT(eu)) + eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); + + gen11_compute_sseu_info(sseu, 0x1, g_dss_en, c_dss_en, eu_en); +} + +static void gen12_sseu_info_init(struct intel_gt *gt) +{ + struct sseu_dev_info *sseu = >->info.sseu; + struct intel_uncore *uncore = gt->uncore; + u32 g_dss_en; + u16 eu_en = 0; + u8 eu_en_fuse; u8 s_en; int eu; @@ -217,43 +246,23 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS. * Instead of splitting these, provide userspace with an array * of DSS to more closely represent the hardware resource. - * - * In addition, the concept of slice has been removed in Xe_HP. - * To be compatible with prior generations, assume a single slice - * across the entire device. Then calculate out the DSS for each - * workload type within that software slice. */ - if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) - intel_sseu_set_info(sseu, 1, 32, 16); - else - intel_sseu_set_info(sseu, 1, 6, 16); + intel_sseu_set_info(sseu, 1, 6, 16); - /* - * As mentioned above, Xe_HP does not have the concept of a slice. - * Enable one for software backwards compatibility. - */ - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - s_en = 0x1; - else - s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & - GEN11_GT_S_ENA_MASK; + s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & + GEN11_GT_S_ENA_MASK; g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE); /* one bit per pair of EUs */ - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; - else - eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & - GEN11_EU_DIS_MASK); + eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & + GEN11_EU_DIS_MASK); for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); - gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en); + gen11_compute_sseu_info(sseu, s_en, g_dss_en, 0, eu_en); /* TGL only supports slice-level power gating */ sseu->has_slice_pg = 1; @@ -608,18 +617,20 @@ void intel_sseu_info_init(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; - if (IS_HASWELL(i915)) - hsw_sseu_info_init(gt); - else if (IS_CHERRYVIEW(i915)) - cherryview_sseu_info_init(gt); - else if (IS_BROADWELL(i915)) - bdw_sseu_info_init(gt); - else if (GRAPHICS_VER(i915) == 9) - gen9_sseu_info_init(gt); - else if (GRAPHICS_VER(i915) == 11) - gen11_sseu_info_init(gt); + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_sseu_info_init(gt); else if (GRAPHICS_VER(i915) >= 12) gen12_sseu_info_init(gt); + else if (GRAPHICS_VER(i915) >= 11) + gen11_sseu_info_init(gt); + else if (GRAPHICS_VER(i915) >= 9) + gen9_sseu_info_init(gt); + else if (IS_BROADWELL(i915)) + bdw_sseu_info_init(gt); + else if (IS_CHERRYVIEW(i915)) + cherryview_sseu_info_init(gt); + else if (IS_HASWELL(i915)) + hsw_sseu_info_init(gt); } u32 intel_sseu_make_rpcs(struct intel_gt *gt, From patchwork Wed Apr 27 23:07:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12829829 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1BB0BC433F5 for ; Wed, 27 Apr 2022 23:08:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B79FA10E697; Wed, 27 Apr 2022 23:08:08 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0D7F110E6A8; Wed, 27 Apr 2022 23:07:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651100874; x=1682636874; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7GAo4EUPy8Xww+4o/i5/Q7d5Z82BEOCd27NtXVLtP08=; b=VpySa96iZrdlYRtmUoYLuv7WvnqIOdx0JOX8OWlDN8xHBK2PccBGbblF V8A61u7ZXi0Q6X7DeI/LfCIgBFjdepvNu++moD9NDQxmK/7boC5Ay/gXy iDywO1SBpZfpMFAOvTlDt5gefGLYsBfiqZVsGie8oAe8HyWXSKWAe9OQd aYFelisHkB3mcCyElkAPXFiZFhTuTQDMri9HpF6HvkSGmhe6N/dUvvlzX EB7fDmqUBzWp7CfGWDKwgERYBIfBqboJlTA8WGhg7z7C9/fl7EI7ey1xw Gezzg/VVH/QcEYc+dByB+DNgwIP4jf7k1GDHuJQ8NrqVGEgSe6F8xyPwr Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10330"; a="265912016" X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="265912016" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="533495694" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 4/5] drm/i915/sseu: Simplify gen11+ SSEU handling Date: Wed, 27 Apr 2022 16:07:46 -0700 Message-Id: <20220427230747.906625-5-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427230747.906625-1-matthew.d.roper@intel.com> References: <20220427230747.906625-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tvrtko.ursulin@linux.intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Although gen11 and gen12 architectures supported the concept of multiple slices, in practice all the platforms that were actually designed only had a single slice (i.e., note the parameters to 'intel_sseu_set_info' that we pass for each platform). We can simplify the code slightly by dropping the multi-slice logic from gen11+ platforms. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 73 ++++++++++++++-------------- 1 file changed, 36 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index ef66c2b8861a..f7ff6a9f67b0 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -155,48 +155,32 @@ int intel_sseu_copy_eumask_to_user(void __user *to, return copy_to_user(to, eu_mask, len); } -static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en) -{ - u32 ss_mask; - - ss_mask = ss_en >> (s * sseu->max_subslices); - ss_mask &= GENMASK(sseu->max_subslices - 1, 0); - - return ss_mask; -} - -static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, +static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u32 g_ss_en, u32 c_ss_en, u16 eu_en) { - int s; + u32 valid_ss_mask = GENMASK(sseu->max_subslices - 1, 0); /* g_ss_en/c_ss_en represent entire subslice mask across all slices */ GEM_BUG_ON(sseu->max_slices * sseu->max_subslices > sizeof(g_ss_en) * BITS_PER_BYTE); - for (s = 0; s < sseu->max_slices; s++) { - if ((s_en & BIT(s)) == 0) - continue; + sseu->slice_mask |= BIT(0); + + /* + * XeHP introduces the concept of compute vs geometry DSS. To reduce + * variation between GENs around subslice usage, store a mask for both + * the geometry and compute enabled masks since userspace will need to + * be able to query these masks independently. Also compute a total + * enabled subslice count for the purposes of selecting subslices to + * use in a particular GEM context. + */ + intel_sseu_set_subslices(sseu, 0, sseu->compute_subslice_mask, + c_ss_en & valid_ss_mask); + intel_sseu_set_subslices(sseu, 0, sseu->geometry_subslice_mask, + g_ss_en & valid_ss_mask); + intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, + (g_ss_en | c_ss_en) & valid_ss_mask); - sseu->slice_mask |= BIT(s); - - /* - * XeHP introduces the concept of compute vs geometry DSS. To - * reduce variation between GENs around subslice usage, store a - * mask for both the geometry and compute enabled masks since - * userspace will need to be able to query these masks - * independently. Also compute a total enabled subslice count - * for the purposes of selecting subslices to use in a - * particular GEM context. - */ - intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask, - get_ss_stride_mask(sseu, s, c_ss_en)); - intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask, - get_ss_stride_mask(sseu, s, g_ss_en)); - intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, - get_ss_stride_mask(sseu, s, - g_ss_en | c_ss_en)); - } sseu->has_common_ss_eumask = 1; sseu->eu_mask[0] = eu_en; sseu->eu_per_subslice = hweight16(eu_en); @@ -229,7 +213,7 @@ static void xehp_sseu_info_init(struct intel_gt *gt) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); - gen11_compute_sseu_info(sseu, 0x1, g_dss_en, c_dss_en, eu_en); + gen11_compute_sseu_info(sseu, g_dss_en, c_dss_en, eu_en); } static void gen12_sseu_info_init(struct intel_gt *gt) @@ -249,8 +233,15 @@ static void gen12_sseu_info_init(struct intel_gt *gt) */ intel_sseu_set_info(sseu, 1, 6, 16); + /* + * Although gen12 architecture supported multiple slices, TGL, RKL, + * DG1, and ADL only had a single slice. + */ s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK; + if (s_en != 0x1) + drm_dbg(>->i915->drm, "Slice mask %#x is not the expected 0x1!\n", + s_en); g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); @@ -262,7 +253,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); - gen11_compute_sseu_info(sseu, s_en, g_dss_en, 0, eu_en); + gen11_compute_sseu_info(sseu, g_dss_en, 0, eu_en); /* TGL only supports slice-level power gating */ sseu->has_slice_pg = 1; @@ -281,14 +272,22 @@ static void gen11_sseu_info_init(struct intel_gt *gt) else intel_sseu_set_info(sseu, 1, 8, 8); + /* + * Although gen11 architecture supported multiple slices, ICL and + * EHL/JSL only had a single slice in practice. + */ s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK; + if (s_en != 0x1) + drm_dbg(>->i915->drm, "Slice mask %#x is not the expected 0x1!\n", + s_en); + ss_en = ~intel_uncore_read(uncore, GEN11_GT_SUBSLICE_DISABLE); eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & GEN11_EU_DIS_MASK); - gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en); + gen11_compute_sseu_info(sseu, ss_en, 0, eu_en); /* ICL has no power gating restrictions. */ sseu->has_slice_pg = 1; From patchwork Wed Apr 27 23:07:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12829830 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5037EC433EF for ; Wed, 27 Apr 2022 23:08:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2B8DD10F835; Wed, 27 Apr 2022 23:08:10 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8A96710EE38; Wed, 27 Apr 2022 23:07:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651100874; x=1682636874; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vfgMjXLkB879LNZ1hYiJL2duTAW16jVh1S1D++sQFDs=; b=AhFTSzMyQErrBhAghtbC/RYDGVhjPuzrpbJMtHPC4odZxgU4/7rgd6a6 jHwIB3AIt4Lxl+PH6JwB1TBr2md3164wYnOAfWdZtP9zKCsikksIA2yzD JSsY/a/yT62REu53uamebLwVkdydE663Bn2VGbV9/Cxa2uwAGCUzWch1a 8b0/fsgibowblWkZOfNyizWu0K5t/K5npyjILH3EmSDech9cwcx2YaZD6 IiWpceujO/8UoaVpVv4QA89zeMr/F2mBiaKxImw5FTCDewN3s4IZ37EwM vgXZtbJ9R1OvjUWMpjqmhAP+XQx0K2E5hhSZnLo34GBcAbH49vlp0Pk0u A==; X-IronPort-AV: E=McAfee;i="6400,9594,10330"; a="265912017" X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="265912017" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:54 -0700 X-IronPort-AV: E=Sophos;i="5.90,294,1643702400"; d="scan'208";a="533495698" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2022 16:07:53 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 5/5] drm/i915/sseu: Disassociate internal subslice mask representation from uapi Date: Wed, 27 Apr 2022 16:07:47 -0700 Message-Id: <20220427230747.906625-6-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427230747.906625-1-matthew.d.roper@intel.com> References: <20220427230747.906625-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tvrtko.ursulin@linux.intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than storing subslice masks internally as u8[] (inside the sseu structure) and u32 (everywhere else), let's move over to using an intel_sseu_ss_mask_t typedef compatible with the operations in linux/bitmap.h. We're soon going to start adding code for a new platform where subslice masks are spread across two 32-bit registers (requiring 64 bits to represent), and we expect future platforms will likely take this even farther, requiring bitmask storage larger than a simple u64 can hold. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_gt.c | 14 +- drivers/gpu/drm/i915/gt/intel_sseu.c | 197 +++++++++++-------- drivers/gpu/drm/i915/gt/intel_sseu.h | 48 ++--- drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 28 +-- drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 ++- drivers/gpu/drm/i915/i915_getparam.c | 2 +- drivers/gpu/drm/i915/i915_query.c | 8 +- 9 files changed, 183 insertions(+), 148 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index ab4c5ab28e4d..ea012ee3a8de 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1901,7 +1901,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt, if (user->slice_mask & ~device->slice_mask) return -EINVAL; - if (user->subslice_mask & ~device->subslice_mask[0]) + if (user->subslice_mask & ~device->subslice_mask.b[0]) return -EINVAL; if (user->max_eus_per_subslice > device->max_eus_per_subslice) @@ -1915,7 +1915,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt, /* Part specific restrictions. */ if (GRAPHICS_VER(i915) == 11) { unsigned int hw_s = hweight8(device->slice_mask); - unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]); + unsigned int hw_ss_per_s = hweight8(device->subslice_mask.b[0]); unsigned int req_s = hweight8(context->slice_mask); unsigned int req_ss = hweight8(context->subslice_mask); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 14c6ddbbfde8..39c09963b3c7 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -610,7 +610,7 @@ static void engine_mask_apply_compute_fuses(struct intel_gt *gt) if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) return; - ccs_mask = intel_slicemask_from_dssmask(intel_sseu_get_compute_subslices(&info->sseu), + ccs_mask = intel_slicemask_from_dssmask(info->sseu.compute_subslice_mask, ss_per_ccs); /* * If all DSS in a quadrant are fused off, the corresponding CCS diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 92394f13b42f..cc03512d59ba 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -133,13 +133,6 @@ static const struct intel_mmio_range dg2_lncf_steering_table[] = { {}, }; -static u16 slicemask(struct intel_gt *gt, int count) -{ - u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); - - return intel_slicemask_from_dssmask(dss_mask, count); -} - int intel_gt_init_mmio(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; @@ -153,11 +146,14 @@ int intel_gt_init_mmio(struct intel_gt *gt) * An mslice is unavailable only if both the meml3 for the slice is * disabled *and* all of the DSS in the slice (quadrant) are disabled. */ - if (HAS_MSLICES(i915)) + if (HAS_MSLICES(i915)) { gt->info.mslice_mask = - slicemask(gt, GEN_DSS_PER_MSLICE) | + intel_slicemask_from_dssmask(gt->info.sseu.subslice_mask, + GEN_DSS_PER_MSLICE); + gt->info.mslice_mask |= (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN12_MEML3_EN_MASK); + } if (IS_DG2(i915)) { gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index f7ff6a9f67b0..466505d6bd18 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -28,56 +28,49 @@ void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices, unsigned int intel_sseu_subslice_total(const struct sseu_dev_info *sseu) { - unsigned int i, total = 0; - - for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++) - total += hweight8(sseu->subslice_mask[i]); - - return total; + return bitmap_weight(sseu->subslice_mask.b, I915_MAX_SS_FUSE_BITS); } -static u32 -sseu_get_subslices(const struct sseu_dev_info *sseu, - const u8 *subslice_mask, u8 slice) +static intel_sseu_ss_mask_t +get_ss_mask_for_slice(const struct sseu_dev_info *sseu, + intel_sseu_ss_mask_t all_ss, + int slice) { - int i, offset = slice * sseu->ss_stride; - u32 mask = 0; + intel_sseu_ss_mask_t mask = {}, slice_ss = {}; + int offset = slice * sseu->max_subslices; GEM_BUG_ON(slice >= sseu->max_slices); - for (i = 0; i < sseu->ss_stride; i++) - mask |= (u32)subslice_mask[offset + i] << i * BITS_PER_BYTE; + if (sseu->max_slices == 1) + return all_ss; - return mask; -} + bitmap_set(mask.b, offset, sseu->max_subslices); + bitmap_and(slice_ss.b, all_ss.b, mask.b, I915_MAX_SS_FUSE_BITS); + bitmap_shift_right(slice_ss.b, slice_ss.b, offset, I915_MAX_SS_FUSE_BITS); -u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice) -{ - return sseu_get_subslices(sseu, sseu->subslice_mask, slice); + return slice_ss; } -static u32 sseu_get_geometry_subslices(const struct sseu_dev_info *sseu) +intel_sseu_ss_mask_t +intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice) { - return sseu_get_subslices(sseu, sseu->geometry_subslice_mask, 0); + return get_ss_mask_for_slice(sseu, sseu->subslice_mask, slice); } -u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu) -{ - return sseu_get_subslices(sseu, sseu->compute_subslice_mask, 0); -} - -void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice, - u8 *subslice_mask, u32 ss_mask) -{ - int offset = slice * sseu->ss_stride; - - memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride); -} - -unsigned int -intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice) +/* + * Set the subslice mask associated with a specific slice. Only needed on + * pre-gen11 platforms that have multiple slices. + */ +static void set_subslices(struct sseu_dev_info *sseu, int slice, u32 ss_mask) { - return hweight32(intel_sseu_get_subslices(sseu, slice)); + intel_sseu_ss_mask_t mask = {}, newbits = {}; + int offset = slice * sseu->max_subslices; + + bitmap_set(mask.b, offset, sseu->max_subslices); + bitmap_from_arr32(newbits.b, &ss_mask, 32); + bitmap_shift_left(newbits.b, newbits.b, offset, I915_MAX_SS_FUSE_BITS); + bitmap_replace(sseu->subslice_mask.b, sseu->subslice_mask.b, + newbits.b, mask.b, I915_MAX_SS_FUSE_BITS); } static int sseu_eu_idx(const struct sseu_dev_info *sseu, int slice, @@ -115,7 +108,7 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu) u16 i, total = 0; if (sseu->has_common_ss_eumask) - return intel_sseu_subslices_per_slice(sseu, 0) * + return bitmap_weight(sseu->subslice_mask.b, I915_MAX_SS_FUSE_BITS) * hweight16(sseu->eu_mask[0]); for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++) @@ -155,11 +148,42 @@ int intel_sseu_copy_eumask_to_user(void __user *to, return copy_to_user(to, eu_mask, len); } -static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, - u32 g_ss_en, u32 c_ss_en, u16 eu_en) +/** + * intel_sseu_copy_ssmask_to_user - Copy subslice mask into a userspace buffer + * @to: Pointer to userspace buffer to copy to + * @sseu: SSEU structure containing subslice mask to copy + * + * Copies the subslice mask to a userspace buffer in the format expected by + * the query ioctl's topology queries. + * + * Returns the result of the copy_to_user() operation. + */ +int intel_sseu_copy_ssmask_to_user(void __user *to, + const struct sseu_dev_info *sseu) { - u32 valid_ss_mask = GENMASK(sseu->max_subslices - 1, 0); + u8 ss_mask[GEN_SS_MASK_SIZE] = {}; + int len = sseu->max_slices * sseu->ss_stride; + int s, ss, i; + for (s = 0; s < sseu->max_slices; s++) { + for (ss = 0; ss < sseu->max_subslices; ss++) { + i = s * sseu->ss_stride + ss; + + if (!intel_sseu_has_subslice(sseu, s, ss)) + continue; + + ss_mask[i / BITS_PER_BYTE] |= BIT(i % BITS_PER_BYTE); + } + } + + return copy_to_user(to, ss_mask, len); +} + +static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, + intel_sseu_ss_mask_t g_ss_en, + intel_sseu_ss_mask_t c_ss_en, + u16 eu_en) +{ /* g_ss_en/c_ss_en represent entire subslice mask across all slices */ GEM_BUG_ON(sseu->max_slices * sseu->max_subslices > sizeof(g_ss_en) * BITS_PER_BYTE); @@ -174,12 +198,12 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, * enabled subslice count for the purposes of selecting subslices to * use in a particular GEM context. */ - intel_sseu_set_subslices(sseu, 0, sseu->compute_subslice_mask, - c_ss_en & valid_ss_mask); - intel_sseu_set_subslices(sseu, 0, sseu->geometry_subslice_mask, - g_ss_en & valid_ss_mask); - intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, - (g_ss_en | c_ss_en) & valid_ss_mask); + bitmap_copy(sseu->compute_subslice_mask.b, c_ss_en.b, I915_MAX_SS_FUSE_BITS); + bitmap_copy(sseu->geometry_subslice_mask.b, g_ss_en.b, I915_MAX_SS_FUSE_BITS); + bitmap_or(sseu->subslice_mask.b, + sseu->compute_subslice_mask.b, + sseu->geometry_subslice_mask.b, + I915_MAX_SS_FUSE_BITS); sseu->has_common_ss_eumask = 1; sseu->eu_mask[0] = eu_en; @@ -191,7 +215,8 @@ static void xehp_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; - u32 g_dss_en, c_dss_en = 0; + intel_sseu_ss_mask_t g_dss_en = {}, c_dss_en = {}; + u32 val; u16 eu_en = 0; u8 eu_en_fuse; int eu; @@ -204,8 +229,11 @@ static void xehp_sseu_info_init(struct intel_gt *gt) */ intel_sseu_set_info(sseu, 1, 32, 16); - g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); - c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE); + val = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); + bitmap_from_arr32(g_dss_en.b, &val, 32); + + val = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE); + bitmap_from_arr32(c_dss_en.b, &val, 32); eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; @@ -220,7 +248,8 @@ static void gen12_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; - u32 g_dss_en; + intel_sseu_ss_mask_t g_dss_en = {}, empty_mask = {}; + u32 val; u16 eu_en = 0; u8 eu_en_fuse; u8 s_en; @@ -243,7 +272,8 @@ static void gen12_sseu_info_init(struct intel_gt *gt) drm_dbg(>->i915->drm, "Slice mask %#x is not the expected 0x1!\n", s_en); - g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); + val = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); + bitmap_from_arr32(g_dss_en.b, &val, 32); /* one bit per pair of EUs */ eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & @@ -253,7 +283,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); - gen11_compute_sseu_info(sseu, g_dss_en, 0, eu_en); + gen11_compute_sseu_info(sseu, g_dss_en, empty_mask, eu_en); /* TGL only supports slice-level power gating */ sseu->has_slice_pg = 1; @@ -263,7 +293,8 @@ static void gen11_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; - u32 ss_en; + intel_sseu_ss_mask_t ss_en = {}, empty_mask = {}; + u32 val; u8 eu_en; u8 s_en; @@ -282,12 +313,13 @@ static void gen11_sseu_info_init(struct intel_gt *gt) drm_dbg(>->i915->drm, "Slice mask %#x is not the expected 0x1!\n", s_en); - ss_en = ~intel_uncore_read(uncore, GEN11_GT_SUBSLICE_DISABLE); + val = ~intel_uncore_read(uncore, GEN11_GT_SUBSLICE_DISABLE); + bitmap_from_arr32(ss_en.b, &val, 32); eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & GEN11_EU_DIS_MASK); - gen11_compute_sseu_info(sseu, ss_en, 0, eu_en); + gen11_compute_sseu_info(sseu, ss_en, empty_mask, eu_en); /* ICL has no power gating restrictions. */ sseu->has_slice_pg = 1; @@ -328,7 +360,7 @@ static void cherryview_sseu_info_init(struct intel_gt *gt) sseu_set_eus(sseu, 0, 1, ~disabled_mask); } - intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask); + set_subslices(sseu, 0, subslice_mask); sseu->eu_total = compute_eu_total(sseu); @@ -384,8 +416,7 @@ static void gen9_sseu_info_init(struct intel_gt *gt) /* skip disabled slice */ continue; - intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, - subslice_mask); + set_subslices(sseu, s, subslice_mask); eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s)); for (ss = 0; ss < sseu->max_subslices; ss++) { @@ -442,8 +473,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt) sseu->has_eu_pg = sseu->eu_per_subslice > 2; if (IS_GEN9_LP(i915)) { -#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask[0] & BIT(ss))) - info->has_pooled_eu = hweight8(sseu->subslice_mask[0]) == 3; +#define IS_SS_DISABLED(ss) test_bit(ss, sseu->subslice_mask.b) + info->has_pooled_eu = hweight8(sseu->subslice_mask.b[0]) == 3; sseu->min_eu_in_pool = 0; if (info->has_pooled_eu) { @@ -497,8 +528,7 @@ static void bdw_sseu_info_init(struct intel_gt *gt) /* skip disabled slice */ continue; - intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, - subslice_mask); + set_subslices(sseu, s, subslice_mask); for (ss = 0; ss < sseu->max_subslices; ss++) { u8 eu_disabled_mask; @@ -595,8 +625,7 @@ static void hsw_sseu_info_init(struct intel_gt *gt) sseu->eu_per_subslice); for (s = 0; s < sseu->max_slices; s++) { - intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, - subslice_mask); + set_subslices(sseu, s, subslice_mask); for (ss = 0; ss < sseu->max_subslices; ss++) { sseu_set_eus(sseu, s, ss, @@ -685,7 +714,7 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt, */ if (GRAPHICS_VER(i915) == 11 && slices == 1 && - subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) { + subslices > min_t(u8, 4, hweight8(sseu->subslice_mask.b[0]) / 2)) { GEM_BUG_ON(subslices & 1); subslice_pg = false; @@ -755,9 +784,11 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p) hweight8(sseu->slice_mask), sseu->slice_mask); drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu)); for (s = 0; s < sseu->max_slices; s++) { - drm_printf(p, "slice%d: %u subslices, mask=%08x\n", - s, intel_sseu_subslices_per_slice(sseu, s), - intel_sseu_get_subslices(sseu, s)); + intel_sseu_ss_mask_t ssmask = intel_sseu_get_subslices(sseu, s); + + drm_printf(p, "slice%d: %u subslices, mask=%*pb\n", + s, bitmap_weight(ssmask.b, I915_MAX_SS_FUSE_BITS), + I915_MAX_SS_FUSE_BITS, ssmask.b); } drm_printf(p, "EU total: %u\n", sseu->eu_total); drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice); @@ -775,9 +806,11 @@ static void sseu_print_hsw_topology(const struct sseu_dev_info *sseu, int s, ss; for (s = 0; s < sseu->max_slices; s++) { - drm_printf(p, "slice%d: %u subslice(s) (0x%08x):\n", - s, intel_sseu_subslices_per_slice(sseu, s), - intel_sseu_get_subslices(sseu, s)); + intel_sseu_ss_mask_t ssmask = intel_sseu_get_subslices(sseu, s); + + drm_printf(p, "slice%d: %u subslice(s) (0x%*pb):\n", + s, bitmap_weight(ssmask.b, I915_MAX_SS_FUSE_BITS), + I915_MAX_SS_FUSE_BITS, ssmask.b); for (ss = 0; ss < sseu->max_subslices; ss++) { u16 enabled_eus = sseu_get_eus(sseu, s, ss); @@ -791,16 +824,14 @@ static void sseu_print_hsw_topology(const struct sseu_dev_info *sseu, static void sseu_print_xehp_topology(const struct sseu_dev_info *sseu, struct drm_printer *p) { - u32 g_dss_mask = sseu_get_geometry_subslices(sseu); - u32 c_dss_mask = intel_sseu_get_compute_subslices(sseu); int dss; for (dss = 0; dss < sseu->max_subslices; dss++) { u16 enabled_eus = sseu_get_eus(sseu, 0, dss); drm_printf(p, "DSS_%02d: G:%3s C:%3s, %2u EUs (0x%04hx)\n", dss, - str_yes_no(g_dss_mask & BIT(dss)), - str_yes_no(c_dss_mask & BIT(dss)), + str_yes_no(test_bit(dss, sseu->geometry_subslice_mask.b)), + str_yes_no(test_bit(dss, sseu->compute_subslice_mask.b)), hweight16(enabled_eus), enabled_eus); } } @@ -818,20 +849,24 @@ void intel_sseu_print_topology(struct drm_i915_private *i915, } } -u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice) +u16 intel_slicemask_from_dssmask(intel_sseu_ss_mask_t dss_mask, + int dss_per_slice) { + intel_sseu_ss_mask_t per_slice_mask = {}; u16 slice_mask = 0; int i; - WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask)); + WARN_ON(DIV_ROUND_UP(I915_MAX_SS_FUSE_BITS, dss_per_slice) > + 8 * sizeof(slice_mask)); - for (i = 0; dss_mask; i++) { - if (dss_mask & GENMASK(dss_per_slice - 1, 0)) + bitmap_fill(per_slice_mask.b, dss_per_slice); + for (i = 0; !bitmap_empty(dss_mask.b, I915_MAX_SS_FUSE_BITS); i++) { + if (bitmap_intersects(dss_mask.b, per_slice_mask.b, dss_per_slice)) slice_mask |= BIT(i); - dss_mask >>= dss_per_slice; + bitmap_shift_right(dss_mask.b, dss_mask.b, dss_per_slice, + I915_MAX_SS_FUSE_BITS); } return slice_mask; } - diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 106726a2244e..455f085093ae 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -52,11 +52,22 @@ struct drm_printer; #define GEN_MAX_GSLICES (GEN_MAX_DSS / GEN_DSS_PER_GSLICE) #define GEN_MAX_CSLICES (GEN_MAX_DSS / GEN_DSS_PER_CSLICE) +/* + * Maximum number of 32-bit registers used by hardware to express the + * enabled/disabled subslices. + */ +#define I915_MAX_SS_FUSE_REGS 1 +#define I915_MAX_SS_FUSE_BITS (I915_MAX_SS_FUSE_REGS * 32) + +typedef struct { + unsigned long b[BITS_TO_LONGS(I915_MAX_SS_FUSE_BITS)]; +} intel_sseu_ss_mask_t; + struct sseu_dev_info { u8 slice_mask; - u8 subslice_mask[GEN_SS_MASK_SIZE]; - u8 geometry_subslice_mask[GEN_SS_MASK_SIZE]; - u8 compute_subslice_mask[GEN_SS_MASK_SIZE]; + intel_sseu_ss_mask_t subslice_mask; + intel_sseu_ss_mask_t geometry_subslice_mask; + intel_sseu_ss_mask_t compute_subslice_mask; /* * EU masks. Use has_common_ss_eumask to determine how the field @@ -107,7 +118,7 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu) { struct intel_sseu value = { .slice_mask = sseu->slice_mask, - .subslice_mask = sseu->subslice_mask[0], + .subslice_mask = sseu->subslice_mask.b[0], .min_eus_per_subslice = sseu->max_eus_per_subslice, .max_eus_per_subslice = sseu->max_eus_per_subslice, }; @@ -119,18 +130,8 @@ static inline bool intel_sseu_has_subslice(const struct sseu_dev_info *sseu, int slice, int subslice) { - u8 mask; - int ss_idx = subslice / BITS_PER_BYTE; - - if (slice >= sseu->max_slices || - subslice >= sseu->max_subslices) - return false; - - GEM_BUG_ON(ss_idx >= sseu->ss_stride); - - mask = sseu->subslice_mask[slice * sseu->ss_stride + ss_idx]; - - return mask & BIT(subslice % BITS_PER_BYTE); + return test_bit(slice * sseu->ss_stride + subslice, + sseu->subslice_mask.b); } void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices, @@ -139,15 +140,14 @@ void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices, unsigned int intel_sseu_subslice_total(const struct sseu_dev_info *sseu); -unsigned int -intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice); +intel_sseu_ss_mask_t +intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice); -u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice); - -u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu); +intel_sseu_ss_mask_t +intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu); void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice, - u8 *subslice_mask, u32 ss_mask); + u32 ss_mask); void intel_sseu_info_init(struct intel_gt *gt); @@ -159,9 +159,11 @@ void intel_sseu_print_topology(struct drm_i915_private *i915, const struct sseu_dev_info *sseu, struct drm_printer *p); -u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice); +u16 intel_slicemask_from_dssmask(intel_sseu_ss_mask_t dss_mask, int dss_per_slice); int intel_sseu_copy_eumask_to_user(void __user *to, const struct sseu_dev_info *sseu); +int intel_sseu_copy_ssmask_to_user(void __user *to, + const struct sseu_dev_info *sseu); #endif /* __INTEL_SSEU_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c index 2d5d011e01db..1f77bd52a3a6 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c @@ -4,6 +4,7 @@ * Copyright © 2020 Intel Corporation */ +#include #include #include "i915_drv.h" @@ -12,11 +13,15 @@ #include "intel_sseu_debugfs.h" static void sseu_copy_subslices(const struct sseu_dev_info *sseu, - int slice, u8 *to_mask) + int slice, + intel_sseu_ss_mask_t *to_mask) { - int offset = slice * sseu->ss_stride; + int offset = slice * sseu->max_subslices; - memcpy(&to_mask[offset], &sseu->subslice_mask[offset], sseu->ss_stride); + bitmap_fill(to_mask->b, sseu->max_subslices); + bitmap_shift_left(to_mask->b, to_mask->b, offset, I915_MAX_SS_FUSE_BITS); + bitmap_and(to_mask->b, to_mask->b, sseu->subslice_mask.b, I915_MAX_SS_FUSE_BITS); + bitmap_shift_right(to_mask->b, to_mask->b, offset, I915_MAX_SS_FUSE_BITS); } static void cherryview_sseu_device_status(struct intel_gt *gt, @@ -41,7 +46,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt, continue; sseu->slice_mask = BIT(0); - sseu->subslice_mask[0] |= BIT(ss); + set_bit(0, sseu->subslice_mask.b); eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) + ((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) + ((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) + @@ -92,7 +97,7 @@ static void gen11_sseu_device_status(struct intel_gt *gt, continue; sseu->slice_mask |= BIT(s); - sseu_copy_subslices(&info->sseu, s, sseu->subslice_mask); + sseu_copy_subslices(&info->sseu, s, &sseu->subslice_mask); for (ss = 0; ss < info->sseu.max_subslices; ss++) { unsigned int eu_cnt; @@ -148,20 +153,17 @@ static void gen9_sseu_device_status(struct intel_gt *gt, if (IS_GEN9_BC(gt->i915)) sseu_copy_subslices(&info->sseu, s, - sseu->subslice_mask); + &sseu->subslice_mask); for (ss = 0; ss < info->sseu.max_subslices; ss++) { unsigned int eu_cnt; - u8 ss_idx = s * info->sseu.ss_stride + - ss / BITS_PER_BYTE; if (IS_GEN9_LP(gt->i915)) { if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss)))) /* skip disabled subslice */ continue; - sseu->subslice_mask[ss_idx] |= - BIT(ss % BITS_PER_BYTE); + set_bit(ss, sseu->subslice_mask.b); } eu_cnt = eu_reg[2 * s + ss / 2] & eu_mask[ss % 2]; @@ -189,7 +191,7 @@ static void bdw_sseu_device_status(struct intel_gt *gt, sseu->eu_per_subslice = info->sseu.eu_per_subslice; for (s = 0; s < fls(sseu->slice_mask); s++) sseu_copy_subslices(&info->sseu, s, - sseu->subslice_mask); + &sseu->subslice_mask); sseu->eu_total = sseu->eu_per_subslice * intel_sseu_subslice_total(sseu); @@ -217,8 +219,10 @@ static void i915_print_sseu_info(struct seq_file *m, seq_printf(m, " %s Subslice Total: %u\n", type, intel_sseu_subslice_total(sseu)); for (s = 0; s < fls(sseu->slice_mask); s++) { + intel_sseu_ss_mask_t ssmask = intel_sseu_get_subslices(sseu, s); + seq_printf(m, " %s Slice%i subslices: %u\n", type, - s, intel_sseu_subslices_per_slice(sseu, s)); + s, bitmap_weight(ssmask.b, I915_MAX_SS_FUSE_BITS)); } seq_printf(m, " %s EU Total: %u\n", type, sseu->eu_total); diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index a05c4b99b3fb..5db492072f99 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -931,6 +931,7 @@ static void gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) { const struct sseu_dev_info *sseu = &to_gt(i915)->info.sseu; + intel_sseu_ss_mask_t ssmask; unsigned int slice, subslice; u32 mcr, mcr_mask; @@ -948,9 +949,9 @@ gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) * on s/ss combo, the read should be done with read_subslice_reg. */ slice = ffs(sseu->slice_mask) - 1; - GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); - subslice = ffs(intel_sseu_get_subslices(sseu, slice)); - GEM_BUG_ON(!subslice); + ssmask = intel_sseu_get_subslices(sseu, slice); + subslice = find_first_bit(ssmask.b, I915_MAX_SS_FUSE_BITS); + GEM_BUG_ON(subslice == I915_MAX_SS_FUSE_BITS); subslice--; /* @@ -1087,11 +1088,10 @@ static void icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) { const struct sseu_dev_info *sseu = >->info.sseu; - unsigned int slice, subslice; + unsigned int subslice; GEM_BUG_ON(GRAPHICS_VER(gt->i915) < 11); GEM_BUG_ON(hweight8(sseu->slice_mask) > 1); - slice = 0; /* * Although a platform may have subslices, we need to always steer @@ -1102,7 +1102,7 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) * one of the higher subslices, we run the risk of reading back 0's or * random garbage. */ - subslice = __ffs(intel_sseu_get_subslices(sseu, slice)); + subslice = find_first_bit(sseu->subslice_mask.b, I915_MAX_SS_FUSE_BITS); /* * If the subslice we picked above also steers us to a valid L3 bank, @@ -1112,7 +1112,7 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) if (gt->info.l3bank_mask & BIT(subslice)) gt->steering_table[L3BANK] = NULL; - __add_mcr_wa(gt, wal, slice, subslice); + __add_mcr_wa(gt, wal, 0, subslice); } static void @@ -1120,7 +1120,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) { const struct sseu_dev_info *sseu = >->info.sseu; unsigned long slice, subslice = 0, slice_mask = 0; - u64 dss_mask = 0; u32 lncf_mask = 0; int i; @@ -1151,8 +1150,8 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) */ /* Find the potential gslice candidates */ - dss_mask = intel_sseu_get_subslices(sseu, 0); - slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE); + slice_mask = intel_slicemask_from_dssmask(sseu->subslice_mask, + GEN_DSS_PER_GSLICE); /* * Find the potential LNCF candidates. Either LNCF within a valid @@ -1177,9 +1176,9 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) } slice = __ffs(slice_mask); - subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE)); + subslice = find_next_bit(sseu->subslice_mask.b, I915_MAX_SS_FUSE_BITS, + slice * GEN_DSS_PER_GSLICE); WARN_ON(subslice > GEN_DSS_PER_GSLICE); - WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0); __add_mcr_wa(gt, wal, slice, subslice); @@ -2012,9 +2011,8 @@ engine_fake_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) static bool needs_wa_1308578152(struct intel_engine_cs *engine) { - u64 dss_mask = intel_sseu_get_subslices(&engine->gt->info.sseu, 0); - - return (dss_mask & GENMASK(GEN_DSS_PER_GSLICE - 1, 0)) == 0; + return find_first_bit(engine->gt->info.sseu.subslice_mask.b, + I915_MAX_SS_FUSE_BITS) >= GEN_DSS_PER_GSLICE; } static void diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index ac9767c56619..349d496d164a 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -162,7 +162,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, return -EINVAL; /* Only copy bits from the first slice */ - memcpy(&value, sseu->subslice_mask, + memcpy(&value, sseu->subslice_mask.b, min(sseu->ss_stride, (u8)sizeof(value))); if (!value) return -ENODEV; diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index 16f43bf32a05..9afa6d1eaf95 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -31,7 +31,7 @@ static int copy_query_item(void *query_hdr, size_t query_sz, static int fill_topology_info(const struct sseu_dev_info *sseu, struct drm_i915_query_item *query_item, - const u8 *subslice_mask) + intel_sseu_ss_mask_t subslice_mask) { struct drm_i915_query_topology_info topo; u32 slice_length, subslice_length, eu_length, total_length; @@ -71,9 +71,9 @@ static int fill_topology_info(const struct sseu_dev_info *sseu, &sseu->slice_mask, slice_length)) return -EFAULT; - if (copy_to_user(u64_to_user_ptr(query_item->data_ptr + - sizeof(topo) + slice_length), - subslice_mask, subslice_length)) + if (intel_sseu_copy_ssmask_to_user(u64_to_user_ptr(query_item->data_ptr + + sizeof(topo) + slice_length), + sseu)) return -EFAULT; if (intel_sseu_copy_eumask_to_user(u64_to_user_ptr(query_item->data_ptr +