From patchwork Fri Mar 11 06:15:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12777490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A0A82C433EF for ; Fri, 11 Mar 2022 06:16:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F0E2310E1E2; Fri, 11 Mar 2022 06:15:56 +0000 (UTC) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id C24F210E1E2; Fri, 11 Mar 2022 06:15:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646979354; x=1678515354; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=XMx3oD4MZseJb8Q0bnpaY26JZjvlTcZENslKx3g3uNk=; b=FSB99rV1Cf69lvq7xXPPnfLYX/FZ8pUJYn8bK5heCSBP+OHkyd999Xqj 0eFCiZMbFGC5UJbx0ehJZzCUtOjBsmxbNAT3rr4fmKh9iw+FnxUtk0TSj qxX0/T3gOP98ftSPTK/KxCQuD4LDmGvthAQ6haAyl94SzeSM1EnsHlLFR OXtxdjic5CKYPNV93qAUYnKmYrJcTH+RFlpi8Ms/+LG11O8DrA0NhGPKt IKkGEXBk2CNO5kBnxdlv88/PvsYt1PAWyf7EiibFV15sESd7Bg7+4X424 PdE+xeCz6+viQsLbHYCuTDYaDDLF/W5+LKeK7natvf6H2WyQYqZuDGKgB A==; X-IronPort-AV: E=McAfee;i="6200,9189,10282"; a="316231245" X-IronPort-AV: E=Sophos;i="5.90,173,1643702400"; d="scan'208";a="316231245" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2022 22:15:53 -0800 X-IronPort-AV: E=Sophos;i="5.90,173,1643702400"; d="scan'208";a="538885116" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2022 22:15:53 -0800 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 1/2] drm/i915/sseu: Don't overallocate subslice storage Date: Thu, 10 Mar 2022 22:15:42 -0800 Message-Id: <20220311061543.153611-1-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Xe_HP removed "slice" as a first-class unit in the hardware design. Instead we now have a single pool of subslices (which are now referred to as "DSS") that different hardware units have different ways of grouping ("compute slices," "geometry slices," etc.). For the purposes of topology representation, we treat Xe_HP-based platforms as having a single slice that contains all of the platform's DSS. There's no need to allocate storage space for (max legacy slices * max dss); let's update some of our macros to minimize the storage requirement for sseu topology. We'll also document some of the constants to make it a little bit more clear what they represent. Signed-off-by: Matt Roper Reviewed-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 +- drivers/gpu/drm/i915/gt/intel_sseu.h | 47 +++++++++++++++----- 2 files changed, 36 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 4fbf45a74ec0..f9e246004bc0 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -645,7 +645,7 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine) #define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, dss_) \ for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \ - (iter_) < GEN_MAX_SUBSLICES; \ + (iter_) < GEN_SS_MASK_SIZE; \ (iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \ (dss_) = (iter_) % GEN_DSS_PER_GSLICE) \ for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_))) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 8a79cd8eaab4..4f59eadbb61a 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -15,26 +15,49 @@ struct drm_i915_private; struct intel_gt; struct drm_printer; -#define GEN_MAX_SLICES (3) /* SKL upper bound */ -#define GEN_MAX_SUBSLICES (32) /* XEHPSDV upper bound */ -#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE) -#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES) -#define GEN_MAX_EUS (16) /* TGL upper bound */ -#define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS) +/* + * Maximum number of legacy slices. Legacy slices no longer exist starting on + * Xe_HP ("gslices," "cslices," etc. on Xe_HP and beyond are a different + * concept and are not expressed through fusing). + */ +#define GEN_MAX_LEGACY_SLICES 3 + +/* + * Maximum number of subslices that can exist within a legacy slice. This is + * only relevant to pre-Xe_HP platforms (Xe_HP and beyond use the GEN_MAX_DSS + * value below). + */ +#define GEN_MAX_LEGACY_SUBSLICES 6 + +/* Maximum number of DSS on newer platforms (Xe_HP and beyond). */ +#define GEN_MAX_DSS 32 + +/* Maximum number of EUs that can exist within a subslice or DSS. */ +#define GEN_MAX_EUS_PER_SS 16 + +#define MAX(a, b) ((a) > (b) ? (a) : (b)) + +/* The maximum number of bits needed to express each subslice/DSS independently */ +#define GEN_SS_MASK_SIZE MAX(GEN_MAX_DSS, \ + GEN_MAX_LEGACY_SLICES * GEN_MAX_LEGACY_SUBSLICES) + +#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE) +#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_SS_MASK_SIZE) +#define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS_PER_SS) #define GEN_DSS_PER_GSLICE 4 #define GEN_DSS_PER_CSLICE 8 #define GEN_DSS_PER_MSLICE 8 -#define GEN_MAX_GSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_GSLICE) -#define GEN_MAX_CSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_CSLICE) +#define GEN_MAX_GSLICES (GEN_MAX_DSS / GEN_DSS_PER_GSLICE) +#define GEN_MAX_CSLICES (GEN_MAX_DSS / GEN_DSS_PER_CSLICE) struct sseu_dev_info { u8 slice_mask; - u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; - u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; - u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; - u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE]; + u8 subslice_mask[GEN_SS_MASK_SIZE]; + u8 geometry_subslice_mask[GEN_SS_MASK_SIZE]; + u8 compute_subslice_mask[GEN_SS_MASK_SIZE]; + u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE]; u16 eu_total; u8 eu_per_subslice; u8 min_eu_in_pool; From patchwork Fri Mar 11 06:15:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12777489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3EFAAC433F5 for ; Fri, 11 Mar 2022 06:16:03 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 68CF510E368; Fri, 11 Mar 2022 06:15:56 +0000 (UTC) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 114A610E22D; Fri, 11 Mar 2022 06:15:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646979355; x=1678515355; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SweCQKQpJ8nzFPVZTn+F+67Vc3pdTHsJccHrZnoXbgw=; b=YEMpKOx34o7qgLnIdalgnyRYQ8VbIkOLCjl+3e/mSild1YY8EQLZ5x70 HksmWWV0EMjTVK5svqrBzkfeaQyZWPMEJNAHxS6/yMmcRC/jBT2iTmfWZ ZJcQFXsxsyWuRnMyiRClx82MGbRIZ2srWNrweptZDTkRXu1PoYtXft80s YjDpso3Hm5qpKKi4ZDFb2LEp1fUPiIKMX4cPKbJK4KnB2LhqtD2/Z8+H6 7fG0CPUaQD+N4S4gVK3UbI/433Kk/Xk4Wmh00HngkdDxIkFDzsdK0/G/K K+7Bl7y0FtJxes4mCmAIoVOc7GzSuQrU4/iiiygxx+3zLVkNLlQBqE9Ns Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10282"; a="316231250" X-IronPort-AV: E=Sophos;i="5.90,173,1643702400"; d="scan'208";a="316231250" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2022 22:15:54 -0800 X-IronPort-AV: E=Sophos;i="5.90,173,1643702400"; d="scan'208";a="538885124" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2022 22:15:54 -0800 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH 2/2] drm/i915/xehp: Update topology dumps for Xe_HP Date: Thu, 10 Mar 2022 22:15:43 -0800 Message-Id: <20220311061543.153611-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220311061543.153611-1-matthew.d.roper@intel.com> References: <20220311061543.153611-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" When running on Xe_HP or beyond, let's use an updated format for describing topology in our error state dumps and debugfs to give a more accurate view of the hardware: - Just report DSS directly without the legacy "slice0" output that's no longer meaningful. - Indicate whether each DSS is accessible for geometry and/or compute. - Rename "rcs_topology" to "sseu_topology" since the information reported is common to both RCS and CCS engines now. Signed-off-by: Matt Roper Reported-by: kernel test robot Reported-by: kernel test robot Reported-by: kernel test robot Reviewed-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_sseu.c | 48 +++++++++++++++++--- drivers/gpu/drm/i915/gt/intel_sseu.h | 3 +- drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 8 ++-- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- 4 files changed, 48 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 614915ffbd37..4d28458ab768 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -10,6 +10,8 @@ #include "intel_gt_regs.h" #include "intel_sseu.h" +#include "linux/string_helpers.h" + void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices, u8 max_subslices, u8 max_eus_per_subslice) { @@ -54,6 +56,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice) return _intel_sseu_get_subslices(sseu, sseu->subslice_mask, slice); } +u32 intel_sseu_get_geometry_subslices(const struct sseu_dev_info *sseu) +{ + return _intel_sseu_get_subslices(sseu, sseu->geometry_subslice_mask, 0); +} + u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu) { return _intel_sseu_get_subslices(sseu, sseu->compute_subslice_mask, 0); @@ -720,16 +727,11 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p) str_yes_no(sseu->has_eu_pg)); } -void intel_sseu_print_topology(const struct sseu_dev_info *sseu, - struct drm_printer *p) +static void intel_sseu_print_legacy_topology(const struct sseu_dev_info *sseu, + struct drm_printer *p) { int s, ss; - if (sseu->max_slices == 0) { - drm_printf(p, "Unavailable\n"); - return; - } - for (s = 0; s < sseu->max_slices; s++) { drm_printf(p, "slice%d: %u subslice(s) (0x%08x):\n", s, intel_sseu_subslices_per_slice(sseu, s), @@ -744,6 +746,38 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu, } } +static void intel_sseu_print_xehp_topology(const struct sseu_dev_info *sseu, + struct drm_printer *p) +{ + u32 g_dss_mask = intel_sseu_get_geometry_subslices(sseu); + u32 c_dss_mask = intel_sseu_get_compute_subslices(sseu); + int dss; + + for (dss = 0; dss < sseu->max_subslices; dss++) { + u16 enabled_eus = sseu_get_eus(sseu, 0, dss); + + drm_printf(p, "DSS%02d: G:%3s C:%3s, %2u EUs (0x%04hx)\n", dss, + str_yes_no(g_dss_mask & BIT(dss)), + str_yes_no(c_dss_mask & BIT(dss)), + hweight16(enabled_eus), enabled_eus); + } +} + + +void intel_sseu_print_topology(struct drm_i915_private *i915, + const struct sseu_dev_info *sseu, + struct drm_printer *p) +{ + if (sseu->max_slices == 0) { + drm_printf(p, "Unavailable\n"); + return; + } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_sseu_print_xehp_topology(sseu, p); + } else { + intel_sseu_print_legacy_topology(sseu, p); + } +} + u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice) { u16 slice_mask = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 4f59eadbb61a..fe22ea9bb213 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -139,7 +139,8 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt, const struct intel_sseu *req_sseu); void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p); -void intel_sseu_print_topology(const struct sseu_dev_info *sseu, +void intel_sseu_print_topology(struct drm_i915_private *i915, + const struct sseu_dev_info *sseu, struct drm_printer *p); u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice); diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c index a9d5bc49f361..6b944de48666 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c @@ -287,22 +287,22 @@ static int sseu_status_show(struct seq_file *m, void *unused) } DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(sseu_status); -static int rcs_topology_show(struct seq_file *m, void *unused) +static int sseu_topology_show(struct seq_file *m, void *unused) { struct intel_gt *gt = m->private; struct drm_printer p = drm_seq_file_printer(m); - intel_sseu_print_topology(>->info.sseu, &p); + intel_sseu_print_topology(gt->i915, >->info.sseu, &p); return 0; } -DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rcs_topology); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(sseu_topology); void intel_sseu_debugfs_register(struct intel_gt *gt, struct dentry *root) { static const struct intel_gt_debugfs_file files[] = { { "sseu_status", &sseu_status_fops, NULL }, - { "rcs_topology", &rcs_topology_fops, NULL }, + { "sseu_topology", &sseu_topology_fops, NULL }, }; intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), gt); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 5e09a4e4b01a..44ff2b899893 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -712,7 +712,7 @@ static void err_print_gt_info(struct drm_i915_error_state_buf *m, struct drm_printer p = i915_error_printer(m); intel_gt_info_print(>->info, &p); - intel_sseu_print_topology(>->info.sseu, &p); + intel_sseu_print_topology(gt->_gt->i915, >->info.sseu, &p); } static void err_print_gt(struct drm_i915_error_state_buf *m,