From patchwork Tue Oct 25 20:16:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 71948FA3741 for ; Tue, 25 Oct 2022 20:19:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C2FF110E259; Tue, 25 Oct 2022 20:19:45 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id E33B510E195 for ; Tue, 25 Oct 2022 20:17:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=wyTprWxRDAJtaGf1vt1QfI6lpLCf4CizLzJL1rV+Pa4=; b=c/OamzphzeJiJYuTFbc9YA0wtFLBhkBtBuYzG4v+rQTKm1go6u/MOGhU T+mqvB8fNzTEJjYwGrrDgiiKyrLnAA+9zI6idNbwRme2dmp9ya60j4UwH 2uGHq9V/9awJ6z6FIry17hVrjbLzvLiL3db7yJmBnQpwD25+YSeHr+65q SbEirSM3NFTvdWkSaDZgnwSaPLNHRVYSutWl8ghK0fpUS6lqjkHcJaYaY xpLTS9Qghoe1czzMjOXldvh6aOLGJhNGzA8O3BUSvlH7oXopE7nm0qScO xZYyZ90x7XcbedzH1yvIkizH7z/hY/m16bbpBikpsL4QXzCBCEOeTBJ2N Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="287498118" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="287498118" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699709" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699709" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:08 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:53 +0000 Message-Id: <20221025201708.84018-2-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 01/16] drm/i915/perf: Fix OA filtering logic for GuC mode X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" With GuC mode of submission, GuC is in control of defining the context id field that is part of the OA reports. To filter reports, UMD and KMD must know what sw context id was chosen by GuC. There is not interface between KMD and GuC to determine this, so read the upper-dword of EXECLIST_STATUS to filter/squash OA reports for the specific context. v2: Explain guc id stealing w.r.t OA use case Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_lrc.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 144 ++++++++++++++++++++++++---- 2 files changed, 127 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h index a390f0813c8b..7111bae759f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h @@ -110,6 +110,8 @@ enum { #define XEHP_SW_CTX_ID_WIDTH 16 #define XEHP_SW_COUNTER_SHIFT 58 #define XEHP_SW_COUNTER_WIDTH 6 +#define GEN12_GUC_SW_CTX_ID_SHIFT 39 +#define GEN12_GUC_SW_CTX_ID_WIDTH 16 static inline void lrc_runtime_start(struct intel_context *ce) { diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 15816df916c7..255335868b6a 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1231,6 +1231,128 @@ static struct intel_context *oa_pin_context(struct i915_perf_stream *stream) return stream->pinned_ctx; } +static int +__store_reg_to_mem(struct i915_request *rq, i915_reg_t reg, u32 ggtt_offset) +{ + u32 *cs, cmd; + + cmd = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; + if (GRAPHICS_VER(rq->engine->i915) >= 8) + cmd++; + + cs = intel_ring_begin(rq, 4); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = cmd; + *cs++ = i915_mmio_reg_offset(reg); + *cs++ = ggtt_offset; + *cs++ = 0; + + intel_ring_advance(rq, cs); + + return 0; +} + +static int +__read_reg(struct intel_context *ce, i915_reg_t reg, u32 ggtt_offset) +{ + struct i915_request *rq; + int err; + + rq = i915_request_create(ce); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + i915_request_get(rq); + + err = __store_reg_to_mem(rq, reg, ggtt_offset); + + i915_request_add(rq); + if (!err && i915_request_wait(rq, 0, HZ / 2) < 0) + err = -ETIME; + + i915_request_put(rq); + + return err; +} + +static int +gen12_guc_sw_ctx_id(struct intel_context *ce, u32 *ctx_id) +{ + struct i915_vma *scratch; + u32 *val; + int err; + + scratch = __vm_create_scratch_for_read_pinned(&ce->engine->gt->ggtt->vm, 4); + if (IS_ERR(scratch)) + return PTR_ERR(scratch); + + err = i915_vma_sync(scratch); + if (err) + goto err_scratch; + + err = __read_reg(ce, RING_EXECLIST_STATUS_HI(ce->engine->mmio_base), + i915_ggtt_offset(scratch)); + if (err) + goto err_scratch; + + val = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB); + if (IS_ERR(val)) { + err = PTR_ERR(val); + goto err_scratch; + } + + *ctx_id = *val; + i915_gem_object_unpin_map(scratch->obj); + +err_scratch: + i915_vma_unpin_and_release(&scratch, 0); + return err; +} + +/* + * For execlist mode of submission, pick an unused context id + * 0 - (NUM_CONTEXT_TAG -1) are used by other contexts + * XXX_MAX_CONTEXT_HW_ID is used by idle context + * + * For GuC mode of submission read context id from the upper dword of the + * EXECLIST_STATUS register. Note that we read this value only once and expect + * that the value stays fixed for the entire OA use case. There are cases where + * GuC KMD implementation may deregister a context to reuse it's context id, but + * we prevent that from happening to the OA context by pinning it. + */ +static int gen12_get_render_context_id(struct i915_perf_stream *stream) +{ + u32 ctx_id, mask; + int ret; + + if (intel_engine_uses_guc(stream->engine)) { + ret = gen12_guc_sw_ctx_id(stream->pinned_ctx, &ctx_id); + if (ret) + return ret; + + mask = ((1U << GEN12_GUC_SW_CTX_ID_WIDTH) - 1) << + (GEN12_GUC_SW_CTX_ID_SHIFT - 32); + } else if (GRAPHICS_VER_FULL(stream->engine->i915) >= IP_VER(12, 50)) { + ctx_id = (XEHP_MAX_CONTEXT_HW_ID - 1) << + (XEHP_SW_CTX_ID_SHIFT - 32); + + mask = ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) << + (XEHP_SW_CTX_ID_SHIFT - 32); + } else { + ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) << + (GEN11_SW_CTX_ID_SHIFT - 32); + + mask = ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << + (GEN11_SW_CTX_ID_SHIFT - 32); + } + stream->specific_ctx_id = ctx_id & mask; + stream->specific_ctx_id_mask = mask; + + return 0; +} + /** * oa_get_render_ctx_id - determine and hold ctx hw id * @stream: An i915-perf stream opened for OA metrics @@ -1244,6 +1366,7 @@ static struct intel_context *oa_pin_context(struct i915_perf_stream *stream) static int oa_get_render_ctx_id(struct i915_perf_stream *stream) { struct intel_context *ce; + int ret = 0; ce = oa_pin_context(stream); if (IS_ERR(ce)) @@ -1290,24 +1413,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) case 11: case 12: - if (GRAPHICS_VER_FULL(ce->engine->i915) >= IP_VER(12, 50)) { - stream->specific_ctx_id_mask = - ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) << - (XEHP_SW_CTX_ID_SHIFT - 32); - stream->specific_ctx_id = - (XEHP_MAX_CONTEXT_HW_ID - 1) << - (XEHP_SW_CTX_ID_SHIFT - 32); - } else { - stream->specific_ctx_id_mask = - ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); - /* - * Pick an unused context id - * 0 - BITS_PER_LONG are used by other contexts - * GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context - */ - stream->specific_ctx_id = - (GEN12_MAX_CONTEXT_HW_ID - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); - } + ret = gen12_get_render_context_id(stream); break; default: @@ -1321,7 +1427,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) stream->specific_ctx_id, stream->specific_ctx_id_mask); - return 0; + return ret; } /** From patchwork Tue Oct 25 20:16:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019842 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8D41FA3741 for ; Tue, 25 Oct 2022 20:17:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E0EFD10E195; Tue, 25 Oct 2022 20:17:14 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1BB3810E1B9 for ; Tue, 25 Oct 2022 20:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=CZSFrJdR5WQWV4kIfwWVPzi99EJdBto1wxq7x7IOHx8=; b=d4nhB8IIykQEo1kza2n2ttwuB8y9snAHH1hh91hVaCXujz5Ow6WxV+1c lbSEPZQRId+uQkLFrq7HJ8eAXVqvJ/WCJm/UCfncxmRXEUROPBUwqiboP aWvEsL+u40eT3E3pREJmHzYVaDWsaCnrXS7e5dShhwC3xMI0IRtulpXa+ p3h1jjlVJIEYjYSuZDkjraenyUe6cIiNCTa9hDCMRHrAHVg3qgdVoTmsj YWt5K9nTNamBqtNMORwICCrPLgzTGjZjBjyO6AcKIWLzZqMX4DF4BgYIS gxBVzKgmhSnC4SnJxxWqmo/HkUF3FwTNBRDP0FLDrUxXMLEvv0J4adfd8 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="287498120" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="287498120" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699711" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699711" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:08 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:54 +0000 Message-Id: <20221025201708.84018-3-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 02/16] drm/i915/perf: Add 32-bit OAG and OAR formats for DG2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add new OA formats for DG2. MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893 v2: - Update commit title (Ashutosh) - Coding style fixes (Lionel) - 64 bit OA formats need UMD changes in GPUvis, drop for now and send in a separate series with UMD changes v3: - Update commit message to drop 64 bit related description Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin #1 --- drivers/gpu/drm/i915/i915_perf.c | 7 +++++++ include/uapi/drm/i915_drm.h | 4 ++++ 2 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 255335868b6a..2b772a6b1cd6 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -320,6 +320,8 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = { [I915_OA_FORMAT_A12] = { 0, 64 }, [I915_OA_FORMAT_A12_B8_C8] = { 2, 128 }, [I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 }, + [I915_OAR_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 }, + [I915_OA_FORMAT_A24u40_A14u32_B8_C8] = { 5, 256 }, }; #define SAMPLE_OA_REPORT (1<<0) @@ -4515,6 +4517,11 @@ static void oa_init_supported_formats(struct i915_perf *perf) oa_format_add(perf, I915_OA_FORMAT_C4_B8); break; + case INTEL_DG2: + oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8); + oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8); + break; + default: MISSING_CASE(platform); } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 2e613109356b..158b35fb28f3 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2666,6 +2666,10 @@ enum drm_i915_oa_format { I915_OA_FORMAT_A12_B8_C8, I915_OA_FORMAT_A32u40_A4u32_B8_C8, + /* DG2 */ + I915_OAR_FORMAT_A32u40_A4u32_B8_C8, + I915_OA_FORMAT_A24u40_A14u32_B8_C8, + I915_OA_FORMAT_MAX /* non-ABI */ }; From patchwork Tue Oct 25 20:16:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3D8B0FA373E for ; Tue, 25 Oct 2022 20:17:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 42F6F10E1E9; Tue, 25 Oct 2022 20:17:43 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 35F7110E1E4 for ; Tue, 25 Oct 2022 20:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=LKa8xEu2kt29oumvTiSbWgVwoup4BlirQcKxvwTygKw=; b=ce9xT7lYV6mncqQtDKXzBJPbXrQIz8+wAupdfQajBHUL3zOOkA7re/DV j70e1YpTHt1TuVFyVTntpo6giTtDLRp2R1I6y/A1lTZ3FO7c2/dVyAU3D lm8W/f19aByGK9XorlaQ3QiA+fzB6Tc0tmiqiGyMaiCYpd2jzwih279Rr VQardtyLzKhRVoydXKNA+MyFkgWvneq3129d0Z+raI8L3GVYdCGkwsA0u qcDEoKXkM3J79XoCbYvnkVLvJJbWOArHy/8cCkqZl/he2v9P4giJfLJUs wfWTzkbZ7ZMHbUG4uN5Z8HBcOmVE+mrpLyAQSMXaI5B3/zwULvbNyUG5s A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="287498121" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="287498121" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699713" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699713" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:08 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:55 +0000 Message-Id: <20221025201708.84018-4-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 03/16] drm/i915/perf: Fix noa wait predication for DG2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Predication for batch buffer commands changed in XEHPSDV. MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT register. The MI_SET_PREDICATE_RESULT register can only be modified with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE command sets MI_SET_PREDICATE_RESULT based on bit 0 of MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_engine_regs.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 24 +++++++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_regs.h b/drivers/gpu/drm/i915/gt/intel_engine_regs.h index fe1a0d5fd4b1..ee3efd06ee54 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_regs.h @@ -201,6 +201,7 @@ #define RING_CONTEXT_STATUS_PTR(base) _MMIO((base) + 0x3a0) #define RING_CTX_TIMESTAMP(base) _MMIO((base) + 0x3a8) /* gen8+ */ #define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8) +#define MI_PREDICATE_RESULT_2_ENGINE(base) _MMIO((base) + 0x3bc) #define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base) + 0x4D0) + (i) * 4) #define RING_FORCE_TO_NONPRIV_DENY REG_BIT(30) #define RING_FORCE_TO_NONPRIV_ADDRESS_MASK REG_GENMASK(25, 2) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2b772a6b1cd6..e68666b44a72 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -286,6 +286,7 @@ static u32 i915_perf_stream_paranoid = true; #define OAREPORT_REASON_CTX_SWITCH (1<<3) #define OAREPORT_REASON_CLK_RATIO (1<<5) +#define HAS_MI_SET_PREDICATE(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) /* For sysctl proc_dointvec_minmax of i915_oa_max_sample_rate * @@ -1760,6 +1761,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) DELTA_TARGET, N_CS_GPR }; + i915_reg_t mi_predicate_result = HAS_MI_SET_PREDICATE(i915) ? + MI_PREDICATE_RESULT_2_ENGINE(base) : + MI_PREDICATE_RESULT_1(RENDER_RING_BASE); bo = i915_gem_object_create_internal(i915, 4096); if (IS_ERR(bo)) { @@ -1797,7 +1801,7 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) stream, cs, true /* save */, CS_GPR(i), INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR + 8 * i, 2); cs = save_restore_register( - stream, cs, true /* save */, MI_PREDICATE_RESULT_1(RENDER_RING_BASE), + stream, cs, true /* save */, mi_predicate_result, INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1, 1); /* First timestamp snapshot location. */ @@ -1851,7 +1855,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ *cs++ = MI_LOAD_REGISTER_REG | (3 - 2); *cs++ = i915_mmio_reg_offset(CS_GPR(JUMP_PREDICATE)); - *cs++ = i915_mmio_reg_offset(MI_PREDICATE_RESULT_1(RENDER_RING_BASE)); + *cs++ = i915_mmio_reg_offset(mi_predicate_result); + + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE | 1; /* Restart from the beginning if we had timestamps roll over. */ *cs++ = (GRAPHICS_VER(i915) < 8 ? @@ -1861,6 +1868,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) *cs++ = i915_ggtt_offset(vma) + (ts0 - batch) * 4; *cs++ = 0; + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE; + /* * Now add the diff between to previous timestamps and add it to : * (((1 * << 64) - 1) - delay_ns) @@ -1888,7 +1898,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ *cs++ = MI_LOAD_REGISTER_REG | (3 - 2); *cs++ = i915_mmio_reg_offset(CS_GPR(JUMP_PREDICATE)); - *cs++ = i915_mmio_reg_offset(MI_PREDICATE_RESULT_1(RENDER_RING_BASE)); + *cs++ = i915_mmio_reg_offset(mi_predicate_result); + + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE | 1; /* Predicate the jump. */ *cs++ = (GRAPHICS_VER(i915) < 8 ? @@ -1898,13 +1911,16 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) *cs++ = i915_ggtt_offset(vma) + (jump - batch) * 4; *cs++ = 0; + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE; + /* Restore registers. */ for (i = 0; i < N_CS_GPR; i++) cs = save_restore_register( stream, cs, false /* restore */, CS_GPR(i), INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR + 8 * i, 2); cs = save_restore_register( - stream, cs, false /* restore */, MI_PREDICATE_RESULT_1(RENDER_RING_BASE), + stream, cs, false /* restore */, mi_predicate_result, INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1, 1); /* And return to the ring. */ From patchwork Tue Oct 25 20:16:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 91CFDFA373E for ; Tue, 25 Oct 2022 20:17:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 910A310E1EE; Tue, 25 Oct 2022 20:17:47 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 546CA10E1E6 for ; Tue, 25 Oct 2022 20:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=cFjxVIr5lKSorgveWNmdJqhprO2Fp73LjgBlMnH2AlE=; b=WGjXEKPyzuDZryPIUdVGTWsccdpi/yZdlR+IbGeb2H7R3X3idC08LBLE s+8AUppR/5rKOgi1aWtNX5aj9JuXbBtFnvzROg4BfQ0CJQvhFAsvDlktB 2Rqry76Nb2+G0eSxNpBLZRKw/3AbYK/9eGHcvDxh08w5Y1+ax/FcM2wkK Djg15p7glJ+BzrKCpGbHqBLYWJFrKGGQkC+Li3CXUBdXMJK+vV3Fd4deA jhUgf8wR3SMefPNp1F5Al+HdYboFxs/U33X48zmNziyE4xMQYm6Vk0X69 g8tv6n7K7BTJmR+XSnT9VrMJMDYbzPa4Nn5Cy1OQiGgdPf+xetFjB4J8a g==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="287498123" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="287498123" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699714" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699714" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:56 +0000 Message-Id: <20221025201708.84018-5-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 04/16] drm/i915/perf: Determine gen12 oa ctx offset at runtime X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Some SKUs of same gen12 platform may have different oactxctrl offsets. For gen12, determine oactxctrl offsets at runtime. v2: (Lionel) - Move MI definitions to intel_gpu_commands.h - Ensure __find_reg_in_lri does read past context image size v3: (Ashutosh) - Drop unnecessary use of double underscores - fix find_reg_in_lri - Return error if oa context offset is U32_MAX - Error out if oa_ctx_ctrl_offset does not find offset v4: (Ashutosh) - Warn on odd MI LRI_LEN - Remove unnecessary check for valid_oactxctrl_offset - Drop valid_oactxctrl_offset macro v5: Drop unrelated comment Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 4 + drivers/gpu/drm/i915/i915_perf.c | 146 ++++++++++++++++--- drivers/gpu/drm/i915/i915_perf_oa_regs.h | 2 +- 3 files changed, 127 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index d4e9702d3c8e..f50ea92910d9 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -187,6 +187,10 @@ #define MI_BATCH_RESOURCE_STREAMER REG_BIT(10) #define MI_BATCH_PREDICATE REG_BIT(15) /* HSW+ on RCS only*/ +#define MI_OPCODE(x) (((x) >> 23) & 0x3f) +#define IS_MI_LRI_CMD(x) (MI_OPCODE(x) == MI_OPCODE(MI_INSTR(0x22, 0))) +#define MI_LRI_LEN(x) (((x) & 0xff) + 1) + /* * 3D instructions used by the kernel */ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index e68666b44a72..b71b5cf21176 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1356,6 +1356,74 @@ static int gen12_get_render_context_id(struct i915_perf_stream *stream) return 0; } +static bool oa_find_reg_in_lri(u32 *state, u32 reg, u32 *offset, u32 end) +{ + u32 idx = *offset; + u32 len = min(MI_LRI_LEN(state[idx]) + idx, end); + bool found = false; + + idx++; + for (; idx < len; idx += 2) { + if (state[idx] == reg) { + found = true; + break; + } + } + + *offset = idx; + return found; +} + +static u32 oa_context_image_offset(struct intel_context *ce, u32 reg) +{ + u32 offset, len = (ce->engine->context_size - PAGE_SIZE) / 4; + u32 *state = ce->lrc_reg_state; + + for (offset = 0; offset < len; ) { + if (IS_MI_LRI_CMD(state[offset])) { + /* + * We expect reg-value pairs in MI_LRI command, so + * MI_LRI_LEN() should be even, if not, issue a warning. + */ + drm_WARN_ON(&ce->engine->i915->drm, + MI_LRI_LEN(state[offset]) & 0x1); + + if (oa_find_reg_in_lri(state, reg, &offset, len)) + break; + } else { + offset++; + } + } + + return offset < len ? offset : U32_MAX; +} + +static int set_oa_ctx_ctrl_offset(struct intel_context *ce) +{ + i915_reg_t reg = GEN12_OACTXCONTROL(ce->engine->mmio_base); + struct i915_perf *perf = &ce->engine->i915->perf; + u32 offset = perf->ctx_oactxctrl_offset; + + /* Do this only once. Failure is stored as offset of U32_MAX */ + if (offset) + goto exit; + + offset = oa_context_image_offset(ce, i915_mmio_reg_offset(reg)); + perf->ctx_oactxctrl_offset = offset; + + drm_dbg(&ce->engine->i915->drm, + "%s oa ctx control at 0x%08x dword offset\n", + ce->engine->name, offset); + +exit: + return offset && offset != U32_MAX ? 0 : -ENODEV; +} + +static bool engine_supports_mi_query(struct intel_engine_cs *engine) +{ + return engine->class == RENDER_CLASS; +} + /** * oa_get_render_ctx_id - determine and hold ctx hw id * @stream: An i915-perf stream opened for OA metrics @@ -1375,6 +1443,21 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) if (IS_ERR(ce)) return PTR_ERR(ce); + if (engine_supports_mi_query(stream->engine)) { + /* + * We are enabling perf query here. If we don't find the context + * offset here, just return an error. + */ + ret = set_oa_ctx_ctrl_offset(ce); + if (ret) { + intel_context_unpin(ce); + drm_err(&stream->perf->i915->drm, + "Enabling perf query failed for %s\n", + stream->engine->name); + return ret; + } + } + switch (GRAPHICS_VER(ce->engine->i915)) { case 7: { /* @@ -2406,10 +2489,11 @@ static int gen12_configure_oar_context(struct i915_perf_stream *stream, int err; struct intel_context *ce = stream->pinned_ctx; u32 format = stream->oa_buffer.format; + u32 offset = stream->perf->ctx_oactxctrl_offset; struct flex regs_context[] = { { GEN8_OACTXCONTROL, - stream->perf->ctx_oactxctrl_offset + 1, + offset + 1, active ? GEN8_OA_COUNTER_RESUME : 0, }, }; @@ -2434,12 +2518,13 @@ static int gen12_configure_oar_context(struct i915_perf_stream *stream, }, }; - /* Modify the context image of pinned context with regs_context*/ + /* Modify the context image of pinned context with regs_context */ err = intel_context_lock_pinned(ce); if (err) return err; - err = gen8_modify_context(ce, regs_context, ARRAY_SIZE(regs_context)); + err = gen8_modify_context(ce, regs_context, + ARRAY_SIZE(regs_context)); intel_context_unlock_pinned(ce); if (err) return err; @@ -2564,6 +2649,7 @@ lrc_configure_all_contexts(struct i915_perf_stream *stream, const struct i915_oa_config *oa_config, struct i915_active *active) { + u32 ctx_oactxctrl = stream->perf->ctx_oactxctrl_offset; /* The MMIO offsets for Flex EU registers aren't contiguous */ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset; #define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1) @@ -2574,7 +2660,7 @@ lrc_configure_all_contexts(struct i915_perf_stream *stream, }, { GEN8_OACTXCONTROL, - stream->perf->ctx_oactxctrl_offset + 1, + ctx_oactxctrl + 1, }, { EU_PERF_CNTL0, ctx_flexeuN(0) }, { EU_PERF_CNTL1, ctx_flexeuN(1) }, @@ -4543,6 +4629,37 @@ static void oa_init_supported_formats(struct i915_perf *perf) } } +static void i915_perf_init_info(struct drm_i915_private *i915) +{ + struct i915_perf *perf = &i915->perf; + + switch (GRAPHICS_VER(i915)) { + case 8: + perf->ctx_oactxctrl_offset = 0x120; + perf->ctx_flexeu0_offset = 0x2ce; + perf->gen8_valid_ctx_bit = BIT(25); + break; + case 9: + perf->ctx_oactxctrl_offset = 0x128; + perf->ctx_flexeu0_offset = 0x3de; + perf->gen8_valid_ctx_bit = BIT(16); + break; + case 11: + perf->ctx_oactxctrl_offset = 0x124; + perf->ctx_flexeu0_offset = 0x78e; + perf->gen8_valid_ctx_bit = BIT(16); + break; + case 12: + /* + * Calculate offset at runtime in oa_pin_context for gen12 and + * cache the value in perf->ctx_oactxctrl_offset. + */ + break; + default: + MISSING_CASE(GRAPHICS_VER(i915)); + } +} + /** * i915_perf_init - initialize i915-perf state on module bind * @i915: i915 device instance @@ -4581,6 +4698,7 @@ void i915_perf_init(struct drm_i915_private *i915) * execlist mode by default. */ perf->ops.read = gen8_oa_read; + i915_perf_init_info(i915); if (IS_GRAPHICS_VER(i915, 8, 9)) { perf->ops.is_valid_b_counter_reg = @@ -4600,18 +4718,6 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen8_enable_metric_set; perf->ops.disable_metric_set = gen8_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; - - if (GRAPHICS_VER(i915) == 8) { - perf->ctx_oactxctrl_offset = 0x120; - perf->ctx_flexeu0_offset = 0x2ce; - - perf->gen8_valid_ctx_bit = BIT(25); - } else { - perf->ctx_oactxctrl_offset = 0x128; - perf->ctx_flexeu0_offset = 0x3de; - - perf->gen8_valid_ctx_bit = BIT(16); - } } else if (GRAPHICS_VER(i915) == 11) { perf->ops.is_valid_b_counter_reg = gen7_is_valid_b_counter_addr; @@ -4625,11 +4731,6 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen8_enable_metric_set; perf->ops.disable_metric_set = gen11_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; - - perf->ctx_oactxctrl_offset = 0x124; - perf->ctx_flexeu0_offset = 0x78e; - - perf->gen8_valid_ctx_bit = BIT(16); } else if (GRAPHICS_VER(i915) == 12) { perf->ops.is_valid_b_counter_reg = gen12_is_valid_b_counter_addr; @@ -4643,9 +4744,6 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen12_enable_metric_set; perf->ops.disable_metric_set = gen12_disable_metric_set; perf->ops.oa_hw_tail_read = gen12_oa_hw_tail_read; - - perf->ctx_flexeu0_offset = 0; - perf->ctx_oactxctrl_offset = 0x144; } } diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h index f31c9f13a9fc..0ef3562ff4aa 100644 --- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h +++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h @@ -97,7 +97,7 @@ #define GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT 1 #define GEN12_OAR_OACONTROL_COUNTER_ENABLE (1 << 0) -#define GEN12_OACTXCONTROL _MMIO(0x2360) +#define GEN12_OACTXCONTROL(base) _MMIO((base) + 0x360) #define GEN12_OAR_OASTATUS _MMIO(0x2968) /* Gen12 OAG unit */ From patchwork Tue Oct 25 20:16:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5C162FA373E for ; Tue, 25 Oct 2022 20:18:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E578A10E204; Tue, 25 Oct 2022 20:17:48 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id EC7C610E1A0 for ; Tue, 25 Oct 2022 20:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=QNa1fAgZRmoghrzYtP3NECj4ODDCLvmpK3e70sTj00M=; b=CBb/t9nzRO8jjJC6btqOUxn7eXxRfb3vcgLYNJ2EmdjTC5xoPik089cz CcCKNIRdTWd2+td19T4INSbP+nTwEl3Q+ziJlJt9O701p/RAdmBdT4zmh g3xSrUoe0oP1c4WKM647nHUdCp2Ny+3C7VJzKgPdkENdF8Xk3YtV6+aui 1OJm10zDvSbhn1oH45/qVD/PfIkHfQMvXlNSEjKXI+BZ3iC+vhoFMSaHi NXFIL+OGT78nCIEDNaDO0Vn3AwIAs07W+JmbI6cXRn349zNZn2IFfuEvJ 9+hQJY7yFuda4ocxOkNDR2CGk+MHDb71+cDIQjSZUwdsQbXx/oVtaoBwf g==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847670" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847670" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699716" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699716" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:57 +0000 Message-Id: <20221025201708.84018-6-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 05/16] drm/i915/perf: Enable bytes per clock reporting in OA X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" XEHPSDV and DG2 provide a way to configure bytes per clock vs commands per clock reporting. Enable bytes per clock setting on enabling OA. Bspec: 51762 Bspec: 52201 v2: - Fix commit msg (Ashutosh) - Fix checkpatch issues v3: - s/commands/bytes/ in code comment and commmit msg Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/i915_perf.c | 20 ++++++++++++++++++++ drivers/gpu/drm/i915/i915_perf_oa_regs.h | 4 ++++ drivers/gpu/drm/i915/intel_device_info.h | 1 + 5 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7c64f8a17493..438aebeea103 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -898,6 +898,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm) #define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc) +#define HAS_OA_BPC_REPORTING(dev_priv) \ + (INTEL_INFO(dev_priv)->has_oa_bpc_reporting) + /* * Set this flag, when platform requires 64K GTT page sizes or larger for * device local memory access. diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 496df0f547f4..cbced3f3db17 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1023,6 +1023,7 @@ static const struct intel_device_info adl_p_info = { .has_logical_ring_contexts = 1, \ .has_logical_ring_elsq = 1, \ .has_mslice_steering = 1, \ + .has_oa_bpc_reporting = 1, \ .has_rc6 = 1, \ .has_reset_engine = 1, \ .has_rps = 1, \ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index b71b5cf21176..d11cc949c9be 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -2748,10 +2748,12 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream, struct i915_active *active) { + struct drm_i915_private *i915 = stream->perf->i915; struct intel_uncore *uncore = stream->uncore; struct i915_oa_config *oa_config = stream->oa_config; bool periodic = stream->periodic; u32 period_exponent = stream->period_exponent; + u32 sqcnt1; int ret; intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG, @@ -2770,6 +2772,16 @@ gen12_enable_metric_set(struct i915_perf_stream *stream, (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT)) : 0); + /* + * Initialize Super Queue Internal Cnt Register + * Set PMON Enable in order to collect valid metrics. + * Enable byets per clock reporting in OA for XEHPSDV onward. + */ + sqcnt1 = GEN12_SQCNT1_PMON_ENABLE | + (HAS_OA_BPC_REPORTING(i915) ? GEN12_SQCNT1_OABPC : 0); + + intel_uncore_rmw(uncore, GEN12_SQCNT1, 0, sqcnt1); + /* * Update all contexts prior writing the mux configurations as we need * to make sure all slices/subslices are ON before writing to NOA @@ -2819,6 +2831,8 @@ static void gen11_disable_metric_set(struct i915_perf_stream *stream) static void gen12_disable_metric_set(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; + struct drm_i915_private *i915 = stream->perf->i915; + u32 sqcnt1; /* Reset all contexts' slices/subslices configurations. */ gen12_configure_all_contexts(stream, NULL, NULL); @@ -2829,6 +2843,12 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream) /* Make sure we disable noa to save power. */ intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0); + + sqcnt1 = GEN12_SQCNT1_PMON_ENABLE | + (HAS_OA_BPC_REPORTING(i915) ? GEN12_SQCNT1_OABPC : 0); + + /* Reset PMON Enable to save power. */ + intel_uncore_rmw(uncore, GEN12_SQCNT1, sqcnt1, 0); } static void gen7_oa_enable(struct i915_perf_stream *stream) diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h index 0ef3562ff4aa..381d94101610 100644 --- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h +++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h @@ -134,4 +134,8 @@ #define GDT_CHICKEN_BITS _MMIO(0x9840) #define GT_NOA_ENABLE 0x00000080 +#define GEN12_SQCNT1 _MMIO(0x8718) +#define GEN12_SQCNT1_PMON_ENABLE REG_BIT(30) +#define GEN12_SQCNT1_OABPC REG_BIT(29) + #endif /* __INTEL_PERF_OA_REGS__ */ diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index cdf78728dcad..42218c8d85f2 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -164,6 +164,7 @@ enum intel_ppgtt_type { func(has_logical_ring_elsq); \ func(has_media_ratio_mode); \ func(has_mslice_steering); \ + func(has_oa_bpc_reporting); \ func(has_one_eu_per_fuse_bit); \ func(has_pxp); \ func(has_rc6); \ From patchwork Tue Oct 25 20:16:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 89DC0FA3740 for ; Tue, 25 Oct 2022 20:18:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A953710E1F8; Tue, 25 Oct 2022 20:17:47 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id BF6FC10E195 for ; Tue, 25 Oct 2022 20:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=8fhMJUnphS85HIbgGhFVJCpLu1IjGd4WHwEfzhps+eQ=; b=dq3XcL/6uOWBFpGlaT5Z1UkrBIaVq5Zc35a0ZBRKt7xr1yyRDFqo/lxx klwfkGAx50po1CkHyCaWcjV+ToAiINjdypBD+6/b4QJgYDHqiFcyr/D/O mrer3XEh5s/sldkr0uQEFn59pbQoS1AO0UcOuXvSrZVZTlWQ9pwpoc8vG ego/tbO2MGYm8mU+BXoj0sS6UgTTq//aEPANwH/mMjscodQgHXv4dkKRF klX++sVMmKHC1zMDcpPcS/KiBf6HB+ArbXxclAUxqhzv93ZnxacHqcXx8 OAItQXoWn+7rdtviVDaCHhwV+wtAIn/QZz+UhYn0QdyXhREOJ/GQjL941 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847671" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847671" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699717" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699717" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:58 +0000 Message-Id: <20221025201708.84018-7-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 06/16] drm/i915/perf: Simply use stream->ctx X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Earlier code used exclusive_stream to check for user passed context. Simplify this by accessing stream->ctx. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index d11cc949c9be..75d320b2c1f8 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -776,7 +776,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * switches since it's not-uncommon for periodic samples to * identify a switch before any 'context switch' report. */ - if (!stream->perf->exclusive_stream->ctx || + if (!stream->ctx || stream->specific_ctx_id == ctx_id || stream->oa_buffer.last_ctx_id == stream->specific_ctx_id || reason & OAREPORT_REASON_CTX_SWITCH) { @@ -785,7 +785,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * While filtering for a single context we avoid * leaking the IDs of other contexts. */ - if (stream->perf->exclusive_stream->ctx && + if (stream->ctx && stream->specific_ctx_id != ctx_id) { report32[2] = INVALID_CTX_ID; } From patchwork Tue Oct 25 20:16:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 35B5BC38A2D for ; Tue, 25 Oct 2022 20:17:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E16C010E1A0; Tue, 25 Oct 2022 20:17:45 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 25D4910E1B9 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=L1cDWYQev2o99tMCAGdSsoFL/Y5cYR2yrKa1j7zp/Zo=; b=H7j0/8V2bFjLMn78VY59UD7Z1GaSi1LcR600thL6I6P096mLfMydVFRd dPM0TU+nZqbfocpJdWY0snk9pg7YjZMOz/In4Yqnzc1fMnKplyDdbAtey D53AIMcxE0pZ1c0vOBvGxRXQUn/fs1vi2PqSUV/AgFqXmCRXZGJCjie5M Q5MI7JMU9fVSWWRcLzXqon5eAsxTaT3JI66GmudNlm2D9OhhQQ+KqOJO+ VGUM2Ys/CLfkE/RSTTkLDiesXVIsyXPKXEmulFVH/uiaLNlArFNnyMrQx KauyaOu5DLIDqA/lgWuFC/gI5/HTClyecYPNj5TnFKgklE9ivdjx3NdMh A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847673" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847673" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699721" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699721" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:16:59 +0000 Message-Id: <20221025201708.84018-8-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 07/16] drm/i915/perf: Move gt-specific data from i915->perf to gt->perf X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Make perf part of gt as the OAG buffer is specific to a gt. The refactor eventually simplifies programming the right OA buffer and the right HW registers when supporting multiple gts. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 + drivers/gpu/drm/i915/gt/intel_sseu.c | 4 +- drivers/gpu/drm/i915/i915_perf.c | 75 +++++++++++++--------- drivers/gpu/drm/i915/i915_perf_types.h | 39 +++++------ drivers/gpu/drm/i915/selftests/i915_perf.c | 16 +++-- 5 files changed, 80 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 64aa2ba624fc..6f686a4244f0 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -20,6 +20,7 @@ #include "intel_gsc.h" #include "i915_vma.h" +#include "i915_perf_types.h" #include "intel_engine_types.h" #include "intel_gt_buffer_pool_types.h" #include "intel_hwconfig.h" @@ -289,6 +290,8 @@ struct intel_gt { /* sysfs defaults per gt */ struct gt_defaults defaults; struct kobject *sysfs_defaults; + + struct i915_perf_gt perf; }; struct intel_gt_definition { diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 66f21c735d54..6c6198a257ac 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -677,8 +677,8 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt, * If i915/perf is active, we want a stable powergating configuration * on the system. Use the configuration pinned by i915/perf. */ - if (i915->perf.exclusive_stream) - req_sseu = &i915->perf.sseu; + if (gt->perf.exclusive_stream) + req_sseu = >->perf.sseu; slices = hweight8(req_sseu->slice_mask); subslices = hweight8(req_sseu->subslice_mask); diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 75d320b2c1f8..83c5dc043261 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1565,8 +1565,9 @@ free_noa_wait(struct i915_perf_stream *stream) static void i915_oa_stream_destroy(struct i915_perf_stream *stream) { struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; - if (WARN_ON(stream != perf->exclusive_stream)) + if (WARN_ON(stream != gt->perf.exclusive_stream)) return; /* @@ -1575,7 +1576,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream) * * See i915_oa_init_reg_state() and lrc_configure_all_contexts() */ - WRITE_ONCE(perf->exclusive_stream, NULL); + WRITE_ONCE(gt->perf.exclusive_stream, NULL); perf->ops.disable_metric_set(stream); free_oa_buffer(stream); @@ -2566,10 +2567,11 @@ oa_configure_all_contexts(struct i915_perf_stream *stream, { struct drm_i915_private *i915 = stream->perf->i915; struct intel_engine_cs *engine; + struct intel_gt *gt = stream->engine->gt; struct i915_gem_context *ctx, *cn; int err; - lockdep_assert_held(&stream->perf->lock); + lockdep_assert_held(>->perf.lock); /* * The OA register config is setup through the context image. This image @@ -3090,6 +3092,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, { struct drm_i915_private *i915 = stream->perf->i915; struct i915_perf *perf = stream->perf; + struct intel_gt *gt; int format_size; int ret; @@ -3098,6 +3101,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, "OA engine not specified\n"); return -EINVAL; } + gt = props->engine->gt; /* * If the sysfs metrics/ directory wasn't registered for some @@ -3128,7 +3132,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, * counter reports and marshal to the appropriate client * we currently only allow exclusive access */ - if (perf->exclusive_stream) { + if (gt->perf.exclusive_stream) { drm_dbg(&stream->perf->i915->drm, "OA unit already in use\n"); return -EBUSY; @@ -3208,8 +3212,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->ops = &i915_oa_stream_ops; - perf->sseu = props->sseu; - WRITE_ONCE(perf->exclusive_stream, stream); + stream->engine->gt->perf.sseu = props->sseu; + WRITE_ONCE(gt->perf.exclusive_stream, stream); ret = i915_perf_stream_enable_sync(stream); if (ret) { @@ -3231,7 +3235,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, return 0; err_enable: - WRITE_ONCE(perf->exclusive_stream, NULL); + WRITE_ONCE(gt->perf.exclusive_stream, NULL); perf->ops.disable_metric_set(stream); free_oa_buffer(stream); @@ -3261,7 +3265,7 @@ void i915_oa_init_reg_state(const struct intel_context *ce, return; /* perf.exclusive_stream serialised by lrc_configure_all_contexts() */ - stream = READ_ONCE(engine->i915->perf.exclusive_stream); + stream = READ_ONCE(engine->gt->perf.exclusive_stream); if (stream && GRAPHICS_VER(stream->perf->i915) < 12) gen8_update_reg_state_unlocked(ce, stream); } @@ -3290,7 +3294,7 @@ static ssize_t i915_perf_read(struct file *file, loff_t *ppos) { struct i915_perf_stream *stream = file->private_data; - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; size_t offset = 0; int ret; @@ -3314,14 +3318,14 @@ static ssize_t i915_perf_read(struct file *file, if (ret) return ret; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } while (!offset && !ret); } else { - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } /* We allow the poll checking to sometimes report false positive EPOLLIN @@ -3368,7 +3372,7 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer) * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that * will be woken for new stream data. * - * Note: The &perf->lock mutex has been taken to serialize + * Note: The >->perf.lock mutex has been taken to serialize * with any non-file-operation driver hooks. * * Returns: any poll events that are ready without sleeping @@ -3409,12 +3413,12 @@ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream, static __poll_t i915_perf_poll(struct file *file, poll_table *wait) { struct i915_perf_stream *stream = file->private_data; - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; __poll_t ret; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = i915_perf_poll_locked(stream, file, wait); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); return ret; } @@ -3513,7 +3517,7 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, * @cmd: the ioctl request * @arg: the ioctl data * - * Note: The &perf->lock mutex has been taken to serialize + * Note: The >->perf.lock mutex has been taken to serialize * with any non-file-operation driver hooks. * * Returns: zero on success or a negative error code. Returns -EINVAL for @@ -3553,12 +3557,12 @@ static long i915_perf_ioctl(struct file *file, unsigned long arg) { struct i915_perf_stream *stream = file->private_data; - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; long ret; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = i915_perf_ioctl_locked(stream, cmd, arg); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); return ret; } @@ -3570,7 +3574,7 @@ static long i915_perf_ioctl(struct file *file, * Frees all resources associated with the given i915 perf @stream, disabling * any associated data capture in the process. * - * Note: The &perf->lock mutex has been taken to serialize + * Note: The >->perf.lock mutex has been taken to serialize * with any non-file-operation driver hooks. */ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) @@ -3602,10 +3606,11 @@ static int i915_perf_release(struct inode *inode, struct file *file) { struct i915_perf_stream *stream = file->private_data; struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); i915_perf_destroy_locked(stream); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); /* Release the reference the perf stream kept on the driver. */ drm_dev_put(&perf->i915->drm); @@ -3638,7 +3643,7 @@ static const struct file_operations fops = { * See i915_perf_ioctl_open() for interface details. * * Implements further stream config validation and stream initialization on - * behalf of i915_perf_open_ioctl() with the &perf->lock mutex + * behalf of i915_perf_open_ioctl() with the >->perf.lock mutex * taken to serialize with any non-file-operation driver hooks. * * Note: at this point the @props have only been validated in isolation and @@ -4022,7 +4027,7 @@ static int read_properties_unlocked(struct i915_perf *perf, * mutex to avoid an awkward lockdep with mmap_lock. * * Most of the implementation details are handled by - * i915_perf_open_ioctl_locked() after taking the &perf->lock + * i915_perf_open_ioctl_locked() after taking the >->perf.lock * mutex for serializing with any non-file-operation driver hooks. * * Return: A newly opened i915 Perf stream file descriptor or negative @@ -4033,6 +4038,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, { struct i915_perf *perf = &to_i915(dev)->perf; struct drm_i915_perf_open_param *param = data; + struct intel_gt *gt; struct perf_open_properties props; u32 known_open_flags; int ret; @@ -4059,9 +4065,11 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, if (ret) return ret; - mutex_lock(&perf->lock); + gt = props.engine->gt; + + mutex_lock(>->perf.lock); ret = i915_perf_open_ioctl_locked(perf, param, &props, file); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); return ret; } @@ -4077,6 +4085,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, void i915_perf_register(struct drm_i915_private *i915) { struct i915_perf *perf = &i915->perf; + struct intel_gt *gt = to_gt(i915); if (!perf->i915) return; @@ -4085,13 +4094,13 @@ void i915_perf_register(struct drm_i915_private *i915) * i915_perf_open_ioctl(); considering that we register after * being exposed to userspace. */ - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); perf->metrics_kobj = kobject_create_and_add("metrics", &i915->drm.primary->kdev->kobj); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } /** @@ -4768,7 +4777,11 @@ void i915_perf_init(struct drm_i915_private *i915) } if (perf->ops.enable_metric_set) { - mutex_init(&perf->lock); + struct intel_gt *gt; + int i; + + for_each_gt(gt, i915, i) + mutex_init(>->perf.lock); /* Choose a representative limit */ oa_sample_rate_hard_limit = to_gt(i915)->clock_frequency / 2; diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index 05cb9a335a97..e888bfab478f 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -380,6 +380,26 @@ struct i915_oa_ops { u32 (*oa_hw_tail_read)(struct i915_perf_stream *stream); }; +struct i915_perf_gt { + /* + * Lock associated with anything below within this structure. + */ + struct mutex lock; + + /** + * @sseu: sseu configuration selected to run while perf is active, + * applies to all contexts. + */ + struct intel_sseu sseu; + + /* + * @exclusive_stream: The stream currently using the OA unit. This is + * sometimes accessed outside a syscall associated to its file + * descriptor. + */ + struct i915_perf_stream *exclusive_stream; +}; + struct i915_perf { struct drm_i915_private *i915; @@ -397,25 +417,6 @@ struct i915_perf { */ struct idr metrics_idr; - /* - * Lock associated with anything below within this structure - * except exclusive_stream. - */ - struct mutex lock; - - /* - * The stream currently using the OA unit. If accessed - * outside a syscall associated to its file - * descriptor. - */ - struct i915_perf_stream *exclusive_stream; - - /** - * @sseu: sseu configuration selected to run while perf is active, - * applies to all contexts. - */ - struct intel_sseu sseu; - /** * For rate limiting any notifications of spurious * invalid OA reports diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c index 429c6d73b159..24dde5531423 100644 --- a/drivers/gpu/drm/i915/selftests/i915_perf.c +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c @@ -102,6 +102,12 @@ test_stream(struct i915_perf *perf) I915_OA_FORMAT_A32u40_A4u32_B8_C8 : I915_OA_FORMAT_C4_B8, }; struct i915_perf_stream *stream; + struct intel_gt *gt; + + if (!props.engine) + return NULL; + + gt = props.engine->gt; if (!oa_config) return NULL; @@ -116,12 +122,12 @@ test_stream(struct i915_perf *perf) stream->perf = perf; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); if (i915_oa_stream_init(stream, ¶m, &props)) { kfree(stream); stream = NULL; } - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); i915_oa_config_put(oa_config); @@ -130,11 +136,11 @@ test_stream(struct i915_perf *perf) static void stream_destroy(struct i915_perf_stream *stream) { - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); i915_perf_destroy_locked(stream); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } static int live_sanitycheck(void *arg) From patchwork Tue Oct 25 20:17:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7111AC38A2D for ; Tue, 25 Oct 2022 20:19:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 915DF10E248; Tue, 25 Oct 2022 20:19:45 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5819310E1E4 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=Iu/sQAT2Cw1JVTbWsP1Bo9Ez2sbIg9ZzGNjLknBxvM4=; b=BZXbE03olq6txuv0P1K9Qgcb49aa/AhKvn+u6/3Gtm+TkSP2E2NlUdEZ H8bMzYpFtG/+64/LV+8jP6cCjL8/ZWdqwaF9K+CnZVBpV4zE92WTYaSNf SiuIQvyWGPkn+erEqubbvTo/PHq9p0kb5u9r56cXMFpdcv0Z8EBwLA0ns abB0AHFfXMtkMeGWzR1dViXizXjx5yj58za/HfwROAGPyyKcxwv9nlTTj W4uve4vNEtRyNeTXO6P9byHgwkkfmlVxkYevrfTRBuMj+eeCTP0FiGeOv bV2x7yFCmvOHykaPCaJL316h60Gc+/uFhax4ko8/f8gGEmpABWeO2SFl8 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847674" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847674" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699724" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699724" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:00 +0000 Message-Id: <20221025201708.84018-9-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 08/16] drm/i915/perf: Replace gt->perf.lock with stream->lock for file ops X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" With multi-gt, user can access multiple OA buffers concurrently. Use stream->lock instead of gt->perf.lock to serialize file operations. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 31 ++++++++++++-------------- drivers/gpu/drm/i915/i915_perf_types.h | 5 +++++ 2 files changed, 19 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 83c5dc043261..9a00398ae25f 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3231,6 +3231,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->poll_check_timer.function = oa_poll_check_timer_cb; init_waitqueue_head(&stream->poll_wq); spin_lock_init(&stream->oa_buffer.ptr_lock); + mutex_init(&stream->lock); return 0; @@ -3294,7 +3295,6 @@ static ssize_t i915_perf_read(struct file *file, loff_t *ppos) { struct i915_perf_stream *stream = file->private_data; - struct intel_gt *gt = stream->engine->gt; size_t offset = 0; int ret; @@ -3318,14 +3318,14 @@ static ssize_t i915_perf_read(struct file *file, if (ret) return ret; - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); } while (!offset && !ret); } else { - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); } /* We allow the poll checking to sometimes report false positive EPOLLIN @@ -3372,9 +3372,6 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer) * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that * will be woken for new stream data. * - * Note: The >->perf.lock mutex has been taken to serialize - * with any non-file-operation driver hooks. - * * Returns: any poll events that are ready without sleeping */ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream, @@ -3413,12 +3410,11 @@ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream, static __poll_t i915_perf_poll(struct file *file, poll_table *wait) { struct i915_perf_stream *stream = file->private_data; - struct intel_gt *gt = stream->engine->gt; __poll_t ret; - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = i915_perf_poll_locked(stream, file, wait); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); return ret; } @@ -3517,9 +3513,6 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, * @cmd: the ioctl request * @arg: the ioctl data * - * Note: The >->perf.lock mutex has been taken to serialize - * with any non-file-operation driver hooks. - * * Returns: zero on success or a negative error code. Returns -EINVAL for * an unknown ioctl request. */ @@ -3557,12 +3550,11 @@ static long i915_perf_ioctl(struct file *file, unsigned long arg) { struct i915_perf_stream *stream = file->private_data; - struct intel_gt *gt = stream->engine->gt; long ret; - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = i915_perf_ioctl_locked(stream, cmd, arg); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); return ret; } @@ -3608,6 +3600,11 @@ static int i915_perf_release(struct inode *inode, struct file *file) struct i915_perf *perf = stream->perf; struct intel_gt *gt = stream->engine->gt; + /* + * Within this call, we know that the fd is being closed and we have no + * other user of stream->lock. Use the perf lock to destroy the stream + * here. + */ mutex_lock(>->perf.lock); i915_perf_destroy_locked(stream); mutex_unlock(>->perf.lock); diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index e888bfab478f..dc9bfd8086cf 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -146,6 +146,11 @@ struct i915_perf_stream { */ struct intel_engine_cs *engine; + /* + * Lock associated with operations on stream + */ + struct mutex lock; + /** * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` * properties given when opening a stream, representing the contents From patchwork Tue Oct 25 20:17:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3BFC3C38A2D for ; Tue, 25 Oct 2022 20:19:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CE99210E265; Tue, 25 Oct 2022 20:19:50 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6E09010E1E9 for ; Tue, 25 Oct 2022 20:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729030; x=1698265030; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=ZZfFCkoh6MPPUxL/bCeTS2yNAhT4KZgxAlEzIIlXMWQ=; b=fPRju2DSnwSGEzTk6jDKLsgNxMnP3iN10/9Jl3F3MnO+bv8BNee0oAWm lvAw+695K/S2L5mt9sOBiB88PJlbIztthITeDmultSKTFUDHYI9joaRze Dg7DHhCBcIlSJYPTS3JrvBAxaAhtMbx4DEznzo2CKhI9N/1wIP3anYkhn wKNWZDC2s4miffPRbGdJx7HhD0yVvoodWnWTaIyNNowAtcA9hqJ38Gfui 8WaLst8OxhhwBLn4Oj+MnHPpl0e1+4Nm3CYCj+DCkQoNEoXevP1Rr8pPk L51FQfopac0kwl3a6vR/ESjBdRLRtDzJR2g3bbfXnmCFIKRlld31WL85k A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="287498125" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="287498125" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699727" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699727" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:01 +0000 Message-Id: <20221025201708.84018-10-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 09/16] drm/i915/perf: Use gt-specific ggtt for OA and noa-wait buffers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" User passes uabi engine class and instance to the perf OA interface. Use gt corresponding to the engine to pin the buffers to the right ggtt. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9a00398ae25f..2c8727253f0d 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1754,6 +1754,7 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream) static int alloc_oa_buffer(struct i915_perf_stream *stream) { struct drm_i915_private *i915 = stream->perf->i915; + struct intel_gt *gt = stream->engine->gt; struct drm_i915_gem_object *bo; struct i915_vma *vma; int ret; @@ -1773,11 +1774,22 @@ static int alloc_oa_buffer(struct i915_perf_stream *stream) i915_gem_object_set_cache_coherency(bo, I915_CACHE_LLC); /* PreHSW required 512K alignment, HSW requires 16M */ - vma = i915_gem_object_ggtt_pin(bo, NULL, 0, SZ_16M, 0); + vma = i915_vma_instance(bo, >->ggtt->vm, NULL); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto err_unref; } + + /* + * PreHSW required 512K alignment. + * HSW and onwards, align to requested size of OA buffer. + */ + ret = i915_vma_pin(vma, 0, SZ_16M, PIN_GLOBAL | PIN_HIGH); + if (ret) { + drm_err(>->i915->drm, "Failed to pin OA buffer %d\n", ret); + goto err_unref; + } + stream->oa_buffer.vma = vma; stream->oa_buffer.vaddr = @@ -1827,6 +1839,7 @@ static u32 *save_restore_register(struct i915_perf_stream *stream, u32 *cs, static int alloc_noa_wait(struct i915_perf_stream *stream) { struct drm_i915_private *i915 = stream->perf->i915; + struct intel_gt *gt = stream->engine->gt; struct drm_i915_gem_object *bo; struct i915_vma *vma; const u64 delay_ticks = 0xffffffffffffffff - @@ -1867,12 +1880,16 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) * multiple OA config BOs will have a jump to this address and it * needs to be fixed during the lifetime of the i915/perf stream. */ - vma = i915_gem_object_ggtt_pin_ww(bo, &ww, NULL, 0, 0, PIN_HIGH); + vma = i915_vma_instance(bo, >->ggtt->vm, NULL); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto out_ww; } + ret = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_GLOBAL | PIN_HIGH); + if (ret) + goto out_ww; + batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); if (IS_ERR(batch)) { ret = PTR_ERR(batch); From patchwork Tue Oct 25 20:17:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0A0C4FA373E for ; Tue, 25 Oct 2022 20:17:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9123410E1E6; Tue, 25 Oct 2022 20:17:45 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7F06210E1EA for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=FnCQd6/dfYi+SC8Zcyus2h0Pl1aP5tGzcTd0NpaAvgg=; b=h3mMJMliArBFfJCsH/2OP8fKkHva6lMSbWRpeijpdAGyNgLkgX9ZYIJe IbCik6Q6ppdnflknLPWZ+arWpOLzjSUEEISjv/4o59Z7IFb3LxsYduh1j 32ew8pyUjcPgkOvaZJf0MkzKWE2n9xjrsurHVuiQXYEDxp4JAfN6u1CrA BQUMgxGknlHdCKVCK5R53q9oQx+k/2dJYm30TVhp0DA0bnR25Pl2epCyb ab/L14zn6nXsHDfQU8loPb31fUvt2Zk6ej191s1A6D+gPRxV/E70IWNif qVBYIEaa6fc+VPp7CbQQIfoReuWriv0oj79hepDolagFEFq8Ubtj4BnIo A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847676" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847676" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699728" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699728" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:02 +0000 Message-Id: <20221025201708.84018-11-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 10/16] drm/i915/perf: Store a pointer to oa_format in oa_buffer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" DG2 introduces OA reports with 64 bit report header fields. Perf OA would need more information about the OA format in order to process such reports. Store all OA format info in oa_buffer instead of just the size and format-id. v2: Drop format_size variable (Ashutosh) Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 30 +++++++++++--------------- drivers/gpu/drm/i915/i915_perf_types.h | 3 +-- 2 files changed, 13 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2c8727253f0d..585079ae5f03 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -465,7 +465,7 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream) static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) { u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; unsigned long flags; bool pollin; u32 hw_tail; @@ -602,7 +602,7 @@ static int append_oa_sample(struct i915_perf_stream *stream, size_t *offset, const u8 *report) { - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; struct drm_i915_perf_record_header header; header.type = DRM_I915_PERF_RECORD_SAMPLE; @@ -652,7 +652,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, size_t *offset) { struct intel_uncore *uncore = stream->uncore; - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; u8 *oa_buf_base = stream->oa_buffer.vaddr; u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); u32 mask = (OA_BUFFER_SIZE - 1); @@ -945,7 +945,7 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, size_t *offset) { struct intel_uncore *uncore = stream->uncore; - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; u8 *oa_buf_base = stream->oa_buffer.vaddr; u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); u32 mask = (OA_BUFFER_SIZE - 1); @@ -2506,7 +2506,7 @@ static int gen12_configure_oar_context(struct i915_perf_stream *stream, { int err; struct intel_context *ce = stream->pinned_ctx; - u32 format = stream->oa_buffer.format; + u32 format = stream->oa_buffer.format->format; u32 offset = stream->perf->ctx_oactxctrl_offset; struct flex regs_context[] = { { @@ -2877,7 +2877,7 @@ static void gen7_oa_enable(struct i915_perf_stream *stream) u32 ctx_id = stream->specific_ctx_id; bool periodic = stream->periodic; u32 period_exponent = stream->period_exponent; - u32 report_format = stream->oa_buffer.format; + u32 report_format = stream->oa_buffer.format->format; /* * Reset buf pointers so we don't forward reports from before now. @@ -2903,7 +2903,7 @@ static void gen7_oa_enable(struct i915_perf_stream *stream) static void gen8_oa_enable(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; - u32 report_format = stream->oa_buffer.format; + u32 report_format = stream->oa_buffer.format->format; /* * Reset buf pointers so we don't forward reports from before now. @@ -2929,7 +2929,7 @@ static void gen8_oa_enable(struct i915_perf_stream *stream) static void gen12_oa_enable(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; - u32 report_format = stream->oa_buffer.format; + u32 report_format = stream->oa_buffer.format->format; /* * If we don't want OA reports from the OA buffer, then we don't even @@ -3110,7 +3110,6 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, struct drm_i915_private *i915 = stream->perf->i915; struct i915_perf *perf = stream->perf; struct intel_gt *gt; - int format_size; int ret; if (!props->engine) { @@ -3166,20 +3165,15 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->sample_size = sizeof(struct drm_i915_perf_record_header); - format_size = perf->oa_formats[props->oa_format].size; + stream->oa_buffer.format = &perf->oa_formats[props->oa_format]; + if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0)) + return -EINVAL; stream->sample_flags = props->sample_flags; - stream->sample_size += format_size; - - stream->oa_buffer.format_size = format_size; - if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format_size == 0)) - return -EINVAL; + stream->sample_size += stream->oa_buffer.format->size; stream->hold_preemption = props->hold_preemption; - stream->oa_buffer.format = - perf->oa_formats[props->oa_format].format; - stream->periodic = props->oa_periodic; if (stream->periodic) stream->period_exponent = props->oa_period_exponent; diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index dc9bfd8086cf..e0c96b44eda8 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -250,11 +250,10 @@ struct i915_perf_stream { * @oa_buffer: State of the OA buffer. */ struct { + const struct i915_oa_format *format; struct i915_vma *vma; u8 *vaddr; u32 last_ctx_id; - int format; - int format_size; int size_exponent; /** From patchwork Tue Oct 25 20:17:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5062C38A2D for ; Tue, 25 Oct 2022 20:19:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A851D10E24B; Tue, 25 Oct 2022 20:19:46 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id B158710E1F8 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=LFdrl0lGOYrI5u+rfnfxRAGWg4ZEvNiB417xWKBCnoM=; b=hyvidkn/1tv3ROWiUkyq1bw+L0eL1HwlVgu6CLD9xHEXelCFOjt5NgQL THebs9uYpuZTKde+CtvbeYoknE3m+d6GqwPu/hZsdtJEh6OI5A8DDY1hG e/vJUvMRNr3Scb3kW6SPFGMqAJrnQk08RnJ78iFfDRWNTOmvhyghg4sE2 3Q6wGBT5tT4PgHdHAdeLTrZZDAevmFihXD7ELfPKRHDjvdJrOhBgeK886 fQoswaPY2Ij6TknPFfzqShkydc7ScM38S9sduYN1a67trLU9X3dtcWYqt TgVqmW1Ye7TFIc/iFojh6wl5Gj+hcwCjeSAc4qTGPnf+iBZMKDO5rhmYM A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847677" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847677" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699729" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699729" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:03 +0000 Message-Id: <20221025201708.84018-12-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 11/16] drm/i915/perf: Add Wa_1508761755:dg2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Disable Clock gating in EU when gathering the events so that EU events are not lost. v2: Fix checkpatch issues v3: User MCR helpers to write to MC reg Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 36d95b79022c..b101e31df61c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1164,6 +1164,7 @@ #define GEN12_DISABLE_EARLY_READ REG_BIT(14) #define GEN12_ENABLE_LARGE_GRF_MODE REG_BIT(12) #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) +#define GEN12_DISABLE_DOP_GATING REG_BIT(0) #define RT_CTRL MCR_REG(0xe530) #define DIS_NULL_QUERY REG_BIT(10) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 585079ae5f03..18619eb19769 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -204,6 +204,7 @@ #include "gt/intel_gpu_commands.h" #include "gt/intel_gt.h" #include "gt/intel_gt_clock_utils.h" +#include "gt/intel_gt_mcr.h" #include "gt/intel_gt_regs.h" #include "gt/intel_lrc.h" #include "gt/intel_lrc_reg.h" @@ -2775,6 +2776,18 @@ gen12_enable_metric_set(struct i915_perf_stream *stream, u32 sqcnt1; int ret; + /* + * Wa_1508761755:xehpsdv, dg2 + * EU NOA signals behave incorrectly if EU clock gating is enabled. + * Disable thread stall DOP gating and EU DOP gating. + */ + if (IS_XEHPSDV(i915) || IS_DG2(i915)) { + intel_gt_mcr_multicast_write(uncore->gt, GEN8_ROW_CHICKEN, + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)); + intel_uncore_write(uncore, GEN7_ROW_CHICKEN2, + _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING)); + } + intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG, /* Disable clk ratio reports, like previous Gens. */ _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS | @@ -2853,6 +2866,17 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream) struct drm_i915_private *i915 = stream->perf->i915; u32 sqcnt1; + /* + * Wa_1508761755:xehpsdv, dg2 + * Enable thread stall DOP gating and EU DOP gating. + */ + if (IS_XEHPSDV(i915) || IS_DG2(i915)) { + intel_gt_mcr_multicast_write(uncore->gt, GEN8_ROW_CHICKEN, + _MASKED_BIT_DISABLE(STALL_DOP_GATING_DISABLE)); + intel_uncore_write(uncore, GEN7_ROW_CHICKEN2, + _MASKED_BIT_DISABLE(GEN12_DISABLE_DOP_GATING)); + } + /* Reset all contexts' slices/subslices configurations. */ gen12_configure_all_contexts(stream, NULL, NULL); From patchwork Tue Oct 25 20:17:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019844 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6B24FA373E for ; Tue, 25 Oct 2022 20:17:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5FFFB10E1B9; Tue, 25 Oct 2022 20:17:45 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id BAFBA10E195 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=sOj7czvw8Iyz7/T7VEZBcnCMddbpFGlsUmMOJFz4Lx8=; b=Z3195UXFqUNTfZB9QPR6TiMA/o0Q+4/WPZpWdTxtQtMMBgxvKa7ibUCi Hlu2lTDftKDZQF1tKZQM93W/JbHNOYsRLduj36e3HYUapPO2xrG0wYxNI IGuUMNz7EjdOSg6VjS7aZq25UYPMgRzM4XuQLLFbOUgx+cPwMPvGLY7u7 vKQzeAbESvc6YY43Z0V1a+DXU4Xe1kJ1yIwdhNgv7T2x0V2PWaxX5zyiy qcQxdsXQyRD9bMZxyac0945VrlLWeytjC4u6IyNaiAjan7akdiEkB0i+x aIm+ZSifqX2eYgF4Ni8Lmk9Lu+c74+pjHxVub0yubSw3fHQpgs9nzuM+6 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847678" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847678" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699731" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699731" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:04 +0000 Message-Id: <20221025201708.84018-13-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 12/16] drm/i915/perf: Apply Wa_18013179988 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" OA reports in the OA buffer contain an OA timestamp field that helps user calculate delta between 2 OA reports. The calculation relies on the CS timestamp frequency to convert the timestamp value to nanoseconds. The CS timestamp frequency is a function of the CTC_SHIFT value in RPM_CONFIG0. In DG2, OA unit assumes that the CTC_SHIFT is 3, instead of using the actual value from RPM_CONFIG0. At the user level, this results in an error in calculating delta between 2 OA reports since the OA timestamp is not shifted in the same manner as CS timestamp. Also the periodicity of the reports is different from what the user configured because of mismatch in the CS and OA frequencies. The issue also affects MI_REPORT_PERF_COUNT command. To resolve this, return actual OA timestamp frequency to the user in i915_getparam_ioctl, so that user can calculate the right OA exponent as well as interpret the reports correctly. MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893 v2: - Use REG_FIELD_GET (Ashutosh) - Update commit msg Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_getparam.c | 3 +++ drivers/gpu/drm/i915/i915_perf.c | 30 ++++++++++++++++++++++++++-- drivers/gpu/drm/i915/i915_perf.h | 2 ++ include/uapi/drm/i915_drm.h | 6 ++++++ 4 files changed, 39 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 342c8ca6414e..3047e80e1163 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_PERF_REVISION: value = i915_perf_ioctl_version(); break; + case I915_PARAM_OA_TIMESTAMP_FREQUENCY: + value = i915_perf_oa_timestamp_frequency(i915); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 18619eb19769..8540eb6156e4 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3109,6 +3109,30 @@ get_sseu_config(struct intel_sseu *out_sseu, return i915_gem_user_to_context_sseu(engine->gt, drm_sseu, out_sseu); } +/* + * OA timestamp frequency = CS timestamp frequency in most platforms. On some + * platforms OA unit ignores the CTC_SHIFT and the 2 timestamps differ. In such + * cases, return the adjusted CS timestamp frequency to the user. + */ +u32 i915_perf_oa_timestamp_frequency(struct drm_i915_private *i915) +{ + /* Wa_18013179988:dg2 */ + if (IS_DG2(i915)) { + intel_wakeref_t wakeref; + u32 reg, shift; + + with_intel_runtime_pm(to_gt(i915)->uncore->rpm, wakeref) + reg = intel_uncore_read(to_gt(i915)->uncore, RPM_CONFIG0); + + shift = REG_FIELD_GET(GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK, + reg); + + return to_gt(i915)->clock_frequency << (3 - shift); + } + + return to_gt(i915)->clock_frequency; +} + /** * i915_oa_stream_init - validate combined props for OA stream and init * @stream: An i915 perf stream @@ -3830,8 +3854,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf, static u64 oa_exponent_to_ns(struct i915_perf *perf, int exponent) { - return intel_gt_clock_interval_to_ns(to_gt(perf->i915), - 2ULL << exponent); + u64 nom = (2ULL << exponent) * NSEC_PER_SEC; + u32 den = i915_perf_oa_timestamp_frequency(perf->i915); + + return div_u64(nom + den - 1, den); } static __always_inline bool diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h index 1d1329e5af3a..f96e09a4af04 100644 --- a/drivers/gpu/drm/i915/i915_perf.h +++ b/drivers/gpu/drm/i915/i915_perf.h @@ -57,4 +57,6 @@ static inline void i915_oa_config_put(struct i915_oa_config *oa_config) kref_put(&oa_config->ref, i915_oa_config_release); } +u32 i915_perf_oa_timestamp_frequency(struct drm_i915_private *i915); + #endif /* __I915_PERF_H__ */ diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 158b35fb28f3..c346b1923d11 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -765,6 +765,12 @@ typedef struct drm_i915_irq_wait { /* Query if the kernel supports the I915_USERPTR_PROBE flag. */ #define I915_PARAM_HAS_USERPTR_PROBE 56 +/* + * Frequency of the timestamps in OA reports. This used to be the same as the CS + * timestamp frequency, but differs on some platforms. + */ +#define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57 + /* Must be kept compact -- no holes and well documented */ /** From patchwork Tue Oct 25 20:17:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0FA02FA373E for ; Tue, 25 Oct 2022 20:17:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8488410E1E4; Tue, 25 Oct 2022 20:17:45 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id E3C5610E1E9 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=S6TU5IPCKqugKuXPsfWSoRIT1EXNM/yiDDISPsvQZ4M=; b=CLsuKkxvv6riwHZxIB0xiN7HEqTRwMozH2hxX/stecL3ZzzYFUBgCxan VydFXpIbLwLSdrzZZpikporXI4JlecY+sXmTb6oLRranaIz5eybL+jLsd C56nkRjV+k++NvlgMOOBreUXWTQ645CwJ499J9EF1S2HkRSpYtr6ho4kK 1CP0PdTEln4SRAbZpNvfKDMpR4BkGGKgLMf7c4EkiSoD9cNq/HdEnVGiE nMt/5obHVVwqHbDXjt60jMuzNJtSDX30Ts+a5+RC9nSjKzt3PF/L3M8aY 2W5Ir9hwijgUB5BWqTrfjjFi2sBrIiObvqIL33LKf0O48GBBrZ6xLRqt6 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="369847679" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="369847679" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699732" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699732" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:05 +0000 Message-Id: <20221025201708.84018-14-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 13/16] drm/i915/perf: Save/restore EU flex counters across reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If a drm client is killed, then hw contexts used by the client are reset immediately. This reset clears the EU flex counter configuration. If an OA use case is running in parallel, it would start seeing zeroed eu counter values following the reset even if the drm client is restarted. Save/restore the EU flex counter config so that the EU counters can be monitored continuously across resets. v2: - Save/restore eu flex config only for gen12, as for pre-gen12, these are saved and restored in the context image. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 34ef4f36e660..a419d60166c8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -392,6 +392,16 @@ static int guc_mmio_regset_init(struct temp_regset *regset, else ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); + if (GRAPHICS_VER(engine->i915) >= 12) { + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL0, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL1, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL2, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL3, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL4, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL5, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL6, false); + } + return ret ? -1 : 0; } From patchwork Tue Oct 25 20:17:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1C08FA3740 for ; Tue, 25 Oct 2022 20:18:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0A51A10E21C; Tue, 25 Oct 2022 20:17:50 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 51E9610E178 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=zM7GuTbjWMlzRANswteKnJnfR56kggIRCBhe4j+YOFg=; b=BhPRo8mDLL/6xBv52lOn6kynTs6meAAEdrfA6voZAWQbjAO/LrTqzTRe +3Cyjuc1wEXykm+eUJUKyhqNo6Bb7dJQXLDbbe76Rf8V2Z7B/OJLKzu9u ZAbhYHsNigBW8OpJXiVxHb0N9XarukQbY4FBiHOwDIOUIM538DY3is861 b++j9emYtcH7YQlSB79Y1kS4ZR8i+eUjLJEXpUWuYzQnX1TxcZRNvXAdr 39JIoyg0r7KXkTFJt4AxI3tQqHs5xsCenMorgFkuBZcCdtGPg7Y8YyMAj bJTPDh7laSaUtk43c6s6kQOXUPA+EUWF7PP/S0aYvNqgu4yqLnuX1s62F w==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="288177916" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="288177916" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699733" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699733" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:09 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:06 +0000 Message-Id: <20221025201708.84018-15-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 14/16] drm/i915/guc: Support OA when Wa_16011777198 is enabled X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Vinay Belgaumkar On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA since OA does not expect engine resets during its use. Fix it by disabling RC6. v2: (Ashutosh) - Bring back slpc_unset_param helper - Update commit msg - Use with_intel_runtime_pm helper for set/unset v3: (Ashutosh) - Just use intel_uc_uses_guc_rc Signed-off-by: Vinay Belgaumkar Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- .../drm/i915/gt/uc/abi/guc_actions_slpc_abi.h | 9 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 66 +++++++++++++++++++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 27 ++++++++ 4 files changed, 104 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h index 4c840a2639dc..811add10c30d 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h @@ -128,6 +128,15 @@ enum slpc_media_ratio_mode { SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO = 2, }; +enum slpc_gucrc_mode { + SLPC_GUCRC_MODE_HW = 0, + SLPC_GUCRC_MODE_GUCRC_NO_RC6 = 1, + SLPC_GUCRC_MODE_GUCRC_STATIC_TIMEOUT = 2, + SLPC_GUCRC_MODE_GUCRC_DYNAMIC_HYSTERESIS = 3, + + SLPC_GUCRC_MODE_MAX, +}; + enum slpc_event_id { SLPC_EVENT_RESET = 0, SLPC_EVENT_SHUTDOWN = 1, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c index fdd895f73f9f..b3a4fb9e021f 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c @@ -137,6 +137,17 @@ static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value) return ret > 0 ? -EPROTO : ret; } +static int guc_action_slpc_unset_param(struct intel_guc *guc, u8 id) +{ + u32 request[] = { + GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST, + SLPC_EVENT(SLPC_EVENT_PARAMETER_UNSET, 1), + id, + }; + + return intel_guc_send(guc, request, ARRAY_SIZE(request)); +} + static bool slpc_is_running(struct intel_guc_slpc *slpc) { return slpc_get_state(slpc) == SLPC_GLOBAL_STATE_RUNNING; @@ -190,6 +201,15 @@ static int slpc_set_param(struct intel_guc_slpc *slpc, u8 id, u32 value) return ret; } +static int slpc_unset_param(struct intel_guc_slpc *slpc, u8 id) +{ + struct intel_guc *guc = slpc_to_guc(slpc); + + GEM_BUG_ON(id >= SLPC_MAX_PARAM); + + return guc_action_slpc_unset_param(guc, id); +} + static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) { struct drm_i915_private *i915 = slpc_to_i915(slpc); @@ -610,6 +630,52 @@ static void slpc_get_rp_values(struct intel_guc_slpc *slpc) slpc->boost_freq = slpc->rp0_freq; } +/** + * intel_guc_slpc_override_gucrc_mode() - override GUCRC mode + * @slpc: pointer to intel_guc_slpc. + * @mode: new value of the mode. + * + * This function will override the GUCRC mode. + * + * Return: 0 on success, non-zero error code on failure. + */ +int intel_guc_slpc_override_gucrc_mode(struct intel_guc_slpc *slpc, u32 mode) +{ + int ret; + struct drm_i915_private *i915 = slpc_to_i915(slpc); + intel_wakeref_t wakeref; + + if (mode >= SLPC_GUCRC_MODE_MAX) + return -EINVAL; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) { + ret = slpc_set_param(slpc, SLPC_PARAM_PWRGATE_RC_MODE, mode); + if (ret) + drm_err(&i915->drm, + "Override gucrc mode %d failed %d\n", + mode, ret); + } + + return ret; +} + +int intel_guc_slpc_unset_gucrc_mode(struct intel_guc_slpc *slpc) +{ + struct drm_i915_private *i915 = slpc_to_i915(slpc); + intel_wakeref_t wakeref; + int ret = 0; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) { + ret = slpc_unset_param(slpc, SLPC_PARAM_PWRGATE_RC_MODE); + if (ret) + drm_err(&i915->drm, + "Unsetting gucrc mode failed %d\n", + ret); + } + + return ret; +} + /* * intel_guc_slpc_enable() - Start SLPC * @slpc: pointer to intel_guc_slpc. diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h index 82a98f78f96c..ccf483730d9d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h @@ -42,5 +42,7 @@ int intel_guc_slpc_set_media_ratio_mode(struct intel_guc_slpc *slpc, u32 val); void intel_guc_pm_intrmsk_enable(struct intel_gt *gt); void intel_guc_slpc_boost(struct intel_guc_slpc *slpc); void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc); +int intel_guc_slpc_unset_gucrc_mode(struct intel_guc_slpc *slpc); +int intel_guc_slpc_override_gucrc_mode(struct intel_guc_slpc *slpc, u32 mode); #endif diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 8540eb6156e4..bc0c486cf7d4 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -209,6 +209,7 @@ #include "gt/intel_lrc.h" #include "gt/intel_lrc_reg.h" #include "gt/intel_ring.h" +#include "gt/uc/intel_guc_slpc.h" #include "i915_drv.h" #include "i915_file_private.h" @@ -1582,6 +1583,15 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream) free_oa_buffer(stream); + /* + * Wa_16011777198:dg2: Unset the override of GUCRC mode to enable rc6. + */ + if (intel_uc_uses_guc_rc(>->uc) && + (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || + IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))) + drm_WARN_ON(>->i915->drm, + intel_guc_slpc_unset_gucrc_mode(>->uc.guc.slpc)); + intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL); intel_engine_pm_put(stream->engine); @@ -3265,6 +3275,23 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, intel_engine_pm_get(stream->engine); intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL); + /* + * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes + * OA to lose the configuration state. Prevent this by overriding GUCRC + * mode. + */ + if (intel_uc_uses_guc_rc(>->uc) && + (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || + IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))) { + ret = intel_guc_slpc_override_gucrc_mode(>->uc.guc.slpc, + SLPC_GUCRC_MODE_GUCRC_NO_RC6); + if (ret) { + drm_dbg(&stream->perf->i915->drm, + "Unable to override gucrc mode\n"); + goto err_config; + } + } + ret = alloc_oa_buffer(stream); if (ret) goto err_oa_buf_alloc; From patchwork Tue Oct 25 20:17:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9561FFA373E for ; Tue, 25 Oct 2022 20:19:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 671F1890BB; Tue, 25 Oct 2022 20:19:45 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6B6B910E1E6 for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=DJafpqtsWp+wpO3jNcUzovjADb2i4yEJI1/tstnegV0=; b=mbQllwaG5FdRm+s8PnRed98nAiSdGYr8AaoqY2Hi8apqCvDDyhDL7W3m spJtWIpMi7tAqo48wwUQz4F1ijy2dnCAlzyJB4uovstyzHiHUYZJlKTvn FVIqI2j+EMcEQU5WFA2e03FsjKhEDr/mO8s6+j+Mo+wu63trV6BDv1rUc zySLXVW3NSXrvHYGrijhnG/SS5ajQEfkXj4V7JOKLhFR7IauGq1tpgQEr 5y0cgHU3J1bc8NIWrU4JNqs+bW+qhhoQxphI6EhCpzaRCZIKm0Z1SXKvp X38I4h5c07f8QwOxiclkpr9ozUOSs9ejqW6iYboPnV3yUQSaX61C0bIy2 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="288177917" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="288177917" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699735" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699735" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:10 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:07 +0000 Message-Id: <20221025201708.84018-16-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 15/16] drm/i915/perf: complete programming whitelisting for XEHPSDV X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Lionel Landwerlin We have an additional register to select which slices contribute to OAG/OAG counter increments. Signed-off-by: Lionel Landwerlin Signed-off-by: Matt Roper Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/i915_perf.c | 13 +++++++++++++ drivers/gpu/drm/i915/intel_device_info.h | 1 + 4 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 438aebeea103..3bbcd726c2da 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -900,6 +900,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_OA_BPC_REPORTING(dev_priv) \ (INTEL_INFO(dev_priv)->has_oa_bpc_reporting) +#define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \ + (INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits) /* * Set this flag, when platform requires 64K GTT page sizes or larger for diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index cbced3f3db17..3f505ee15d66 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1024,6 +1024,7 @@ static const struct intel_device_info adl_p_info = { .has_logical_ring_elsq = 1, \ .has_mslice_steering = 1, \ .has_oa_bpc_reporting = 1, \ + .has_oa_slice_contrib_limits = 1, \ .has_rc6 = 1, \ .has_reset_engine = 1, \ .has_rps = 1, \ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index bc0c486cf7d4..176442d5e57e 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4261,6 +4261,11 @@ static const struct i915_range gen12_oa_b_counters[] = { {} }; +static const struct i915_range xehp_oa_b_counters[] = { + { .start = 0xdc48, .end = 0xdc48 }, /* OAA_ENABLE_REG */ + { .start = 0xdd00, .end = 0xdd48 }, /* OAG_LCE0_0 - OAA_LENABLE_REG */ +}; + static const struct i915_range gen7_oa_mux_regs[] = { { .start = 0x91b8, .end = 0x91cc }, /* OA_PERFCNT[1-2], OA_PERFMATRIX */ { .start = 0x9800, .end = 0x9888 }, /* MICRO_BP0_0 - NOA_WRITE */ @@ -4335,6 +4340,12 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr) return reg_in_range_table(addr, gen12_oa_b_counters); } +static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr) +{ + return reg_in_range_table(addr, xehp_oa_b_counters) || + reg_in_range_table(addr, gen12_oa_b_counters); +} + static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr) { return reg_in_range_table(addr, gen12_oa_mux_regs); @@ -4847,6 +4858,8 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; } else if (GRAPHICS_VER(i915) == 12) { perf->ops.is_valid_b_counter_reg = + HAS_OA_SLICE_CONTRIB_LIMITS(i915) ? + xehp_is_valid_b_counter_addr : gen12_is_valid_b_counter_addr; perf->ops.is_valid_mux_reg = gen12_is_valid_mux_addr; diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 42218c8d85f2..e292c1ee7c93 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -165,6 +165,7 @@ enum intel_ppgtt_type { func(has_media_ratio_mode); \ func(has_mslice_steering); \ func(has_oa_bpc_reporting); \ + func(has_oa_slice_contrib_limits); \ func(has_one_eu_per_fuse_bit); \ func(has_pxp); \ func(has_rc6); \ From patchwork Tue Oct 25 20:17:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13019851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D6AEC38A2D for ; Tue, 25 Oct 2022 20:18:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 527D410E23E; Tue, 25 Oct 2022 20:17:50 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 882C610E1EE for ; Tue, 25 Oct 2022 20:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666729031; x=1698265031; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=YlZ+eGnldfao8Uwr+68ouknknGbnfLgIQe4oLSHh4Rk=; b=eBB03qpLs5Rk5Q2TGthczyVxnQABZM+7NmA0iHIfhT9ivtK2m8Jsvhfk WCnLJrmc4ymhixWMPpHmpmm8wY9u1SAkfPjMfHTC8cQ9aZzdd9VcmMHdB IPqIzFSDU9r441AUjgA4UGBAaL/nH3/J3TsB4WXDF1ZDjrmgHIl3QyKv+ 8n2t4yfC0fDum8rXPXMpKnq+UJvP/gnZVrZQ1U5eLVgWD8IlMUryBWncK eI2Y9LDgHYK4jgeZ0As1sbNTaxbBmN3wTWJYjtwCcfWWbqeZFf4L9+m5p qTqhJr5SuwnUakbixWf98ab4WxmMDTeSTuBcl4Gi/cD9kPko9+kLH/cgJ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="288177919" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="288177919" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10511"; a="609699736" X-IronPort-AV: E=Sophos;i="5.95,212,1661842800"; d="scan'208";a="609699736" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 13:17:10 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Tue, 25 Oct 2022 20:17:08 +0000 Message-Id: <20221025201708.84018-17-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> References: <20221025201708.84018-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v5 16/16] drm/i915/perf: Enable OA for DG2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" OA was disabled for DG2 as support was missing. Enable it back now. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 176442d5e57e..3438cff13f38 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4798,12 +4798,6 @@ void i915_perf_init(struct drm_i915_private *i915) { struct i915_perf *perf = &i915->perf; - /* XXX const struct i915_perf_ops! */ - - /* i915_perf is not enabled for DG2 yet */ - if (IS_DG2(i915)) - return; - perf->oa_formats = oa_formats; if (IS_HASWELL(i915)) { perf->ops.is_valid_b_counter_reg = gen7_is_valid_b_counter_addr;