From patchwork Mon Oct 10 18:14:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002814 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BC46FC433F5 for ; Mon, 10 Oct 2022 18:14:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DA16610E6C1; Mon, 10 Oct 2022 18:14:45 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id B4B1E10E6BC for ; Mon, 10 Oct 2022 18:14:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425676; x=1696961676; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=aKUQRDaNC5maBOvn+F4hd9NgmZZSg+GG1W5y7xIMCss=; b=WSYYRPf+HoX49pOSwIMCfNsb0EQYFLGsOToSzX1LFEq7AycXICxLQVsU ur+DVxo3pGpbybY3vOQVMr5wmE8nZnc5x4x4kuSZ5E7JPPHX/isawluUN WlqJPS4QY9kqf3t24qO5cqg64Kw0WjmRB5WqT6wIsjul9De00MywmQZXD UUIMjuCsO6X/bSlf9A6f/cvsMQKpC1jGsSHkpg0M00Fy3Ja0J1RGCPryG 5EkljvyO7lehsjamAdsdDXCYljENsNx4kkzIqoNP0tI2nV3nUuspznKYr d+rjywFubs+OqyFey7q7KT96gft+ntPBYhY6ONLp2M2VM+0+OZbJ2yhoh A==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="301909902" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="301909902" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820256" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820256" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:35 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:19 +0000 Message-Id: <20221010181434.513477-2-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 01/16] drm/i915/perf: Fix OA filtering logic for GuC mode X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" With GuC mode of submission, GuC is in control of defining the context id field that is part of the OA reports. To filter reports, UMD and KMD must know what sw context id was chosen by GuC. There is not interface between KMD and GuC to determine this, so read the upper-dword of EXECLIST_STATUS to filter/squash OA reports for the specific context. v2: Explain guc id stealing w.r.t OA use case Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_lrc.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 144 ++++++++++++++++++++++++---- 2 files changed, 127 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h index a390f0813c8b..7111bae759f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h @@ -110,6 +110,8 @@ enum { #define XEHP_SW_CTX_ID_WIDTH 16 #define XEHP_SW_COUNTER_SHIFT 58 #define XEHP_SW_COUNTER_WIDTH 6 +#define GEN12_GUC_SW_CTX_ID_SHIFT 39 +#define GEN12_GUC_SW_CTX_ID_WIDTH 16 static inline void lrc_runtime_start(struct intel_context *ce) { diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 0defbb43ceea..315662329be3 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1233,6 +1233,128 @@ static struct intel_context *oa_pin_context(struct i915_perf_stream *stream) return stream->pinned_ctx; } +static int +__store_reg_to_mem(struct i915_request *rq, i915_reg_t reg, u32 ggtt_offset) +{ + u32 *cs, cmd; + + cmd = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; + if (GRAPHICS_VER(rq->engine->i915) >= 8) + cmd++; + + cs = intel_ring_begin(rq, 4); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = cmd; + *cs++ = i915_mmio_reg_offset(reg); + *cs++ = ggtt_offset; + *cs++ = 0; + + intel_ring_advance(rq, cs); + + return 0; +} + +static int +__read_reg(struct intel_context *ce, i915_reg_t reg, u32 ggtt_offset) +{ + struct i915_request *rq; + int err; + + rq = i915_request_create(ce); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + i915_request_get(rq); + + err = __store_reg_to_mem(rq, reg, ggtt_offset); + + i915_request_add(rq); + if (!err && i915_request_wait(rq, 0, HZ / 2) < 0) + err = -ETIME; + + i915_request_put(rq); + + return err; +} + +static int +gen12_guc_sw_ctx_id(struct intel_context *ce, u32 *ctx_id) +{ + struct i915_vma *scratch; + u32 *val; + int err; + + scratch = __vm_create_scratch_for_read_pinned(&ce->engine->gt->ggtt->vm, 4); + if (IS_ERR(scratch)) + return PTR_ERR(scratch); + + err = i915_vma_sync(scratch); + if (err) + goto err_scratch; + + err = __read_reg(ce, RING_EXECLIST_STATUS_HI(ce->engine->mmio_base), + i915_ggtt_offset(scratch)); + if (err) + goto err_scratch; + + val = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB); + if (IS_ERR(val)) { + err = PTR_ERR(val); + goto err_scratch; + } + + *ctx_id = *val; + i915_gem_object_unpin_map(scratch->obj); + +err_scratch: + i915_vma_unpin_and_release(&scratch, 0); + return err; +} + +/* + * For execlist mode of submission, pick an unused context id + * 0 - (NUM_CONTEXT_TAG -1) are used by other contexts + * XXX_MAX_CONTEXT_HW_ID is used by idle context + * + * For GuC mode of submission read context id from the upper dword of the + * EXECLIST_STATUS register. Note that we read this value only once and expect + * that the value stays fixed for the entire OA use case. There are cases where + * GuC KMD implementation may deregister a context to reuse it's context id, but + * we prevent that from happening to the OA context by pinning it. + */ +static int gen12_get_render_context_id(struct i915_perf_stream *stream) +{ + u32 ctx_id, mask; + int ret; + + if (intel_engine_uses_guc(stream->engine)) { + ret = gen12_guc_sw_ctx_id(stream->pinned_ctx, &ctx_id); + if (ret) + return ret; + + mask = ((1U << GEN12_GUC_SW_CTX_ID_WIDTH) - 1) << + (GEN12_GUC_SW_CTX_ID_SHIFT - 32); + } else if (GRAPHICS_VER_FULL(stream->engine->i915) >= IP_VER(12, 50)) { + ctx_id = (XEHP_MAX_CONTEXT_HW_ID - 1) << + (XEHP_SW_CTX_ID_SHIFT - 32); + + mask = ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) << + (XEHP_SW_CTX_ID_SHIFT - 32); + } else { + ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) << + (GEN11_SW_CTX_ID_SHIFT - 32); + + mask = ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << + (GEN11_SW_CTX_ID_SHIFT - 32); + } + stream->specific_ctx_id = ctx_id & mask; + stream->specific_ctx_id_mask = mask; + + return 0; +} + /** * oa_get_render_ctx_id - determine and hold ctx hw id * @stream: An i915-perf stream opened for OA metrics @@ -1246,6 +1368,7 @@ static struct intel_context *oa_pin_context(struct i915_perf_stream *stream) static int oa_get_render_ctx_id(struct i915_perf_stream *stream) { struct intel_context *ce; + int ret = 0; ce = oa_pin_context(stream); if (IS_ERR(ce)) @@ -1292,24 +1415,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) case 11: case 12: - if (GRAPHICS_VER_FULL(ce->engine->i915) >= IP_VER(12, 50)) { - stream->specific_ctx_id_mask = - ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) << - (XEHP_SW_CTX_ID_SHIFT - 32); - stream->specific_ctx_id = - (XEHP_MAX_CONTEXT_HW_ID - 1) << - (XEHP_SW_CTX_ID_SHIFT - 32); - } else { - stream->specific_ctx_id_mask = - ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); - /* - * Pick an unused context id - * 0 - BITS_PER_LONG are used by other contexts - * GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context - */ - stream->specific_ctx_id = - (GEN12_MAX_CONTEXT_HW_ID - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); - } + ret = gen12_get_render_context_id(stream); break; default: @@ -1323,7 +1429,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) stream->specific_ctx_id, stream->specific_ctx_id_mask); - return 0; + return ret; } /** From patchwork Mon Oct 10 18:14:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002818 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0E4EEC433FE for ; Mon, 10 Oct 2022 18:15:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E937510E6B6; Mon, 10 Oct 2022 18:15:02 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id D3CCE10E6B7 for ; Mon, 10 Oct 2022 18:14:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425676; x=1696961676; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=3IfIxOh4GcOBD5vGbiszKHjIIdjvwslGkQu3O8+wGzQ=; b=HvMjGUdEoZrzEVnGwh+uhEjx3N/bs3AsFUVj+HusgxsLlsKA482jSiWU amLF0H2+epRxCUSRXfpUunQ0AyhJXUiXhfw2GipItOLBhO6ZcVHb95a3z L35gDx1snlUNltzICHfVmzjYY/gM7Feap/lWM5KwqBsxM4k0aeGxFbNFB TgWRhknlbklyiGkCnzhZFeeka9jAp5M2HbUUNtxf9ELLUj6GW28eH1Yn8 JLOvD9wr8mNlob96hMFNpJReAnCnZ8BhD/I5XOgw1sHSjbDE1b4ZnsPvP bKEvoAhYP/i7HgrXQjqwaE5yCt6J4ejKJk3FCBocJo1In/A5daLclfJZV g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="301909903" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="301909903" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820260" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820260" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:35 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:20 +0000 Message-Id: <20221010181434.513477-3-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 02/16] drm/i915/perf: Add 32-bit OAG and OAR formats for DG2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add new OA formats for DG2. v2: - Update commit title (Ashutosh) - Coding style fixes (Lionel) - 64 bit OA formats need UMD changes in GPUvis, drop for now and send in a separate series with UMD changes v3: - Update commit message to drop 64 bit related description Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin #1 --- drivers/gpu/drm/i915/i915_perf.c | 7 +++++++ include/uapi/drm/i915_drm.h | 4 ++++ 2 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 315662329be3..41e9f620ee31 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -320,6 +320,8 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = { [I915_OA_FORMAT_A12] = { 0, 64 }, [I915_OA_FORMAT_A12_B8_C8] = { 2, 128 }, [I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 }, + [I915_OAR_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 }, + [I915_OA_FORMAT_A24u40_A14u32_B8_C8] = { 5, 256 }, }; #define SAMPLE_OA_REPORT (1<<0) @@ -4517,6 +4519,11 @@ static void oa_init_supported_formats(struct i915_perf *perf) oa_format_add(perf, I915_OA_FORMAT_C4_B8); break; + case INTEL_DG2: + oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8); + oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8); + break; + default: MISSING_CASE(platform); } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 520ad2691a99..8b59590e06d4 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2650,6 +2650,10 @@ enum drm_i915_oa_format { I915_OA_FORMAT_A12_B8_C8, I915_OA_FORMAT_A32u40_A4u32_B8_C8, + /* DG2 */ + I915_OAR_FORMAT_A32u40_A4u32_B8_C8, + I915_OA_FORMAT_A24u40_A14u32_B8_C8, + I915_OA_FORMAT_MAX /* non-ABI */ }; From patchwork Mon Oct 10 18:14:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 64AE0C433F5 for ; Mon, 10 Oct 2022 18:14:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A711E10E6BC; Mon, 10 Oct 2022 18:14:45 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0D47B10E6C4 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=d8wcRRq9QkgznB/BZxgorrckMEA+svvX+I1oXD8gw3A=; b=jE+Q17zkzYpNMFemciw0L5Svti1951DfwCf08J9lji9uz9rO/vLrUz/6 AajCjkxVT3oikMRzesq3XbCEGNHS8Im6UcMM8qDMzSAXo97pHkBQOcVr+ nrMUB65f33ei9hZtJ6A8cw+elEXXHiPsdwwvwKbdPRsKrREzMIyReNnYs CRRRxlgbhjBXWJ+kn5bkaEZcvr+6jd63MhxKlmrNVUCfgfE4lDe0dWfxL il4TOQohbQkbnUs2m9QcVe3LcTaCKEo2zjxEUBUiB71ew1Gl8E7A6oCmB 2Biog0TS5c37eIEXOh2VSZcCMcvn6dC5koyZZ8fDlozU2YwtD0Uhwl/8P A==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="301909904" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="301909904" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820265" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820265" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:35 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:21 +0000 Message-Id: <20221010181434.513477-4-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 03/16] drm/i915/perf: Fix noa wait predication for DG2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Predication for batch buffer commands changed in XEHPSDV. MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT register. The MI_SET_PREDICATE_RESULT register can only be modified with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE command sets MI_SET_PREDICATE_RESULT based on bit 0 of MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_engine_regs.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 24 +++++++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_regs.h b/drivers/gpu/drm/i915/gt/intel_engine_regs.h index fe1a0d5fd4b1..ee3efd06ee54 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_regs.h @@ -201,6 +201,7 @@ #define RING_CONTEXT_STATUS_PTR(base) _MMIO((base) + 0x3a0) #define RING_CTX_TIMESTAMP(base) _MMIO((base) + 0x3a8) /* gen8+ */ #define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8) +#define MI_PREDICATE_RESULT_2_ENGINE(base) _MMIO((base) + 0x3bc) #define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base) + 0x4D0) + (i) * 4) #define RING_FORCE_TO_NONPRIV_DENY REG_BIT(30) #define RING_FORCE_TO_NONPRIV_ADDRESS_MASK REG_GENMASK(25, 2) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 41e9f620ee31..cd57b5836386 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -286,6 +286,7 @@ static u32 i915_perf_stream_paranoid = true; #define OAREPORT_REASON_CTX_SWITCH (1<<3) #define OAREPORT_REASON_CLK_RATIO (1<<5) +#define HAS_MI_SET_PREDICATE(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) /* For sysctl proc_dointvec_minmax of i915_oa_max_sample_rate * @@ -1762,6 +1763,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) DELTA_TARGET, N_CS_GPR }; + i915_reg_t mi_predicate_result = HAS_MI_SET_PREDICATE(i915) ? + MI_PREDICATE_RESULT_2_ENGINE(base) : + MI_PREDICATE_RESULT_1(RENDER_RING_BASE); bo = i915_gem_object_create_internal(i915, 4096); if (IS_ERR(bo)) { @@ -1799,7 +1803,7 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) stream, cs, true /* save */, CS_GPR(i), INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR + 8 * i, 2); cs = save_restore_register( - stream, cs, true /* save */, MI_PREDICATE_RESULT_1(RENDER_RING_BASE), + stream, cs, true /* save */, mi_predicate_result, INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1, 1); /* First timestamp snapshot location. */ @@ -1853,7 +1857,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ *cs++ = MI_LOAD_REGISTER_REG | (3 - 2); *cs++ = i915_mmio_reg_offset(CS_GPR(JUMP_PREDICATE)); - *cs++ = i915_mmio_reg_offset(MI_PREDICATE_RESULT_1(RENDER_RING_BASE)); + *cs++ = i915_mmio_reg_offset(mi_predicate_result); + + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE | 1; /* Restart from the beginning if we had timestamps roll over. */ *cs++ = (GRAPHICS_VER(i915) < 8 ? @@ -1863,6 +1870,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) *cs++ = i915_ggtt_offset(vma) + (ts0 - batch) * 4; *cs++ = 0; + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE; + /* * Now add the diff between to previous timestamps and add it to : * (((1 * << 64) - 1) - delay_ns) @@ -1890,7 +1900,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ *cs++ = MI_LOAD_REGISTER_REG | (3 - 2); *cs++ = i915_mmio_reg_offset(CS_GPR(JUMP_PREDICATE)); - *cs++ = i915_mmio_reg_offset(MI_PREDICATE_RESULT_1(RENDER_RING_BASE)); + *cs++ = i915_mmio_reg_offset(mi_predicate_result); + + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE | 1; /* Predicate the jump. */ *cs++ = (GRAPHICS_VER(i915) < 8 ? @@ -1900,13 +1913,16 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) *cs++ = i915_ggtt_offset(vma) + (jump - batch) * 4; *cs++ = 0; + if (HAS_MI_SET_PREDICATE(i915)) + *cs++ = MI_SET_PREDICATE; + /* Restore registers. */ for (i = 0; i < N_CS_GPR; i++) cs = save_restore_register( stream, cs, false /* restore */, CS_GPR(i), INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR + 8 * i, 2); cs = save_restore_register( - stream, cs, false /* restore */, MI_PREDICATE_RESULT_1(RENDER_RING_BASE), + stream, cs, false /* restore */, mi_predicate_result, INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1, 1); /* And return to the ring. */ From patchwork Mon Oct 10 18:14:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002820 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 00E1EC433F5 for ; Mon, 10 Oct 2022 18:15:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 774C410E6C3; Mon, 10 Oct 2022 18:15:03 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0D80410E6C9 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=zCQ9JLVXV3ZXnk82Tr9HGTKl0eCoEr4yO/Nh2fjMnDI=; b=SadLT8AhxQi8DrBh84J5BLdHgB2RheSN53wXW/cjizG4F83RgJmaMYCT xv+1F9bI9youzLZr+vtp+c9f70ImCvJkMrdk4xJNYEuiqeXsq1x9/Oqnz w/OInYtlolo+1HhmoOuKeqdga+5aTsOYq63BZCBA5h5JSak4Q8qgIw0f/ 81JP2s1aY23Ujdwjz6JIVfUZ+D0ePqNy4aYuaQBw3UraiRgQw3OcVGJm3 niXLThECE7RpCyqjXA9ynqgp6RSLs3GqL7s2uUzjnlsNP3kxZZjHALi9B NPCi4z/7s0QRHB9ViRc3WdUzNbnj2pd1BADfx1XCh1HJoZFGaaQbwlEIz g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439651" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439651" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820269" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820269" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:35 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:22 +0000 Message-Id: <20221010181434.513477-5-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 04/16] drm/i915/perf: Determine gen12 oa ctx offset at runtime X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Some SKUs of same gen12 platform may have different oactxctrl offsets. For gen12, determine oactxctrl offsets at runtime. v2: (Lionel) - Move MI definitions to intel_gpu_commands.h - Ensure __find_reg_in_lri does read past context image size v3: (Ashutosh) - Drop unnecessary use of double underscores - fix find_reg_in_lri - Return error if oa context offset is U32_MAX - Error out if oa_ctx_ctrl_offset does not find offset Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 4 + drivers/gpu/drm/i915/i915_perf.c | 154 +++++++++++++++---- drivers/gpu/drm/i915/i915_perf_oa_regs.h | 2 +- 3 files changed, 129 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index d4e9702d3c8e..f50ea92910d9 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -187,6 +187,10 @@ #define MI_BATCH_RESOURCE_STREAMER REG_BIT(10) #define MI_BATCH_PREDICATE REG_BIT(15) /* HSW+ on RCS only*/ +#define MI_OPCODE(x) (((x) >> 23) & 0x3f) +#define IS_MI_LRI_CMD(x) (MI_OPCODE(x) == MI_OPCODE(MI_INSTR(0x22, 0))) +#define MI_LRI_LEN(x) (((x) & 0xff) + 1) + /* * 3D instructions used by the kernel */ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index cd57b5836386..b292aa39633e 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1358,6 +1358,68 @@ static int gen12_get_render_context_id(struct i915_perf_stream *stream) return 0; } +#define valid_oactxctrl_offset(x) ((x) && (x) != U32_MAX) +static bool oa_find_reg_in_lri(u32 *state, u32 reg, u32 *offset, u32 end) +{ + u32 idx = *offset; + u32 len = min(MI_LRI_LEN(state[idx]) + idx, end); + bool found = false; + + idx++; + for (; idx < len; idx += 2) { + if (state[idx] == reg) { + found = true; + break; + } + } + + *offset = idx; + return found; +} + +static u32 oa_context_image_offset(struct intel_context *ce, u32 reg) +{ + u32 offset, len = (ce->engine->context_size - PAGE_SIZE) / 4; + u32 *state = ce->lrc_reg_state; + + for (offset = 0; offset < len; ) { + if (IS_MI_LRI_CMD(state[offset])) { + if (oa_find_reg_in_lri(state, reg, &offset, len)) + break; + } else { + offset++; + } + } + + return offset < len ? offset : U32_MAX; +} + +static int set_oa_ctx_ctrl_offset(struct intel_context *ce) +{ + i915_reg_t reg = GEN12_OACTXCONTROL(ce->engine->mmio_base); + struct i915_perf *perf = &ce->engine->i915->perf; + u32 offset = perf->ctx_oactxctrl_offset; + + /* Do this only once. Failure is stored as offset of U32_MAX */ + if (offset) + goto exit; + + offset = oa_context_image_offset(ce, i915_mmio_reg_offset(reg)); + perf->ctx_oactxctrl_offset = offset; + + drm_dbg(&ce->engine->i915->drm, + "%s oa ctx control at 0x%08x dword offset\n", + ce->engine->name, offset); + +exit: + return valid_oactxctrl_offset(offset) ? 0 : -ENODEV; +} + +static bool engine_supports_mi_query(struct intel_engine_cs *engine) +{ + return engine->class == RENDER_CLASS; +} + /** * oa_get_render_ctx_id - determine and hold ctx hw id * @stream: An i915-perf stream opened for OA metrics @@ -1377,6 +1439,21 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) if (IS_ERR(ce)) return PTR_ERR(ce); + if (engine_supports_mi_query(stream->engine)) { + /* + * We are enabling perf query here. If we don't find the context + * offset here, just return an error. + */ + ret = set_oa_ctx_ctrl_offset(ce); + if (ret) { + intel_context_unpin(ce); + drm_err(&stream->perf->i915->drm, + "Enabling perf query failed for %s\n", + stream->engine->name); + return ret; + } + } + switch (GRAPHICS_VER(ce->engine->i915)) { case 7: { /* @@ -2408,10 +2485,11 @@ static int gen12_configure_oar_context(struct i915_perf_stream *stream, int err; struct intel_context *ce = stream->pinned_ctx; u32 format = stream->oa_buffer.format; + u32 offset = stream->perf->ctx_oactxctrl_offset; struct flex regs_context[] = { { GEN8_OACTXCONTROL, - stream->perf->ctx_oactxctrl_offset + 1, + offset + 1, active ? GEN8_OA_COUNTER_RESUME : 0, }, }; @@ -2436,15 +2514,18 @@ static int gen12_configure_oar_context(struct i915_perf_stream *stream, }, }; - /* Modify the context image of pinned context with regs_context*/ - err = intel_context_lock_pinned(ce); - if (err) - return err; + /* Modify the context image of pinned context with regs_context */ + if (valid_oactxctrl_offset(offset)) { + err = intel_context_lock_pinned(ce); + if (err) + return err; - err = gen8_modify_context(ce, regs_context, ARRAY_SIZE(regs_context)); - intel_context_unlock_pinned(ce); - if (err) - return err; + err = gen8_modify_context(ce, regs_context, + ARRAY_SIZE(regs_context)); + intel_context_unlock_pinned(ce); + if (err) + return err; + } /* Apply regs_lri using LRI with pinned context */ return gen8_modify_self(ce, regs_lri, ARRAY_SIZE(regs_lri), active); @@ -2566,6 +2647,7 @@ lrc_configure_all_contexts(struct i915_perf_stream *stream, const struct i915_oa_config *oa_config, struct i915_active *active) { + u32 ctx_oactxctrl = stream->perf->ctx_oactxctrl_offset; /* The MMIO offsets for Flex EU registers aren't contiguous */ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset; #define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1) @@ -2576,7 +2658,7 @@ lrc_configure_all_contexts(struct i915_perf_stream *stream, }, { GEN8_OACTXCONTROL, - stream->perf->ctx_oactxctrl_offset + 1, + ctx_oactxctrl + 1, }, { EU_PERF_CNTL0, ctx_flexeuN(0) }, { EU_PERF_CNTL1, ctx_flexeuN(1) }, @@ -4545,6 +4627,37 @@ static void oa_init_supported_formats(struct i915_perf *perf) } } +static void i915_perf_init_info(struct drm_i915_private *i915) +{ + struct i915_perf *perf = &i915->perf; + + switch (GRAPHICS_VER(i915)) { + case 8: + perf->ctx_oactxctrl_offset = 0x120; + perf->ctx_flexeu0_offset = 0x2ce; + perf->gen8_valid_ctx_bit = BIT(25); + break; + case 9: + perf->ctx_oactxctrl_offset = 0x128; + perf->ctx_flexeu0_offset = 0x3de; + perf->gen8_valid_ctx_bit = BIT(16); + break; + case 11: + perf->ctx_oactxctrl_offset = 0x124; + perf->ctx_flexeu0_offset = 0x78e; + perf->gen8_valid_ctx_bit = BIT(16); + break; + case 12: + /* + * Calculate offset at runtime in oa_pin_context for gen12 and + * cache the value in perf->ctx_oactxctrl_offset. + */ + break; + default: + MISSING_CASE(GRAPHICS_VER(i915)); + } +} + /** * i915_perf_init - initialize i915-perf state on module bind * @i915: i915 device instance @@ -4583,6 +4696,7 @@ void i915_perf_init(struct drm_i915_private *i915) * execlist mode by default. */ perf->ops.read = gen8_oa_read; + i915_perf_init_info(i915); if (IS_GRAPHICS_VER(i915, 8, 9)) { perf->ops.is_valid_b_counter_reg = @@ -4602,18 +4716,6 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen8_enable_metric_set; perf->ops.disable_metric_set = gen8_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; - - if (GRAPHICS_VER(i915) == 8) { - perf->ctx_oactxctrl_offset = 0x120; - perf->ctx_flexeu0_offset = 0x2ce; - - perf->gen8_valid_ctx_bit = BIT(25); - } else { - perf->ctx_oactxctrl_offset = 0x128; - perf->ctx_flexeu0_offset = 0x3de; - - perf->gen8_valid_ctx_bit = BIT(16); - } } else if (GRAPHICS_VER(i915) == 11) { perf->ops.is_valid_b_counter_reg = gen7_is_valid_b_counter_addr; @@ -4627,11 +4729,6 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen8_enable_metric_set; perf->ops.disable_metric_set = gen11_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; - - perf->ctx_oactxctrl_offset = 0x124; - perf->ctx_flexeu0_offset = 0x78e; - - perf->gen8_valid_ctx_bit = BIT(16); } else if (GRAPHICS_VER(i915) == 12) { perf->ops.is_valid_b_counter_reg = gen12_is_valid_b_counter_addr; @@ -4645,9 +4742,6 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen12_enable_metric_set; perf->ops.disable_metric_set = gen12_disable_metric_set; perf->ops.oa_hw_tail_read = gen12_oa_hw_tail_read; - - perf->ctx_flexeu0_offset = 0; - perf->ctx_oactxctrl_offset = 0x144; } } diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h index f31c9f13a9fc..0ef3562ff4aa 100644 --- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h +++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h @@ -97,7 +97,7 @@ #define GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT 1 #define GEN12_OAR_OACONTROL_COUNTER_ENABLE (1 << 0) -#define GEN12_OACTXCONTROL _MMIO(0x2360) +#define GEN12_OACTXCONTROL(base) _MMIO((base) + 0x360) #define GEN12_OAR_OASTATUS _MMIO(0x2968) /* Gen12 OAG unit */ From patchwork Mon Oct 10 18:14:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1C327C433F5 for ; Mon, 10 Oct 2022 18:15:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 030F110E6D4; Mon, 10 Oct 2022 18:15:05 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3496410E6D4 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=uRZ5gPBptt/QdYzdYxLYPi+Ya8ENubMxXocc7xMS4LY=; b=DXlFui4cYzaoDqpA4hmgsjyeuF6gpaXpKv5JQ/Jipoebh7vSXG7z0rjE 5XwEtftZ5hk2rkhxLbQ+AVExhpNZmn1XOvmigj3IekLKOKaqqEdFSJguo IfUxJnDP+/PbUO2d0I46f4w6+o9629i8XBgGX5e0Tnu8x7TBwYvjDR999 rfUhnhzERPembY7kQCgnTwd3Q8OFTep+gs/HQhq+t+1xmYHC1uhCx1POe G2ZmpoxFNBhOTtczh8aS6GM+dXYsZpY6ggvzUQTFDRNJ+3f5n6b0sVHvO hsffmNMvI9bn0+2+bZMHNIwMW1F6wrMEfZRChecCVvNo0NXGZ8VGrdiZJ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439652" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439652" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820272" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820272" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:35 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:23 +0000 Message-Id: <20221010181434.513477-6-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 05/16] drm/i915/perf: Enable bytes per clock reporting in OA X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" XEHPSDV and DG2 provide a way to configure bytes per clock vs commands per clock reporting. Enable bytes per clock setting on enabling OA. Bspec: 51762 Bspec: 52201 v2: - Fix commit msg (Ashutosh) - Fix checkpatch issues v3: - s/commands/bytes/ in code comment and commmit msg Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/i915_perf.c | 20 ++++++++++++++++++++ drivers/gpu/drm/i915/i915_perf_oa_regs.h | 4 ++++ drivers/gpu/drm/i915/intel_device_info.h | 1 + 5 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9f9372931fd2..ccd54ff54002 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -902,6 +902,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm) #define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc) +#define HAS_OA_BPC_REPORTING(dev_priv) \ + (INTEL_INFO(dev_priv)->has_oa_bpc_reporting) + /* * Set this flag, when platform requires 64K GTT page sizes or larger for * device local memory access. diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 77e7df21f539..6b25e4cb6221 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1022,6 +1022,7 @@ static const struct intel_device_info adl_p_info = { .has_logical_ring_contexts = 1, \ .has_logical_ring_elsq = 1, \ .has_mslice_steering = 1, \ + .has_oa_bpc_reporting = 1, \ .has_rc6 = 1, \ .has_reset_engine = 1, \ .has_rps = 1, \ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index b292aa39633e..a22f18554ab9 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -2746,10 +2746,12 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream, struct i915_active *active) { + struct drm_i915_private *i915 = stream->perf->i915; struct intel_uncore *uncore = stream->uncore; struct i915_oa_config *oa_config = stream->oa_config; bool periodic = stream->periodic; u32 period_exponent = stream->period_exponent; + u32 sqcnt1; int ret; intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG, @@ -2768,6 +2770,16 @@ gen12_enable_metric_set(struct i915_perf_stream *stream, (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT)) : 0); + /* + * Initialize Super Queue Internal Cnt Register + * Set PMON Enable in order to collect valid metrics. + * Enable byets per clock reporting in OA for XEHPSDV onward. + */ + sqcnt1 = GEN12_SQCNT1_PMON_ENABLE | + (HAS_OA_BPC_REPORTING(i915) ? GEN12_SQCNT1_OABPC : 0); + + intel_uncore_rmw(uncore, GEN12_SQCNT1, 0, sqcnt1); + /* * Update all contexts prior writing the mux configurations as we need * to make sure all slices/subslices are ON before writing to NOA @@ -2817,6 +2829,8 @@ static void gen11_disable_metric_set(struct i915_perf_stream *stream) static void gen12_disable_metric_set(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; + struct drm_i915_private *i915 = stream->perf->i915; + u32 sqcnt1; /* Reset all contexts' slices/subslices configurations. */ gen12_configure_all_contexts(stream, NULL, NULL); @@ -2827,6 +2841,12 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream) /* Make sure we disable noa to save power. */ intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0); + + sqcnt1 = GEN12_SQCNT1_PMON_ENABLE | + (HAS_OA_BPC_REPORTING(i915) ? GEN12_SQCNT1_OABPC : 0); + + /* Reset PMON Enable to save power. */ + intel_uncore_rmw(uncore, GEN12_SQCNT1, sqcnt1, 0); } static void gen7_oa_enable(struct i915_perf_stream *stream) diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h index 0ef3562ff4aa..381d94101610 100644 --- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h +++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h @@ -134,4 +134,8 @@ #define GDT_CHICKEN_BITS _MMIO(0x9840) #define GT_NOA_ENABLE 0x00000080 +#define GEN12_SQCNT1 _MMIO(0x8718) +#define GEN12_SQCNT1_PMON_ENABLE REG_BIT(30) +#define GEN12_SQCNT1_OABPC REG_BIT(29) + #endif /* __INTEL_PERF_OA_REGS__ */ diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 09b18910d3ab..1f7a842cd408 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -164,6 +164,7 @@ enum intel_ppgtt_type { func(has_logical_ring_elsq); \ func(has_media_ratio_mode); \ func(has_mslice_steering); \ + func(has_oa_bpc_reporting); \ func(has_one_eu_per_fuse_bit); \ func(has_pxp); \ func(has_rc6); \ From patchwork Mon Oct 10 18:14:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 44AAEC433FE for ; Mon, 10 Oct 2022 18:15:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B529D10E6D6; Mon, 10 Oct 2022 18:15:05 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8FFE110E6B7 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=Ju3WKNqd9BntqpJAXz08NLYdeTk1TRcREUfk2pAzLH0=; b=Ra6UDSol/khHJk+l+L7JrxAWEymDcVErXVeXtgGeafI9gLY0HbOaRqHf mtWHAYmA5Z//1LiuwbplhDZzkmHh8C9EG8QjQRtkZX9eNFqjCc4oeKBYS UqnHLLy0S5CXaZHjPg2PhoaBAJi1r4SKrUsB54kkCKQdxTpw5iWCB/7LC G2PSmZx5bejsMJOnOtK4E95ke5+0dTfZzEjJdk79EFSdfBGe4jHsFvvQ6 DAqoeSkAolfBA/xjSTYjdVsO+63wXw+NaIN/5AGIlxt3+aVUExHlcQlsi rzP65y8JiGQjrYIwWV5UgQFbz0uYGqqVHoN2e+qIAVOmIXjbF7WpsEFmp Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439653" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439653" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820275" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820275" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:35 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:24 +0000 Message-Id: <20221010181434.513477-7-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 06/16] drm/i915/perf: Simply use stream->ctx X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Earlier code used exclusive_stream to check for user passed context. Simplify this by accessing stream->ctx. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index a22f18554ab9..b60846f5d33c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -777,7 +777,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * switches since it's not-uncommon for periodic samples to * identify a switch before any 'context switch' report. */ - if (!stream->perf->exclusive_stream->ctx || + if (!stream->ctx || stream->specific_ctx_id == ctx_id || stream->oa_buffer.last_ctx_id == stream->specific_ctx_id || reason & OAREPORT_REASON_CTX_SWITCH) { @@ -786,7 +786,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * While filtering for a single context we avoid * leaking the IDs of other contexts. */ - if (stream->perf->exclusive_stream->ctx && + if (stream->ctx && stream->specific_ctx_id != ctx_id) { report32[2] = INVALID_CTX_ID; } From patchwork Mon Oct 10 18:14:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002826 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3DCB8C433FE for ; Mon, 10 Oct 2022 18:15:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6C72B10E6E9; Mon, 10 Oct 2022 18:15:20 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5CB3110E6D6 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=nGtYpD6+HjRBKowpL2I+d7ZW7B4GYkVKXsw926aW2Dw=; b=bT9z0G5PRZhEryuPJmB7HsRxhtPsM4z8OyMnymA46B9YHD4OLbTQalj+ bGwyTjrAKo6Y8gtSqtq5gvT6RrBYokTvyhyW5RuquHibQHLR33/MwpZ8g GwnLZ5a9i8V5a0nqmXKNeUw7MD6LmyGp4i/j2UQ5XJwvaYANZB/cM9MLW JnGaZElmQBv3jeW4MVi7dE0Imee+Gbg/GJjQJhM+RULibqX0+oymTTTgL Z/SMm6rObtQnTMrNvspmn+L/FUN5yEwnk5BH8sWd921c77dAk3rD1EZt/ Ub/ud8yV6IbYdWogRPCXm+uW4rSupVNblj9rVGGIFzE8u1i8qZry0taaY w==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439655" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439655" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820279" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820279" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:25 +0000 Message-Id: <20221010181434.513477-8-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 07/16] drm/i915/perf: Move gt-specific data from i915->perf to gt->perf X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Make perf part of gt as the OAG buffer is specific to a gt. The refactor eventually simplifies programming the right OA buffer and the right HW registers when supporting multiple gts. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 + drivers/gpu/drm/i915/gt/intel_sseu.c | 4 +- drivers/gpu/drm/i915/i915_perf.c | 75 +++++++++++++--------- drivers/gpu/drm/i915/i915_perf_types.h | 39 +++++------ drivers/gpu/drm/i915/selftests/i915_perf.c | 16 +++-- 5 files changed, 80 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index f19c2de77ff6..9f653a347cad 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -20,6 +20,7 @@ #include "intel_gsc.h" #include "i915_vma.h" +#include "i915_perf_types.h" #include "intel_engine_types.h" #include "intel_gt_buffer_pool_types.h" #include "intel_hwconfig.h" @@ -286,6 +287,8 @@ struct intel_gt { /* sysfs defaults per gt */ struct gt_defaults defaults; struct kobject *sysfs_defaults; + + struct i915_perf_gt perf; }; struct intel_gt_definition { diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 66f21c735d54..6c6198a257ac 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -677,8 +677,8 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt, * If i915/perf is active, we want a stable powergating configuration * on the system. Use the configuration pinned by i915/perf. */ - if (i915->perf.exclusive_stream) - req_sseu = &i915->perf.sseu; + if (gt->perf.exclusive_stream) + req_sseu = >->perf.sseu; slices = hweight8(req_sseu->slice_mask); subslices = hweight8(req_sseu->subslice_mask); diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index b60846f5d33c..446c9c97c3a4 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1561,8 +1561,9 @@ free_noa_wait(struct i915_perf_stream *stream) static void i915_oa_stream_destroy(struct i915_perf_stream *stream) { struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; - if (WARN_ON(stream != perf->exclusive_stream)) + if (WARN_ON(stream != gt->perf.exclusive_stream)) return; /* @@ -1571,7 +1572,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream) * * See i915_oa_init_reg_state() and lrc_configure_all_contexts() */ - WRITE_ONCE(perf->exclusive_stream, NULL); + WRITE_ONCE(gt->perf.exclusive_stream, NULL); perf->ops.disable_metric_set(stream); free_oa_buffer(stream); @@ -2564,10 +2565,11 @@ oa_configure_all_contexts(struct i915_perf_stream *stream, { struct drm_i915_private *i915 = stream->perf->i915; struct intel_engine_cs *engine; + struct intel_gt *gt = stream->engine->gt; struct i915_gem_context *ctx, *cn; int err; - lockdep_assert_held(&stream->perf->lock); + lockdep_assert_held(>->perf.lock); /* * The OA register config is setup through the context image. This image @@ -3088,6 +3090,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, { struct drm_i915_private *i915 = stream->perf->i915; struct i915_perf *perf = stream->perf; + struct intel_gt *gt; int format_size; int ret; @@ -3096,6 +3099,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, "OA engine not specified\n"); return -EINVAL; } + gt = props->engine->gt; /* * If the sysfs metrics/ directory wasn't registered for some @@ -3126,7 +3130,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, * counter reports and marshal to the appropriate client * we currently only allow exclusive access */ - if (perf->exclusive_stream) { + if (gt->perf.exclusive_stream) { drm_dbg(&stream->perf->i915->drm, "OA unit already in use\n"); return -EBUSY; @@ -3206,8 +3210,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->ops = &i915_oa_stream_ops; - perf->sseu = props->sseu; - WRITE_ONCE(perf->exclusive_stream, stream); + stream->engine->gt->perf.sseu = props->sseu; + WRITE_ONCE(gt->perf.exclusive_stream, stream); ret = i915_perf_stream_enable_sync(stream); if (ret) { @@ -3229,7 +3233,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, return 0; err_enable: - WRITE_ONCE(perf->exclusive_stream, NULL); + WRITE_ONCE(gt->perf.exclusive_stream, NULL); perf->ops.disable_metric_set(stream); free_oa_buffer(stream); @@ -3259,7 +3263,7 @@ void i915_oa_init_reg_state(const struct intel_context *ce, return; /* perf.exclusive_stream serialised by lrc_configure_all_contexts() */ - stream = READ_ONCE(engine->i915->perf.exclusive_stream); + stream = READ_ONCE(engine->gt->perf.exclusive_stream); if (stream && GRAPHICS_VER(stream->perf->i915) < 12) gen8_update_reg_state_unlocked(ce, stream); } @@ -3288,7 +3292,7 @@ static ssize_t i915_perf_read(struct file *file, loff_t *ppos) { struct i915_perf_stream *stream = file->private_data; - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; size_t offset = 0; int ret; @@ -3312,14 +3316,14 @@ static ssize_t i915_perf_read(struct file *file, if (ret) return ret; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } while (!offset && !ret); } else { - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } /* We allow the poll checking to sometimes report false positive EPOLLIN @@ -3366,7 +3370,7 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer) * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that * will be woken for new stream data. * - * Note: The &perf->lock mutex has been taken to serialize + * Note: The >->perf.lock mutex has been taken to serialize * with any non-file-operation driver hooks. * * Returns: any poll events that are ready without sleeping @@ -3407,12 +3411,12 @@ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream, static __poll_t i915_perf_poll(struct file *file, poll_table *wait) { struct i915_perf_stream *stream = file->private_data; - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; __poll_t ret; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = i915_perf_poll_locked(stream, file, wait); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); return ret; } @@ -3511,7 +3515,7 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, * @cmd: the ioctl request * @arg: the ioctl data * - * Note: The &perf->lock mutex has been taken to serialize + * Note: The >->perf.lock mutex has been taken to serialize * with any non-file-operation driver hooks. * * Returns: zero on success or a negative error code. Returns -EINVAL for @@ -3551,12 +3555,12 @@ static long i915_perf_ioctl(struct file *file, unsigned long arg) { struct i915_perf_stream *stream = file->private_data; - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; long ret; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); ret = i915_perf_ioctl_locked(stream, cmd, arg); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); return ret; } @@ -3568,7 +3572,7 @@ static long i915_perf_ioctl(struct file *file, * Frees all resources associated with the given i915 perf @stream, disabling * any associated data capture in the process. * - * Note: The &perf->lock mutex has been taken to serialize + * Note: The >->perf.lock mutex has been taken to serialize * with any non-file-operation driver hooks. */ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) @@ -3600,10 +3604,11 @@ static int i915_perf_release(struct inode *inode, struct file *file) { struct i915_perf_stream *stream = file->private_data; struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); i915_perf_destroy_locked(stream); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); /* Release the reference the perf stream kept on the driver. */ drm_dev_put(&perf->i915->drm); @@ -3636,7 +3641,7 @@ static const struct file_operations fops = { * See i915_perf_ioctl_open() for interface details. * * Implements further stream config validation and stream initialization on - * behalf of i915_perf_open_ioctl() with the &perf->lock mutex + * behalf of i915_perf_open_ioctl() with the >->perf.lock mutex * taken to serialize with any non-file-operation driver hooks. * * Note: at this point the @props have only been validated in isolation and @@ -4020,7 +4025,7 @@ static int read_properties_unlocked(struct i915_perf *perf, * mutex to avoid an awkward lockdep with mmap_lock. * * Most of the implementation details are handled by - * i915_perf_open_ioctl_locked() after taking the &perf->lock + * i915_perf_open_ioctl_locked() after taking the >->perf.lock * mutex for serializing with any non-file-operation driver hooks. * * Return: A newly opened i915 Perf stream file descriptor or negative @@ -4031,6 +4036,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, { struct i915_perf *perf = &to_i915(dev)->perf; struct drm_i915_perf_open_param *param = data; + struct intel_gt *gt; struct perf_open_properties props; u32 known_open_flags; int ret; @@ -4057,9 +4063,11 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, if (ret) return ret; - mutex_lock(&perf->lock); + gt = props.engine->gt; + + mutex_lock(>->perf.lock); ret = i915_perf_open_ioctl_locked(perf, param, &props, file); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); return ret; } @@ -4075,6 +4083,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, void i915_perf_register(struct drm_i915_private *i915) { struct i915_perf *perf = &i915->perf; + struct intel_gt *gt = to_gt(i915); if (!perf->i915) return; @@ -4083,13 +4092,13 @@ void i915_perf_register(struct drm_i915_private *i915) * i915_perf_open_ioctl(); considering that we register after * being exposed to userspace. */ - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); perf->metrics_kobj = kobject_create_and_add("metrics", &i915->drm.primary->kdev->kobj); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } /** @@ -4766,7 +4775,11 @@ void i915_perf_init(struct drm_i915_private *i915) } if (perf->ops.enable_metric_set) { - mutex_init(&perf->lock); + struct intel_gt *gt; + int i; + + for_each_gt(gt, i915, i) + mutex_init(>->perf.lock); /* Choose a representative limit */ oa_sample_rate_hard_limit = to_gt(i915)->clock_frequency / 2; diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index 05cb9a335a97..e888bfab478f 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -380,6 +380,26 @@ struct i915_oa_ops { u32 (*oa_hw_tail_read)(struct i915_perf_stream *stream); }; +struct i915_perf_gt { + /* + * Lock associated with anything below within this structure. + */ + struct mutex lock; + + /** + * @sseu: sseu configuration selected to run while perf is active, + * applies to all contexts. + */ + struct intel_sseu sseu; + + /* + * @exclusive_stream: The stream currently using the OA unit. This is + * sometimes accessed outside a syscall associated to its file + * descriptor. + */ + struct i915_perf_stream *exclusive_stream; +}; + struct i915_perf { struct drm_i915_private *i915; @@ -397,25 +417,6 @@ struct i915_perf { */ struct idr metrics_idr; - /* - * Lock associated with anything below within this structure - * except exclusive_stream. - */ - struct mutex lock; - - /* - * The stream currently using the OA unit. If accessed - * outside a syscall associated to its file - * descriptor. - */ - struct i915_perf_stream *exclusive_stream; - - /** - * @sseu: sseu configuration selected to run while perf is active, - * applies to all contexts. - */ - struct intel_sseu sseu; - /** * For rate limiting any notifications of spurious * invalid OA reports diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c index 429c6d73b159..24dde5531423 100644 --- a/drivers/gpu/drm/i915/selftests/i915_perf.c +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c @@ -102,6 +102,12 @@ test_stream(struct i915_perf *perf) I915_OA_FORMAT_A32u40_A4u32_B8_C8 : I915_OA_FORMAT_C4_B8, }; struct i915_perf_stream *stream; + struct intel_gt *gt; + + if (!props.engine) + return NULL; + + gt = props.engine->gt; if (!oa_config) return NULL; @@ -116,12 +122,12 @@ test_stream(struct i915_perf *perf) stream->perf = perf; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); if (i915_oa_stream_init(stream, ¶m, &props)) { kfree(stream); stream = NULL; } - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); i915_oa_config_put(oa_config); @@ -130,11 +136,11 @@ test_stream(struct i915_perf *perf) static void stream_destroy(struct i915_perf_stream *stream) { - struct i915_perf *perf = stream->perf; + struct intel_gt *gt = stream->engine->gt; - mutex_lock(&perf->lock); + mutex_lock(>->perf.lock); i915_perf_destroy_locked(stream); - mutex_unlock(&perf->lock); + mutex_unlock(>->perf.lock); } static int live_sanitycheck(void *arg) From patchwork Mon Oct 10 18:14:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002824 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34E92C433F5 for ; Mon, 10 Oct 2022 18:15:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D628E10E6D8; Mon, 10 Oct 2022 18:15:07 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4E23D10E6D5 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=ANnqzMZFUKwWmwNgg2VinokEtJEXBivAoemdsiCEZK8=; b=OKeOrYSomQ1YW0Kf0LlH3QzmxRqbjXVtCXUxmENNoEg1kNkt0/vO60Qs cT/xEGpN29up/v2UFSm0pnrBzqvjPT+H++1+z6QV0Em53KScv4zwgxHZL 2HAZtXQT923UjY32wIE6XPSEC3neG1IZxBWkaPXy7UavEf4rfQExEU7Bp iTuZ6FBs0UsZQmbOSJGlpH+qRHE36lZ5VaRY2dh1HCsaBPjW+3jhQNNW/ kATbOyKUwHzz9tuY3bDgw66q+s04kK20iTMhdB4clVO94u0khNfFuERzw 3hrAAIOF4wxc6+UIbzya7ROxyLroA2d5peA9Tz3GDCHoYZiOvqSRaZO9n A==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="301909905" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="301909905" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820282" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820282" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:26 +0000 Message-Id: <20221010181434.513477-9-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 08/16] drm/i915/perf: Replace gt->perf.lock with stream->lock for file ops X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" With multi-gt, user can access multiple OA buffers concurrently. Use stream->lock instead of gt->perf.lock to serialize file operations. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 31 ++++++++++++-------------- drivers/gpu/drm/i915/i915_perf_types.h | 5 +++++ 2 files changed, 19 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 446c9c97c3a4..3961e9c9e97b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3229,6 +3229,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->poll_check_timer.function = oa_poll_check_timer_cb; init_waitqueue_head(&stream->poll_wq); spin_lock_init(&stream->oa_buffer.ptr_lock); + mutex_init(&stream->lock); return 0; @@ -3292,7 +3293,6 @@ static ssize_t i915_perf_read(struct file *file, loff_t *ppos) { struct i915_perf_stream *stream = file->private_data; - struct intel_gt *gt = stream->engine->gt; size_t offset = 0; int ret; @@ -3316,14 +3316,14 @@ static ssize_t i915_perf_read(struct file *file, if (ret) return ret; - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); } while (!offset && !ret); } else { - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = stream->ops->read(stream, buf, count, &offset); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); } /* We allow the poll checking to sometimes report false positive EPOLLIN @@ -3370,9 +3370,6 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer) * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that * will be woken for new stream data. * - * Note: The >->perf.lock mutex has been taken to serialize - * with any non-file-operation driver hooks. - * * Returns: any poll events that are ready without sleeping */ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream, @@ -3411,12 +3408,11 @@ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream, static __poll_t i915_perf_poll(struct file *file, poll_table *wait) { struct i915_perf_stream *stream = file->private_data; - struct intel_gt *gt = stream->engine->gt; __poll_t ret; - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = i915_perf_poll_locked(stream, file, wait); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); return ret; } @@ -3515,9 +3511,6 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, * @cmd: the ioctl request * @arg: the ioctl data * - * Note: The >->perf.lock mutex has been taken to serialize - * with any non-file-operation driver hooks. - * * Returns: zero on success or a negative error code. Returns -EINVAL for * an unknown ioctl request. */ @@ -3555,12 +3548,11 @@ static long i915_perf_ioctl(struct file *file, unsigned long arg) { struct i915_perf_stream *stream = file->private_data; - struct intel_gt *gt = stream->engine->gt; long ret; - mutex_lock(>->perf.lock); + mutex_lock(&stream->lock); ret = i915_perf_ioctl_locked(stream, cmd, arg); - mutex_unlock(>->perf.lock); + mutex_unlock(&stream->lock); return ret; } @@ -3606,6 +3598,11 @@ static int i915_perf_release(struct inode *inode, struct file *file) struct i915_perf *perf = stream->perf; struct intel_gt *gt = stream->engine->gt; + /* + * Within this call, we know that the fd is being closed and we have no + * other user of stream->lock. Use the perf lock to destroy the stream + * here. + */ mutex_lock(>->perf.lock); i915_perf_destroy_locked(stream); mutex_unlock(>->perf.lock); diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index e888bfab478f..dc9bfd8086cf 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -146,6 +146,11 @@ struct i915_perf_stream { */ struct intel_engine_cs *engine; + /* + * Lock associated with operations on stream + */ + struct mutex lock; + /** * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` * properties given when opening a stream, representing the contents From patchwork Mon Oct 10 18:14:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C56E4C433F5 for ; Mon, 10 Oct 2022 18:14:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0027D10E6B7; Mon, 10 Oct 2022 18:14:44 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id BA32910E6D9 for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=EiG7y2mNkmcIZo24nc1z2FovK9JX3IgrM+eH0bulPmo=; b=b3z0oj+LWtAScUuUxfOrnuPiNZN2oZxCNyt6r5hF/60zo2W5zS89LUsY qFVW7Y6UCiDJH4x4EDL2x70HDKm+VMhd0Q02CZjknxlOBIGFLylq6volD rfb9RCY/clstT/4rJlilxMdtjsUDDkHfzH/saeksh6Ct/KhZcAQJC2Ovh qLhSUpLQsslvbp7nWraxa3mKcy4moEpsWCrpeMjcqGJYiIJ7zOgD6fR16 uSHyVWOiDt65rsIh+WbKhFjqYM6DkcrCRifOTaR6zl2jmfz2Mn8HQZH6T rZZJ8lWRvyr+PmVp3uExkcAPRXDh/Ji7mQIkxsbp8ZsfW74LdYQpZpbY5 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439656" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439656" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820286" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820286" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:27 +0000 Message-Id: <20221010181434.513477-10-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 09/16] drm/i915/perf: Use gt-specific ggtt for OA and noa-wait buffers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" User passes uabi engine class and instance to the perf OA interface. Use gt corresponding to the engine to pin the buffers to the right ggtt. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 3961e9c9e97b..2f9e18ee0aab 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1750,6 +1750,7 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream) static int alloc_oa_buffer(struct i915_perf_stream *stream) { struct drm_i915_private *i915 = stream->perf->i915; + struct intel_gt *gt = stream->engine->gt; struct drm_i915_gem_object *bo; struct i915_vma *vma; int ret; @@ -1769,11 +1770,22 @@ static int alloc_oa_buffer(struct i915_perf_stream *stream) i915_gem_object_set_cache_coherency(bo, I915_CACHE_LLC); /* PreHSW required 512K alignment, HSW requires 16M */ - vma = i915_gem_object_ggtt_pin(bo, NULL, 0, SZ_16M, 0); + vma = i915_vma_instance(bo, >->ggtt->vm, NULL); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto err_unref; } + + /* + * PreHSW required 512K alignment. + * HSW and onwards, align to requested size of OA buffer. + */ + ret = i915_vma_pin(vma, 0, SZ_16M, PIN_GLOBAL | PIN_HIGH); + if (ret) { + drm_err(>->i915->drm, "Failed to pin OA buffer %d\n", ret); + goto err_unref; + } + stream->oa_buffer.vma = vma; stream->oa_buffer.vaddr = @@ -1823,6 +1835,7 @@ static u32 *save_restore_register(struct i915_perf_stream *stream, u32 *cs, static int alloc_noa_wait(struct i915_perf_stream *stream) { struct drm_i915_private *i915 = stream->perf->i915; + struct intel_gt *gt = stream->engine->gt; struct drm_i915_gem_object *bo; struct i915_vma *vma; const u64 delay_ticks = 0xffffffffffffffff - @@ -1863,12 +1876,16 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) * multiple OA config BOs will have a jump to this address and it * needs to be fixed during the lifetime of the i915/perf stream. */ - vma = i915_gem_object_ggtt_pin_ww(bo, &ww, NULL, 0, 0, PIN_HIGH); + vma = i915_vma_instance(bo, >->ggtt->vm, NULL); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto out_ww; } + ret = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_GLOBAL | PIN_HIGH); + if (ret) + goto out_ww; + batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); if (IS_ERR(batch)) { ret = PTR_ERR(batch); From patchwork Mon Oct 10 18:14:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 504DEC433FE for ; Mon, 10 Oct 2022 18:14:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5DFD710E6B5; Mon, 10 Oct 2022 18:14:44 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id EB41610E6BC for ; Mon, 10 Oct 2022 18:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425677; x=1696961677; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=xdL5g16hvWoRXtIK01mqJBwVRFsir67mkS11mGon/As=; b=OTc/1tNZT/ImZ5BftcxPoGWBOzvMfVj+QM/gc4MorS0/wNe9bKoEc/Th Tbs5MzBDIf0CoClCrQPUaLAtOcU1+oJpOAOJe2rM5mEodJFVaPBQ2VIz2 Fp+VrYN6aQ3bAkS3P57VAA3yLgZbxbOZbm8VQxsQBQQj1TM2O5BWhrxWg c9aUyoAcSMaIE5ycQA2fnupqaDMQo1+otZBy92CYrqXWt7j/ea7oHI0LI Ik+eYDzF/Ybz6hfF8N1C6ZTBPjJ28CpGKt8cEI4uHXKKRsATAe5ESqxKm +Zsw8kqaJ2rTLOXa/9F1MegdbXc79ZNjp2zApJsbiBfvIRM1TeF3OLvLA A==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439657" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439657" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820289" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820289" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:28 +0000 Message-Id: <20221010181434.513477-11-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 10/16] drm/i915/perf: Store a pointer to oa_format in oa_buffer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" DG2 introduces OA reports with 64 bit report header fields. Perf OA would need more information about the OA format in order to process such reports. Store all OA format info in oa_buffer instead of just the size and format-id. v2: Drop format_size variable (Ashutosh) Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 30 +++++++++++--------------- drivers/gpu/drm/i915/i915_perf_types.h | 3 +-- 2 files changed, 13 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2f9e18ee0aab..ad74fa5847f7 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -465,7 +465,7 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream) static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) { u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; unsigned long flags; bool pollin; u32 hw_tail; @@ -602,7 +602,7 @@ static int append_oa_sample(struct i915_perf_stream *stream, size_t *offset, const u8 *report) { - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; struct drm_i915_perf_record_header header; header.type = DRM_I915_PERF_RECORD_SAMPLE; @@ -652,7 +652,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, size_t *offset) { struct intel_uncore *uncore = stream->uncore; - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; u8 *oa_buf_base = stream->oa_buffer.vaddr; u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); u32 mask = (OA_BUFFER_SIZE - 1); @@ -946,7 +946,7 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, size_t *offset) { struct intel_uncore *uncore = stream->uncore; - int report_size = stream->oa_buffer.format_size; + int report_size = stream->oa_buffer.format->size; u8 *oa_buf_base = stream->oa_buffer.vaddr; u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); u32 mask = (OA_BUFFER_SIZE - 1); @@ -2502,7 +2502,7 @@ static int gen12_configure_oar_context(struct i915_perf_stream *stream, { int err; struct intel_context *ce = stream->pinned_ctx; - u32 format = stream->oa_buffer.format; + u32 format = stream->oa_buffer.format->format; u32 offset = stream->perf->ctx_oactxctrl_offset; struct flex regs_context[] = { { @@ -2875,7 +2875,7 @@ static void gen7_oa_enable(struct i915_perf_stream *stream) u32 ctx_id = stream->specific_ctx_id; bool periodic = stream->periodic; u32 period_exponent = stream->period_exponent; - u32 report_format = stream->oa_buffer.format; + u32 report_format = stream->oa_buffer.format->format; /* * Reset buf pointers so we don't forward reports from before now. @@ -2901,7 +2901,7 @@ static void gen7_oa_enable(struct i915_perf_stream *stream) static void gen8_oa_enable(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; - u32 report_format = stream->oa_buffer.format; + u32 report_format = stream->oa_buffer.format->format; /* * Reset buf pointers so we don't forward reports from before now. @@ -2927,7 +2927,7 @@ static void gen8_oa_enable(struct i915_perf_stream *stream) static void gen12_oa_enable(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; - u32 report_format = stream->oa_buffer.format; + u32 report_format = stream->oa_buffer.format->format; /* * If we don't want OA reports from the OA buffer, then we don't even @@ -3108,7 +3108,6 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, struct drm_i915_private *i915 = stream->perf->i915; struct i915_perf *perf = stream->perf; struct intel_gt *gt; - int format_size; int ret; if (!props->engine) { @@ -3164,20 +3163,15 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->sample_size = sizeof(struct drm_i915_perf_record_header); - format_size = perf->oa_formats[props->oa_format].size; + stream->oa_buffer.format = &perf->oa_formats[props->oa_format]; + if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0)) + return -EINVAL; stream->sample_flags = props->sample_flags; - stream->sample_size += format_size; - - stream->oa_buffer.format_size = format_size; - if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format_size == 0)) - return -EINVAL; + stream->sample_size += stream->oa_buffer.format->size; stream->hold_preemption = props->hold_preemption; - stream->oa_buffer.format = - perf->oa_formats[props->oa_format].format; - stream->periodic = props->oa_periodic; if (stream->periodic) stream->period_exponent = props->oa_period_exponent; diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index dc9bfd8086cf..e0c96b44eda8 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -250,11 +250,10 @@ struct i915_perf_stream { * @oa_buffer: State of the OA buffer. */ struct { + const struct i915_oa_format *format; struct i915_vma *vma; u8 *vaddr; u32 last_ctx_id; - int format; - int format_size; int size_exponent; /** From patchwork Mon Oct 10 18:14:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30C0FC433FE for ; Mon, 10 Oct 2022 18:14:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 42D5810E6C5; Mon, 10 Oct 2022 18:14:47 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1FE5610E6E2 for ; Mon, 10 Oct 2022 18:14:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425678; x=1696961678; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=RBNJA5Ki+xlYSfFoUv15Te5LTjyVoGrRKN8FF0M4TOE=; b=B+ALQajh7i0fycd1alA0pnePAxi7neaKHguYDJUVGi5BH/ZjAhSWwvjG PXUKN6SMPYPyJtwd4BY+hbjonwNyek2+6i+xItn96Q5NDQIwHrOlTNCoV 6CqKYZ0dLeH4V2jwUdCs8fAQBpAu+5yykP0bL5DvtgfvJJTuVJxCzdpeZ iaX3xcGRVxk8ZgJdbl+e5BTwRMHwcVYSHauWO2Xju/nTfhKr0gGZ08mm1 lWPjzSmUKzKmz9mdoLO8n4kOQgAXwpEdhMZQPsSd6KxAVuTJwA2n1Ge6/ 6gdXAi1kRks3/QkkoHE1g+kbEH6JhCZsxq5yfahTyrsLTnzezUXd4i/UP g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439658" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439658" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820292" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820292" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:29 +0000 Message-Id: <20221010181434.513477-12-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 11/16] drm/i915/perf: Add Wa_1508761755:dg2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Disable Clock gating in EU when gathering the events so that EU events are not lost. v2: Fix checkpatch issues Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 2275ee47da95..8bb1694035a6 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1131,6 +1131,7 @@ #define GEN12_DISABLE_EARLY_READ REG_BIT(14) #define GEN12_ENABLE_LARGE_GRF_MODE REG_BIT(12) #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) +#define GEN12_DISABLE_DOP_GATING REG_BIT(0) #define RT_CTRL _MMIO(0xe530) #define DIS_NULL_QUERY REG_BIT(10) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ad74fa5847f7..ccc0bd9f2969 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -2773,6 +2773,18 @@ gen12_enable_metric_set(struct i915_perf_stream *stream, u32 sqcnt1; int ret; + /* + * Wa_1508761755:xehpsdv, dg2 + * EU NOA signals behave incorrectly if EU clock gating is enabled. + * Disable thread stall DOP gating and EU DOP gating. + */ + if (IS_XEHPSDV(i915) || IS_DG2(i915)) { + intel_uncore_write(uncore, GEN8_ROW_CHICKEN, + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE)); + intel_uncore_write(uncore, GEN7_ROW_CHICKEN2, + _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING)); + } + intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG, /* Disable clk ratio reports, like previous Gens. */ _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS | @@ -2851,6 +2863,17 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream) struct drm_i915_private *i915 = stream->perf->i915; u32 sqcnt1; + /* + * Wa_1508761755:xehpsdv, dg2 + * Enable thread stall DOP gating and EU DOP gating. + */ + if (IS_XEHPSDV(i915) || IS_DG2(i915)) { + intel_uncore_write(uncore, GEN8_ROW_CHICKEN, + _MASKED_BIT_DISABLE(STALL_DOP_GATING_DISABLE)); + intel_uncore_write(uncore, GEN7_ROW_CHICKEN2, + _MASKED_BIT_DISABLE(GEN12_DISABLE_DOP_GATING)); + } + /* Reset all contexts' slices/subslices configurations. */ gen12_configure_all_contexts(stream, NULL, NULL); From patchwork Mon Oct 10 18:14:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AAB77C433FE for ; Mon, 10 Oct 2022 18:15:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0FDFB10E501; Mon, 10 Oct 2022 18:15:00 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2EB7C10E6E4 for ; Mon, 10 Oct 2022 18:14:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425678; x=1696961678; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=/+VDP6srtww/gQL/Uaky7uZsLBaJneCJ+ayKBE1OfEk=; b=j/T0hzVzoRJcXoCql1wzf59e3izghdJMchns6q7H2Fur4d1EAmoO/LHw IFslruyFR8CQtTuYFKMmdBkbcQ+W3sKoEVqMJLEdE445+RNSrqavUyQdj rpLRpHd2WXBJIl/2ejtxdXltlsLA96A/YawDQVNmUpRK4bEsR6pE9MApv PodEMDyR1bEIucPAvsXcHCO7gbKfAmfqmA3ocSGcxd705pw2UIm2nYHS+ wH8PT+JyLJSN6SiLT9fDpiZx0Jk+RzqADm7WPU3QmWNeUukpXy+Xw8ZyH eErKkq3EhE0zEMotJOVmHeX64UvfdGCf9g+WBA/YpDwNiSG2DeCwLdNmn w==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439661" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439661" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820297" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820297" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:30 +0000 Message-Id: <20221010181434.513477-13-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 12/16] drm/i915/perf: Apply Wa_18013179988 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" OA reports in the OA buffer contain an OA timestamp field that helps user calculate delta between 2 OA reports. The calculation relies on the CS timestamp frequency to convert the timestamp value to nanoseconds. The CS timestamp frequency is a function of the CTC_SHIFT value in RPM_CONFIG0. In DG2, OA unit assumes that the CTC_SHIFT is 3, instead of using the actual value from RPM_CONFIG0. At the user level, this results in an error in calculating delta between 2 OA reports since the OA timestamp is not shifted in the same manner as CS timestamp. Also the periodicity of the reports is different from what the user configured because of mismatch in the CS and OA frequencies. The issue also affects MI_REPORT_PERF_COUNT command. To resolve this, return actual OA timestamp frequency to the user in i915_getparam_ioctl, so that user can calculate the right OA exponent as well as interpret the reports correctly. v2: - Use REG_FIELD_GET (Ashutosh) - Update commit msg Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_getparam.c | 3 +++ drivers/gpu/drm/i915/i915_perf.c | 30 ++++++++++++++++++++++++++-- drivers/gpu/drm/i915/i915_perf.h | 2 ++ include/uapi/drm/i915_drm.h | 6 ++++++ 4 files changed, 39 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 342c8ca6414e..3047e80e1163 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_PERF_REVISION: value = i915_perf_ioctl_version(); break; + case I915_PARAM_OA_TIMESTAMP_FREQUENCY: + value = i915_perf_oa_timestamp_frequency(i915); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ccc0bd9f2969..ece4b34461ad 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3106,6 +3106,30 @@ get_sseu_config(struct intel_sseu *out_sseu, return i915_gem_user_to_context_sseu(engine->gt, drm_sseu, out_sseu); } +/* + * OA timestamp frequency = CS timestamp frequency in most platforms. On some + * platforms OA unit ignores the CTC_SHIFT and the 2 timestamps differ. In such + * cases, return the adjusted CS timestamp frequency to the user. + */ +u32 i915_perf_oa_timestamp_frequency(struct drm_i915_private *i915) +{ + /* Wa_18013179988:dg2 */ + if (IS_DG2(i915)) { + intel_wakeref_t wakeref; + u32 reg, shift; + + with_intel_runtime_pm(to_gt(i915)->uncore->rpm, wakeref) + reg = intel_uncore_read(to_gt(i915)->uncore, RPM_CONFIG0); + + shift = REG_FIELD_GET(GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK, + reg); + + return to_gt(i915)->clock_frequency << (3 - shift); + } + + return to_gt(i915)->clock_frequency; +} + /** * i915_oa_stream_init - validate combined props for OA stream and init * @stream: An i915 perf stream @@ -3827,8 +3851,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf, static u64 oa_exponent_to_ns(struct i915_perf *perf, int exponent) { - return intel_gt_clock_interval_to_ns(to_gt(perf->i915), - 2ULL << exponent); + u64 nom = (2ULL << exponent) * NSEC_PER_SEC; + u32 den = i915_perf_oa_timestamp_frequency(perf->i915); + + return div_u64(nom + den - 1, den); } static __always_inline bool diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h index 1d1329e5af3a..f96e09a4af04 100644 --- a/drivers/gpu/drm/i915/i915_perf.h +++ b/drivers/gpu/drm/i915/i915_perf.h @@ -57,4 +57,6 @@ static inline void i915_oa_config_put(struct i915_oa_config *oa_config) kref_put(&oa_config->ref, i915_oa_config_release); } +u32 i915_perf_oa_timestamp_frequency(struct drm_i915_private *i915); + #endif /* __I915_PERF_H__ */ diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 8b59590e06d4..3a714980b9f5 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -749,6 +749,12 @@ typedef struct drm_i915_irq_wait { /* Query if the kernel supports the I915_USERPTR_PROBE flag. */ #define I915_PARAM_HAS_USERPTR_PROBE 56 +/* + * Frequency of the timestamps in OA reports. This used to be the same as the CS + * timestamp frequency, but differs on some platforms. + */ +#define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57 + /* Must be kept compact -- no holes and well documented */ /** From patchwork Mon Oct 10 18:14:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E93AC4332F for ; Mon, 10 Oct 2022 18:15:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 68B3610E6BE; Mon, 10 Oct 2022 18:15:03 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4789210E6E5 for ; Mon, 10 Oct 2022 18:14:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425678; x=1696961678; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=x2lHFOUnq8PbtpW6UnPhOShY0313BeMQbob4BegoYx4=; b=WnINdEFQGpt5jxtS+JhbO6U9d+zRKTpJMhXRN+fF44GfsR5NU4zqiKtx cvQbIHUrSJVlKH86VAY2M8M+lURKyB4yJ5RiFWy15aCaA6sGTg+raaBnt T8YMLUZzphXMIQlD8OCJCAn7q7+vUEbw7pBR7xGoRa2VUWV7JQ6Ptj9wk +E/1MAOClSaJapOod00uZQl2Xu5/n8jxoprvYImmXBPkMp9l/1iOAx/RU 1yTjDvdsvcuoHiJjDqJDE+q5JZ2JvGFcgGEJX2YQDNHIAdwxrI7FhFwJQ bl8yGPuqDQDYZj4WR7kE/+LKs8MLEB9WkxzvY01Q5U2EVqbfi8LJr4CxB g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439662" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439662" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:37 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820300" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820300" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:31 +0000 Message-Id: <20221010181434.513477-14-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 13/16] drm/i915/perf: Save/restore EU flex counters across reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If a drm client is killed, then hw contexts used by the client are reset immediately. This reset clears the EU flex counter configuration. If an OA use case is running in parallel, it would start seeing zeroed eu counter values following the reset even if the drm client is restarted. Save/restore the EU flex counter config so that the EU counters can be monitored continuously across resets. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 74cbe8eaf531..3e152219fcb2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -375,6 +375,14 @@ static int guc_mmio_regset_init(struct temp_regset *regset, for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++) ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL0, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL1, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL2, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL3, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL4, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL5, false); + ret |= GUC_MMIO_REG_ADD(gt, regset, EU_PERF_CNTL6, false); + return ret ? -1 : 0; } From patchwork Mon Oct 10 18:14:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30FF2C433FE for ; Mon, 10 Oct 2022 18:15:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6348C10E6C9; Mon, 10 Oct 2022 18:15:04 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5847410E6E9 for ; Mon, 10 Oct 2022 18:14:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425678; x=1696961678; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=rABf1Gqu3zQNeDgbhuOePoNoCgcZksIfXzeG70KdXvM=; b=R4o/6am6GNU0y7CjGTt6hV3N53IKWO1jcwztIlaRDHpS+9EQ/9ZLnwb7 i/e3rZiWsRoD4yEm5nRIteAGsWAh1SV+L8vEDMSp8pTuKJi6V3480i+GD HaY92CPzLyVz5gSZQF7VYfhpDnskvUXc0BkEw5Ky/zWTVA0ji48OdwsLY rC03SQn80f77e8b2TY269C/NzCKTkRXUuswgrZ0ooPV218xnRPuni1Nvh S7jz4K+WoyhhIfhYJDyp0ZduhZN40bYPufn4GgUNe6r5GnLAucrmJKC4b DslwTrVtJ/j2HsKpD6wUCfMR4Azi7eu4O5x10D9m6tSYfprhO7oyqPaY4 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439665" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439665" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:37 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820304" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820304" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:36 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:32 +0000 Message-Id: <20221010181434.513477-15-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 14/16] drm/i915/guc: Support OA when Wa_16011777198 is enabled X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Vinay Belgaumkar On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA since OA does not expect engine resets during its use. Fix it by disabling RC6. v2: (Ashutosh) - Bring back slpc_unset_param helper - Update commit msg - Use with_intel_runtime_pm helper for set/unset v3: (Ashutosh) - Just use intel_uc_uses_guc_rc Signed-off-by: Vinay Belgaumkar Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- .../drm/i915/gt/uc/abi/guc_actions_slpc_abi.h | 9 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 66 +++++++++++++++++++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 27 ++++++++ 4 files changed, 104 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h index 4c840a2639dc..811add10c30d 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h @@ -128,6 +128,15 @@ enum slpc_media_ratio_mode { SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO = 2, }; +enum slpc_gucrc_mode { + SLPC_GUCRC_MODE_HW = 0, + SLPC_GUCRC_MODE_GUCRC_NO_RC6 = 1, + SLPC_GUCRC_MODE_GUCRC_STATIC_TIMEOUT = 2, + SLPC_GUCRC_MODE_GUCRC_DYNAMIC_HYSTERESIS = 3, + + SLPC_GUCRC_MODE_MAX, +}; + enum slpc_event_id { SLPC_EVENT_RESET = 0, SLPC_EVENT_SHUTDOWN = 1, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c index fdd895f73f9f..b3a4fb9e021f 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c @@ -137,6 +137,17 @@ static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value) return ret > 0 ? -EPROTO : ret; } +static int guc_action_slpc_unset_param(struct intel_guc *guc, u8 id) +{ + u32 request[] = { + GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST, + SLPC_EVENT(SLPC_EVENT_PARAMETER_UNSET, 1), + id, + }; + + return intel_guc_send(guc, request, ARRAY_SIZE(request)); +} + static bool slpc_is_running(struct intel_guc_slpc *slpc) { return slpc_get_state(slpc) == SLPC_GLOBAL_STATE_RUNNING; @@ -190,6 +201,15 @@ static int slpc_set_param(struct intel_guc_slpc *slpc, u8 id, u32 value) return ret; } +static int slpc_unset_param(struct intel_guc_slpc *slpc, u8 id) +{ + struct intel_guc *guc = slpc_to_guc(slpc); + + GEM_BUG_ON(id >= SLPC_MAX_PARAM); + + return guc_action_slpc_unset_param(guc, id); +} + static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) { struct drm_i915_private *i915 = slpc_to_i915(slpc); @@ -610,6 +630,52 @@ static void slpc_get_rp_values(struct intel_guc_slpc *slpc) slpc->boost_freq = slpc->rp0_freq; } +/** + * intel_guc_slpc_override_gucrc_mode() - override GUCRC mode + * @slpc: pointer to intel_guc_slpc. + * @mode: new value of the mode. + * + * This function will override the GUCRC mode. + * + * Return: 0 on success, non-zero error code on failure. + */ +int intel_guc_slpc_override_gucrc_mode(struct intel_guc_slpc *slpc, u32 mode) +{ + int ret; + struct drm_i915_private *i915 = slpc_to_i915(slpc); + intel_wakeref_t wakeref; + + if (mode >= SLPC_GUCRC_MODE_MAX) + return -EINVAL; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) { + ret = slpc_set_param(slpc, SLPC_PARAM_PWRGATE_RC_MODE, mode); + if (ret) + drm_err(&i915->drm, + "Override gucrc mode %d failed %d\n", + mode, ret); + } + + return ret; +} + +int intel_guc_slpc_unset_gucrc_mode(struct intel_guc_slpc *slpc) +{ + struct drm_i915_private *i915 = slpc_to_i915(slpc); + intel_wakeref_t wakeref; + int ret = 0; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) { + ret = slpc_unset_param(slpc, SLPC_PARAM_PWRGATE_RC_MODE); + if (ret) + drm_err(&i915->drm, + "Unsetting gucrc mode failed %d\n", + ret); + } + + return ret; +} + /* * intel_guc_slpc_enable() - Start SLPC * @slpc: pointer to intel_guc_slpc. diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h index 82a98f78f96c..ccf483730d9d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h @@ -42,5 +42,7 @@ int intel_guc_slpc_set_media_ratio_mode(struct intel_guc_slpc *slpc, u32 val); void intel_guc_pm_intrmsk_enable(struct intel_gt *gt); void intel_guc_slpc_boost(struct intel_guc_slpc *slpc); void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc); +int intel_guc_slpc_unset_gucrc_mode(struct intel_guc_slpc *slpc); +int intel_guc_slpc_override_gucrc_mode(struct intel_guc_slpc *slpc, u32 mode); #endif diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ece4b34461ad..ae7efe8a831e 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -208,6 +208,7 @@ #include "gt/intel_lrc.h" #include "gt/intel_lrc_reg.h" #include "gt/intel_ring.h" +#include "gt/uc/intel_guc_slpc.h" #include "i915_drv.h" #include "i915_file_private.h" @@ -1577,6 +1578,15 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream) free_oa_buffer(stream); + /* + * Wa_16011777198:dg2: Unset the override of GUCRC mode to enable rc6. + */ + if (intel_uc_uses_guc_rc(>->uc) && + (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || + IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))) + drm_WARN_ON(>->i915->drm, + intel_guc_slpc_unset_gucrc_mode(>->uc.guc.slpc)); + intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL); intel_engine_pm_put(stream->engine); @@ -3262,6 +3272,23 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, intel_engine_pm_get(stream->engine); intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL); + /* + * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes + * OA to lose the configuration state. Prevent this by overriding GUCRC + * mode. + */ + if (intel_uc_uses_guc_rc(>->uc) && + (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || + IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))) { + ret = intel_guc_slpc_override_gucrc_mode(>->uc.guc.slpc, + SLPC_GUCRC_MODE_GUCRC_NO_RC6); + if (ret) { + drm_dbg(&stream->perf->i915->drm, + "Unable to override gucrc mode\n"); + goto err_config; + } + } + ret = alloc_oa_buffer(stream); if (ret) goto err_oa_buf_alloc; From patchwork Mon Oct 10 18:14:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3797AC433FE for ; Mon, 10 Oct 2022 18:15:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 930AD10E6D5; Mon, 10 Oct 2022 18:15:14 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8BC4610E6C9 for ; Mon, 10 Oct 2022 18:14:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425678; x=1696961678; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=Trv08wtV+/dc1BCY535i1NYObvrVDrfoXrEIeEV7/Ho=; b=iHts47Um1Oofuz0YLlHQYiRnHE26k7T+1a1jOqcowAtp6R1gp0nnS3Cc enVvumAox0lnKXUC687zWO+VEur/g1XEY9J3wfASXqEcEiw5f46BszN7y +FsVp7YtMuxFDAzp2vT6ltMBcHLz+Vx6S7HtmWP0ptvX6pURzFRcJVKKQ 8pcdqyzp4zhkdarYMLfjoVzQ77C1ggvLjzaINa5X+kr+69MeQXcdA3iSh ywr1mxUfb7+p3TWbNWbrhlmFBFVilyi+Zxd5EJA+Bo0EvJnBIXpLwT2ta Mpn+a4cSDzUGgjb3wgTexO7zZZhbRoZr6wifHyZOwkZuu7AnqL4y6tNqg w==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439667" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439667" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:37 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820307" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820307" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:37 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:33 +0000 Message-Id: <20221010181434.513477-16-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 15/16] drm/i915/perf: complete programming whitelisting for XEHPSDV X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Lionel Landwerlin We have an additional register to select which slices contribute to OAG/OAG counter increments. Signed-off-by: Lionel Landwerlin Signed-off-by: Matt Roper Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/i915_perf.c | 13 +++++++++++++ drivers/gpu/drm/i915/intel_device_info.h | 1 + 4 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ccd54ff54002..992ca7b0aea0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -904,6 +904,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_OA_BPC_REPORTING(dev_priv) \ (INTEL_INFO(dev_priv)->has_oa_bpc_reporting) +#define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \ + (INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits) /* * Set this flag, when platform requires 64K GTT page sizes or larger for diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 6b25e4cb6221..f5de5ae0b800 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1023,6 +1023,7 @@ static const struct intel_device_info adl_p_info = { .has_logical_ring_elsq = 1, \ .has_mslice_steering = 1, \ .has_oa_bpc_reporting = 1, \ + .has_oa_slice_contrib_limits = 1, \ .has_rc6 = 1, \ .has_reset_engine = 1, \ .has_rps = 1, \ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ae7efe8a831e..c9ab7eaa15ea 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4258,6 +4258,11 @@ static const struct i915_range gen12_oa_b_counters[] = { {} }; +static const struct i915_range xehp_oa_b_counters[] = { + { .start = 0xdc48, .end = 0xdc48 }, /* OAA_ENABLE_REG */ + { .start = 0xdd00, .end = 0xdd48 }, /* OAG_LCE0_0 - OAA_LENABLE_REG */ +}; + static const struct i915_range gen7_oa_mux_regs[] = { { .start = 0x91b8, .end = 0x91cc }, /* OA_PERFCNT[1-2], OA_PERFMATRIX */ { .start = 0x9800, .end = 0x9888 }, /* MICRO_BP0_0 - NOA_WRITE */ @@ -4332,6 +4337,12 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr) return reg_in_range_table(addr, gen12_oa_b_counters); } +static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr) +{ + return reg_in_range_table(addr, xehp_oa_b_counters) || + reg_in_range_table(addr, gen12_oa_b_counters); +} + static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr) { return reg_in_range_table(addr, gen12_oa_mux_regs); @@ -4844,6 +4855,8 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; } else if (GRAPHICS_VER(i915) == 12) { perf->ops.is_valid_b_counter_reg = + HAS_OA_SLICE_CONTRIB_LIMITS(i915) ? + xehp_is_valid_b_counter_addr : gen12_is_valid_b_counter_addr; perf->ops.is_valid_mux_reg = gen12_is_valid_mux_addr; diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 1f7a842cd408..9975f736bc41 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -165,6 +165,7 @@ enum intel_ppgtt_type { func(has_media_ratio_mode); \ func(has_mslice_steering); \ func(has_oa_bpc_reporting); \ + func(has_oa_slice_contrib_limits); \ func(has_one_eu_per_fuse_bit); \ func(has_pxp); \ func(has_rc6); \ From patchwork Mon Oct 10 18:14:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13002816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5C9C5C433F5 for ; Mon, 10 Oct 2022 18:15:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C91A310E454; Mon, 10 Oct 2022 18:14:59 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9978A10E700 for ; Mon, 10 Oct 2022 18:14:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665425678; x=1696961678; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=GHH0xba73IqimeUFidh9YYyt+dABFULfI58cLzbIzXE=; b=dH2MIQjDRB55RBz+lFGjC93rpvbtQWYp84UyeZGlM9iZH8VrAnDlqVtX bkcGgnWsvb2fplZpF0vTdFxIkLlZtH3Bj2h5w6phOWIjZKRjtQZD//8iK m1v2+hUNyYpuH36vDkoa1Csoc/iOzM266syRvbkkVeYkk/Wc6APpawPyv GUgf5Rxbwyyl5bFItV8OQJRZbizQnEQ4LpXHOnL4CHgtMglTIL0sKYeuh XL6wTwD5fYlFUnBemuSBqT9iB3dysLsTN4BZhUT5OaSqffY00OIB+QyUs 41xUMx+75UHzfFfutCZUjBMlJFMBRphUAjdmbCVdp8mnePf9nj10gTso+ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="368439668" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="368439668" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:37 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10496"; a="603820311" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="603820311" Received: from dut042-dg2frd.fm.intel.com ([10.105.19.4]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 11:14:37 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org, Lionel G Landwerlin , Ashutosh Dixit Date: Mon, 10 Oct 2022 18:14:34 +0000 Message-Id: <20221010181434.513477-17-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> References: <20221010181434.513477-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 16/16] drm/i915/perf: Enable OA for DG2 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" OA was disabled for DG2 as support was missing. Enable it back now. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c9ab7eaa15ea..cb3c67abb82e 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4795,12 +4795,6 @@ void i915_perf_init(struct drm_i915_private *i915) { struct i915_perf *perf = &i915->perf; - /* XXX const struct i915_perf_ops! */ - - /* i915_perf is not enabled for DG2 yet */ - if (IS_DG2(i915)) - return; - perf->oa_formats = oa_formats; if (IS_HASWELL(i915)) { perf->ops.is_valid_b_counter_reg = gen7_is_valid_b_counter_addr;