From patchwork Fri Aug 16 08:04:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lucas De Marchi X-Patchwork-Id: 11097159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0788214DB for ; Fri, 16 Aug 2019 08:07:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EDD4928841 for ; Fri, 16 Aug 2019 08:07:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E1B6A289DC; Fri, 16 Aug 2019 08:07:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C606B28841 for ; Fri, 16 Aug 2019 08:07:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 098836EB2C; Fri, 16 Aug 2019 08:07:19 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 47C526EB2A for ; Fri, 16 Aug 2019 08:07:18 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Aug 2019 01:06:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,391,1559545200"; d="scan'208";a="184851645" Received: from miyoungj-mobl1.amr.corp.intel.com (HELO ldmartin-desk1.intel.com) ([10.254.105.68]) by FMSMGA003.fm.intel.com with ESMTP; 16 Aug 2019 01:06:15 -0700 From: Lucas De Marchi To: intel-gfx@lists.freedesktop.org Date: Fri, 16 Aug 2019 01:04:58 -0700 Message-Id: <20190816080503.28594-35-lucas.demarchi@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190816080503.28594-1-lucas.demarchi@intel.com> References: <20190816080503.28594-1-lucas.demarchi@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 34/39] drm/i915/tgl: Add perf support on TGL X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Lionel Landwerlin The design of the OA unit has been split into several units. We now have a global unit (OAG) and a render specific unit (OAR). This leads to some changes on how we program things. Some details : OAR: - has its own set of counter registers, they are per-context saved/restored - counters are not written to the circular OA buffer - a snapshot of the counters can be acquired with MI_RECORD_PERF_COUNT, or a single counter can be read with MI_STORE_REGISTER_MEM. OAG: - has global counters that increment across context switches - counters are written into the circular OA buffer (if requested) BSpec: 28727, 30021 Signed-off-by: Lionel Landwerlin Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_perf.c | 236 ++++++++++++++++++++++++-- drivers/gpu/drm/i915/i915_reg.h | 103 +++++++++++ drivers/gpu/drm/i915/oa/i915_oa_tgl.c | 113 ++++++++++++ drivers/gpu/drm/i915/oa/i915_oa_tgl.h | 17 ++ 5 files changed, 458 insertions(+), 14 deletions(-) create mode 100644 drivers/gpu/drm/i915/oa/i915_oa_tgl.c create mode 100644 drivers/gpu/drm/i915/oa/i915_oa_tgl.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 45add812048b..6d9040cdf431 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -234,7 +234,8 @@ i915-y += \ oa/i915_oa_cflgt2.o \ oa/i915_oa_cflgt3.o \ oa/i915_oa_cnl.o \ - oa/i915_oa_icl.o + oa/i915_oa_icl.o \ + oa/i915_oa_tgl.o i915-y += i915_perf.o # Post-mortem debug and GPU hang state capture diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9386d9c82930..250061cdad5c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -215,6 +215,7 @@ #include "oa/i915_oa_cflgt3.h" #include "oa/i915_oa_cnl.h" #include "oa/i915_oa_icl.h" +#include "oa/i915_oa_tgl.h" #define OA_TAKEN(tail, head) (((tail) - (head)) & (stream->oa_buffer.vma->size - 1)) @@ -1496,6 +1497,73 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) stream->pollin = false; } +static void gen12_init_oa_buffer(struct i915_perf_stream *stream) +{ + struct drm_i915_private *dev_priv = stream->dev_priv; + u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); + unsigned long flags; + + spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); + + I915_WRITE(GEN12_OAG_OASTATUS, 0); + I915_WRITE(GEN12_OAG_OAHEADPTR, gtt_offset & GEN12_OAG_OAHEADPTR_MASK); + stream->oa_buffer.head = gtt_offset; + + /* + * PRM says: + * + * "This MMIO must be set before the OATAILPTR + * register and after the OAHEADPTR register. This is + * to enable proper functionality of the overflow + * bit." + * + * On hardware that supports it, OA buffer size goes up to 128Mb by + * toggling a bit in the OAG_OA_DEBUG register meaning multiply base + * value by 8. OA buffer size is already clamped between 128K and max + * supported size when validating properties passed by the user, so no + * need to check for specific hardware here. + */ + I915_WRITE(GEN12_OAG_OABUFFER, gtt_offset | + ((stream->oa_buffer.size_exponent - 17) << + GEN12_OAG_OABUFFER_BUFFER_SIZE_SHIFT) | + GEN8_OABUFFER_MEM_SELECT_GGTT); + I915_WRITE(GEN12_OAG_OATAILPTR, gtt_offset & GEN12_OAG_OATAILPTR_MASK); + + /* Mark that we need updated tail pointers to read from... */ + stream->oa_buffer.tails[0].offset = INVALID_TAIL_PTR; + stream->oa_buffer.tails[1].offset = INVALID_TAIL_PTR; + + /* + * Reset state used to recognise context switches, affecting which + * reports we will forward to userspace while filtering for a single + * context. + */ + stream->oa_buffer.last_ctx_id = INVALID_CTX_ID; + + spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); + + /* + * NB: although the OA buffer will initially be allocated + * zeroed via shmfs (and so this memset is redundant when + * first allocating), we may re-init the OA buffer, either + * when re-enabling a stream or in error/reset paths. + * + * The reason we clear the buffer for each re-init is for the + * sanity check in gen8_append_oa_reports() that looks at the + * reason field to make sure it's non-zero which relies on + * the assumption that new reports are being written to zeroed + * memory... + */ + memset(stream->oa_buffer.vaddr, 0, + stream->oa_buffer.vma->size); + + /* + * Maybe make ->pollin per-stream state if we support multiple + * concurrent streams in the future. + */ + stream->pollin = false; +} + static int alloc_oa_buffer(struct i915_perf_stream *stream, int size_exponent) { struct drm_i915_gem_object *bo; @@ -1691,10 +1759,21 @@ gen8_update_reg_state_unlocked(struct i915_perf_stream *stream, }; int i; - CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, - (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | - (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) | - GEN8_OA_COUNTER_RESUME); + if (IS_GEN(i915, 12)) { + u32 format = stream->oa_buffer.format; + + CTX_REG(reg_state, ctx_oactxctrl, GEN12_OAR_OACONTROL, + (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) | + (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0)); + } else { + u32 period_exponent = stream->period_exponent; + bool periodic = stream->periodic; + + CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL, + (period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) | + (periodic ? GEN8_OA_TIMER_ENABLE : 0) | + GEN8_OA_COUNTER_RESUME); + } for (i = 0; i < ARRAY_SIZE(flex_regs); i++) { CTX_REG(reg_state, ctx_flexeu0 + i * 2, flex_regs[i], @@ -1855,8 +1934,8 @@ static int gen8_configure_context(struct i915_gem_context *ctx, * * Note: it's only the RCS/Render context that has any OA state. */ -static int gen8_configure_all_contexts(struct i915_perf_stream *stream, - const struct i915_oa_config *oa_config) +static int lrc_configure_all_contexts(struct i915_perf_stream *stream, + const struct i915_oa_config *oa_config) { struct drm_i915_private *i915 = stream->dev_priv; /* The MMIO offsets for Flex EU registers aren't contiguous */ @@ -1981,7 +2060,50 @@ static int gen8_enable_metric_set(struct i915_perf_stream *stream) * to make sure all slices/subslices are ON before writing to NOA * registers. */ - ret = gen8_configure_all_contexts(stream, oa_config); + ret = lrc_configure_all_contexts(stream, oa_config); + if (ret) + return ret; + + config_oa_regs(dev_priv, oa_config->mux_regs, oa_config->mux_regs_len); + + config_oa_regs(dev_priv, oa_config->b_counter_regs, + oa_config->b_counter_regs_len); + + return 0; +} + +static int gen12_enable_metric_set(struct i915_perf_stream *stream) +{ + struct drm_i915_private *dev_priv = stream->dev_priv; + struct i915_oa_config *oa_config = stream->oa_config; + bool periodic = stream->periodic; + u32 period_exponent = stream->period_exponent; + int ret; + + I915_WRITE(GEN12_OAG_OA_DEBUG, + /* Disable clk ratio reports, like previous Gens. */ + _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS | + GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO) | + /* + * If the user didn't require OA reports, instruct the + * hardware not to emit ctx switch reports. + */ + (stream->sample_flags & SAMPLE_OA_REPORT) ? + _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CTX_SWITCH_REPORTS) : + _MASKED_BIT_DISABLE(GEN12_OAG_OA_DEBUG_DISABLE_CTX_SWITCH_REPORTS)); + + I915_WRITE(GEN12_OAG_OAGLBCTXCTRL, + periodic ? + (GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE | + period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT) + : 0); + + /* + * Update all contexts prior writing the mux configurations as we need + * to make sure all slices/subslices are ON before writing to NOA + * registers. + */ + ret = lrc_configure_all_contexts(stream, oa_config); if (ret) return ret; @@ -1999,7 +2121,7 @@ static void gen8_disable_metric_set(struct i915_perf_stream *stream) struct drm_i915_private *dev_priv = stream->dev_priv; /* Reset all contexts' slices/subslices configurations. */ - gen8_configure_all_contexts(stream, NULL); + lrc_configure_all_contexts(stream, NULL); I915_WRITE(GDT_CHICKEN_BITS, (I915_READ(GDT_CHICKEN_BITS) & ~GT_NOA_ENABLE)); @@ -2010,7 +2132,19 @@ static void gen10_disable_metric_set(struct i915_perf_stream *stream) struct drm_i915_private *dev_priv = stream->dev_priv; /* Reset all contexts' slices/subslices configurations. */ - gen8_configure_all_contexts(stream, NULL); + lrc_configure_all_contexts(stream, NULL); + + /* Make sure we disable noa to save power. */ + I915_WRITE(RPM_CONFIG1, + I915_READ(RPM_CONFIG1) & ~GEN10_GT_NOA_ENABLE); +} + +static void gen12_disable_metric_set(struct i915_perf_stream *stream) +{ + struct drm_i915_private *dev_priv = stream->dev_priv; + + /* Reset all contexts' slices/subslices configurations. */ + lrc_configure_all_contexts(stream, NULL); /* Make sure we disable noa to save power. */ I915_WRITE(RPM_CONFIG1, @@ -2073,6 +2207,26 @@ static void gen8_oa_enable(struct i915_perf_stream *stream) GEN8_OA_COUNTER_ENABLE); } +static void gen12_oa_enable(struct i915_perf_stream *stream) +{ + struct drm_i915_private *dev_priv = stream->dev_priv; + u32 report_format = stream->oa_buffer.format; + + /* + * If we don't want OA reports from the OA buffer, then we don't even + * need to program the OAG unit. + */ + if (!(stream->sample_flags & SAMPLE_OA_REPORT)) + return; + + gen12_init_oa_buffer(stream); + + I915_WRITE(GEN12_OAG_OACONTROL, + (report_format << + GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) | + GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE); +} + /** * i915_oa_stream_enable - handle `I915_PERF_IOCTL_ENABLE` for OA stream * @stream: An i915 perf stream opened for OA metrics @@ -2116,6 +2270,13 @@ static void gen8_oa_disable(struct i915_perf_stream *stream) DRM_ERROR("wait for OA to be disabled timed out\n"); } +static void gen12_oa_disable(struct i915_perf_stream *stream) +{ + struct drm_i915_private *dev_priv = stream->dev_priv; + + I915_WRITE(GEN12_OAG_OACONTROL, 0); +} + /** * i915_oa_stream_disable - handle `I915_PERF_IOCTL_DISABLE` for OA stream * @stream: An i915 perf stream opened for OA metrics @@ -2733,16 +2894,24 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, * rest of the system, which we consider acceptable for a * non-privileged client. * - * For Gen8+ the OA unit no longer supports clock gating off for a + * For Gen8->11 the OA unit no longer supports clock gating off for a * specific context and the kernel can't securely stop the counters * from updating as system-wide / global values. Even though we can * filter reports based on the included context ID we can't block * clients from seeing the raw / global counter values via * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to * enable the OA unit by default. + * + * For Gen12+ we gain a new OAR unit that only monitors the RCS on a + * per context basis. So we can relax requirements there if the user + * doesn't request global stream access (i.e. query based sampling + * using MI_RECORD_PERF_COUNT. */ if (IS_HASWELL(dev_priv) && specific_ctx) privileged_op = false; + else if (IS_GEN(dev_priv, 12) && specific_ctx && + (props->sample_flags & SAMPLE_OA_REPORT) == 0) + privileged_op = false; /* Similar to perf's kernel.perf_paranoid_cpu sysctl option * we check a dev.i915.perf_stream_paranoid sysctl option @@ -3082,7 +3251,9 @@ void i915_perf_register(struct drm_i915_private *dev_priv) sysfs_attr_init(&dev_priv->perf.test_config.sysfs_metric_id.attr); - if (INTEL_GEN(dev_priv) >= 11) { + if (IS_GEN(dev_priv, 12)) { + i915_perf_load_test_config_tgl(dev_priv); + } else if (INTEL_GEN(dev_priv) >= 11) { i915_perf_load_test_config_icl(dev_priv); } else if (IS_CANNONLAKE(dev_priv)) { i915_perf_load_test_config_cnl(dev_priv); @@ -3228,6 +3399,26 @@ static bool chv_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr) (addr >= 0x182300 && addr <= 0x1823A4); } +static bool gen12_is_valid_b_counter_addr(struct drm_i915_private *dev_priv, + u32 addr) +{ + return (addr >= i915_mmio_reg_offset(GEN12_OAG_CEC0_0) && + addr <= i915_mmio_reg_offset(GEN12_OAG_CEC7_1)) || + (addr >= i915_mmio_reg_offset(GEN12_OAG_SCEC0_0) && + addr <= i915_mmio_reg_offset(GEN12_OAG_SCEC7_1)) || + addr == i915_mmio_reg_offset(GEN12_OAA_DBG_REG) || + addr == i915_mmio_reg_offset(GEN12_OAG_OA_PESS) || + addr == i915_mmio_reg_offset(GEN12_OAG_SPCTR_CNF); +} + +static bool gen12_is_valid_mux_addr(struct drm_i915_private *dev_priv, + u32 addr) +{ + return addr == i915_mmio_reg_offset(NOA_WRITE) || + (addr >= i915_mmio_reg_offset(NOA_CONFIG(0)) && + addr <= i915_mmio_reg_offset(NOA_CONFIG(8))); +} + static u32 mask_reg_value(u32 reg, u32 val) { /* HALF_SLICE_CHICKEN2 is programmed with a the @@ -3617,8 +3808,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv) */ dev_priv->perf.oa_formats = gen8_plus_oa_formats; - dev_priv->perf.ops.oa_enable = gen8_oa_enable; - dev_priv->perf.ops.oa_disable = gen8_oa_disable; dev_priv->perf.ops.read = gen8_oa_read; dev_priv->perf.ops.oa_hw_tail_read = gen8_oa_hw_tail_read; @@ -3635,6 +3824,8 @@ void i915_perf_init(struct drm_i915_private *dev_priv) chv_is_valid_mux_addr; } + dev_priv->perf.ops.oa_enable = gen8_oa_enable; + dev_priv->perf.ops.oa_disable = gen8_oa_disable; dev_priv->perf.ops.enable_metric_set = gen8_enable_metric_set; dev_priv->perf.ops.disable_metric_set = gen8_disable_metric_set; @@ -3657,6 +3848,8 @@ void i915_perf_init(struct drm_i915_private *dev_priv) dev_priv->perf.ops.is_valid_flex_reg = gen8_is_valid_flex_addr; + dev_priv->perf.ops.oa_enable = gen8_oa_enable; + dev_priv->perf.ops.oa_disable = gen8_oa_disable; dev_priv->perf.ops.enable_metric_set = gen8_enable_metric_set; dev_priv->perf.ops.disable_metric_set = gen10_disable_metric_set; @@ -3668,6 +3861,23 @@ void i915_perf_init(struct drm_i915_private *dev_priv) dev_priv->perf.ctx_flexeu0_offset = 0x78e; } dev_priv->perf.gen8_valid_ctx_bit = BIT(16); + } else if (IS_GEN(dev_priv, 12)) { + dev_priv->perf.ops.is_valid_b_counter_reg = + gen12_is_valid_b_counter_addr; + dev_priv->perf.ops.is_valid_mux_reg = + gen12_is_valid_mux_addr; + dev_priv->perf.ops.is_valid_flex_reg = + gen8_is_valid_flex_addr; + + dev_priv->perf.ops.oa_enable = gen12_oa_enable; + dev_priv->perf.ops.oa_disable = gen12_oa_disable; + dev_priv->perf.ops.enable_metric_set = gen12_enable_metric_set; + dev_priv->perf.ops.disable_metric_set = gen12_disable_metric_set; + + dev_priv->perf.ctx_oactxctrl_offset = 0x14a; /* To verify */ + dev_priv->perf.ctx_flexeu0_offset = 0x3de; /* To verify*/ + + dev_priv->perf.gen8_valid_ctx_bit = (1<<16); } } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d35f385f8288..ae4485b60b53 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -699,6 +699,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN8_OATAILPTR _MMIO(0x2B10) #define GEN8_OATAILPTR_MASK 0xffffffc0 + #define OABUFFER_SIZE_128K (0 << 3) #define OABUFFER_SIZE_256K (1 << 3) #define OABUFFER_SIZE_512K (2 << 3) @@ -708,6 +709,44 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OABUFFER_SIZE_8M (6 << 3) #define OABUFFER_SIZE_16M (7 << 3) +/* Gen12 OAR unit */ +#define GEN12_OAR_OACONTROL _MMIO(0x2960) +#define GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT 2 +#define GEN12_OAR_OACONTROL_COUNTER_ENABLE (1 << 0) + +#define GEN12_OAR_OASTATUS _MMIO(0x2968) + +/* Gen12 OAG unit */ +#define GEN12_OAG_OAHEADPTR _MMIO(0xdb00) +#define GEN12_OAG_OAHEADPTR_MASK 0xffffffc0 +#define GEN12_OAG_OATAILPTR _MMIO(0xdb04) +#define GEN12_OAG_OATAILPTR_MASK 0xffffffc0 + +#define GEN12_OAG_OABUFFER _MMIO(0xdb08) +#define GEN12_OAG_OABUFFER_BUFFER_SIZE_MASK (0x7) +#define GEN12_OAG_OABUFFER_BUFFER_SIZE_SHIFT (3) +#define GEN12_OAG_OABUFFER_MEMORY_SELECT (1 << 0) /* 0: PPGTT, 1: GGTT */ + +#define GEN12_OAG_OAGLBCTXCTRL _MMIO(0x2b28) +#define GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT 2 +#define GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE (1 << 1) +#define GEN12_OAG_OAGLBCTXCTRL_COUNTER_RESUME (1 << 0) + +#define GEN12_OAG_OACONTROL _MMIO(0xdaf4) +#define GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT 2 +#define GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE (1 << 0) + +#define GEN12_OAG_OA_DEBUG _MMIO(0xdaf8) +#define GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO (1 << 6) +#define GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS (1 << 5) +#define GEN12_OAG_OA_DEBUG_DISABLE_GO_1_0_REPORTS (1 << 2) +#define GEN12_OAG_OA_DEBUG_DISABLE_CTX_SWITCH_REPORTS (1 << 1) + +#define GEN12_OAG_OASTATUS _MMIO(0xdafc) +#define GEN12_OAG_OASTATUS_COUNTER_OVERFLOW (1 << 2) +#define GEN12_OAG_OASTATUS_BUFFER_OVERFLOW (1 << 1) +#define GEN12_OAG_OASTATUS_REPORT_LOST (1 << 0) + /* * Flexible, Aggregate EU Counter Registers. * Note: these aren't contiguous @@ -944,6 +983,26 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OAREPORTTRIG8_NOA_SELECT_6_SHIFT 24 #define OAREPORTTRIG8_NOA_SELECT_7_SHIFT 28 +/* Same layout as OASTARTTRIGX */ +#define GEN12_OAG_OASTARTTRIG1 _MMIO(0xd900) +#define GEN12_OAG_OASTARTTRIG2 _MMIO(0xd904) +#define GEN12_OAG_OASTARTTRIG3 _MMIO(0xd908) +#define GEN12_OAG_OASTARTTRIG4 _MMIO(0xd90c) +#define GEN12_OAG_OASTARTTRIG5 _MMIO(0xd910) +#define GEN12_OAG_OASTARTTRIG6 _MMIO(0xd914) +#define GEN12_OAG_OASTARTTRIG7 _MMIO(0xd918) +#define GEN12_OAG_OASTARTTRIG8 _MMIO(0xd91c) + +/* Same layout as OAREPORTTRIGX */ +#define GEN12_OAG_OAREPORTTRIG1 _MMIO(0xd920) +#define GEN12_OAG_OAREPORTTRIG2 _MMIO(0xd924) +#define GEN12_OAG_OAREPORTTRIG3 _MMIO(0xd928) +#define GEN12_OAG_OAREPORTTRIG4 _MMIO(0xd92c) +#define GEN12_OAG_OAREPORTTRIG5 _MMIO(0xd930) +#define GEN12_OAG_OAREPORTTRIG6 _MMIO(0xd934) +#define GEN12_OAG_OAREPORTTRIG7 _MMIO(0xd938) +#define GEN12_OAG_OAREPORTTRIG8 _MMIO(0xd93c) + /* CECX_0 */ #define OACEC_COMPARE_LESS_OR_EQUAL 6 #define OACEC_COMPARE_NOT_EQUAL 5 @@ -960,6 +1019,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OACEC_SELECT_PREV (1 << 19) #define OACEC_SELECT_BOOLEAN (2 << 19) +/* 11-bit array 0: pass-through, 1: negated */ +#define GEN12_OASCEC_NEGATE_MASK 0x7ff +#define GEN12_OASCEC_NEGATE_SHIFT 21 + /* CECX_1 */ #define OACEC_MASK_MASK 0xffff #define OACEC_CONSIDERATIONS_MASK 0xffff @@ -982,6 +1045,42 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OACEC7_0 _MMIO(0x27a8) #define OACEC7_1 _MMIO(0x27ac) +/* Same layout as CECX_Y */ +#define GEN12_OAG_CEC0_0 _MMIO(0xd940) +#define GEN12_OAG_CEC0_1 _MMIO(0xd944) +#define GEN12_OAG_CEC1_0 _MMIO(0xd948) +#define GEN12_OAG_CEC1_1 _MMIO(0xd94c) +#define GEN12_OAG_CEC2_0 _MMIO(0xd950) +#define GEN12_OAG_CEC2_1 _MMIO(0xd954) +#define GEN12_OAG_CEC3_0 _MMIO(0xd958) +#define GEN12_OAG_CEC3_1 _MMIO(0xd95c) +#define GEN12_OAG_CEC4_0 _MMIO(0xd960) +#define GEN12_OAG_CEC4_1 _MMIO(0xd964) +#define GEN12_OAG_CEC5_0 _MMIO(0xd968) +#define GEN12_OAG_CEC5_1 _MMIO(0xd96c) +#define GEN12_OAG_CEC6_0 _MMIO(0xd970) +#define GEN12_OAG_CEC6_1 _MMIO(0xd974) +#define GEN12_OAG_CEC7_0 _MMIO(0xd978) +#define GEN12_OAG_CEC7_1 _MMIO(0xd97c) + +/* Same layout as CECX_Y + negate 11-bit array */ +#define GEN12_OAG_SCEC0_0 _MMIO(0xdc00) +#define GEN12_OAG_SCEC0_1 _MMIO(0xdc04) +#define GEN12_OAG_SCEC1_0 _MMIO(0xdc08) +#define GEN12_OAG_SCEC1_1 _MMIO(0xdc0c) +#define GEN12_OAG_SCEC2_0 _MMIO(0xdc10) +#define GEN12_OAG_SCEC2_1 _MMIO(0xdc14) +#define GEN12_OAG_SCEC3_0 _MMIO(0xdc18) +#define GEN12_OAG_SCEC3_1 _MMIO(0xdc1c) +#define GEN12_OAG_SCEC4_0 _MMIO(0xdc20) +#define GEN12_OAG_SCEC4_1 _MMIO(0xdc24) +#define GEN12_OAG_SCEC5_0 _MMIO(0xdc28) +#define GEN12_OAG_SCEC5_1 _MMIO(0xdc2c) +#define GEN12_OAG_SCEC6_0 _MMIO(0xdc30) +#define GEN12_OAG_SCEC6_1 _MMIO(0xdc34) +#define GEN12_OAG_SCEC7_0 _MMIO(0xdc38) +#define GEN12_OAG_SCEC7_1 _MMIO(0xdc3c) + /* OA perf counters */ #define OA_PERFCNT1_LO _MMIO(0x91B8) #define OA_PERFCNT1_HI _MMIO(0x91BC) @@ -1062,6 +1161,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define MICRO_BP3_COUNT_STATUS23 _MMIO(0x9838) #define MICRO_BP_FIRED_ARMED _MMIO(0x983C) +#define GEN12_OAA_DBG_REG _MMIO(0xdc44) +#define GEN12_OAG_OA_PESS _MMIO(0x2b2c) +#define GEN12_OAG_SPCTR_CNF _MMIO(0xdc40) + #define GDT_CHICKEN_BITS _MMIO(0x9840) #define GT_NOA_ENABLE 0x00000080 diff --git a/drivers/gpu/drm/i915/oa/i915_oa_tgl.c b/drivers/gpu/drm/i915/oa/i915_oa_tgl.c new file mode 100644 index 000000000000..d803407a799e --- /dev/null +++ b/drivers/gpu/drm/i915/oa/i915_oa_tgl.c @@ -0,0 +1,113 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2018 Intel Corporation + * + * Autogenerated file by GPU Top : https://github.com/rib/gputop + * DO NOT EDIT manually! + */ + +#include + +#include "i915_drv.h" +#include "i915_oa_tgl.h" + +static const struct i915_oa_reg b_counter_config_test_oa[] = { + { _MMIO(0xd920), 0x00000000 }, + { _MMIO(0xd900), 0x00000000 }, + { _MMIO(0xd904), 0xf0800000 }, + { _MMIO(0xd910), 0x00000000 }, + { _MMIO(0xd914), 0xf0800000 }, + { _MMIO(0xdc40), 0x00ff0000 }, + { _MMIO(0xd940), 0x00000004 }, + { _MMIO(0xd944), 0x0000ffff }, + { _MMIO(0xdc00), 0x00000004 }, + { _MMIO(0xdc04), 0x0000ffff }, + { _MMIO(0xd948), 0x00000003 }, + { _MMIO(0xd94c), 0x0000ffff }, + { _MMIO(0xdc08), 0x00000003 }, + { _MMIO(0xdc0c), 0x0000ffff }, + { _MMIO(0xd950), 0x00000007 }, + { _MMIO(0xd954), 0x0000ffff }, + { _MMIO(0xdc10), 0x00000007 }, + { _MMIO(0xdc14), 0x0000ffff }, + { _MMIO(0xd958), 0x00100002 }, + { _MMIO(0xd95c), 0x0000fff7 }, + { _MMIO(0xdc18), 0x00100002 }, + { _MMIO(0xdc1c), 0x0000fff7 }, + { _MMIO(0xd960), 0x00100002 }, + { _MMIO(0xd964), 0x0000ffcf }, + { _MMIO(0xdc20), 0x00100002 }, + { _MMIO(0xdc24), 0x0000ffcf }, + { _MMIO(0xd968), 0x00100082 }, + { _MMIO(0xd96c), 0x0000ffef }, + { _MMIO(0xdc28), 0x00100082 }, + { _MMIO(0xdc2c), 0x0000ffef }, + { _MMIO(0xd970), 0x001000c2 }, + { _MMIO(0xd974), 0x0000ffe7 }, + { _MMIO(0xdc30), 0x001000c2 }, + { _MMIO(0xdc34), 0x0000ffe7 }, + { _MMIO(0xd978), 0x00100001 }, + { _MMIO(0xd97c), 0x0000ffe7 }, + { _MMIO(0xdc38), 0x00100001 }, + { _MMIO(0xdc3c), 0x0000ffe7 }, + { _MMIO(0x2b2c), 0x00000000 }, + { _MMIO(0xd920), 0x00000000 }, + { _MMIO(0xd920), 0x00000000 }, + { _MMIO(0xd920), 0x00000000 }, + { _MMIO(0xd920), 0x00000000 }, +}; + +static const struct i915_oa_reg flex_eu_config_test_oa[] = { +}; + +static const struct i915_oa_reg mux_config_test_oa[] = { + { _MMIO(0xd04), 0x00000200 }, + { _MMIO(0x9840), 0x00000000 }, + { _MMIO(0x9884), 0x00000003 }, + { _MMIO(0x9888), 0x49110000 }, + { _MMIO(0x9888), 0x5d100400 }, + { _MMIO(0x9888), 0x1d1103a3 }, + { _MMIO(0x9888), 0x01110000 }, + { _MMIO(0x9888), 0x61110000 }, + { _MMIO(0x9888), 0x17100000 }, + { _MMIO(0x9888), 0x55100000 }, + { _MMIO(0x9888), 0x31100000 }, + { _MMIO(0x9884), 0x00000003 }, + { _MMIO(0x9888), 0x65100002 }, + { _MMIO(0x9884), 0x00000000 }, + { _MMIO(0x9888), 0x42000001 }, +}; + +static ssize_t +show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "1\n"); +} + +void +i915_perf_load_test_config_tgl(struct drm_i915_private *dev_priv) +{ + strlcpy(dev_priv->perf.test_config.uuid, + "5338e21d-f79c-49c2-88d6-47e5b8589291", + sizeof(dev_priv->perf.test_config.uuid)); + dev_priv->perf.test_config.id = 1; + + dev_priv->perf.test_config.mux_regs = mux_config_test_oa; + dev_priv->perf.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa); + + dev_priv->perf.test_config.b_counter_regs = b_counter_config_test_oa; + dev_priv->perf.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa); + + dev_priv->perf.test_config.flex_regs = flex_eu_config_test_oa; + dev_priv->perf.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa); + + dev_priv->perf.test_config.sysfs_metric.name = "5338e21d-f79c-49c2-88d6-47e5b8589291"; + dev_priv->perf.test_config.sysfs_metric.attrs = dev_priv->perf.test_config.attrs; + + dev_priv->perf.test_config.attrs[0] = &dev_priv->perf.test_config.sysfs_metric_id.attr; + + dev_priv->perf.test_config.sysfs_metric_id.attr.name = "id"; + dev_priv->perf.test_config.sysfs_metric_id.attr.mode = 0444; + dev_priv->perf.test_config.sysfs_metric_id.show = show_test_oa_id; +} diff --git a/drivers/gpu/drm/i915/oa/i915_oa_tgl.h b/drivers/gpu/drm/i915/oa/i915_oa_tgl.h new file mode 100644 index 000000000000..64aa47ce2e55 --- /dev/null +++ b/drivers/gpu/drm/i915/oa/i915_oa_tgl.h @@ -0,0 +1,17 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2018 Intel Corporation + * + * Autogenerated file by GPU Top : https://github.com/rib/gputop + * DO NOT EDIT manually! + */ + +#ifndef __I915_OA_TGL_H__ +#define __I915_OA_TGL_H__ + +struct drm_i915_private; + +extern void i915_perf_load_test_config_tgl(struct drm_i915_private *dev_priv); + +#endif