From patchwork Wed May 31 12:33:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 9756793 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BB8BF602F0 for ; Wed, 31 May 2017 12:34:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6C462807B for ; Wed, 31 May 2017 12:34:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9AEE72836F; Wed, 31 May 2017 12:34:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 435A02807B for ; Wed, 31 May 2017 12:34:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A3CBD6E264; Wed, 31 May 2017 12:34:21 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1C53A6E251 for ; Wed, 31 May 2017 12:34:15 +0000 (UTC) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 May 2017 05:34:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.38,423,1491289200"; d="scan'208";a="863126498" Received: from llandwerlin-skull-nuc.ld.intel.com (HELO delly.ld.intel.com) ([10.103.239.195]) by FMSMGA003.fm.intel.com with ESMTP; 31 May 2017 05:34:14 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 31 May 2017 13:33:54 +0100 Message-Id: <20170531123355.6363-14-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170531123355.6363-1-lionel.g.landwerlin@intel.com> References: <20170424071751.2416-1-lionel.g.landwerlin@intel.com> <20170531123355.6363-1-lionel.g.landwerlin@intel.com> Subject: [Intel-gfx] [PATCH v15 13/14] drm/i915/perf: reprogram NOA muxes at the beginning of each workload X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Dynamic slices/subslices shutdown will effectivelly loose the NOA configuration uploaded in the slices/subslices. When i915 perf is in use, we therefore need to reprogram it. v2: Make sure we handle configs with more register writes than the max MI_LOAD_REGISTER_IMM can do (Lionel) Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 77 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_ringbuffer.c | 3 ++ 3 files changed, 82 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c3acb0e9eb5d..499b2f9aa4be 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2400,6 +2400,7 @@ struct drm_i915_private { const struct i915_oa_reg *mux_regs[6]; int mux_regs_lens[6]; int n_mux_configs; + int total_n_mux_regs; const struct i915_oa_reg *b_counter_regs; int b_counter_regs_len; @@ -3535,6 +3536,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, void i915_oa_init_reg_state(struct intel_engine_cs *engine, struct i915_gem_context *ctx, uint32_t *reg_state); +int i915_oa_emit_noa_config_locked(struct drm_i915_gem_request *req); /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c281847eb56b..4229c74baa22 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1438,11 +1438,24 @@ static void config_oa_regs(struct drm_i915_private *dev_priv, } } +static void count_total_mux_regs(struct drm_i915_private *dev_priv) +{ + int i; + + dev_priv->perf.oa.total_n_mux_regs = 0; + for (i = 0; i < dev_priv->perf.oa.n_mux_configs; i++) { + dev_priv->perf.oa.total_n_mux_regs += + dev_priv->perf.oa.mux_regs_lens[i]; + } +} + static int hsw_enable_metric_set(struct drm_i915_private *dev_priv) { int ret = i915_oa_select_metric_set_hsw(dev_priv); int i; + count_total_mux_regs(dev_priv); + if (ret) return ret; @@ -1756,6 +1769,8 @@ static int gen8_enable_metric_set(struct drm_i915_private *dev_priv) int ret = dev_priv->perf.oa.ops.select_metric_set(dev_priv); int i; + count_total_mux_regs(dev_priv); + if (ret) return ret; @@ -2094,6 +2109,68 @@ void i915_oa_init_reg_state(struct intel_engine_cs *engine, gen8_update_reg_state_unlocked(ctx, reg_state); } +int i915_oa_emit_noa_config_locked(struct drm_i915_gem_request *req) +{ + struct drm_i915_private *dev_priv = req->i915; + int max_loads = 125; + int n_load, n_registers, n_loaded_register; + int i, j; + u32 *cs; + + lockdep_assert_held(&dev_priv->drm.struct_mutex); + + if (!IS_GEN(dev_priv, 8, 9)) + return 0; + + /* Perf not supported or not enabled. */ + if (!dev_priv->perf.initialized || + !dev_priv->perf.oa.exclusive_stream) + return 0; + + n_registers = dev_priv->perf.oa.total_n_mux_regs; + n_load = (n_registers / max_loads) + + (n_registers % max_loads) == 0; + + cs = intel_ring_begin(req, + 3 * 2 + /* MI_LOAD_REGISTER_IMM for chicken registers */ + n_load + /* MI_LOAD_REGISTER_IMM for mux registers */ + n_registers * 2 + /* offset & value for mux registers*/ + 1 /* NOOP */); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_LOAD_REGISTER_IMM(1); + *cs++ = i915_mmio_reg_offset(GDT_CHICKEN_BITS); + *cs++ = 0xA0; + + n_loaded_register = 0; + for (i = 0; i < dev_priv->perf.oa.n_mux_configs; i++) { + const struct i915_oa_reg *mux_regs = + dev_priv->perf.oa.mux_regs[i]; + const int mux_regs_len = dev_priv->perf.oa.mux_regs_lens[i]; + + for (j = 0; j < mux_regs_len; j++) { + if ((n_loaded_register % max_loads) == 0) { + n_load = min(n_registers - n_loaded_register, max_loads); + *cs++ = MI_LOAD_REGISTER_IMM(n_load); + } + + *cs++ = i915_mmio_reg_offset(mux_regs[j].addr); + *cs++ = mux_regs[j].value; + n_loaded_register++; + } + } + + *cs++ = MI_LOAD_REGISTER_IMM(1); + *cs++ = i915_mmio_reg_offset(GDT_CHICKEN_BITS); + *cs++ = 0x80; + + *cs++ = MI_NOOP; + intel_ring_advance(req, cs); + + return 0; +} + /** * i915_perf_read_locked - &i915_perf_stream_ops->read with error normalisation * @stream: An i915 perf stream diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index acd1da9b62a3..67aaaebb194b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1874,6 +1874,9 @@ gen8_emit_bb_start(struct drm_i915_gem_request *req, !(dispatch_flags & I915_DISPATCH_SECURE); u32 *cs; + /* Emit NOA config */ + i915_oa_emit_noa_config_locked(req); + cs = intel_ring_begin(req, 4); if (IS_ERR(cs)) return PTR_ERR(cs);