From patchwork Wed Oct 10 16:55:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10634845 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D4DA14BD for ; Wed, 10 Oct 2018 16:55:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 22E092A005 for ; Wed, 10 Oct 2018 16:55:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 173EB2A08B; Wed, 10 Oct 2018 16:55:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B5F092A005 for ; Wed, 10 Oct 2018 16:55:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CA666892D3; Wed, 10 Oct 2018 16:55:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 23C6B8929B for ; Wed, 10 Oct 2018 16:55:38 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Oct 2018 09:55:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,365,1534834800"; d="scan'208";a="270206345" Received: from delly.ld.intel.com ([10.103.239.197]) by fmsmga005.fm.intel.com with ESMTP; 10 Oct 2018 09:55:36 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 10 Oct 2018 17:55:30 +0100 Message-Id: <20181010165533.23345-2-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> References: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/4] drm/i915/perf: remove redundant oa buffer initialization X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We initialize the OA buffer everytime we enable the OA unit (first call in gen[78]_oa_enable), so we don't need to initialize when preparing the metric set. Signed-off-by: Lionel Landwerlin Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_drv.h | 17 ----------------- drivers/gpu/drm/i915/i915_perf.c | 6 +----- 2 files changed, 1 insertion(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 63ce0da4e723..eef7c811bd8f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1529,23 +1529,6 @@ struct i915_oa_ops { */ bool (*is_valid_flex_reg)(struct drm_i915_private *dev_priv, u32 addr); - /** - * @init_oa_buffer: Resets the head and tail pointers of the - * circular buffer for periodic OA reports. - * - * Called when first opening a stream for OA metrics, but also may be - * called in response to an OA buffer overflow or other error - * condition. - * - * Note it may be necessary to clear the full OA buffer here as part of - * maintaining the invariable that new reports must be written to - * zeroed memory for us to be able to reliable detect if an expected - * report has not yet landed in memory. (At least on Haswell the OA - * buffer tail pointer is not synchronized with reports being visible - * to the CPU) - */ - void (*init_oa_buffer)(struct drm_i915_private *dev_priv); - /** * @enable_metric_set: Selects and applies any MUX configuration to set * up the Boolean and Custom (B/C) counters that are part of the diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 30911efd2cf7..14f7d03aabcf 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1530,8 +1530,6 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv) goto err_unpin; } - dev_priv->perf.oa.ops.init_oa_buffer(dev_priv); - DRM_DEBUG_DRIVER("OA Buffer initialized, gtt offset = 0x%x, vaddr = %p\n", i915_ggtt_offset(dev_priv->perf.oa.oa_buffer.vma), dev_priv->perf.oa.oa_buffer.vaddr); @@ -2000,7 +1998,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, return -EINVAL; } - if (!dev_priv->perf.oa.ops.init_oa_buffer) { + if (!dev_priv->perf.oa.ops.enable_metric_set) { DRM_DEBUG("OA unit not supported\n"); return -ENODEV; } @@ -3389,7 +3387,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv) dev_priv->perf.oa.ops.is_valid_mux_reg = hsw_is_valid_mux_addr; dev_priv->perf.oa.ops.is_valid_flex_reg = NULL; - dev_priv->perf.oa.ops.init_oa_buffer = gen7_init_oa_buffer; dev_priv->perf.oa.ops.enable_metric_set = hsw_enable_metric_set; dev_priv->perf.oa.ops.disable_metric_set = hsw_disable_metric_set; dev_priv->perf.oa.ops.oa_enable = gen7_oa_enable; @@ -3408,7 +3405,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv) */ dev_priv->perf.oa.oa_formats = gen8_plus_oa_formats; - dev_priv->perf.oa.ops.init_oa_buffer = gen8_init_oa_buffer; dev_priv->perf.oa.ops.oa_enable = gen8_oa_enable; dev_priv->perf.oa.ops.oa_disable = gen8_oa_disable; dev_priv->perf.oa.ops.read = gen8_oa_read; From patchwork Wed Oct 10 16:55:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10634849 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 63E35933 for ; Wed, 10 Oct 2018 16:55:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B2182A005 for ; Wed, 10 Oct 2018 16:55:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4F6932A08B; Wed, 10 Oct 2018 16:55:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D15682A005 for ; Wed, 10 Oct 2018 16:55:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1AA896E31E; Wed, 10 Oct 2018 16:55:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 378778982D for ; Wed, 10 Oct 2018 16:55:39 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Oct 2018 09:55:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,365,1534834800"; d="scan'208";a="270206353" Received: from delly.ld.intel.com ([10.103.239.197]) by fmsmga005.fm.intel.com with ESMTP; 10 Oct 2018 09:55:37 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 10 Oct 2018 17:55:31 +0100 Message-Id: <20181010165533.23345-3-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> References: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/4] drm/i915/perf: pass stream to vfuncs when possible X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We want to use some of the properties of the perf stream to program the hardware in a later commit. v2: Pass only perf stream as argument (Matthew) Signed-off-by: Lionel Landwerlin Reviewed-by: Matthew Auld (v1) Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_drv.h | 7 +++--- drivers/gpu/drm/i915/i915_perf.c | 43 +++++++++++++++++++------------- 2 files changed, 28 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index eef7c811bd8f..65eaac2d7e3c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1535,8 +1535,7 @@ struct i915_oa_ops { * counter reports being sampled. May apply system constraints such as * disabling EU clock gating as required. */ - int (*enable_metric_set)(struct drm_i915_private *dev_priv, - const struct i915_oa_config *oa_config); + int (*enable_metric_set)(struct i915_perf_stream *stream); /** * @disable_metric_set: Remove system constraints associated with using @@ -1547,12 +1546,12 @@ struct i915_oa_ops { /** * @oa_enable: Enable periodic sampling */ - void (*oa_enable)(struct drm_i915_private *dev_priv); + void (*oa_enable)(struct i915_perf_stream *stream); /** * @oa_disable: Disable periodic sampling */ - void (*oa_disable)(struct drm_i915_private *dev_priv); + void (*oa_disable)(struct i915_perf_stream *stream); /** * @read: Copy data from the circular OA buffer into a given userspace diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 14f7d03aabcf..88f3f9b6a353 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -890,8 +890,8 @@ static int gen8_oa_read(struct i915_perf_stream *stream, DRM_DEBUG("OA buffer overflow (exponent = %d): force restart\n", dev_priv->perf.oa.period_exponent); - dev_priv->perf.oa.ops.oa_disable(dev_priv); - dev_priv->perf.oa.ops.oa_enable(dev_priv); + dev_priv->perf.oa.ops.oa_disable(stream); + dev_priv->perf.oa.ops.oa_enable(stream); /* * Note: .oa_enable() is expected to re-init the oabuffer and @@ -1114,8 +1114,8 @@ static int gen7_oa_read(struct i915_perf_stream *stream, DRM_DEBUG("OA buffer overflow (exponent = %d): force restart\n", dev_priv->perf.oa.period_exponent); - dev_priv->perf.oa.ops.oa_disable(dev_priv); - dev_priv->perf.oa.ops.oa_enable(dev_priv); + dev_priv->perf.oa.ops.oa_disable(stream); + dev_priv->perf.oa.ops.oa_enable(stream); oastatus1 = I915_READ(GEN7_OASTATUS1); } @@ -1563,9 +1563,11 @@ static void config_oa_regs(struct drm_i915_private *dev_priv, } } -static int hsw_enable_metric_set(struct drm_i915_private *dev_priv, - const struct i915_oa_config *oa_config) +static int hsw_enable_metric_set(struct i915_perf_stream *stream) { + struct drm_i915_private *dev_priv = stream->dev_priv; + const struct i915_oa_config *oa_config = stream->oa_config; + /* PRM: * * OA unit is using “crclk” for its functionality. When trunk @@ -1767,9 +1769,10 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv, return 0; } -static int gen8_enable_metric_set(struct drm_i915_private *dev_priv, - const struct i915_oa_config *oa_config) +static int gen8_enable_metric_set(struct i915_perf_stream *stream) { + struct drm_i915_private *dev_priv = stream->dev_priv; + const struct i915_oa_config *oa_config = stream->oa_config; int ret; /* @@ -1837,10 +1840,10 @@ static void gen10_disable_metric_set(struct drm_i915_private *dev_priv) I915_READ(RPM_CONFIG1) & ~GEN10_GT_NOA_ENABLE); } -static void gen7_oa_enable(struct drm_i915_private *dev_priv) +static void gen7_oa_enable(struct i915_perf_stream *stream) { - struct i915_gem_context *ctx = - dev_priv->perf.oa.exclusive_stream->ctx; + struct drm_i915_private *dev_priv = stream->dev_priv; + struct i915_gem_context *ctx = stream->ctx; u32 ctx_id = dev_priv->perf.oa.specific_ctx_id; bool periodic = dev_priv->perf.oa.periodic; u32 period_exponent = dev_priv->perf.oa.period_exponent; @@ -1867,8 +1870,9 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv) GEN7_OACONTROL_ENABLE); } -static void gen8_oa_enable(struct drm_i915_private *dev_priv) +static void gen8_oa_enable(struct i915_perf_stream *stream) { + struct drm_i915_private *dev_priv = stream->dev_priv; u32 report_format = dev_priv->perf.oa.oa_buffer.format; /* @@ -1905,7 +1909,7 @@ static void i915_oa_stream_enable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; - dev_priv->perf.oa.ops.oa_enable(dev_priv); + dev_priv->perf.oa.ops.oa_enable(stream); if (dev_priv->perf.oa.periodic) hrtimer_start(&dev_priv->perf.oa.poll_check_timer, @@ -1913,8 +1917,10 @@ static void i915_oa_stream_enable(struct i915_perf_stream *stream) HRTIMER_MODE_REL_PINNED); } -static void gen7_oa_disable(struct drm_i915_private *dev_priv) +static void gen7_oa_disable(struct i915_perf_stream *stream) { + struct drm_i915_private *dev_priv = stream->dev_priv; + I915_WRITE(GEN7_OACONTROL, 0); if (intel_wait_for_register(dev_priv, GEN7_OACONTROL, GEN7_OACONTROL_ENABLE, 0, @@ -1922,8 +1928,10 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv) DRM_ERROR("wait for OA to be disabled timed out\n"); } -static void gen8_oa_disable(struct drm_i915_private *dev_priv) +static void gen8_oa_disable(struct i915_perf_stream *stream) { + struct drm_i915_private *dev_priv = stream->dev_priv; + I915_WRITE(GEN8_OACONTROL, 0); if (intel_wait_for_register(dev_priv, GEN8_OACONTROL, GEN8_OA_COUNTER_ENABLE, 0, @@ -1943,7 +1951,7 @@ static void i915_oa_stream_disable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; - dev_priv->perf.oa.ops.oa_disable(dev_priv); + dev_priv->perf.oa.ops.oa_disable(stream); if (dev_priv->perf.oa.periodic) hrtimer_cancel(&dev_priv->perf.oa.poll_check_timer); @@ -2092,8 +2100,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (ret) goto err_lock; - ret = dev_priv->perf.oa.ops.enable_metric_set(dev_priv, - stream->oa_config); + ret = dev_priv->perf.oa.ops.enable_metric_set(stream); if (ret) { DRM_DEBUG("Unable to enable metric set\n"); goto err_enable; From patchwork Wed Oct 10 16:55:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10634851 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C178A46E4 for ; Wed, 10 Oct 2018 16:55:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B57DA2A005 for ; Wed, 10 Oct 2018 16:55:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9BF22A08B; Wed, 10 Oct 2018 16:55:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6E33A2A005 for ; Wed, 10 Oct 2018 16:55:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ECE1E6E32B; Wed, 10 Oct 2018 16:55:44 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 38B0A6E31E for ; Wed, 10 Oct 2018 16:55:40 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Oct 2018 09:55:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,365,1534834800"; d="scan'208";a="270206360" Received: from delly.ld.intel.com ([10.103.239.197]) by fmsmga005.fm.intel.com with ESMTP; 10 Oct 2018 09:55:38 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 10 Oct 2018 17:55:32 +0100 Message-Id: <20181010165533.23345-4-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> References: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/4] drm/i915/perf: do not warn when OA buffer is already allocated X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP If 2 processes race to open the perf stream, it's possible that one of them will see that OA buffer has already been allocated, while a previous process is still finishing to reprogram the hardware (on gen8+). The opening sequence has been reworked a few times and we probably lost track of the order in which things are supposed to happen. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 88f3f9b6a353..a648ded97969 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1494,13 +1494,15 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv) struct i915_vma *vma; int ret; - if (WARN_ON(dev_priv->perf.oa.oa_buffer.vma)) - return -ENODEV; - ret = i915_mutex_lock_interruptible(&dev_priv->drm); if (ret) return ret; + if (dev_priv->perf.oa.oa_buffer.vma) { + ret = -EBUSY; + goto unlock; + } + BUILD_BUG_ON_NOT_POWER_OF_2(OA_BUFFER_SIZE); BUILD_BUG_ON(OA_BUFFER_SIZE < SZ_128K || OA_BUFFER_SIZE > SZ_16M); From patchwork Wed Oct 10 16:55:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10634853 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6CBB69B1 for ; Wed, 10 Oct 2018 16:55:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E3EF2A005 for ; Wed, 10 Oct 2018 16:55:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 72E022A08B; Wed, 10 Oct 2018 16:55:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ADACB2A005 for ; Wed, 10 Oct 2018 16:55:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6B4F06E337; Wed, 10 Oct 2018 16:55:45 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4182F6E32B for ; Wed, 10 Oct 2018 16:55:41 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Oct 2018 09:55:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,365,1534834800"; d="scan'208";a="270206374" Received: from delly.ld.intel.com ([10.103.239.197]) by fmsmga005.fm.intel.com with ESMTP; 10 Oct 2018 09:55:39 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 10 Oct 2018 17:55:33 +0100 Message-Id: <20181010165533.23345-5-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> References: <20181010165533.23345-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/4] drm/i915/perf: add a parameter to control the size of OA buffer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP The way our hardware is designed doesn't seem to let us use the MI_RECORD_PERF_COUNT command without setting up a circular buffer. In the case where the user didn't request OA reports to be available through the i915 perf stream, we can set the OA buffer to the minimum size to avoid consuming memory which won't be used by the driver. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 111 +++++++++++++++++++++++-------- drivers/gpu/drm/i915/i915_reg.h | 2 + include/uapi/drm/i915_drm.h | 8 +++ 4 files changed, 95 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 65eaac2d7e3c..a0069add9d39 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2053,6 +2053,8 @@ struct drm_i915_private { u32 last_ctx_id; int format; int format_size; + u32 size; + int size_exponent; /** * Locks reads and writes to all head/tail state diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index a648ded97969..1eaef711160a 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -212,13 +212,7 @@ #include "i915_oa_icl.h" #include "intel_lrc_reg.h" -/* HW requires this to be a power of two, between 128k and 16M, though driver - * is currently generally designed assuming the largest 16M size is used such - * that the overflow cases are unlikely in normal operation. - */ -#define OA_BUFFER_SIZE SZ_16M - -#define OA_TAKEN(tail, head) ((tail - head) & (OA_BUFFER_SIZE - 1)) +#define OA_TAKEN(tail, head) ((tail - head) & (dev_priv->perf.oa.oa_buffer.size - 1)) /** * DOC: OA Tail Pointer Race @@ -361,6 +355,8 @@ struct perf_open_properties { int oa_format; bool oa_periodic; int oa_period_exponent; + u32 oa_buffer_size; + u32 oa_buffer_size_exponent; }; static void free_oa_config(struct drm_i915_private *dev_priv, @@ -523,7 +519,7 @@ static bool oa_buffer_check_unlocked(struct drm_i915_private *dev_priv) * could put the tail out of bounds... */ if (hw_tail >= gtt_offset && - hw_tail < (gtt_offset + OA_BUFFER_SIZE)) { + hw_tail < (gtt_offset + dev_priv->perf.oa.oa_buffer.size)) { dev_priv->perf.oa.oa_buffer.tails[!aged_idx].offset = aging_tail = hw_tail; dev_priv->perf.oa.oa_buffer.aging_timestamp = now; @@ -652,7 +648,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, int report_size = dev_priv->perf.oa.oa_buffer.format_size; u8 *oa_buf_base = dev_priv->perf.oa.oa_buffer.vaddr; u32 gtt_offset = i915_ggtt_offset(dev_priv->perf.oa.oa_buffer.vma); - u32 mask = (OA_BUFFER_SIZE - 1); + u32 mask = (dev_priv->perf.oa.oa_buffer.size - 1); size_t start_offset = *offset; unsigned long flags; unsigned int aged_tail_idx; @@ -692,8 +688,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * only be incremented by multiples of the report size (notably also * all a power of two). */ - if (WARN_ONCE(head > OA_BUFFER_SIZE || head % report_size || - tail > OA_BUFFER_SIZE || tail % report_size, + if (WARN_ONCE(head > dev_priv->perf.oa.oa_buffer.size || head % report_size || + tail > dev_priv->perf.oa.oa_buffer.size || tail % report_size, "Inconsistent OA buffer pointers: head = %u, tail = %u\n", head, tail)) return -EIO; @@ -716,7 +712,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * here would imply a driver bug that would result * in an overrun. */ - if (WARN_ON((OA_BUFFER_SIZE - head) < report_size)) { + if (WARN_ON((dev_priv->perf.oa.oa_buffer.size - head) < report_size)) { DRM_ERROR("Spurious OA head ptr: non-integral report offset\n"); break; } @@ -941,7 +937,7 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, int report_size = dev_priv->perf.oa.oa_buffer.format_size; u8 *oa_buf_base = dev_priv->perf.oa.oa_buffer.vaddr; u32 gtt_offset = i915_ggtt_offset(dev_priv->perf.oa.oa_buffer.vma); - u32 mask = (OA_BUFFER_SIZE - 1); + u32 mask = (dev_priv->perf.oa.oa_buffer.size - 1); size_t start_offset = *offset; unsigned long flags; unsigned int aged_tail_idx; @@ -978,8 +974,8 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, * only be incremented by multiples of the report size (notably also * all a power of two). */ - if (WARN_ONCE(head > OA_BUFFER_SIZE || head % report_size || - tail > OA_BUFFER_SIZE || tail % report_size, + if (WARN_ONCE(head > dev_priv->perf.oa.oa_buffer.size || head % report_size || + tail > dev_priv->perf.oa.oa_buffer.size || tail % report_size, "Inconsistent OA buffer pointers: head = %u, tail = %u\n", head, tail)) return -EIO; @@ -999,7 +995,7 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, * here would imply a driver bug that would result * in an overrun. */ - if (WARN_ON((OA_BUFFER_SIZE - head) < report_size)) { + if (WARN_ON((dev_priv->perf.oa.oa_buffer.size - head) < report_size)) { DRM_ERROR("Spurious OA head ptr: non-integral report offset\n"); break; } @@ -1396,7 +1392,9 @@ static void gen7_init_oa_buffer(struct drm_i915_private *dev_priv) I915_WRITE(GEN7_OABUFFER, gtt_offset); - I915_WRITE(GEN7_OASTATUS1, gtt_offset | OABUFFER_SIZE_16M); /* tail */ + I915_WRITE(GEN7_OASTATUS1, gtt_offset | + (dev_priv->perf.oa.oa_buffer.size_exponent << + GEN7_OASTATUS1_BUFFER_SIZE_SHIFT)); /* tail */ /* Mark that we need updated tail pointers to read from... */ dev_priv->perf.oa.oa_buffer.tails[0].offset = INVALID_TAIL_PTR; @@ -1421,7 +1419,8 @@ static void gen7_init_oa_buffer(struct drm_i915_private *dev_priv) * the assumption that new reports are being written to zeroed * memory... */ - memset(dev_priv->perf.oa.oa_buffer.vaddr, 0, OA_BUFFER_SIZE); + memset(dev_priv->perf.oa.oa_buffer.vaddr, 0, + dev_priv->perf.oa.oa_buffer.vma->size); /* Maybe make ->pollin per-stream state if we support multiple * concurrent streams in the future. @@ -1451,7 +1450,9 @@ static void gen8_init_oa_buffer(struct drm_i915_private *dev_priv) * bit." */ I915_WRITE(GEN8_OABUFFER, gtt_offset | - OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT); + (dev_priv->perf.oa.oa_buffer.size_exponent << + GEN8_OABUFFER_BUFFER_SIZE_SHIFT) | + GEN8_OABUFFER_MEM_SELECT_GGTT); I915_WRITE(GEN8_OATAILPTR, gtt_offset & GEN8_OATAILPTR_MASK); /* Mark that we need updated tail pointers to read from... */ @@ -1479,7 +1480,8 @@ static void gen8_init_oa_buffer(struct drm_i915_private *dev_priv) * the assumption that new reports are being written to zeroed * memory... */ - memset(dev_priv->perf.oa.oa_buffer.vaddr, 0, OA_BUFFER_SIZE); + memset(dev_priv->perf.oa.oa_buffer.vaddr, 0, + dev_priv->perf.oa.oa_buffer.vma->size); /* * Maybe make ->pollin per-stream state if we support multiple @@ -1488,7 +1490,8 @@ static void gen8_init_oa_buffer(struct drm_i915_private *dev_priv) dev_priv->perf.oa.pollin = false; } -static int alloc_oa_buffer(struct drm_i915_private *dev_priv) +static int alloc_oa_buffer(struct drm_i915_private *dev_priv, u32 size, + int size_exponent) { struct drm_i915_gem_object *bo; struct i915_vma *vma; @@ -1503,10 +1506,9 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv) goto unlock; } - BUILD_BUG_ON_NOT_POWER_OF_2(OA_BUFFER_SIZE); - BUILD_BUG_ON(OA_BUFFER_SIZE < SZ_128K || OA_BUFFER_SIZE > SZ_16M); + BUG_ON(size < SZ_128K || size > SZ_16M); - bo = i915_gem_object_create(dev_priv, OA_BUFFER_SIZE); + bo = i915_gem_object_create(dev_priv, size); if (IS_ERR(bo)) { DRM_ERROR("Failed to allocate OA buffer\n"); ret = PTR_ERR(bo); @@ -1518,12 +1520,14 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv) goto err_unref; /* PreHSW required 512K alignment, HSW requires 16M */ - vma = i915_gem_object_ggtt_pin(bo, NULL, 0, SZ_16M, 0); + vma = i915_gem_object_ggtt_pin(bo, NULL, 0, size, 0); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto err_unref; } dev_priv->perf.oa.oa_buffer.vma = vma; + dev_priv->perf.oa.oa_buffer.size = size; + dev_priv->perf.oa.oa_buffer.size_exponent = size_exponent; dev_priv->perf.oa.oa_buffer.vaddr = i915_gem_object_pin_map(bo, I915_MAP_WB); @@ -1532,9 +1536,10 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv) goto err_unpin; } - DRM_DEBUG_DRIVER("OA Buffer initialized, gtt offset = 0x%x, vaddr = %p\n", + DRM_DEBUG_DRIVER("OA Buffer initialized, gtt offset = 0x%x, vaddr = %p, size = %u\n", i915_ggtt_offset(dev_priv->perf.oa.oa_buffer.vma), - dev_priv->perf.oa.oa_buffer.vaddr); + dev_priv->perf.oa.oa_buffer.vaddr, + dev_priv->perf.oa.oa_buffer.size); goto unlock; @@ -2094,7 +2099,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, intel_runtime_pm_get(dev_priv); intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); - ret = alloc_oa_buffer(dev_priv); + ret = alloc_oa_buffer(dev_priv, props->oa_buffer_size, + props->oa_buffer_size_exponent); if (ret) goto err_oa_buf_alloc; @@ -2653,6 +2659,39 @@ static u64 oa_exponent_to_ns(struct drm_i915_private *dev_priv, int exponent) 1000ULL * INTEL_INFO(dev_priv)->cs_timestamp_frequency_khz); } +static int +select_oa_buffer_size(struct drm_i915_private *i915, + u64 requested_size, + u32 *selected_size, + int *selected_exponent) +{ + const u32 max_size = SZ_16M; + + /* + * When no size is specified, use the largest size supported + * by all generations. + */ + if (!requested_size) { + *selected_size = SZ_16M; + *selected_exponent = 7; + return 0; + } + + /* Start with the smallest OA buffer size. */ + *selected_size = 128 * 1024; + *selected_exponent = 0; + while (requested_size > *selected_size && + *selected_size < max_size) { + *selected_size *= 2; + *selected_exponent += 1; + } + + if (requested_size > *selected_size) + return -EINVAL; /* TODO: ENOMEM? ENODEV? */ + + return 0; +} + /** * read_properties_unlocked - validate + copy userspace stream open properties * @dev_priv: i915 device instance @@ -2780,6 +2819,13 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, props->oa_periodic = true; props->oa_period_exponent = value; break; + case DRM_I915_PERF_PROP_OA_BUFFER_SIZE: + ret = select_oa_buffer_size(dev_priv, value, + &props->oa_buffer_size, + &props->oa_buffer_size_exponent); + if (ret) + return ret; + break; case DRM_I915_PERF_PROP_MAX: MISSING_CASE(id); return -EINVAL; @@ -2788,6 +2834,15 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, uprop += 2; } + /* If no buffer size was requested, select the default one. */ + if (!props->oa_buffer_size) { + if (select_oa_buffer_size(dev_priv, 0, &props->oa_buffer_size, + &props->oa_buffer_size_exponent)) { + DRM_ERROR("Unable to select default OA buffer size\n"); + return -EINVAL; + } + } + return 0; } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 3bbf21ea5c57..e9fc5e4f6d87 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -595,12 +595,14 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN8_OABUFFER_UDW _MMIO(0x23b4) #define GEN8_OABUFFER _MMIO(0x2b14) #define GEN8_OABUFFER_MEM_SELECT_GGTT (1 << 0) /* 0: PPGTT, 1: GGTT */ +#define GEN8_OABUFFER_BUFFER_SIZE_SHIFT 3 #define GEN7_OASTATUS1 _MMIO(0x2364) #define GEN7_OASTATUS1_TAIL_MASK 0xffffffc0 #define GEN7_OASTATUS1_COUNTER_OVERFLOW (1 << 2) #define GEN7_OASTATUS1_OABUFFER_OVERFLOW (1 << 1) #define GEN7_OASTATUS1_REPORT_LOST (1 << 0) +#define GEN7_OASTATUS1_BUFFER_SIZE_SHIFT 3 #define GEN7_OASTATUS2 _MMIO(0x2368) #define GEN7_OASTATUS2_HEAD_MASK 0xffffffc0 diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index f22c4ee82871..b51a5777f627 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1558,6 +1558,14 @@ enum drm_i915_perf_property_id { */ DRM_I915_PERF_PROP_OA_EXPONENT, + /** + * Specify a global OA buffer size to be allocated in bytes. + * The driver will allocate a HW supported size that is at + * least as large as specified by this property. Larger sizes + * than what the HW supports will fail. + */ + DRM_I915_PERF_PROP_OA_BUFFER_SIZE, + DRM_I915_PERF_PROP_MAX /* non-ABI */ };