From patchwork Fri Apr 22 11:34:02 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 8911071 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 9D117BF29F for ; Fri, 22 Apr 2016 11:33:44 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 98A1E20220 for ; Fri, 22 Apr 2016 11:33:43 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 970A720212 for ; Fri, 22 Apr 2016 11:33:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9B6B96EEAD; Fri, 22 Apr 2016 11:33:35 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTP id 888826EE9E for ; Fri, 22 Apr 2016 11:33:05 +0000 (UTC) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 22 Apr 2016 04:33:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,517,1455004800"; d="scan'208";a="690446046" Received: from sourab-desktop.iind.intel.com ([10.223.82.63]) by FMSMGA003.fm.intel.com with ESMTP; 22 Apr 2016 04:33:02 -0700 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Fri, 22 Apr 2016 17:04:02 +0530 Message-Id: <1461324845-25755-14-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1461324845-25755-1-git-send-email-sourab.gupta@intel.com> References: <1461324845-25755-1-git-send-email-sourab.gupta@intel.com> Cc: Daniel Vetter , Sourab Gupta , Deepak S Subject: [Intel-gfx] [PATCH 13/16] drm/i915: Extract raw GPU timestamps from OA reports to forward in perf samples X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta The OA reports contain the least significant 32 bits of the gpu timestamp. This patch enables retrieval of the timestamp field from OA reports, to forward as 64 bit raw gpu timestamps in the perf samples. Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 44 ++++++++++++++++++++++++++++++---------- drivers/gpu/drm/i915/i915_reg.h | 4 ++++ 3 files changed, 38 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bdc7ad4..2ac07fb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2179,6 +2179,7 @@ struct drm_i915_private { u32 ctx_flexeu0_off; u32 n_pending_periodic_samples; u32 pending_periodic_ts; + u64 last_gpu_ts; struct i915_oa_ops ops; const struct i915_oa_format *oa_formats; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index f1c26e5..2bf9cf0 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -775,6 +775,24 @@ static int append_sample(struct i915_perf_stream *stream, return 0; } +static u64 get_gpu_ts_from_oa_report(struct drm_i915_private *dev_priv, + const u8 *report) +{ + u32 sample_ts = *(u32 *)(report + 4); + u32 delta; + + /* + * NB: We have to assume we're updating last_gpu_ts frequently + * enough that it's never possible to see multiple overflows before + * we compare sample_ts to last_gpu_ts. Since this is significantly + * large duration (~6min for 80ns ts base), we can safely assume so. + */ + delta = sample_ts - (u32)dev_priv->perf.oa.last_gpu_ts; + dev_priv->perf.oa.last_gpu_ts += delta; + + return dev_priv->perf.oa.last_gpu_ts; +} + static int append_oa_buffer_sample(struct i915_perf_stream *stream, struct i915_perf_read_state *read_state, const u8 *report) @@ -811,10 +829,9 @@ static int append_oa_buffer_sample(struct i915_perf_stream *stream, if (sample_flags & SAMPLE_TAG) data.tag = dev_priv->perf.last_tag; - /* Derive timestamp from OA report, after scaling with the ts base */ -#warning "FIXME: append_oa_buffer_sample: derive the timestamp from OA report" + /* Derive timestamp from OA report */ if (sample_flags & SAMPLE_TS) - data.ts = 0; + data.ts = get_gpu_ts_from_oa_report(dev_priv, report); if (sample_flags & SAMPLE_OA_REPORT) data.report = report; @@ -1226,6 +1243,7 @@ static int append_one_cs_sample(struct i915_perf_stream *stream, enum intel_engine_id id = stream->engine; struct sample_data data = { 0 }; u32 sample_flags = stream->sample_flags; + u64 gpu_ts = 0; int ret = 0; if (sample_flags & SAMPLE_OA_REPORT) { @@ -1242,6 +1260,9 @@ static int append_one_cs_sample(struct i915_perf_stream *stream, U32_MAX); if (ret) return ret; + + if (sample_flags & SAMPLE_TS) + gpu_ts = get_gpu_ts_from_oa_report(dev_priv, report); } if (sample_flags & SAMPLE_OA_SOURCE_INFO) @@ -1263,17 +1284,14 @@ static int append_one_cs_sample(struct i915_perf_stream *stream, } if (sample_flags & SAMPLE_TS) { - /* For RCS, if OA samples are also being collected, derive the - * timestamp from OA report, after scaling with the TS base. + /* If OA sampling is enabled, derive the ts from OA report. * Else, forward the timestamp collected via command stream. */ -#warning "FIXME: append_one_cs_sample: derive the timestamp from OA report" - if (sample_flags & SAMPLE_OA_REPORT) - data.ts = 0; - else - data.ts = *(u64 *) + if (!(sample_flags & SAMPLE_OA_REPORT)) + gpu_ts = *(u64 *) (dev_priv->perf.command_stream_buf[id].addr + node->ts_offset); + data.ts = gpu_ts; } return append_sample(stream, read_state, &data); @@ -2025,8 +2043,12 @@ static void i915_ring_stream_enable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; - if (stream->sample_flags & SAMPLE_OA_REPORT) + if (stream->sample_flags & SAMPLE_OA_REPORT) { + dev_priv->perf.oa.last_gpu_ts = + ((u64)I915_READ(GT_TIMESTAMP_COUNT_UDW) << 32) | + I915_READ(GT_TIMESTAMP_COUNT); dev_priv->perf.oa.ops.oa_enable(dev_priv); + } if (stream->cs_mode) stream->command_stream_hook = i915_ring_stream_cs_hook; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 0924e4f..2584c0b 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -573,6 +573,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define PS_DEPTH_COUNT _MMIO(0x2350) #define PS_DEPTH_COUNT_UDW _MMIO(0x2350 + 4) +/* Timestamp count register */ +#define GT_TIMESTAMP_COUNT _MMIO(0x2358) +#define GT_TIMESTAMP_COUNT_UDW _MMIO(0x2358 + 4) + /* There are the 4 64-bit counter registers, one for each stream output */ #define GEN7_SO_NUM_PRIMS_WRITTEN(n) _MMIO(0x5200 + (n) * 8) #define GEN7_SO_NUM_PRIMS_WRITTEN_UDW(n) _MMIO(0x5200 + (n) * 8 + 4)