From patchwork Wed Nov 15 12:13:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sagar.a.kamble@intel.com X-Patchwork-Id: 10059281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B11C56019D for ; Wed, 15 Nov 2017 12:10:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AABC729F2C for ; Wed, 15 Nov 2017 12:10:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9FA7429F31; Wed, 15 Nov 2017 12:10:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 44EDF29F2C for ; Wed, 15 Nov 2017 12:10:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9C6456E431; Wed, 15 Nov 2017 12:10:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 91A716E4EC for ; Wed, 15 Nov 2017 12:10:30 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Nov 2017 04:10:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,399,1505804400"; d="scan'208";a="149720776" Received: from sakamble-desktop.iind.intel.com ([10.223.26.118]) by orsmga004.jf.intel.com with ESMTP; 15 Nov 2017 04:10:28 -0800 From: Sagar Arun Kamble To: intel-gfx@lists.freedesktop.org Date: Wed, 15 Nov 2017 17:43:54 +0530 Message-Id: <1510748034-14034-5-git-send-email-sagar.a.kamble@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1510748034-14034-1-git-send-email-sagar.a.kamble@intel.com> References: <1510748034-14034-1-git-send-email-sagar.a.kamble@intel.com> Cc: Sourab Gupta , Matthew Auld Subject: [Intel-gfx] [RFC 4/4] drm/i915/perf: Send system clock monotonic time in perf samples X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta Currently, we have the ability to only forward the GPU timestamps in the samples (which are generated via OA reports). This limits the ability to correlate these samples with the system events. An ability is therefore needed to report timestamps in different clock domains, such as CLOCK_MONOTONIC, in the perf samples to be of more practical use to the userspace. This ability becomes important when we want to correlate/plot GPU events/samples with other system events on the same timeline (e.g. vblank events, or timestamps when work was submitted to kernel, etc.) The patch here proposes a mechanism to achieve this. The correlation between gpu time and system time is established using the timestamp clock associated with the command stream, abstracted as timecounter/cyclecounter to retrieve gpu/system time correlated values. v2: Added i915_driver_init_late() function to capture the new late init phase for perf (Chris) v3: Removed cross-timestamp changes. Signed-off-by: Sourab Gupta Signed-off-by: Sagar Arun Kamble Cc: Lionel Landwerlin Cc: Chris Wilson Cc: Sourab Gupta Cc: Matthew Auld --- drivers/gpu/drm/i915/i915_perf.c | 27 +++++++++++++++++++++++++++ include/uapi/drm/i915_drm.h | 7 +++++++ 2 files changed, 34 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 3b721d7..94ee924 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -336,6 +336,7 @@ #define SAMPLE_OA_REPORT BIT(0) #define SAMPLE_GPU_TS BIT(1) +#define SAMPLE_SYSTEM_TS BIT(2) /** * struct perf_open_properties - for validated properties given to open a stream @@ -622,6 +623,7 @@ static int append_oa_sample(struct i915_perf_stream *stream, struct drm_i915_perf_record_header header; u32 sample_flags = stream->sample_flags; u64 gpu_ts = 0; + u64 system_ts = 0; header.type = DRM_I915_PERF_RECORD_SAMPLE; header.pad = 0; @@ -647,6 +649,23 @@ static int append_oa_sample(struct i915_perf_stream *stream, if (copy_to_user(buf, &gpu_ts, I915_PERF_TS_SAMPLE_SIZE)) return -EFAULT; + buf += I915_PERF_TS_SAMPLE_SIZE; + } + + if (sample_flags & SAMPLE_SYSTEM_TS) { + gpu_ts = get_gpu_ts_from_oa_report(stream, report); + /* + * XXX: timecounter_cyc2time considers time backwards if delta + * timestamp is more than half the max ns time covered by + * counter. It will be ~35min for 36 bit counter. If this much + * sampling duration is needed we will have to update tc->nsec + * by explicitly reading the timecounter (timecounter_read) + * before this duration. + */ + system_ts = timecounter_cyc2time(&stream->tc, gpu_ts); + + if (copy_to_user(buf, &system_ts, I915_PERF_TS_SAMPLE_SIZE)) + return -EFAULT; } (*offset) += header.size; @@ -2137,6 +2156,11 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->sample_size += I915_PERF_TS_SAMPLE_SIZE; } + if (props->sample_flags & SAMPLE_SYSTEM_TS) { + stream->sample_flags |= SAMPLE_SYSTEM_TS; + stream->sample_size += I915_PERF_TS_SAMPLE_SIZE; + } + dev_priv->perf.oa.oa_buffer.format_size = format_size; if (WARN_ON(dev_priv->perf.oa.oa_buffer.format_size == 0)) return -EINVAL; @@ -2857,6 +2881,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, case DRM_I915_PERF_PROP_SAMPLE_GPU_TS: props->sample_flags |= SAMPLE_GPU_TS; break; + case DRM_I915_PERF_PROP_SAMPLE_SYSTEM_TS: + props->sample_flags |= SAMPLE_SYSTEM_TS; + break; case DRM_I915_PERF_PROP_OA_METRICS_SET: if (value == 0) { DRM_DEBUG("Unknown OA metric set ID\n"); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 0b9249e..283859c 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1453,6 +1453,12 @@ enum drm_i915_perf_property_id { DRM_I915_PERF_PROP_SAMPLE_GPU_TS, /** + * This property requests inclusion of CLOCK_MONOTONIC system time in + * the perf sample data. + */ + DRM_I915_PERF_PROP_SAMPLE_SYSTEM_TS, + + /** * The value specifies which set of OA unit metrics should be * be configured, defining the contents of any OA unit reports. */ @@ -1539,6 +1545,7 @@ enum drm_i915_perf_record_type { * * { u32 oa_report[]; } && DRM_I915_PERF_PROP_SAMPLE_OA * { u64 gpu_timestamp; } && DRM_I915_PERF_PROP_SAMPLE_GPU_TS + * { u64 system_timestamp; } && DRM_I915_PERF_PROP_SAMPLE_SYSTEM_TS * }; */ DRM_I915_PERF_RECORD_SAMPLE = 1,