From patchwork Thu Feb 15 15:34:53 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tvrtko Ursulin <tursulin@ursulin.net>
X-Patchwork-Id: 10221807
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	7A3C260600 for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 15 Feb 2018 15:35:07 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A6B729416
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 15 Feb 2018 15:35:07 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 5A6652941D; Thu, 15 Feb 2018 15:35:07 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 679C729416
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 15 Feb 2018 15:35:06 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id B1E3E6E534;
	Thu, 15 Feb 2018 15:35:04 +0000 (UTC)
X-Original-To: Intel-gfx@lists.freedesktop.org
Delivered-To: Intel-gfx@lists.freedesktop.org
Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com
	[IPv6:2a00:1450:400c:c09::242])
	by gabe.freedesktop.org (Postfix) with ESMTPS id B4E056E534
	for <Intel-gfx@lists.freedesktop.org>;
	Thu, 15 Feb 2018 15:35:03 +0000 (UTC)
Received: by mail-wm0-x242.google.com with SMTP id j199so1596469wmj.2
	for <Intel-gfx@lists.freedesktop.org>;
	Thu, 15 Feb 2018 07:35:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=ursulin-net.20150623.gappssmtp.com; s=20150623;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=cAO2YhJYblIhJzmaLVgUbu89xOGbSeXF0X9ILiXeYZk=;
	b=Fs2K8lZ+EYotS562ef9VwxvBi3jrQZgRj/batRW36N7WNmXsR9kh0r7TNs8XNAM87r
	HCVQVOxoCRL4J+kavyS/zsW3v5jrUkTTYNu4Y/IvQXcRqNAlkSinBxrFB7ReoEloLv2E
	8y9hTx9yUfpJsNDHD3s1nE2U8IOxaY6GkH0kOvmgYFZNTxiddxhxow3GDdLjEd1YZPko
	GH81A8G8xkS8EaVYtXG8TBw0ZGtLMvX2cfh7/KeU0w/nf1LqBHYJuxjqU4qMZaFm1LsJ
	vkqY8llwwzc2Qf2SUF0ykCSATkVc8UA4YCbYnmFcslzdA5w8z5DYzcOZUaGj1XouFC78
	5CJg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=cAO2YhJYblIhJzmaLVgUbu89xOGbSeXF0X9ILiXeYZk=;
	b=Zw0owCpQdCs1efIWHm5UN00aB1kHlpAw6CikgFHyHAMk3c2++sEKMmS25jSGxzYKCJ
	yXKUa1EO421Y8ITsed+acOPIDD0cNv7OtBQh/SjkiZMghL2dxcfY5DgtT2L1CKVfqc/p
	rlHIvW6XjPW/IMLuaEeXYKnwhxatYZ11ivv7rF+uwYTwRN4gu39wT2kNDu6L5kZYHA3J
	JM4LUlbfD2lJF7lA/jRsifoCUSztXP+MfYcTRq+L5NBqqKvCQJvkB0msCWQVSjwAf8ny
	j+c2vxzqGdwtobBRcBYNpw00pS9l7mxb905q8KB+GDMzwq3AWXDRNehqOOJMh3MFcaDD
	/J1w==
X-Gm-Message-State: APf1xPCsB41QnQZz9WbDJ36xfteXVb+oSCiUTs6ML7oKszWpSJ3NOrGd
	YLzWupFamTbCjV0i4lb3PlI1Sg==
X-Google-Smtp-Source: 
 AH8x226eAP0NAMbWfuhNJ0615aPRXznW36HZbVbrCG0k50YWfK9k/Wby3+1c8rjrfGx1vsFhcof27A==
X-Received: by 10.28.216.149 with SMTP id p143mr2407599wmg.80.1518708902110;
	Thu, 15 Feb 2018 07:35:02 -0800 (PST)
Received: from localhost.localdomain ([95.146.144.186])
	by smtp.gmail.com with ESMTPSA id
	g13sm18194903wrh.51.2018.02.15.07.35.01
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Thu, 15 Feb 2018 07:35:01 -0800 (PST)
From: Tvrtko Ursulin <tursulin@ursulin.net>
X-Google-Original-From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: igt-dev@lists.freedesktop.org
Date: Thu, 15 Feb 2018 15:34:53 +0000
Message-Id: <20180215153453.657-1-tvrtko.ursulin@linux.intel.com>
X-Mailer: git-send-email 2.14.1
In-Reply-To: <151869859558.11261.17638690172075680562@mail.alporthouse.com>
References: <151869859558.11261.17638690172075680562@mail.alporthouse.com>
Subject: [Intel-gfx] [PATCH i-g-t v6] tests/perf_pmu: Verify engine busyness
	accuracy
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Cc: Intel-gfx@lists.freedesktop.org
MIME-Version: 1.0
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A subtest to verify that the engine busyness is reported with expected
accuracy on platforms where the feature is available.

We test three patterns: 2%, 50% and 98% load per engine.

v2:
 * Use spin batch instead of nop calibration.
 * Various tweaks.

v3:
 * Change loops to be time based.
 * Use __igt_spin_batch_new inside timing sensitive loops.
 * Fixed PWM sleep handling.

v4:
 * Use restarting spin batch.
 * Calibrate more carefully by looking at the real PWM loop.

v5:
 * Made standalone.
 * Better info messages.
 * Tweak sleep compensation.

v6:
 * Some final tweaks. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/perf_pmu.c | 199 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 181 insertions(+), 18 deletions(-)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 7d9c42d16a08..82053416cbea 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -35,6 +35,7 @@
 #include <dirent.h>
 #include <time.h>
 #include <poll.h>
+#include <sched.h>
 
 #include "igt.h"
 #include "igt_core.h"
@@ -385,6 +386,22 @@ busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
 	gem_quiescent_gpu(gem_fd);
 }
 
+static void
+__submit_spin_batch(int gem_fd, igt_spin_t *spin,
+		    const struct intel_execution_engine2 *e)
+{
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = spin->handle
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffer_count = 1,
+		.buffers_ptr = to_user_pointer(&obj),
+		.flags = e2ring(gem_fd, e),
+	};
+
+	gem_execbuf(gem_fd, &eb);
+}
+
 static void
 most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
 		    const unsigned int num_engines, unsigned int flags)
@@ -405,15 +422,7 @@ most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
 		if (e == e_) {
 			idle_idx = i;
 		} else if (spin) {
-			struct drm_i915_gem_exec_object2 obj = {
-				.handle = spin->handle
-			};
-			struct drm_i915_gem_execbuffer2 eb = {
-				.buffer_count = 1,
-				.buffers_ptr = to_user_pointer(&obj),
-				.flags = e2ring(gem_fd, e_),
-			};
-			gem_execbuf(gem_fd, &eb);
+			__submit_spin_batch(gem_fd, spin, e_);
 		} else {
 			spin = igt_spin_batch_new(gem_fd, 0,
 						  e2ring(gem_fd, e_), 0);
@@ -469,15 +478,7 @@ all_busy_check_all(int gem_fd, const unsigned int num_engines,
 			continue;
 
 		if (spin) {
-			struct drm_i915_gem_exec_object2 obj = {
-				.handle = spin->handle
-			};
-			struct drm_i915_gem_execbuffer2 eb = {
-				.buffer_count = 1,
-				.buffers_ptr = to_user_pointer(&obj),
-				.flags = e2ring(gem_fd, e),
-			};
-			gem_execbuf(gem_fd, &eb);
+			__submit_spin_batch(gem_fd, spin, e);
 		} else {
 			spin = igt_spin_batch_new(gem_fd, 0,
 						  e2ring(gem_fd, e), 0);
@@ -1392,6 +1393,157 @@ test_enable_race(int gem_fd, const struct intel_execution_engine2 *e)
 	gem_quiescent_gpu(gem_fd);
 }
 
+static double __error(double val, double ref)
+{
+	igt_assert(ref > 1e-5 /* smallval */);
+	return (100.0 * val / ref) - 100.0;
+}
+
+static void __rearm_spin_batch(igt_spin_t *spin)
+{
+	const uint32_t mi_arb_chk = 0x5 << 23;
+
+       *spin->batch = mi_arb_chk;
+       __sync_synchronize();
+}
+
+#define div_round_up(a, b) (((a) + (b) - 1) / (b))
+
+static void
+accuracy(int gem_fd, const struct intel_execution_engine2 *e,
+	 unsigned long target_busy_pct)
+{
+	const unsigned int min_test_loops = 7;
+	const unsigned long min_test_us = 1e6;
+	unsigned long busy_us = 2500;
+	unsigned long idle_us = 100 * (busy_us - target_busy_pct *
+				busy_us / 100) / target_busy_pct;
+	unsigned long pwm_calibration_us;
+	unsigned long test_us;
+	double busy_r;
+	uint64_t val[2];
+	uint64_t ts[2];
+	int fd;
+
+	/* Sampling platforms cannot reach the high accuracy criteria. */
+	igt_require(gem_has_execlists(gem_fd));
+
+	while (idle_us < 2500) {
+		busy_us *= 2;
+		idle_us *= 2;
+	}
+
+	pwm_calibration_us = min_test_loops * (busy_us + idle_us);
+	while (pwm_calibration_us < min_test_us)
+		pwm_calibration_us += busy_us + idle_us;
+	test_us = min_test_loops * (idle_us + busy_us);
+	while (test_us < min_test_us)
+		test_us += busy_us + idle_us;
+
+	igt_info("calibration=%luus, test=%luus; ratio=%.2f%% (%luus/%luus)\n",
+		 pwm_calibration_us, test_us,
+		 (double)busy_us / (busy_us + idle_us) * 100.0,
+		 busy_us, idle_us);
+
+	assert_within_epsilon((double)busy_us / (busy_us + idle_us),
+				(double)target_busy_pct / 100.0, tolerance);
+
+	/* Emit PWM pattern on the engine from a child. */
+	igt_fork(child, 1) {
+		struct sched_param rt = { .sched_priority = 99 };
+		const unsigned long timeout[] = { pwm_calibration_us * 1000,
+						  test_us * 2 * 1000 };
+		unsigned long sleep_busy = busy_us;
+		unsigned long sleep_idle = idle_us;
+		igt_spin_t *spin;
+		int ret;
+
+		/* We need the best sleep accuracy we can get. */
+		ret = sched_setscheduler(0,
+					 SCHED_FIFO | SCHED_RESET_ON_FORK,
+					 &rt);
+		if (ret)
+			igt_warn("Failed to set scheduling policy!\n");
+
+		/* Allocate our spin batch and idle it. */
+		spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
+		igt_spin_batch_end(spin);
+		gem_sync(gem_fd, spin->handle);
+
+		/* 1st pass is calibration, second pass is the test. */
+		for (int pass = 0; pass < ARRAY_SIZE(timeout); pass++) {
+			unsigned long busy_ns = 0, idle_ns = 0;
+			struct timespec test_start = { };
+			unsigned long loops = 0;
+			double err_busy, err_idle;
+
+			igt_nsec_elapsed(&test_start);
+			do {
+				struct timespec t_busy = { };
+
+				igt_nsec_elapsed(&t_busy);
+
+				/* Restart the spinbatch. */
+				__rearm_spin_batch(spin);
+				__submit_spin_batch(gem_fd, spin, e);
+				measured_usleep(sleep_busy);
+				igt_spin_batch_end(spin);
+				gem_sync(gem_fd, spin->handle);
+
+				busy_ns += igt_nsec_elapsed(&t_busy);
+
+				idle_ns += measured_usleep(sleep_idle);
+
+				loops++;
+			} while (igt_nsec_elapsed(&test_start) < timeout[pass]);
+
+			busy_ns = div_round_up(busy_ns, loops);
+			idle_ns = div_round_up(idle_ns, loops);
+
+			err_busy = __error(busy_ns / 1000, busy_us);
+			err_idle = __error(idle_ns / 1000, idle_us);
+
+			igt_info("%u: busy %lu/%lu %.2f%%, idle %lu/%lu %.2f%%\n",
+				 pass,
+				 busy_ns / 1000, busy_us, err_busy,
+				 idle_ns / 1000, idle_us, err_idle);
+
+			if (pass == 0) {
+				sleep_busy = (double)busy_us -
+					     (double)busy_us * err_busy / 100.0;
+				sleep_idle = (double)idle_us -
+					     (double)idle_us * err_idle / 100.0;
+				igt_info("calibrated sleeps ratio %.2f%% (%lu/%lu)\n",
+					 (double)sleep_busy /
+					 (sleep_busy + sleep_idle) * 100.0,
+					 sleep_busy, sleep_idle);
+			}
+		}
+
+		igt_spin_batch_free(gem_fd, spin);
+	}
+
+	/* Let the child run. */
+	usleep(pwm_calibration_us * 2);
+
+	/* Collect engine busyness for an interesting part of child runtime. */
+	fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance));
+	val[0] = __pmu_read_single(fd, &ts[0]);
+	usleep(test_us / 2);
+	val[1] = __pmu_read_single(fd, &ts[1]);
+	close(fd);
+
+	igt_waitchildren();
+
+	busy_r = (double)(val[1] - val[0]) / (ts[1] - ts[0]);
+
+	igt_info("error=%.2f%% (%.2f%% vs %lu%%)\n",
+		 __error(busy_r, target_busy_pct / 100.0),
+		 busy_r * 100.0, target_busy_pct);
+
+	assert_within_epsilon(busy_r, (double)target_busy_pct / 100.0, 0.15);
+}
+
 igt_main
 {
 	const unsigned int num_other_metrics =
@@ -1420,6 +1572,8 @@ igt_main
 		invalid_init();
 
 	for_each_engine_class_instance(fd, e) {
+		const unsigned int pct[] = { 2, 50, 98 };
+
 		/**
 		 * Test that a single engine metric can be initialized or it
 		 * is correctly rejected.
@@ -1526,6 +1680,15 @@ igt_main
 			 */
 			igt_subtest_f("enable-race-%s", e->name)
 				test_enable_race(fd, e);
+
+			/**
+			 * Check engine busyness accuracy is as expected.
+			 */
+			for (i = 0; i < ARRAY_SIZE(pct); i++) {
+				igt_subtest_f("busy-accuracy-%u-%s",
+					      pct[i], e->name)
+					accuracy(fd, e, pct[i]);
+			}
 		}
 
 		/**