From patchwork Thu Feb 15 11:53:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10220939 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A6AA3601E7 for ; Thu, 15 Feb 2018 11:53:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96FFB29123 for ; Thu, 15 Feb 2018 11:53:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B50B29128; Thu, 15 Feb 2018 11:53:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F200629123 for ; Thu, 15 Feb 2018 11:53:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 755566E461; Thu, 15 Feb 2018 11:53:56 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr0-x244.google.com (mail-wr0-x244.google.com [IPv6:2a00:1450:400c:c0c::244]) by gabe.freedesktop.org (Postfix) with ESMTPS id D94F06E461 for ; Thu, 15 Feb 2018 11:53:54 +0000 (UTC) Received: by mail-wr0-x244.google.com with SMTP id s5so3080919wra.0 for ; Thu, 15 Feb 2018 03:53:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ursulin-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=mDEU0gFAXoMciuZSYOxP808w9n99hGL5LS/n96OluUM=; b=cK0rMdMMsTlW2Oi/kjthOhcRz6Fr5dlg0vJhti9/8/0h6mJm/Lx5WUrbn8bUEuDGcU sTs0NkKx8u+xw2GPTGEzcPzgToS3LXongwRgWKTZ+aSaTgQURgoq7eoe4sWgpjRp9FLr uW5xj5kA8+RtEaHaIKdCT3nT3k8Ao71hgfsZYtDOJk6TeukCY74vvm9HeJBEyVl3ebbV BsIkIBVabaA7lU0o15Og85tHbnHPnDVs2zRyYGolKj7CxhpqI2TvaS2FDXytd5WdKyie r62OP3y96cHQIYwTxJBq58LqCxyu1xBjv5LRRqKDGFSHwz0Izgm0qTyCSXD2EK61aqYB stZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=mDEU0gFAXoMciuZSYOxP808w9n99hGL5LS/n96OluUM=; b=FomDFzh/5pezpEqr6oS287KULO925M559HfcO+JJetWH83f4r5nWfBUcDOu7OGI4dF ODu2R2KaN/9EfFe3SsoEwQac0YvXZjaH57TOr65XcEC5S+el+L9NbzwhGzKgzOEg662O fBdVZZX2k1Diurz3WjUh3JLRxdu+NMop6uojy8mX7E5Q5BMnAYZyZA+ZAXnbuSEOH9eF f9GriIObSJUIX2xhEC3iC7lSlhkxIbwLiYEklSbyfMzp+/scSP8KNa4Hul43ZR+TwPbA WrPYvb9hySslN7IUn8QQMFMr/twlsO46QY7kcskrp0K5mToyic7OM0RTGZIeCmGk1TXt y18w== X-Gm-Message-State: APf1xPDvLou0v4b+AB6+HHvIdb9St7w/tv0+uD3bStlK6ejoZp8xoJA+ kWNYC0GOdnnTkNp0z3z2h/k+taMh X-Google-Smtp-Source: AH8x224UBm09HBAPj2qBNM/UpzyJGWsojwv5kTfpgAiKXhzYscHJjxcbKxK5mtwTrBienaTHUvA+mA== X-Received: by 10.223.161.15 with SMTP id o15mr2126688wro.274.1518695633265; Thu, 15 Feb 2018 03:53:53 -0800 (PST) Received: from localhost.localdomain ([95.146.144.186]) by smtp.gmail.com with ESMTPSA id v202sm3544028wmv.29.2018.02.15.03.53.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Feb 2018 03:53:52 -0800 (PST) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 15 Feb 2018 11:53:44 +0000 Message-Id: <20180215115344.27625-1-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.14.1 Subject: [Intel-gfx] [PATCH i-g-t] tests/perf_pmu: Verify engine busyness accuracy X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin A subtest to verify that the engine busyness is reported with expected accuracy on platforms where the feature is available. We test three patterns: 2%, 50% and 98% load per engine. v2: * Use spin batch instead of nop calibration. * Various tweaks. v3: * Change loops to be time based. * Use __igt_spin_batch_new inside timing sensitive loops. * Fixed PWM sleep handling. v4: * Use restarting spin batch. * Calibrate more carefully by looking at the real PWM loop. v5: * Made standalone. * Better info messages. * Tweak sleep compensation. Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- tests/perf_pmu.c | 192 +++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 174 insertions(+), 18 deletions(-) diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c index a7501ca5f7a4..fa9b54793a52 100644 --- a/tests/perf_pmu.c +++ b/tests/perf_pmu.c @@ -35,6 +35,7 @@ #include #include #include +#include #include "igt.h" #include "igt_core.h" @@ -385,6 +386,22 @@ busy_check_all(int gem_fd, const struct intel_execution_engine2 *e, gem_quiescent_gpu(gem_fd); } +static void +__submit_spin_batch(int gem_fd, igt_spin_t *spin, + const struct intel_execution_engine2 *e) +{ + struct drm_i915_gem_exec_object2 obj = { + .handle = spin->handle + }; + struct drm_i915_gem_execbuffer2 eb = { + .buffer_count = 1, + .buffers_ptr = to_user_pointer(&obj), + .flags = e2ring(gem_fd, e), + }; + + gem_execbuf(gem_fd, &eb); +} + static void most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e, const unsigned int num_engines, unsigned int flags) @@ -405,15 +422,7 @@ most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e, if (e == e_) { idle_idx = i; } else if (spin) { - struct drm_i915_gem_exec_object2 obj = { - .handle = spin->handle - }; - struct drm_i915_gem_execbuffer2 eb = { - .buffer_count = 1, - .buffers_ptr = to_user_pointer(&obj), - .flags = e2ring(gem_fd, e_), - }; - gem_execbuf(gem_fd, &eb); + __submit_spin_batch(gem_fd, spin, e_); } else { spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e_), 0); @@ -469,15 +478,7 @@ all_busy_check_all(int gem_fd, const unsigned int num_engines, continue; if (spin) { - struct drm_i915_gem_exec_object2 obj = { - .handle = spin->handle - }; - struct drm_i915_gem_execbuffer2 eb = { - .buffer_count = 1, - .buffers_ptr = to_user_pointer(&obj), - .flags = e2ring(gem_fd, e), - }; - gem_execbuf(gem_fd, &eb); + __submit_spin_batch(gem_fd, spin, e); } else { spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0); @@ -1390,6 +1391,150 @@ test_enable_race(int gem_fd, const struct intel_execution_engine2 *e) gem_quiescent_gpu(gem_fd); } +static double __error(double val, double ref) +{ + igt_assert(ref != 0.0); + return (100.0 * val / ref) - 100.0; +} + +static void __rearm_spin_batch(igt_spin_t *spin) +{ + const uint32_t mi_arb_chk = 0x5 << 23; + + *spin->batch = mi_arb_chk; + __sync_synchronize(); +} + +#define div_round_up(a, b) (((a) + (b) - 1) / (b)) + +static void +accuracy(int gem_fd, const struct intel_execution_engine2 *e, + unsigned long target_busy_pct) +{ + const unsigned int min_test_loops = 7; + const unsigned long min_test_us = 1e6; + unsigned long busy_us = 2500; + unsigned long idle_us = 100 * (busy_us - target_busy_pct * + busy_us / 100) / target_busy_pct; + unsigned long pwm_calibration_us; + unsigned long test_us; + double busy_r; + uint64_t val[2]; + uint64_t ts[2]; + int fd; + + /* Sampling platforms cannot reach the high accuracy criteria. */ + igt_require(intel_gen(intel_get_drm_devid(gem_fd)) >= 8); + + while (idle_us < 2500) { + busy_us *= 2; + idle_us *= 2; + } + + pwm_calibration_us = min_test_loops * (busy_us + idle_us); + while (pwm_calibration_us < min_test_us) + pwm_calibration_us += busy_us + idle_us; + test_us = min_test_loops * (idle_us + busy_us); + while (test_us < min_test_us) + test_us += busy_us + idle_us; + + igt_info("calibration=%luus, test=%luus; busy=%luus, idle=%luus\n", + pwm_calibration_us, test_us, busy_us, idle_us); + + assert_within_epsilon((double)busy_us / (busy_us + idle_us), + (double)target_busy_pct / 100.0, tolerance); + + /* Emit PWM pattern on the engine from a child. */ + igt_fork(child, 1) { + struct sched_param rt = { .sched_priority = 99 }; + const unsigned long timeout[] = { pwm_calibration_us * 1000, + test_us * 2 * 1000 }; + unsigned long sleep_busy = busy_us; + unsigned long sleep_idle = idle_us; + igt_spin_t *spin; + + /* We need the best sleep accuracy we can get. */ + igt_require(sched_setscheduler(0, + SCHED_FIFO | SCHED_RESET_ON_FORK, + &rt) == 0); + + /* Allocate our spin batch and idle it. */ + spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0); + igt_spin_batch_end(spin); + gem_sync(gem_fd, spin->handle); + + /* 1st pass is calibration, second pass is the test. */ + for (int pass = 0; pass < ARRAY_SIZE(timeout); pass++) { + unsigned long busy_ns = 0, idle_ns = 0; + struct timespec test_start = { }; + unsigned long loops = 0; + double err_busy, err_idle; + + igt_nsec_elapsed(&test_start); + do { + struct timespec t_busy = { }; + + igt_nsec_elapsed(&t_busy); + + /* Restart the spinbatch. */ + __rearm_spin_batch(spin); + __submit_spin_batch(gem_fd, spin, e); + measured_usleep(sleep_busy); + igt_spin_batch_end(spin); + gem_sync(gem_fd, spin->handle); + + busy_ns += igt_nsec_elapsed(&t_busy); + + idle_ns += measured_usleep(sleep_idle); + + loops++; + } while (igt_nsec_elapsed(&test_start) < timeout[pass]); + + busy_ns = div_round_up(busy_ns, loops); + idle_ns = div_round_up(idle_ns, loops); + + err_busy = __error(busy_ns / 1000, busy_us); + err_idle = __error(idle_ns / 1000, idle_us); + + igt_info("%u: busy %lu/%lu %.2f%%, idle %lu/%lu %.2f%%\n", + pass, + busy_ns / 1000, busy_us, err_busy, + idle_ns / 1000, idle_us, err_idle); + + if (pass == 0) { + sleep_busy = (double)busy_us - + (double)busy_us * err_busy / 100.0; + sleep_idle = (double)idle_us - + (double)idle_us * err_idle / 100.0; + igt_info("calibrated sleeps: busy=%lu, idle=%lu\n", + sleep_busy, sleep_idle); + } + } + + igt_spin_batch_free(gem_fd, spin); + } + + /* Let the child run. */ + usleep(pwm_calibration_us * 2); + + /* Collect engine busyness for an interesting part of child runtime. */ + fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance)); + val[0] = __pmu_read_single(fd, &ts[0]); + usleep(test_us / 2); + val[1] = __pmu_read_single(fd, &ts[1]); + close(fd); + + igt_waitchildren(); + + busy_r = (double)(val[1] - val[0]) / (ts[1] - ts[0]); + + igt_info("error=%.2f%% (%.2f%% vs %lu%%)\n", + __error(busy_r, target_busy_pct / 100.0), + busy_r * 100.0, target_busy_pct); + + assert_within_epsilon(busy_r, (double)target_busy_pct / 100.0, 0.15); +} + igt_main { const unsigned int num_other_metrics = @@ -1418,6 +1563,8 @@ igt_main invalid_init(); for_each_engine_class_instance(fd, e) { + const unsigned int pct[] = { 2, 50, 98 }; + /** * Test that a single engine metric can be initialized or it * is correctly rejected. @@ -1524,6 +1671,15 @@ igt_main */ igt_subtest_f("enable-race-%s", e->name) test_enable_race(fd, e); + + /** + * Check engine busyness accuracy is as expected. + */ + for (i = 0; i < ARRAY_SIZE(pct); i++) { + igt_subtest_f("busy-accuracy-%u-%s", + pct[i], e->name) + accuracy(fd, e, pct[i]); + } } /**