From patchwork Fri Sep 29 12:39:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 9977955 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C6F4E6034B for ; Fri, 29 Sep 2017 12:39:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD4052980D for ; Fri, 29 Sep 2017 12:39:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B1F8C29858; Fri, 29 Sep 2017 12:39:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 13C332980D for ; Fri, 29 Sep 2017 12:39:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B8E566EB01; Fri, 29 Sep 2017 12:39:53 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr0-x241.google.com (mail-wr0-x241.google.com [IPv6:2a00:1450:400c:c0c::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id D29896EB01 for ; Fri, 29 Sep 2017 12:39:51 +0000 (UTC) Received: by mail-wr0-x241.google.com with SMTP id 97so1214109wrb.0 for ; Fri, 29 Sep 2017 05:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ursulin-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=INoqraRoWQEHh7Ryho7q/h320IvOfXr9bSDEr16Ed98=; b=LJjffFt9FrWdG4g6Mhn8kZv2fKgds3SOt9m/inS/bY0EtSOOiY+znEygvFw6TTcwRM Py8nFJ/UvL3bPF6ZRdBiyvKXUS19eUP/WivfpAFHNoaHOGwpbMEA7F/pmNrJmg8IncsO jII7L641V+Ju2nSj+MZxeTEiZiZ2+Of7RG3jh2FOhe0AGlrHcQsghubce5qDpIrICO4r CqH2d3221SCrIB61chH8tTX7DMP1J6fSJ6e1Xg2gf+f7/blPKo6hOlgOV9czC28IYf/Z NkB3dx/PtiZcW3uT5+QVo3wIIDjmsSIDgVvpGx57J/njeRuYRAb/WpMQmJrAOglhz46j YYig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=INoqraRoWQEHh7Ryho7q/h320IvOfXr9bSDEr16Ed98=; b=LI+ZmvH6xjtIzQz/FLZUAnzrn5fvoGFlobpLHc6GXWxopPgO1kzJNqH38qCakKOwlN 2iUugGm+eUQJ8Iqb1fYwZLjrL4ugD78WBTHnSSmdoUryz5eyNnwGZaXgoRh6K/6k/eWJ cYy0bdAGvUiQngHV2o2oYVsIrG0eoBMHyFO/w59WBc7o220ikPesIoWfLoq91h+1GroO X2GwlbAAIjAk3HVF2k7btrLW3AueScUQAN2yVy6vrD6FDuXPkKr/1sv4guM5ihWRpomS XcAsVVSS1ITJ1wagjhq4UU7cIBvsYFliNs+0YtZCkwiW4w44eKedURgHKBNSRgvpL+AG wOzQ== X-Gm-Message-State: AHPjjUh1RG8AsvfXTr0S8dnRb376GvQ8yfYqR+Qan/axlBKS7eb2j1qu ePQGU1QioW8UTu2EmSMP8dwKdqoz X-Google-Smtp-Source: AOwi7QBv/7I/99sj6j7ibPtF3nb8E4/oBmcEBCzdLubVwPxGeOJedKfTB1tlPsprWHCdMh0+gTSToA== X-Received: by 10.223.197.141 with SMTP id m13mr8161099wrg.203.1506688790219; Fri, 29 Sep 2017 05:39:50 -0700 (PDT) Received: from t460p.intel ([95.146.151.158]) by smtp.gmail.com with ESMTPSA id w5sm1959691wrg.65.2017.09.29.05.39.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Sep 2017 05:39:49 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 29 Sep 2017 13:39:36 +0100 Message-Id: <20170929123939.3312-5-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20170929123939.3312-1-tvrtko.ursulin@linux.intel.com> References: <20170929123939.3312-1-tvrtko.ursulin@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH i-g-t 4/7] intel-gpu-overlay: Catch-up to new i915 PMU X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin v2: Update for i915 changes. Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- lib/igt_perf.h | 89 +++++++++++++++++++++++++++++++++--------------- overlay/gem-interrupts.c | 2 +- overlay/gpu-freq.c | 8 ++--- overlay/gpu-top.c | 68 ++++++++++++++++++++---------------- overlay/power.c | 4 +-- overlay/rc6.c | 20 +++++------ 6 files changed, 116 insertions(+), 75 deletions(-) diff --git a/lib/igt_perf.h b/lib/igt_perf.h index 8e674c3a3755..e38171da5261 100644 --- a/lib/igt_perf.h +++ b/lib/igt_perf.h @@ -1,3 +1,27 @@ +/* + * Copyright © 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + #ifndef I915_PERF_H #define I915_PERF_H @@ -5,41 +29,52 @@ #include -#define I915_SAMPLE_BUSY 0 -#define I915_SAMPLE_WAIT 1 -#define I915_SAMPLE_SEMA 2 +enum drm_i915_gem_engine_class { + I915_ENGINE_CLASS_OTHER = 0, + I915_ENGINE_CLASS_RENDER = 1, + I915_ENGINE_CLASS_COPY = 2, + I915_ENGINE_CLASS_VIDEO = 3, + I915_ENGINE_CLASS_VIDEO_ENHANCE = 4, + I915_ENGINE_CLASS_MAX /* non-ABI */ +}; + +enum drm_i915_pmu_engine_sample { + I915_SAMPLE_BUSY = 0, + I915_SAMPLE_WAIT = 1, + I915_SAMPLE_SEMA = 2, + I915_ENGINE_SAMPLE_MAX /* non-ABI */ +}; -#define I915_SAMPLE_RCS 0 -#define I915_SAMPLE_VCS 1 -#define I915_SAMPLE_BCS 2 -#define I915_SAMPLE_VECS 3 +#define I915_PMU_SAMPLE_BITS (4) +#define I915_PMU_SAMPLE_MASK (0xf) +#define I915_PMU_SAMPLE_INSTANCE_BITS (8) +#define I915_PMU_CLASS_SHIFT \ + (I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS) -#define __I915_PERF_COUNT(ring, id) ((ring) << 4 | (id)) +#define __I915_PMU_ENGINE(class, instance, sample) \ + ((class) << I915_PMU_CLASS_SHIFT | \ + (instance) << I915_PMU_SAMPLE_BITS | \ + (sample)) -#define I915_PERF_COUNT_RCS_BUSY __I915_PERF_COUNT(I915_SAMPLE_RCS, I915_SAMPLE_BUSY) -#define I915_PERF_COUNT_RCS_WAIT __I915_PERF_COUNT(I915_SAMPLE_RCS, I915_SAMPLE_WAIT) -#define I915_PERF_COUNT_RCS_SEMA __I915_PERF_COUNT(I915_SAMPLE_RCS, I915_SAMPLE_SEMA) +#define I915_PMU_ENGINE_BUSY(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY) -#define I915_PERF_COUNT_VCS_BUSY __I915_PERF_COUNT(I915_SAMPLE_VCS, I915_SAMPLE_BUSY) -#define I915_PERF_COUNT_VCS_WAIT __I915_PERF_COUNT(I915_SAMPLE_VCS, I915_SAMPLE_WAIT) -#define I915_PERF_COUNT_VCS_SEMA __I915_PERF_COUNT(I915_SAMPLE_VCS, I915_SAMPLE_SEMA) +#define I915_PMU_ENGINE_WAIT(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT) -#define I915_PERF_COUNT_BCS_BUSY __I915_PERF_COUNT(I915_SAMPLE_BCS, I915_SAMPLE_BUSY) -#define I915_PERF_COUNT_BCS_WAIT __I915_PERF_COUNT(I915_SAMPLE_BCS, I915_SAMPLE_WAIT) -#define I915_PERF_COUNT_BCS_SEMA __I915_PERF_COUNT(I915_SAMPLE_BCS, I915_SAMPLE_SEMA) +#define I915_PMU_ENGINE_SEMA(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA) -#define I915_PERF_COUNT_VECS_BUSY __I915_PERF_COUNT(I915_SAMPLE_VECS, I915_SAMPLE_BUSY) -#define I915_PERF_COUNT_VECS_WAIT __I915_PERF_COUNT(I915_SAMPLE_VECS, I915_SAMPLE_WAIT) -#define I915_PERF_COUNT_VECS_SEMA __I915_PERF_COUNT(I915_SAMPLE_VECS, I915_SAMPLE_SEMA) +#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) -#define I915_PERF_ACTUAL_FREQUENCY 32 -#define I915_PERF_REQUESTED_FREQUENCY 33 -#define I915_PERF_ENERGY 34 -#define I915_PERF_INTERRUPTS 35 +#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0) +#define I915_PMU_REQUESTED_FREQUENCY __I915_PMU_OTHER(1) +#define I915_PMU_INTERRUPTS __I915_PMU_OTHER(2) +#define I915_PMU_RC6_RESIDENCY __I915_PMU_OTHER(3) +#define I915_PMU_RC6p_RESIDENCY __I915_PMU_OTHER(4) +#define I915_PMU_RC6pp_RESIDENCY __I915_PMU_OTHER(5) -#define I915_PERF_RC6_RESIDENCY 40 -#define I915_PERF_RC6p_RESIDENCY 41 -#define I915_PERF_RC6pp_RESIDENCY 42 +#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY static inline int perf_event_open(struct perf_event_attr *attr, diff --git a/overlay/gem-interrupts.c b/overlay/gem-interrupts.c index 3eda24f4d7eb..add4a9dfd725 100644 --- a/overlay/gem-interrupts.c +++ b/overlay/gem-interrupts.c @@ -113,7 +113,7 @@ int gem_interrupts_init(struct gem_interrupts *irqs) { memset(irqs, 0, sizeof(*irqs)); - irqs->fd = perf_i915_open(I915_PERF_INTERRUPTS); + irqs->fd = perf_i915_open(I915_PMU_INTERRUPTS); if (irqs->fd < 0 && interrupts_read() < 0) irqs->error = ENODEV; diff --git a/overlay/gpu-freq.c b/overlay/gpu-freq.c index 76c5ed9acfd1..2a8e02f68ce5 100644 --- a/overlay/gpu-freq.c +++ b/overlay/gpu-freq.c @@ -37,8 +37,8 @@ static int perf_open(void) { int fd; - fd = perf_i915_open_group(I915_PERF_ACTUAL_FREQUENCY, -1); - if (perf_i915_open_group(I915_PERF_REQUESTED_FREQUENCY, fd) < 0) { + fd = perf_i915_open_group(I915_PMU_ACTUAL_FREQUENCY, -1); + if (perf_i915_open_group(I915_PMU_REQUESTED_FREQUENCY, fd) < 0) { close(fd); fd = -1; } @@ -176,8 +176,8 @@ int gpu_freq_update(struct gpu_freq *gf) return EAGAIN; } - gf->current = (s->act - d->act) / d_time; - gf->request = (s->req - d->req) / d_time; + gf->current = (s->act - d->act) * 1000000000 / d_time; + gf->request = (s->req - d->req) * 1000000000 / d_time; } return 0; diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c index 812f47d5aced..61b8f62fd78c 100644 --- a/overlay/gpu-top.c +++ b/overlay/gpu-top.c @@ -43,49 +43,57 @@ #define RING_WAIT (1<<11) #define RING_WAIT_SEMAPHORE (1<<10) -#define __I915_PERF_RING(n) (4*n) -#define I915_PERF_RING_BUSY(n) (__I915_PERF_RING(n) + 0) -#define I915_PERF_RING_WAIT(n) (__I915_PERF_RING(n) + 1) -#define I915_PERF_RING_SEMA(n) (__I915_PERF_RING(n) + 2) - static int perf_init(struct gpu_top *gt) { - const char *names[] = { - "RCS", - "BCS", - "VCS0", - "VCS1", - NULL, + struct engine_desc { + unsigned class, inst; + const char *name; + } *d, engines[] = { + { I915_ENGINE_CLASS_RENDER, 0, "rcs0" }, + { I915_ENGINE_CLASS_COPY, 0, "bcs0" }, + { I915_ENGINE_CLASS_VIDEO, 0, "vcs0" }, + { I915_ENGINE_CLASS_VIDEO, 1, "vcs1" }, + { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0, "vecs0" }, + { 0, 0, NULL } }; - int n; - gt->fd = perf_i915_open_group(I915_PERF_RING_BUSY(0), -1); + d = &engines[0]; + + gt->fd = perf_i915_open_group(I915_PMU_ENGINE_BUSY(d->class, d->inst), + -1); if (gt->fd < 0) return -1; - if (perf_i915_open_group(I915_PERF_RING_WAIT(0), gt->fd) >= 0) + if (perf_i915_open_group(I915_PMU_ENGINE_WAIT(d->class, d->inst), + gt->fd) >= 0) gt->have_wait = 1; - if (perf_i915_open_group(I915_PERF_RING_SEMA(0), gt->fd) >= 0) + if (perf_i915_open_group(I915_PMU_ENGINE_SEMA(d->class, d->inst), + gt->fd) >= 0) gt->have_sema = 1; - gt->ring[0].name = names[0]; + gt->ring[0].name = d->name; gt->num_rings = 1; - for (n = 1; names[n]; n++) { - if (perf_i915_open_group(I915_PERF_RING_BUSY(n), gt->fd) >= 0) { - if (gt->have_wait && - perf_i915_open_group(I915_PERF_RING_WAIT(n), - gt->fd) < 0) - return -1; - - if (gt->have_sema && - perf_i915_open_group(I915_PERF_RING_SEMA(n), - gt->fd) < 0) - return -1; - - gt->ring[gt->num_rings++].name = names[n]; - } + for (d++; d->name; d++) { + if (perf_i915_open_group(I915_PMU_ENGINE_BUSY(d->class, + d->inst), + gt->fd) < 0) + continue; + + if (gt->have_wait && + perf_i915_open_group(I915_PMU_ENGINE_WAIT(d->class, + d->inst), + gt->fd) < 0) + return -1; + + if (gt->have_sema && + perf_i915_open_group(I915_PMU_ENGINE_SEMA(d->class, + d->inst), + gt->fd) < 0) + return -1; + + gt->ring[gt->num_rings++].name = d->name; } return 0; diff --git a/overlay/power.c b/overlay/power.c index dd4aec6bffd9..805f4ca7805c 100644 --- a/overlay/power.c +++ b/overlay/power.c @@ -45,9 +45,7 @@ int power_init(struct power *power) memset(power, 0, sizeof(*power)); - power->fd = perf_i915_open(I915_PERF_ENERGY); - if (power->fd != -1) - return 0; + power->fd = -1; sprintf(buf, "%s/i915_energy_uJ", debugfs_dri_path); fd = open(buf, 0); diff --git a/overlay/rc6.c b/overlay/rc6.c index 46c975a557ff..57abea41b3c6 100644 --- a/overlay/rc6.c +++ b/overlay/rc6.c @@ -43,15 +43,15 @@ static int perf_open(unsigned *flags) { int fd; - fd = perf_i915_open_group(I915_PERF_RC6_RESIDENCY, -1); + fd = perf_i915_open_group(I915_PMU_RC6_RESIDENCY, -1); if (fd < 0) return -1; *flags |= RC6; - if (perf_i915_open_group(I915_PERF_RC6p_RESIDENCY, fd) >= 0) + if (perf_i915_open_group(I915_PMU_RC6p_RESIDENCY, fd) >= 0) *flags |= RC6p; - if (perf_i915_open_group(I915_PERF_RC6pp_RESIDENCY, fd) >= 0) + if (perf_i915_open_group(I915_PMU_RC6pp_RESIDENCY, fd) >= 0) *flags |= RC6pp; return fd; @@ -132,11 +132,11 @@ int rc6_update(struct rc6 *rc6) len = 2; if (rc6->flags & RC6) - s->rc6_residency = data[len++]; + s->rc6_residency = data[len++] / 1000000; if (rc6->flags & RC6p) - s->rc6p_residency = data[len++]; + s->rc6p_residency = data[len++] / 1000000; if (rc6->flags & RC6pp) - s->rc6pp_residency = data[len++]; + s->rc6pp_residency = data[len++] / 1000000; } if (rc6->count == 1) @@ -149,14 +149,14 @@ int rc6_update(struct rc6 *rc6) } d_rc6 = s->rc6_residency - d->rc6_residency; - rc6->rc6 = (100 * d_rc6 + d_time/2) / d_time; + rc6->rc6 = 100 * d_rc6 / d_time; d_rc6p = s->rc6p_residency - d->rc6p_residency; - rc6->rc6p = (100 * d_rc6p + d_time/2) / d_time; + rc6->rc6p = 100 * d_rc6p / d_time; d_rc6pp = s->rc6pp_residency - d->rc6pp_residency; - rc6->rc6pp = (100 * d_rc6pp + d_time/2) / d_time; + rc6->rc6pp = 100 * d_rc6pp / d_time; - rc6->rc6_combined = (100 * (d_rc6 + d_rc6p + d_rc6pp) + d_time/2) / d_time; + rc6->rc6_combined = 100 * (d_rc6 + d_rc6p + d_rc6pp) / d_time; return 0; }