From patchwork Wed Oct 3 12:03:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624723 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8299613BB for ; Wed, 3 Oct 2018 12:04:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7574D28944 for ; Wed, 3 Oct 2018 12:04:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6991428957; Wed, 3 Oct 2018 12:04:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 02F3928944 for ; Wed, 3 Oct 2018 12:04:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1B75D6E472; Wed, 3 Oct 2018 12:04:40 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3E1446E459 for ; Wed, 3 Oct 2018 12:04:20 +0000 (UTC) Received: by mail-wm1-x336.google.com with SMTP id z204-v6so4536698wmc.5 for ; Wed, 03 Oct 2018 05:04:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=c6kktTQ0q1jpw+7BCOpKsUuu+cDrmx8ivIBCy5jOlYQ=; b=ieusevU1uP9ukSo5WCrcJkq/WFasEj+CWZIPQFu9YHOrFy0WuJleA6x2+fcD3Kn+lE jjmZ73Vp0onOcdDEENU756Xg8Ea8t+JsRt7Wj+X/5IXRzIQg24OxlyBv16+yDZJyJrLK s6uLn+bBWkLjXdrCy1h7DOLWAZb/v4LeI6DtG0C46mVcOL4Hta4gvsz0CNkEjsT7MXAK aT6xfq8ezFAuyL49EyGPSplfVbCTixahgN/KwdzNcgxdcRmFVd6OhemAdgj6mZOJ70bA A3c4osbz+YZv744RE6m23WTaW+EP86+gwPOpM0cGaZDKNcbzvq1ctfkQVYjLJq17XwZV KxLQ== X-Gm-Message-State: ABuFfoiYZbjGu2a8TaZdmlWLe+A0DLDjthuBgKUkhkxHAqhNX2lnus2s xAJJnE/4CasmiTaHoQrKoV8pKdX59rk= X-Google-Smtp-Source: ACcGV60N5nZOUy9UhxnQFjP7urxVuFPOQQ72HeDjpZxFF+Z5VNRL390GW36iygclkNz+7aLwOFp/zw== X-Received: by 2002:a7b:c0d3:: with SMTP id s19-v6mr1240217wmh.119.1538568258498; Wed, 03 Oct 2018 05:04:18 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id f69-v6sm866657wmf.34.2018.10.03.05.04.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:04:17 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:03:59 +0100 Message-Id: <20181003120406.6784-7-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120406.6784-1-tvrtko.ursulin@linux.intel.com> References: <20181003120406.6784-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [RFC 06/13] drm/i915/pmu: Add running counter X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We add a PMU counter to expose the number of requests currently executing on the GPU. This is useful to analyze the overall load of the system. v2: * Rebase. * Drop floating point constant. (Chris Wilson) v3: * Change scale to 1024 for faster arithmetics. (Chris Wilson) v4: * Refactored for timer period accounting. v5: * Avoid 64-division. (Chris Wilson) v6: * Do fewer divisions by accumulating in qd.ns units. (Chris Wilson) * Change counter scale to avoid multiplication in readout and increase counter headroom. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 20 ++++++++++++++++++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- include/uapi/drm/i915_drm.h | 5 +++++ 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index b01a2e66d33a..7435bce23b8f 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -18,7 +18,8 @@ BIT(I915_SAMPLE_WAIT) | \ BIT(I915_SAMPLE_SEMA) | \ BIT(I915_SAMPLE_QUEUED) | \ - BIT(I915_SAMPLE_RUNNABLE)) + BIT(I915_SAMPLE_RUNNABLE) | \ + BIT(I915_SAMPLE_RUNNING)) #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS) @@ -223,6 +224,11 @@ engines_sample(struct drm_i915_private *dev_priv, unsigned int period_ns) add_sample_mult(&engine->pmu.sample[I915_SAMPLE_RUNNABLE], engine->request_stats.runnable, period_ns); + + if (engine->pmu.enable & BIT(I915_SAMPLE_RUNNING)) + add_sample_mult(&engine->pmu.sample[I915_SAMPLE_RUNNING], + last_seqno - current_seqno, + period_ns); } if (fw) @@ -338,6 +344,7 @@ engine_event_status(struct intel_engine_cs *engine, case I915_SAMPLE_WAIT: case I915_SAMPLE_QUEUED: case I915_SAMPLE_RUNNABLE: + case I915_SAMPLE_RUNNING: break; case I915_SAMPLE_SEMA: if (INTEL_GEN(engine->i915) < 6) @@ -557,11 +564,14 @@ static u64 __i915_pmu_event_read(struct perf_event *event) val = engine->pmu.sample[sample].cur; if (sample == I915_SAMPLE_QUEUED || - sample == I915_SAMPLE_RUNNABLE) { + sample == I915_SAMPLE_RUNNABLE || + sample == I915_SAMPLE_RUNNING) { BUILD_BUG_ON(NSEC_PER_SEC % I915_SAMPLE_QUEUED_DIVISOR); BUILD_BUG_ON(I915_SAMPLE_QUEUED_DIVISOR != I915_SAMPLE_RUNNABLE_DIVISOR); + BUILD_BUG_ON(I915_SAMPLE_QUEUED_DIVISOR != + I915_SAMPLE_RUNNING_DIVISOR); /* to qd */ val = div_u64(val, NSEC_PER_SEC / @@ -863,6 +873,7 @@ add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name, /* No brackets or quotes below please. */ #define I915_SAMPLE_QUEUED_SCALE 0.001 #define I915_SAMPLE_RUNNABLE_SCALE 0.001 +#define I915_SAMPLE_RUNNING_SCALE 0.001 static struct attribute ** create_event_attributes(struct drm_i915_private *i915) @@ -890,6 +901,8 @@ create_event_attributes(struct drm_i915_private *i915) __stringify(I915_SAMPLE_QUEUED_SCALE)), __engine_event_scale(I915_SAMPLE_RUNNABLE, "runnable", __stringify(I915_SAMPLE_RUNNABLE_SCALE)), + __engine_event_scale(I915_SAMPLE_RUNNING, "running", + __stringify(I915_SAMPLE_RUNNING_SCALE)), }; unsigned int count = 0; struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter; @@ -905,6 +918,9 @@ create_event_attributes(struct drm_i915_private *i915) BUILD_BUG_ON(I915_SAMPLE_RUNNABLE_DIVISOR != (1 / I915_SAMPLE_RUNNABLE_SCALE)); + BUILD_BUG_ON(I915_SAMPLE_RUNNING_DIVISOR != + (1 / I915_SAMPLE_RUNNING_SCALE)); + /* Count how many counters we will be exposing. */ for (i = 0; i < ARRAY_SIZE(events); i++) { if (!config_status(i915, events[i].config)) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 50914b0ed826..8b53ed069063 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -455,7 +455,7 @@ struct intel_engine_cs { * * Our internal timer stores the current counters in this field. */ -#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_RUNNABLE + 1) +#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_RUNNING + 1) struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX]; } pmu; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 5bb7f53f1a3d..10279c0ef94c 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -113,11 +113,13 @@ enum drm_i915_pmu_engine_sample { I915_SAMPLE_SEMA = 2, I915_SAMPLE_QUEUED = 3, I915_SAMPLE_RUNNABLE = 4, + I915_SAMPLE_RUNNING = 5, }; /* Divide counter value by divisor to get the real value. */ #define I915_SAMPLE_QUEUED_DIVISOR (1000) #define I915_SAMPLE_RUNNABLE_DIVISOR (1000) +#define I915_SAMPLE_RUNNING_DIVISOR (1000) #define I915_PMU_SAMPLE_BITS (4) #define I915_PMU_SAMPLE_MASK (0xf) @@ -145,6 +147,9 @@ enum drm_i915_pmu_engine_sample { #define I915_PMU_ENGINE_RUNNABLE(class, instance) \ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE) +#define I915_PMU_ENGINE_RUNNING(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING) + #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) #define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)