From patchwork Tue Feb 6 18:33:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10204711 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 143BB602D8 for ; Wed, 7 Feb 2018 07:42:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 00E2F28417 for ; Wed, 7 Feb 2018 07:42:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E9A34285EE; Wed, 7 Feb 2018 07:42:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4B72B28536 for ; Wed, 7 Feb 2018 07:42:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B3A246E314; Wed, 7 Feb 2018 07:42:31 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7F6D46E0D9 for ; Tue, 6 Feb 2018 18:33:22 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id i186so5578081wmi.4 for ; Tue, 06 Feb 2018 10:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ursulin-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6IG/TTW+f+mYwKETWiYcezEbdCRtOd+kli/2aiDcBf0=; b=noCXvJw3WeuaUZ+rVUk3nJC4k56oxgHcCYMO5oBBe1o7vent95VCcUvHkBZtFtikup VFnecobYwSVnJkubRJc+e0k5jURiL6rqtR9M6IH4GWZ/Tx14pKPPwgUq2JmHEjeZkDgY cZubxUm4JoFtxwClW0OsOku0A/Ra1VUSQEoCMxj3jwGDkalTek9/m3/TksUC0SYEJ9Qg BmWBkV8IXT1r/Y7Dz+kwQwmk5fKks1cAcDoVKrQuc49NbnFA3oRhZlgA/7qj/uxCxJyy 0n61Ggl1QMRnP9m4o/hHesq/fy1NNXGhy9YraFh2s+oZ4oy7SZoYCnwuRYqiGMQKFkbF BZ2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6IG/TTW+f+mYwKETWiYcezEbdCRtOd+kli/2aiDcBf0=; b=h0FVgQZo7psSNTpSG+LDHYOE9jICkruso7waWTgaEUgm8uXl9ONw99+yi6ahsAAj18 OSQsjGbfmIzTFv6yVuHe/GhSt2M+NfS7J3kyGMw4e5jISKIfDmDOjVUW0Bf3SwTjINaS P/6OWMnZ7LtbotAWUfb6zJSC8DcwwnFgmng6EUAf6aVu2/i52g+unBFQhMfarX+Dj67K slnNXMVepZZsEvu1y4AYEFzl+6a9leEJeQSXk43ueCyGrSas3l3TWgO+5MJkkSglnHvd H6KOPlxi30cncOm75gYx+sS2PjnBqH4FVXKqbgww8otqiJymVFS4uMHBbb1N3rHEvS1Z NJwQ== X-Gm-Message-State: APf1xPATfYMCdN2vYiWdqyFAs4RLXwhenrVyRJZ/ZBR0RVe9WmXE60Ty zteynfL2YrYL8TOFElYUuHlRaA== X-Google-Smtp-Source: AH8x225QLky4a1AeQBrJmlojG3KzMYApWsJ6eJ5b8kfivNSu7vMcqCS57Prs/XsptC9YnwpeKZBqaQ== X-Received: by 10.28.0.212 with SMTP id 203mr2437700wma.128.1517942001055; Tue, 06 Feb 2018 10:33:21 -0800 (PST) Received: from localhost.localdomain ([95.146.144.186]) by smtp.gmail.com with ESMTPSA id 89sm14127688wrq.16.2018.02.06.10.33.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Feb 2018 10:33:20 -0800 (PST) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Subject: [PATCH v2] drm/i915/pmu: Fix sleep under atomic in RC6 readout Date: Tue, 6 Feb 2018 18:33:11 +0000 Message-Id: <20180206183311.17924-1-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180206161058.nuc2cfilodtqk32a@ideak-desk.fi.intel.com> References: <20180206161058.nuc2cfilodtqk32a@ideak-desk.fi.intel.com> X-Mailman-Approved-At: Wed, 07 Feb 2018 07:42:28 +0000 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , David Airlie , dri-devel@lists.freedesktop.org, Rodrigo Vivi , tursulin@ursulin.net, intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We are not allowed to call intel_runtime_pm_get from the PMU counter read callback since the former can sleep, and the latter is running under IRQ context. To workaround this, we record the last known RC6 and while runtime suspended estimate its increase by querying the runtime PM core timestamps. Downside of this approach is that we can temporarily lose a chunk of RC6 time, from the last PMU read-out to runtime suspend entry, but that will eventually catch up, once device comes back online and in the presence of PMU queries. Also, we have to be careful not to overshoot the RC6 estimate, so once resumed after a period of approximation, we only update the counter once it catches up. With the observation that RC6 is increasing while the device is suspended, this should not pose a problem and can only cause slight inaccuracies due clock base differences. v2: Simplify by estimating on top of PM core counters. (Imre) Signed-off-by: Tvrtko Ursulin Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943 Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics") Testcase: igt/perf_pmu/rc6-runtime-pm Cc: Tvrtko Ursulin Cc: Chris Wilson Cc: Imre Deak Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: David Airlie Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++++++++++++++++++------- drivers/gpu/drm/i915/i915_pmu.h | 6 +++ 2 files changed, 84 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 1c440460255d..bfc402d47609 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -415,7 +415,81 @@ static int i915_pmu_event_init(struct perf_event *event) return 0; } -static u64 __i915_pmu_event_read(struct perf_event *event) +static u64 get_rc6(struct drm_i915_private *i915, bool locked) +{ + unsigned long flags; + u64 val; + + if (intel_runtime_pm_get_if_in_use(i915)) { + val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ? + VLV_GT_RENDER_RC6 : + GEN6_GT_GFX_RC6); + + if (HAS_RC6p(i915)) + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p); + + if (HAS_RC6pp(i915)) + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp); + + intel_runtime_pm_put(i915); + + /* + * If we are coming back from being runtime suspended we must + * be careful not to report a larger value than returned + * previously. + */ + + if (!locked) + spin_lock_irqsave(&i915->pmu.lock, flags); + + if (val >= i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) { + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0; + i915->pmu.sample[__I915_SAMPLE_RC6].cur = val; + } else { + val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur; + } + + if (!locked) + spin_unlock_irqrestore(&i915->pmu.lock, flags); + } else { + struct pci_dev *pdev = i915->drm.pdev; + struct device *kdev = &pdev->dev; + unsigned long flags2; + + /* + * We are runtime suspended. + * + * Report the delta from when the device was suspended to now, + * on top of the last known real value, as the approximated RC6 + * counter value. + */ + if (!locked) + spin_lock_irqsave(&i915->pmu.lock, flags); + + spin_lock_irqsave(&kdev->power.lock, flags2); + + if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) + i915->pmu.suspended_jiffies_last = + kdev->power.suspended_jiffies; + + val = kdev->power.suspended_jiffies - + i915->pmu.suspended_jiffies_last; + val += jiffies - kdev->power.accounting_timestamp; + + spin_unlock_irqrestore(&kdev->power.lock, flags2); + + val = jiffies_to_nsecs(val); + val += i915->pmu.sample[__I915_SAMPLE_RC6].cur; + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; + + if (!locked) + spin_unlock_irqrestore(&i915->pmu.lock, flags); + } + + return val; +} + +static u64 __i915_pmu_event_read(struct perf_event *event, bool locked) { struct drm_i915_private *i915 = container_of(event->pmu, typeof(*i915), pmu.base); @@ -453,18 +527,7 @@ static u64 __i915_pmu_event_read(struct perf_event *event) val = count_interrupts(i915); break; case I915_PMU_RC6_RESIDENCY: - intel_runtime_pm_get(i915); - val = intel_rc6_residency_ns(i915, - IS_VALLEYVIEW(i915) ? - VLV_GT_RENDER_RC6 : - GEN6_GT_GFX_RC6); - if (HAS_RC6p(i915)) - val += intel_rc6_residency_ns(i915, - GEN6_GT_GFX_RC6p); - if (HAS_RC6pp(i915)) - val += intel_rc6_residency_ns(i915, - GEN6_GT_GFX_RC6pp); - intel_runtime_pm_put(i915); + val = get_rc6(i915, locked); break; } } @@ -479,7 +542,7 @@ static void i915_pmu_event_read(struct perf_event *event) again: prev = local64_read(&hwc->prev_count); - new = __i915_pmu_event_read(event); + new = __i915_pmu_event_read(event, false); if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) goto again; @@ -534,7 +597,7 @@ static void i915_pmu_enable(struct perf_event *event) * for all listeners. Even when the event was already enabled and has * an existing non-zero value. */ - local64_set(&event->hw.prev_count, __i915_pmu_event_read(event)); + local64_set(&event->hw.prev_count, __i915_pmu_event_read(event, true)); spin_unlock_irqrestore(&i915->pmu.lock, flags); } diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 5a2e013a56bb..aa1b1a987ea1 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -27,6 +27,8 @@ enum { __I915_SAMPLE_FREQ_ACT = 0, __I915_SAMPLE_FREQ_REQ, + __I915_SAMPLE_RC6, + __I915_SAMPLE_RC6_ESTIMATED, __I915_NUM_PMU_SAMPLERS }; @@ -94,6 +96,10 @@ struct i915_pmu { * struct intel_engine_cs. */ struct i915_pmu_sample sample[__I915_NUM_PMU_SAMPLERS]; + /** + * @suspended_jiffies_last: Cached suspend time from PM core. + */ + unsigned long suspended_jiffies_last; /** * @i915_attr: Memory block holding device attributes. */