From patchwork Tue Aug 6 09:05:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11078497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1C60D14DB for ; Tue, 6 Aug 2019 09:06:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B6A226AE3 for ; Tue, 6 Aug 2019 09:06:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F348D28927; Tue, 6 Aug 2019 09:06:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 45F4826AE3 for ; Tue, 6 Aug 2019 09:06:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BD32A6E34E; Tue, 6 Aug 2019 09:06:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 31F8C6E343 for ; Tue, 6 Aug 2019 09:06:03 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17901461-1500050 for multiple; Tue, 06 Aug 2019 10:05:37 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 6 Aug 2019 10:05:17 +0100 Message-Id: <20190806090535.14807-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20190806090535.14807-1-chris@chris-wilson.co.uk> References: <20190806090535.14807-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/22] drm/i915/pmu: Use GT parked for estimating RC6 while asleep X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP As we track when we put the GT device to sleep upon idling, we can use that callback to sample the current rc6 counters and record the timestamp for estimating samples after that point while asleep. v2: Stick to using ktime_t v3: Track user_wakerefs that interfere with the new intel_gt_pm_wait_for_idle Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gem/i915_gem_pm.c | 19 ++++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 1 + drivers/gpu/drm/i915/i915_debugfs.c | 30 +++--- drivers/gpu/drm/i915/i915_pmu.c | 120 +++++++++++------------ drivers/gpu/drm/i915/i915_pmu.h | 4 +- 5 files changed, 96 insertions(+), 78 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index 17e3618241c5..4b866db46d52 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -141,6 +141,21 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915) return switch_to_kernel_context_sync(&i915->gt); } +static void user_forcewake(struct intel_gt *gt, bool suspend) +{ + int count = atomic_read(>->user_wakeref); + + if (likely(!count)) + return; + + intel_gt_pm_get(gt); + if (suspend) + atomic_sub(count, >->wakeref.count); + else + atomic_add(count, >->wakeref.count); + intel_gt_pm_put(gt); +} + void i915_gem_suspend(struct drm_i915_private *i915) { GEM_TRACE("\n"); @@ -148,6 +163,8 @@ void i915_gem_suspend(struct drm_i915_private *i915) intel_wakeref_auto(&i915->ggtt.userfault_wakeref, 0); flush_workqueue(i915->wq); + user_forcewake(&i915->gt, true); + mutex_lock(&i915->drm.struct_mutex); /* @@ -262,6 +279,8 @@ void i915_gem_resume(struct drm_i915_private *i915) if (!i915_gem_load_power_context(i915)) goto err_wedged; + user_forcewake(&i915->gt, false); + out_unlock: intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL); mutex_unlock(&i915->drm.struct_mutex); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 5fd11e361d03..9f36eac9e23b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -51,6 +51,7 @@ struct intel_gt { struct list_head active_rings; struct intel_wakeref wakeref; + atomic_t user_wakeref; struct list_head closed_vma; spinlock_t closed_lock; /* guards the list of closed_vma */ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 68a6b002c9ba..6659b616344a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3666,8 +3666,12 @@ i915_drop_caches_set(void *data, u64 val) mutex_unlock(&i915->drm.struct_mutex); - if (ret == 0 && val & DROP_IDLE) - ret = intel_gt_pm_wait_for_idle(&i915->gt); + if (ret == 0 && val & DROP_IDLE) { + if (atomic_read(&i915->gt.user_wakeref)) + ret = -EBUSY; + else + ret = intel_gt_pm_wait_for_idle(&i915->gt); + } } if (val & DROP_RESET_ACTIVE && intel_gt_terminally_wedged(&i915->gt)) @@ -3995,13 +3999,12 @@ static int i915_sseu_status(struct seq_file *m, void *unused) static int i915_forcewake_open(struct inode *inode, struct file *file) { struct drm_i915_private *i915 = inode->i_private; + struct intel_gt *gt = &i915->gt; - if (INTEL_GEN(i915) < 6) - return 0; - - file->private_data = - (void *)(uintptr_t)intel_runtime_pm_get(&i915->runtime_pm); - intel_uncore_forcewake_user_get(&i915->uncore); + atomic_inc(>->user_wakeref); + intel_gt_pm_get(gt); + if (INTEL_GEN(i915) >= 6) + intel_uncore_forcewake_user_get(gt->uncore); return 0; } @@ -4009,13 +4012,12 @@ static int i915_forcewake_open(struct inode *inode, struct file *file) static int i915_forcewake_release(struct inode *inode, struct file *file) { struct drm_i915_private *i915 = inode->i_private; + struct intel_gt *gt = &i915->gt; - if (INTEL_GEN(i915) < 6) - return 0; - - intel_uncore_forcewake_user_put(&i915->uncore); - intel_runtime_pm_put(&i915->runtime_pm, - (intel_wakeref_t)(uintptr_t)file->private_data); + if (INTEL_GEN(i915) >= 6) + intel_uncore_forcewake_user_put(&i915->uncore); + intel_gt_pm_put(gt); + atomic_dec(>->user_wakeref); return 0; } diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index c7ee0ab180e8..2faefe9e630e 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -116,19 +116,51 @@ static bool pmu_needs_timer(struct i915_pmu *pmu, bool gpu_active) return enable; } +static u64 __get_rc6(struct intel_gt *gt) +{ + struct drm_i915_private *i915 = gt->i915; + u64 val; + + val = intel_rc6_residency_ns(i915, + IS_VALLEYVIEW(i915) ? + VLV_GT_RENDER_RC6 : + GEN6_GT_GFX_RC6); + + if (HAS_RC6p(i915)) + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p); + + if (HAS_RC6pp(i915)) + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp); + + return val; +} + void i915_pmu_gt_parked(struct drm_i915_private *i915) { struct i915_pmu *pmu = &i915->pmu; + u64 val; if (!pmu->base.event_init) return; spin_lock_irq(&pmu->lock); + + val = 0; + if (pmu->sample[__I915_SAMPLE_RC6].cur) + val = __get_rc6(&i915->gt); + + if (val >= pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur) { + pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0; + pmu->sample[__I915_SAMPLE_RC6].cur = val; + } + pmu->sleep_last = ktime_get(); + /* * Signal sampling timer to stop if only engine events are enabled and * GPU went idle. */ pmu->timer_enabled = pmu_needs_timer(pmu, false); + spin_unlock_irq(&pmu->lock); } @@ -143,6 +175,11 @@ static void __i915_pmu_maybe_start_timer(struct i915_pmu *pmu) } } +static inline s64 ktime_since(const ktime_t kt) +{ + return ktime_to_ns(ktime_sub(ktime_get(), kt)); +} + void i915_pmu_gt_unparked(struct drm_i915_private *i915) { struct i915_pmu *pmu = &i915->pmu; @@ -151,10 +188,22 @@ void i915_pmu_gt_unparked(struct drm_i915_private *i915) return; spin_lock_irq(&pmu->lock); + /* * Re-enable sampling timer when GPU goes active. */ __i915_pmu_maybe_start_timer(pmu); + + /* Estimate how long we slept and accumulate that into rc6 counters */ + if (pmu->sample[__I915_SAMPLE_RC6].cur) { + u64 val; + + val = ktime_since(pmu->sleep_last); + val += pmu->sample[__I915_SAMPLE_RC6].cur; + + pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; + } + spin_unlock_irq(&pmu->lock); } @@ -426,39 +475,18 @@ static int i915_pmu_event_init(struct perf_event *event) return 0; } -static u64 __get_rc6(struct intel_gt *gt) -{ - struct drm_i915_private *i915 = gt->i915; - u64 val; - - val = intel_rc6_residency_ns(i915, - IS_VALLEYVIEW(i915) ? - VLV_GT_RENDER_RC6 : - GEN6_GT_GFX_RC6); - - if (HAS_RC6p(i915)) - val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p); - - if (HAS_RC6pp(i915)) - val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp); - - return val; -} - static u64 get_rc6(struct intel_gt *gt) { -#if IS_ENABLED(CONFIG_PM) struct drm_i915_private *i915 = gt->i915; - struct intel_runtime_pm *rpm = &i915->runtime_pm; struct i915_pmu *pmu = &i915->pmu; - intel_wakeref_t wakeref; unsigned long flags; u64 val; - wakeref = intel_runtime_pm_get_if_in_use(rpm); - if (wakeref) { + spin_lock_irqsave(&pmu->lock, flags); + + if (intel_gt_pm_get_if_awake(gt)) { val = __get_rc6(gt); - intel_runtime_pm_put(rpm, wakeref); + intel_gt_pm_put(gt); /* * If we are coming back from being runtime suspended we must @@ -466,19 +494,13 @@ static u64 get_rc6(struct intel_gt *gt) * previously. */ - spin_lock_irqsave(&pmu->lock, flags); - if (val >= pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur) { pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0; pmu->sample[__I915_SAMPLE_RC6].cur = val; } else { val = pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur; } - - spin_unlock_irqrestore(&pmu->lock, flags); } else { - struct device *kdev = rpm->kdev; - /* * We are runtime suspended. * @@ -486,42 +508,16 @@ static u64 get_rc6(struct intel_gt *gt) * on top of the last known real value, as the approximated RC6 * counter value. */ - spin_lock_irqsave(&pmu->lock, flags); - /* - * After the above branch intel_runtime_pm_get_if_in_use failed - * to get the runtime PM reference we cannot assume we are in - * runtime suspend since we can either: a) race with coming out - * of it before we took the power.lock, or b) there are other - * states than suspended which can bring us here. - * - * We need to double-check that we are indeed currently runtime - * suspended and if not we cannot do better than report the last - * known RC6 value. - */ - if (pm_runtime_status_suspended(kdev)) { - val = pm_runtime_suspended_time(kdev); + val = ktime_since(pmu->sleep_last); + val += pmu->sample[__I915_SAMPLE_RC6].cur; - if (!pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur) - pmu->suspended_time_last = val; - - val -= pmu->suspended_time_last; - val += pmu->sample[__I915_SAMPLE_RC6].cur; - - pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; - } else if (pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur) { - val = pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur; - } else { - val = pmu->sample[__I915_SAMPLE_RC6].cur; - } - - spin_unlock_irqrestore(&pmu->lock, flags); + pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; } + spin_unlock_irqrestore(&pmu->lock, flags); + return val; -#else - return __get_rc6(gt); -#endif } static u64 __i915_pmu_event_read(struct perf_event *event) diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 4fc4f2478301..067dbbf3bdff 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -97,9 +97,9 @@ struct i915_pmu { */ struct i915_pmu_sample sample[__I915_NUM_PMU_SAMPLERS]; /** - * @suspended_time_last: Cached suspend time from PM core. + * @sleep_last: Last time GT parked for RC6 estimation. */ - u64 suspended_time_last; + ktime_t sleep_last; /** * @i915_attr: Memory block holding device attributes. */