From patchwork Tue Feb 6 14:31:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10204719 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 61D7F602D8 for ; Wed, 7 Feb 2018 07:43:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4DFE728D4D for ; Wed, 7 Feb 2018 07:43:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 41AB628D52; Wed, 7 Feb 2018 07:43:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A147928D4D for ; Wed, 7 Feb 2018 07:43:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8AE086E4ED; Wed, 7 Feb 2018 07:42:41 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id B360C6E480 for ; Tue, 6 Feb 2018 14:31:20 +0000 (UTC) Received: by mail-wm0-x241.google.com with SMTP id 141so4208299wme.3 for ; Tue, 06 Feb 2018 06:31:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ursulin-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=zHMZxg/wE+S7VIvLLsHdlspaPagJeG/sBvBILhGSz5Q=; b=ix+hdLgPMFehbBtAaobQBsC990uh07acBQzoGKkuEYVDpAtkhdaLKcifOODpb64A3i g7q7LGLSu4T3Z8pPS6Y48OAKdEltfHCJdHQVseKyZk3v6SfgsQnDecbHX7MfRmLLe41t 4onSz4NbJTq0Z/p5UK7tiOZkeWXEPi8rB4Ke6gIf0FbrSHB28BMXf+cabBzi2XceSJZY /EE4sEHKUJZE0PUBk3OS50MjIvU7lXDx6Kcklg2zF1ak6W85KxBpcfcKNVazcEjOmd0W XW9QHXFzQ/kVXl7kgpWb7C3Yr6rGUaNjDZ0rj8BoTGImOr1zfmm8Fyu087oykQD6ZGaB 6RpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=zHMZxg/wE+S7VIvLLsHdlspaPagJeG/sBvBILhGSz5Q=; b=Atm6+iIS05dvy8sQqRGrgOUlhacUHNsGJAYPBPEi2eCAGJijHYzmfMdeAONykhHQ3y /fNbJRLxEF+kZ0idvv8cW4DqRMApYFTXkxxpcwZqiujVnCHftkX3uexV4CZAOHmzKOTV xAkH6CHEVJYYybx7WFxMHLM3veWjK7t6XuWNUvqgtAGR12aHFohPknpOQiO3Q8ZBq+pt heSUMVlz+pqcqiHhro7ef4pIHZ/0i+PNyJgHumURF1piG4cYHcTTDbSWiU0f+0D3SQz/ lRouasYAbovKuhHgSSV9ejxJPqeSP9kFqjm75uBnZjsQBlmjgv5Wokb93nUTSYMKvauc aGtw== X-Gm-Message-State: APf1xPAwAe6lNP5iJtV4bsK6HiwJIrJWYWzTbSKOpkcGkq1u/4saRFPl gi3rnAWsIvIELNTywA5YeAfdng== X-Google-Smtp-Source: AH8x227M0tsD4DjrBswwLRkkuH19yYsA24pnKiCzPLjyhlsiBIZJij+56zEfxzI7GL95UQbx0/Iapw== X-Received: by 10.28.199.201 with SMTP id x192mr1969397wmf.45.1517927479333; Tue, 06 Feb 2018 06:31:19 -0800 (PST) Received: from localhost.localdomain ([95.146.144.186]) by smtp.gmail.com with ESMTPSA id 5sm6823797wre.35.2018.02.06.06.31.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Feb 2018 06:31:18 -0800 (PST) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Subject: [PATCH] drm/i915/pmu: Fix sleep under atomic in RC6 readout Date: Tue, 6 Feb 2018 14:31:07 +0000 Message-Id: <20180206143107.25786-1-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.14.1 X-Mailman-Approved-At: Wed, 07 Feb 2018 07:42:28 +0000 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , David Airlie , dri-devel@lists.freedesktop.org, Rodrigo Vivi , tursulin@ursulin.net, intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We are not allowed to call intel_runtime_pm_get from the PMU counter read callback since the former can sleep, and the latter is running under IRQ context. To workaround this, we start our timer when we detect that we have failed to obtain a runtime PM reference during read, and approximate the growing RC6 counter from the timer. Once the timer manages to obtain the runtime PM reference, we stop the timer and go back to the above described behaviour. We have to be careful not to overshoot the RC6 estimate, so once resumed after a period of approximation, we only update the counter once it catches up. With the observation that RC6 is increasing while the device is suspended, this should not pose a problem and can only cause slight inaccuracies due clock base differences. Signed-off-by: Tvrtko Ursulin Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943 Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics") Testcase: igt/perf_pmu/rc6-runtime-pm Cc: Tvrtko Ursulin Cc: Chris Wilson Cc: Imre Deak Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: David Airlie Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org --- drivers/gpu/drm/i915/i915_pmu.c | 149 ++++++++++++++++++++++++++++++---------- drivers/gpu/drm/i915/i915_pmu.h | 1 + 2 files changed, 114 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 1c440460255d..dca41c072a7c 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -90,23 +90,16 @@ static unsigned int event_enabled_bit(struct perf_event *event) return config_enabled_bit(event->attr.config); } -static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active) +static bool +__pmu_needs_timer(struct drm_i915_private *i915, u64 enable, bool gpu_active) { - u64 enable; - - /* - * Only some counters need the sampling timer. - * - * We start with a bitmask of all currently enabled events. - */ - enable = i915->pmu.enable; - /* - * Mask out all the ones which do not need the timer, or in + * Mask out all events which do not need the timer, or in * other words keep all the ones that could need the timer. */ enable &= config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY) | config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY) | + config_enabled_mask(I915_PMU_RC6_RESIDENCY) | ENGINE_SAMPLE_MASK; /* @@ -130,6 +123,11 @@ static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active) return enable; } +static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active) +{ + return __pmu_needs_timer(i915, i915->pmu.enable, gpu_active); +} + void i915_pmu_gt_parked(struct drm_i915_private *i915) { if (!i915->pmu.base.event_init) @@ -181,20 +179,20 @@ update_sample(struct i915_pmu_sample *sample, u32 unit, u32 val) sample->cur += mul_u32_u32(val, unit); } -static void engines_sample(struct drm_i915_private *dev_priv) +static bool engines_sample(struct drm_i915_private *dev_priv) { struct intel_engine_cs *engine; enum intel_engine_id id; bool fw = false; if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0) - return; + return false; if (!dev_priv->gt.awake) - return; + return false; if (!intel_runtime_pm_get_if_in_use(dev_priv)) - return; + return false; for_each_engine(engine, dev_priv, id) { u32 current_seqno = intel_engine_get_seqno(engine); @@ -225,10 +223,51 @@ static void engines_sample(struct drm_i915_private *dev_priv) if (fw) intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); - intel_runtime_pm_put(dev_priv); + return true; +} + +static u64 read_rc6_residency(struct drm_i915_private *i915) +{ + u64 val; + + val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ? + VLV_GT_RENDER_RC6 : GEN6_GT_GFX_RC6); + if (HAS_RC6p(i915)) + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p); + if (HAS_RC6pp(i915)) + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp); + + return val; +} + +static void +update_rc6_sample(struct drm_i915_private *i915, u64 val, bool locked) +{ + unsigned long flags; + + if (!locked) + spin_lock_irqsave(&i915->pmu.lock, flags); + + /* + * Update stored RC6 counter only if it is greater than the current + * value. This deals with periods of runtime suspend during which we are + * estimating the RC6 residency, so do not want to overshoot the real + * value read once the device is woken up. + */ + if (val > i915->pmu.sample[__I915_SAMPLE_RC6].cur) + i915->pmu.sample[__I915_SAMPLE_RC6].cur = val; + + /* We don't need to sample RC6 from the timer any more. */ + i915->pmu.timer_enabled = + __pmu_needs_timer(i915, + i915->pmu.enable & ~config_enabled_mask(I915_PMU_RC6_RESIDENCY), + READ_ONCE(i915->gt.awake)); + + if (!locked) + spin_unlock_irqrestore(&i915->pmu.lock, flags); } -static void frequency_sample(struct drm_i915_private *dev_priv) +static bool others_sample(struct drm_i915_private *dev_priv, bool pm) { if (dev_priv->pmu.enable & config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) { @@ -236,10 +275,10 @@ static void frequency_sample(struct drm_i915_private *dev_priv) val = dev_priv->gt_pm.rps.cur_freq; if (dev_priv->gt.awake && - intel_runtime_pm_get_if_in_use(dev_priv)) { + (pm || intel_runtime_pm_get_if_in_use(dev_priv))) { + pm = true; val = intel_get_cagf(dev_priv, I915_READ_NOTRACE(GEN6_RPSTAT1)); - intel_runtime_pm_put(dev_priv); } update_sample(&dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT], @@ -252,18 +291,48 @@ static void frequency_sample(struct drm_i915_private *dev_priv) intel_gpu_freq(dev_priv, dev_priv->gt_pm.rps.cur_freq)); } + + if (dev_priv->pmu.enable & + config_enabled_mask(I915_PMU_RC6_RESIDENCY)) { + if (pm || intel_runtime_pm_get_if_in_use(dev_priv)) { + update_rc6_sample(dev_priv, + read_rc6_residency(dev_priv), + false); + pm = true; + } else { + unsigned long flags; + + /* + * When device is runtime suspended we assume RC6 + * residency is increasing by the sampling timer period. + */ + spin_lock_irqsave(&dev_priv->pmu.lock, flags); + dev_priv->pmu.sample[__I915_SAMPLE_RC6].cur += PERIOD; + spin_unlock_irqrestore(&dev_priv->pmu.lock, flags); + } + } + + return pm; + } static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) { struct drm_i915_private *i915 = container_of(hrtimer, struct drm_i915_private, pmu.timer); + bool pm; if (!READ_ONCE(i915->pmu.timer_enabled)) return HRTIMER_NORESTART; - engines_sample(i915); - frequency_sample(i915); + pm = engines_sample(i915); + pm = others_sample(i915, pm); + + if (pm) + intel_runtime_pm_put(i915); + + if (!READ_ONCE(i915->pmu.timer_enabled)) + return HRTIMER_NORESTART; hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD)); return HRTIMER_RESTART; @@ -415,7 +484,7 @@ static int i915_pmu_event_init(struct perf_event *event) return 0; } -static u64 __i915_pmu_event_read(struct perf_event *event) +static u64 __i915_pmu_event_read(struct perf_event *event, bool locked) { struct drm_i915_private *i915 = container_of(event->pmu, typeof(*i915), pmu.base); @@ -453,18 +522,26 @@ static u64 __i915_pmu_event_read(struct perf_event *event) val = count_interrupts(i915); break; case I915_PMU_RC6_RESIDENCY: - intel_runtime_pm_get(i915); - val = intel_rc6_residency_ns(i915, - IS_VALLEYVIEW(i915) ? - VLV_GT_RENDER_RC6 : - GEN6_GT_GFX_RC6); - if (HAS_RC6p(i915)) - val += intel_rc6_residency_ns(i915, - GEN6_GT_GFX_RC6p); - if (HAS_RC6pp(i915)) - val += intel_rc6_residency_ns(i915, - GEN6_GT_GFX_RC6pp); - intel_runtime_pm_put(i915); + if (intel_runtime_pm_get_if_in_use(i915)) { + update_rc6_sample(i915, + read_rc6_residency(i915), + locked); + intel_runtime_pm_put(i915); + } else { + unsigned long flags; + + /* + * If we failed to read the actual value, start + * the timer which will be estimating it while + * device is suspended. + */ + if (!locked) + spin_lock_irqsave(&i915->pmu.lock, flags); + __i915_pmu_maybe_start_timer(i915); + if (!locked) + spin_unlock_irqrestore(&i915->pmu.lock, flags); + } + val = i915->pmu.sample[__I915_SAMPLE_RC6].cur; break; } } @@ -479,7 +556,7 @@ static void i915_pmu_event_read(struct perf_event *event) again: prev = local64_read(&hwc->prev_count); - new = __i915_pmu_event_read(event); + new = __i915_pmu_event_read(event, false); if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) goto again; @@ -534,7 +611,7 @@ static void i915_pmu_enable(struct perf_event *event) * for all listeners. Even when the event was already enabled and has * an existing non-zero value. */ - local64_set(&event->hw.prev_count, __i915_pmu_event_read(event)); + local64_set(&event->hw.prev_count, __i915_pmu_event_read(event, true)); spin_unlock_irqrestore(&i915->pmu.lock, flags); } diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 5a2e013a56bb..249983ed3f08 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -27,6 +27,7 @@ enum { __I915_SAMPLE_FREQ_ACT = 0, __I915_SAMPLE_FREQ_REQ, + __I915_SAMPLE_RC6, __I915_NUM_PMU_SAMPLERS };