Message ID | 20200114105648.2172026-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] drm/i915/pmu: Correct the rc6 offset upon enabling | expand |
Quoting Chris Wilson (2020-01-14 10:56:47) > The rc6 residency starts ticking from 0 from BIOS POST, but the kernel > starts measuring the time from its boot. If we start measuruing > I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from > 0 and then upon first activity (park/unpark) add in all the rc6 > residency since boot. After the first park with the sampler engaged, the > sleep/active counters are aligned. > > v2: With a wakeref to be sure > > Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > index 28a82c849bac..ec0299490dd4 100644 > --- a/drivers/gpu/drm/i915/i915_pmu.c > +++ b/drivers/gpu/drm/i915/i915_pmu.c > @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event) > container_of(event->pmu, typeof(*i915), pmu.base); > unsigned int bit = event_enabled_bit(event); > struct i915_pmu *pmu = &i915->pmu; > + intel_wakeref_t wakeref; > unsigned long flags; > > + wakeref = intel_runtime_pm_get(&i915->runtime_pm); > spin_lock_irqsave(&pmu->lock, flags); > > /* > @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event) > BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); > GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); > GEM_BUG_ON(pmu->enable_count[bit] == ~0); > + > + if (pmu->enable_count[bit] == 0 && > + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) { > + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0; I can't decide if it's better to have discrete sampling appear monotonic, or to reset just in case we drifted far off. -Chris
On 14/01/2020 11:06, Chris Wilson wrote: > Quoting Chris Wilson (2020-01-14 10:56:47) >> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel >> starts measuring the time from its boot. If we start measuruing >> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from >> 0 and then upon first activity (park/unpark) add in all the rc6 >> residency since boot. After the first park with the sampler engaged, the >> sleep/active counters are aligned. >> >> v2: With a wakeref to be sure >> >> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> --- >> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c >> index 28a82c849bac..ec0299490dd4 100644 >> --- a/drivers/gpu/drm/i915/i915_pmu.c >> +++ b/drivers/gpu/drm/i915/i915_pmu.c >> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event) >> container_of(event->pmu, typeof(*i915), pmu.base); >> unsigned int bit = event_enabled_bit(event); >> struct i915_pmu *pmu = &i915->pmu; >> + intel_wakeref_t wakeref; >> unsigned long flags; >> >> + wakeref = intel_runtime_pm_get(&i915->runtime_pm); I think it would be nicer to use with_intel_runtime_pm directly at the __get_rc6 call site. That would show/localise where it is actually needed. >> spin_lock_irqsave(&pmu->lock, flags); >> >> /* >> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event) >> BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); >> GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); >> GEM_BUG_ON(pmu->enable_count[bit] == ~0); >> + >> + if (pmu->enable_count[bit] == 0 && >> + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) { >> + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0; > > I can't decide if it's better to have discrete sampling appear > monotonic, or to reset just in case we drifted far off. What do you mean? This looks correct to me as you implemented it. On enable it samples the real RC6 and updates pmu->sleep_last. So regardless if the next even read comes with device awake or suspended it will report monotonic and without adding up any time outside the enabled window. Drift can normally come when we overestimate because hw RC6 can be less than our time between park and unpark. I don't see how to reset that and stay monotonic. Or you are thinking it doesn't need to be monotonic? Regards, Tvrtko
On 14/01/2020 11:37, Tvrtko Ursulin wrote: > > On 14/01/2020 11:06, Chris Wilson wrote: >> Quoting Chris Wilson (2020-01-14 10:56:47) >>> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel >>> starts measuring the time from its boot. If we start measuruing >>> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from >>> 0 and then upon first activity (park/unpark) add in all the rc6 >>> residency since boot. After the first park with the sampler engaged, the >>> sleep/active counters are aligned. >>> >>> v2: With a wakeref to be sure >>> >>> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> --- >>> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c >>> b/drivers/gpu/drm/i915/i915_pmu.c >>> index 28a82c849bac..ec0299490dd4 100644 >>> --- a/drivers/gpu/drm/i915/i915_pmu.c >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c >>> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event >>> *event) >>> container_of(event->pmu, typeof(*i915), pmu.base); >>> unsigned int bit = event_enabled_bit(event); >>> struct i915_pmu *pmu = &i915->pmu; >>> + intel_wakeref_t wakeref; >>> unsigned long flags; >>> + wakeref = intel_runtime_pm_get(&i915->runtime_pm); > > I think it would be nicer to use with_intel_runtime_pm directly at the > __get_rc6 call site. That would show/localise where it is actually needed. Spinlock though, my bad. Regards, Tvrtko
Quoting Tvrtko Ursulin (2020-01-14 11:37:09) > > On 14/01/2020 11:06, Chris Wilson wrote: > > Quoting Chris Wilson (2020-01-14 10:56:47) > >> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel > >> starts measuring the time from its boot. If we start measuruing > >> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from > >> 0 and then upon first activity (park/unpark) add in all the rc6 > >> residency since boot. After the first park with the sampler engaged, the > >> sleep/active counters are aligned. > >> > >> v2: With a wakeref to be sure > >> > >> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") > >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >> --- > >> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ > >> 1 file changed, 12 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > >> index 28a82c849bac..ec0299490dd4 100644 > >> --- a/drivers/gpu/drm/i915/i915_pmu.c > >> +++ b/drivers/gpu/drm/i915/i915_pmu.c > >> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event) > >> container_of(event->pmu, typeof(*i915), pmu.base); > >> unsigned int bit = event_enabled_bit(event); > >> struct i915_pmu *pmu = &i915->pmu; > >> + intel_wakeref_t wakeref; > >> unsigned long flags; > >> > >> + wakeref = intel_runtime_pm_get(&i915->runtime_pm); > > I think it would be nicer to use with_intel_runtime_pm directly at the > __get_rc6 call site. That would show/localise where it is actually needed. We can't, as it gets called under the spinlock :( And I don't see a way around that, as we require the fixup to be applied while idle and so require the wakeref. > >> spin_lock_irqsave(&pmu->lock, flags); > >> > >> /* > >> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event) > >> BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); > >> GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); > >> GEM_BUG_ON(pmu->enable_count[bit] == ~0); > >> + > >> + if (pmu->enable_count[bit] == 0 && > >> + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) { > >> + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0; > > > > I can't decide if it's better to have discrete sampling appear > > monotonic, or to reset just in case we drifted far off. > > What do you mean? > > This looks correct to me as you implemented it. On enable it samples the > real RC6 and updates pmu->sleep_last. So regardless if the next even > read comes with device awake or suspended it will report monotonic and > without adding up any time outside the enabled window. u64 sample[2]; fd = perf_open(RC6); sample[0] = read(fd); close(fd); fd = perf_open(RC6); sample[1] = read(fd); close(fd); /* assume idle system */ assert(sample[1] > sample[0]); Do we want that? I don't think that's required by the perf API, as the counters are only valid while the event is enabled (iirc). > Drift can normally come when we overestimate because hw RC6 can be less > than our time between park and unpark. I don't see how to reset that and > stay monotonic. Or you are thinking it doesn't need to be monotonic? I was mostly thinking of bugs, e.g. across suspend. We also have a problem if we wait longer than 2*counter_wrap between perf_event_reads, and we probably should install a very lax timer for rc6 to ensure the __get_rc6() calls remain monotonic. There was a fdo bug for that :) -Chris
On 14/01/2020 11:49, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2020-01-14 11:37:09) >> On 14/01/2020 11:06, Chris Wilson wrote: >>> Quoting Chris Wilson (2020-01-14 10:56:47) >>>> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel >>>> starts measuring the time from its boot. If we start measuruing >>>> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from >>>> 0 and then upon first activity (park/unpark) add in all the rc6 >>>> residency since boot. After the first park with the sampler engaged, the >>>> sleep/active counters are aligned. >>>> >>>> v2: With a wakeref to be sure >>>> >>>> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") >>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c >>>> index 28a82c849bac..ec0299490dd4 100644 >>>> --- a/drivers/gpu/drm/i915/i915_pmu.c >>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c >>>> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event) >>>> container_of(event->pmu, typeof(*i915), pmu.base); >>>> unsigned int bit = event_enabled_bit(event); >>>> struct i915_pmu *pmu = &i915->pmu; >>>> + intel_wakeref_t wakeref; >>>> unsigned long flags; >>>> >>>> + wakeref = intel_runtime_pm_get(&i915->runtime_pm); >> >> I think it would be nicer to use with_intel_runtime_pm directly at the >> __get_rc6 call site. That would show/localise where it is actually needed. > > We can't, as it gets called under the spinlock :( > > And I don't see a way around that, as we require the fixup to be applied > while idle and so require the wakeref. > >>>> spin_lock_irqsave(&pmu->lock, flags); >>>> >>>> /* >>>> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event) >>>> BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); >>>> GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); >>>> GEM_BUG_ON(pmu->enable_count[bit] == ~0); >>>> + >>>> + if (pmu->enable_count[bit] == 0 && >>>> + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) { >>>> + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0; >>> >>> I can't decide if it's better to have discrete sampling appear >>> monotonic, or to reset just in case we drifted far off. >> >> What do you mean? >> >> This looks correct to me as you implemented it. On enable it samples the >> real RC6 and updates pmu->sleep_last. So regardless if the next even >> read comes with device awake or suspended it will report monotonic and >> without adding up any time outside the enabled window. > > u64 sample[2]; > > fd = perf_open(RC6); > sample[0] = read(fd); > close(fd); > > fd = perf_open(RC6); > sample[1] = read(fd); > close(fd); > > /* assume idle system */ > assert(sample[1] > sample[0]); > > Do we want that? I don't think that's required by the perf API, as the > counters are only valid while the event is enabled (iirc). Yeah I'd say this is not interesting. > >> Drift can normally come when we overestimate because hw RC6 can be less >> than our time between park and unpark. I don't see how to reset that and >> stay monotonic. Or you are thinking it doesn't need to be monotonic? > > I was mostly thinking of bugs, e.g. across suspend. We also have a problem > if we wait longer than 2*counter_wrap between perf_event_reads, and we > probably should install a very lax timer for rc6 to ensure the > __get_rc6() calls remain monotonic. There was a fdo bug for that :) Yes I remember now. With 1.28us tick hw overflow is ~5500 seconds so perf_event_read every ~3hrs is enough to keep outside overflow. It's probably enough. :) For the patch: Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 28a82c849bac..ec0299490dd4 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event) container_of(event->pmu, typeof(*i915), pmu.base); unsigned int bit = event_enabled_bit(event); struct i915_pmu *pmu = &i915->pmu; + intel_wakeref_t wakeref; unsigned long flags; + wakeref = intel_runtime_pm_get(&i915->runtime_pm); spin_lock_irqsave(&pmu->lock, flags); /* @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event) BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); GEM_BUG_ON(pmu->enable_count[bit] == ~0); + + if (pmu->enable_count[bit] == 0 && + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) { + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0; + pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(&i915->gt); + pmu->sleep_last = ktime_get(); + } + pmu->enable |= BIT_ULL(bit); pmu->enable_count[bit]++; @@ -688,6 +698,8 @@ static void i915_pmu_enable(struct perf_event *event) * an existing non-zero value. */ local64_set(&event->hw.prev_count, __i915_pmu_event_read(event)); + + intel_runtime_pm_put(&i915->runtime_pm, wakeref); } static void i915_pmu_disable(struct perf_event *event)
The rc6 residency starts ticking from 0 from BIOS POST, but the kernel starts measuring the time from its boot. If we start measuruing I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from 0 and then upon first activity (park/unpark) add in all the rc6 residency since boot. After the first park with the sampler engaged, the sleep/active counters are aligned. v2: With a wakeref to be sure Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)