diff mbox series

[1/2] drm/i915/pmu: Correct the rc6 offset upon enabling

Message ID 20200114105648.2172026-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [1/2] drm/i915/pmu: Correct the rc6 offset upon enabling | expand

Commit Message

Chris Wilson Jan. 14, 2020, 10:56 a.m. UTC
The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
starts measuring the time from its boot. If we start measuruing
I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
0 and then upon first activity (park/unpark) add in all the rc6
residency since boot. After the first park with the sampler engaged, the
sleep/active counters are aligned.

v2: With a wakeref to be sure

Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Chris Wilson Jan. 14, 2020, 11:06 a.m. UTC | #1
Quoting Chris Wilson (2020-01-14 10:56:47)
> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
> starts measuring the time from its boot. If we start measuruing
> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
> 0 and then upon first activity (park/unpark) add in all the rc6
> residency since boot. After the first park with the sampler engaged, the
> sleep/active counters are aligned.
> 
> v2: With a wakeref to be sure
> 
> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 28a82c849bac..ec0299490dd4 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
>                 container_of(event->pmu, typeof(*i915), pmu.base);
>         unsigned int bit = event_enabled_bit(event);
>         struct i915_pmu *pmu = &i915->pmu;
> +       intel_wakeref_t wakeref;
>         unsigned long flags;
>  
> +       wakeref = intel_runtime_pm_get(&i915->runtime_pm);
>         spin_lock_irqsave(&pmu->lock, flags);
>  
>         /*
> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
>         BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
>         GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
>         GEM_BUG_ON(pmu->enable_count[bit] == ~0);
> +
> +       if (pmu->enable_count[bit] == 0 &&
> +           config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
> +               pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;

I can't decide if it's better to have discrete sampling appear
monotonic, or to reset just in case we drifted far off.
-Chris
Tvrtko Ursulin Jan. 14, 2020, 11:37 a.m. UTC | #2
On 14/01/2020 11:06, Chris Wilson wrote:
> Quoting Chris Wilson (2020-01-14 10:56:47)
>> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
>> starts measuring the time from its boot. If we start measuruing
>> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
>> 0 and then upon first activity (park/unpark) add in all the rc6
>> residency since boot. After the first park with the sampler engaged, the
>> sleep/active counters are aligned.
>>
>> v2: With a wakeref to be sure
>>
>> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>> index 28a82c849bac..ec0299490dd4 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
>>                  container_of(event->pmu, typeof(*i915), pmu.base);
>>          unsigned int bit = event_enabled_bit(event);
>>          struct i915_pmu *pmu = &i915->pmu;
>> +       intel_wakeref_t wakeref;
>>          unsigned long flags;
>>   
>> +       wakeref = intel_runtime_pm_get(&i915->runtime_pm);

I think it would be nicer to use with_intel_runtime_pm directly at the 
__get_rc6 call site. That would show/localise where it is actually needed.

>>          spin_lock_irqsave(&pmu->lock, flags);
>>   
>>          /*
>> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
>>          BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
>>          GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
>>          GEM_BUG_ON(pmu->enable_count[bit] == ~0);
>> +
>> +       if (pmu->enable_count[bit] == 0 &&
>> +           config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
>> +               pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
> 
> I can't decide if it's better to have discrete sampling appear
> monotonic, or to reset just in case we drifted far off.

What do you mean?

This looks correct to me as you implemented it. On enable it samples the 
real RC6 and updates pmu->sleep_last. So regardless if the next even 
read comes with device awake or suspended it will report monotonic and 
without adding up any time outside the enabled window.

Drift can normally come when we overestimate because hw RC6 can be less 
than our time between park and unpark. I don't see how to reset that and 
stay monotonic. Or you are thinking it doesn't need to be monotonic?

Regards,

Tvrtko
Tvrtko Ursulin Jan. 14, 2020, 11:45 a.m. UTC | #3
On 14/01/2020 11:37, Tvrtko Ursulin wrote:
> 
> On 14/01/2020 11:06, Chris Wilson wrote:
>> Quoting Chris Wilson (2020-01-14 10:56:47)
>>> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
>>> starts measuring the time from its boot. If we start measuruing
>>> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
>>> 0 and then upon first activity (park/unpark) add in all the rc6
>>> residency since boot. After the first park with the sampler engaged, the
>>> sleep/active counters are aligned.
>>>
>>> v2: With a wakeref to be sure
>>>
>>> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
>>>   1 file changed, 12 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
>>> b/drivers/gpu/drm/i915/i915_pmu.c
>>> index 28a82c849bac..ec0299490dd4 100644
>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event 
>>> *event)
>>>                  container_of(event->pmu, typeof(*i915), pmu.base);
>>>          unsigned int bit = event_enabled_bit(event);
>>>          struct i915_pmu *pmu = &i915->pmu;
>>> +       intel_wakeref_t wakeref;
>>>          unsigned long flags;
>>> +       wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> 
> I think it would be nicer to use with_intel_runtime_pm directly at the 
> __get_rc6 call site. That would show/localise where it is actually needed.

Spinlock though, my bad.

Regards,

Tvrtko
Chris Wilson Jan. 14, 2020, 11:49 a.m. UTC | #4
Quoting Tvrtko Ursulin (2020-01-14 11:37:09)
> 
> On 14/01/2020 11:06, Chris Wilson wrote:
> > Quoting Chris Wilson (2020-01-14 10:56:47)
> >> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
> >> starts measuring the time from its boot. If we start measuruing
> >> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
> >> 0 and then upon first activity (park/unpark) add in all the rc6
> >> residency since boot. After the first park with the sampler engaged, the
> >> sleep/active counters are aligned.
> >>
> >> v2: With a wakeref to be sure
> >>
> >> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
> >>   1 file changed, 12 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> >> index 28a82c849bac..ec0299490dd4 100644
> >> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
> >>                  container_of(event->pmu, typeof(*i915), pmu.base);
> >>          unsigned int bit = event_enabled_bit(event);
> >>          struct i915_pmu *pmu = &i915->pmu;
> >> +       intel_wakeref_t wakeref;
> >>          unsigned long flags;
> >>   
> >> +       wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> 
> I think it would be nicer to use with_intel_runtime_pm directly at the 
> __get_rc6 call site. That would show/localise where it is actually needed.

We can't, as it gets called under the spinlock :(

And I don't see a way around that, as we require the fixup to be applied
while idle and so require the wakeref.

> >>          spin_lock_irqsave(&pmu->lock, flags);
> >>   
> >>          /*
> >> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
> >>          BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
> >>          GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
> >>          GEM_BUG_ON(pmu->enable_count[bit] == ~0);
> >> +
> >> +       if (pmu->enable_count[bit] == 0 &&
> >> +           config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
> >> +               pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
> > 
> > I can't decide if it's better to have discrete sampling appear
> > monotonic, or to reset just in case we drifted far off.
> 
> What do you mean?
> 
> This looks correct to me as you implemented it. On enable it samples the 
> real RC6 and updates pmu->sleep_last. So regardless if the next even 
> read comes with device awake or suspended it will report monotonic and 
> without adding up any time outside the enabled window.

u64 sample[2];

fd = perf_open(RC6);
sample[0] = read(fd);
close(fd);

fd = perf_open(RC6);
sample[1] = read(fd);
close(fd);

/* assume idle system */
assert(sample[1] > sample[0]);

Do we want that? I don't think that's required by the perf API, as the
counters are only valid while the event is enabled (iirc).

> Drift can normally come when we overestimate because hw RC6 can be less 
> than our time between park and unpark. I don't see how to reset that and 
> stay monotonic. Or you are thinking it doesn't need to be monotonic?

I was mostly thinking of bugs, e.g. across suspend. We also have a problem
if we wait longer than 2*counter_wrap between perf_event_reads, and we
probably should install a very lax timer for rc6 to ensure the
__get_rc6() calls remain monotonic. There was a fdo bug for that :)
-Chris
Tvrtko Ursulin Jan. 14, 2020, 12:37 p.m. UTC | #5
On 14/01/2020 11:49, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-01-14 11:37:09)
>> On 14/01/2020 11:06, Chris Wilson wrote:
>>> Quoting Chris Wilson (2020-01-14 10:56:47)
>>>> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
>>>> starts measuring the time from its boot. If we start measuruing
>>>> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
>>>> 0 and then upon first activity (park/unpark) add in all the rc6
>>>> residency since boot. After the first park with the sampler engaged, the
>>>> sleep/active counters are aligned.
>>>>
>>>> v2: With a wakeref to be sure
>>>>
>>>> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
>>>>    1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>>>> index 28a82c849bac..ec0299490dd4 100644
>>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>>> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
>>>>                   container_of(event->pmu, typeof(*i915), pmu.base);
>>>>           unsigned int bit = event_enabled_bit(event);
>>>>           struct i915_pmu *pmu = &i915->pmu;
>>>> +       intel_wakeref_t wakeref;
>>>>           unsigned long flags;
>>>>    
>>>> +       wakeref = intel_runtime_pm_get(&i915->runtime_pm);
>>
>> I think it would be nicer to use with_intel_runtime_pm directly at the
>> __get_rc6 call site. That would show/localise where it is actually needed.
> 
> We can't, as it gets called under the spinlock :(
> 
> And I don't see a way around that, as we require the fixup to be applied
> while idle and so require the wakeref.
> 
>>>>           spin_lock_irqsave(&pmu->lock, flags);
>>>>    
>>>>           /*
>>>> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
>>>>           BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
>>>>           GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
>>>>           GEM_BUG_ON(pmu->enable_count[bit] == ~0);
>>>> +
>>>> +       if (pmu->enable_count[bit] == 0 &&
>>>> +           config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
>>>> +               pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
>>>
>>> I can't decide if it's better to have discrete sampling appear
>>> monotonic, or to reset just in case we drifted far off.
>>
>> What do you mean?
>>
>> This looks correct to me as you implemented it. On enable it samples the
>> real RC6 and updates pmu->sleep_last. So regardless if the next even
>> read comes with device awake or suspended it will report monotonic and
>> without adding up any time outside the enabled window.
> 
> u64 sample[2];
> 
> fd = perf_open(RC6);
> sample[0] = read(fd);
> close(fd);
> 
> fd = perf_open(RC6);
> sample[1] = read(fd);
> close(fd);
> 
> /* assume idle system */
> assert(sample[1] > sample[0]);
> 
> Do we want that? I don't think that's required by the perf API, as the
> counters are only valid while the event is enabled (iirc).

Yeah I'd say this is not interesting.

> 
>> Drift can normally come when we overestimate because hw RC6 can be less
>> than our time between park and unpark. I don't see how to reset that and
>> stay monotonic. Or you are thinking it doesn't need to be monotonic?
> 
> I was mostly thinking of bugs, e.g. across suspend. We also have a problem
> if we wait longer than 2*counter_wrap between perf_event_reads, and we
> probably should install a very lax timer for rc6 to ensure the
> __get_rc6() calls remain monotonic. There was a fdo bug for that :)

Yes I remember now. With 1.28us tick hw overflow is ~5500 seconds so 
perf_event_read every ~3hrs is enough to keep outside overflow. It's 
probably enough. :)

For the patch:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 28a82c849bac..ec0299490dd4 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -637,8 +637,10 @@  static void i915_pmu_enable(struct perf_event *event)
 		container_of(event->pmu, typeof(*i915), pmu.base);
 	unsigned int bit = event_enabled_bit(event);
 	struct i915_pmu *pmu = &i915->pmu;
+	intel_wakeref_t wakeref;
 	unsigned long flags;
 
+	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
 	spin_lock_irqsave(&pmu->lock, flags);
 
 	/*
@@ -648,6 +650,14 @@  static void i915_pmu_enable(struct perf_event *event)
 	BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
 	GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
 	GEM_BUG_ON(pmu->enable_count[bit] == ~0);
+
+	if (pmu->enable_count[bit] == 0 &&
+	    config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
+		pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
+		pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(&i915->gt);
+		pmu->sleep_last = ktime_get();
+	}
+
 	pmu->enable |= BIT_ULL(bit);
 	pmu->enable_count[bit]++;
 
@@ -688,6 +698,8 @@  static void i915_pmu_enable(struct perf_event *event)
 	 * an existing non-zero value.
 	 */
 	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+
+	intel_runtime_pm_put(&i915->runtime_pm, wakeref);
 }
 
 static void i915_pmu_disable(struct perf_event *event)