diff mbox series

[v3,2/4] drm/i915/pmu: Use kstat_irqs to get interrupt count

Message ID 20201205014340.148235-3-jsnitsel@redhat.com (mailing list archive)
State New, archived
Headers show
Series tpm_tis: Detect interrupt storms | expand

Commit Message

Jerry Snitselaar Dec. 5, 2020, 1:43 a.m. UTC
Now that kstat_irqs is exported, get rid of count_interrupts in
i915_pmu.c

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org 
Cc: dri-devel@lists.freedesktop.org
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Peter Huewe <peterhuewe@gmx.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
---
 drivers/gpu/drm/i915/i915_pmu.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

Comments

Thomas Gleixner Dec. 6, 2020, 4:38 p.m. UTC | #1
On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:

> Now that kstat_irqs is exported, get rid of count_interrupts in
> i915_pmu.c
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -423,22 +423,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
>  	return HRTIMER_RESTART;
>  }
>  
> -static u64 count_interrupts(struct drm_i915_private *i915)
> -{
> -	/* open-coded kstat_irqs() */
> -	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
> -	u64 sum = 0;
> -	int cpu;
> -
> -	if (!desc || !desc->kstat_irqs)
> -		return 0;
> -
> -	for_each_possible_cpu(cpu)
> -		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
> -
> -	return sum;
> -}

May I ask why this has been merged in the first place?

Nothing in a driver has ever to fiddle with the internals of an irq
descriptor. We have functions for properly accessing them. Just because
C allows to fiddle with everything is not a justification. If the
required function is not exported then adding the export with a proper
explanation is not asked too much.

Also this lacks protection or at least a comment why this can be called
safely and is not subject to a concurrent removal of the irq descriptor.
The same problem exists when calling kstat_irqs(). It's even documented
at the top of the function.

Thanks,

        tglx
Thomas Gleixner Dec. 6, 2020, 9:33 p.m. UTC | #2
On Sun, Dec 06 2020 at 17:38, Thomas Gleixner wrote:
> On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
>> Now that kstat_irqs is exported, get rid of count_interrupts in
>> i915_pmu.c
>
> May I ask why this has been merged in the first place?
>
> Nothing in a driver has ever to fiddle with the internals of an irq
> descriptor. We have functions for properly accessing them. Just because
> C allows to fiddle with everything is not a justification. If the
> required function is not exported then adding the export with a proper
> explanation is not asked too much.
>
> Also this lacks protection or at least a comment why this can be called
> safely and is not subject to a concurrent removal of the irq descriptor.
> The same problem exists when calling kstat_irqs(). It's even documented
> at the top of the function.

And as pointed out vs. that TPM thing this really could have been a
trivial

    i915->irqs++;

in the interrupt handler and a read of that instead of iterating over
all possible cpus and summing it up. Oh well...

Thanks,

        tglx
Jerry Snitselaar Dec. 6, 2020, 9:47 p.m. UTC | #3
Thomas Gleixner @ 2020-12-06 09:38 MST:

> On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
>
>> Now that kstat_irqs is exported, get rid of count_interrupts in
>> i915_pmu.c
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -423,22 +423,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
>>  	return HRTIMER_RESTART;
>>  }
>>  
>> -static u64 count_interrupts(struct drm_i915_private *i915)
>> -{
>> -	/* open-coded kstat_irqs() */
>> -	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
>> -	u64 sum = 0;
>> -	int cpu;
>> -
>> -	if (!desc || !desc->kstat_irqs)
>> -		return 0;
>> -
>> -	for_each_possible_cpu(cpu)
>> -		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
>> -
>> -	return sum;
>> -}
>
> May I ask why this has been merged in the first place?
>
> Nothing in a driver has ever to fiddle with the internals of an irq
> descriptor. We have functions for properly accessing them. Just because
> C allows to fiddle with everything is not a justification. If the
> required function is not exported then adding the export with a proper
> explanation is not asked too much.
>
> Also this lacks protection or at least a comment why this can be called
> safely and is not subject to a concurrent removal of the irq descriptor.
> The same problem exists when calling kstat_irqs(). It's even documented
> at the top of the function.
>
> Thanks,
>
>         tglx

I don't know the history behind this bit. I stumbled across it in cscope
when looking for places using kstat_irqs.
Thomas Gleixner Dec. 6, 2020, 11:38 p.m. UTC | #4
On Sun, Dec 06 2020 at 14:47, Jerry Snitselaar wrote:
> Thomas Gleixner @ 2020-12-06 09:38 MST:
>
> I don't know the history behind this bit. I stumbled across it in cscope
> when looking for places using kstat_irqs.

I'm not ranting at you. The i915 people are on Cc.
Jarkko Sakkinen Dec. 8, 2020, 9:54 a.m. UTC | #5
On Sun, Dec 06, 2020 at 10:33:09PM +0100, Thomas Gleixner wrote:
> On Sun, Dec 06 2020 at 17:38, Thomas Gleixner wrote:
> > On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
> >> Now that kstat_irqs is exported, get rid of count_interrupts in
> >> i915_pmu.c
> >
> > May I ask why this has been merged in the first place?
> >
> > Nothing in a driver has ever to fiddle with the internals of an irq
> > descriptor. We have functions for properly accessing them. Just because
> > C allows to fiddle with everything is not a justification. If the
> > required function is not exported then adding the export with a proper
> > explanation is not asked too much.
> >
> > Also this lacks protection or at least a comment why this can be called
> > safely and is not subject to a concurrent removal of the irq descriptor.
> > The same problem exists when calling kstat_irqs(). It's even documented
> > at the top of the function.
> 
> And as pointed out vs. that TPM thing this really could have been a
> trivial
> 
>     i915->irqs++;
> 
> in the interrupt handler and a read of that instead of iterating over
> all possible cpus and summing it up. Oh well...

I'm fine with that. 

> Thanks,
> 
>         tglx

/Jarkko
Joonas Lahtinen Dec. 10, 2020, 7:53 a.m. UTC | #6
+ Tvrtko and Chris for comments

Code seems to be added in:

commit 0cd4684d6ea9a4ffec33fc19de4dd667bb90d0a5
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Tue Nov 21 18:18:50 2017 +0000

    drm/i915/pmu: Add interrupt count metric

I think later in the thread there was a suggestion to replace this with
simple counter increment in IRQ handler.

Regards, Joonas

Quoting Thomas Gleixner (2020-12-06 18:38:44)
> On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
> 
> > Now that kstat_irqs is exported, get rid of count_interrupts in
> > i915_pmu.c
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -423,22 +423,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
> >       return HRTIMER_RESTART;
> >  }
> >  
> > -static u64 count_interrupts(struct drm_i915_private *i915)
> > -{
> > -     /* open-coded kstat_irqs() */
> > -     struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
> > -     u64 sum = 0;
> > -     int cpu;
> > -
> > -     if (!desc || !desc->kstat_irqs)
> > -             return 0;
> > -
> > -     for_each_possible_cpu(cpu)
> > -             sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
> > -
> > -     return sum;
> > -}
> 
> May I ask why this has been merged in the first place?
> 
> Nothing in a driver has ever to fiddle with the internals of an irq
> descriptor. We have functions for properly accessing them. Just because
> C allows to fiddle with everything is not a justification. If the
> required function is not exported then adding the export with a proper
> explanation is not asked too much.
> 
> Also this lacks protection or at least a comment why this can be called
> safely and is not subject to a concurrent removal of the irq descriptor.
> The same problem exists when calling kstat_irqs(). It's even documented
> at the top of the function.
> 
> Thanks,
> 
>         tglx
> 
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Tvrtko Ursulin Dec. 10, 2020, 10:45 a.m. UTC | #7
On 10/12/2020 07:53, Joonas Lahtinen wrote:
> + Tvrtko and Chris for comments
> 
> Code seems to be added in:
> 
> commit 0cd4684d6ea9a4ffec33fc19de4dd667bb90d0a5
> Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Date:   Tue Nov 21 18:18:50 2017 +0000
> 
>      drm/i915/pmu: Add interrupt count metric
> 
> I think later in the thread there was a suggestion to replace this with
> simple counter increment in IRQ handler.

It was indeed unsafe until recent b00bccb3f0bb ("drm/i915/pmu: Handle 
PCI unbind") but now should be fine.

If kstat_irqs does not get exported it is easy enough for i915 to keep a 
local counter. Reasoning was very infrequent per cpu summation is much 
cheaper than very frequent atomic add. Up to thousands of interrupts per 
second vs "once per second" PMU read kind of thing.

Regards,

Tvrtko

> Quoting Thomas Gleixner (2020-12-06 18:38:44)
>> On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
>>
>>> Now that kstat_irqs is exported, get rid of count_interrupts in
>>> i915_pmu.c
>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>> @@ -423,22 +423,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
>>>        return HRTIMER_RESTART;
>>>   }
>>>   
>>> -static u64 count_interrupts(struct drm_i915_private *i915)
>>> -{
>>> -     /* open-coded kstat_irqs() */
>>> -     struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
>>> -     u64 sum = 0;
>>> -     int cpu;
>>> -
>>> -     if (!desc || !desc->kstat_irqs)
>>> -             return 0;
>>> -
>>> -     for_each_possible_cpu(cpu)
>>> -             sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
>>> -
>>> -     return sum;
>>> -}
>>
>> May I ask why this has been merged in the first place?
>>
>> Nothing in a driver has ever to fiddle with the internals of an irq
>> descriptor. We have functions for properly accessing them. Just because
>> C allows to fiddle with everything is not a justification. If the
>> required function is not exported then adding the export with a proper
>> explanation is not asked too much.
>>
>> Also this lacks protection or at least a comment why this can be called
>> safely and is not subject to a concurrent removal of the irq descriptor.
>> The same problem exists when calling kstat_irqs(). It's even documented
>> at the top of the function.
>>
>> Thanks,
>>
>>          tglx
>>
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Thomas Gleixner Dec. 10, 2020, 4:35 p.m. UTC | #8
On Thu, Dec 10 2020 at 10:45, Tvrtko Ursulin wrote:
> On 10/12/2020 07:53, Joonas Lahtinen wrote:
>> I think later in the thread there was a suggestion to replace this with
>> simple counter increment in IRQ handler.
>
> It was indeed unsafe until recent b00bccb3f0bb ("drm/i915/pmu: Handle 
> PCI unbind") but now should be fine.
>
> If kstat_irqs does not get exported it is easy enough for i915 to keep a 
> local counter. Reasoning was very infrequent per cpu summation is much 
> cheaper than very frequent atomic add. Up to thousands of interrupts per 
> second vs "once per second" PMU read kind of thing.

Why do you need a atomic_add? It's ONE interrupt which can only be
executed on ONE CPU at a time. Interrupt handlers are non-reentrant.

The core code function will just return an accumulated counter nowadays
which is only 32bit wide, which is what the interface provided forever.
That needs to be fixed first.

Aside of that the accounting is wrong when the interrupt line is shared
because the core accounts interrupt per line not per device sharing the
line. Don't know whether you care or not.

I'll send out a series addressing irq_to_desc() (ab)use all over the
place shortly. i915 is in there...

Thanks,

        tglx
Tvrtko Ursulin Dec. 10, 2020, 5:09 p.m. UTC | #9
On 10/12/2020 16:35, Thomas Gleixner wrote:
> On Thu, Dec 10 2020 at 10:45, Tvrtko Ursulin wrote:
>> On 10/12/2020 07:53, Joonas Lahtinen wrote:
>>> I think later in the thread there was a suggestion to replace this with
>>> simple counter increment in IRQ handler.
>>
>> It was indeed unsafe until recent b00bccb3f0bb ("drm/i915/pmu: Handle
>> PCI unbind") but now should be fine.
>>
>> If kstat_irqs does not get exported it is easy enough for i915 to keep a
>> local counter. Reasoning was very infrequent per cpu summation is much
>> cheaper than very frequent atomic add. Up to thousands of interrupts per
>> second vs "once per second" PMU read kind of thing.
> 
> Why do you need a atomic_add? It's ONE interrupt which can only be
> executed on ONE CPU at a time. Interrupt handlers are non-reentrant.
> 
> The core code function will just return an accumulated counter nowadays
> which is only 32bit wide, which is what the interface provided forever.
> That needs to be fixed first.
> 
> Aside of that the accounting is wrong when the interrupt line is shared
> because the core accounts interrupt per line not per device sharing the
> line. Don't know whether you care or not.
> 
> I'll send out a series addressing irq_to_desc() (ab)use all over the
> place shortly. i915 is in there...

Yep we don't need atomic, my bad. And we would care about the shared 
interrupt line. And without atomic the extra accounting falls way below 
noise.

So in the light of it all, it sounds best I just quickly replace our 
abuse with private counting and then you don't have to deal with it in 
your series.

Regards,

Tvrtko
Thomas Gleixner Dec. 10, 2020, 5:44 p.m. UTC | #10
On Thu, Dec 10 2020 at 17:09, Tvrtko Ursulin wrote:
> On 10/12/2020 16:35, Thomas Gleixner wrote:
>> I'll send out a series addressing irq_to_desc() (ab)use all over the
>> place shortly. i915 is in there...
>
> Yep we don't need atomic, my bad. And we would care about the shared 
> interrupt line. And without atomic the extra accounting falls way below 
> noise.

You have to be careful though. If you make the accumulated counter 64
bit wide then you need to be careful vs. 32bit machines.

> So in the light of it all, it sounds best I just quickly replace our 
> abuse with private counting and then you don't have to deal with it in 
> your series.

I mostly have it. Still chewing on the 32bit vs. 64bit thing. And
keeping it in my series allows me to remove the export of irq_to_desc()
at the end without waiting for your tree to be merged.

Give me a few.

Thanks,

        tglx
Tvrtko Ursulin Dec. 10, 2020, 5:51 p.m. UTC | #11
On 10/12/2020 17:44, Thomas Gleixner wrote:
> On Thu, Dec 10 2020 at 17:09, Tvrtko Ursulin wrote:
>> On 10/12/2020 16:35, Thomas Gleixner wrote:
>>> I'll send out a series addressing irq_to_desc() (ab)use all over the
>>> place shortly. i915 is in there...
>>
>> Yep we don't need atomic, my bad. And we would care about the shared
>> interrupt line. And without atomic the extra accounting falls way below
>> noise.
> 
> You have to be careful though. If you make the accumulated counter 64
> bit wide then you need to be careful vs. 32bit machines.

Yep, thanks, I am bad jumping from one thing to another. Forgot about 
the read side atomicity completely..

>> So in the light of it all, it sounds best I just quickly replace our
>> abuse with private counting and then you don't have to deal with it in
>> your series.
> 
> I mostly have it. Still chewing on the 32bit vs. 64bit thing. And
> keeping it in my series allows me to remove the export of irq_to_desc()
> at the end without waiting for your tree to be merged.
> 
> Give me a few.

Ok.

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 69c0fa20eba1..a3e63f03da8c 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -423,22 +423,6 @@  static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
 	return HRTIMER_RESTART;
 }
 
-static u64 count_interrupts(struct drm_i915_private *i915)
-{
-	/* open-coded kstat_irqs() */
-	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
-	u64 sum = 0;
-	int cpu;
-
-	if (!desc || !desc->kstat_irqs)
-		return 0;
-
-	for_each_possible_cpu(cpu)
-		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
-
-	return sum;
-}
-
 static void i915_pmu_event_destroy(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -581,7 +565,7 @@  static u64 __i915_pmu_event_read(struct perf_event *event)
 				   USEC_PER_SEC /* to MHz */);
 			break;
 		case I915_PMU_INTERRUPTS:
-			val = count_interrupts(i915);
+			val = kstat_irqs(i915->drm.pdev->irq);
 			break;
 		case I915_PMU_RC6_RESIDENCY:
 			val = get_rc6(&i915->gt);