diff mbox series

perf: arm_cspmu: Don't touch interrupt registers if no interrupt was assigned

Message ID 20240307193104.58302-1-ilkka@os.amperecomputing.com (mailing list archive)
State New, archived
Headers show
Series perf: arm_cspmu: Don't touch interrupt registers if no interrupt was assigned | expand

Commit Message

Ilkka Koskinen March 7, 2024, 7:31 p.m. UTC
The driver enabled and disabled interrupts even if no interrupt was
assigned to the device.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Ilkka Koskinen April 4, 2024, 8:35 p.m. UTC | #1
Hi,

Could someone take a look at this patch and give some feedback?

Cheers, Ilkka

On Thu, 7 Mar 2024, Ilkka Koskinen wrote:
> The driver enabled and disabled interrupts even if no interrupt was
> assigned to the device.
>
> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
> ---
> drivers/perf/arm_cspmu/arm_cspmu.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 50b89b989ce7..2cbdb5dcb6ff 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -795,7 +795,8 @@ static void arm_cspmu_enable_counter(struct arm_cspmu *cspmu, int idx)
> 	inten_off = PMINTENSET + (4 * reg_id);
> 	cnten_off = PMCNTENSET + (4 * reg_id);
>
> -	writel(BIT(reg_bit), cspmu->base0 + inten_off);
> +	if (cspmu->irq)
> +		writel(BIT(reg_bit), cspmu->base0 + inten_off);
> 	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
> }
>
> @@ -810,7 +811,8 @@ static void arm_cspmu_disable_counter(struct arm_cspmu *cspmu, int idx)
> 	cnten_off = PMCNTENCLR + (4 * reg_id);
>
> 	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
> -	writel(BIT(reg_bit), cspmu->base0 + inten_off);
> +	if (cspmu->irq)
> +		writel(BIT(reg_bit), cspmu->base0 + inten_off);
> }
>
> static void arm_cspmu_event_update(struct perf_event *event)
> -- 
> 2.40.1
>
>
Robin Murphy April 5, 2024, 10:33 a.m. UTC | #2
On 2024-03-07 7:31 pm, Ilkka Koskinen wrote:
> The driver enabled and disabled interrupts even if no interrupt was
> assigned to the device.

Why's that a concern - if the interrupt isn't routed anywhere, surely it 
makes no difference what happens at the source end?

Thanks,
Robin.

> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
> ---
>   drivers/perf/arm_cspmu/arm_cspmu.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 50b89b989ce7..2cbdb5dcb6ff 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -795,7 +795,8 @@ static void arm_cspmu_enable_counter(struct arm_cspmu *cspmu, int idx)
>   	inten_off = PMINTENSET + (4 * reg_id);
>   	cnten_off = PMCNTENSET + (4 * reg_id);
>   
> -	writel(BIT(reg_bit), cspmu->base0 + inten_off);
> +	if (cspmu->irq)
> +		writel(BIT(reg_bit), cspmu->base0 + inten_off);
>   	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
>   }
>   
> @@ -810,7 +811,8 @@ static void arm_cspmu_disable_counter(struct arm_cspmu *cspmu, int idx)
>   	cnten_off = PMCNTENCLR + (4 * reg_id);
>   
>   	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
> -	writel(BIT(reg_bit), cspmu->base0 + inten_off);
> +	if (cspmu->irq)
> +		writel(BIT(reg_bit), cspmu->base0 + inten_off);
>   }
>   
>   static void arm_cspmu_event_update(struct perf_event *event)
Ilkka Koskinen April 5, 2024, 10:33 p.m. UTC | #3
On Fri, 5 Apr 2024, Robin Murphy wrote:
> On 2024-03-07 7:31 pm, Ilkka Koskinen wrote:
>> The driver enabled and disabled interrupts even if no interrupt was
>> assigned to the device.
>
> Why's that a concern - if the interrupt isn't routed anywhere, surely it 
> makes no difference what happens at the source end?

The issue is that we have two PMUs attached to the same interrupt line.
Unfortunately, I just don't seem to find time to add support for shared 
interrupts to the cspmu driver. Meanwhile, I assigned the interrupt 
to one of the PMUs while the other one has zero in the APMT table. Without 
the patch, I can trigger "ghost interrupt" in the latter PMU.

Cheers, Ilkka

>
> Thanks,
> Robin.
>
>> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
>> ---
>>   drivers/perf/arm_cspmu/arm_cspmu.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c 
>> b/drivers/perf/arm_cspmu/arm_cspmu.c
>> index 50b89b989ce7..2cbdb5dcb6ff 100644
>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
>> @@ -795,7 +795,8 @@ static void arm_cspmu_enable_counter(struct arm_cspmu 
>> *cspmu, int idx)
>>   	inten_off = PMINTENSET + (4 * reg_id);
>>   	cnten_off = PMCNTENSET + (4 * reg_id);
>>   -	writel(BIT(reg_bit), cspmu->base0 + inten_off);
>> +	if (cspmu->irq)
>> +		writel(BIT(reg_bit), cspmu->base0 + inten_off);
>>   	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
>>   }
>>   @@ -810,7 +811,8 @@ static void arm_cspmu_disable_counter(struct 
>> arm_cspmu *cspmu, int idx)
>>   	cnten_off = PMCNTENCLR + (4 * reg_id);
>>     	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
>> -	writel(BIT(reg_bit), cspmu->base0 + inten_off);
>> +	if (cspmu->irq)
>> +		writel(BIT(reg_bit), cspmu->base0 + inten_off);
>>   }
>>     static void arm_cspmu_event_update(struct perf_event *event)
>
Robin Murphy April 8, 2024, 12:05 p.m. UTC | #4
On 2024-04-05 11:33 pm, Ilkka Koskinen wrote:
> 
> On Fri, 5 Apr 2024, Robin Murphy wrote:
>> On 2024-03-07 7:31 pm, Ilkka Koskinen wrote:
>>> The driver enabled and disabled interrupts even if no interrupt was
>>> assigned to the device.
>>
>> Why's that a concern - if the interrupt isn't routed anywhere, surely 
>> it makes no difference what happens at the source end?
> 
> The issue is that we have two PMUs attached to the same interrupt line.
> Unfortunately, I just don't seem to find time to add support for shared 
> interrupts to the cspmu driver. Meanwhile, I assigned the interrupt to 
> one of the PMUs while the other one has zero in the APMT table.

I suspected something like that ;)

> Without 
> the patch, I can trigger "ghost interrupt" in the latter PMU.

An occasional spurious interrupt should be no big deal. If it ends up as 
a screaming spurious interrupt because we never handle the overflow 
condition on the "other" PMU, then what matters most is that we never 
handle the overflow, thus the "other" PMU is still useless since you 
can't assume the user is going to read it frequently enough to avoid 
losing information and getting nonsense counts back. So this hack really 
isn't a viable solution for anything.

Thanks,
Robin.
Ilkka Koskinen April 9, 2024, 1:05 a.m. UTC | #5
On Mon, 8 Apr 2024, Robin Murphy wrote:
> On 2024-04-05 11:33 pm, Ilkka Koskinen wrote:
>> 
>> On Fri, 5 Apr 2024, Robin Murphy wrote:
>>> On 2024-03-07 7:31 pm, Ilkka Koskinen wrote:
>>>> The driver enabled and disabled interrupts even if no interrupt was
>>>> assigned to the device.
>>> 
>>> Why's that a concern - if the interrupt isn't routed anywhere, surely it 
>>> makes no difference what happens at the source end?
>> 
>> The issue is that we have two PMUs attached to the same interrupt line.
>> Unfortunately, I just don't seem to find time to add support for shared 
>> interrupts to the cspmu driver. Meanwhile, I assigned the interrupt to one 
>> of the PMUs while the other one has zero in the APMT table.
>
> I suspected something like that ;)
>
>> Without the patch, I can trigger "ghost interrupt" in the latter PMU.
>
> An occasional spurious interrupt should be no big deal. If it ends up as a 
> screaming spurious interrupt because we never handle the overflow condition 
> on the "other" PMU, then what matters most is that we never handle the 
> overflow, thus the "other" PMU is still useless since you can't assume the 
> user is going to read it frequently enough to avoid losing information and 
> getting nonsense counts back. So this hack really isn't a viable solution for 
> anything.

IIRC, what happens is that kernel will disable the interrupt eventually 
due to unhandled spurious interrupts making the "working" PMU also 
useless.

Cheers, Ilkka

>
> Thanks,
> Robin.
>
Robin Murphy April 9, 2024, 12:53 p.m. UTC | #6
On 09/04/2024 2:05 am, Ilkka Koskinen wrote:
> 
> On Mon, 8 Apr 2024, Robin Murphy wrote:
>> On 2024-04-05 11:33 pm, Ilkka Koskinen wrote:
>>>
>>> On Fri, 5 Apr 2024, Robin Murphy wrote:
>>>> On 2024-03-07 7:31 pm, Ilkka Koskinen wrote:
>>>>> The driver enabled and disabled interrupts even if no interrupt was
>>>>> assigned to the device.
>>>>
>>>> Why's that a concern - if the interrupt isn't routed anywhere, 
>>>> surely it makes no difference what happens at the source end?
>>>
>>> The issue is that we have two PMUs attached to the same interrupt line.
>>> Unfortunately, I just don't seem to find time to add support for 
>>> shared interrupts to the cspmu driver. Meanwhile, I assigned the 
>>> interrupt to one of the PMUs while the other one has zero in the APMT 
>>> table.
>>
>> I suspected something like that ;)
>>
>>> Without the patch, I can trigger "ghost interrupt" in the latter PMU.
>>
>> An occasional spurious interrupt should be no big deal. If it ends up 
>> as a screaming spurious interrupt because we never handle the overflow 
>> condition on the "other" PMU, then what matters most is that we never 
>> handle the overflow, thus the "other" PMU is still useless since you 
>> can't assume the user is going to read it frequently enough to avoid 
>> losing information and getting nonsense counts back. So this hack 
>> really isn't a viable solution for anything.
> 
> IIRC, what happens is that kernel will disable the interrupt eventually 
> due to unhandled spurious interrupts making the "working" PMU also useless.

Indeed, but if having one inaccurate PMU is fine, having more than one 
is no big deal either, right? The moral of the story is that hacking the 
firmware to lie about the hardware is just not a great idea.

TBH it's always seemed a bit broken that we allow probing without an IRQ 
but then have no accommodation for overflow if so. Fixing that would be 
a good thing in itself, and would at least have the side-effect of 
allowing your hack to work, however much I may disapprove of that :)

FWIW it is still lingering some way down my to-do list to factor out the 
fiddly IRQ-sharing/migration code into at least a helper library (if not 
further into perf core itself) before it gets copy-pasted much more, and 
it occurs to me that I could then easily factor the IRQ-substitute timer 
approach from e.g. arm-ccn into that as well... The more I think about 
it the more I might just convince myself that I want it for the driver 
I'm currently working on and justify bumping it up the list, let's see...

Cheers,
Robin.
Ilkka Koskinen April 11, 2024, 7:35 a.m. UTC | #7
On Tue, 9 Apr 2024, Robin Murphy wrote:
> On 09/04/2024 2:05 am, Ilkka Koskinen wrote:
>> 
>> On Mon, 8 Apr 2024, Robin Murphy wrote:
>>> On 2024-04-05 11:33 pm, Ilkka Koskinen wrote:
>>>> 
>>>> On Fri, 5 Apr 2024, Robin Murphy wrote:
>>>>> On 2024-03-07 7:31 pm, Ilkka Koskinen wrote:
>>>>>> The driver enabled and disabled interrupts even if no interrupt was
>>>>>> assigned to the device.
>>>>> 
>>>>> Why's that a concern - if the interrupt isn't routed anywhere, surely it 
>>>>> makes no difference what happens at the source end?
>>>> 
>>>> The issue is that we have two PMUs attached to the same interrupt line.
>>>> Unfortunately, I just don't seem to find time to add support for shared 
>>>> interrupts to the cspmu driver. Meanwhile, I assigned the interrupt to 
>>>> one of the PMUs while the other one has zero in the APMT table.
>>> 
>>> I suspected something like that ;)
>>> 
>>>> Without the patch, I can trigger "ghost interrupt" in the latter PMU.
>>> 
>>> An occasional spurious interrupt should be no big deal. If it ends up as a 
>>> screaming spurious interrupt because we never handle the overflow 
>>> condition on the "other" PMU, then what matters most is that we never 
>>> handle the overflow, thus the "other" PMU is still useless since you can't 
>>> assume the user is going to read it frequently enough to avoid losing 
>>> information and getting nonsense counts back. So this hack really isn't a 
>>> viable solution for anything.
>> 
>> IIRC, what happens is that kernel will disable the interrupt eventually due 
>> to unhandled spurious interrupts making the "working" PMU also useless.
>
> Indeed, but if having one inaccurate PMU is fine, having more than one is no 
> big deal either, right? The moral of the story is that hacking the firmware 
> to lie about the hardware is just not a great idea.

Depends on the use case, of course :D

>
> TBH it's always seemed a bit broken that we allow probing without an IRQ but 
> then have no accommodation for overflow if so. Fixing that would be a good 
> thing in itself, and would at least have the side-effect of allowing your 
> hack to work, however much I may disapprove of that :)
>
> FWIW it is still lingering some way down my to-do list to factor out the 
> fiddly IRQ-sharing/migration code into at least a helper library (if not 
> further into perf core itself) before it gets copy-pasted much more, and it 
> occurs to me that I could then easily factor the IRQ-substitute timer 
> approach from e.g. arm-ccn into that as well... The more I think about it the 
> more I might just convince myself that I want it for the driver I'm currently 
> working on and justify bumping it up the list, let's see...

That sounds like a great idea. Even plain helper library would keep the 
drivers a lot cleaner.

Cheers, Ilkka
diff mbox series

Patch

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index 50b89b989ce7..2cbdb5dcb6ff 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -795,7 +795,8 @@  static void arm_cspmu_enable_counter(struct arm_cspmu *cspmu, int idx)
 	inten_off = PMINTENSET + (4 * reg_id);
 	cnten_off = PMCNTENSET + (4 * reg_id);
 
-	writel(BIT(reg_bit), cspmu->base0 + inten_off);
+	if (cspmu->irq)
+		writel(BIT(reg_bit), cspmu->base0 + inten_off);
 	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
 }
 
@@ -810,7 +811,8 @@  static void arm_cspmu_disable_counter(struct arm_cspmu *cspmu, int idx)
 	cnten_off = PMCNTENCLR + (4 * reg_id);
 
 	writel(BIT(reg_bit), cspmu->base0 + cnten_off);
-	writel(BIT(reg_bit), cspmu->base0 + inten_off);
+	if (cspmu->irq)
+		writel(BIT(reg_bit), cspmu->base0 + inten_off);
 }
 
 static void arm_cspmu_event_update(struct perf_event *event)