diff mbox series

drm/panfrost: fix power transition timeout warnings

Message ID 20240322164525.2617508-1-christianshewitt@gmail.com (mailing list archive)
State New, archived
Headers show
Series drm/panfrost: fix power transition timeout warnings | expand

Commit Message

Christian Hewitt March 22, 2024, 4:45 p.m. UTC
Increase the timeout value to prevent system logs on Amlogic boards flooding
with power transition warnings:

[   13.047638] panfrost ffe40000.gpu: shader power transition timeout
[   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
[   13.937324] panfrost ffe40000.gpu: shader power transition timeout
[   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
...
[39829.506904] panfrost ffe40000.gpu: shader power transition timeout
[39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
[39949.508369] panfrost ffe40000.gpu: shader power transition timeout
[39949.509405] panfrost ffe40000.gpu: l2 power transition timeout

The 2000 value has been found through trial and error testing with devices
using G52 and G31 GPUs.

Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
---
 drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

AngeloGioacchino Del Regno March 25, 2024, 8:20 a.m. UTC | #1
Il 22/03/24 17:45, Christian Hewitt ha scritto:
> Increase the timeout value to prevent system logs on Amlogic boards flooding
> with power transition warnings:
> 
> [   13.047638] panfrost ffe40000.gpu: shader power transition timeout
> [   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
> [   13.937324] panfrost ffe40000.gpu: shader power transition timeout
> [   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
> ...
> [39829.506904] panfrost ffe40000.gpu: shader power transition timeout
> [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
> [39949.508369] panfrost ffe40000.gpu: shader power transition timeout
> [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout
> 
> The 2000 value has been found through trial and error testing with devices
> using G52 and G31 GPUs.
> 
> Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
> ---
>   drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 9063ce254642..fd8e44992184 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>   
>   	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>   	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
> -					 val, !val, 1, 1000);
> +					 val, !val, 1, 2000);
>   	if (ret)
>   		dev_err(pfdev->dev, "shader power transition timeout");
>   
>   	gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
>   	ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
> -					 val, !val, 1, 1000);
> +					 val, !val, 1, 2000);

Are you sure that you need to raise the timeout for TILER as well?

Cheers,
Angelo

>   	if (ret)
>   		dev_err(pfdev->dev, "tiler power transition timeout");
>   
>   	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>   	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
> -				 val, !val, 0, 1000);
> +				 val, !val, 0, 2000);
>   	if (ret)
>   		dev_err(pfdev->dev, "l2 power transition timeout");
>   }
Steven Price March 25, 2024, 10:28 a.m. UTC | #2
On 22/03/2024 16:45, Christian Hewitt wrote:
> Increase the timeout value to prevent system logs on Amlogic boards flooding
> with power transition warnings:
> 
> [   13.047638] panfrost ffe40000.gpu: shader power transition timeout
> [   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
> [   13.937324] panfrost ffe40000.gpu: shader power transition timeout
> [   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
> ...
> [39829.506904] panfrost ffe40000.gpu: shader power transition timeout
> [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
> [39949.508369] panfrost ffe40000.gpu: shader power transition timeout
> [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout
> 
> The 2000 value has been found through trial and error testing with devices
> using G52 and G31 GPUs.

How close to 2ms did you need in your trial and error testing? I'm
wondering if we should increase it further in case this might still
trigger occasionally?

kbase seems to have a 5s (5000ms!) timeout before it will actually
complain. But equally it doesn't busy wait on the registers in the same
way as panfrost, so the impact to the rest of the system of a long wait
is less.

But 2ms doesn't sound an unreasonable timeout so:

Reviewed-by: Steven Price <steven.price@arm.com>

> Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>> ---
>  drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 9063ce254642..fd8e44992184 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>  
>  	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>  	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
> -					 val, !val, 1, 1000);
> +					 val, !val, 1, 2000);
>  	if (ret)
>  		dev_err(pfdev->dev, "shader power transition timeout");
>  
>  	gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
>  	ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
> -					 val, !val, 1, 1000);
> +					 val, !val, 1, 2000);
>  	if (ret)
>  		dev_err(pfdev->dev, "tiler power transition timeout");

As Angelo points out the tiler probably doesn't need such a long
timeout, but I can't see the harm in consistency so I'm happy with this
change. If my memory of the hardware is correct then the tiler power off
actually does very little and so I wouldn't expect it to take very long.

Steve

>  	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>  	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
> -				 val, !val, 0, 1000);
> +				 val, !val, 0, 2000);
>  	if (ret)
>  		dev_err(pfdev->dev, "l2 power transition timeout");
>  }
Christian Hewitt March 25, 2024, 11:36 a.m. UTC | #3
> On 25 Mar 2024, at 2:28 pm, Steven Price <steven.price@arm.com> wrote:
> 
> On 22/03/2024 16:45, Christian Hewitt wrote:
>> Increase the timeout value to prevent system logs on Amlogic boards flooding
>> with power transition warnings:
>> 
>> [   13.047638] panfrost ffe40000.gpu: shader power transition timeout
>> [   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
>> [   13.937324] panfrost ffe40000.gpu: shader power transition timeout
>> [   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
>> ...
>> [39829.506904] panfrost ffe40000.gpu: shader power transition timeout
>> [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
>> [39949.508369] panfrost ffe40000.gpu: shader power transition timeout
>> [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout
>> 
>> The 2000 value has been found through trial and error testing with devices
>> using G52 and G31 GPUs.
> 
> How close to 2ms did you need in your trial and error testing? I'm
> wondering if we should increase it further in case this might still
> trigger occasionally?

I backed it off progressively but still saw occasional messages at 1.6ms
so padded it a little with 2ms, and those systems haven’t shown errors
since so I currently see it as a ’safe’ value. The one possible wildcard
is testing with older T820/T628 boards; but that needs to wait until I’m
back home from a long trip and able to test them. The possible theory
being that older/slower systems might require more time. Worst case I’ll
have to send another change.

> kbase seems to have a 5s (5000ms!) timeout before it will actually
> complain. But equally it doesn't busy wait on the registers in the same
> way as panfrost, so the impact to the rest of the system of a long wait
> is less.
> 
> But 2ms doesn't sound an unreasonable timeout so:
> 
> Reviewed-by: Steven Price <steven.price@arm.com>
> 
>> Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
>> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>> ---
>> drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> index 9063ce254642..fd8e44992184 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> @@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>> 
>> gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>> ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>> -  val, !val, 1, 1000);
>> +  val, !val, 1, 2000);
>> if (ret)
>> dev_err(pfdev->dev, "shader power transition timeout");
>> 
>> gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
>> ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
>> -  val, !val, 1, 1000);
>> +  val, !val, 1, 2000);
>> if (ret)
>> dev_err(pfdev->dev, "tiler power transition timeout");
> 
> As Angelo points out the tiler probably doesn't need such a long
> timeout, but I can't see the harm in consistency so I'm happy with this
> change. If my memory of the hardware is correct then the tiler power off
> actually does very little and so I wouldn't expect it to take very long.

I’ve seen tiler timeouts once I think and thus included it, but not since
the values were increased. As long as it’s acceptable I won’t over-think
it but if more testing is needed I can look at it more.

> Steve
> 
>> gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>> ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>> -  val, !val, 0, 1000);
>> +  val, !val, 0, 2000);
>> if (ret)
>> dev_err(pfdev->dev, "l2 power transition timeout");
>> }

Christian
AngeloGioacchino Del Regno March 25, 2024, 1:07 p.m. UTC | #4
Il 25/03/24 12:36, Christian Hewitt ha scritto:
>> On 25 Mar 2024, at 2:28 pm, Steven Price <steven.price@arm.com> wrote:
>>
>> On 22/03/2024 16:45, Christian Hewitt wrote:
>>> Increase the timeout value to prevent system logs on Amlogic boards flooding
>>> with power transition warnings:
>>>
>>> [   13.047638] panfrost ffe40000.gpu: shader power transition timeout
>>> [   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
>>> [   13.937324] panfrost ffe40000.gpu: shader power transition timeout
>>> [   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
>>> ...
>>> [39829.506904] panfrost ffe40000.gpu: shader power transition timeout
>>> [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
>>> [39949.508369] panfrost ffe40000.gpu: shader power transition timeout
>>> [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout
>>>
>>> The 2000 value has been found through trial and error testing with devices
>>> using G52 and G31 GPUs.
>>
>> How close to 2ms did you need in your trial and error testing? I'm
>> wondering if we should increase it further in case this might still
>> trigger occasionally?
> 
> I backed it off progressively but still saw occasional messages at 1.6ms
> so padded it a little with 2ms, and those systems haven’t shown errors
> since so I currently see it as a ’safe’ value. The one possible wildcard
> is testing with older T820/T628 boards; but that needs to wait until I’m
> back home from a long trip and able to test them. The possible theory
> being that older/slower systems might require more time. Worst case I’ll
> have to send another change.
> 
>> kbase seems to have a 5s (5000ms!) timeout before it will actually
>> complain. But equally it doesn't busy wait on the registers in the same
>> way as panfrost, so the impact to the rest of the system of a long wait
>> is less.
>>
>> But 2ms doesn't sound an unreasonable timeout so:
>>
>> Reviewed-by: Steven Price <steven.price@arm.com>
>>
>>> Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
>>> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>> ---
>>> drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> index 9063ce254642..fd8e44992184 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> @@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>
>>> gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>> ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>> -  val, !val, 1, 1000);
>>> +  val, !val, 1, 2000);
>>> if (ret)
>>> dev_err(pfdev->dev, "shader power transition timeout");
>>>
>>> gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
>>> ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
>>> -  val, !val, 1, 1000);
>>> +  val, !val, 1, 2000);
>>> if (ret)
>>> dev_err(pfdev->dev, "tiler power transition timeout");
>>
>> As Angelo points out the tiler probably doesn't need such a long
>> timeout, but I can't see the harm in consistency so I'm happy with this
>> change. If my memory of the hardware is correct then the tiler power off
>> actually does very little and so I wouldn't expect it to take very long.
> 
> I’ve seen tiler timeouts once I think and thus included it, but not since
> the values were increased. As long as it’s acceptable I won’t over-think
> it but if more testing is needed I can look at it more.
> 

Thanks for clarifying.

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Steven Price March 28, 2024, 4:30 p.m. UTC | #5
On 22/03/2024 16:45, Christian Hewitt wrote:
> Increase the timeout value to prevent system logs on Amlogic boards flooding
> with power transition warnings:
> 
> [   13.047638] panfrost ffe40000.gpu: shader power transition timeout
> [   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
> [   13.937324] panfrost ffe40000.gpu: shader power transition timeout
> [   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
> ...
> [39829.506904] panfrost ffe40000.gpu: shader power transition timeout
> [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
> [39949.508369] panfrost ffe40000.gpu: shader power transition timeout
> [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout
> 
> The 2000 value has been found through trial and error testing with devices
> using G52 and G31 GPUs.
> 
> Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>

Queued to drm-misc-fixes.

Thanks,

Steve

> ---
>  drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 9063ce254642..fd8e44992184 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>  
>  	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>  	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
> -					 val, !val, 1, 1000);
> +					 val, !val, 1, 2000);
>  	if (ret)
>  		dev_err(pfdev->dev, "shader power transition timeout");
>  
>  	gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
>  	ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
> -					 val, !val, 1, 1000);
> +					 val, !val, 1, 2000);
>  	if (ret)
>  		dev_err(pfdev->dev, "tiler power transition timeout");
>  
>  	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>  	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
> -				 val, !val, 0, 1000);
> +				 val, !val, 0, 2000);
>  	if (ret)
>  		dev_err(pfdev->dev, "l2 power transition timeout");
>  }
diff mbox series

Patch

diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 9063ce254642..fd8e44992184 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -441,19 +441,19 @@  void panfrost_gpu_power_off(struct panfrost_device *pfdev)
 
 	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
 	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
-					 val, !val, 1, 1000);
+					 val, !val, 1, 2000);
 	if (ret)
 		dev_err(pfdev->dev, "shader power transition timeout");
 
 	gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
 	ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
-					 val, !val, 1, 1000);
+					 val, !val, 1, 2000);
 	if (ret)
 		dev_err(pfdev->dev, "tiler power transition timeout");
 
 	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
 	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
-				 val, !val, 0, 1000);
+				 val, !val, 0, 2000);
 	if (ret)
 		dev_err(pfdev->dev, "l2 power transition timeout");
 }