diff mbox series

soc: qcom: icc-bwmon: Update zone1_thres_count to 3

Message ID 20240522081508.1488592-1-quic_kshivnan@quicinc.com (mailing list archive)
State Changes Requested
Headers show
Series soc: qcom: icc-bwmon: Update zone1_thres_count to 3 | expand

Commit Message

Shivnandan Kumar May 22, 2024, 8:15 a.m. UTC
Update zone1_thres_count to 3 from 16 so that
driver can reduce bus vote in 3 sample windows instead
of waiting for 16 windows. This is in line with downstream
implementation.

Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
---
 drivers/soc/qcom/icc-bwmon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
2.25.1

Comments

Krzysztof Kozlowski May 22, 2024, 8:28 a.m. UTC | #1
On 22/05/2024 10:15, Shivnandan Kumar wrote:
> Update zone1_thres_count to 3 from 16 so that
> driver can reduce bus vote in 3 sample windows instead
> of waiting for 16 windows. This is in line with downstream
> implementation.
> 

This might make bwmon quite jittery. I don't think downstream is the
source of truth here. Please provide some measurements *on mainline tree*.

Best regards,
Krzysztof
Shivnandan Kumar May 22, 2024, 9:05 a.m. UTC | #2
On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote:
> On 22/05/2024 10:15, Shivnandan Kumar wrote:
>> Update zone1_thres_count to 3 from 16 so that
>> driver can reduce bus vote in 3 sample windows instead
>> of waiting for 16 windows. This is in line with downstream
>> implementation.
>>
> 
> This might make bwmon quite jittery. I don't think downstream is the
> source of truth here. Please provide some measurements *on mainline tree*.
> 

Hi Krzysztof,

The 16-window (64 ms) waiting time is too long to reduce the bus vote.
At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at 
60FPS in 64ms. Hence, delay of 64ms in decision making will lead to 
higher power regression. I’ve tested this change for 4K video playback 
on mainline tree, and there’s a significant power-saving.

I propose to make it a tunable,so that user space can tune it
based on runtime depending on fps.

USECASE                     zone1_thres_count=16     zone1_thres_count=3
4K video playback           236.15 mA	             203.15 mA

Thanks,
Shivnandan

> Best regards,
> Krzysztof
>
Krzysztof Kozlowski May 22, 2024, 9:19 a.m. UTC | #3
On 22/05/2024 11:05, Shivnandan Kumar wrote:
> 
> 
> On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote:
>> On 22/05/2024 10:15, Shivnandan Kumar wrote:
>>> Update zone1_thres_count to 3 from 16 so that
>>> driver can reduce bus vote in 3 sample windows instead
>>> of waiting for 16 windows. This is in line with downstream
>>> implementation.
>>>
>>
>> This might make bwmon quite jittery. I don't think downstream is the
>> source of truth here. Please provide some measurements *on mainline tree*.
>>
> 
> Hi Krzysztof,
> 
> The 16-window (64 ms) waiting time is too long to reduce the bus vote.
> At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at 
> 60FPS in 64ms. Hence, delay of 64ms in decision making will lead to 
> higher power regression. I’ve tested this change for 4K video playback 
> on mainline tree, and there’s a significant power-saving.

Please include it, with measurement below, in the commit msg.

> I propose to make it a tunable,so that user space can tune it
> based on runtime depending on fps.>
> USECASE                     zone1_thres_count=16     zone1_thres_count=3
> 4K video playback           236.15 mA	             203.15 mA
> 
> Thanks,
> Shivnandan
> 
>> Best regards,
>> Krzysztof
>>

Best regards,
Krzysztof
Bjorn Andersson May 22, 2024, 4:57 p.m. UTC | #4
On Wed, May 22, 2024 at 02:35:21PM +0530, Shivnandan Kumar wrote:
> 
> 
> On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote:
> > On 22/05/2024 10:15, Shivnandan Kumar wrote:
> > > Update zone1_thres_count to 3 from 16 so that
> > > driver can reduce bus vote in 3 sample windows instead
> > > of waiting for 16 windows. This is in line with downstream
> > > implementation.
> > > 
> > 
> > This might make bwmon quite jittery. I don't think downstream is the
> > source of truth here. Please provide some measurements *on mainline tree*.
> > 
> 
> Hi Krzysztof,
> 
> The 16-window (64 ms) waiting time is too long to reduce the bus vote.
> At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at 60FPS
> in 64ms. Hence, delay of 64ms in decision making will lead to higher power
> regression. I’ve tested this change for 4K video playback on mainline tree,
> and there’s a significant power-saving.
> 

As requested by Krzysztof already, update your commit message to capture
such motivation.

Please read and follow this:
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

> I propose to make it a tunable,so that user space can tune it
> based on runtime depending on fps.
> 

I presume that in e.g. Android there could be some sort of power HAL
that tweaks this value dynamically? In a general purpose system, how do
we make sure that this value stays relevant for multiple types of use
cases?

> USECASE                     zone1_thres_count=16     zone1_thres_count=3
> 4K video playback           236.15 mA	             203.15 mA

4k video playback is a fairly specific (and generally unusual) use case.
Is there any impact (negative or positive) for other use
cases/workloads?

Regards,
Bjorn

> 
> Thanks,
> Shivnandan
> 
> > Best regards,
> > Krzysztof
> >
diff mbox series

Patch

diff --git a/drivers/soc/qcom/icc-bwmon.c b/drivers/soc/qcom/icc-bwmon.c
index 656706259353..f1065427bb80 100644
--- a/drivers/soc/qcom/icc-bwmon.c
+++ b/drivers/soc/qcom/icc-bwmon.c
@@ -815,7 +815,7 @@  static const struct icc_bwmon_data msm8998_bwmon_data = {
 static const struct icc_bwmon_data sdm845_cpu_bwmon_data = {
 	.sample_ms = 4,
 	.count_unit_kb = 64,
-	.zone1_thres_count = 16,
+	.zone1_thres_count = 3,
 	.zone3_thres_count = 1,
 	.quirks = BWMON_HAS_GLOBAL_IRQ,
 	.regmap_fields = sdm845_cpu_bwmon_reg_fields,
@@ -834,7 +834,7 @@  static const struct icc_bwmon_data sdm845_llcc_bwmon_data = {
 static const struct icc_bwmon_data sc7280_llcc_bwmon_data = {
 	.sample_ms = 4,
 	.count_unit_kb = 64,
-	.zone1_thres_count = 16,
+	.zone1_thres_count = 3,
 	.zone3_thres_count = 1,
 	.quirks = BWMON_NEEDS_FORCE_CLEAR,
 	.regmap_fields = sdm845_llcc_bwmon_reg_fields,