diff mbox series

[2/3] drm/i915/hwmon: Enable PL1 limit when writing limit value to HW

Message ID 20230213210049.1900681-3-ashutosh.dixit@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/hwmon: PL1 power limit fixes for ATSM | expand

Commit Message

Dixit, Ashutosh Feb. 13, 2023, 9 p.m. UTC
Previous documentation suggested that the PL1 power limit is always enabled
in HW. However we now find this not to be the case on some platforms (such
as ATSM). Therefore enable the PL1 power limit (by setting the enable bit)
when writing the PL1 limit value to HW.

Bspec: 51864

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/i915_hwmon.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Rodrigo Vivi Feb. 14, 2023, 2:51 p.m. UTC | #1
On Mon, Feb 13, 2023 at 01:00:48PM -0800, Ashutosh Dixit wrote:
> Previous documentation suggested that the PL1 power limit is always enabled
> in HW. However we now find this not to be the case on some platforms (such
> as ATSM). Therefore enable the PL1 power limit (by setting the enable bit)
> when writing the PL1 limit value to HW.
> 
> Bspec: 51864
> 
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_hwmon.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 85195d61f89c7..7c20a6f47b92e 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  
>  	/* Computation in 64-bits to avoid overflow. Round to nearest. */
>  	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
> +	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
>  
>  	hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
> -					    PKG_PWR_LIM_1,
> -					    REG_FIELD_PREP(PKG_PWR_LIM_1, nval));
> +					    PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
> +					    nval);

This patch looks right to me. But could you please open up on what exactly
failed on that reverted patch? Why do we need to set the bits together?

>  	return 0;
>  }
>  
> -- 
> 2.38.0
>
Dixit, Ashutosh Feb. 14, 2023, 8:47 p.m. UTC | #2
On Tue, 14 Feb 2023 06:51:37 -0800, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Mon, Feb 13, 2023 at 01:00:48PM -0800, Ashutosh Dixit wrote:
> > Previous documentation suggested that the PL1 power limit is always enabled
> > in HW. However we now find this not to be the case on some platforms (such
> > as ATSM). Therefore enable the PL1 power limit (by setting the enable bit)
> > when writing the PL1 limit value to HW.
> >
> > Bspec: 51864
> >
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 85195d61f89c7..7c20a6f47b92e 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >
> >	/* Computation in 64-bits to avoid overflow. Round to nearest. */
> >	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
> > +	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
> >
> >	hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
> > -					    PKG_PWR_LIM_1,
> > -					    REG_FIELD_PREP(PKG_PWR_LIM_1, nval));
> > +					    PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
> > +					    nval);
>
> This patch looks right to me. But could you please open up on what exactly
> failed on that reverted patch? Why do we need to set the bits together?

I had explained a bit here:

https://gitlab.freedesktop.org/drm/intel/-/issues/8062#note_1761070

But will repeat. On ATSM, at power-up, PCODE sets the PL1 power limit to 0
but disables the PL1 power limit. The earlier patch had enabled the the PL1
power limit during module load itself but had left the PL1 power limit set
to 0 (since there is no easy way to find out what it should be set to, on
ATSM PCODE sets even the max power (which could have been used to set the
PL1 limit) to 0). You can see that patch here:

https://patchwork.freedesktop.org/patch/521321/?series=113578&rev=4

Now the PL1 power limit being 0 (and enabled) implies that HW will work
with minimum power and therefore the lowest effective frequency. This means
all workloads will run slower and this was resulting in various IGT tests
timing out and GuC FW load (on resets) timing out on ATSM and that is why
we had to revert that patch.

In this patch I have changed the strategy and instead of enabling the PL1
power limit on module load we now enable it only when the limit is set by
userspace. So at least the default CI will not be affected in this case. We
can see that there no regressions on ATSM this time here:

https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113984v1/bat-all.html?

Thanks.
--
Ashutosh



>
> >	return 0;
> >  }
> >
> > --
> > 2.38.0
> >
Rodrigo Vivi Feb. 14, 2023, 8:53 p.m. UTC | #3
On Tue, Feb 14, 2023 at 12:47:23PM -0800, Dixit, Ashutosh wrote:
> On Tue, 14 Feb 2023 06:51:37 -0800, Rodrigo Vivi wrote:
> >
> 
> Hi Rodrigo,
> 
> > On Mon, Feb 13, 2023 at 01:00:48PM -0800, Ashutosh Dixit wrote:
> > > Previous documentation suggested that the PL1 power limit is always enabled
> > > in HW. However we now find this not to be the case on some platforms (such
> > > as ATSM). Therefore enable the PL1 power limit (by setting the enable bit)
> > > when writing the PL1 limit value to HW.
> > >
> > > Bspec: 51864
> > >
> > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_hwmon.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > > index 85195d61f89c7..7c20a6f47b92e 100644
> > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > >
> > >	/* Computation in 64-bits to avoid overflow. Round to nearest. */
> > >	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
> > > +	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
> > >
> > >	hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
> > > -					    PKG_PWR_LIM_1,
> > > -					    REG_FIELD_PREP(PKG_PWR_LIM_1, nval));
> > > +					    PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
> > > +					    nval);
> >
> > This patch looks right to me. But could you please open up on what exactly
> > failed on that reverted patch? Why do we need to set the bits together?
> 
> I had explained a bit here:
> 
> https://gitlab.freedesktop.org/drm/intel/-/issues/8062#note_1761070
> 
> But will repeat. On ATSM, at power-up, PCODE sets the PL1 power limit to 0
> but disables the PL1 power limit. The earlier patch had enabled the the PL1
> power limit during module load itself but had left the PL1 power limit set
> to 0 (since there is no easy way to find out what it should be set to, on
> ATSM PCODE sets even the max power (which could have been used to set the
> PL1 limit) to 0). You can see that patch here:
> 
> https://patchwork.freedesktop.org/patch/521321/?series=113578&rev=4
> 
> Now the PL1 power limit being 0 (and enabled) implies that HW will work
> with minimum power and therefore the lowest effective frequency. This means
> all workloads will run slower and this was resulting in various IGT tests
> timing out and GuC FW load (on resets) timing out on ATSM and that is why
> we had to revert that patch.
> 
> In this patch I have changed the strategy and instead of enabling the PL1
> power limit on module load we now enable it only when the limit is set by
> userspace. So at least the default CI will not be affected in this case. We
> can see that there no regressions on ATSM this time here:
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113984v1/bat-all.html?
> 

this makes total sense. Thank you!

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


> Thanks.
> --
> Ashutosh
> 
> 
> 
> >
> > >	return 0;
> > >  }
> > >
> > > --
> > > 2.38.0
> > >
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 85195d61f89c7..7c20a6f47b92e 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -385,10 +385,11 @@  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 
 	/* Computation in 64-bits to avoid overflow. Round to nearest. */
 	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
+	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
 
 	hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
-					    PKG_PWR_LIM_1,
-					    REG_FIELD_PREP(PKG_PWR_LIM_1, nval));
+					    PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
+					    nval);
 	return 0;
 }