Message ID | 20230213210049.1900681-3-ashutosh.dixit@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/hwmon: PL1 power limit fixes for ATSM | expand |
On Mon, Feb 13, 2023 at 01:00:48PM -0800, Ashutosh Dixit wrote: > Previous documentation suggested that the PL1 power limit is always enabled > in HW. However we now find this not to be the case on some platforms (such > as ATSM). Therefore enable the PL1 power limit (by setting the enable bit) > when writing the PL1 limit value to HW. > > Bspec: 51864 > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > --- > drivers/gpu/drm/i915/i915_hwmon.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > index 85195d61f89c7..7c20a6f47b92e 100644 > --- a/drivers/gpu/drm/i915/i915_hwmon.c > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > /* Computation in 64-bits to avoid overflow. Round to nearest. */ > nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER); > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); > > hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit, > - PKG_PWR_LIM_1, > - REG_FIELD_PREP(PKG_PWR_LIM_1, nval)); > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, > + nval); This patch looks right to me. But could you please open up on what exactly failed on that reverted patch? Why do we need to set the bits together? > return 0; > } > > -- > 2.38.0 >
On Tue, 14 Feb 2023 06:51:37 -0800, Rodrigo Vivi wrote: > Hi Rodrigo, > On Mon, Feb 13, 2023 at 01:00:48PM -0800, Ashutosh Dixit wrote: > > Previous documentation suggested that the PL1 power limit is always enabled > > in HW. However we now find this not to be the case on some platforms (such > > as ATSM). Therefore enable the PL1 power limit (by setting the enable bit) > > when writing the PL1 limit value to HW. > > > > Bspec: 51864 > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > --- > > drivers/gpu/drm/i915/i915_hwmon.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > index 85195d61f89c7..7c20a6f47b92e 100644 > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > /* Computation in 64-bits to avoid overflow. Round to nearest. */ > > nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER); > > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); > > > > hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit, > > - PKG_PWR_LIM_1, > > - REG_FIELD_PREP(PKG_PWR_LIM_1, nval)); > > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, > > + nval); > > This patch looks right to me. But could you please open up on what exactly > failed on that reverted patch? Why do we need to set the bits together? I had explained a bit here: https://gitlab.freedesktop.org/drm/intel/-/issues/8062#note_1761070 But will repeat. On ATSM, at power-up, PCODE sets the PL1 power limit to 0 but disables the PL1 power limit. The earlier patch had enabled the the PL1 power limit during module load itself but had left the PL1 power limit set to 0 (since there is no easy way to find out what it should be set to, on ATSM PCODE sets even the max power (which could have been used to set the PL1 limit) to 0). You can see that patch here: https://patchwork.freedesktop.org/patch/521321/?series=113578&rev=4 Now the PL1 power limit being 0 (and enabled) implies that HW will work with minimum power and therefore the lowest effective frequency. This means all workloads will run slower and this was resulting in various IGT tests timing out and GuC FW load (on resets) timing out on ATSM and that is why we had to revert that patch. In this patch I have changed the strategy and instead of enabling the PL1 power limit on module load we now enable it only when the limit is set by userspace. So at least the default CI will not be affected in this case. We can see that there no regressions on ATSM this time here: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113984v1/bat-all.html? Thanks. -- Ashutosh > > > return 0; > > } > > > > -- > > 2.38.0 > >
On Tue, Feb 14, 2023 at 12:47:23PM -0800, Dixit, Ashutosh wrote: > On Tue, 14 Feb 2023 06:51:37 -0800, Rodrigo Vivi wrote: > > > > Hi Rodrigo, > > > On Mon, Feb 13, 2023 at 01:00:48PM -0800, Ashutosh Dixit wrote: > > > Previous documentation suggested that the PL1 power limit is always enabled > > > in HW. However we now find this not to be the case on some platforms (such > > > as ATSM). Therefore enable the PL1 power limit (by setting the enable bit) > > > when writing the PL1 limit value to HW. > > > > > > Bspec: 51864 > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > > --- > > > drivers/gpu/drm/i915/i915_hwmon.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > index 85195d61f89c7..7c20a6f47b92e 100644 > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > > > /* Computation in 64-bits to avoid overflow. Round to nearest. */ > > > nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER); > > > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); > > > > > > hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit, > > > - PKG_PWR_LIM_1, > > > - REG_FIELD_PREP(PKG_PWR_LIM_1, nval)); > > > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, > > > + nval); > > > > This patch looks right to me. But could you please open up on what exactly > > failed on that reverted patch? Why do we need to set the bits together? > > I had explained a bit here: > > https://gitlab.freedesktop.org/drm/intel/-/issues/8062#note_1761070 > > But will repeat. On ATSM, at power-up, PCODE sets the PL1 power limit to 0 > but disables the PL1 power limit. The earlier patch had enabled the the PL1 > power limit during module load itself but had left the PL1 power limit set > to 0 (since there is no easy way to find out what it should be set to, on > ATSM PCODE sets even the max power (which could have been used to set the > PL1 limit) to 0). You can see that patch here: > > https://patchwork.freedesktop.org/patch/521321/?series=113578&rev=4 > > Now the PL1 power limit being 0 (and enabled) implies that HW will work > with minimum power and therefore the lowest effective frequency. This means > all workloads will run slower and this was resulting in various IGT tests > timing out and GuC FW load (on resets) timing out on ATSM and that is why > we had to revert that patch. > > In this patch I have changed the strategy and instead of enabling the PL1 > power limit on module load we now enable it only when the limit is set by > userspace. So at least the default CI will not be affected in this case. We > can see that there no regressions on ATSM this time here: > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113984v1/bat-all.html? > this makes total sense. Thank you! Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > Thanks. > -- > Ashutosh > > > > > > > > return 0; > > > } > > > > > > -- > > > 2.38.0 > > >
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 85195d61f89c7..7c20a6f47b92e 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -385,10 +385,11 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) /* Computation in 64-bits to avoid overflow. Round to nearest. */ nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER); + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit, - PKG_PWR_LIM_1, - REG_FIELD_PREP(PKG_PWR_LIM_1, nval)); + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, + nval); return 0; }
Previous documentation suggested that the PL1 power limit is always enabled in HW. However we now find this not to be the case on some platforms (such as ATSM). Therefore enable the PL1 power limit (by setting the enable bit) when writing the PL1 limit value to HW. Bspec: 51864 Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> --- drivers/gpu/drm/i915/i915_hwmon.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)