diff mbox series

[v8,06/17] pwm: lpss: Use pwm_lpss_restore() when restoring state on resume

Message ID 20200830125753.230420-7-hdegoede@redhat.com (mailing list archive)
State New, archived
Headers show
Series acpi/pwm/i915: Convert pwm-crc and i915 driver's PWM code to use the atomic PWM API | expand

Commit Message

Hans de Goede Aug. 30, 2020, 12:57 p.m. UTC
Before this commit a suspend + resume of the LPSS PWM controller
would result in the controller being reset to its defaults of
output-freq = clock/256, duty-cycle=100%, until someone changes
to the output-freq and/or duty-cycle are made.

This problem has been masked so far because the main consumer
(the i915 driver) was always making duty-cycle changes on resume.
With the conversion of the i915 driver to the atomic PWM API the
driver now only disables/enables the PWM on suspend/resume leaving
the output-freq and duty as is, triggering this problem.

The LPSS PWM controller has a mechanism where the ctrl register value
and the actual base-unit and on-time-div values used are latched. When
software sets the SW_UPDATE bit then at the end of the current PWM cycle,
the new values from the ctrl-register will be latched into the actual
registers, and the SW_UPDATE bit will be cleared.

The problem is that before this commit our suspend/resume handling
consisted of simply saving the PWM ctrl register on suspend and
restoring it on resume, without setting the PWM_SW_UPDATE bit.
When the controller has lost its state over a suspend/resume and thus
has been reset to the defaults, just restoring the register is not
enough. We must also set the SW_UPDATE bit to tell the controller to
latch the restored values into the actual registers.

Fixing this problem is not as simple as just or-ing in the value which
is being restored with SW_UPDATE. If the PWM was enabled before we must
write the new settings + PWM_SW_UPDATE before setting PWM_ENABLE.
We must also wait for PWM_SW_UPDATE to become 0 again and depending on the
model we must do this either before or after the setting of PWM_ENABLE.

All the necessary logic for doing this is already present inside
pwm_lpss_apply(), so instead of duplicating this inside the resume
handler, this commit adds a new pwm_lpss_restore() helper which mirrors
pwm_lpss_apply() minus the runtime-pm reference handling (which we should
not change on resume).

This fixes the output-freq and duty-cycle being reset to their defaults
on resume.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
Changes in v8:
- Drop optimization to skip restore if current ctrl reg is the same as our saved
  ctrl reg value (because this causes issues on some devices)
- Simplify pwm_lpss_restore_state() to not rely on the current state
- Modify commit message to mention the new pwm_lpss_restore_state() helper

Changes in v6:
- Add a pwm_lpss_restore_state() helper for re-applying the PWM state on resume

Changes in v5:
- The changes to pwm_lpss_apply() are much cleaner now thanks to the new
  pwm_lpss_prepare_enable() helper.

Changes in v3:
- This replaces the "pwm: lpss: Set SW_UPDATE bit when enabling the PWM"
  patch from previous versions of this patch-set, which really was a hack
  working around the resume issue which this patch fixes properly.
---
 drivers/pwm/pwm-lpss.c | 40 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

Comments

Thierry Reding Aug. 31, 2020, 11:10 a.m. UTC | #1
On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
> Before this commit a suspend + resume of the LPSS PWM controller
> would result in the controller being reset to its defaults of
> output-freq = clock/256, duty-cycle=100%, until someone changes
> to the output-freq and/or duty-cycle are made.
> 
> This problem has been masked so far because the main consumer
> (the i915 driver) was always making duty-cycle changes on resume.
> With the conversion of the i915 driver to the atomic PWM API the
> driver now only disables/enables the PWM on suspend/resume leaving
> the output-freq and duty as is, triggering this problem.

Doesn't this imply that there's another bug at play here? At the PWM API
level you're applying a state and it's up to the driver to ensure that
the hardware state after ->apply() is what the software has requested.

If you only switch the enable state and that doesn't cause period and
duty cycle to be updated it means that your driver isn't writing those
registers when it should be.

> The LPSS PWM controller has a mechanism where the ctrl register value
> and the actual base-unit and on-time-div values used are latched. When
> software sets the SW_UPDATE bit then at the end of the current PWM cycle,
> the new values from the ctrl-register will be latched into the actual
> registers, and the SW_UPDATE bit will be cleared.
> 
> The problem is that before this commit our suspend/resume handling
> consisted of simply saving the PWM ctrl register on suspend and
> restoring it on resume, without setting the PWM_SW_UPDATE bit.
> When the controller has lost its state over a suspend/resume and thus
> has been reset to the defaults, just restoring the register is not
> enough. We must also set the SW_UPDATE bit to tell the controller to
> latch the restored values into the actual registers.
> 
> Fixing this problem is not as simple as just or-ing in the value which
> is being restored with SW_UPDATE. If the PWM was enabled before we must
> write the new settings + PWM_SW_UPDATE before setting PWM_ENABLE.
> We must also wait for PWM_SW_UPDATE to become 0 again and depending on the
> model we must do this either before or after the setting of PWM_ENABLE.
> 
> All the necessary logic for doing this is already present inside
> pwm_lpss_apply(), so instead of duplicating this inside the resume
> handler, this commit adds a new pwm_lpss_restore() helper which mirrors
> pwm_lpss_apply() minus the runtime-pm reference handling (which we should
> not change on resume).

If this is all already implemented in pwm_lpss_apply(), why isn't it
working for the suspend/resume case? It shouldn't matter that the
consumer only changes the enable/disable state. After ->apply()
successfully returns your hardware should be programmed with exactly
the state that the consumer requested.

Thierry
Hans de Goede Aug. 31, 2020, 11:46 a.m. UTC | #2
Hi,

On 8/31/20 1:10 PM, Thierry Reding wrote:
> On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
>> Before this commit a suspend + resume of the LPSS PWM controller
>> would result in the controller being reset to its defaults of
>> output-freq = clock/256, duty-cycle=100%, until someone changes
>> to the output-freq and/or duty-cycle are made.
>>
>> This problem has been masked so far because the main consumer
>> (the i915 driver) was always making duty-cycle changes on resume.
>> With the conversion of the i915 driver to the atomic PWM API the
>> driver now only disables/enables the PWM on suspend/resume leaving
>> the output-freq and duty as is, triggering this problem.
> 
> Doesn't this imply that there's another bug at play here? At the PWM API
> level you're applying a state and it's up to the driver to ensure that
> the hardware state after ->apply() is what the software has requested.
> 
> If you only switch the enable state and that doesn't cause period and
> duty cycle to be updated it means that your driver isn't writing those
> registers when it should be.

Right, the driver was not committing those as it should *on resume*,
that and it skips setting the update bit on the subsequent enable,
which is an optimization which gets removed in 7/17.

Before switching the i915 driver over to atomic, when the LPSS-PWM
was used for the backlight we got the following order on suspend/resume

1. Set duty-cycle to 0%
2. Set enabled to 0
3. Save ctrl reg
4. Power-off PWM controller, it now looses all its state
5. Power-on PWM ctrl
6. Restore ctrl reg (as a single reg write)
7. Set enabled to 1, at this point one would expect the
duty/freq from the restored ctrl-reg to apply, but:
a) The resume code never sets the update bit (which this commit fixes); and
b) On applying the pwm_state with enabled=1 the code applying the
state does this (before setting the enabled bit in the ctrl reg):

	if (orig_ctrl != ctrl) {
		pwm_lpss_write(pwm, ctrl);
		pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
	}
and since the restore of the ctrl reg set the old duty/freq the
writes are skipped, so the update bit never gets set.

8. Set duty-cycle to the pre-suspend value (which is not 0)
this does cause a change in the ctrl-reg, so now the update flag
does get set.

Note that 1-2 and 7-8 are both done by the non atomic i915 code,
when moving the i915 code to atomic I decided that having these
2 separate steps here is non-sense, so the new i915 code just
toggles the enable bit. So in essence the new atomic PWM
i915 code drops step 1 and 8.

Dropping steps 8 means that the update bit never gets set and we
end up with the PWM running at its power-on-reset duty cycle.

You are correct in your remark to patch 7/17 that since that removes
the if (orig_ctrl != ctrl) for the writes that now step 7 will be
sufficient to get the PWM to work again. But that only takes the i915
usage into account.

What if the PWM is used through the sysfs userspace API?
Then only steps 3-6 will happen on suspend-resume and without
fixing step 6 to properly restore the PWM controller in its
pre-resume state (this patch) it will once again be running at
its power-on-reset defaults instead of the values from the
restored control register.

So at step 6, if the PWM was enabled before, we must set the update
bit, and then wait for it to clear again so the controller is
ready for subsequent updates. The waiting for it to clear again
needs to happen before or after setting the enable bit depending
on the hw generation, which leads to this patch.

I hope that helps explain why this patch is the correct thing
to do.


>> The LPSS PWM controller has a mechanism where the ctrl register value
>> and the actual base-unit and on-time-div values used are latched. When
>> software sets the SW_UPDATE bit then at the end of the current PWM cycle,
>> the new values from the ctrl-register will be latched into the actual
>> registers, and the SW_UPDATE bit will be cleared.
>>
>> The problem is that before this commit our suspend/resume handling
>> consisted of simply saving the PWM ctrl register on suspend and
>> restoring it on resume, without setting the PWM_SW_UPDATE bit.
>> When the controller has lost its state over a suspend/resume and thus
>> has been reset to the defaults, just restoring the register is not
>> enough. We must also set the SW_UPDATE bit to tell the controller to
>> latch the restored values into the actual registers.
>>
>> Fixing this problem is not as simple as just or-ing in the value which
>> is being restored with SW_UPDATE. If the PWM was enabled before we must
>> write the new settings + PWM_SW_UPDATE before setting PWM_ENABLE.
>> We must also wait for PWM_SW_UPDATE to become 0 again and depending on the
>> model we must do this either before or after the setting of PWM_ENABLE.
>>
>> All the necessary logic for doing this is already present inside
>> pwm_lpss_apply(), so instead of duplicating this inside the resume
>> handler, this commit adds a new pwm_lpss_restore() helper which mirrors
>> pwm_lpss_apply() minus the runtime-pm reference handling (which we should
>> not change on resume).
> 
> If this is all already implemented in pwm_lpss_apply(), why isn't it
> working for the suspend/resume case? It shouldn't matter that the
> consumer only changes the enable/disable state. After ->apply()
> successfully returns your hardware should be programmed with exactly
> the state that the consumer requested.

See above, apply() was trying to be smart but the restore of ctrl
on resume without setting the update bit was tricking it. That
being too smart for its own good is removed in 7/16 as you
rightfully point out. But this patch is still necessary for the
PWM controller to be in the expected state between resume and the
first apply() after resume (which may be quite a long time in
the future when using e.g. the sysfs API).

Regards,

Hans
Thierry Reding Aug. 31, 2020, 1:15 p.m. UTC | #3
On Mon, Aug 31, 2020 at 01:46:28PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 8/31/20 1:10 PM, Thierry Reding wrote:
> > On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
> > > Before this commit a suspend + resume of the LPSS PWM controller
> > > would result in the controller being reset to its defaults of
> > > output-freq = clock/256, duty-cycle=100%, until someone changes
> > > to the output-freq and/or duty-cycle are made.
> > > 
> > > This problem has been masked so far because the main consumer
> > > (the i915 driver) was always making duty-cycle changes on resume.
> > > With the conversion of the i915 driver to the atomic PWM API the
> > > driver now only disables/enables the PWM on suspend/resume leaving
> > > the output-freq and duty as is, triggering this problem.
> > 
> > Doesn't this imply that there's another bug at play here? At the PWM API
> > level you're applying a state and it's up to the driver to ensure that
> > the hardware state after ->apply() is what the software has requested.
> > 
> > If you only switch the enable state and that doesn't cause period and
> > duty cycle to be updated it means that your driver isn't writing those
> > registers when it should be.
> 
> Right, the driver was not committing those as it should *on resume*,
> that and it skips setting the update bit on the subsequent enable,
> which is an optimization which gets removed in 7/17.
> 
> Before switching the i915 driver over to atomic, when the LPSS-PWM
> was used for the backlight we got the following order on suspend/resume
> 
> 1. Set duty-cycle to 0%
> 2. Set enabled to 0
> 3. Save ctrl reg
> 4. Power-off PWM controller, it now looses all its state
> 5. Power-on PWM ctrl
> 6. Restore ctrl reg (as a single reg write)
> 7. Set enabled to 1, at this point one would expect the
> duty/freq from the restored ctrl-reg to apply, but:
> a) The resume code never sets the update bit (which this commit fixes); and
> b) On applying the pwm_state with enabled=1 the code applying the
> state does this (before setting the enabled bit in the ctrl reg):
> 
> 	if (orig_ctrl != ctrl) {
> 		pwm_lpss_write(pwm, ctrl);
> 		pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
> 	}
> and since the restore of the ctrl reg set the old duty/freq the
> writes are skipped, so the update bit never gets set.
> 
> 8. Set duty-cycle to the pre-suspend value (which is not 0)
> this does cause a change in the ctrl-reg, so now the update flag
> does get set.
> 
> Note that 1-2 and 7-8 are both done by the non atomic i915 code,
> when moving the i915 code to atomic I decided that having these
> 2 separate steps here is non-sense, so the new i915 code just
> toggles the enable bit. So in essence the new atomic PWM
> i915 code drops step 1 and 8.
> 
> Dropping steps 8 means that the update bit never gets set and we
> end up with the PWM running at its power-on-reset duty cycle.
> 
> You are correct in your remark to patch 7/17 that since that removes
> the if (orig_ctrl != ctrl) for the writes that now step 7 will be
> sufficient to get the PWM to work again. But that only takes the i915
> usage into account.
> 
> What if the PWM is used through the sysfs userspace API?
> Then only steps 3-6 will happen on suspend-resume and without
> fixing step 6 to properly restore the PWM controller in its
> pre-resume state (this patch) it will once again be running at
> its power-on-reset defaults instead of the values from the
> restored control register.

Actually PWM's sysfs code has suspend/resume callbacks that basically
make sysfs just a regular consumer of PWMs. So they do end up doing a
pwm_apply_state() on the PWM as well on suspend and restore the state
from before suspend on resume.

This was done very specifically because the suspend/resume order can be
unexpected under some circumstances, so for PWM we really want for the
consumer to always have ultimate control over when precisely the PWM is
restored on resume.

The reason why we did this was because people observed weird glitches on
suspend/resume with different severity. In some cases a backlight would
be resumed before the display controller had had a chance to start
sending frames, causing on-screen corruption in some cases (such as
smart displays) and in other cases a PWM-controller regulator would be
resumed too late or too early, which I think was causing some issue with
the CPUs not working properly on resume.

So I'd prefer not to have any PWM driver save and restore its own
context on suspend/resume, because that's inevitably going to cause
unexpected behaviour at some point. If it's absolutely necessary we can
of course still do that, but I think in that case we need to at least
add a comment in the code about why context save/restore is needed in
this particular case and make it clear that this is not something that
other drivers should copy because they most likely won't be needing it.

Given the above it also doesn't sound to me like there's a real problem,
or at least that the bug is somewhere else. A consumer should always be
responsible for applying the pre-suspend state upon resume and it sounds
like that would be true after patch 7. Since sysfs is just a regular
consumer, the same should apply for sysfs-controlled PWMs as well.

> So at step 6, if the PWM was enabled before, we must set the update
> bit, and then wait for it to clear again so the controller is
> ready for subsequent updates. The waiting for it to clear again
> needs to happen before or after setting the enable bit depending
> on the hw generation, which leads to this patch.

But all of that should be happening as part of the call to
pwm_apply_state(), right? That path should be taken for all consumers on
resume, including sysfs.

> I hope that helps explain why this patch is the correct thing
> to do.
> 
> 
> > > The LPSS PWM controller has a mechanism where the ctrl register value
> > > and the actual base-unit and on-time-div values used are latched. When
> > > software sets the SW_UPDATE bit then at the end of the current PWM cycle,
> > > the new values from the ctrl-register will be latched into the actual
> > > registers, and the SW_UPDATE bit will be cleared.
> > > 
> > > The problem is that before this commit our suspend/resume handling
> > > consisted of simply saving the PWM ctrl register on suspend and
> > > restoring it on resume, without setting the PWM_SW_UPDATE bit.
> > > When the controller has lost its state over a suspend/resume and thus
> > > has been reset to the defaults, just restoring the register is not
> > > enough. We must also set the SW_UPDATE bit to tell the controller to
> > > latch the restored values into the actual registers.
> > > 
> > > Fixing this problem is not as simple as just or-ing in the value which
> > > is being restored with SW_UPDATE. If the PWM was enabled before we must
> > > write the new settings + PWM_SW_UPDATE before setting PWM_ENABLE.
> > > We must also wait for PWM_SW_UPDATE to become 0 again and depending on the
> > > model we must do this either before or after the setting of PWM_ENABLE.
> > > 
> > > All the necessary logic for doing this is already present inside
> > > pwm_lpss_apply(), so instead of duplicating this inside the resume
> > > handler, this commit adds a new pwm_lpss_restore() helper which mirrors
> > > pwm_lpss_apply() minus the runtime-pm reference handling (which we should
> > > not change on resume).
> > 
> > If this is all already implemented in pwm_lpss_apply(), why isn't it
> > working for the suspend/resume case? It shouldn't matter that the
> > consumer only changes the enable/disable state. After ->apply()
> > successfully returns your hardware should be programmed with exactly
> > the state that the consumer requested.
> 
> See above, apply() was trying to be smart but the restore of ctrl
> on resume without setting the update bit was tricking it. That
> being too smart for its own good is removed in 7/16 as you
> rightfully point out. But this patch is still necessary for the
> PWM controller to be in the expected state between resume and the
> first apply() after resume (which may be quite a long time in
> the future when using e.g. the sysfs API).

Like I said, the sysfs code should be resuming any exported PWMs on
resume just like any other consumer.

Obviously it's always up to the consumer to call pwm_apply_state() at
the right time. If that's "too late" for some reason, then that's a bug
in the consumer driver. But as I explained above there are a number of
cases where restoring context in the PWM driver itself doesn't work
because it can cause sequencing issues.

Thierry
Hans de Goede Aug. 31, 2020, 5:57 p.m. UTC | #4
Hi,

On 8/31/20 3:15 PM, Thierry Reding wrote:
> On Mon, Aug 31, 2020 at 01:46:28PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 8/31/20 1:10 PM, Thierry Reding wrote:
>>> On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
>>>> Before this commit a suspend + resume of the LPSS PWM controller
>>>> would result in the controller being reset to its defaults of
>>>> output-freq = clock/256, duty-cycle=100%, until someone changes
>>>> to the output-freq and/or duty-cycle are made.
>>>>
>>>> This problem has been masked so far because the main consumer
>>>> (the i915 driver) was always making duty-cycle changes on resume.
>>>> With the conversion of the i915 driver to the atomic PWM API the
>>>> driver now only disables/enables the PWM on suspend/resume leaving
>>>> the output-freq and duty as is, triggering this problem.
>>>
>>> Doesn't this imply that there's another bug at play here? At the PWM API
>>> level you're applying a state and it's up to the driver to ensure that
>>> the hardware state after ->apply() is what the software has requested.
>>>
>>> If you only switch the enable state and that doesn't cause period and
>>> duty cycle to be updated it means that your driver isn't writing those
>>> registers when it should be.
>>
>> Right, the driver was not committing those as it should *on resume*,
>> that and it skips setting the update bit on the subsequent enable,
>> which is an optimization which gets removed in 7/17.
>>
>> Before switching the i915 driver over to atomic, when the LPSS-PWM
>> was used for the backlight we got the following order on suspend/resume
>>
>> 1. Set duty-cycle to 0%
>> 2. Set enabled to 0
>> 3. Save ctrl reg
>> 4. Power-off PWM controller, it now looses all its state
>> 5. Power-on PWM ctrl
>> 6. Restore ctrl reg (as a single reg write)
>> 7. Set enabled to 1, at this point one would expect the
>> duty/freq from the restored ctrl-reg to apply, but:
>> a) The resume code never sets the update bit (which this commit fixes); and
>> b) On applying the pwm_state with enabled=1 the code applying the
>> state does this (before setting the enabled bit in the ctrl reg):
>>
>> 	if (orig_ctrl != ctrl) {
>> 		pwm_lpss_write(pwm, ctrl);
>> 		pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
>> 	}
>> and since the restore of the ctrl reg set the old duty/freq the
>> writes are skipped, so the update bit never gets set.
>>
>> 8. Set duty-cycle to the pre-suspend value (which is not 0)
>> this does cause a change in the ctrl-reg, so now the update flag
>> does get set.
>>
>> Note that 1-2 and 7-8 are both done by the non atomic i915 code,
>> when moving the i915 code to atomic I decided that having these
>> 2 separate steps here is non-sense, so the new i915 code just
>> toggles the enable bit. So in essence the new atomic PWM
>> i915 code drops step 1 and 8.
>>
>> Dropping steps 8 means that the update bit never gets set and we
>> end up with the PWM running at its power-on-reset duty cycle.
>>
>> You are correct in your remark to patch 7/17 that since that removes
>> the if (orig_ctrl != ctrl) for the writes that now step 7 will be
>> sufficient to get the PWM to work again. But that only takes the i915
>> usage into account.
>>
>> What if the PWM is used through the sysfs userspace API?
>> Then only steps 3-6 will happen on suspend-resume and without
>> fixing step 6 to properly restore the PWM controller in its
>> pre-resume state (this patch) it will once again be running at
>> its power-on-reset defaults instead of the values from the
>> restored control register.
> 
> Actually PWM's sysfs code has suspend/resume callbacks that basically
> make sysfs just a regular consumer of PWMs. So they do end up doing a
> pwm_apply_state() on the PWM as well on suspend and restore the state
> from before suspend on resume.
> 
> This was done very specifically because the suspend/resume order can be
> unexpected under some circumstances, so for PWM we really want for the
> consumer to always have ultimate control over when precisely the PWM is
> restored on resume.
>
> The reason why we did this was because people observed weird glitches on
> suspend/resume with different severity. In some cases a backlight would
> be resumed before the display controller had had a chance to start
> sending frames, causing on-screen corruption in some cases (such as
> smart displays) and in other cases a PWM-controller regulator would be
> resumed too late or too early, which I think was causing some issue with
> the CPUs not working properly on resume.
> 
> So I'd prefer not to have any PWM driver save and restore its own
> context on suspend/resume, because that's inevitably going to cause
> unexpected behaviour at some point. If it's absolutely necessary we can
> of course still do that, but I think in that case we need to at least
> add a comment in the code about why context save/restore is needed in
> this particular case and make it clear that this is not something that
> other drivers should copy because they most likely won't be needing it.
> 
> Given the above it also doesn't sound to me like there's a real problem,
> or at least that the bug is somewhere else. A consumer should always be
> responsible for applying the pre-suspend state upon resume and it sounds
> like that would be true after patch 7. Since sysfs is just a regular
> consumer, the same should apply for sysfs-controlled PWMs as well.

Ok, I was not aware that for PWM the consumer is supposed to always
be the one to restore the state. If that is the rule then we should probably
just drop the save/restore suspend/resume code from pwm-lpss.

It seems that I'm actually responsible for adding that suspend/resume
code in the first place, see commit 1d375b58c12f ("pwm: lpss: platform:
Save/restore the ctrl register over a suspend/resume") although the
ctrl-register was already being saved/restored before that commit
but then by the acpi/acpi_lpss.c code.

One worry after dropping the suspend/resume save/restore code is what
happens if the controller was enabled at the moment the system suspends
and the consumers first post resume apply() call has pwm_state.enabled
set too.

Currently pwm_lpss_apply() looks like this:

         if (state->enabled) {
                 if (!pwm_is_enabled(pwm)) {
                         pm_runtime_get_sync(chip->dev);
                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
                         if (ret)
                                 pm_runtime_put(chip->dev);
                 } else {
                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, false);
                 }
         } else if (pwm_is_enabled(pwm)) {

Where the true / false parameter to pwm_lpss_prepare_enable()
decides if pwm_lpss_prepare_enable() sets the enable bit in the controllers
ctrl register, or if it skips that.

If we then come from a full system suspend (controller loses state,
comes up with enable bit cleared) we will still enter the:

                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, false);

Path since the !pwm_is_enabled(pwm) check checks the pwm_state struct,
not the actual hw-enabled bit and then we do not (re)set the enabled after
resume as we should when apply() is called with pwm_state.enabled set.

Fixing this is easy though, we still need to check for the disabled ->
enabled transition for runtime pm refcounting, but we can also tell
pwm_lpss_prepare_enable() to set the enable bit in the other path, this
will be a no-op in case it is already set.

So then the new apply() code would become:

         if (state->enabled) {
                 if (!pwm_is_enabled(pwm)) {
                         pm_runtime_get_sync(chip->dev);
                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
                         if (ret)
                                 pm_runtime_put(chip->dev);
                 } else {
                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
                 }
         } else if (pwm_is_enabled(pwm)) {

(and we can even optimize out the enable parameter to pwm_lpss_prepare_enable
then and always make it set the enable bit).

Together with patch 07/16 will make apply() always work independent of
the state the controller was in before it got called. Which in light of
all the subtle issues we have seen surrounding this is likely a good thing.

And with the fix to make apply() fully independent of the previous state
of the controller, I'm all for dropping the suspend/resume state
save/restore code.  Doing that makes things more KISS so I like it :)

>> So at step 6, if the PWM was enabled before, we must set the update
>> bit, and then wait for it to clear again so the controller is
>> ready for subsequent updates. The waiting for it to clear again
>> needs to happen before or after setting the enable bit depending
>> on the hw generation, which leads to this patch.
> 
> But all of that should be happening as part of the call to
> pwm_apply_state(), right? That path should be taken for all consumers on
> resume, including sysfs.

Ack.

<snip>

>> See above, apply() was trying to be smart but the restore of ctrl
>> on resume without setting the update bit was tricking it. That
>> being too smart for its own good is removed in 7/16 as you
>> rightfully point out. But this patch is still necessary for the
>> PWM controller to be in the expected state between resume and the
>> first apply() after resume (which may be quite a long time in
>> the future when using e.g. the sysfs API).
> 
> Like I said, the sysfs code should be resuming any exported PWMs on
> resume just like any other consumer.
> 
> Obviously it's always up to the consumer to call pwm_apply_state() at
> the right time. If that's "too late" for some reason, then that's a bug
> in the consumer driver. But as I explained above there are a number of
> cases where restoring context in the PWM driver itself doesn't work
> because it can cause sequencing issues.

Ack, I was not aware that PWM consumers are responsible for restoring
their own state on resume. If that is the case then:

TL;DR:

1. We (I) should make apply() work independent of the current
hardware state, instead of having it make various assumptions
about that state as it now does.

2. We (I) should drop the suspend/resume save/restore state
handlers from pwm-lpss completely.

So I believe that I should prepare yet another version of this
patch-set replacing 06/17 and 07/17 with 2 patches doing these
2 things.

Thierry, Andy, does that sound good to you ?

Regards,

Hans
Andy Shevchenko Sept. 1, 2020, 8:09 a.m. UTC | #5
On Mon, Aug 31, 2020 at 07:57:30PM +0200, Hans de Goede wrote:
> On 8/31/20 3:15 PM, Thierry Reding wrote:
> > On Mon, Aug 31, 2020 at 01:46:28PM +0200, Hans de Goede wrote:
> > > On 8/31/20 1:10 PM, Thierry Reding wrote:
> > > > On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
> > > > > Before this commit a suspend + resume of the LPSS PWM controller
> > > > > would result in the controller being reset to its defaults of
> > > > > output-freq = clock/256, duty-cycle=100%, until someone changes
> > > > > to the output-freq and/or duty-cycle are made.
> > > > > 
> > > > > This problem has been masked so far because the main consumer
> > > > > (the i915 driver) was always making duty-cycle changes on resume.
> > > > > With the conversion of the i915 driver to the atomic PWM API the
> > > > > driver now only disables/enables the PWM on suspend/resume leaving
> > > > > the output-freq and duty as is, triggering this problem.
> > > > 
> > > > Doesn't this imply that there's another bug at play here? At the PWM API
> > > > level you're applying a state and it's up to the driver to ensure that
> > > > the hardware state after ->apply() is what the software has requested.
> > > > 
> > > > If you only switch the enable state and that doesn't cause period and
> > > > duty cycle to be updated it means that your driver isn't writing those
> > > > registers when it should be.
> > > 
> > > Right, the driver was not committing those as it should *on resume*,
> > > that and it skips setting the update bit on the subsequent enable,
> > > which is an optimization which gets removed in 7/17.
> > > 
> > > Before switching the i915 driver over to atomic, when the LPSS-PWM
> > > was used for the backlight we got the following order on suspend/resume
> > > 
> > > 1. Set duty-cycle to 0%
> > > 2. Set enabled to 0
> > > 3. Save ctrl reg
> > > 4. Power-off PWM controller, it now looses all its state
> > > 5. Power-on PWM ctrl
> > > 6. Restore ctrl reg (as a single reg write)
> > > 7. Set enabled to 1, at this point one would expect the
> > > duty/freq from the restored ctrl-reg to apply, but:
> > > a) The resume code never sets the update bit (which this commit fixes); and
> > > b) On applying the pwm_state with enabled=1 the code applying the
> > > state does this (before setting the enabled bit in the ctrl reg):
> > > 
> > > 	if (orig_ctrl != ctrl) {
> > > 		pwm_lpss_write(pwm, ctrl);
> > > 		pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
> > > 	}
> > > and since the restore of the ctrl reg set the old duty/freq the
> > > writes are skipped, so the update bit never gets set.
> > > 
> > > 8. Set duty-cycle to the pre-suspend value (which is not 0)
> > > this does cause a change in the ctrl-reg, so now the update flag
> > > does get set.
> > > 
> > > Note that 1-2 and 7-8 are both done by the non atomic i915 code,
> > > when moving the i915 code to atomic I decided that having these
> > > 2 separate steps here is non-sense, so the new i915 code just
> > > toggles the enable bit. So in essence the new atomic PWM
> > > i915 code drops step 1 and 8.
> > > 
> > > Dropping steps 8 means that the update bit never gets set and we
> > > end up with the PWM running at its power-on-reset duty cycle.
> > > 
> > > You are correct in your remark to patch 7/17 that since that removes
> > > the if (orig_ctrl != ctrl) for the writes that now step 7 will be
> > > sufficient to get the PWM to work again. But that only takes the i915
> > > usage into account.
> > > 
> > > What if the PWM is used through the sysfs userspace API?
> > > Then only steps 3-6 will happen on suspend-resume and without
> > > fixing step 6 to properly restore the PWM controller in its
> > > pre-resume state (this patch) it will once again be running at
> > > its power-on-reset defaults instead of the values from the
> > > restored control register.
> > 
> > Actually PWM's sysfs code has suspend/resume callbacks that basically
> > make sysfs just a regular consumer of PWMs. So they do end up doing a
> > pwm_apply_state() on the PWM as well on suspend and restore the state
> > from before suspend on resume.
> > 
> > This was done very specifically because the suspend/resume order can be
> > unexpected under some circumstances, so for PWM we really want for the
> > consumer to always have ultimate control over when precisely the PWM is
> > restored on resume.
> > 
> > The reason why we did this was because people observed weird glitches on
> > suspend/resume with different severity. In some cases a backlight would
> > be resumed before the display controller had had a chance to start
> > sending frames, causing on-screen corruption in some cases (such as
> > smart displays) and in other cases a PWM-controller regulator would be
> > resumed too late or too early, which I think was causing some issue with
> > the CPUs not working properly on resume.
> > 
> > So I'd prefer not to have any PWM driver save and restore its own
> > context on suspend/resume, because that's inevitably going to cause
> > unexpected behaviour at some point. If it's absolutely necessary we can
> > of course still do that, but I think in that case we need to at least
> > add a comment in the code about why context save/restore is needed in
> > this particular case and make it clear that this is not something that
> > other drivers should copy because they most likely won't be needing it.
> > 
> > Given the above it also doesn't sound to me like there's a real problem,
> > or at least that the bug is somewhere else. A consumer should always be
> > responsible for applying the pre-suspend state upon resume and it sounds
> > like that would be true after patch 7. Since sysfs is just a regular
> > consumer, the same should apply for sysfs-controlled PWMs as well.
> 
> Ok, I was not aware that for PWM the consumer is supposed to always
> be the one to restore the state. If that is the rule then we should probably
> just drop the save/restore suspend/resume code from pwm-lpss.
> 
> It seems that I'm actually responsible for adding that suspend/resume
> code in the first place, see commit 1d375b58c12f ("pwm: lpss: platform:
> Save/restore the ctrl register over a suspend/resume") although the
> ctrl-register was already being saved/restored before that commit
> but then by the acpi/acpi_lpss.c code.
> 
> One worry after dropping the suspend/resume save/restore code is what
> happens if the controller was enabled at the moment the system suspends
> and the consumers first post resume apply() call has pwm_state.enabled
> set too.
> 
> Currently pwm_lpss_apply() looks like this:
> 
>         if (state->enabled) {
>                 if (!pwm_is_enabled(pwm)) {
>                         pm_runtime_get_sync(chip->dev);
>                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
>                         if (ret)
>                                 pm_runtime_put(chip->dev);
>                 } else {
>                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, false);
>                 }
>         } else if (pwm_is_enabled(pwm)) {
> 
> Where the true / false parameter to pwm_lpss_prepare_enable()
> decides if pwm_lpss_prepare_enable() sets the enable bit in the controllers
> ctrl register, or if it skips that.
> 
> If we then come from a full system suspend (controller loses state,
> comes up with enable bit cleared) we will still enter the:
> 
>                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, false);
> 
> Path since the !pwm_is_enabled(pwm) check checks the pwm_state struct,
> not the actual hw-enabled bit and then we do not (re)set the enabled after
> resume as we should when apply() is called with pwm_state.enabled set.
> 
> Fixing this is easy though, we still need to check for the disabled ->
> enabled transition for runtime pm refcounting, but we can also tell
> pwm_lpss_prepare_enable() to set the enable bit in the other path, this
> will be a no-op in case it is already set.
> 
> So then the new apply() code would become:
> 
>         if (state->enabled) {
>                 if (!pwm_is_enabled(pwm)) {
>                         pm_runtime_get_sync(chip->dev);
>                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
>                         if (ret)
>                                 pm_runtime_put(chip->dev);
>                 } else {
>                         ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
>                 }
>         } else if (pwm_is_enabled(pwm)) {
> 
> (and we can even optimize out the enable parameter to pwm_lpss_prepare_enable
> then and always make it set the enable bit).
> 
> Together with patch 07/16 will make apply() always work independent of
> the state the controller was in before it got called. Which in light of
> all the subtle issues we have seen surrounding this is likely a good thing.
> 
> And with the fix to make apply() fully independent of the previous state
> of the controller, I'm all for dropping the suspend/resume state
> save/restore code.  Doing that makes things more KISS so I like it :)
> 
> > > So at step 6, if the PWM was enabled before, we must set the update
> > > bit, and then wait for it to clear again so the controller is
> > > ready for subsequent updates. The waiting for it to clear again
> > > needs to happen before or after setting the enable bit depending
> > > on the hw generation, which leads to this patch.
> > 
> > But all of that should be happening as part of the call to
> > pwm_apply_state(), right? That path should be taken for all consumers on
> > resume, including sysfs.
> 
> Ack.
> 
> <snip>
> 
> > > See above, apply() was trying to be smart but the restore of ctrl
> > > on resume without setting the update bit was tricking it. That
> > > being too smart for its own good is removed in 7/16 as you
> > > rightfully point out. But this patch is still necessary for the
> > > PWM controller to be in the expected state between resume and the
> > > first apply() after resume (which may be quite a long time in
> > > the future when using e.g. the sysfs API).
> > 
> > Like I said, the sysfs code should be resuming any exported PWMs on
> > resume just like any other consumer.
> > 
> > Obviously it's always up to the consumer to call pwm_apply_state() at
> > the right time. If that's "too late" for some reason, then that's a bug
> > in the consumer driver. But as I explained above there are a number of
> > cases where restoring context in the PWM driver itself doesn't work
> > because it can cause sequencing issues.
> 
> Ack, I was not aware that PWM consumers are responsible for restoring
> their own state on resume. If that is the case then:
> 
> TL;DR:
> 
> 1. We (I) should make apply() work independent of the current
> hardware state, instead of having it make various assumptions
> about that state as it now does.
> 
> 2. We (I) should drop the suspend/resume save/restore state
> handlers from pwm-lpss completely.
> 
> So I believe that I should prepare yet another version of this
> patch-set replacing 06/17 and 07/17 with 2 patches doing these
> 2 things.
> 
> Thierry, Andy, does that sound good to you ?

Good to me, thanks!
diff mbox series

Patch

diff --git a/drivers/pwm/pwm-lpss.c b/drivers/pwm/pwm-lpss.c
index 8a136ba2a583..9a7400c6fb6e 100644
--- a/drivers/pwm/pwm-lpss.c
+++ b/drivers/pwm/pwm-lpss.c
@@ -166,6 +166,25 @@  static int pwm_lpss_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 	return ret;
 }
 
+/*
+ * This is a mirror of pwm_lpss_apply() without relying on the current state
+ * (no pwm_is_enabled() calls) and without pm_runtime reference handling,
+ * for restoring the PWM state on resume.
+ */
+static int pwm_lpss_restore_state(struct pwm_lpss_chip *lpwm,
+				  struct pwm_device *pwm,
+				  const struct pwm_state *state)
+{
+	int ret = 0;
+
+	if (state->enabled)
+		ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
+	else
+		pwm_lpss_write(pwm, pwm_lpss_read(pwm) & ~PWM_ENABLE);
+
+	return ret;
+}
+
 static void pwm_lpss_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
 			       struct pwm_state *state)
 {
@@ -278,10 +297,25 @@  EXPORT_SYMBOL_GPL(pwm_lpss_suspend);
 int pwm_lpss_resume(struct device *dev)
 {
 	struct pwm_lpss_chip *lpwm = dev_get_drvdata(dev);
-	int i;
+	struct pwm_device *pwm;
+	int i, ret;
 
-	for (i = 0; i < lpwm->info->npwm; i++)
-		writel(lpwm->saved_ctrl[i], lpwm->regs + i * PWM_SIZE + PWM);
+	for (i = 0; i < lpwm->info->npwm; i++) {
+		pwm = &lpwm->chip.pwms[i];
+
+		/*
+		 * We cannot just blindly restore the old value here. Since we
+		 * are changing the settings we must set SW_UPDATE and if the
+		 * PWM was enabled before we must write the new settings +
+		 * PWM_SW_UPDATE before setting PWM_ENABLE. We must also wait
+		 * for PWM_SW_UPDATE to become 0 again and depending on the
+		 * model we must do this either before or after the setting of
+		 * PWM_ENABLE.
+		 */
+		ret = pwm_lpss_restore_state(lpwm, pwm, &pwm->state);
+		if (ret)
+			dev_err(dev, "Error restoring state on resume\n");
+	}
 
 	return 0;
 }