diff mbox series

drm/panel: simple: Initialize unprepared_time in probe

Message ID 20230709135231.449636-1-marex@denx.de (mailing list archive)
State New, archived
Headers show
Series drm/panel: simple: Initialize unprepared_time in probe | expand

Commit Message

Marek Vasut July 9, 2023, 1:52 p.m. UTC
The unprepared_time has to be initialized during probe to probe time
ktime, otherwise panel_simple_resume() panel_simple_wait() call may
wait too short time, or no time at all, which would violate the panel
timing specification. Initializing the unprepared_time() to probe time
ktime assures the delay is at least what the panel requires from the
time kernel started. The unprepared_time is then updated every time
the panel is suspended in panel_simple_suspend() too.

Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
Signed-off-by: Marek Vasut <marex@denx.de>
---
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/panel/panel-simple.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Sam Ravnborg July 9, 2023, 3:08 p.m. UTC | #1
Hi Marek.

On Sun, Jul 09, 2023 at 03:52:31PM +0200, Marek Vasut wrote:
> The unprepared_time has to be initialized during probe to probe time
> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
> wait too short time, or no time at all, which would violate the panel
> timing specification. Initializing the unprepared_time() to probe time
> ktime assures the delay is at least what the panel requires from the
> time kernel started. The unprepared_time is then updated every time
> the panel is suspended in panel_simple_suspend() too.
> 
> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
> Signed-off-by: Marek Vasut <marex@denx.de>

Looks OK,

Reviewed-by: Sam Ravnborg <sam@ravnborg.org>

It looks like prepared_time is not used anymore.
Could you dig a little into this while you are in the waiting area.

	Sam

> ---
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: David Airlie <airlied@gmail.com>
> Cc: Douglas Anderson <dianders@chromium.org>
> Cc: Neil Armstrong <neil.armstrong@linaro.org>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: dri-devel@lists.freedesktop.org
> ---
>  drivers/gpu/drm/panel/panel-simple.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
> index d3238088b7f80..37afed67fea7e 100644
> --- a/drivers/gpu/drm/panel/panel-simple.c
> +++ b/drivers/gpu/drm/panel/panel-simple.c
> @@ -567,6 +567,7 @@ static int panel_simple_probe(struct device *dev, const struct panel_desc *desc)
>  
>  	panel->enabled = false;
>  	panel->prepared_time = 0;
> +	panel->unprepared_time = ktime_get_boottime();
>  	panel->desc = desc;
>  
>  	panel->supply = devm_regulator_get(dev, "power");
> -- 
> 2.40.1
Marek Vasut July 9, 2023, 4:19 p.m. UTC | #2
On 7/9/23 17:08, Sam Ravnborg wrote:
> Hi Marek.

Hi,

> On Sun, Jul 09, 2023 at 03:52:31PM +0200, Marek Vasut wrote:
>> The unprepared_time has to be initialized during probe to probe time
>> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
>> wait too short time, or no time at all, which would violate the panel
>> timing specification. Initializing the unprepared_time() to probe time
>> ktime assures the delay is at least what the panel requires from the
>> time kernel started. The unprepared_time is then updated every time
>> the panel is suspended in panel_simple_suspend() too.
>>
>> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
>> Signed-off-by: Marek Vasut <marex@denx.de>
> 
> Looks OK,
> 
> Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
> 
> It looks like prepared_time is not used anymore.
> Could you dig a little into this while you are in the waiting area.

Good catch, seems that has been replaced by RPM so whatever is left is 
just a remnant and should be dropped . I can prepare a patch, but let's 
see if Douglas has any further comment on this.
Doug Anderson July 18, 2023, 2:17 p.m. UTC | #3
Hi,

On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
>
> The unprepared_time has to be initialized during probe to probe time
> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
> wait too short time, or no time at all, which would violate the panel
> timing specification. Initializing the unprepared_time() to probe time
> ktime assures the delay is at least what the panel requires from the
> time kernel started. The unprepared_time is then updated every time
> the panel is suspended in panel_simple_suspend() too.
>
> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
> Signed-off-by: Marek Vasut <marex@denx.de>

Can you talk in more detail about the problem you're seeing? Your
patch will likely cause boot speed regressions. While correctness
trumps performance, I'd like to make sure this is right before landing
it.

Specifically, I think your patch is nearly the opposite as what I did
in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
always-on/boot-on by time since booted"). I think many of the same
arguments I made in that commit message argue against your patch.

...however, I guess in the case of the panel, things could be
different because regulators aren't directly controlled by the panel
code. Thus, I could imagine that your situation is this:

1. Bootloader runs and leaves the panel powered on.

2. Linux boots. Time starts at 0.

3. Simple fixed regulator (GPIO-based) probes and doesn't know GPIO
state of regulator, so turns it off. We'll call this time "a"

4. Panel probes at time "b" and tries to turn the panel on.

With the existing code, when we try to turn the panel code on for the
first time we'll wait "min(unprepared_time, b)". In other words, if
the panel's probe was called so early at boot that it was shorter than
unprepared_time then we'd delay. Otherwise we wouldn't. In the case
described above, this is obviously a violation.

The more correct delay would be to wait "min(unprepared_time, b-a)".
In other words, make sure the regulator is off for a certain amount of
time.

Your patch would make us always wait "unprepared_time", which is still
correct but less performant.

Did I describe your situation correctly? If so, then IMO a more
correct fix than this is actually:

a) Don't rely on the panel code to enforce your regulator constraints.
It's OK for the panel code to have this logic as a failsafe, but it's
actually better to specify "off-on-delay-us" for the regulator itself.
This means that the regulator framework can handle things correctly.
It'll work better for deferred probes and shared regulator rails,
among other things. Note that "off-on-delay-us" is currently only
implemented for fixed regulators, but IMO it would be an easy sell to
make it generic.

b) Assuming your panel is OK with it, consider using
"regulator-boot-on" to optimize your boot speed.

...one further note is that, I believe, not all regulator drivers will
force regulators off at probe time. If your regulator is backed by a
PMIC instead of a simple fixed regulator and the bootloader left the
regulator on then I believe you could end up with a situation very
similar to the "regulator-boot-on" case.

-Doug
Marek Vasut July 18, 2023, 3:36 p.m. UTC | #4
On 7/18/23 16:17, Doug Anderson wrote:
> Hi,

Hi,

> On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
>>
>> The unprepared_time has to be initialized during probe to probe time
>> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
>> wait too short time, or no time at all, which would violate the panel
>> timing specification. Initializing the unprepared_time() to probe time
>> ktime assures the delay is at least what the panel requires from the
>> time kernel started. The unprepared_time is then updated every time
>> the panel is suspended in panel_simple_suspend() too.
>>
>> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
>> Signed-off-by: Marek Vasut <marex@denx.de>
> 
> Can you talk in more detail about the problem you're seeing? Your
> patch will likely cause boot speed regressions. While correctness
> trumps performance, I'd like to make sure this is right before landing
> it.

With AUO T215HVN01 panel, connected to LT9211 DSI-to-LVDS bridge, 
connected to MX8M Mini DSIM , the panel just would not come up correctly 
because this unprepare_time is not observed. The panel would only show 
blue stripe on the left side, instead of actual image.

> Specifically, I think your patch is nearly the opposite as what I did
> in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
> always-on/boot-on by time since booted"). I think many of the same
> arguments I made in that commit message argue against your patch.

You cannot guarantee in which state the panel is after boot/reboot , so 
I believe the kernel has to shut it down, and then bring it up, with the 
correct timings.

> ...however, I guess in the case of the panel, things could be
> different because regulators aren't directly controlled by the panel
> code. Thus, I could imagine that your situation is this:
> 
> 1. Bootloader runs and leaves the panel powered on.

Bootloader does not touch the panel at all.

> 2. Linux boots. Time starts at 0.
> 
> 3. Simple fixed regulator (GPIO-based) probes and doesn't know GPIO
> state of regulator, so turns it off. We'll call this time "a"
> 
> 4. Panel probes at time "b" and tries to turn the panel on.
> 
> With the existing code, when we try to turn the panel code on for the
> first time we'll wait "min(unprepared_time, b)". In other words, if
> the panel's probe was called so early at boot that it was shorter than
> unprepared_time then we'd delay. Otherwise we wouldn't. In the case
> described above, this is obviously a violation.
> 
> The more correct delay would be to wait "min(unprepared_time, b-a)".
> In other words, make sure the regulator is off for a certain amount of
> time.
> 
> Your patch would make us always wait "unprepared_time", which is still
> correct but less performant.
> 
> Did I describe your situation correctly?

Partly.

I believe the better approach would be to fix this such that we do not 
operate panels out of specification right now, since panel vendors are 
very sensitive about that, and any sort of optimization is a topic for 
separate patch.

But please do keep in mind that depending on the state of the system in 
which bootloader left it is likely a bad idea.

> If so, then IMO a more
> correct fix than this is actually:
> 
> a) Don't rely on the panel code to enforce your regulator constraints.
> It's OK for the panel code to have this logic as a failsafe, but it's
> actually better to specify "off-on-delay-us" for the regulator itself.
> This means that the regulator framework can handle things correctly.
> It'll work better for deferred probes and shared regulator rails,
> among other things. Note that "off-on-delay-us" is currently only
> implemented for fixed regulators, but IMO it would be an easy sell to
> make it generic.
> 
> b) Assuming your panel is OK with it, consider using
> "regulator-boot-on" to optimize your boot speed.

This is dangerous, since the panel has power sequencing requirements 
which must be observed, i.e. which supplies get flipped on in specific 
order with various delays between each step. That very much rules out 
any such regulator-boot-on shenanigans.

> ...one further note is that, I believe, not all regulator drivers will
> force regulators off at probe time. If your regulator is backed by a
> PMIC instead of a simple fixed regulator and the bootloader left the
> regulator on then I believe you could end up with a situation very
> similar to the "regulator-boot-on" case.
Doug Anderson July 18, 2023, 4:15 p.m. UTC | #5
Hi,

On Tue, Jul 18, 2023 at 8:36 AM Marek Vasut <marex@denx.de> wrote:
>
> On 7/18/23 16:17, Doug Anderson wrote:
> > Hi,
>
> Hi,
>
> > On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
> >>
> >> The unprepared_time has to be initialized during probe to probe time
> >> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
> >> wait too short time, or no time at all, which would violate the panel
> >> timing specification. Initializing the unprepared_time() to probe time
> >> ktime assures the delay is at least what the panel requires from the
> >> time kernel started. The unprepared_time is then updated every time
> >> the panel is suspended in panel_simple_suspend() too.
> >>
> >> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
> >> Signed-off-by: Marek Vasut <marex@denx.de>
> >
> > Can you talk in more detail about the problem you're seeing? Your
> > patch will likely cause boot speed regressions. While correctness
> > trumps performance, I'd like to make sure this is right before landing
> > it.
>
> With AUO T215HVN01 panel, connected to LT9211 DSI-to-LVDS bridge,
> connected to MX8M Mini DSIM , the panel just would not come up correctly
> because this unprepare_time is not observed. The panel would only show
> blue stripe on the left side, instead of actual image.
>
> > Specifically, I think your patch is nearly the opposite as what I did
> > in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
> > always-on/boot-on by time since booted"). I think many of the same
> > arguments I made in that commit message argue against your patch.
>
> You cannot guarantee in which state the panel is after boot/reboot,

Agreed. To the best extent possible, whatever solution we arrive at
should work regardless of how the bootloader left things.


> so
> I believe the kernel has to shut it down, and then bring it up, with the
> correct timings.

If that's required for your panel then the driver should do what it
needs to do to ensure this. As indicated by my other comments, I
actually don't think your patch currently does in all cases. If the
panel is powered by a PMIC and the bootloader left the power on, your
patch series _won't_ shut it down and bring it back up, will it?

In any case, if your panel requires extra delays, it would be ideal if
this didn't inflict a penalty on all panels. I haven't personally
worked on any panels currently serviced by panel-simple, but for most
eDP panels the only strong timing requirement is that once you turn
off the main power rail that you don't turn it on again for ~500ms.
For most panels it's OK to turn it on early (like as soon as the
regulator proves) and also OK if the main power rail stays on between
the bootloader and the kernel. For eDP the one exception I've seen was
the "samsung-atna33xc20" panel and that panel has its own driver
specifically to deal with quirks like this. I talk about this a little
bit in commit 23ff866987de ("arm64: dts: qcom: sc7180: Start the
trogdor eDP/touchscreen regulator on") since homestar uses
"samsung-atna33xc20"


> > ...however, I guess in the case of the panel, things could be
> > different because regulators aren't directly controlled by the panel
> > code. Thus, I could imagine that your situation is this:
> >
> > 1. Bootloader runs and leaves the panel powered on.
>
> Bootloader does not touch the panel at all.

Huh, then I'm pretty confused. Where is the timing violation then? If
the panel was off when the device started booting and the bootloader
didn't touch the panel, then the existing code should work fine. The
current code will make sure that we delay at least "unprepare" ms
since the kernel booted and so no specs should be violated.

Are you sure you aren't running into something like a case of
-EPROBE_DEFER where panel-simple powers the regulator on, then
un-probes, and then tries probing again? ...or maybe the default state
of the regulator at bootup _is_ powered on and that's the problem? In
either case, it feels like the regulator "off-on-delay" constraint
might be better here.


> > 2. Linux boots. Time starts at 0.
> >
> > 3. Simple fixed regulator (GPIO-based) probes and doesn't know GPIO
> > state of regulator, so turns it off. We'll call this time "a"
> >
> > 4. Panel probes at time "b" and tries to turn the panel on.
> >
> > With the existing code, when we try to turn the panel code on for the
> > first time we'll wait "min(unprepared_time, b)". In other words, if
> > the panel's probe was called so early at boot that it was shorter than
> > unprepared_time then we'd delay. Otherwise we wouldn't. In the case
> > described above, this is obviously a violation.
> >
> > The more correct delay would be to wait "min(unprepared_time, b-a)".
> > In other words, make sure the regulator is off for a certain amount of
> > time.
> >
> > Your patch would make us always wait "unprepared_time", which is still
> > correct but less performant.
> >
> > Did I describe your situation correctly?
>
> Partly.
>
> I believe the better approach would be to fix this such that we do not
> operate panels out of specification right now, since panel vendors are
> very sensitive about that, and any sort of optimization is a topic for
> separate patch.
>
> But please do keep in mind that depending on the state of the system in
> which bootloader left it is likely a bad idea.

Right that we want to match the panel spec and right that we should
work regardless of if the bootloader left the panel off or left it on.
If you look at my commit message in commit 691c1fcda535 ("regulator:
core: Shorten off-on-delay-us for always-on/boot-on by time since
booted") you can see that I made sure to consider both situations.


> > If so, then IMO a more
> > correct fix than this is actually:
> >
> > a) Don't rely on the panel code to enforce your regulator constraints.
> > It's OK for the panel code to have this logic as a failsafe, but it's
> > actually better to specify "off-on-delay-us" for the regulator itself.
> > This means that the regulator framework can handle things correctly.
> > It'll work better for deferred probes and shared regulator rails,
> > among other things. Note that "off-on-delay-us" is currently only
> > implemented for fixed regulators, but IMO it would be an easy sell to
> > make it generic.
> >
> > b) Assuming your panel is OK with it, consider using
> > "regulator-boot-on" to optimize your boot speed.
>
> This is dangerous, since the panel has power sequencing requirements
> which must be observed, i.e. which supplies get flipped on in specific
> order with various delays between each step. That very much rules out
> any such regulator-boot-on shenanigans.

Right. This is why I said for b) "assuming your panel is OK with it"
and "consider using". :-) Most eDP panels can handle this. If your
panel can't, then the correct solution is a) without b).
Marek Vasut July 18, 2023, 5:37 p.m. UTC | #6
On 7/18/23 18:15, Doug Anderson wrote:
> Hi,

Hi,

> On Tue, Jul 18, 2023 at 8:36 AM Marek Vasut <marex@denx.de> wrote:
>>
>> On 7/18/23 16:17, Doug Anderson wrote:
>>> Hi,
>>
>> Hi,
>>
>>> On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
>>>>
>>>> The unprepared_time has to be initialized during probe to probe time
>>>> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
>>>> wait too short time, or no time at all, which would violate the panel
>>>> timing specification. Initializing the unprepared_time() to probe time
>>>> ktime assures the delay is at least what the panel requires from the
>>>> time kernel started. The unprepared_time is then updated every time
>>>> the panel is suspended in panel_simple_suspend() too.
>>>>
>>>> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
>>>> Signed-off-by: Marek Vasut <marex@denx.de>
>>>
>>> Can you talk in more detail about the problem you're seeing? Your
>>> patch will likely cause boot speed regressions. While correctness
>>> trumps performance, I'd like to make sure this is right before landing
>>> it.
>>
>> With AUO T215HVN01 panel, connected to LT9211 DSI-to-LVDS bridge,
>> connected to MX8M Mini DSIM , the panel just would not come up correctly
>> because this unprepare_time is not observed. The panel would only show
>> blue stripe on the left side, instead of actual image.
>>
>>> Specifically, I think your patch is nearly the opposite as what I did
>>> in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
>>> always-on/boot-on by time since booted"). I think many of the same
>>> arguments I made in that commit message argue against your patch.
>>
>> You cannot guarantee in which state the panel is after boot/reboot,
> 
> Agreed. To the best extent possible, whatever solution we arrive at
> should work regardless of how the bootloader left things.
> 
> 
>> so
>> I believe the kernel has to shut it down, and then bring it up, with the
>> correct timings.
> 
> If that's required for your panel then the driver should do what it
> needs to do to ensure this.

The panel-simple driver used to do it. Now it no longer does, which 
means the kernel is now running this AUO and possibly other panels out 
of specification.

> As indicated by my other comments, I
> actually don't think your patch currently does in all cases. If the
> panel is powered by a PMIC and the bootloader left the power on, your
> patch series _won't_ shut it down and bring it back up, will it?

That depends on the regulator configuration. That itself is a separate 
issue however, one which has been present even before any of this boot 
time optimization attempt.

> In any case, if your panel requires extra delays, it would be ideal if
> this didn't inflict a penalty on all panels. I haven't personally
> worked on any panels currently serviced by panel-simple, but for most
> eDP panels the only strong timing requirement is that once you turn
> off the main power rail that you don't turn it on again for ~500ms.

The extra delay is actually only inflicted on panels which do set delay 
{ .unprepare = ... } constraint in their timing specification, and those 
panels most certainly do need those extra delays to operate correctly.

> For most panels it's OK to turn it on early (like as soon as the
> regulator proves) and also OK if the main power rail stays on between
> the bootloader and the kernel.

I would debate the "most" part, as that is not my experience with DPI 
and LVDS panels, which, if they are not correctly power sequenced, can 
go all kinds of weird and that weirdness is often very subtle. Or worse, 
those panels start failing in deployment.

> For eDP the one exception I've seen was
> the "samsung-atna33xc20" panel and that panel has its own driver
> specifically to deal with quirks like this. I talk about this a little
> bit in commit 23ff866987de ("arm64: dts: qcom: sc7180: Start the
> trogdor eDP/touchscreen regulator on") since homestar uses
> "samsung-atna33xc20"

I seldom work with eDP panels, so I cannot comment on that part.

It is well possible the more complex electronics of the panel hides a 
lot of the power sequencing details, I wouldn't be surprised by that.

>>> ...however, I guess in the case of the panel, things could be
>>> different because regulators aren't directly controlled by the panel
>>> code. Thus, I could imagine that your situation is this:
>>>
>>> 1. Bootloader runs and leaves the panel powered on.
>>
>> Bootloader does not touch the panel at all.
> 
> Huh, then I'm pretty confused. Where is the timing violation then? If
> the panel was off when the device started booting and the bootloader
> didn't touch the panel, then the existing code should work fine. The
> current code will make sure that we delay at least "unprepare" ms
> since the kernel booted and so no specs should be violated.
> 
> Are you sure you aren't running into something like a case of
> -EPROBE_DEFER where panel-simple powers the regulator on, then
> un-probes, and then tries probing again? ...or maybe the default state
> of the regulator at bootup _is_ powered on and that's the problem?

Have a look at panel_simple_resume() panel_simple_wait(), this is where 
the extra delay is needed. You cannot predict how long the bootloader 
took to reach the kernel time t=0 and you cannot know what happened 
before the bootloader started (maybe abrupt sysrq reset), not on all 
platforms anyway, so the best you can do is assume the worst, i.e. full 
unprepare delay.

> In
> either case, it feels like the regulator "off-on-delay" constraint
> might be better here.

Please stop suggesting that we should work around the existing defect of 
this driver, which does not correctly honor the .delay.unprepare time of 
a panel and causes actual failures on existing panels, instead of fixing 
it properly, only because this impedes boot time. Sure, it does, but 
correctly bringing up the panel is more important than reducing boot 
time at all costs, because if I only see blue stripe on the left side of 
the panel, I do not care if the kernel booted 100ms faster, I care about 
the non-working panel .

This could be a good optimization, but it certainly is not a fix for the 
issue at hand.

>>> 2. Linux boots. Time starts at 0.
>>>
>>> 3. Simple fixed regulator (GPIO-based) probes and doesn't know GPIO
>>> state of regulator, so turns it off. We'll call this time "a"
>>>
>>> 4. Panel probes at time "b" and tries to turn the panel on.
>>>
>>> With the existing code, when we try to turn the panel code on for the
>>> first time we'll wait "min(unprepared_time, b)". In other words, if
>>> the panel's probe was called so early at boot that it was shorter than
>>> unprepared_time then we'd delay. Otherwise we wouldn't. In the case
>>> described above, this is obviously a violation.
>>>
>>> The more correct delay would be to wait "min(unprepared_time, b-a)".
>>> In other words, make sure the regulator is off for a certain amount of
>>> time.
>>>
>>> Your patch would make us always wait "unprepared_time", which is still
>>> correct but less performant.
>>>
>>> Did I describe your situation correctly?
>>
>> Partly.
>>
>> I believe the better approach would be to fix this such that we do not
>> operate panels out of specification right now, since panel vendors are
>> very sensitive about that, and any sort of optimization is a topic for
>> separate patch.
>>
>> But please do keep in mind that depending on the state of the system in
>> which bootloader left it is likely a bad idea.
> 
> Right that we want to match the panel spec and right that we should
> work regardless of if the bootloader left the panel off or left it on.
> If you look at my commit message in commit 691c1fcda535 ("regulator:
> core: Shorten off-on-delay-us for always-on/boot-on by time since
> booted") you can see that I made sure to consider both situations.

This all talks about 'off-on-delay-us' DT property, I don't like how 
this is being posed as an alternative, because it does not really fix 
the problem with this driver now failing to respect the .delay.unprepare 
delay in panel description .

>>> If so, then IMO a more
>>> correct fix than this is actually:
>>>
>>> a) Don't rely on the panel code to enforce your regulator constraints.
>>> It's OK for the panel code to have this logic as a failsafe, but it's
>>> actually better to specify "off-on-delay-us" for the regulator itself.
>>> This means that the regulator framework can handle things correctly.
>>> It'll work better for deferred probes and shared regulator rails,
>>> among other things. Note that "off-on-delay-us" is currently only
>>> implemented for fixed regulators, but IMO it would be an easy sell to
>>> make it generic.
>>>
>>> b) Assuming your panel is OK with it, consider using
>>> "regulator-boot-on" to optimize your boot speed.
>>
>> This is dangerous, since the panel has power sequencing requirements
>> which must be observed, i.e. which supplies get flipped on in specific
>> order with various delays between each step. That very much rules out
>> any such regulator-boot-on shenanigans.
> 
> Right. This is why I said for b) "assuming your panel is OK with it"
> and "consider using". :-) Most eDP panels can handle this. If your
> panel can't, then the correct solution is a) without b).

Please see above, I really don't think that 'off-on-delay-us' is 
relevant to fixing this issue. It is a nice optimization, but it is 
separate topic.
Doug Anderson July 18, 2023, 7:33 p.m. UTC | #7
Hi,

On Tue, Jul 18, 2023 at 10:37 AM Marek Vasut <marex@denx.de> wrote:
>
> On 7/18/23 18:15, Doug Anderson wrote:
> > Hi,
>
> Hi,
>
> > On Tue, Jul 18, 2023 at 8:36 AM Marek Vasut <marex@denx.de> wrote:
> >>
> >> On 7/18/23 16:17, Doug Anderson wrote:
> >>> Hi,
> >>
> >> Hi,
> >>
> >>> On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
> >>>>
> >>>> The unprepared_time has to be initialized during probe to probe time
> >>>> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
> >>>> wait too short time, or no time at all, which would violate the panel
> >>>> timing specification. Initializing the unprepared_time() to probe time
> >>>> ktime assures the delay is at least what the panel requires from the
> >>>> time kernel started. The unprepared_time is then updated every time
> >>>> the panel is suspended in panel_simple_suspend() too.
> >>>>
> >>>> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
> >>>> Signed-off-by: Marek Vasut <marex@denx.de>
> >>>
> >>> Can you talk in more detail about the problem you're seeing? Your
> >>> patch will likely cause boot speed regressions. While correctness
> >>> trumps performance, I'd like to make sure this is right before landing
> >>> it.
> >>
> >> With AUO T215HVN01 panel, connected to LT9211 DSI-to-LVDS bridge,
> >> connected to MX8M Mini DSIM , the panel just would not come up correctly
> >> because this unprepare_time is not observed. The panel would only show
> >> blue stripe on the left side, instead of actual image.
> >>
> >>> Specifically, I think your patch is nearly the opposite as what I did
> >>> in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
> >>> always-on/boot-on by time since booted"). I think many of the same
> >>> arguments I made in that commit message argue against your patch.
> >>
> >> You cannot guarantee in which state the panel is after boot/reboot,
> >
> > Agreed. To the best extent possible, whatever solution we arrive at
> > should work regardless of how the bootloader left things.
> >
> >
> >> so
> >> I believe the kernel has to shut it down, and then bring it up, with the
> >> correct timings.
> >
> > If that's required for your panel then the driver should do what it
> > needs to do to ensure this.
>
> The panel-simple driver used to do it. Now it no longer does, which
> means the kernel is now running this AUO and possibly other panels out
> of specification.

OK, I think the more I read this thread the more confused I get. :(
Hopefully we can arrive at some clarity.

1. I guess first off, nothing about the old kernel would have ensured
that the regulator would have been shut off. Looking at the old code
(before e5e30dfcf3db, the commit yous "Fixes") the panel-simple driver
just did:

regulator_get()
regulator_enable()

If the regulator was left on by the bootloader and managed by a
regulator driver that can read back initial regulator states then the
old driver would have done nothing at all to guarantee that a
regulator went off. If you want some proof of this, it's even
documented in `Documentation/power/regulator/consumer.rst`:

NOTE:
  The supply may already be enabled before regulator_enabled() is called.
  This may happen if the consumer shares the regulator or the regulator has been
  previously enabled by bootloader or kernel board initialization code.

If you really need to make sure that your regulator was disabled at
boot, you could probably do something like this psuedocode:

supply = regulator_get(...)
if (regulator_is_enabled(supply)) {
  /* Enable and disable and that should sync it up */
  regulator_enable(supply);
  regulator_disable(supply);
  if (regulator_is_enabled(supply)) {
    pr_err("Crud, we couldn't disable\n");
    return -E_LIFESUCKS;
  }
}


2. Looking more closely at the commit you're fixing, though, I'm even
more confused.

I _think_ your assertion here is that the longer delay is needed on
the first power on of the panel at bootup. Is that correct? This is
why you need to initialize "unprepared_time" in the probe() function.
However, when I go back to the old code (before e5e30dfcf3db, the
commit yours "Fixes") you can actually see that there was no delay at
all before the first power on of the panel. The only delay was if you
turned the panel off and then turned it back on again. ...so the only
thing that the commit should have broken would have been the power-ons
of the panel _after_ the first. ...but your patch only affects the
delay for the first power on.

Huh?


> > As indicated by my other comments, I
> > actually don't think your patch currently does in all cases. If the
> > panel is powered by a PMIC and the bootloader left the power on, your
> > patch series _won't_ shut it down and bring it back up, will it?
>
> That depends on the regulator configuration. That itself is a separate
> issue however, one which has been present even before any of this boot
> time optimization attempt.
>
> > In any case, if your panel requires extra delays, it would be ideal if
> > this didn't inflict a penalty on all panels. I haven't personally
> > worked on any panels currently serviced by panel-simple, but for most
> > eDP panels the only strong timing requirement is that once you turn
> > off the main power rail that you don't turn it on again for ~500ms.
>
> The extra delay is actually only inflicted on panels which do set delay
> { .unprepare = ... } constraint in their timing specification, and those
> panels most certainly do need those extra delays to operate correctly.
>
> > For most panels it's OK to turn it on early (like as soon as the
> > regulator proves) and also OK if the main power rail stays on between
> > the bootloader and the kernel.
>
> I would debate the "most" part, as that is not my experience with DPI
> and LVDS panels, which, if they are not correctly power sequenced, can
> go all kinds of weird and that weirdness is often very subtle. Or worse,
> those panels start failing in deployment.
>
> > For eDP the one exception I've seen was
> > the "samsung-atna33xc20" panel and that panel has its own driver
> > specifically to deal with quirks like this. I talk about this a little
> > bit in commit 23ff866987de ("arm64: dts: qcom: sc7180: Start the
> > trogdor eDP/touchscreen regulator on") since homestar uses
> > "samsung-atna33xc20"
>
> I seldom work with eDP panels, so I cannot comment on that part.
>
> It is well possible the more complex electronics of the panel hides a
> lot of the power sequencing details, I wouldn't be surprised by that.
>
> >>> ...however, I guess in the case of the panel, things could be
> >>> different because regulators aren't directly controlled by the panel
> >>> code. Thus, I could imagine that your situation is this:
> >>>
> >>> 1. Bootloader runs and leaves the panel powered on.
> >>
> >> Bootloader does not touch the panel at all.
> >
> > Huh, then I'm pretty confused. Where is the timing violation then? If
> > the panel was off when the device started booting and the bootloader
> > didn't touch the panel, then the existing code should work fine. The
> > current code will make sure that we delay at least "unprepare" ms
> > since the kernel booted and so no specs should be violated.
> >
> > Are you sure you aren't running into something like a case of
> > -EPROBE_DEFER where panel-simple powers the regulator on, then
> > un-probes, and then tries probing again? ...or maybe the default state
> > of the regulator at bootup _is_ powered on and that's the problem?
>
> Have a look at panel_simple_resume() panel_simple_wait(), this is where
> the extra delay is needed. You cannot predict how long the bootloader
> took to reach the kernel time t=0 and you cannot know what happened
> before the bootloader started (maybe abrupt sysrq reset), not on all
> platforms anyway, so the best you can do is assume the worst, i.e. full
> unprepare delay.

I feel like there is a confusion here. With the old code,
"unprepared_time" was implicitly set to 0 (because the whole structure
was zero initialized). 0 is actually a valid time and represents the
time that the kernel booted (well, more correctly when ktime finished
initting, but that's pretty early).

Let's look at a few concerte cases. In this example I'll go with what
I think you've said is happening in your system: the bootloader
doesn't touch the panel and the panels power rails are off at bootup.


Case 1: everything boots absurdly fast and "unprepared_time" is 1000 ms.

1. CPU resets and starts executing the bootloader. Panel is fully powered off.

2. Let's imagine the bootloader finishes in an absurdly fast 10 ms and
starts Linux.

3. Linux starts and inits its clock. It does this in 10 ms. Kernel
time is 0 now and it's been 20 ms since CPU reset.

4. Linux gets to panel init code after another 10 ms. Kernel time is
10 ms and it's been 20 ms since CPU reset.

5. We try to turn the panel on after another 10 ms. Kernel time is 20
ms and it's been 30 ms since CPU reset.

6. We look at kernel time (30 ms) and the unprepare delay (1000 ms)
and we'll delay 970 ms.

7. After the delay, kernel time will be 1000 ms and it will have been
1010 ms since CPU reset.

...so if the panel was truly untouched by the bootloader and the
panel's power truly initted to off at bootup then we should be fine
since it's been at least 1010 ms since the panel was powered off.


Case 2: everything boots absurdly slowly and "unprepared_time" is 1000 ms.

1. CPU resets and starts executing the bootloader. Panel is fully powered off.

2. Let's imagine the bootloader finishes in an absurdly slow 2000 ms
and starts Linux.

3. Linux starts and inits its clock. It does this in 2000 ms. Kernel
time is 0 now and it's been 4000 ms since CPU reset.

4. Linux gets to panel init code after another 2000 ms. Kernel time is
2000 ms and it's been 6000 ms since CPU reset.

5. We try to turn the panel on after another 2000 ms. Kernel time is
4000 ms and it's been 8000 ms since CPU reset.

6. We look at kernel time (4000 ms) and the unprepare delay (1000 ms)
and we'll delay 0 ms (no delay)

...so if the panel was truly untouched by the bootloader and the
panel's power truly initted to off at bootup then we should be fine
since it's been at least 8000 ms since the panel was powered off.


Since the existing code should be correctly honoring the delay in both
of the two cases, I'd like to find out what assumption is wrong.

> > In
> > either case, it feels like the regulator "off-on-delay" constraint
> > might be better here.
>
> Please stop suggesting that we should work around the existing defect of
> this driver, which does not correctly honor the .delay.unprepare time of
> a panel and causes actual failures on existing panels, instead of fixing
> it properly, only because this impedes boot time. Sure, it does, but
> correctly bringing up the panel is more important than reducing boot
> time at all costs, because if I only see blue stripe on the left side of
> the panel, I do not care if the kernel booted 100ms faster, I care about
> the non-working panel .
>
> This could be a good optimization, but it certainly is not a fix for the
> issue at hand.

There are simply too many things about the problem that don't add up.
The reason I keep pushing the off-on-delay is that it has more to do
with the root cause of the problem here.

-Doug
Marek Vasut July 23, 2023, 10:47 p.m. UTC | #8
On 7/18/23 21:33, Doug Anderson wrote:
> Hi,
> 
> On Tue, Jul 18, 2023 at 10:37 AM Marek Vasut <marex@denx.de> wrote:
>>
>> On 7/18/23 18:15, Doug Anderson wrote:
>>> Hi,
>>
>> Hi,
>>
>>> On Tue, Jul 18, 2023 at 8:36 AM Marek Vasut <marex@denx.de> wrote:
>>>>
>>>> On 7/18/23 16:17, Doug Anderson wrote:
>>>>> Hi,
>>>>
>>>> Hi,
>>>>
>>>>> On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
>>>>>>
>>>>>> The unprepared_time has to be initialized during probe to probe time
>>>>>> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
>>>>>> wait too short time, or no time at all, which would violate the panel
>>>>>> timing specification. Initializing the unprepared_time() to probe time
>>>>>> ktime assures the delay is at least what the panel requires from the
>>>>>> time kernel started. The unprepared_time is then updated every time
>>>>>> the panel is suspended in panel_simple_suspend() too.
>>>>>>
>>>>>> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
>>>>>> Signed-off-by: Marek Vasut <marex@denx.de>
>>>>>
>>>>> Can you talk in more detail about the problem you're seeing? Your
>>>>> patch will likely cause boot speed regressions. While correctness
>>>>> trumps performance, I'd like to make sure this is right before landing
>>>>> it.
>>>>
>>>> With AUO T215HVN01 panel, connected to LT9211 DSI-to-LVDS bridge,
>>>> connected to MX8M Mini DSIM , the panel just would not come up correctly
>>>> because this unprepare_time is not observed. The panel would only show
>>>> blue stripe on the left side, instead of actual image.
>>>>
>>>>> Specifically, I think your patch is nearly the opposite as what I did
>>>>> in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
>>>>> always-on/boot-on by time since booted"). I think many of the same
>>>>> arguments I made in that commit message argue against your patch.
>>>>
>>>> You cannot guarantee in which state the panel is after boot/reboot,
>>>
>>> Agreed. To the best extent possible, whatever solution we arrive at
>>> should work regardless of how the bootloader left things.
>>>
>>>
>>>> so
>>>> I believe the kernel has to shut it down, and then bring it up, with the
>>>> correct timings.
>>>
>>> If that's required for your panel then the driver should do what it
>>> needs to do to ensure this.
>>
>> The panel-simple driver used to do it. Now it no longer does, which
>> means the kernel is now running this AUO and possibly other panels out
>> of specification.
> 
> OK, I think the more I read this thread the more confused I get. :(
> Hopefully we can arrive at some clarity.
> 
> 1. I guess first off, nothing about the old kernel would have ensured
> that the regulator would have been shut off. Looking at the old code
> (before e5e30dfcf3db, the commit yous "Fixes") the panel-simple driver
> just did:
> 
> regulator_get()
> regulator_enable()
> 
> If the regulator was left on by the bootloader and managed by a
> regulator driver that can read back initial regulator states then the
> old driver would have done nothing at all to guarantee that a
> regulator went off. If you want some proof of this, it's even
> documented in `Documentation/power/regulator/consumer.rst`:
> 
> NOTE:
>    The supply may already be enabled before regulator_enabled() is called.
>    This may happen if the consumer shares the regulator or the regulator has been
>    previously enabled by bootloader or kernel board initialization code.
> 
> If you really need to make sure that your regulator was disabled at
> boot, you could probably do something like this psuedocode:
> 
> supply = regulator_get(...)
> if (regulator_is_enabled(supply)) {
>    /* Enable and disable and that should sync it up */
>    regulator_enable(supply);
>    regulator_disable(supply);
>    if (regulator_is_enabled(supply)) {
>      pr_err("Crud, we couldn't disable\n");
>      return -E_LIFESUCKS;
>    }
> }
> 
> 
> 2. Looking more closely at the commit you're fixing, though, I'm even
> more confused.
> 
> I _think_ your assertion here is that the longer delay is needed on
> the first power on of the panel at bootup. Is that correct? This is
> why you need to initialize "unprepared_time" in the probe() function.
> However, when I go back to the old code (before e5e30dfcf3db, the
> commit yours "Fixes") you can actually see that there was no delay at
> all before the first power on of the panel. The only delay was if you
> turned the panel off and then turned it back on again. ...so the only
> thing that the commit should have broken would have been the power-ons
> of the panel _after_ the first. ...but your patch only affects the
> delay for the first power on.
> 
> Huh?
> 
> 
>>> As indicated by my other comments, I
>>> actually don't think your patch currently does in all cases. If the
>>> panel is powered by a PMIC and the bootloader left the power on, your
>>> patch series _won't_ shut it down and bring it back up, will it?
>>
>> That depends on the regulator configuration. That itself is a separate
>> issue however, one which has been present even before any of this boot
>> time optimization attempt.
>>
>>> In any case, if your panel requires extra delays, it would be ideal if
>>> this didn't inflict a penalty on all panels. I haven't personally
>>> worked on any panels currently serviced by panel-simple, but for most
>>> eDP panels the only strong timing requirement is that once you turn
>>> off the main power rail that you don't turn it on again for ~500ms.
>>
>> The extra delay is actually only inflicted on panels which do set delay
>> { .unprepare = ... } constraint in their timing specification, and those
>> panels most certainly do need those extra delays to operate correctly.
>>
>>> For most panels it's OK to turn it on early (like as soon as the
>>> regulator proves) and also OK if the main power rail stays on between
>>> the bootloader and the kernel.
>>
>> I would debate the "most" part, as that is not my experience with DPI
>> and LVDS panels, which, if they are not correctly power sequenced, can
>> go all kinds of weird and that weirdness is often very subtle. Or worse,
>> those panels start failing in deployment.
>>
>>> For eDP the one exception I've seen was
>>> the "samsung-atna33xc20" panel and that panel has its own driver
>>> specifically to deal with quirks like this. I talk about this a little
>>> bit in commit 23ff866987de ("arm64: dts: qcom: sc7180: Start the
>>> trogdor eDP/touchscreen regulator on") since homestar uses
>>> "samsung-atna33xc20"
>>
>> I seldom work with eDP panels, so I cannot comment on that part.
>>
>> It is well possible the more complex electronics of the panel hides a
>> lot of the power sequencing details, I wouldn't be surprised by that.
>>
>>>>> ...however, I guess in the case of the panel, things could be
>>>>> different because regulators aren't directly controlled by the panel
>>>>> code. Thus, I could imagine that your situation is this:
>>>>>
>>>>> 1. Bootloader runs and leaves the panel powered on.
>>>>
>>>> Bootloader does not touch the panel at all.
>>>
>>> Huh, then I'm pretty confused. Where is the timing violation then? If
>>> the panel was off when the device started booting and the bootloader
>>> didn't touch the panel, then the existing code should work fine. The
>>> current code will make sure that we delay at least "unprepare" ms
>>> since the kernel booted and so no specs should be violated.
>>>
>>> Are you sure you aren't running into something like a case of
>>> -EPROBE_DEFER where panel-simple powers the regulator on, then
>>> un-probes, and then tries probing again? ...or maybe the default state
>>> of the regulator at bootup _is_ powered on and that's the problem?
>>
>> Have a look at panel_simple_resume() panel_simple_wait(), this is where
>> the extra delay is needed. You cannot predict how long the bootloader
>> took to reach the kernel time t=0 and you cannot know what happened
>> before the bootloader started (maybe abrupt sysrq reset), not on all
>> platforms anyway, so the best you can do is assume the worst, i.e. full
>> unprepare delay.
> 
> I feel like there is a confusion here. With the old code,
> "unprepared_time" was implicitly set to 0 (because the whole structure
> was zero initialized). 0 is actually a valid time and represents the
> time that the kernel booted (well, more correctly when ktime finished
> initting, but that's pretty early).
> 
> Let's look at a few concerte cases. In this example I'll go with what
> I think you've said is happening in your system: the bootloader
> doesn't touch the panel and the panels power rails are off at bootup.
> 
> 
> Case 1: everything boots absurdly fast and "unprepared_time" is 1000 ms.
> 
> 1. CPU resets and starts executing the bootloader. Panel is fully powered off.
> 
> 2. Let's imagine the bootloader finishes in an absurdly fast 10 ms and
> starts Linux.
> 
> 3. Linux starts and inits its clock. It does this in 10 ms. Kernel
> time is 0 now and it's been 20 ms since CPU reset.
> 
> 4. Linux gets to panel init code after another 10 ms. Kernel time is
> 10 ms and it's been 20 ms since CPU reset.
> 
> 5. We try to turn the panel on after another 10 ms. Kernel time is 20
> ms and it's been 30 ms since CPU reset.
> 
> 6. We look at kernel time (30 ms) and the unprepare delay (1000 ms)
> and we'll delay 970 ms.
> 
> 7. After the delay, kernel time will be 1000 ms and it will have been
> 1010 ms since CPU reset.
> 
> ...so if the panel was truly untouched by the bootloader and the
> panel's power truly initted to off at bootup then we should be fine
> since it's been at least 1010 ms since the panel was powered off.
> 
> 
> Case 2: everything boots absurdly slowly and "unprepared_time" is 1000 ms.
> 
> 1. CPU resets and starts executing the bootloader. Panel is fully powered off.
> 
> 2. Let's imagine the bootloader finishes in an absurdly slow 2000 ms
> and starts Linux.
> 
> 3. Linux starts and inits its clock. It does this in 2000 ms. Kernel
> time is 0 now and it's been 4000 ms since CPU reset.
> 
> 4. Linux gets to panel init code after another 2000 ms. Kernel time is
> 2000 ms and it's been 6000 ms since CPU reset.
> 
> 5. We try to turn the panel on after another 2000 ms. Kernel time is
> 4000 ms and it's been 8000 ms since CPU reset.
> 
> 6. We look at kernel time (4000 ms) and the unprepare delay (1000 ms)
> and we'll delay 0 ms (no delay)
> 
> ...so if the panel was truly untouched by the bootloader and the
> panel's power truly initted to off at bootup then we should be fine
> since it's been at least 8000 ms since the panel was powered off.
> 
> 
> Since the existing code should be correctly honoring the delay in both
> of the two cases, I'd like to find out what assumption is wrong.

Maybe the EPROBE_DEFER actually happens and triggers the failure ?

[...]
Doug Anderson July 24, 2023, 1:49 p.m. UTC | #9
Hi,

On Sun, Jul 23, 2023 at 3:47 PM Marek Vasut <marex@denx.de> wrote:
>
> On 7/18/23 21:33, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Jul 18, 2023 at 10:37 AM Marek Vasut <marex@denx.de> wrote:
> >>
> >> On 7/18/23 18:15, Doug Anderson wrote:
> >>> Hi,
> >>
> >> Hi,
> >>
> >>> On Tue, Jul 18, 2023 at 8:36 AM Marek Vasut <marex@denx.de> wrote:
> >>>>
> >>>> On 7/18/23 16:17, Doug Anderson wrote:
> >>>>> Hi,
> >>>>
> >>>> Hi,
> >>>>
> >>>>> On Sun, Jul 9, 2023 at 6:52 AM Marek Vasut <marex@denx.de> wrote:
> >>>>>>
> >>>>>> The unprepared_time has to be initialized during probe to probe time
> >>>>>> ktime, otherwise panel_simple_resume() panel_simple_wait() call may
> >>>>>> wait too short time, or no time at all, which would violate the panel
> >>>>>> timing specification. Initializing the unprepared_time() to probe time
> >>>>>> ktime assures the delay is at least what the panel requires from the
> >>>>>> time kernel started. The unprepared_time is then updated every time
> >>>>>> the panel is suspended in panel_simple_suspend() too.
> >>>>>>
> >>>>>> Fixes: e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next prepare to shorten it")
> >>>>>> Signed-off-by: Marek Vasut <marex@denx.de>
> >>>>>
> >>>>> Can you talk in more detail about the problem you're seeing? Your
> >>>>> patch will likely cause boot speed regressions. While correctness
> >>>>> trumps performance, I'd like to make sure this is right before landing
> >>>>> it.
> >>>>
> >>>> With AUO T215HVN01 panel, connected to LT9211 DSI-to-LVDS bridge,
> >>>> connected to MX8M Mini DSIM , the panel just would not come up correctly
> >>>> because this unprepare_time is not observed. The panel would only show
> >>>> blue stripe on the left side, instead of actual image.
> >>>>
> >>>>> Specifically, I think your patch is nearly the opposite as what I did
> >>>>> in commit 691c1fcda535 ("regulator: core: Shorten off-on-delay-us for
> >>>>> always-on/boot-on by time since booted"). I think many of the same
> >>>>> arguments I made in that commit message argue against your patch.
> >>>>
> >>>> You cannot guarantee in which state the panel is after boot/reboot,
> >>>
> >>> Agreed. To the best extent possible, whatever solution we arrive at
> >>> should work regardless of how the bootloader left things.
> >>>
> >>>
> >>>> so
> >>>> I believe the kernel has to shut it down, and then bring it up, with the
> >>>> correct timings.
> >>>
> >>> If that's required for your panel then the driver should do what it
> >>> needs to do to ensure this.
> >>
> >> The panel-simple driver used to do it. Now it no longer does, which
> >> means the kernel is now running this AUO and possibly other panels out
> >> of specification.
> >
> > OK, I think the more I read this thread the more confused I get. :(
> > Hopefully we can arrive at some clarity.
> >
> > 1. I guess first off, nothing about the old kernel would have ensured
> > that the regulator would have been shut off. Looking at the old code
> > (before e5e30dfcf3db, the commit yous "Fixes") the panel-simple driver
> > just did:
> >
> > regulator_get()
> > regulator_enable()
> >
> > If the regulator was left on by the bootloader and managed by a
> > regulator driver that can read back initial regulator states then the
> > old driver would have done nothing at all to guarantee that a
> > regulator went off. If you want some proof of this, it's even
> > documented in `Documentation/power/regulator/consumer.rst`:
> >
> > NOTE:
> >    The supply may already be enabled before regulator_enabled() is called.
> >    This may happen if the consumer shares the regulator or the regulator has been
> >    previously enabled by bootloader or kernel board initialization code.
> >
> > If you really need to make sure that your regulator was disabled at
> > boot, you could probably do something like this psuedocode:
> >
> > supply = regulator_get(...)
> > if (regulator_is_enabled(supply)) {
> >    /* Enable and disable and that should sync it up */
> >    regulator_enable(supply);
> >    regulator_disable(supply);
> >    if (regulator_is_enabled(supply)) {
> >      pr_err("Crud, we couldn't disable\n");
> >      return -E_LIFESUCKS;
> >    }
> > }
> >
> >
> > 2. Looking more closely at the commit you're fixing, though, I'm even
> > more confused.
> >
> > I _think_ your assertion here is that the longer delay is needed on
> > the first power on of the panel at bootup. Is that correct? This is
> > why you need to initialize "unprepared_time" in the probe() function.
> > However, when I go back to the old code (before e5e30dfcf3db, the
> > commit yours "Fixes") you can actually see that there was no delay at
> > all before the first power on of the panel. The only delay was if you
> > turned the panel off and then turned it back on again. ...so the only
> > thing that the commit should have broken would have been the power-ons
> > of the panel _after_ the first. ...but your patch only affects the
> > delay for the first power on.
> >
> > Huh?
> >
> >
> >>> As indicated by my other comments, I
> >>> actually don't think your patch currently does in all cases. If the
> >>> panel is powered by a PMIC and the bootloader left the power on, your
> >>> patch series _won't_ shut it down and bring it back up, will it?
> >>
> >> That depends on the regulator configuration. That itself is a separate
> >> issue however, one which has been present even before any of this boot
> >> time optimization attempt.
> >>
> >>> In any case, if your panel requires extra delays, it would be ideal if
> >>> this didn't inflict a penalty on all panels. I haven't personally
> >>> worked on any panels currently serviced by panel-simple, but for most
> >>> eDP panels the only strong timing requirement is that once you turn
> >>> off the main power rail that you don't turn it on again for ~500ms.
> >>
> >> The extra delay is actually only inflicted on panels which do set delay
> >> { .unprepare = ... } constraint in their timing specification, and those
> >> panels most certainly do need those extra delays to operate correctly.
> >>
> >>> For most panels it's OK to turn it on early (like as soon as the
> >>> regulator proves) and also OK if the main power rail stays on between
> >>> the bootloader and the kernel.
> >>
> >> I would debate the "most" part, as that is not my experience with DPI
> >> and LVDS panels, which, if they are not correctly power sequenced, can
> >> go all kinds of weird and that weirdness is often very subtle. Or worse,
> >> those panels start failing in deployment.
> >>
> >>> For eDP the one exception I've seen was
> >>> the "samsung-atna33xc20" panel and that panel has its own driver
> >>> specifically to deal with quirks like this. I talk about this a little
> >>> bit in commit 23ff866987de ("arm64: dts: qcom: sc7180: Start the
> >>> trogdor eDP/touchscreen regulator on") since homestar uses
> >>> "samsung-atna33xc20"
> >>
> >> I seldom work with eDP panels, so I cannot comment on that part.
> >>
> >> It is well possible the more complex electronics of the panel hides a
> >> lot of the power sequencing details, I wouldn't be surprised by that.
> >>
> >>>>> ...however, I guess in the case of the panel, things could be
> >>>>> different because regulators aren't directly controlled by the panel
> >>>>> code. Thus, I could imagine that your situation is this:
> >>>>>
> >>>>> 1. Bootloader runs and leaves the panel powered on.
> >>>>
> >>>> Bootloader does not touch the panel at all.
> >>>
> >>> Huh, then I'm pretty confused. Where is the timing violation then? If
> >>> the panel was off when the device started booting and the bootloader
> >>> didn't touch the panel, then the existing code should work fine. The
> >>> current code will make sure that we delay at least "unprepare" ms
> >>> since the kernel booted and so no specs should be violated.
> >>>
> >>> Are you sure you aren't running into something like a case of
> >>> -EPROBE_DEFER where panel-simple powers the regulator on, then
> >>> un-probes, and then tries probing again? ...or maybe the default state
> >>> of the regulator at bootup _is_ powered on and that's the problem?
> >>
> >> Have a look at panel_simple_resume() panel_simple_wait(), this is where
> >> the extra delay is needed. You cannot predict how long the bootloader
> >> took to reach the kernel time t=0 and you cannot know what happened
> >> before the bootloader started (maybe abrupt sysrq reset), not on all
> >> platforms anyway, so the best you can do is assume the worst, i.e. full
> >> unprepare delay.
> >
> > I feel like there is a confusion here. With the old code,
> > "unprepared_time" was implicitly set to 0 (because the whole structure
> > was zero initialized). 0 is actually a valid time and represents the
> > time that the kernel booted (well, more correctly when ktime finished
> > initting, but that's pretty early).
> >
> > Let's look at a few concerte cases. In this example I'll go with what
> > I think you've said is happening in your system: the bootloader
> > doesn't touch the panel and the panels power rails are off at bootup.
> >
> >
> > Case 1: everything boots absurdly fast and "unprepared_time" is 1000 ms.
> >
> > 1. CPU resets and starts executing the bootloader. Panel is fully powered off.
> >
> > 2. Let's imagine the bootloader finishes in an absurdly fast 10 ms and
> > starts Linux.
> >
> > 3. Linux starts and inits its clock. It does this in 10 ms. Kernel
> > time is 0 now and it's been 20 ms since CPU reset.
> >
> > 4. Linux gets to panel init code after another 10 ms. Kernel time is
> > 10 ms and it's been 20 ms since CPU reset.
> >
> > 5. We try to turn the panel on after another 10 ms. Kernel time is 20
> > ms and it's been 30 ms since CPU reset.
> >
> > 6. We look at kernel time (30 ms) and the unprepare delay (1000 ms)
> > and we'll delay 970 ms.
> >
> > 7. After the delay, kernel time will be 1000 ms and it will have been
> > 1010 ms since CPU reset.
> >
> > ...so if the panel was truly untouched by the bootloader and the
> > panel's power truly initted to off at bootup then we should be fine
> > since it's been at least 1010 ms since the panel was powered off.
> >
> >
> > Case 2: everything boots absurdly slowly and "unprepared_time" is 1000 ms.
> >
> > 1. CPU resets and starts executing the bootloader. Panel is fully powered off.
> >
> > 2. Let's imagine the bootloader finishes in an absurdly slow 2000 ms
> > and starts Linux.
> >
> > 3. Linux starts and inits its clock. It does this in 2000 ms. Kernel
> > time is 0 now and it's been 4000 ms since CPU reset.
> >
> > 4. Linux gets to panel init code after another 2000 ms. Kernel time is
> > 2000 ms and it's been 6000 ms since CPU reset.
> >
> > 5. We try to turn the panel on after another 2000 ms. Kernel time is
> > 4000 ms and it's been 8000 ms since CPU reset.
> >
> > 6. We look at kernel time (4000 ms) and the unprepare delay (1000 ms)
> > and we'll delay 0 ms (no delay)
> >
> > ...so if the panel was truly untouched by the bootloader and the
> > panel's power truly initted to off at bootup then we should be fine
> > since it's been at least 8000 ms since the panel was powered off.
> >
> >
> > Since the existing code should be correctly honoring the delay in both
> > of the two cases, I'd like to find out what assumption is wrong.
>
> Maybe the EPROBE_DEFER actually happens and triggers the failure ?

I could certainly believe that EPROBE_DEFER is involved. In any case,
it sounds as if you need to dig into the failure more and understand
it better before we land a fix. As it is, I don't think it's clear
how/why the delay you're adding is actually helping your situation
and, unless my logic is wrong, I don't think it's just putting the
delay back to how things were before commit e5e30dfcf3db ("drm: panel:
simple: Defer unprepare delay till next prepare to shorten it").

-Doug
Marek Vasut July 31, 2023, 6:03 p.m. UTC | #10
On 7/24/23 15:49, Doug Anderson wrote:

Hi,

[...]

>> Maybe the EPROBE_DEFER actually happens and triggers the failure ?
> 
> I could certainly believe that EPROBE_DEFER is involved.

So no, it is not. It is difficult to set this up and access the signals, 
but so I did.

What happens is this:
panel_simple_probe() calls devm_regulator_get()
   -> If the regulator was ENABLED, then it is now DISABLED
   -> For regulator-fixed, this means the regulator GPIO goes HIGH->LOW

panel_simple_prepare() triggers panel_simple_resume()
   -> If this occurs too soon after devm_regulator_get() turned the
      regulator OFF and thus regulator GPIO low, then unprepare time is
      not respected => FAIL

Since there is no way to find out in which state the regulator was when 
devm_regulator_get() was called, we have to wait the full unprepare time 
before re-enabling that regulator in panel_simple_resume().
Doug Anderson July 31, 2023, 7:50 p.m. UTC | #11
Hi,

On Mon, Jul 31, 2023 at 11:03 AM Marek Vasut <marex@denx.de> wrote:
>
> On 7/24/23 15:49, Doug Anderson wrote:
>
> Hi,
>
> [...]
>
> >> Maybe the EPROBE_DEFER actually happens and triggers the failure ?
> >
> > I could certainly believe that EPROBE_DEFER is involved.
>
> So no, it is not. It is difficult to set this up and access the signals,
> but so I did.
>
> What happens is this:
> panel_simple_probe() calls devm_regulator_get()
>    -> If the regulator was ENABLED, then it is now DISABLED

As per my previous response, I don't think this is true. This was the
part where I referred to `Documentation/power/regulator/consumer.rst`
which said:

NOTE:
  The supply may already be enabled before regulator_enabled() is called.
  This may happen if the consumer shares the regulator or the regulator has been
  previously enabled by bootloader or kernel board initialization code.


My belief is that what's actually happening is that when the regulator
_probes_ that the regulator turns off. In Linux GPIO regulators cannot
read the state of the regulator at bootup. That means that when the
regulator itself probes that Linux has to decide on the default state
of the regulator itself. If the device tree has "regulator-boot-on"
then Linux will turn the regulator on (even without any clients). In
this case the regulator will stay on until some client enables and
then disables the regulator, or until the regulator boot timeout
happens and all unused regulators are powered off. If the device tree
doesn't have "regulator-boot-on" then Linux will turn the regulator
off.


>    -> For regulator-fixed, this means the regulator GPIO goes HIGH->LOW
>
> panel_simple_prepare() triggers panel_simple_resume()
>    -> If this occurs too soon after devm_regulator_get() turned the
>       regulator OFF and thus regulator GPIO low, then unprepare time is
>       not respected => FAIL
>
> Since there is no way to find out in which state the regulator was when
> devm_regulator_get() was called, we have to wait the full unprepare time
> before re-enabling that regulator in panel_simple_resume().

So just as a point of order, do you agree that prior to the commit
that you are "Fixing" that we _weren't_ doing the full delay at probe
time? That means that if things worked before they were working by
some amount of luck, right? The old code used to do a delay after
turning _off_ the regulator (at unprepare time).

I will also continue to assert that trying to fix the problem via a
delay in the probe of the panel code is the wrong place to fix the
code. The problem is that you need a board-level constraint on this
regulator (off-on-delay-us) preventing it from turning on again within
a certain amount of time after it is turned off. This allows the
regulator framework, which is what decided to turn this rail off
during the regulator probe, to enforce this constraint.


-Doug
Marek Vasut July 31, 2023, 9:15 p.m. UTC | #12
On 7/31/23 21:50, Doug Anderson wrote:
> Hi,
> 
> On Mon, Jul 31, 2023 at 11:03 AM Marek Vasut <marex@denx.de> wrote:
>>
>> On 7/24/23 15:49, Doug Anderson wrote:
>>
>> Hi,
>>
>> [...]
>>
>>>> Maybe the EPROBE_DEFER actually happens and triggers the failure ?
>>>
>>> I could certainly believe that EPROBE_DEFER is involved.
>>
>> So no, it is not. It is difficult to set this up and access the signals,
>> but so I did.
>>
>> What happens is this:
>> panel_simple_probe() calls devm_regulator_get()
>>     -> If the regulator was ENABLED, then it is now DISABLED
> 
> As per my previous response, I don't think this is true.

I just measured that with a scope on actual hardware .

reg_fixed_voltage_probe() is the code which turns the regulator OFF, 
specifically in the part gpiod_get_optional(...GPIOD_OUT_LOW);

> This was the
> part where I referred to `Documentation/power/regulator/consumer.rst`
> which said:
> 
> NOTE:
>    The supply may already be enabled before regulator_enabled() is called.
>    This may happen if the consumer shares the regulator or the regulator has been
>    previously enabled by bootloader or kernel board initialization code.
> 
> 
> My belief is that what's actually happening is that when the regulator
> _probes_ that the regulator turns off. In Linux GPIO regulators cannot
> read the state of the regulator at bootup. That means that when the
> regulator itself probes that Linux has to decide on the default state
> of the regulator itself. If the device tree has "regulator-boot-on"
> then Linux will turn the regulator on (even without any clients). In
> this case the regulator will stay on until some client enables and
> then disables the regulator, or until the regulator boot timeout
> happens and all unused regulators are powered off. If the device tree
> doesn't have "regulator-boot-on" then Linux will turn the regulator
> off.
> 
> 
>>     -> For regulator-fixed, this means the regulator GPIO goes HIGH->LOW
>>
>> panel_simple_prepare() triggers panel_simple_resume()
>>     -> If this occurs too soon after devm_regulator_get() turned the
>>        regulator OFF and thus regulator GPIO low, then unprepare time is
>>        not respected => FAIL
>>
>> Since there is no way to find out in which state the regulator was when
>> devm_regulator_get() was called, we have to wait the full unprepare time
>> before re-enabling that regulator in panel_simple_resume().
> 
> So just as a point of order, do you agree that prior to the commit
> that you are "Fixing" that we _weren't_ doing the full delay at probe
> time? That means that if things worked before they were working by
> some amount of luck, right? The old code used to do a delay after
> turning _off_ the regulator (at unprepare time).

Yes, that's well possible.

> I will also continue to assert that trying to fix the problem via a
> delay in the probe of the panel code is the wrong place to fix the
> code. The problem is that you need a board-level constraint on this
> regulator (off-on-delay-us) preventing it from turning on again within
> a certain amount of time after it is turned off. This allows the
> regulator framework, which is what decided to turn this rail off
> during the regulator probe, to enforce this constraint.

No, the driver must respect the unprepare delay, that is what is 
currently not happening. Trying to somehow work around that by adding 
random changes to DT is not going to fix the fact that panel-simple is 
not respecting its own internal built-in constraints.
Doug Anderson July 31, 2023, 9:34 p.m. UTC | #13
Hi,

On Mon, Jul 31, 2023 at 2:15 PM Marek Vasut <marex@denx.de> wrote:
>
> On 7/31/23 21:50, Doug Anderson wrote:
> > Hi,
> >
> > On Mon, Jul 31, 2023 at 11:03 AM Marek Vasut <marex@denx.de> wrote:
> >>
> >> On 7/24/23 15:49, Doug Anderson wrote:
> >>
> >> Hi,
> >>
> >> [...]
> >>
> >>>> Maybe the EPROBE_DEFER actually happens and triggers the failure ?
> >>>
> >>> I could certainly believe that EPROBE_DEFER is involved.
> >>
> >> So no, it is not. It is difficult to set this up and access the signals,
> >> but so I did.
> >>
> >> What happens is this:
> >> panel_simple_probe() calls devm_regulator_get()
> >>     -> If the regulator was ENABLED, then it is now DISABLED
> >
> > As per my previous response, I don't think this is true.
>
> I just measured that with a scope on actual hardware .
>
> reg_fixed_voltage_probe() is the code which turns the regulator OFF,
> specifically in the part gpiod_get_optional(...GPIOD_OUT_LOW);

Great, at least we're on the same page here now.


> > So just as a point of order, do you agree that prior to the commit
> > that you are "Fixing" that we _weren't_ doing the full delay at probe
> > time? That means that if things worked before they were working by
> > some amount of luck, right? The old code used to do a delay after
> > turning _off_ the regulator (at unprepare time).
>
> Yes, that's well possible.

...so assuming we agree that this is _not_ a regression introduced by
e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next
prepare to shorten it"), that means that all other preexisting users
of panel-simple were fine with the old behavior where the unprepare
delay was only enforced when the panel driver itself turned things off
and then back on and no extra delay was needed at probe. The one board
we know about that has a problem with this behavior is the one you're
reproducing on, which we think only worked previously by chance.


> > I will also continue to assert that trying to fix the problem via a
> > delay in the probe of the panel code is the wrong place to fix the
> > code. The problem is that you need a board-level constraint on this
> > regulator (off-on-delay-us) preventing it from turning on again within
> > a certain amount of time after it is turned off. This allows the
> > regulator framework, which is what decided to turn this rail off
> > during the regulator probe, to enforce this constraint.
>
> No, the driver must respect the unprepare delay, that is what is
> currently not happening. Trying to somehow work around that by adding
> random changes to DT is not going to fix the fact that panel-simple is
> not respecting its own internal built-in constraints.

Well, except that the panel _isn't_ actually unpreparing the panel.
The constant you're talking about is a delay between unpreparing the
panel and then preparing it again and we're not doing that here. Here,
you are trying to account for an implicit unprepare that happened
outside the context of the panel driver (in the regulator framework).
The regulator framework is the one disabling the regulator on its own
and without the knowledge of the panel driver. The problem should be
addressed at the regulator framework, which might involve the
off-on-delay.

I would even go further and say that, perhaps, when the regulator
framework turns this regulator off at bootup then you might be
violating the power sequencing requirements in the panel anyway. If
the bootloader left the panel on and _also_ left an enable GPIO on
then it's entirely possible that you've got a power leak through the
enable GPIO where you're backpowering the panel's logic when the
regulator framework turns things off. This is something that would be
impossible for the panel driver to solve because the panel driver
hasn't even probed yet.

In any case, at this point it seems unlikely that I'll convince you or
that you'll convince me. I wonder if it's time for someone else on
this thread to provide an opinion.

-Doug
Marek Vasut July 31, 2023, 9:53 p.m. UTC | #14
On 7/31/23 23:34, Doug Anderson wrote:
> Hi,
> 
> On Mon, Jul 31, 2023 at 2:15 PM Marek Vasut <marex@denx.de> wrote:
>>
>> On 7/31/23 21:50, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Mon, Jul 31, 2023 at 11:03 AM Marek Vasut <marex@denx.de> wrote:
>>>>
>>>> On 7/24/23 15:49, Doug Anderson wrote:
>>>>
>>>> Hi,
>>>>
>>>> [...]
>>>>
>>>>>> Maybe the EPROBE_DEFER actually happens and triggers the failure ?
>>>>>
>>>>> I could certainly believe that EPROBE_DEFER is involved.
>>>>
>>>> So no, it is not. It is difficult to set this up and access the signals,
>>>> but so I did.
>>>>
>>>> What happens is this:
>>>> panel_simple_probe() calls devm_regulator_get()
>>>>      -> If the regulator was ENABLED, then it is now DISABLED
>>>
>>> As per my previous response, I don't think this is true.
>>
>> I just measured that with a scope on actual hardware .
>>
>> reg_fixed_voltage_probe() is the code which turns the regulator OFF,
>> specifically in the part gpiod_get_optional(...GPIOD_OUT_LOW);
> 
> Great, at least we're on the same page here now.
> 
> 
>>> So just as a point of order, do you agree that prior to the commit
>>> that you are "Fixing" that we _weren't_ doing the full delay at probe
>>> time? That means that if things worked before they were working by
>>> some amount of luck, right? The old code used to do a delay after
>>> turning _off_ the regulator (at unprepare time).
>>
>> Yes, that's well possible.
> 
> ...so assuming we agree that this is _not_ a regression introduced by
> e5e30dfcf3db ("drm: panel: simple: Defer unprepare delay till next
> prepare to shorten it"), that means that all other preexisting users
> of panel-simple were fine with the old behavior

No, it does not mean that all the previous users were fine with that 
behavior. It means the driver operates panels out of spec, we cannot 
know which ones, but we do know it does. Whether users noticed that 
defect or not is another question, which we cannot answer.

> where the unprepare
> delay was only enforced when the panel driver itself turned things off
> and then back on and no extra delay was needed at probe. The one board
> we know about that has a problem with this behavior is the one you're
> reproducing on, which we think only worked previously by chance.

There is at least one device now which shows a problem with the current 
state of driver.

>>> I will also continue to assert that trying to fix the problem via a
>>> delay in the probe of the panel code is the wrong place to fix the
>>> code. The problem is that you need a board-level constraint on this
>>> regulator (off-on-delay-us) preventing it from turning on again within
>>> a certain amount of time after it is turned off. This allows the
>>> regulator framework, which is what decided to turn this rail off
>>> during the regulator probe, to enforce this constraint.
>>
>> No, the driver must respect the unprepare delay, that is what is
>> currently not happening. Trying to somehow work around that by adding
>> random changes to DT is not going to fix the fact that panel-simple is
>> not respecting its own internal built-in constraints.
> 
> Well, except that the panel _isn't_ actually unpreparing the panel.
> The constant you're talking about is a delay between unpreparing the
> panel and then preparing it again and we're not doing that here. Here,
> you are trying to account for an implicit unprepare that happened
> outside the context of the panel driver (in the regulator framework).
> The regulator framework is the one disabling the regulator on its own
> and without the knowledge of the panel driver.

I agree until this point.

> The problem should be
> addressed at the regulator framework, which might involve the
> off-on-delay.

I disagree with this point.

> I would even go further and say that, perhaps, when the regulator
> framework turns this regulator off at bootup then you might be
> violating the power sequencing requirements in the panel anyway. If
> the bootloader left the panel on and _also_ left an enable GPIO on
> then it's entirely possible that you've got a power leak through the
> enable GPIO where you're backpowering the panel's logic when the
> regulator framework turns things off. This is something that would be
> impossible for the panel driver to solve because the panel driver
> hasn't even probed yet.

I agree with this part as well.

> In any case, at this point it seems unlikely that I'll convince you or
> that you'll convince me. I wonder if it's time for someone else on
> this thread to provide an opinion.

I agree with this part as well.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
index d3238088b7f80..37afed67fea7e 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -567,6 +567,7 @@  static int panel_simple_probe(struct device *dev, const struct panel_desc *desc)
 
 	panel->enabled = false;
 	panel->prepared_time = 0;
+	panel->unprepared_time = ktime_get_boottime();
 	panel->desc = desc;
 
 	panel->supply = devm_regulator_get(dev, "power");