diff mbox series

regulator: wm8994: Use PROBE_FORCE_SYNCHRONOUS

Message ID 20230323083312.199189-1-m.szyprowski@samsung.com (mailing list archive)
State Accepted
Commit eef644d3802e7d5b899514dc9c3663a692817162
Headers show
Series regulator: wm8994: Use PROBE_FORCE_SYNCHRONOUS | expand

Commit Message

Marek Szyprowski March 23, 2023, 8:33 a.m. UTC
Restore synchronous probing for wm8994 regulators because otherwise the
sound device is never initialized on Exynos5250-based Arndale board.

Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/regulator/wm8994-regulator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Charles Keepax March 23, 2023, 11:40 a.m. UTC | #1
On Thu, Mar 23, 2023 at 09:33:12AM +0100, Marek Szyprowski wrote:
> Restore synchronous probing for wm8994 regulators because otherwise the
> sound device is never initialized on Exynos5250-based Arndale board.
> 
> Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14")
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
>  drivers/regulator/wm8994-regulator.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c
> index 8921051a00e9..2946db448aec 100644
> --- a/drivers/regulator/wm8994-regulator.c
> +++ b/drivers/regulator/wm8994-regulator.c
> @@ -227,7 +227,7 @@ static struct platform_driver wm8994_ldo_driver = {
>  	.probe = wm8994_ldo_probe,
>  	.driver		= {
>  		.name	= "wm8994-ldo",
> -		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
> +		.probe_type = PROBE_FORCE_SYNCHRONOUS,
>  	},
>  };

Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>

Yes, these seems to be a wider problem with these complex CODECs
that have an internal LDO. Changing to async probe, means the
internal LDO driver doesn't probe before the code in the main MFD
carries on, which means the regulator framework finds no driver
and swaps in the dummy. Which means the CODEC never powers up.

I think these basically have to be forced sync, I will do a patch
to update the other devices working like this.

Thanks,
Charles
Mark Brown March 23, 2023, 1:49 p.m. UTC | #2
On Thu, 23 Mar 2023 09:33:12 +0100, Marek Szyprowski wrote:
> Restore synchronous probing for wm8994 regulators because otherwise the
> sound device is never initialized on Exynos5250-based Arndale board.
> 
> 

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git for-next

Thanks!

[1/1] regulator: wm8994: Use PROBE_FORCE_SYNCHRONOUS
      commit: eef644d3802e7d5b899514dc9c3663a692817162

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark
Doug Anderson March 23, 2023, 4:53 p.m. UTC | #3
Hi,

On Thu, Mar 23, 2023 at 4:40 AM Charles Keepax
<ckeepax@opensource.cirrus.com> wrote:
>
> On Thu, Mar 23, 2023 at 09:33:12AM +0100, Marek Szyprowski wrote:
> > Restore synchronous probing for wm8994 regulators because otherwise the
> > sound device is never initialized on Exynos5250-based Arndale board.
> >
> > Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14")
> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > ---
> >  drivers/regulator/wm8994-regulator.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c
> > index 8921051a00e9..2946db448aec 100644
> > --- a/drivers/regulator/wm8994-regulator.c
> > +++ b/drivers/regulator/wm8994-regulator.c
> > @@ -227,7 +227,7 @@ static struct platform_driver wm8994_ldo_driver = {
> >       .probe = wm8994_ldo_probe,
> >       .driver         = {
> >               .name   = "wm8994-ldo",
> > -             .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> > +             .probe_type = PROBE_FORCE_SYNCHRONOUS,
> >       },
> >  };
>
> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
>
> Yes, these seems to be a wider problem with these complex CODECs
> that have an internal LDO. Changing to async probe, means the
> internal LDO driver doesn't probe before the code in the main MFD
> carries on, which means the regulator framework finds no driver
> and swaps in the dummy. Which means the CODEC never powers up.
>
> I think these basically have to be forced sync, I will do a patch
> to update the other devices working like this.

Sorry for the breakage and thank you for the fix.

No question that a quick switch back to PROBE_FORCE_SYNCHRONOUS is the
right first step here, but I'm wondering if there are any further
steps we want to take.

If my analysis is correct, there's still potential to run into similar
problems even with PROBE_FORCE_SYNCHRONOUS. I don't think that
mfd_add_devices() is _guaranteed_ to finish probing all the
sub-devices by the time it returns. Specifically, imagine that
wm8994_ldo_probe() tries to get a GPIO that that system hasn't made
available yet. Potentially the system could return -EPROBE_DEFER there
and that would put the LDO on the deferred probe list. Doing so won't
cause mfd_add_devices() to fail. Now we'll end up with a dummy
regulator again. Admittedly most cases GPIO providers probe really
early and so this argument is a bit of a stretch, but I guess the
point is that there's no official guarantee that mfd_add_devices()
will finish probing all sub-devices so it's not ideal to rely on.
Also, other drivers with a similar pattern might have other reasons to
-EPROBE_DEFER.

These types of issues are the same ones I faced with DP AUX bus and
the solutions were never super amazing...

One solution we ended up with for the DP AUX bus was to add a
"done_probing" callback argument to devm_of_dp_aux_populate_bus().
This allowed the parent to be called back when all the children were
done probing. This implementation is easier for DP AUX bus than it
would be in the generic MFD case, but conceivably it would be possible
there too?

Another possible solution is to somehow break the driver up into more
sub-drivers. Essentially, you have a top-level "wm8994" that's not
much more than a container. Then you create a new-sub-device and
relegate anything that needs the regulators to the new sub-device. The
new sub-device can just -EPROBE_DEFER until it detects that the
regulators have finished probing. I ended up doing stuff like this for
"ti-sn65dsi86.c" using the Linux aux bus (not to be confused with the
DP Aux bus) and it was a bit odd but worked OK.

-Doug
Mark Brown March 23, 2023, 4:57 p.m. UTC | #4
On Thu, Mar 23, 2023 at 09:53:18AM -0700, Doug Anderson wrote:

> Sorry for the breakage and thank you for the fix.

Mostly my fault, it was me asked you to do all the drivers.

> No question that a quick switch back to PROBE_FORCE_SYNCHRONOUS is the
> right first step here, but I'm wondering if there are any further
> steps we want to take.

> If my analysis is correct, there's still potential to run into similar
> problems even with PROBE_FORCE_SYNCHRONOUS. I don't think that
> mfd_add_devices() is _guaranteed_ to finish probing all the
> sub-devices by the time it returns. Specifically, imagine that
> wm8994_ldo_probe() tries to get a GPIO that that system hasn't made
> available yet. Potentially the system could return -EPROBE_DEFER there

Yes, the code isn't 100% robust.  The driver was written on the basis
that we know the target systems for practical deployments are very
unlikely to have such issues and we'd deal with the potential issues if
they ever actually cropped up.
Charles Keepax March 23, 2023, 5:45 p.m. UTC | #5
On Thu, Mar 23, 2023 at 09:53:18AM -0700, Doug Anderson wrote:
> On Thu, Mar 23, 2023 at 4:40 AM Charles Keepax
> If my analysis is correct, there's still potential to run into similar
> problems even with PROBE_FORCE_SYNCHRONOUS. I don't think that
> mfd_add_devices() is _guaranteed_ to finish probing all the
> sub-devices by the time it returns. Specifically, imagine that
> wm8994_ldo_probe() tries to get a GPIO that that system hasn't made
> available yet. Potentially the system could return -EPROBE_DEFER there
> and that would put the LDO on the deferred probe list. Doing so won't
> cause mfd_add_devices() to fail. Now we'll end up with a dummy
> regulator again. Admittedly most cases GPIO providers probe really
> early and so this argument is a bit of a stretch, but I guess the
> point is that there's no official guarantee that mfd_add_devices()
> will finish probing all sub-devices so it's not ideal to rely on.
> Also, other drivers with a similar pattern might have other reasons to
> -EPROBE_DEFER.
> 
> These types of issues are the same ones I faced with DP AUX bus and
> the solutions were never super amazing...
> 
> One solution we ended up with for the DP AUX bus was to add a
> "done_probing" callback argument to devm_of_dp_aux_populate_bus().
> This allowed the parent to be called back when all the children were
> done probing. This implementation is easier for DP AUX bus than it
> would be in the generic MFD case, but conceivably it would be possible
> there too?
> 
> Another possible solution is to somehow break the driver up into more
> sub-drivers. Essentially, you have a top-level "wm8994" that's not
> much more than a container. Then you create a new-sub-device and
> relegate anything that needs the regulators to the new sub-device. The
> new sub-device can just -EPROBE_DEFER until it detects that the
> regulators have finished probing. I ended up doing stuff like this for
> "ti-sn65dsi86.c" using the Linux aux bus (not to be confused with the
> DP Aux bus) and it was a bit odd but worked OK.

Yes I believe you are correct, there is still an issue here,
indeed a quick test suggests I can still cause this by forcing a
probe defer in the regulator driver.

I think really the best place to look at this would be at the
regulator level. It is fine if mfd_add_devices passes, the problem
really is that the regulator core doesn't realise the regulator is
going to be arriving, and thus returns a dummy regulator, rather
than returning EPROBE_DEFER. If it did the MFD driver would probe
defer at the point of requesting the regulator, which would all
make sense.

I will see if I can find some time to think about that further
but very unlikely to happen this week.

Thanks,
Charles
Mark Brown March 23, 2023, 5:49 p.m. UTC | #6
On Thu, Mar 23, 2023 at 05:45:31PM +0000, Charles Keepax wrote:

> I think really the best place to look at this would be at the
> regulator level. It is fine if mfd_add_devices passes, the problem
> really is that the regulator core doesn't realise the regulator is
> going to be arriving, and thus returns a dummy regulator, rather
> than returning EPROBE_DEFER. If it did the MFD driver would probe
> defer at the point of requesting the regulator, which would all
> make sense.

You need the MFD to tell the regulator subsystem that there's a
regulator bound there, or to force all the users to explicitly do the
mapping of the regulator in their firmwares (which isn't really a
viable approach).
Doug Anderson March 23, 2023, 6 p.m. UTC | #7
Hi,

On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax
<ckeepax@opensource.cirrus.com> wrote:
>
> I think really the best place to look at this would be at the
> regulator level. It is fine if mfd_add_devices passes, the problem
> really is that the regulator core doesn't realise the regulator is
> going to be arriving, and thus returns a dummy regulator, rather
> than returning EPROBE_DEFER. If it did the MFD driver would probe
> defer at the point of requesting the regulator, which would all
> make sense.

I think something like your suggestion could be made to work for the
"microphone" supply in the arizona MFD, but not for the others looked
at here.

The problem is that if the MFD driver gets -EPROBE_DEFER then it will
go through its error handling path and call mfd_remove_devices().
That'll remove the sub-device providing the regulator. If you try
again, you'll just do the same. :-)

Specifically in wm8994 after we've populated the regulator sub-devices
then we turn them on and start talking to the device.

I think the two options I have could both work for wm8994's case:
either add some type of "my children have done probing" to MFD and
move the turning on of regulators / talking to devices there, or add
another stub-device and add it there. ;-)

-Doug
Charles Keepax March 24, 2023, 9:21 a.m. UTC | #8
On Thu, Mar 23, 2023 at 05:49:28PM +0000, Mark Brown wrote:
> On Thu, Mar 23, 2023 at 05:45:31PM +0000, Charles Keepax wrote:
> 
> > I think really the best place to look at this would be at the
> > regulator level. It is fine if mfd_add_devices passes, the problem
> > really is that the regulator core doesn't realise the regulator is
> > going to be arriving, and thus returns a dummy regulator, rather
> > than returning EPROBE_DEFER. If it did the MFD driver would probe
> > defer at the point of requesting the regulator, which would all
> > make sense.
> 
> You need the MFD to tell the regulator subsystem that there's a
> regulator bound there, or to force all the users to explicitly do the
> mapping of the regulator in their firmwares (which isn't really a
> viable approach).

Yeah changing the firmware situation is definitely not a goer. I
need to just clarify in my head exactly what is missing, with
respect to the know the regulator exists.

Thanks,
Charles
Charles Keepax March 24, 2023, 9:23 a.m. UTC | #9
On Thu, Mar 23, 2023 at 11:00:32AM -0700, Doug Anderson wrote:
> Hi,
> 
> On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax
> <ckeepax@opensource.cirrus.com> wrote:
> >
> > I think really the best place to look at this would be at the
> > regulator level. It is fine if mfd_add_devices passes, the problem
> > really is that the regulator core doesn't realise the regulator is
> > going to be arriving, and thus returns a dummy regulator, rather
> > than returning EPROBE_DEFER. If it did the MFD driver would probe
> > defer at the point of requesting the regulator, which would all
> > make sense.
> 
> I think something like your suggestion could be made to work for the
> "microphone" supply in the arizona MFD, but not for the others looked
> at here.
> 
> The problem is that if the MFD driver gets -EPROBE_DEFER then it will
> go through its error handling path and call mfd_remove_devices().
> That'll remove the sub-device providing the regulator. If you try
> again, you'll just do the same. :-)
> 
> Specifically in wm8994 after we've populated the regulator sub-devices
> then we turn them on and start talking to the device.
> 
> I think the two options I have could both work for wm8994's case:
> either add some type of "my children have done probing" to MFD and
> move the turning on of regulators / talking to devices there, or add
> another stub-device and add it there. ;-)

Is this true if we keep the regulator as sync though? Yes it will
remove the children but when it re-adds them the reason that the
regulator probe deferred in the first place will hopefully be
removed. So it will now fully probe in path.

Thanks,
Charles
Doug Anderson March 24, 2023, 3:06 p.m. UTC | #10
Hi,

On Fri, Mar 24, 2023 at 2:23 AM Charles Keepax
<ckeepax@opensource.cirrus.com> wrote:
>
> On Thu, Mar 23, 2023 at 11:00:32AM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax
> > <ckeepax@opensource.cirrus.com> wrote:
> > >
> > > I think really the best place to look at this would be at the
> > > regulator level. It is fine if mfd_add_devices passes, the problem
> > > really is that the regulator core doesn't realise the regulator is
> > > going to be arriving, and thus returns a dummy regulator, rather
> > > than returning EPROBE_DEFER. If it did the MFD driver would probe
> > > defer at the point of requesting the regulator, which would all
> > > make sense.
> >
> > I think something like your suggestion could be made to work for the
> > "microphone" supply in the arizona MFD, but not for the others looked
> > at here.
> >
> > The problem is that if the MFD driver gets -EPROBE_DEFER then it will
> > go through its error handling path and call mfd_remove_devices().
> > That'll remove the sub-device providing the regulator. If you try
> > again, you'll just do the same. :-)
> >
> > Specifically in wm8994 after we've populated the regulator sub-devices
> > then we turn them on and start talking to the device.
> >
> > I think the two options I have could both work for wm8994's case:
> > either add some type of "my children have done probing" to MFD and
> > move the turning on of regulators / talking to devices there, or add
> > another stub-device and add it there. ;-)
>
> Is this true if we keep the regulator as sync though? Yes it will
> remove the children but when it re-adds them the reason that the
> regulator probe deferred in the first place will hopefully be
> removed. So it will now fully probe in path.

Ah, I see. So you keep it as synchronous _and_ make it so that it
won't get a dummy. Yeah, I was trying to brainstorm ways we could fix
it if we kept the regulator async. If we keep it as sync and fix the
dummy problem then, indeed, it should work as you say.

I've spent a whole lot of time dealing with similar issues, though,
and I think there is actually another related concern with that design
(where the regulator is synchronous). ;-) If the child device ends up
depending on a resource that _never_ shows up then you can get into an
infinite probe deferral loop at bootup. If it works the way it did
last time I analyzed similar code:

1. Your MFD starts to probe and kicks off probing of its children
(including the regulator).

2. Your regulator tries to probe and tries to get a resource that will
never exist, maybe because of a bug in dts or maybe because it won't
show up until userspace loads a module. It returns -EPROBE_DEFER.

3. The MFD realizes that the regulator didn't come up and it also
returns -EPROBE_DEFER after removing all its children.

4. That fact that we added/removed devices in the above means that the
kernel thinks it should retry probes of previously deferred devices
because, maybe, the device showed up that everyone was waiting for.
Thus, we go back to step #1.

...the system can actually loop forever in steps #1 - #4 and we ended
up in that situation several times during development with similar
architected systems.


-Doug
Charles Keepax March 24, 2023, 3:44 p.m. UTC | #11
On Fri, Mar 24, 2023 at 08:06:15AM -0700, Doug Anderson wrote:
> On Fri, Mar 24, 2023 at 2:23 AM Charles Keepax
> > On Thu, Mar 23, 2023 at 11:00:32AM -0700, Doug Anderson wrote:
> > > On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax
> I've spent a whole lot of time dealing with similar issues, though,
> and I think there is actually another related concern with that design
> (where the regulator is synchronous). ;-) If the child device ends up
> depending on a resource that _never_ shows up then you can get into an
> infinite probe deferral loop at bootup. If it works the way it did
> last time I analyzed similar code:
> 
> 1. Your MFD starts to probe and kicks off probing of its children
> (including the regulator).
> 
> 2. Your regulator tries to probe and tries to get a resource that will
> never exist, maybe because of a bug in dts or maybe because it won't
> show up until userspace loads a module. It returns -EPROBE_DEFER.
> 
> 3. The MFD realizes that the regulator didn't come up and it also
> returns -EPROBE_DEFER after removing all its children.
> 
> 4. That fact that we added/removed devices in the above means that the
> kernel thinks it should retry probes of previously deferred devices
> because, maybe, the device showed up that everyone was waiting for.
> Thus, we go back to step #1.
> 
> ...the system can actually loop forever in steps #1 - #4 and we ended
> up in that situation several times during development with similar
> architected systems.

Hmm... shoot yes you are correct that would indeed happen.

Thanks,
Charles
diff mbox series

Patch

diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c
index 8921051a00e9..2946db448aec 100644
--- a/drivers/regulator/wm8994-regulator.c
+++ b/drivers/regulator/wm8994-regulator.c
@@ -227,7 +227,7 @@  static struct platform_driver wm8994_ldo_driver = {
 	.probe = wm8994_ldo_probe,
 	.driver		= {
 		.name	= "wm8994-ldo",
-		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
+		.probe_type = PROBE_FORCE_SYNCHRONOUS,
 	},
 };