diff mbox series

[v3,1/2] driver core: Rework logic in __driver_deferred_probe_check_state to allow EPROBE_DEFER to be returned for longer

Message ID 20200218220748.54823-1-john.stultz@linaro.org (mailing list archive)
State Superseded, archived
Headers show
Series [v3,1/2] driver core: Rework logic in __driver_deferred_probe_check_state to allow EPROBE_DEFER to be returned for longer | expand

Commit Message

John Stultz Feb. 18, 2020, 10:07 p.m. UTC
Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
at the end of initcall"), along with commit 25b4e70dcce9
("driver core: allow stopping deferred probe after init") after
late_initcall, drivers will stop getting EPROBE_DEFER, and
instead see an error causing the driver to fail to load.

That change causes trouble when trying to use many clk drivers
as modules, as the clk modules may not load until much later
after init has started. If a dependent driver loads and gets an
error instead of EPROBE_DEFER, it won't try to reload later when
the dependency is met, and will thus fail to load.

This patch reworks some of the logic in
__driver_deferred_probe_check_state() so that if the
deferred_probe_timeout value is set, we will return EPROBE_DEFER
until that timeout expires, which may be after initcalls_done
is set to true.

Specifically, on db845c, this change (when combined with booting
using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
system, where as without it the display will fail to load.

Cc: Rob Herring <robh@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v2:
* Add calls to driver_deferred_probe_trigger() after the two minute timeout,
  as suggested by Bjorn
* Minor whitespace cleanups
* Switch to 30 second timeout to match what the regulator code is doing as
  suggested by Rob.
v3:
* Rework to reuse existing deferred_probe_timeout value, suggested by Rob
* Dropped Fixes: tags as Rob requested (Not my hill to die on :)
---
 drivers/base/dd.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Comments

Rob Herring Feb. 18, 2020, 10:51 p.m. UTC | #1
On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@linaro.org> wrote:
>
> Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> at the end of initcall"), along with commit 25b4e70dcce9
> ("driver core: allow stopping deferred probe after init") after
> late_initcall, drivers will stop getting EPROBE_DEFER, and
> instead see an error causing the driver to fail to load.
>
> That change causes trouble when trying to use many clk drivers
> as modules, as the clk modules may not load until much later
> after init has started. If a dependent driver loads and gets an
> error instead of EPROBE_DEFER, it won't try to reload later when
> the dependency is met, and will thus fail to load.
>
> This patch reworks some of the logic in
> __driver_deferred_probe_check_state() so that if the
> deferred_probe_timeout value is set, we will return EPROBE_DEFER
> until that timeout expires, which may be after initcalls_done
> is set to true.
>
> Specifically, on db845c, this change (when combined with booting
> using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> system, where as without it the display will fail to load.

I would change the default for deferred_probe_timeout to 30 and then
regulator code can rely on that. Curious, why 30 sec is fine now when
you originally had 2 min? I'd just pick what you think is best. I
doubt Mark had any extensive experiments to come up with 30sec.

> Cc: Rob Herring <robh@kernel.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Kevin Hilman <khilman@kernel.org>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Todd Kjos <tkjos@google.com>
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: Liam Girdwood <lgirdwood@gmail.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-pm@vger.kernel.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
> v2:
> * Add calls to driver_deferred_probe_trigger() after the two minute timeout,
>   as suggested by Bjorn
> * Minor whitespace cleanups
> * Switch to 30 second timeout to match what the regulator code is doing as
>   suggested by Rob.
> v3:
> * Rework to reuse existing deferred_probe_timeout value, suggested by Rob
> * Dropped Fixes: tags as Rob requested (Not my hill to die on :)
> ---
>  drivers/base/dd.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index b25bcab2a26b..9d916a7b56a6 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -237,13 +237,12 @@ __setup("deferred_probe_timeout=", deferred_probe_timeout_setup);
>
>  static int __driver_deferred_probe_check_state(struct device *dev)
>  {
> -       if (!initcalls_done)
> -               return -EPROBE_DEFER;
> -
> -       if (!deferred_probe_timeout) {
> +       if (initcalls_done && !deferred_probe_timeout) {
>                 dev_WARN(dev, "deferred probe timeout, ignoring dependency");
>                 return -ETIMEDOUT;
>         }
> +       if (!initcalls_done || deferred_probe_timeout > 0)
> +               return -EPROBE_DEFER;
>
>         return 0;
>  }
> --
> 2.17.1
>
John Stultz Feb. 18, 2020, 11:21 p.m. UTC | #2
On Tue, Feb 18, 2020 at 2:51 PM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@linaro.org> wrote:
> >
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
> >
> > That change causes trouble when trying to use many clk drivers
> > as modules, as the clk modules may not load until much later
> > after init has started. If a dependent driver loads and gets an
> > error instead of EPROBE_DEFER, it won't try to reload later when
> > the dependency is met, and will thus fail to load.
> >
> > This patch reworks some of the logic in
> > __driver_deferred_probe_check_state() so that if the
> > deferred_probe_timeout value is set, we will return EPROBE_DEFER
> > until that timeout expires, which may be after initcalls_done
> > is set to true.
> >
> > Specifically, on db845c, this change (when combined with booting
> > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > system, where as without it the display will fail to load.
>
> I would change the default for deferred_probe_timeout to 30 and then
> regulator code can rely on that.

That is exactly what I do in the following patch! Let me know if you
have further suggestions for it.

> Curious, why 30 sec is fine now when
> you originally had 2 min? I'd just pick what you think is best. I
> doubt Mark had any extensive experiments to come up with 30sec.

I had two minutes initially as, due to other delays I still have to
chase, booting all the way to UI on the db845c can sometimes take
almost a minute. So it was just a rough safety estimate. But in v2, I
tested with the 30 second time used by the regulator code, and that
seemed to work ok.

As long as 30 seconds is working well, I'm ok with it for now (and it
can be overrided via boot arg). Though from past experience with
enterprise distros running on servers with tons of disks (which might
take many minutes to initialize), I'd suspect its likely some
environments may need much longer.

thanks
-john
Rob Herring Feb. 18, 2020, 11:53 p.m. UTC | #3
On Tue, Feb 18, 2020 at 5:21 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Tue, Feb 18, 2020 at 2:51 PM Rob Herring <robh@kernel.org> wrote:
> >
> > On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@linaro.org> wrote:
> > >
> > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > at the end of initcall"), along with commit 25b4e70dcce9
> > > ("driver core: allow stopping deferred probe after init") after
> > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > instead see an error causing the driver to fail to load.
> > >
> > > That change causes trouble when trying to use many clk drivers
> > > as modules, as the clk modules may not load until much later
> > > after init has started. If a dependent driver loads and gets an
> > > error instead of EPROBE_DEFER, it won't try to reload later when
> > > the dependency is met, and will thus fail to load.
> > >
> > > This patch reworks some of the logic in
> > > __driver_deferred_probe_check_state() so that if the
> > > deferred_probe_timeout value is set, we will return EPROBE_DEFER
> > > until that timeout expires, which may be after initcalls_done
> > > is set to true.
> > >
> > > Specifically, on db845c, this change (when combined with booting
> > > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > > system, where as without it the display will fail to load.
> >
> > I would change the default for deferred_probe_timeout to 30 and then
> > regulator code can rely on that.
>
> That is exactly what I do in the following patch! Let me know if you
> have further suggestions for it.

Indeed.

Between the above comment and this comment in patch 2:
/* preserve 30 second interval if deferred_probe_timeout=-1 */

...I was confused.

> > Curious, why 30 sec is fine now when
> > you originally had 2 min? I'd just pick what you think is best. I
> > doubt Mark had any extensive experiments to come up with 30sec.
>
> I had two minutes initially as, due to other delays I still have to
> chase, booting all the way to UI on the db845c can sometimes take
> almost a minute. So it was just a rough safety estimate. But in v2, I
> tested with the 30 second time used by the regulator code, and that
> seemed to work ok.
>
> As long as 30 seconds is working well, I'm ok with it for now (and it
> can be overrided via boot arg). Though from past experience with
> enterprise distros running on servers with tons of disks (which might
> take many minutes to initialize), I'd suspect its likely some
> environments may need much longer.

Those systems aren't going to have a pinctrl or clock driver sitting
in an encrypted RAID partition either. :)

Rob
Mark Brown Feb. 18, 2020, 11:57 p.m. UTC | #4
On Tue, Feb 18, 2020 at 05:53:01PM -0600, Rob Herring wrote:
> On Tue, Feb 18, 2020 at 5:21 PM John Stultz <john.stultz@linaro.org> wrote:

> > As long as 30 seconds is working well, I'm ok with it for now (and it
> > can be overrided via boot arg). Though from past experience with
> > enterprise distros running on servers with tons of disks (which might
> > take many minutes to initialize), I'd suspect its likely some
> > environments may need much longer.

> Those systems aren't going to have a pinctrl or clock driver sitting
> in an encrypted RAID partition either. :)

There speaks the voice of hope and optimism.
John Stultz Feb. 18, 2020, 11:58 p.m. UTC | #5
On Tue, Feb 18, 2020 at 3:53 PM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Feb 18, 2020 at 5:21 PM John Stultz <john.stultz@linaro.org> wrote:
> >
> > On Tue, Feb 18, 2020 at 2:51 PM Rob Herring <robh@kernel.org> wrote:
> > >
> > > On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@linaro.org> wrote:
> > > >
> > > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > > at the end of initcall"), along with commit 25b4e70dcce9
> > > > ("driver core: allow stopping deferred probe after init") after
> > > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > > instead see an error causing the driver to fail to load.
> > > >
> > > > That change causes trouble when trying to use many clk drivers
> > > > as modules, as the clk modules may not load until much later
> > > > after init has started. If a dependent driver loads and gets an
> > > > error instead of EPROBE_DEFER, it won't try to reload later when
> > > > the dependency is met, and will thus fail to load.
> > > >
> > > > This patch reworks some of the logic in
> > > > __driver_deferred_probe_check_state() so that if the
> > > > deferred_probe_timeout value is set, we will return EPROBE_DEFER
> > > > until that timeout expires, which may be after initcalls_done
> > > > is set to true.
> > > >
> > > > Specifically, on db845c, this change (when combined with booting
> > > > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > > > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > > > system, where as without it the display will fail to load.
> > >
> > > I would change the default for deferred_probe_timeout to 30 and then
> > > regulator code can rely on that.
> >
> > That is exactly what I do in the following patch! Let me know if you
> > have further suggestions for it.
>
> Indeed.
>
> Between the above comment and this comment in patch 2:
> /* preserve 30 second interval if deferred_probe_timeout=-1 */
>
> ...I was confused.
>

Sorry. I added that out of an abundance of caution to avoid breaking
things if someone booted specifying deferred_probe_timeout=-1 as a
boot argument, since that would cause the regulator timeout to likely
never expire.

> > > Curious, why 30 sec is fine now when
> > > you originally had 2 min? I'd just pick what you think is best. I
> > > doubt Mark had any extensive experiments to come up with 30sec.
> >
> > I had two minutes initially as, due to other delays I still have to
> > chase, booting all the way to UI on the db845c can sometimes take
> > almost a minute. So it was just a rough safety estimate. But in v2, I
> > tested with the 30 second time used by the regulator code, and that
> > seemed to work ok.
> >
> > As long as 30 seconds is working well, I'm ok with it for now (and it
> > can be overrided via boot arg). Though from past experience with
> > enterprise distros running on servers with tons of disks (which might
> > take many minutes to initialize), I'd suspect its likely some
> > environments may need much longer.
>
> Those systems aren't going to have a pinctrl or clock driver sitting
> in an encrypted RAID partition either. :)

Fair enough. Not today.. but it's always only a matter of time between
"that's ridiculous!" and "oh, we need that!" :)

thanks
-john
Mark Brown Feb. 19, 2020, 12:15 a.m. UTC | #6
On Tue, Feb 18, 2020 at 04:51:39PM -0600, Rob Herring wrote:
> On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@linaro.org> wrote:

> > Specifically, on db845c, this change (when combined with booting
> > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > system, where as without it the display will fail to load.

> I would change the default for deferred_probe_timeout to 30 and then
> regulator code can rely on that. Curious, why 30 sec is fine now when
> you originally had 2 min? I'd just pick what you think is best. I
> doubt Mark had any extensive experiments to come up with 30sec.

Sort of - I've spent a bunch of time looking at the sorts of devices
where this is applicable for regulators and 30s is wildly excessive for
the use case.  I didn't specifically measure anything at the time I did
the change though, even longer should work just as well.

That feature in the regulator framework is targetted quite narrowly at
things we really don't want to glitch out during boot if we can avoid it
like the display, people tend to make efforts to ensure that they come
up quickly during boot anyway so we're not expecting to worry about the
full boot time for bigger systems.  The expectation is that most devices
will cope fine with having the power turned off for a period and if the
user can't see it happening then it doesn't *really* matter.
John Stultz Feb. 19, 2020, 12:19 a.m. UTC | #7
On Tue, Feb 18, 2020 at 4:06 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Wednesday, February 19, 2020, John Stultz <john.stultz@linaro.org> wrote:
>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>> index b25bcab2a26b..9d916a7b56a6 100644
>> --- a/drivers/base/dd.c
>> +++ b/drivers/base/dd.c
>> @@ -237,13 +237,12 @@ __setup("deferred_probe_timeout=", deferred_probe_timeout_setup);
>>
>>  static int __driver_deferred_probe_check_state(struct device *dev)
>>  {
>> -       if (!initcalls_done)
>> -               return -EPROBE_DEFER;
>
>
> Why to touch this? Can't you simple add a new condition here 'if (deferred_probe_timeout > 0)'... ?

I think that might work. I'll give it a spin later tonight and double check it.

The main thing I wanted to do is fix the logic hole in the current
code where after initcalls_done=true but before deferred_probe_timeout
has expired we just fall through and return 0, which results in an
ENODEV being returned from the calling function.

thanks
-john
John Stultz Feb. 19, 2020, 1:11 a.m. UTC | #8
On Tue, Feb 18, 2020 at 4:19 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Tue, Feb 18, 2020 at 4:06 PM Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> > On Wednesday, February 19, 2020, John Stultz <john.stultz@linaro.org> wrote:
> >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> >> index b25bcab2a26b..9d916a7b56a6 100644
> >> --- a/drivers/base/dd.c
> >> +++ b/drivers/base/dd.c
> >> @@ -237,13 +237,12 @@ __setup("deferred_probe_timeout=", deferred_probe_timeout_setup);
> >>
> >>  static int __driver_deferred_probe_check_state(struct device *dev)
> >>  {
> >> -       if (!initcalls_done)
> >> -               return -EPROBE_DEFER;
> >
> >
> > Why to touch this? Can't you simple add a new condition here 'if (deferred_probe_timeout > 0)'... ?
>
> I think that might work. I'll give it a spin later tonight and double check it.
>
> The main thing I wanted to do is fix the logic hole in the current
> code where after initcalls_done=true but before deferred_probe_timeout
> has expired we just fall through and return 0, which results in an
> ENODEV being returned from the calling function.


So on IRC Bjorn sort of clarified a point I think Rob was trying to
make on the earlier iteration of this patch, that it seems like
Thierry's patch here:
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=62a6bc3a1e4f4ee9ae0076fa295f9af1c3725ce3
*seems* to be trying to address the exact same issue, and maybe we
should just have the genpd code use that instead?

The main question though, is why do we need both?  As mentioned above,
the existing logic in __driver_deferred_probe_check_state() seems
wrong: Until late_initcall it returns EPROBE_DEFER, then after
initcalls_done==true returns 0 (in which case the caller then
translates to ENODEV), until the timeout expires which it then returns
ETIMEDOUT.

I suspect what is really wanted is EPROBE_DEFER -> (0) ENODEV (when
timeout is not set) or EPROBE_DEFER -> ETIMEOUT (when the timeout is
set), instead of the two state transitions it currently makes.

So I still think my patch is needed, but I also suspect a better fix
would be to kill driver_deferred_probe_check_state() and just replace
its usage with driver_deferred_probe_check_state_continue(). Or am I
still missing something?

thanks
-john
Rob Herring Feb. 19, 2020, 2:07 a.m. UTC | #9
On Tue, Feb 18, 2020 at 7:11 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Tue, Feb 18, 2020 at 4:19 PM John Stultz <john.stultz@linaro.org> wrote:
> >
> > On Tue, Feb 18, 2020 at 4:06 PM Andy Shevchenko
> > <andy.shevchenko@gmail.com> wrote:
> > > On Wednesday, February 19, 2020, John Stultz <john.stultz@linaro.org> wrote:
> > >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > >> index b25bcab2a26b..9d916a7b56a6 100644
> > >> --- a/drivers/base/dd.c
> > >> +++ b/drivers/base/dd.c
> > >> @@ -237,13 +237,12 @@ __setup("deferred_probe_timeout=", deferred_probe_timeout_setup);
> > >>
> > >>  static int __driver_deferred_probe_check_state(struct device *dev)
> > >>  {
> > >> -       if (!initcalls_done)
> > >> -               return -EPROBE_DEFER;
> > >
> > >
> > > Why to touch this? Can't you simple add a new condition here 'if (deferred_probe_timeout > 0)'... ?
> >
> > I think that might work. I'll give it a spin later tonight and double check it.
> >
> > The main thing I wanted to do is fix the logic hole in the current
> > code where after initcalls_done=true but before deferred_probe_timeout
> > has expired we just fall through and return 0, which results in an
> > ENODEV being returned from the calling function.
>
>
> So on IRC Bjorn sort of clarified a point I think Rob was trying to
> make on the earlier iteration of this patch, that it seems like
> Thierry's patch here:
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=62a6bc3a1e4f4ee9ae0076fa295f9af1c3725ce3
> *seems* to be trying to address the exact same issue, and maybe we
> should just have the genpd code use that instead?

Looking at it some more, I think the change to the pinctrl code there
breaks the case I care about (pinctrl described in DT and no driver on
a system that previously worked without pinctrl). Maybe if I set the
timeout to 0 it will still work (which would be fine).

> The main question though, is why do we need both?  As mentioned above,
> the existing logic in __driver_deferred_probe_check_state() seems
> wrong: Until late_initcall it returns EPROBE_DEFER, then after
> initcalls_done==true returns 0 (in which case the caller then
> translates to ENODEV), until the timeout expires which it then returns
> ETIMEDOUT.
>
> I suspect what is really wanted is EPROBE_DEFER -> (0) ENODEV (when
> timeout is not set) or EPROBE_DEFER -> ETIMEOUT (when the timeout is
> set), instead of the two state transitions it currently makes.

Yes. There's never any reason to return 0. It should be one of 3
errnos. If we're moving to always having a timeout, then maybe ENODEV
isn't even needed. I guess it's a stronger "we're done with init and
there's never going to be another driver" which maybe we should do for
!CONFIG_MODULES.

> So I still think my patch is needed, but I also suspect a better fix
> would be to kill driver_deferred_probe_check_state() and just replace
> its usage with driver_deferred_probe_check_state_continue(). Or am I
> still missing something?

I think those should be merged. They now do almost the same thing.
Only in the timeout==-1 case do they differ.

The original intent was that driver_deferred_probe_check_state()
simply returned what state we're in and the caller would decide what
to do with that. IOW, each caller could implement their own policy
possibly based on other information. Pinctrl factored in a DT hint.
IOMMU relied on everything was built-in.

The one complication which I mentioned already is with consoles. A
timeout (and dependencies in modules) there doesn't work. You have to
probe and register the console before init is done.

Rob
John Stultz Feb. 19, 2020, 4:23 a.m. UTC | #10
On Tue, Feb 18, 2020 at 6:07 PM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Feb 18, 2020 at 7:11 PM John Stultz <john.stultz@linaro.org> wrote:
> >
> > On Tue, Feb 18, 2020 at 4:19 PM John Stultz <john.stultz@linaro.org> wrote:
> > >
> > > On Tue, Feb 18, 2020 at 4:06 PM Andy Shevchenko
> > > <andy.shevchenko@gmail.com> wrote:
> > > > On Wednesday, February 19, 2020, John Stultz <john.stultz@linaro.org> wrote:
> > > >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > >> index b25bcab2a26b..9d916a7b56a6 100644
> > > >> --- a/drivers/base/dd.c
> > > >> +++ b/drivers/base/dd.c
> > > >> @@ -237,13 +237,12 @@ __setup("deferred_probe_timeout=", deferred_probe_timeout_setup);
> > > >>
> > > >>  static int __driver_deferred_probe_check_state(struct device *dev)
> > > >>  {
> > > >> -       if (!initcalls_done)
> > > >> -               return -EPROBE_DEFER;
> > > >
> > > >
> > > > Why to touch this? Can't you simple add a new condition here 'if (deferred_probe_timeout > 0)'... ?
> > >
> > > I think that might work. I'll give it a spin later tonight and double check it.
> > >
> > > The main thing I wanted to do is fix the logic hole in the current
> > > code where after initcalls_done=true but before deferred_probe_timeout
> > > has expired we just fall through and return 0, which results in an
> > > ENODEV being returned from the calling function.
> >
> >
> > So on IRC Bjorn sort of clarified a point I think Rob was trying to
> > make on the earlier iteration of this patch, that it seems like
> > Thierry's patch here:
> >   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=62a6bc3a1e4f4ee9ae0076fa295f9af1c3725ce3
> > *seems* to be trying to address the exact same issue, and maybe we
> > should just have the genpd code use that instead?
>
> Looking at it some more, I think the change to the pinctrl code there
> breaks the case I care about (pinctrl described in DT and no driver on
> a system that previously worked without pinctrl). Maybe if I set the
> timeout to 0 it will still work (which would be fine).
>
> > The main question though, is why do we need both?  As mentioned above,
> > the existing logic in __driver_deferred_probe_check_state() seems
> > wrong: Until late_initcall it returns EPROBE_DEFER, then after
> > initcalls_done==true returns 0 (in which case the caller then
> > translates to ENODEV), until the timeout expires which it then returns
> > ETIMEDOUT.
> >
> > I suspect what is really wanted is EPROBE_DEFER -> (0) ENODEV (when
> > timeout is not set) or EPROBE_DEFER -> ETIMEOUT (when the timeout is
> > set), instead of the two state transitions it currently makes.
>
> Yes. There's never any reason to return 0. It should be one of 3
> errnos. If we're moving to always having a timeout, then maybe ENODEV
> isn't even needed. I guess it's a stronger "we're done with init and
> there's never going to be another driver" which maybe we should do for
> !CONFIG_MODULES.
>
> > So I still think my patch is needed, but I also suspect a better fix
> > would be to kill driver_deferred_probe_check_state() and just replace
> > its usage with driver_deferred_probe_check_state_continue(). Or am I
> > still missing something?
>
> I think those should be merged. They now do almost the same thing.
> Only in the timeout==-1 case do they differ.

Well.. almost..  in addition to that different after late_initcall but
before the timeout driver_deferred_probe_check_state() will return
ENODEV, where as driver_deferred_probe_check_state_continue() returns
EPROBE_DEFER until the timeout happens.

> The original intent was that driver_deferred_probe_check_state()
> simply returned what state we're in and the caller would decide what
> to do with that. IOW, each caller could implement their own policy
> possibly based on other information. Pinctrl factored in a DT hint.
> IOMMU relied on everything was built-in.

I'm not super opinionated here, having the subsystem opt in and decide
what to do with the check_state() return value seems sane.  But I
suspect that we can consolidate the two cases down and simplify some
of the logic here.

I'll take another spin on this.

thanks
-john
Saravana Kannan Feb. 20, 2020, 5:27 a.m. UTC | #11
Apologies in advance for replying to this one email but discussing the
points raised in all the other replies. I'm not cc'ed in this thread and
replying to each email individually is a pain.

On Tue, Feb 18, 2020 at 4:07 PM John Stultz wrote:
> Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> at the end of initcall"), along with commit 25b4e70dcce9
> ("driver core: allow stopping deferred probe after init") after
> late_initcall, drivers will stop getting EPROBE_DEFER, and
> instead see an error causing the driver to fail to load.

Both of those patches were the best solution at that point in time. But
the kernel has changed a lot since then. Power domain and IOMMU drivers
can work as modules now. We have of_devlink and sync_state().

So, while a delay might have been the ideal solution back then, I think
we need to work towards removing arbitrary timeouts instead of making
the timeout mechanism more elaborate.

I think driver_deferred_probe_check_state() logic should boiled down
to something like:

int driver_deferred_probe_check_state(struct device *dev)
{
	/* No modules and init done, deferred probes are pointless from
	 * now. */
	if (!defined(CONFIG_MODULES) && initcall_done)
		return -ENODEV;

	/* If modules and of_devlink then you clean want dependencies to
	 * be enforced.
	 */
	if (defined(CONFIG_MODULES) && of_devlink)
		return -EPROBE_DEFER;

	/* Whatever other complexity (including timeouts) we want to
	 * add. Hopefully none - we can discuss in this thread. */
	if (.....)
		return -Exxxx;
	
	/* When in doubt, allow probe deferral. */
	return -EPROBE_DEFER;
}

Rob, for the original use case those two patches were added for, do they need
to support CONFIG_MODULES?

> That change causes trouble when trying to use many clk drivers
> as modules, as the clk modules may not load until much later
> after init has started. If a dependent driver loads and gets an
> error instead of EPROBE_DEFER, it won't try to reload later when
> the dependency is met, and will thus fail to load.

Once we add of_devlink support for power-domains, you won't even hit the
genpd error path if you have of_devlink enabled. I believe in the case
you are testing DB 845c, of_devlink is enabled?

If of_devlink is enabled, the devices depending on the unprobed power
domains would be deferred without even calling the driver's probe()
function.

Adding power-domain support to of_devlink is a 2 line change. I'll send
it out soon.

Also, regulator_init_complete() can be replaced by a sync_state() based
implementation. I have a downstream implementation that works. But it's
definitely not upstream ready. I plan to rewrite it and send it upstream
at some point, but it's fairly straightforward if anyone else want to
implement it. My main point being, we shouldn't have to make the timeout
logic more complex (or even need one) for regulator clean up either.

On Tue, Feb 18, 2020 at 6:07 PM Rob Herring wrote:
> The one complication which I mentioned already is with consoles. A
> timeout (and dependencies in modules) there doesn't work. You have to
> probe and register the console before init is done.

Rob,

I've seen you say this a couple of times before. But I don't think this
is true any more. With of_devlink enabled I've booted hardware countless
times with the console device probing after userspace comes up. The only
limitation for console drivers is that they need to be built-in if they
need to support earlycon. If you don't care to support earlycon (very
useful for bringup debugging), I think the console driver can even be a
module. I don't think even of_devlink needs to be enabled technically if
you load the modules in the right order.

Thanks,
Saravana
Andy Shevchenko Feb. 20, 2020, 10:09 a.m. UTC | #12
On Thu, Feb 20, 2020 at 7:29 AM Saravana Kannan <saravanak@google.com> wrote:
> On Tue, Feb 18, 2020 at 4:07 PM John Stultz wrote:
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
>
> Both of those patches were the best solution at that point in time. But
> the kernel has changed a lot since then. Power domain and IOMMU drivers
> can work as modules now. We have of_devlink and sync_state().
>
> So, while a delay might have been the ideal solution back then, I think
> we need to work towards removing arbitrary timeouts instead of making
> the timeout mechanism more elaborate.
>
> I think driver_deferred_probe_check_state() logic should boiled down
> to something like:

...

> Once we add of_devlink support for power-domains, you won't even hit the
> genpd error path if you have of_devlink enabled. I believe in the case
> you are testing DB 845c, of_devlink is enabled?
>
> If of_devlink is enabled, the devices depending on the unprobed power
> domains would be deferred without even calling the driver's probe()
> function.
>
> Adding power-domain support to of_devlink is a 2 line change. I'll send
> it out soon.

...

Pardon me for not knowing the OF and devlink stuff in particular, but
have you had a chance to test your changes on some (rather complex)
ACPI enabled platforms?
Would it have any complications there?
Rob Herring Feb. 20, 2020, 3:20 p.m. UTC | #13
On Wed, Feb 19, 2020 at 11:27 PM Saravana Kannan <saravanak@google.com> wrote:
>
> Apologies in advance for replying to this one email but discussing the
> points raised in all the other replies. I'm not cc'ed in this thread and
> replying to each email individually is a pain.
>
> On Tue, Feb 18, 2020 at 4:07 PM John Stultz wrote:
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
>
> Both of those patches were the best solution at that point in time. But
> the kernel has changed a lot since then. Power domain and IOMMU drivers
> can work as modules now. We have of_devlink and sync_state().
>
> So, while a delay might have been the ideal solution back then, I think
> we need to work towards removing arbitrary timeouts instead of making
> the timeout mechanism more elaborate.

If you don't have some way to say all the dependencies that can be
resolved have been resolved already, how do you get away from a
timeout? Nothing has changed in that respect.

If a dtb+kernel works, updating just the dtb with new dependencies
should not break things.

> I think driver_deferred_probe_check_state() logic should boiled down
> to something like:
>
> int driver_deferred_probe_check_state(struct device *dev)
> {
>         /* No modules and init done, deferred probes are pointless from
>          * now. */
>         if (!defined(CONFIG_MODULES) && initcall_done)
>                 return -ENODEV;
>
>         /* If modules and of_devlink then you clean want dependencies to
>          * be enforced.
>          */
>         if (defined(CONFIG_MODULES) && of_devlink)
>                 return -EPROBE_DEFER;
>
>         /* Whatever other complexity (including timeouts) we want to
>          * add. Hopefully none - we can discuss in this thread. */
>         if (.....)
>                 return -Exxxx;
>
>         /* When in doubt, allow probe deferral. */
>         return -EPROBE_DEFER;
> }

Mostly makes sense to me. I think using CONFIG_MODULES is good.

However, there is the case in pinctrl that we have a DT flag
'pinctrl-use-default' and we stop deferring at the end of initcalls if
set. With the above, there's no way to detect that. I'm pretty sure
that's broken now with of_devlink and maybe from Thierry's change too.

> Rob, for the original use case those two patches were added for, do they need
> to support CONFIG_MODULES?

At the time since the subsystems involved were not typically modules
so using CONFIG_MODULES didn't really matter. As you said, that's
changed now.

> > That change causes trouble when trying to use many clk drivers
> > as modules, as the clk modules may not load until much later
> > after init has started. If a dependent driver loads and gets an
> > error instead of EPROBE_DEFER, it won't try to reload later when
> > the dependency is met, and will thus fail to load.
>
> Once we add of_devlink support for power-domains, you won't even hit the
> genpd error path if you have of_devlink enabled. I believe in the case
> you are testing DB 845c, of_devlink is enabled?
>
> If of_devlink is enabled, the devices depending on the unprobed power
> domains would be deferred without even calling the driver's probe()
> function.
>
> Adding power-domain support to of_devlink is a 2 line change. I'll send
> it out soon.
>
> Also, regulator_init_complete() can be replaced by a sync_state() based
> implementation. I have a downstream implementation that works. But it's
> definitely not upstream ready. I plan to rewrite it and send it upstream
> at some point, but it's fairly straightforward if anyone else want to
> implement it. My main point being, we shouldn't have to make the timeout
> logic more complex (or even need one) for regulator clean up either.
>
> On Tue, Feb 18, 2020 at 6:07 PM Rob Herring wrote:
> > The one complication which I mentioned already is with consoles. A
> > timeout (and dependencies in modules) there doesn't work. You have to
> > probe and register the console before init is done.
>
> Rob,
>
> I've seen you say this a couple of times before. But I don't think this
> is true any more. With of_devlink enabled I've booted hardware countless
> times with the console device probing after userspace comes up. The only
> limitation for console drivers is that they need to be built-in if they
> need to support earlycon. If you don't care to support earlycon (very
> useful for bringup debugging), I think the console driver can even be a
> module. I don't think even of_devlink needs to be enabled technically if
> you load the modules in the right order.

Every serial driver has to be built-in to enable console support.
That's not because of earlycon. It's been that way for as long as I've
worked on linux. Now of course, a driver could be built-in and still
probe after userspace starts, but in my testing with the timeout that
didn't work. I don't see how of_devlink changes that.

It could depend on what userspace you have. Certainly, booting with
'console=ttyS0 init=/bin/sh' would not work for example. What I
probably tested with was a busybox based rootfs. What are you testing
with? Android?

Rob
Saravana Kannan Feb. 20, 2020, 9:06 p.m. UTC | #14
On Thu, Feb 20, 2020 at 2:09 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Thu, Feb 20, 2020 at 7:29 AM Saravana Kannan <saravanak@google.com> wrote:
> > On Tue, Feb 18, 2020 at 4:07 PM John Stultz wrote:
> > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > at the end of initcall"), along with commit 25b4e70dcce9
> > > ("driver core: allow stopping deferred probe after init") after
> > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > instead see an error causing the driver to fail to load.
> >
> > Both of those patches were the best solution at that point in time. But
> > the kernel has changed a lot since then. Power domain and IOMMU drivers
> > can work as modules now. We have of_devlink and sync_state().
> >
> > So, while a delay might have been the ideal solution back then, I think
> > we need to work towards removing arbitrary timeouts instead of making
> > the timeout mechanism more elaborate.
> >
> > I think driver_deferred_probe_check_state() logic should boiled down
> > to something like:
>
> ...
>
> > Once we add of_devlink support for power-domains, you won't even hit the
> > genpd error path if you have of_devlink enabled. I believe in the case
> > you are testing DB 845c, of_devlink is enabled?
> >
> > If of_devlink is enabled, the devices depending on the unprobed power
> > domains would be deferred without even calling the driver's probe()
> > function.
> >
> > Adding power-domain support to of_devlink is a 2 line change. I'll send
> > it out soon.
>
> ...
>
> Pardon me for not knowing the OF and devlink stuff in particular, but
> have you had a chance to test your changes on some (rather complex)
> ACPI enabled platforms?
> Would it have any complications there?

Hi Andy,

I'm not sure which changes you are referring to. So I'll try to answer
all possibilities :)

If you are referring to the pseudo code proposal, it's mostly
narrowing down the conditions under which the timeout will matter.
1. It's ignoring the timeout stuff if modules are not supported and
all initcall levels are done.
2. If modules are enabled and of_devlink is enabled, it's making sure
the timeout doesn't apply.

(1) seems like a straightforward assumption. (2) is functionally a NOP
for ACPI based platforms. So I don't think we'd have ACPI specific
complications here.

If you were referring to the "of_devlink support for power-domains"
that also won't have any impact on ACPI platforms because right not it
only runs for OF based systems. So again a NOP for ACPI.

Hope that answers your questions.

Thanks,
Saravana
Saravana Kannan Feb. 20, 2020, 11:07 p.m. UTC | #15
On Thu, Feb 20, 2020 at 7:21 AM Rob Herring <robh@kernel.org> wrote:
>
> On Wed, Feb 19, 2020 at 11:27 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > Apologies in advance for replying to this one email but discussing the
> > points raised in all the other replies. I'm not cc'ed in this thread and
> > replying to each email individually is a pain.
> >
> > On Tue, Feb 18, 2020 at 4:07 PM John Stultz wrote:
> > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > at the end of initcall"), along with commit 25b4e70dcce9
> > > ("driver core: allow stopping deferred probe after init") after
> > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > instead see an error causing the driver to fail to load.
> >
> > Both of those patches were the best solution at that point in time. But
> > the kernel has changed a lot since then. Power domain and IOMMU drivers
> > can work as modules now. We have of_devlink and sync_state().
> >
> > So, while a delay might have been the ideal solution back then, I think
> > we need to work towards removing arbitrary timeouts instead of making
> > the timeout mechanism more elaborate.
>
> If you don't have some way to say all the dependencies that can be
> resolved have been resolved already, how do you get away from a
> timeout? Nothing has changed in that respect.

Right, we can't get away from timeout if we need to support
CONFIG_MODULES AND mix-n-matched dtc+kernel versions.

But hopefully we can simplify the timeout logic by reducing the
configurations it needs to support (because other checks take care of
those configurations).

> If a dtb+kernel works, updating just the dtb with new dependencies
> should not break things.

I'm not sold on that policy (I agree newer kernel + old dtb shouldn't
break), but that's a discussion for another time.

>
> > I think driver_deferred_probe_check_state() logic should boiled down
> > to something like:
> >
> > int driver_deferred_probe_check_state(struct device *dev)
> > {
> >         /* No modules and init done, deferred probes are pointless from
> >          * now. */
> >         if (!defined(CONFIG_MODULES) && initcall_done)
> >                 return -ENODEV;
> >
> >         /* If modules and of_devlink then you clean want dependencies to
> >          * be enforced.
> >          */
> >         if (defined(CONFIG_MODULES) && of_devlink)
> >                 return -EPROBE_DEFER;
> >
> >         /* Whatever other complexity (including timeouts) we want to
> >          * add. Hopefully none - we can discuss in this thread. */
> >         if (.....)
> >                 return -Exxxx;
> >
> >         /* When in doubt, allow probe deferral. */
> >         return -EPROBE_DEFER;
> > }
>
> Mostly makes sense to me. I think using CONFIG_MODULES is good.

Good to know. I'll see if John wants to do that. If not, I'll get around to it.

> However, there is the case in pinctrl that we have a DT flag
> 'pinctrl-use-default' and we stop deferring at the end of initcalls if
> set.

Ugh! That code hurts my head! Mainly because the
driver_deferred_probe_check_state[_continue]() function names are so
similar and confusing. And their implementation is also a bit twisty
(like using triple negatives in a sentence). But I also noticed there
is no user of pinctrl-use-default in the upstream kernel. So,
whatever.

> With the above, there's no way to detect that. I'm pretty sure
> that's broken now with of_devlink and maybe from Thierry's change too.

of_devlink doesn't parse pinctrl yet. I have some simple changes
downstream for that which just parses -0, -1, -2 and -3 -- I'll get
around to upstreaming those sometime. However, adding support for
pinctrl-use-default is not hard. But I'd rather not do it unless
someone actually needs to use that along with of_devlink enabled and
asks for it in LKML.

> > Rob, for the original use case those two patches were added for, do they need
> > to support CONFIG_MODULES?
>
> At the time since the subsystems involved were not typically modules
> so using CONFIG_MODULES didn't really matter. As you said, that's
> changed now.
>
> > > That change causes trouble when trying to use many clk drivers
> > > as modules, as the clk modules may not load until much later
> > > after init has started. If a dependent driver loads and gets an
> > > error instead of EPROBE_DEFER, it won't try to reload later when
> > > the dependency is met, and will thus fail to load.
> >
> > Once we add of_devlink support for power-domains, you won't even hit the
> > genpd error path if you have of_devlink enabled. I believe in the case
> > you are testing DB 845c, of_devlink is enabled?
> >
> > If of_devlink is enabled, the devices depending on the unprobed power
> > domains would be deferred without even calling the driver's probe()
> > function.
> >
> > Adding power-domain support to of_devlink is a 2 line change. I'll send
> > it out soon.
> >
> > Also, regulator_init_complete() can be replaced by a sync_state() based
> > implementation. I have a downstream implementation that works. But it's
> > definitely not upstream ready. I plan to rewrite it and send it upstream
> > at some point, but it's fairly straightforward if anyone else want to
> > implement it. My main point being, we shouldn't have to make the timeout
> > logic more complex (or even need one) for regulator clean up either.
> >
> > On Tue, Feb 18, 2020 at 6:07 PM Rob Herring wrote:
> > > The one complication which I mentioned already is with consoles. A
> > > timeout (and dependencies in modules) there doesn't work. You have to
> > > probe and register the console before init is done.
> >
> > Rob,
> >
> > I've seen you say this a couple of times before. But I don't think this
> > is true any more. With of_devlink enabled I've booted hardware countless
> > times with the console device probing after userspace comes up. The only
> > limitation for console drivers is that they need to be built-in if they
> > need to support earlycon. If you don't care to support earlycon (very
> > useful for bringup debugging), I think the console driver can even be a
> > module. I don't think even of_devlink needs to be enabled technically if
> > you load the modules in the right order.
>
> Every serial driver has to be built-in to enable console support.
> That's not because of earlycon. It's been that way for as long as I've
> worked on linux. Now of course, a driver could be built-in and still
> probe after userspace starts, but in my testing with the timeout that
> didn't work.

>I don't see how of_devlink changes that.

of_devlink sometimes helps because it avoids running probe() functions
with poor error handling (Eg: returning -Esomethingelse when they
should return -EPROBE_DEFER). Also, I did say "I don't think even
of_devlink needs to be enabled..."

> It could depend on what userspace you have.

Right, and because of that I think we should say the following going forward:
"Every serial driver has to be built-in to enable console support, but
it can still probe after userspace starts"

Instead of:
"You have to probe and register the console before init is done".

Because the former statement gives a clearer picture and doesn't
discourage people from trying to modularize their platforms more
thoroughly :)

> Certainly, booting with 'console=ttyS0 init=/bin/sh' would not work for example.
> What I probably tested with was a busybox based rootfs. What are you testing
> with? Android?

Yeah, with Android. Happen to know why busybox (or whatever other
shell) might not work? Is it because they exit immediately if console
device is not available instead of continuing to run the scripts?

-Saravana
diff mbox series

Patch

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index b25bcab2a26b..9d916a7b56a6 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -237,13 +237,12 @@  __setup("deferred_probe_timeout=", deferred_probe_timeout_setup);
 
 static int __driver_deferred_probe_check_state(struct device *dev)
 {
-	if (!initcalls_done)
-		return -EPROBE_DEFER;
-
-	if (!deferred_probe_timeout) {
+	if (initcalls_done && !deferred_probe_timeout) {
 		dev_WARN(dev, "deferred probe timeout, ignoring dependency");
 		return -ETIMEDOUT;
 	}
+	if (!initcalls_done || deferred_probe_timeout > 0)
+		return -EPROBE_DEFER;
 
 	return 0;
 }