diff mbox series

[v1] driver core: Extend deferred probe timeout on driver registration

Message ID 20220429220933.1350374-1-saravanak@google.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series [v1] driver core: Extend deferred probe timeout on driver registration | expand

Commit Message

Saravana Kannan April 29, 2022, 10:09 p.m. UTC
The deferred probe timer that's used for this currently starts at
late_initcall and runs for driver_deferred_probe_timeout seconds. The
assumption being that all available drivers would be loaded and
registered before the timer expires. This means, the
driver_deferred_probe_timeout has to be pretty large for it to cover the
worst case. But if we set the default value for it to cover the worst
case, it would significantly slow down the average case. For this
reason, the default value is set to 0.

Also, with CONFIG_MODULES=y and the current default values of
driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
drivers will cause their consumer devices to always defer their probes.
This is because device links created by fw_devlink defer the probe even
before the consumer driver's probe() is called.

Instead of a fixed timeout, if we extend an unexpired deferred probe
timer on every successful driver registration, with the expectation more
modules would be loaded in the near future, then the default value of
driver_deferred_probe_timeout only needs to be as long as the worst case
time difference between two consecutive module loads.

So let's implement that and set the default value to 10 seconds when
CONFIG_MODULES=y.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 .../admin-guide/kernel-parameters.txt         |  6 ++++--
 drivers/base/base.h                           |  1 +
 drivers/base/dd.c                             | 19 +++++++++++++++++++
 drivers/base/driver.c                         |  1 +
 4 files changed, 25 insertions(+), 2 deletions(-)

Comments

Mark Brown May 13, 2022, 12:48 p.m. UTC | #1
On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.

This makes sense to me.

Reviewed-by: Mark Brown <broonie@kernel.org>
Rob Herring May 13, 2022, 1:58 p.m. UTC | #2
On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
>
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.
>
> Also, with CONFIG_MODULES=y and the current default values of
> driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> drivers will cause their consumer devices to always defer their probes.
> This is because device links created by fw_devlink defer the probe even
> before the consumer driver's probe() is called.
>
> Instead of a fixed timeout, if we extend an unexpired deferred probe
> timer on every successful driver registration, with the expectation more
> modules would be loaded in the near future, then the default value of
> driver_deferred_probe_timeout only needs to be as long as the worst case
> time difference between two consecutive module loads.
>
> So let's implement that and set the default value to 10 seconds when
> CONFIG_MODULES=y.

We had to revert a non-zero timeout before (issue with NFS root IIRC).
Does fw_devlink=on somehow fix that?

Rob
Saravana Kannan May 13, 2022, 5:25 p.m. UTC | #3
On Fri, May 13, 2022 at 6:58 AM Rob Herring <robh@kernel.org> wrote:
>
> On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > The deferred probe timer that's used for this currently starts at
> > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > assumption being that all available drivers would be loaded and
> > registered before the timer expires. This means, the
> > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > worst case. But if we set the default value for it to cover the worst
> > case, it would significantly slow down the average case. For this
> > reason, the default value is set to 0.
> >
> > Also, with CONFIG_MODULES=y and the current default values of
> > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > drivers will cause their consumer devices to always defer their probes.
> > This is because device links created by fw_devlink defer the probe even
> > before the consumer driver's probe() is called.
> >
> > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > timer on every successful driver registration, with the expectation more
> > modules would be loaded in the near future, then the default value of
> > driver_deferred_probe_timeout only needs to be as long as the worst case
> > time difference between two consecutive module loads.
> >
> > So let's implement that and set the default value to 10 seconds when
> > CONFIG_MODULES=y.
>
> We had to revert a non-zero timeout before (issue with NFS root IIRC).
> Does fw_devlink=on somehow fix that?

If it's the one where ip autoconfig was timing out, then John Stultz
fixed it by fixing wait_for_device_probe().
https://lore.kernel.org/all/20200422203245.83244-4-john.stultz@linaro.org/

If you are referring to some other issue, then I'd need more details.

-Saravana
Rob Herring May 16, 2022, 1:49 p.m. UTC | #4
On Fri, May 13, 2022 at 12:26 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Fri, May 13, 2022 at 6:58 AM Rob Herring <robh@kernel.org> wrote:
> >
> > On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> >
> > We had to revert a non-zero timeout before (issue with NFS root IIRC).
> > Does fw_devlink=on somehow fix that?
>
> If it's the one where ip autoconfig was timing out, then John Stultz
> fixed it by fixing wait_for_device_probe().
> https://lore.kernel.org/all/20200422203245.83244-4-john.stultz@linaro.org/

Yeah, that was it.

Acked-by: Rob Herring <robh@kernel.org>
Nathan Chancellor May 20, 2022, 11:30 p.m. UTC | #5
Hi Saravana,

On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.
> 
> Also, with CONFIG_MODULES=y and the current default values of
> driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> drivers will cause their consumer devices to always defer their probes.
> This is because device links created by fw_devlink defer the probe even
> before the consumer driver's probe() is called.
> 
> Instead of a fixed timeout, if we extend an unexpired deferred probe
> timer on every successful driver registration, with the expectation more
> modules would be loaded in the near future, then the default value of
> driver_deferred_probe_timeout only needs to be as long as the worst case
> time difference between two consecutive module loads.
> 
> So let's implement that and set the default value to 10 seconds when
> CONFIG_MODULES=y.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Kevin Hilman <khilman@kernel.org>
> Cc: Thierry Reding <treding@nvidia.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> Cc: linux-gpio@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Signed-off-by: Saravana Kannan <saravanak@google.com>

I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
timeout on driver registration") in next-20220520 (bisect log below).

$ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage

$ timeout --foreground 15m stdbuf -oL -eL \
qemu-system-s390x \
-initrd ... \
-M s390-ccw-virtio \
-display none \
-kernel arch/s390/boot/bzImage \
-m 512m \
-nodefaults \
-serial mon:stdio
...
[    2.077303] In-situ OAM (IOAM) with IPv6
[    2.077639] NET: Registered PF_PACKET protocol family
[    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    2.078795] Key type dns_resolver registered
[    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
[    2.081494] Discipline DIAG cannot be used without z/VM
[  260.626363] random: crng init done
qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)

We have a simple rootfs available if necessary:

https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst

If there is any other information I can provide, please let me know!

Cheers,
Nathan

# bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific files for 20220520
# good: [b015dcd62b86d298829990f8261d5d154b8d7af5] Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect start '18ecd30af1a8402c162cca1bd58771c0e5be7815' 'b015dcd62b86d298829990f8261d5d154b8d7af5'
# good: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect good f9b63740b666dd9887eb0282d21b5f65bb0cadd0
# good: [1f5eb3e76303572f0318e8c50da51c516580aa03] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
git bisect good 1f5eb3e76303572f0318e8c50da51c516580aa03
# bad: [4c1d9cc0363691893ef94fa0d798faca013e27d3] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
git bisect bad 4c1d9cc0363691893ef94fa0d798faca013e27d3
# bad: [dcb68304485c0ba5f84f1a54687c751b68263d93] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
git bisect bad dcb68304485c0ba5f84f1a54687c751b68263d93
# good: [61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
git bisect good 61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f
# good: [d4db45a71f56032b552e161968bb0e5fdd2767f8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
git bisect good d4db45a71f56032b552e161968bb0e5fdd2767f8
# good: [d090c7a2ab84663185e4abda21d7d83880937c8a] USB / dwc3: Fix a checkpatch warning in core.c
git bisect good d090c7a2ab84663185e4abda21d7d83880937c8a
# bad: [b232b02bf3c205b13a26dcec08e53baddd8e59ed] driver core: fix deadlock in __device_attach
git bisect bad b232b02bf3c205b13a26dcec08e53baddd8e59ed
# good: [4c32174a24759d5ac6dc42b508fcec2afb8b9602] Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
git bisect good 4c32174a24759d5ac6dc42b508fcec2afb8b9602
# good: [38ea74eb8fc1b82b39e13a6527095a0036539117] rpmsg: use local 'dev' variable
git bisect good 38ea74eb8fc1b82b39e13a6527095a0036539117
# good: [1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57] driver core: location: Add "back" as a possible output for panel
git bisect good 1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57
# good: [6ee60e9c9f2f83ad218159af6a175c57a395ae69] MAINTAINERS: add Russ Weight as a firmware loader maintainer
git bisect good 6ee60e9c9f2f83ad218159af6a175c57a395ae69
# bad: [15f214f9bdb7c1f560b4bf863c5a72ff53b442a4] topology: Remove unused cpu_cluster_mask()
git bisect bad 15f214f9bdb7c1f560b4bf863c5a72ff53b442a4
# bad: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
git bisect bad 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
# first bad commit: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
Saravana Kannan May 20, 2022, 11:49 p.m. UTC | #6
On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Saravana,
>
> On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > The deferred probe timer that's used for this currently starts at
> > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > assumption being that all available drivers would be loaded and
> > registered before the timer expires. This means, the
> > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > worst case. But if we set the default value for it to cover the worst
> > case, it would significantly slow down the average case. For this
> > reason, the default value is set to 0.
> >
> > Also, with CONFIG_MODULES=y and the current default values of
> > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > drivers will cause their consumer devices to always defer their probes.
> > This is because device links created by fw_devlink defer the probe even
> > before the consumer driver's probe() is called.
> >
> > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > timer on every successful driver registration, with the expectation more
> > modules would be loaded in the near future, then the default value of
> > driver_deferred_probe_timeout only needs to be as long as the worst case
> > time difference between two consecutive module loads.
> >
> > So let's implement that and set the default value to 10 seconds when
> > CONFIG_MODULES=y.
> >
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Linus Walleij <linus.walleij@linaro.org>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > Cc: Kevin Hilman <khilman@kernel.org>
> > Cc: Thierry Reding <treding@nvidia.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: Pavel Machek <pavel@ucw.cz>
> > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > Cc: linux-gpio@vger.kernel.org
> > Cc: linux-pm@vger.kernel.org
> > Cc: iommu@lists.linux-foundation.org
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> timeout on driver registration") in next-20220520 (bisect log below).
>
> $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
>
> $ timeout --foreground 15m stdbuf -oL -eL \
> qemu-system-s390x \
> -initrd ... \
> -M s390-ccw-virtio \
> -display none \
> -kernel arch/s390/boot/bzImage \
> -m 512m \
> -nodefaults \
> -serial mon:stdio
> ...
> [    2.077303] In-situ OAM (IOAM) with IPv6
> [    2.077639] NET: Registered PF_PACKET protocol family
> [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> [    2.078795] Key type dns_resolver registered
> [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> [    2.081494] Discipline DIAG cannot be used without z/VM
> [  260.626363] random: crng init done
> qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
>
> We have a simple rootfs available if necessary:
>
> https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
>
> If there is any other information I can provide, please let me know!

Hmm... strange. Can you please try the following command line options
and tell me which of these has the issue and which don't?
1) deferred_probe_timeout=0
2) deferred_probe_timeout=1
3) deferred_probe_timeout=300

That should help me narrow down where the error might be.

Thanks,
Saravana

>
> Cheers,
> Nathan
>
> # bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific files for 20220520
> # good: [b015dcd62b86d298829990f8261d5d154b8d7af5] Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git bisect start '18ecd30af1a8402c162cca1bd58771c0e5be7815' 'b015dcd62b86d298829990f8261d5d154b8d7af5'
> # good: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
> git bisect good f9b63740b666dd9887eb0282d21b5f65bb0cadd0
> # good: [1f5eb3e76303572f0318e8c50da51c516580aa03] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> git bisect good 1f5eb3e76303572f0318e8c50da51c516580aa03
> # bad: [4c1d9cc0363691893ef94fa0d798faca013e27d3] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
> git bisect bad 4c1d9cc0363691893ef94fa0d798faca013e27d3
> # bad: [dcb68304485c0ba5f84f1a54687c751b68263d93] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
> git bisect bad dcb68304485c0ba5f84f1a54687c751b68263d93
> # good: [61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
> git bisect good 61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f
> # good: [d4db45a71f56032b552e161968bb0e5fdd2767f8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
> git bisect good d4db45a71f56032b552e161968bb0e5fdd2767f8
> # good: [d090c7a2ab84663185e4abda21d7d83880937c8a] USB / dwc3: Fix a checkpatch warning in core.c
> git bisect good d090c7a2ab84663185e4abda21d7d83880937c8a
> # bad: [b232b02bf3c205b13a26dcec08e53baddd8e59ed] driver core: fix deadlock in __device_attach
> git bisect bad b232b02bf3c205b13a26dcec08e53baddd8e59ed
> # good: [4c32174a24759d5ac6dc42b508fcec2afb8b9602] Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
> git bisect good 4c32174a24759d5ac6dc42b508fcec2afb8b9602
> # good: [38ea74eb8fc1b82b39e13a6527095a0036539117] rpmsg: use local 'dev' variable
> git bisect good 38ea74eb8fc1b82b39e13a6527095a0036539117
> # good: [1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57] driver core: location: Add "back" as a possible output for panel
> git bisect good 1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57
> # good: [6ee60e9c9f2f83ad218159af6a175c57a395ae69] MAINTAINERS: add Russ Weight as a firmware loader maintainer
> git bisect good 6ee60e9c9f2f83ad218159af6a175c57a395ae69
> # bad: [15f214f9bdb7c1f560b4bf863c5a72ff53b442a4] topology: Remove unused cpu_cluster_mask()
> git bisect bad 15f214f9bdb7c1f560b4bf863c5a72ff53b442a4
> # bad: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
> git bisect bad 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
> # first bad commit: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
Nathan Chancellor May 21, 2022, 12:04 a.m. UTC | #7
On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > Hi Saravana,
> >
> > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> > >
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > Cc: Rob Herring <robh@kernel.org>
> > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > Cc: Kevin Hilman <khilman@kernel.org>
> > > Cc: Thierry Reding <treding@nvidia.com>
> > > Cc: Mark Brown <broonie@kernel.org>
> > > Cc: Pavel Machek <pavel@ucw.cz>
> > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > Cc: linux-gpio@vger.kernel.org
> > > Cc: linux-pm@vger.kernel.org
> > > Cc: iommu@lists.linux-foundation.org
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> >
> > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > timeout on driver registration") in next-20220520 (bisect log below).
> >
> > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> >
> > $ timeout --foreground 15m stdbuf -oL -eL \
> > qemu-system-s390x \
> > -initrd ... \
> > -M s390-ccw-virtio \
> > -display none \
> > -kernel arch/s390/boot/bzImage \
> > -m 512m \
> > -nodefaults \
> > -serial mon:stdio
> > ...
> > [    2.077303] In-situ OAM (IOAM) with IPv6
> > [    2.077639] NET: Registered PF_PACKET protocol family
> > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > [    2.078795] Key type dns_resolver registered
> > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > [    2.081494] Discipline DIAG cannot be used without z/VM
> > [  260.626363] random: crng init done
> > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> >
> > We have a simple rootfs available if necessary:
> >
> > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> >
> > If there is any other information I can provide, please let me know!
> 
> Hmm... strange. Can you please try the following command line options
> and tell me which of these has the issue and which don't?

Sure thing!

> 1) deferred_probe_timeout=0

No issue.

> 2) deferred_probe_timeout=1
> 3) deferred_probe_timeout=300

Both of these appear to hang in the same way, I let each sit for five
minutes.

Cheers,
Nathan
Saravana Kannan May 21, 2022, 12:15 a.m. UTC | #8
On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > Hi Saravana,
> > >
> > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > The deferred probe timer that's used for this currently starts at
> > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > assumption being that all available drivers would be loaded and
> > > > registered before the timer expires. This means, the
> > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > worst case. But if we set the default value for it to cover the worst
> > > > case, it would significantly slow down the average case. For this
> > > > reason, the default value is set to 0.
> > > >
> > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > drivers will cause their consumer devices to always defer their probes.
> > > > This is because device links created by fw_devlink defer the probe even
> > > > before the consumer driver's probe() is called.
> > > >
> > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > timer on every successful driver registration, with the expectation more
> > > > modules would be loaded in the near future, then the default value of
> > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > time difference between two consecutive module loads.
> > > >
> > > > So let's implement that and set the default value to 10 seconds when
> > > > CONFIG_MODULES=y.
> > > >
> > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > Cc: Rob Herring <robh@kernel.org>
> > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > Cc: Mark Brown <broonie@kernel.org>
> > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > Cc: linux-gpio@vger.kernel.org
> > > > Cc: linux-pm@vger.kernel.org
> > > > Cc: iommu@lists.linux-foundation.org
> > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > >
> > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > timeout on driver registration") in next-20220520 (bisect log below).
> > >
> > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > >
> > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > qemu-system-s390x \
> > > -initrd ... \
> > > -M s390-ccw-virtio \
> > > -display none \
> > > -kernel arch/s390/boot/bzImage \
> > > -m 512m \
> > > -nodefaults \
> > > -serial mon:stdio
> > > ...
> > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > [    2.078795] Key type dns_resolver registered
> > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > [  260.626363] random: crng init done
> > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > >
> > > We have a simple rootfs available if necessary:
> > >
> > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > >
> > > If there is any other information I can provide, please let me know!
> >
> > Hmm... strange. Can you please try the following command line options
> > and tell me which of these has the issue and which don't?
>
> Sure thing!
>
> > 1) deferred_probe_timeout=0
>
> No issue.
>
> > 2) deferred_probe_timeout=1
> > 3) deferred_probe_timeout=300
>
> Both of these appear to hang in the same way, I let each sit for five
> minutes.

Strange that a sufficiently large timeout isn't helping. Is it trying
to boot off a network mount? I'll continue looking into this next
week.

-Saravana

>
> Cheers,
> Nathan
Nathan Chancellor May 23, 2022, 3:17 p.m. UTC | #9
On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > The deferred probe timer that's used for this currently starts at
> > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > assumption being that all available drivers would be loaded and
> > > > > registered before the timer expires. This means, the
> > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > worst case. But if we set the default value for it to cover the worst
> > > > > case, it would significantly slow down the average case. For this
> > > > > reason, the default value is set to 0.
> > > > >
> > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > This is because device links created by fw_devlink defer the probe even
> > > > > before the consumer driver's probe() is called.
> > > > >
> > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > timer on every successful driver registration, with the expectation more
> > > > > modules would be loaded in the near future, then the default value of
> > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > time difference between two consecutive module loads.
> > > > >
> > > > > So let's implement that and set the default value to 10 seconds when
> > > > > CONFIG_MODULES=y.
> > > > >
> > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > Cc: Will Deacon <will@kernel.org>
> > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > Cc: linux-gpio@vger.kernel.org
> > > > > Cc: linux-pm@vger.kernel.org
> > > > > Cc: iommu@lists.linux-foundation.org
> > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > >
> > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > >
> > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > >
> > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > qemu-system-s390x \
> > > > -initrd ... \
> > > > -M s390-ccw-virtio \
> > > > -display none \
> > > > -kernel arch/s390/boot/bzImage \
> > > > -m 512m \
> > > > -nodefaults \
> > > > -serial mon:stdio
> > > > ...
> > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > [    2.078795] Key type dns_resolver registered
> > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > [  260.626363] random: crng init done
> > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > >
> > > > We have a simple rootfs available if necessary:
> > > >
> > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > >
> > > > If there is any other information I can provide, please let me know!
> > >
> > > Hmm... strange. Can you please try the following command line options
> > > and tell me which of these has the issue and which don't?
> >
> > Sure thing!
> >
> > > 1) deferred_probe_timeout=0
> >
> > No issue.
> >
> > > 2) deferred_probe_timeout=1
> > > 3) deferred_probe_timeout=300
> >
> > Both of these appear to hang in the same way, I let each sit for five
> > minutes.
> 
> Strange that a sufficiently large timeout isn't helping. Is it trying
> to boot off a network mount? I'll continue looking into this next
> week.

I don't think so, it seems like doing that requires some extra flags
that we do not have:

https://wiki.qemu.org/Features/S390xNetworkBoot

If you need any additional information or want something tested, please
let me know!

Cheers,
Nathan
Saravana Kannan May 23, 2022, 8:04 p.m. UTC | #10
On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > assumption being that all available drivers would be loaded and
> > > > > > registered before the timer expires. This means, the
> > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > case, it would significantly slow down the average case. For this
> > > > > > reason, the default value is set to 0.
> > > > > >
> > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > before the consumer driver's probe() is called.
> > > > > >
> > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > timer on every successful driver registration, with the expectation more
> > > > > > modules would be loaded in the near future, then the default value of
> > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > time difference between two consecutive module loads.
> > > > > >
> > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > CONFIG_MODULES=y.
> > > > > >
> > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > >
> > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > >
> > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > >
> > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > qemu-system-s390x \
> > > > > -initrd ... \
> > > > > -M s390-ccw-virtio \
> > > > > -display none \
> > > > > -kernel arch/s390/boot/bzImage \
> > > > > -m 512m \
> > > > > -nodefaults \
> > > > > -serial mon:stdio
> > > > > ...
> > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > [    2.078795] Key type dns_resolver registered
> > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > [  260.626363] random: crng init done
> > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > >
> > > > > We have a simple rootfs available if necessary:
> > > > >
> > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > >
> > > > > If there is any other information I can provide, please let me know!
> > > >
> > > > Hmm... strange. Can you please try the following command line options
> > > > and tell me which of these has the issue and which don't?
> > >
> > > Sure thing!
> > >
> > > > 1) deferred_probe_timeout=0
> > >
> > > No issue.
> > >
> > > > 2) deferred_probe_timeout=1
> > > > 3) deferred_probe_timeout=300
> > >
> > > Both of these appear to hang in the same way, I let each sit for five
> > > minutes.
> >
> > Strange that a sufficiently large timeout isn't helping. Is it trying
> > to boot off a network mount? I'll continue looking into this next
> > week.
>
> I don't think so, it seems like doing that requires some extra flags
> that we do not have:
>
> https://wiki.qemu.org/Features/S390xNetworkBoot
>
> If you need any additional information or want something tested, please
> let me know!

I'll try to get qemu going on my end, but I'm not too confident I'll
be able to get to it in a timely fashion. So if you can help figure
out where this boot process is hanging, that'd be very much
appreciated.

Couple of suggestions for debugging:

Can you add a log to "wait_for_device_probe()" and see if that's
getting called right before the boot process hangs? If it does, can
you get a stacktrace (I just add a WARN_ON(1) when I need a stack
trace)? It's unlikely this is the case because
deferred_probe_timeout=1 still causes an issue for you, but I'd be
good to rule out.

Let's try to rule out if deferred_probe_extend_timeout() is causing
some issues. So, without my patch, what happens if you set:
deferred_probe_timeout=1
deferred_probe_timeout=300

If deferred_probe_timeout=1 causes an issue even without my patch,
then in addition, can you try commenting out the call to
fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
and try again?

Thanks a lot for reporting and helping debug this issue.

-Saravana
Nathan Chancellor May 23, 2022, 10:14 p.m. UTC | #11
On Mon, May 23, 2022 at 01:04:03PM -0700, Saravana Kannan wrote:
> On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > >
> > > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > > >
> > > > > > Hi Saravana,
> > > > > >
> > > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > > assumption being that all available drivers would be loaded and
> > > > > > > registered before the timer expires. This means, the
> > > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > > case, it would significantly slow down the average case. For this
> > > > > > > reason, the default value is set to 0.
> > > > > > >
> > > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > > before the consumer driver's probe() is called.
> > > > > > >
> > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > > timer on every successful driver registration, with the expectation more
> > > > > > > modules would be loaded in the near future, then the default value of
> > > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > > time difference between two consecutive module loads.
> > > > > > >
> > > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > > CONFIG_MODULES=y.
> > > > > > >
> > > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > >
> > > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > > >
> > > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > > >
> > > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > > qemu-system-s390x \
> > > > > > -initrd ... \
> > > > > > -M s390-ccw-virtio \
> > > > > > -display none \
> > > > > > -kernel arch/s390/boot/bzImage \
> > > > > > -m 512m \
> > > > > > -nodefaults \
> > > > > > -serial mon:stdio
> > > > > > ...
> > > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > > [    2.078795] Key type dns_resolver registered
> > > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > > [  260.626363] random: crng init done
> > > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > > >
> > > > > > We have a simple rootfs available if necessary:
> > > > > >
> > > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > > >
> > > > > > If there is any other information I can provide, please let me know!
> > > > >
> > > > > Hmm... strange. Can you please try the following command line options
> > > > > and tell me which of these has the issue and which don't?
> > > >
> > > > Sure thing!
> > > >
> > > > > 1) deferred_probe_timeout=0
> > > >
> > > > No issue.
> > > >
> > > > > 2) deferred_probe_timeout=1
> > > > > 3) deferred_probe_timeout=300
> > > >
> > > > Both of these appear to hang in the same way, I let each sit for five
> > > > minutes.
> > >
> > > Strange that a sufficiently large timeout isn't helping. Is it trying
> > > to boot off a network mount? I'll continue looking into this next
> > > week.
> >
> > I don't think so, it seems like doing that requires some extra flags
> > that we do not have:
> >
> > https://wiki.qemu.org/Features/S390xNetworkBoot
> >
> > If you need any additional information or want something tested, please
> > let me know!
> 
> I'll try to get qemu going on my end, but I'm not too confident I'll
> be able to get to it in a timely fashion. So if you can help figure
> out where this boot process is hanging, that'd be very much
> appreciated.

Sure thing! Information included below, I am more than happy to continue
to test and debug as you need.

> Couple of suggestions for debugging:
> 
> Can you add a log to "wait_for_device_probe()" and see if that's
> getting called right before the boot process hangs? If it does, can
> you get a stacktrace (I just add a WARN_ON(1) when I need a stack
> trace)? It's unlikely this is the case because
> deferred_probe_timeout=1 still causes an issue for you, but I'd be
> good to rule out.

If I add a pr_info() call at the top of wait_for_device_probe(), I see
it right before the process hangs. Adding WARN_ON(1) right below that
reveals dasd_eckd_init() in drivers/s390/block/dasd_eckd.c calls
wait_for_device_probe():

[    4.610397] ------------[ cut here ]------------
[    4.610520] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x28/0x110
[    4.611134] Modules linked in:
[    4.611593] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220523-dirty #1
[    4.611830] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    4.612017] Krnl PSW : 0704c00180000000 0000000000ce4b3c (wait_for_device_probe+0x2c/0x110)
[    4.612258]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[    4.612387] Krnl GPRS: 80000000fffff071 0000000000000027 000000000000000c 00000000017f91d8
[    4.612457]            00000000fffff071 00000000017f9218 0000000001a655a0 0000000000000006
[    4.612521]            0000000000000002 0000000001965810 00000000019d51a0 0000000000000000
[    4.612585]            0000000002218000 000000000125bcc8 0000000000ce4b38 000003800000bc80
[    4.614814] Krnl Code: 0000000000ce4b2c: e3e0f0980024        stg     %r14,152(%r15)
[    4.614814]            0000000000ce4b32: c0e5ffff94cb        brasl   %r14,0000000000cd74c8
[    4.614814]           #0000000000ce4b38: af000000            mc      0,0
[    4.614814]           >0000000000ce4b3c: c0100054d1fa        larl    %r1,000000000177ef30
[    4.614814]            0000000000ce4b42: e31010000012        lt      %r1,0(%r1)
[    4.614814]            0000000000ce4b48: a784002d            brc     8,0000000000ce4ba2
[    4.614814]            0000000000ce4b4c: d727f0a0f0a0        xc      160(40,%r15),160(%r15)
[    4.614814]            0000000000ce4b52: 41b0f0a0            la      %r11,160(%r15)
[    4.615698] Call Trace:
[    4.616559]  [<0000000000ce4b3c>] wait_for_device_probe+0x2c/0x110
[    4.616744] ([<0000000000ce4b38>] wait_for_device_probe+0x28/0x110)
[    4.616841]  [<000000000196593e>] dasd_eckd_init+0x12e/0x178
[    4.616913]  [<0000000000100936>] do_one_initcall+0x46/0x1e8
[    4.616983]  [<0000000001920706>] do_initcalls+0x126/0x150
[    4.617046]  [<000000000192095e>] kernel_init_freeable+0x1ae/0x1f0
[    4.617110]  [<0000000000ce85a6>] kernel_init+0x2e/0x168
[    4.617171]  [<0000000000103320>] __ret_from_fork+0x40/0x58
[    4.617233]  [<0000000000cf5eaa>] ret_from_fork+0xa/0x40
[    4.617352] Last Breaking-Event-Address:
[    4.617393]  [<0000000000e0e098>] __s390_indirect_jump_r14+0x0/0xc
[    4.617481] ---[ end trace 0000000000000000 ]---

> Let's try to rule out if deferred_probe_extend_timeout() is causing
> some issues. So, without my patch, what happens if you set:
> deferred_probe_timeout=1
> deferred_probe_timeout=300

At commit 6ee60e9c9f2f ("MAINTAINERS: add Russ Weight as a firmware
loader maintainer"), both deferred_probe_timeout=1 and
deferred_probe_timeout=300 hang the boot.

> If deferred_probe_timeout=1 causes an issue even without my patch,
> then in addition, can you try commenting out the call to
> fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
> and try again?

Sure, that does not appear to make a difference with
deferred_probe_timeout=1.

Cheers,
Nathan
Saravana Kannan May 24, 2022, 3:43 a.m. UTC | #12
On Mon, May 23, 2022 at 3:14 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Mon, May 23, 2022 at 01:04:03PM -0700, Saravana Kannan wrote:
> > On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > > > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > >
> > > > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi Saravana,
> > > > > > >
> > > > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > > > assumption being that all available drivers would be loaded and
> > > > > > > > registered before the timer expires. This means, the
> > > > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > > > case, it would significantly slow down the average case. For this
> > > > > > > > reason, the default value is set to 0.
> > > > > > > >
> > > > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > > > before the consumer driver's probe() is called.
> > > > > > > >
> > > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > > > timer on every successful driver registration, with the expectation more
> > > > > > > > modules would be loaded in the near future, then the default value of
> > > > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > > > time difference between two consecutive module loads.
> > > > > > > >
> > > > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > > > CONFIG_MODULES=y.
> > > > > > > >
> > > > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > > >
> > > > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > > > >
> > > > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > > > >
> > > > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > > > qemu-system-s390x \
> > > > > > > -initrd ... \
> > > > > > > -M s390-ccw-virtio \
> > > > > > > -display none \
> > > > > > > -kernel arch/s390/boot/bzImage \
> > > > > > > -m 512m \
> > > > > > > -nodefaults \
> > > > > > > -serial mon:stdio
> > > > > > > ...
> > > > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > > > [    2.078795] Key type dns_resolver registered
> > > > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > > > [  260.626363] random: crng init done
> > > > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > > > >
> > > > > > > We have a simple rootfs available if necessary:
> > > > > > >
> > > > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > > > >
> > > > > > > If there is any other information I can provide, please let me know!
> > > > > >
> > > > > > Hmm... strange. Can you please try the following command line options
> > > > > > and tell me which of these has the issue and which don't?
> > > > >
> > > > > Sure thing!
> > > > >
> > > > > > 1) deferred_probe_timeout=0
> > > > >
> > > > > No issue.
> > > > >
> > > > > > 2) deferred_probe_timeout=1
> > > > > > 3) deferred_probe_timeout=300
> > > > >
> > > > > Both of these appear to hang in the same way, I let each sit for five
> > > > > minutes.
> > > >
> > > > Strange that a sufficiently large timeout isn't helping. Is it trying
> > > > to boot off a network mount? I'll continue looking into this next
> > > > week.
> > >
> > > I don't think so, it seems like doing that requires some extra flags
> > > that we do not have:
> > >
> > > https://wiki.qemu.org/Features/S390xNetworkBoot
> > >
> > > If you need any additional information or want something tested, please
> > > let me know!
> >
> > I'll try to get qemu going on my end, but I'm not too confident I'll
> > be able to get to it in a timely fashion. So if you can help figure
> > out where this boot process is hanging, that'd be very much
> > appreciated.
>
> Sure thing! Information included below, I am more than happy to continue
> to test and debug as you need.
>
> > Couple of suggestions for debugging:
> >
> > Can you add a log to "wait_for_device_probe()" and see if that's
> > getting called right before the boot process hangs? If it does, can
> > you get a stacktrace (I just add a WARN_ON(1) when I need a stack
> > trace)? It's unlikely this is the case because
> > deferred_probe_timeout=1 still causes an issue for you, but I'd be
> > good to rule out.
>
> If I add a pr_info() call at the top of wait_for_device_probe(), I see
> it right before the process hangs. Adding WARN_ON(1) right below that
> reveals dasd_eckd_init() in drivers/s390/block/dasd_eckd.c calls
> wait_for_device_probe():
>
> [    4.610397] ------------[ cut here ]------------
> [    4.610520] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x28/0x110
> [    4.611134] Modules linked in:
> [    4.611593] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220523-dirty #1
> [    4.611830] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    4.612017] Krnl PSW : 0704c00180000000 0000000000ce4b3c (wait_for_device_probe+0x2c/0x110)
> [    4.612258]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> [    4.612387] Krnl GPRS: 80000000fffff071 0000000000000027 000000000000000c 00000000017f91d8
> [    4.612457]            00000000fffff071 00000000017f9218 0000000001a655a0 0000000000000006
> [    4.612521]            0000000000000002 0000000001965810 00000000019d51a0 0000000000000000
> [    4.612585]            0000000002218000 000000000125bcc8 0000000000ce4b38 000003800000bc80
> [    4.614814] Krnl Code: 0000000000ce4b2c: e3e0f0980024        stg     %r14,152(%r15)
> [    4.614814]            0000000000ce4b32: c0e5ffff94cb        brasl   %r14,0000000000cd74c8
> [    4.614814]           #0000000000ce4b38: af000000            mc      0,0
> [    4.614814]           >0000000000ce4b3c: c0100054d1fa        larl    %r1,000000000177ef30
> [    4.614814]            0000000000ce4b42: e31010000012        lt      %r1,0(%r1)
> [    4.614814]            0000000000ce4b48: a784002d            brc     8,0000000000ce4ba2
> [    4.614814]            0000000000ce4b4c: d727f0a0f0a0        xc      160(40,%r15),160(%r15)
> [    4.614814]            0000000000ce4b52: 41b0f0a0            la      %r11,160(%r15)
> [    4.615698] Call Trace:
> [    4.616559]  [<0000000000ce4b3c>] wait_for_device_probe+0x2c/0x110
> [    4.616744] ([<0000000000ce4b38>] wait_for_device_probe+0x28/0x110)
> [    4.616841]  [<000000000196593e>] dasd_eckd_init+0x12e/0x178
> [    4.616913]  [<0000000000100936>] do_one_initcall+0x46/0x1e8
> [    4.616983]  [<0000000001920706>] do_initcalls+0x126/0x150
> [    4.617046]  [<000000000192095e>] kernel_init_freeable+0x1ae/0x1f0
> [    4.617110]  [<0000000000ce85a6>] kernel_init+0x2e/0x168
> [    4.617171]  [<0000000000103320>] __ret_from_fork+0x40/0x58
> [    4.617233]  [<0000000000cf5eaa>] ret_from_fork+0xa/0x40
> [    4.617352] Last Breaking-Event-Address:
> [    4.617393]  [<0000000000e0e098>] __s390_indirect_jump_r14+0x0/0xc
> [    4.617481] ---[ end trace 0000000000000000 ]---
>
> > Let's try to rule out if deferred_probe_extend_timeout() is causing
> > some issues. So, without my patch, what happens if you set:
> > deferred_probe_timeout=1
> > deferred_probe_timeout=300
>
> At commit 6ee60e9c9f2f ("MAINTAINERS: add Russ Weight as a firmware
> loader maintainer"), both deferred_probe_timeout=1 and
> deferred_probe_timeout=300 hang the boot.

Thanks for all the help. I think I know what's going on.

If you revert the following commit, then you'll see that your device
no longer hangs with my changes.
35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
the deferred_probe_timeout fires

That commit was made to make sure NFS mounts continue to work even
when deferred_probe_timeout is set to a value greater than the IP auto
config timeout (I think that's 12 seconds). While that definitely
helps the NFS mount case, unfortunately wait_for_device_probe() is
used in some initcalls that happen before late_initcall(). This causes
a deadlock -- earlier initcalls wait for
late_initcall(deferred_probe_initcall) to run but we can't get to
late_initcall() if earlier initcalls don't finish.

Even if wait_for_device_probe() is used only in late_initcalls(), it
can still hang the kernel boot based on the link order of the .c
files as that determined the call order between late_initcalls().

It looks like all the existing users of wait_for_device_probe() seem
to try and "wait for currently probing devices to finish probing", but
the commit mentioned above changes the semantics to "wait until every
device that'll eventually probe finishes probing". I think the
solution is to revert the commit and then do this additional waiting
just in ip_auto_config(). I'll need to think about this a bit before
I'm sure the solution makes sense.

Thanks for all the help so far.

-Saravana




-Saravana



>
> > If deferred_probe_timeout=1 causes an issue even without my patch,
> > then in addition, can you try commenting out the call to
> > fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
> > and try again?
>
> Sure, that does not appear to make a difference with
> deferred_probe_timeout=1.
>
> Cheers,
> Nathan
Sebastian Andrzej Siewior May 24, 2022, 4:41 p.m. UTC | #13
On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
…
> Thanks for all the help. I think I know what's going on.

I, too got here because my boot recently was extended by 10 seconds and
bisected to that commit in question.

> If you revert the following commit, then you'll see that your device
> no longer hangs with my changes.
> 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> the deferred_probe_timeout fires

Removing probe_timeout_waitqueue (as suggested) or setting the timeout
to 0 avoids the delay.

> -Saravana

Sebastian
Saravana Kannan May 24, 2022, 5:46 p.m. UTC | #14
On Tue, May 24, 2022 at 9:41 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
> …
> > Thanks for all the help. I think I know what's going on.
>
> I, too got here because my boot recently was extended by 10 seconds and
> bisected to that commit in question.
>
> > If you revert the following commit, then you'll see that your device
> > no longer hangs with my changes.
> > 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> > the deferred_probe_timeout fires
>
> Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> to 0 avoids the delay.

In your case, I think it might be working as intended? Curious, what
was the call stack in your case where it was blocked?

-Saravana
Sebastian Andrzej Siewior May 25, 2022, 7:11 a.m. UTC | #15
On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > to 0 avoids the delay.
> 
> In your case, I think it might be working as intended? Curious, what
> was the call stack in your case where it was blocked?

Why is then there 10sec delay during boot? The backtrace is
|------------[ cut here ]------------
|WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
|Modules linked in:
|CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
|RIP: 0010:wait_for_device_probe+0x30/0x110
|Call Trace:
| <TASK>
| prepare_namespace+0x2b/0x160
| kernel_init_freeable+0x2b3/0x2dd
| kernel_init+0x11/0x110
| ret_from_fork+0x22/0x30
| </TASK>

Looking closer, it can't access init. This in particular box boots
directly the kernel without an initramfs so the kernel later mounts
/dev/sda1 and everything is good.  So that seems to be the reason…
My other machine with an initramfs does not show this problem.

> -Saravana

Sebastian
Saravana Kannan May 25, 2022, 7:49 p.m. UTC | #16
On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > to 0 avoids the delay.
> >
> > In your case, I think it might be working as intended? Curious, what
> > was the call stack in your case where it was blocked?
>
> Why is then there 10sec delay during boot? The backtrace is
> |------------[ cut here ]------------
> |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> |Modules linked in:
> |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> |RIP: 0010:wait_for_device_probe+0x30/0x110
> |Call Trace:
> | <TASK>
> | prepare_namespace+0x2b/0x160
> | kernel_init_freeable+0x2b3/0x2dd
> | kernel_init+0x11/0x110
> | ret_from_fork+0x22/0x30
> | </TASK>
>
> Looking closer, it can't access init. This in particular box boots
> directly the kernel without an initramfs so the kernel later mounts
> /dev/sda1 and everything is good.  So that seems to be the reason…

Hmmm... that part shouldn't matter. As long as you are hitting the
same code path. My guess is one of them has CONFIG_MODULES enabled and
the other doesn't.

In either case, I think the patch needs to be reverted (I'll send out
one soon), but that'll also mean I need to revert part of my patch
(sets the timeout back to 0) or I need to fix this case:
https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/

I'll try to do the latter if I can get something reasonable soon.
Otherwise, I'll just do the revert + partial revert.

-Saravana


> My other machine with an initramfs does not show this problem.
>
> > -Saravana
>
> Sebastian
Niklas Cassel May 29, 2022, 8:34 a.m. UTC | #17
On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote:
> On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > > to 0 avoids the delay.
> > >
> > > In your case, I think it might be working as intended? Curious, what
> > > was the call stack in your case where it was blocked?
> >
> > Why is then there 10sec delay during boot? The backtrace is
> > |------------[ cut here ]------------
> > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> > |Modules linked in:
> > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> > |RIP: 0010:wait_for_device_probe+0x30/0x110
> > |Call Trace:
> > | <TASK>
> > | prepare_namespace+0x2b/0x160
> > | kernel_init_freeable+0x2b3/0x2dd
> > | kernel_init+0x11/0x110
> > | ret_from_fork+0x22/0x30
> > | </TASK>
> >
> > Looking closer, it can't access init. This in particular box boots
> > directly the kernel without an initramfs so the kernel later mounts
> > /dev/sda1 and everything is good.  So that seems to be the reason…
>

Hello there,

My (QEMU) boot times were recently extended by 10 seconds.
Looking at the timestamps, it looks like nothing is being done for 10 whole
seconds.

A git bisect landed me at the patch in $subject:
2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")

Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch
author from the others seeing a regression with this patch, gives two different
stacktraces during boot:

[    0.459633] printk: console [netcon0] enabled
[    0.459636] printk: console [netcon0] printing thread started
[    0.459637] netconsole: network logging started
[    0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left
[    0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    0.461085] cfg80211: failed to load regulatory.db
[    0.461113] ALSA device list:
[    0.461116]   No soundcards found.
[    0.461614] ------------[ cut here ]------------
[    0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
[    0.485809] Modules linked in:
[    0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-00004-g74f936013b08-dirty #20
[    0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[    0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160
[    0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
[    0.489539] RSP: 0000:ffff9c7900013ed8 EFLAGS: 00010246
[    0.489965] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000d02
[    0.490597] RDX: 0000000000000cc2 RSI: 0000000000000000 RDI: 000000000002e990
[    0.491181] RBP: 0000000000000214 R08: 000000000000000f R09: 0000000000000064
[    0.491788] R10: ffff9c7900013c6c R11: 0000000000000000 R12: ffff8964c0343640
[    0.492384] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
[    0.492960] FS:  0000000000000000(0000) GS:ffff896637d00000(0000) knlGS:0000000000000000
[    0.493658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.494501] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
[    0.495621] Call Trace:
[    0.496059]  <TASK>
[    0.496266]  ? init_eaccess+0x3b/0x76
[    0.496657]  prepare_namespace+0x30/0x16a
[    0.497016]  kernel_init_freeable+0x207/0x212
[    0.497407]  ? rest_init+0xc0/0xc0
[    0.497714]  kernel_init+0x16/0x120
[    0.498250]  ret_from_fork+0x1f/0x30
[    0.498898]  </TASK>
[    0.499307] ---[ end trace 0000000000000000 ]---
[    0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.749479] ata2.00: applying bridge limits
[    0.750915] ata4: SATA link down (SStatus 0 SControl 300)
[    0.752110] ata5: SATA link down (SStatus 0 SControl 300)
[    0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.754877] ata3: SATA link down (SStatus 0 SControl 300)
[    0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.755387] ata1.00: applying bridge limits
[    0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.755500] ata6.00: applying bridge limits
[    0.757330] ata1.00: configured for UDMA/100
[    0.757441] ata6.00: configured for UDMA/100
[    0.757505] ata2.00: configured for UDMA/100
[    0.758015] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.760542] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    0.760669] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB)
[    0.760778] sd 0:0:0:0: [sda] Write Protect is off
[    0.760787] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    0.760847] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.760981] sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
[    0.761319] scsi 1:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.762808] sd 1:0:0:0: Attached scsi generic sg1 type 0
[    0.763004] sd 1:0:0:0: [sdb] 732 512-byte logical blocks: (375 kB/366 KiB)
[    0.763649] sd 1:0:0:0: [sdb] Write Protect is off
[    0.763661] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    0.763965] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.764026] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
[    0.765923] scsi 5:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.767987] sd 5:0:0:0: Attached scsi generic sg2 type 0
[    0.768626] sd 5:0:0:0: [sdc] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
[    0.769909] sd 5:0:0:0: [sdc] Write Protect is off
[    0.769920] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    0.770515] sd 1:0:0:0: [sdb] Attached SCSI disk
[    0.770900] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.771782] sd 5:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
[    0.773900]  sda: sda1 sda14 sda15
[    0.774499] sd 5:0:0:0: [sdc] Attached SCSI disk
[    0.774907] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.084011] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
[   10.887350] md: Waiting for all devices to be available before autodetect
[   10.887395] md: If you don't use raid, use raid=noautodetect
[   10.887445] ------------[ cut here ]------------
[   10.887448] WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
[   10.893987] Modules linked in:
[   10.894314] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W         5.18.0-next-20220526-00004-g74f936013b08-dirty #20
[   10.895389] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   10.896590] RIP: 0010:wait_for_device_probe+0x1a/0x160
[   10.897215] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
[   10.899269] RSP: 0000:ffff9c7900013ec8 EFLAGS: 00010246
[   10.899762] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
[   10.900424] RDX: 0000000000000000 RSI: ffffffff9d998089 RDI: 00000000ffffffff
[   10.901060] RBP: 0000000000000214 R08: 80000000ffffe1ca R09: ffff9c7900013ea8
[   10.901839] R10: 3fffffffffffffff R11: 0000000000000000 R12: ffff8964c0343640
[   10.902615] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
[   10.903371] FS:  0000000000000000(0000) GS:ffff896637c80000(0000) knlGS:0000000000000000
[   10.904188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   10.904721] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
[   10.905340] Call Trace:
[   10.905558]  <TASK>
[   10.905739]  md_run_setup+0x3c/0x6a
[   10.906036]  prepare_namespace+0x35/0x16a
[   10.906391]  kernel_init_freeable+0x207/0x212
[   10.906748]  ? rest_init+0xc0/0xc0
[   10.907021]  kernel_init+0x16/0x120
[   10.907312]  ret_from_fork+0x1f/0x30
[   10.907617]  </TASK>
[   10.907796] ---[ end trace 0000000000000000 ]---
[   10.908159] md: Autodetecting RAID arrays.
[   10.908160] md: autorun ...
[   10.908160] md: ... autorun DONE.
[   10.911301] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
[   10.911303] EXT4-fs (sda1): write access will be enabled during recovery
[   11.038001] EXT4-fs (sda1): orphan cleanup on readonly fs
[   11.038466] EXT4-fs (sda1): 2 orphan inodes deleted
[   11.038475] EXT4-fs (sda1): recovery complete
[   11.053033] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
[   11.053074] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
[   11.053175] devtmpfs: mounted
[   11.061921] Freeing unused kernel image (initmem) memory: 1680K
[   11.061930] Write protecting the kernel read-only data: 24576k
[   11.071108] Freeing unused kernel image (text/rodata gap) memory: 2032K
[   11.073596] Freeing unused kernel image (rodata/data gap) memory: 924K
[   11.073609] Run /sbin/init as init process


Kind regards,
Niklas
Saravana Kannan May 30, 2022, 4:59 a.m. UTC | #18
On Sun, May 29, 2022 at 1:34 AM 'Niklas Cassel' via kernel-team
<kernel-team@android.com> wrote:
>
> On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote:
> > On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
> > <bigeasy@linutronix.de> wrote:
> > >
> > > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > > > to 0 avoids the delay.
> > > >
> > > > In your case, I think it might be working as intended? Curious, what
> > > > was the call stack in your case where it was blocked?
> > >
> > > Why is then there 10sec delay during boot? The backtrace is
> > > |------------[ cut here ]------------
> > > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> > > |Modules linked in:
> > > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> > > |RIP: 0010:wait_for_device_probe+0x30/0x110
> > > |Call Trace:
> > > | <TASK>
> > > | prepare_namespace+0x2b/0x160
> > > | kernel_init_freeable+0x2b3/0x2dd
> > > | kernel_init+0x11/0x110
> > > | ret_from_fork+0x22/0x30
> > > | </TASK>
> > >
> > > Looking closer, it can't access init. This in particular box boots
> > > directly the kernel without an initramfs so the kernel later mounts
> > > /dev/sda1 and everything is good.  So that seems to be the reason…
> >
>
> Hello there,
>
> My (QEMU) boot times were recently extended by 10 seconds.
> Looking at the timestamps, it looks like nothing is being done for 10 whole
> seconds.
>
> A git bisect landed me at the patch in $subject:
> 2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")
>
> Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch
> author from the others seeing a regression with this patch, gives two different
> stacktraces during boot:

Thanks for the report. My patch was exposing an existing issue. I
already sent a fix for that a few days ago. Waiting for it to get
picked up.

-Saravana

>
> [    0.459633] printk: console [netcon0] enabled
> [    0.459636] printk: console [netcon0] printing thread started
> [    0.459637] netconsole: network logging started
> [    0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database
> [    0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left
> [    0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> [    0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
> [    0.461085] cfg80211: failed to load regulatory.db
> [    0.461113] ALSA device list:
> [    0.461116]   No soundcards found.
> [    0.461614] ------------[ cut here ]------------
> [    0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
> [    0.485809] Modules linked in:
> [    0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-00004-g74f936013b08-dirty #20
> [    0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [    0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160
> [    0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
> [    0.489539] RSP: 0000:ffff9c7900013ed8 EFLAGS: 00010246
> [    0.489965] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000d02
> [    0.490597] RDX: 0000000000000cc2 RSI: 0000000000000000 RDI: 000000000002e990
> [    0.491181] RBP: 0000000000000214 R08: 000000000000000f R09: 0000000000000064
> [    0.491788] R10: ffff9c7900013c6c R11: 0000000000000000 R12: ffff8964c0343640
> [    0.492384] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
> [    0.492960] FS:  0000000000000000(0000) GS:ffff896637d00000(0000) knlGS:0000000000000000
> [    0.493658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.494501] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
> [    0.495621] Call Trace:
> [    0.496059]  <TASK>
> [    0.496266]  ? init_eaccess+0x3b/0x76
> [    0.496657]  prepare_namespace+0x30/0x16a
> [    0.497016]  kernel_init_freeable+0x207/0x212
> [    0.497407]  ? rest_init+0xc0/0xc0
> [    0.497714]  kernel_init+0x16/0x120
> [    0.498250]  ret_from_fork+0x1f/0x30
> [    0.498898]  </TASK>
> [    0.499307] ---[ end trace 0000000000000000 ]---
> [    0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.749479] ata2.00: applying bridge limits
> [    0.750915] ata4: SATA link down (SStatus 0 SControl 300)
> [    0.752110] ata5: SATA link down (SStatus 0 SControl 300)
> [    0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.754877] ata3: SATA link down (SStatus 0 SControl 300)
> [    0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.755387] ata1.00: applying bridge limits
> [    0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.755500] ata6.00: applying bridge limits
> [    0.757330] ata1.00: configured for UDMA/100
> [    0.757441] ata6.00: configured for UDMA/100
> [    0.757505] ata2.00: configured for UDMA/100
> [    0.758015] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.760542] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [    0.760669] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB)
> [    0.760778] sd 0:0:0:0: [sda] Write Protect is off
> [    0.760787] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    0.760847] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.760981] sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
> [    0.761319] scsi 1:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.762808] sd 1:0:0:0: Attached scsi generic sg1 type 0
> [    0.763004] sd 1:0:0:0: [sdb] 732 512-byte logical blocks: (375 kB/366 KiB)
> [    0.763649] sd 1:0:0:0: [sdb] Write Protect is off
> [    0.763661] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    0.763965] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.764026] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
> [    0.765923] scsi 5:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.767987] sd 5:0:0:0: Attached scsi generic sg2 type 0
> [    0.768626] sd 5:0:0:0: [sdc] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
> [    0.769909] sd 5:0:0:0: [sdc] Write Protect is off
> [    0.769920] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [    0.770515] sd 1:0:0:0: [sdb] Attached SCSI disk
> [    0.770900] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.771782] sd 5:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
> [    0.773900]  sda: sda1 sda14 sda15
> [    0.774499] sd 5:0:0:0: [sdc] Attached SCSI disk
> [    0.774907] sd 0:0:0:0: [sda] Attached SCSI disk
> [    1.084011] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
> [   10.887350] md: Waiting for all devices to be available before autodetect
> [   10.887395] md: If you don't use raid, use raid=noautodetect
> [   10.887445] ------------[ cut here ]------------
> [   10.887448] WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
> [   10.893987] Modules linked in:
> [   10.894314] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W         5.18.0-next-20220526-00004-g74f936013b08-dirty #20
> [   10.895389] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   10.896590] RIP: 0010:wait_for_device_probe+0x1a/0x160
> [   10.897215] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
> [   10.899269] RSP: 0000:ffff9c7900013ec8 EFLAGS: 00010246
> [   10.899762] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
> [   10.900424] RDX: 0000000000000000 RSI: ffffffff9d998089 RDI: 00000000ffffffff
> [   10.901060] RBP: 0000000000000214 R08: 80000000ffffe1ca R09: ffff9c7900013ea8
> [   10.901839] R10: 3fffffffffffffff R11: 0000000000000000 R12: ffff8964c0343640
> [   10.902615] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
> [   10.903371] FS:  0000000000000000(0000) GS:ffff896637c80000(0000) knlGS:0000000000000000
> [   10.904188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   10.904721] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
> [   10.905340] Call Trace:
> [   10.905558]  <TASK>
> [   10.905739]  md_run_setup+0x3c/0x6a
> [   10.906036]  prepare_namespace+0x35/0x16a
> [   10.906391]  kernel_init_freeable+0x207/0x212
> [   10.906748]  ? rest_init+0xc0/0xc0
> [   10.907021]  kernel_init+0x16/0x120
> [   10.907312]  ret_from_fork+0x1f/0x30
> [   10.907617]  </TASK>
> [   10.907796] ---[ end trace 0000000000000000 ]---
> [   10.908159] md: Autodetecting RAID arrays.
> [   10.908160] md: autorun ...
> [   10.908160] md: ... autorun DONE.
> [   10.911301] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
> [   10.911303] EXT4-fs (sda1): write access will be enabled during recovery
> [   11.038001] EXT4-fs (sda1): orphan cleanup on readonly fs
> [   11.038466] EXT4-fs (sda1): 2 orphan inodes deleted
> [   11.038475] EXT4-fs (sda1): recovery complete
> [   11.053033] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
> [   11.053074] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
> [   11.053175] devtmpfs: mounted
> [   11.061921] Freeing unused kernel image (initmem) memory: 1680K
> [   11.061930] Write protecting the kernel read-only data: 24576k
> [   11.071108] Freeing unused kernel image (text/rodata gap) memory: 2032K
> [   11.073596] Freeing unused kernel image (rodata/data gap) memory: 924K
> [   11.073609] Run /sbin/init as init process
>
>
> Kind regards,
> Niklas
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
diff mbox series

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7123524a86b8..fcc1dfc877a9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -939,8 +939,10 @@ 
 			[KNL] Debugging option to set a timeout in seconds for
 			deferred probe to give up waiting on dependencies to
 			probe. Only specific dependencies (subsystems or
-			drivers) that have opted in will be ignored. A timeout of 0
-			will timeout at the end of initcalls. This option will also
+			drivers) that have opted in will be ignored. A timeout
+			of 0 will timeout at the end of initcalls. If the time
+			out hasn't expired, it'll be restarted by each
+			successful driver registration. This option will also
 			dump out devices still on the deferred probe list after
 			retrying.
 
diff --git a/drivers/base/base.h b/drivers/base/base.h
index 2882af26392a..ab71403d102f 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -159,6 +159,7 @@  extern char *make_class_name(const char *name, struct kobject *kobj);
 extern int devres_release_all(struct device *dev);
 extern void device_block_probing(void);
 extern void device_unblock_probing(void);
+extern void deferred_probe_extend_timeout(void);
 
 /* /sys/devices directory */
 extern struct kset *devices_kset;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index f47cab21430f..603379b5f9dd 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -255,7 +255,12 @@  static int deferred_devs_show(struct seq_file *s, void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(deferred_devs);
 
+#ifdef CONFIG_MODULES
+int driver_deferred_probe_timeout = 10;
+#else
 int driver_deferred_probe_timeout;
+#endif
+
 EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);
 static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue);
 
@@ -315,6 +320,20 @@  static void deferred_probe_timeout_work_func(struct work_struct *work)
 }
 static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
 
+void deferred_probe_extend_timeout(void)
+{
+	/*
+	 * If the work hasn't been queued yet or if the work expired, don't
+	 * start a new one.
+	 */
+	if (cancel_delayed_work(&deferred_probe_timeout_work)) {
+		schedule_delayed_work(&deferred_probe_timeout_work,
+				driver_deferred_probe_timeout * HZ);
+		pr_debug("Extended deferred probe timeout by %d secs\n",
+					driver_deferred_probe_timeout);
+	}
+}
+
 /**
  * deferred_probe_initcall() - Enable probing of deferred devices
  *
diff --git a/drivers/base/driver.c b/drivers/base/driver.c
index 8c0d33e182fd..77a77b2095cd 100644
--- a/drivers/base/driver.c
+++ b/drivers/base/driver.c
@@ -177,6 +177,7 @@  int driver_register(struct device_driver *drv)
 		return ret;
 	}
 	kobject_uevent(&drv->p->kobj, KOBJ_ADD);
+	deferred_probe_extend_timeout();
 
 	return ret;
 }