Message ID | 20180628204344.13973-2-robh@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jun 28, 2018 at 11:43 PM, Rob Herring <robh@kernel.org> wrote: > Deferred probe will currently wait forever on dependent devices to probe, > but sometimes a driver will never exist. It's also not always critical for > a driver to exist. Platforms can rely on default configuration from the > bootloader or reset defaults for things such as pinctrl and power domains. > This is often the case with initial platform support until various drivers > get enabled. There's at least 2 scenarios where deferred probe can render > a platform broken. Both involve using a DT which has more devices and > dependencies than the kernel supports. The 1st case is a driver may be > disabled in the kernel config. The 2nd case is the kernel version may > simply not have the dependent driver. This can happen if using a newer DT > (provided by firmware perhaps) with a stable kernel version. Deferred > probe issues can be difficult to debug especially if the console has > dependencies or userspace fails to boot to a shell. > > There are also cases like IOMMUs where only built-in drivers are > supported, so deferring probe after initcalls is not needed. The IOMMU > subsystem implemented its own mechanism to handle this using OF_DECLARE > linker sections. > > This commit adds makes ending deferred probe conditional on initcalls > being completed or a debug timeout. Subsystems or drivers may opt-in by > calling driver_deferred_probe_check_init_done() instead of > unconditionally returning -EPROBE_DEFER. They may use additional > information from DT or kernel's config to decide whether to continue to > defer probe or not. > > The timeout mechanism is intended for debug purposes and WARNs loudly. > The remaining deferred probe pending list will also be dumped after the > timeout. Not that this timeout won't work for the console which needs > to be enabled before userspace starts. However, if the console's > dependencies are resolved, then the kernel log will be printed (as > opposed to no output). There is another patch flying around with debugfs node to dump a list of deferred probe queue. I dunno if it makes sense to dump it here and there and if yes, some unification in output, perhaps? > + deferred_probe_timeout = simple_strtol(str, NULL, 0); Hmm... I don't think 16-base or 8-base values are useful to support. One subtle difference that people usually consider timeout values as 10-base and if at some point someone makes it as 0100 (no matter why), it would be much less than expected. 08 wouldn't parsed at all. > + if (deferred_probe_timeout > 0) { Would it be harmful / useful if we skip this check and run the work immediately? > + schedule_delayed_work(&deferred_probe_timeout_work, > + deferred_probe_timeout * HZ); > + }
On Fri, Jun 29, 2018 at 4:25 PM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > > On Thu, Jun 28, 2018 at 11:43 PM, Rob Herring <robh@kernel.org> wrote: > > Deferred probe will currently wait forever on dependent devices to probe, > > but sometimes a driver will never exist. It's also not always critical for > > a driver to exist. Platforms can rely on default configuration from the > > bootloader or reset defaults for things such as pinctrl and power domains. > > This is often the case with initial platform support until various drivers > > get enabled. There's at least 2 scenarios where deferred probe can render > > a platform broken. Both involve using a DT which has more devices and > > dependencies than the kernel supports. The 1st case is a driver may be > > disabled in the kernel config. The 2nd case is the kernel version may > > simply not have the dependent driver. This can happen if using a newer DT > > (provided by firmware perhaps) with a stable kernel version. Deferred > > probe issues can be difficult to debug especially if the console has > > dependencies or userspace fails to boot to a shell. > > > > There are also cases like IOMMUs where only built-in drivers are > > supported, so deferring probe after initcalls is not needed. The IOMMU > > subsystem implemented its own mechanism to handle this using OF_DECLARE > > linker sections. > > > > This commit adds makes ending deferred probe conditional on initcalls > > being completed or a debug timeout. Subsystems or drivers may opt-in by > > calling driver_deferred_probe_check_init_done() instead of > > unconditionally returning -EPROBE_DEFER. They may use additional > > information from DT or kernel's config to decide whether to continue to > > defer probe or not. > > > > The timeout mechanism is intended for debug purposes and WARNs loudly. > > > The remaining deferred probe pending list will also be dumped after the > > timeout. Not that this timeout won't work for the console which needs > > to be enabled before userspace starts. However, if the console's > > dependencies are resolved, then the kernel log will be printed (as > > opposed to no output). > > There is another patch flying around with debugfs node to dump a list > of deferred probe queue. > I dunno if it makes sense to dump it here and there and if yes, some > unification in output, perhaps? Here makes sense because you may not be able to mount and read debugfs. That's the reason for the timeout too. Otherwise, debugfs could be used to trigger the same thing. They are aligned at least in that both use dev_name(), but I don't think we can align completely as we need the context in the log where as reading a file you know the context already. > > + deferred_probe_timeout = simple_strtol(str, NULL, 0); > > Hmm... I don't think 16-base or 8-base values are useful to support. > One subtle difference that people usually consider timeout values as > 10-base and if at some point someone makes it as 0100 (no matter why), > it would be much less than expected. > 08 wouldn't parsed at all. Agreed. > > + if (deferred_probe_timeout > 0) { > > Would it be harmful / useful if we skip this check and run the work immediately? The default is -1 which is disabled. Scheduling work in that case would give us a very long timeout instead. If the timeout is 0, then this never needs to run because driver_deferred_probe_check_state will immediately timeout. Running it would be harmless though as it would just do an extra run of deferred probe wq. Rob
On Thu, Jun 28, 2018 at 02:43:39PM -0600, Rob Herring wrote: > Deferred probe will currently wait forever on dependent devices to probe, > but sometimes a driver will never exist. It's also not always critical for > a driver to exist. Platforms can rely on default configuration from the > bootloader or reset defaults for things such as pinctrl and power domains. > This is often the case with initial platform support until various drivers > get enabled. There's at least 2 scenarios where deferred probe can render > a platform broken. Both involve using a DT which has more devices and > dependencies than the kernel supports. The 1st case is a driver may be > disabled in the kernel config. The 2nd case is the kernel version may > simply not have the dependent driver. This can happen if using a newer DT > (provided by firmware perhaps) with a stable kernel version. Deferred > probe issues can be difficult to debug especially if the console has > dependencies or userspace fails to boot to a shell. > > There are also cases like IOMMUs where only built-in drivers are > supported, so deferring probe after initcalls is not needed. The IOMMU > subsystem implemented its own mechanism to handle this using OF_DECLARE > linker sections. > > This commit adds makes ending deferred probe conditional on initcalls > being completed or a debug timeout. Subsystems or drivers may opt-in by > calling driver_deferred_probe_check_init_done() instead of > unconditionally returning -EPROBE_DEFER. They may use additional > information from DT or kernel's config to decide whether to continue to > defer probe or not. > > The timeout mechanism is intended for debug purposes and WARNs loudly. > The remaining deferred probe pending list will also be dumped after the > timeout. Not that this timeout won't work for the console which needs > to be enabled before userspace starts. However, if the console's > dependencies are resolved, then the kernel log will be printed (as > opposed to no output). > > Cc: Alexander Graf <agraf@suse.de> > Signed-off-by: Rob Herring <robh@kernel.org> I wanted to apply this series, but this patch failed to apply to my driver-core tree :( Can you rebase it and resend? thanks, greg k-h
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index efc7aa7a0670..e83ef4648ea4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -804,6 +804,15 @@ Defaults to the default architecture's huge page size if not specified. + deferred_probe_timeout= + [KNL] Debugging option to set a timeout in seconds for + deferred probe to give up waiting on dependencies to + probe. Only specific dependencies (subsystems or + drivers) that have opted in will be ignored. A timeout of 0 + will timeout at the end of initcalls. This option will also + dump out devices still on the deferred probe list after + retrying. + dhash_entries= [KNL] Set number of hash buckets for dentry cache. diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 1435d7281c66..f0bc73a71a25 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -224,6 +224,51 @@ void device_unblock_probing(void) driver_deferred_probe_trigger(); } +static int deferred_probe_timeout = -1; +static int __init deferred_probe_timeout_setup(char *str) +{ + deferred_probe_timeout = simple_strtol(str, NULL, 0); + return 1; +} +__setup("deferred_probe_timeout=", deferred_probe_timeout_setup); + +/** + * driver_deferred_probe_check_state() - Check deferred probe state + * @dev: device to check + * + * Returns -ENODEV if init is done and all built-in drivers have had a chance + * to probe (i.e. initcalls are done), -ETIMEDOUT if deferred probe debug + * timeout has expired, or -EPROBE_DEFER if none of those conditions are met. + * + * Drivers or subsystems can opt-in to calling this function instead of directly + * returning -EPROBE_DEFER. + */ +int driver_deferred_probe_check_state(struct device *dev) +{ + if (initcalls_done) { + if (!deferred_probe_timeout) { + dev_WARN(dev, "deferred probe timeout, ignoring dependency"); + return -ETIMEDOUT; + } + dev_warn(dev, "ignoring dependency for device, assuming no driver"); + return -ENODEV; + } + return -EPROBE_DEFER; +} + +static void deferred_probe_timeout_work_func(struct work_struct *work) +{ + struct device_private *private, *p; + + deferred_probe_timeout = 0; + driver_deferred_probe_trigger(); + flush_work(&deferred_probe_work); + + list_for_each_entry_safe(private, p, &deferred_probe_pending_list, deferred_probe) + dev_info(private->device, "deferred probe pending"); +} +static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func); + /** * deferred_probe_initcall() - Enable probing of deferred devices * @@ -238,6 +283,18 @@ static int deferred_probe_initcall(void) /* Sort as many dependencies as possible before exiting initcalls */ flush_work(&deferred_probe_work); initcalls_done = true; + + /* + * Trigger deferred probe again, this time we won't defer anything + * that is optional + */ + driver_deferred_probe_trigger(); + flush_work(&deferred_probe_work); + + if (deferred_probe_timeout > 0) { + schedule_delayed_work(&deferred_probe_timeout_work, + deferred_probe_timeout * HZ); + } return 0; } late_initcall(deferred_probe_initcall); diff --git a/include/linux/device.h b/include/linux/device.h index 055a69dbcd18..b6d8e0a09ad9 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -339,6 +339,8 @@ struct device *driver_find_device(struct device_driver *drv, struct device *start, void *data, int (*match)(struct device *dev, void *data)); +int driver_deferred_probe_check_state(struct device *dev); + /** * struct subsys_interface - interfaces to device functions * @name: name of the device function
Deferred probe will currently wait forever on dependent devices to probe, but sometimes a driver will never exist. It's also not always critical for a driver to exist. Platforms can rely on default configuration from the bootloader or reset defaults for things such as pinctrl and power domains. This is often the case with initial platform support until various drivers get enabled. There's at least 2 scenarios where deferred probe can render a platform broken. Both involve using a DT which has more devices and dependencies than the kernel supports. The 1st case is a driver may be disabled in the kernel config. The 2nd case is the kernel version may simply not have the dependent driver. This can happen if using a newer DT (provided by firmware perhaps) with a stable kernel version. Deferred probe issues can be difficult to debug especially if the console has dependencies or userspace fails to boot to a shell. There are also cases like IOMMUs where only built-in drivers are supported, so deferring probe after initcalls is not needed. The IOMMU subsystem implemented its own mechanism to handle this using OF_DECLARE linker sections. This commit adds makes ending deferred probe conditional on initcalls being completed or a debug timeout. Subsystems or drivers may opt-in by calling driver_deferred_probe_check_init_done() instead of unconditionally returning -EPROBE_DEFER. They may use additional information from DT or kernel's config to decide whether to continue to defer probe or not. The timeout mechanism is intended for debug purposes and WARNs loudly. The remaining deferred probe pending list will also be dumped after the timeout. Not that this timeout won't work for the console which needs to be enabled before userspace starts. However, if the console's dependencies are resolved, then the kernel log will be printed (as opposed to no output). Cc: Alexander Graf <agraf@suse.de> Signed-off-by: Rob Herring <robh@kernel.org> --- v3: - Merged with timeout patch. - Clarify that deferred_probe_timeout is a debug option. - Drop the 'optional' param. The only user was pinctrl, so it has to handle that functionality. - Rename function to driver_deferred_probe_check_state - Added kerneldoc for driver_deferred_probe_check_state - Print a 1 line warning if stopping deferred probe after initcalls and a WARN on timeout. .../admin-guide/kernel-parameters.txt | 9 +++ drivers/base/dd.c | 57 +++++++++++++++++++ include/linux/device.h | 2 + 3 files changed, 68 insertions(+) -- 2.17.1