Message ID | 20220427161340.8518-5-lecopzer.chen@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Support hld delayed init based on Pseudo-NMI for arm64 | expand |
On Thu 2022-04-28 00:13:38, Lecopzer Chen wrote: > When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready > yet. E.g. on arm64, PMU is not ready until > device_initcall(armv8_pmu_driver_init). And it is deeply integrated > with the driver model and cpuhp. Hence it is hard to push this > initialization before smp_init(). > > But it is easy to take an opposite approach and try to initialize > the watchdog once again later. > The delayed probe is called using workqueues. It need to allocate > memory and must be proceed in a normal context. > The delayed probe is able to use if watchdog_nmi_probe() returns > non-zero which means the return code returned when PMU is not ready yet. > > Provide an API - retry_lockup_detector_init() for anyone who needs > to delayed init lockup detector if they had ever failed at > lockup_detector_init(). > > The original assumption is: nobody should use delayed probe after > lockup_detector_check() which has __init attribute. > That is, anyone uses this API must call between lockup_detector_init() > and lockup_detector_check(), and the caller must have __init attribute > > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > +/* > + * retry_lockup_detector_init - retry init lockup detector if possible. > + * > + * Retry hardlockup detector init. It is useful when it requires some > + * functionality that has to be initialized later on a particular > + * platform. > + */ > +void __init retry_lockup_detector_init(void) > +{ > + /* Must be called before late init calls */ > + if (!allow_lockup_detector_init_retry) > + return; > + > + queue_work_on(__smp_processor_id(), system_wq, &detector_work); Just a small nit. This can be simplified by calling: schedule_work(&detector_work); It uses "system_wq" that uses CPU-bound workers. It prefers the current CPU. But the exact CPU is not important. Any CPU-bound worker is enough. > +} > + With the above change, feel free to use: Reviewed-by: Petr Mladek <pmladek@suse.com> Best Regards, Petr PS: I am sorry for the late review. I had busy weeks.
> On Thu 2022-04-28 00:13:38, Lecopzer Chen wrote: > > When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready > > yet. E.g. on arm64, PMU is not ready until > > device_initcall(armv8_pmu_driver_init). And it is deeply integrated > > with the driver model and cpuhp. Hence it is hard to push this > > initialization before smp_init(). > > > > But it is easy to take an opposite approach and try to initialize > > the watchdog once again later. > > The delayed probe is called using workqueues. It need to allocate > > memory and must be proceed in a normal context. > > The delayed probe is able to use if watchdog_nmi_probe() returns > > non-zero which means the return code returned when PMU is not ready yet. > > > > Provide an API - retry_lockup_detector_init() for anyone who needs > > to delayed init lockup detector if they had ever failed at > > lockup_detector_init(). > > > > The original assumption is: nobody should use delayed probe after > > lockup_detector_check() which has __init attribute. > > That is, anyone uses this API must call between lockup_detector_init() > > and lockup_detector_check(), and the caller must have __init attribute > > > > --- a/kernel/watchdog.c > > +++ b/kernel/watchdog.c > > +/* > > + * retry_lockup_detector_init - retry init lockup detector if possible. > > + * > > + * Retry hardlockup detector init. It is useful when it requires some > > + * functionality that has to be initialized later on a particular > > + * platform. > > + */ > > +void __init retry_lockup_detector_init(void) > > +{ > > + /* Must be called before late init calls */ > > + if (!allow_lockup_detector_init_retry) > > + return; > > + > > + queue_work_on(__smp_processor_id(), system_wq, &detector_work); > > Just a small nit. This can be simplified by calling: > > schedule_work(&detector_work); > > It uses "system_wq" that uses CPU-bound workers. It prefers > the current CPU. But the exact CPU is not important. Any CPU-bound > worker is enough. Thanks!! I'll tweak this on -rc1 > > > +} > > + > > With the above change, feel free to use: > > Reviewed-by: Petr Mladek <pmladek@suse.com> Really appreciate your review and idea, thank you ver much. BRs, Lecopzer
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index b7bcd63c36b4..10f2a305fe0d 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -118,6 +118,8 @@ static inline int hardlockup_detector_perf_init(void) { return 0; } void watchdog_nmi_stop(void); void watchdog_nmi_start(void); + +void retry_lockup_detector_init(void); int watchdog_nmi_probe(void); void watchdog_nmi_enable(unsigned int cpu); void watchdog_nmi_disable(unsigned int cpu); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index ca85ca5fdeff..1edc23af3b80 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -103,7 +103,13 @@ void __weak watchdog_nmi_disable(unsigned int cpu) hardlockup_detector_perf_disable(); } -/* Return 0, if a NMI watchdog is available. Error code otherwise */ +/* + * Arch specific API. + * + * Return 0 when NMI watchdog is available, negative value otherwise. + * Note that the negative value means that a delayed probe might + * succeed later. + */ int __weak __init watchdog_nmi_probe(void) { return hardlockup_detector_perf_init(); @@ -831,6 +837,62 @@ static struct ctl_table watchdog_sysctls[] = { {} }; +static void __init lockup_detector_delay_init(struct work_struct *work); +static bool allow_lockup_detector_init_retry __initdata; + +static struct work_struct detector_work __initdata = + __WORK_INITIALIZER(detector_work, lockup_detector_delay_init); + +static void __init lockup_detector_delay_init(struct work_struct *work) +{ + int ret; + + ret = watchdog_nmi_probe(); + if (ret) { + pr_info("Delayed init of the lockup detector failed: %d\n", ret); + pr_info("Perf NMI watchdog permanently disabled\n"); + return; + } + + allow_lockup_detector_init_retry = false; + + nmi_watchdog_available = true; + lockup_detector_setup(); +} + +/* + * retry_lockup_detector_init - retry init lockup detector if possible. + * + * Retry hardlockup detector init. It is useful when it requires some + * functionality that has to be initialized later on a particular + * platform. + */ +void __init retry_lockup_detector_init(void) +{ + /* Must be called before late init calls */ + if (!allow_lockup_detector_init_retry) + return; + + queue_work_on(__smp_processor_id(), system_wq, &detector_work); +} + +/* + * Ensure that optional delayed hardlockup init is proceed before + * the init code and memory is freed. + */ +static int __init lockup_detector_check(void) +{ + /* Prevent any later retry. */ + allow_lockup_detector_init_retry = false; + + /* Make sure no work is pending. */ + flush_work(&detector_work); + + return 0; + +} +late_initcall_sync(lockup_detector_check); + static void __init watchdog_sysctl_init(void) { register_sysctl_init("kernel", watchdog_sysctls); @@ -849,6 +911,9 @@ void __init lockup_detector_init(void) if (!watchdog_nmi_probe()) nmi_watchdog_available = true; + else + allow_lockup_detector_init_retry = true; + lockup_detector_setup(); watchdog_sysctl_init(); }