Message ID | 20210615123904.2568052-2-grzegorz.jaszczyk@linaro.org (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | introduce watchdog_dev_suspend/resume | expand |
On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote: > The watchdog drivers often disable wdog clock during suspend and then > enable it again during resume. Nevertheless the ping worker is still > running and can issue low-level ping while the wdog clock is disabled > causing the system hang. To prevent such condition introduce > watchdog_dev_suspend/resume which can be used by any wdog driver and > actually cancel ping worker during suspend and restore it back, if > needed, during resume. > I'll have to look into this further, but I don't think this is the correct solution. Most likely the watchdog core needs to have its own independent suspend/resule functions and suspend the high resolution timer on suspend and restore it on resume. This may require an additional flag to be set by drivers to indicate that the timer should be stopped on suspend. > Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org> > --- > drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++ > include/linux/watchdog.h | 2 ++ > 2 files changed, 51 insertions(+) > > diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c > index 2946f3a63110..3feca1567281 100644 > --- a/drivers/watchdog/watchdog_dev.c > +++ b/drivers/watchdog/watchdog_dev.c > @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void) > kthread_destroy_worker(watchdog_kworker); > } > > +int watchdog_dev_suspend(struct watchdog_device *wdd) > +{ > + struct watchdog_core_data *wd_data = wdd->wd_data; > + int ret; > + > + if (!wdd->wd_data) > + return -ENODEV; > + > + /* ping for the last time before suspend */ > + mutex_lock(&wd_data->lock); > + if (watchdog_worker_should_ping(wd_data)) > + ret = __watchdog_ping(wd_data->wdd); > + mutex_unlock(&wd_data->lock); > + > + if (ret) > + return ret; > + > + /* > + * make sure that watchdog worker will not kick in when the wdog is > + * suspended > + */ > + hrtimer_cancel(&wd_data->timer); > + kthread_cancel_work_sync(&wd_data->work); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(watchdog_dev_suspend); > + > +int watchdog_dev_resume(struct watchdog_device *wdd) > +{ > + struct watchdog_core_data *wd_data = wdd->wd_data; > + int ret; > + > + if (!wdd->wd_data) > + return -ENODEV; > + > + /* > + * __watchdog_ping will also retrigger hrtimer and therefore restore the > + * ping worker if needed. > + */ > + mutex_lock(&wd_data->lock); > + if (watchdog_worker_should_ping(wd_data)) > + ret = __watchdog_ping(wd_data->wdd); > + mutex_unlock(&wd_data->lock); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(watchdog_dev_resume); > + > module_param(handle_boot_enabled, bool, 0444); > MODULE_PARM_DESC(handle_boot_enabled, > "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default=" > diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h > index 9b19e6bb68b5..febfde3b1ff6 100644 > --- a/include/linux/watchdog.h > +++ b/include/linux/watchdog.h > @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd, > unsigned int timeout_parm, struct device *dev); > extern int watchdog_register_device(struct watchdog_device *); > extern void watchdog_unregister_device(struct watchdog_device *); > +int watchdog_dev_suspend(struct watchdog_device *wdd); > +int watchdog_dev_resume(struct watchdog_device *wdd); > > int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int); > > -- > 2.29.0 >
On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote: > > On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote: > > The watchdog drivers often disable wdog clock during suspend and then > > enable it again during resume. Nevertheless the ping worker is still > > running and can issue low-level ping while the wdog clock is disabled > > causing the system hang. To prevent such condition introduce > > watchdog_dev_suspend/resume which can be used by any wdog driver and > > actually cancel ping worker during suspend and restore it back, if > > needed, during resume. > > > > I'll have to look into this further, but I don't think this is the correct > solution. Most likely the watchdog core needs to have its own independent > suspend/resule functions and suspend the high resolution timer on > suspend and restore it on resume. This may require an additional flag > to be set by drivers to indicate that the timer should be stopped on > suspend. That makes sense - thank you for your suggestion. I think I could register a pm notifier in the watchdog core when the new e.g. WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and actually call watchdog_dev_suspend/resume from the notifier callback. Please let me know if you see any other issue with this solution, if not I will post v2. Thank you in advance, Grzegorz > > > Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org> > > --- > > drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++ > > include/linux/watchdog.h | 2 ++ > > 2 files changed, 51 insertions(+) > > > > diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c > > index 2946f3a63110..3feca1567281 100644 > > --- a/drivers/watchdog/watchdog_dev.c > > +++ b/drivers/watchdog/watchdog_dev.c > > @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void) > > kthread_destroy_worker(watchdog_kworker); > > } > > > > +int watchdog_dev_suspend(struct watchdog_device *wdd) > > +{ > > + struct watchdog_core_data *wd_data = wdd->wd_data; > > + int ret; > > + > > + if (!wdd->wd_data) > > + return -ENODEV; > > + > > + /* ping for the last time before suspend */ > > + mutex_lock(&wd_data->lock); > > + if (watchdog_worker_should_ping(wd_data)) > > + ret = __watchdog_ping(wd_data->wdd); > > + mutex_unlock(&wd_data->lock); > > + > > + if (ret) > > + return ret; > > + > > + /* > > + * make sure that watchdog worker will not kick in when the wdog is > > + * suspended > > + */ > > + hrtimer_cancel(&wd_data->timer); > > + kthread_cancel_work_sync(&wd_data->work); > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(watchdog_dev_suspend); > > + > > +int watchdog_dev_resume(struct watchdog_device *wdd) > > +{ > > + struct watchdog_core_data *wd_data = wdd->wd_data; > > + int ret; > > + > > + if (!wdd->wd_data) > > + return -ENODEV; > > + > > + /* > > + * __watchdog_ping will also retrigger hrtimer and therefore restore the > > + * ping worker if needed. > > + */ > > + mutex_lock(&wd_data->lock); > > + if (watchdog_worker_should_ping(wd_data)) > > + ret = __watchdog_ping(wd_data->wdd); > > + mutex_unlock(&wd_data->lock); > > + > > + return ret; > > +} > > +EXPORT_SYMBOL_GPL(watchdog_dev_resume); > > + > > module_param(handle_boot_enabled, bool, 0444); > > MODULE_PARM_DESC(handle_boot_enabled, > > "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default=" > > diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h > > index 9b19e6bb68b5..febfde3b1ff6 100644 > > --- a/include/linux/watchdog.h > > +++ b/include/linux/watchdog.h > > @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd, > > unsigned int timeout_parm, struct device *dev); > > extern int watchdog_register_device(struct watchdog_device *); > > extern void watchdog_unregister_device(struct watchdog_device *); > > +int watchdog_dev_suspend(struct watchdog_device *wdd); > > +int watchdog_dev_resume(struct watchdog_device *wdd); > > > > int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int); > > > > -- > > 2.29.0 > >
On Wed, Jun 16, 2021 at 03:59:23PM +0200, Grzegorz Jaszczyk wrote: > On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote: > > > > On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote: > > > The watchdog drivers often disable wdog clock during suspend and then > > > enable it again during resume. Nevertheless the ping worker is still > > > running and can issue low-level ping while the wdog clock is disabled > > > causing the system hang. To prevent such condition introduce > > > watchdog_dev_suspend/resume which can be used by any wdog driver and > > > actually cancel ping worker during suspend and restore it back, if > > > needed, during resume. > > > > > > > I'll have to look into this further, but I don't think this is the correct > > solution. Most likely the watchdog core needs to have its own independent > > suspend/resule functions and suspend the high resolution timer on > > suspend and restore it on resume. This may require an additional flag > > to be set by drivers to indicate that the timer should be stopped on > > suspend. > > That makes sense - thank you for your suggestion. I think I could > register a pm notifier in the watchdog core when the new e.g. > WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and > actually call watchdog_dev_suspend/resume from the notifier callback. > Please let me know if you see any other issue with this solution, if > not I will post v2. > Go for it. Thanks, Guenter
Hi Grzegorz, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on hwmon/hwmon-next] [also build test WARNING on soc/for-next linus/master v5.13-rc6 next-20210616] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447 base: https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next config: arm64-randconfig-r022-20210615 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 64720f57bea6a6bf033feef4a5751ab9c0c3b401) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/0day-ci/linux/commit/71dadcad8da1862c1205b19ddf4279d494e57545 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447 git checkout 71dadcad8da1862c1205b19ddf4279d494e57545 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> drivers/watchdog/watchdog_dev.c:1232:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:28: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var' #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/watchdog/watchdog_dev.c:1236:6: note: uninitialized use occurs here if (ret) ^~~ include/linux/compiler.h:56:47: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^~~~ include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var' #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond)) ^~~~ drivers/watchdog/watchdog_dev.c:1232:2: note: remove the 'if' if its condition is always true if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:23: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^ drivers/watchdog/watchdog_dev.c:1225:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 drivers/watchdog/watchdog_dev.c:1263:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:28: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var' #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/watchdog/watchdog_dev.c:1267:9: note: uninitialized use occurs here return ret; ^~~ drivers/watchdog/watchdog_dev.c:1263:2: note: remove the 'if' if its condition is always true if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:23: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^ drivers/watchdog/watchdog_dev.c:1253:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 2 warnings generated. vim +1232 drivers/watchdog/watchdog_dev.c 1221 1222 int watchdog_dev_suspend(struct watchdog_device *wdd) 1223 { 1224 struct watchdog_core_data *wd_data = wdd->wd_data; 1225 int ret; 1226 1227 if (!wdd->wd_data) 1228 return -ENODEV; 1229 1230 /* ping for the last time before suspend */ 1231 mutex_lock(&wd_data->lock); > 1232 if (watchdog_worker_should_ping(wd_data)) 1233 ret = __watchdog_ping(wd_data->wdd); 1234 mutex_unlock(&wd_data->lock); 1235 1236 if (ret) 1237 return ret; 1238 1239 /* 1240 * make sure that watchdog worker will not kick in when the wdog is 1241 * suspended 1242 */ 1243 hrtimer_cancel(&wd_data->timer); 1244 kthread_cancel_work_sync(&wd_data->work); 1245 1246 return 0; 1247 } 1248 EXPORT_SYMBOL_GPL(watchdog_dev_suspend); 1249 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 2946f3a63110..3feca1567281 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void) kthread_destroy_worker(watchdog_kworker); } +int watchdog_dev_suspend(struct watchdog_device *wdd) +{ + struct watchdog_core_data *wd_data = wdd->wd_data; + int ret; + + if (!wdd->wd_data) + return -ENODEV; + + /* ping for the last time before suspend */ + mutex_lock(&wd_data->lock); + if (watchdog_worker_should_ping(wd_data)) + ret = __watchdog_ping(wd_data->wdd); + mutex_unlock(&wd_data->lock); + + if (ret) + return ret; + + /* + * make sure that watchdog worker will not kick in when the wdog is + * suspended + */ + hrtimer_cancel(&wd_data->timer); + kthread_cancel_work_sync(&wd_data->work); + + return 0; +} +EXPORT_SYMBOL_GPL(watchdog_dev_suspend); + +int watchdog_dev_resume(struct watchdog_device *wdd) +{ + struct watchdog_core_data *wd_data = wdd->wd_data; + int ret; + + if (!wdd->wd_data) + return -ENODEV; + + /* + * __watchdog_ping will also retrigger hrtimer and therefore restore the + * ping worker if needed. + */ + mutex_lock(&wd_data->lock); + if (watchdog_worker_should_ping(wd_data)) + ret = __watchdog_ping(wd_data->wdd); + mutex_unlock(&wd_data->lock); + + return ret; +} +EXPORT_SYMBOL_GPL(watchdog_dev_resume); + module_param(handle_boot_enabled, bool, 0444); MODULE_PARM_DESC(handle_boot_enabled, "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default=" diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h index 9b19e6bb68b5..febfde3b1ff6 100644 --- a/include/linux/watchdog.h +++ b/include/linux/watchdog.h @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd, unsigned int timeout_parm, struct device *dev); extern int watchdog_register_device(struct watchdog_device *); extern void watchdog_unregister_device(struct watchdog_device *); +int watchdog_dev_suspend(struct watchdog_device *wdd); +int watchdog_dev_resume(struct watchdog_device *wdd); int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);
The watchdog drivers often disable wdog clock during suspend and then enable it again during resume. Nevertheless the ping worker is still running and can issue low-level ping while the wdog clock is disabled causing the system hang. To prevent such condition introduce watchdog_dev_suspend/resume which can be used by any wdog driver and actually cancel ping worker during suspend and restore it back, if needed, during resume. Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org> --- drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++ include/linux/watchdog.h | 2 ++ 2 files changed, 51 insertions(+)