diff mbox series

[1/2] watchdog: introduce watchdog_dev_suspend/resume

Message ID 20210615123904.2568052-2-grzegorz.jaszczyk@linaro.org (mailing list archive)
State New, archived
Headers show
Series introduce watchdog_dev_suspend/resume | expand

Commit Message

Grzegorz Jaszczyk June 15, 2021, 12:39 p.m. UTC
The watchdog drivers often disable wdog clock during suspend and then
enable it again during resume. Nevertheless the ping worker is still
running and can issue low-level ping while the wdog clock is disabled
causing the system hang. To prevent such condition introduce
watchdog_dev_suspend/resume which can be used by any wdog driver and
actually cancel ping worker during suspend and restore it back, if
needed, during resume.

Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org>
---
 drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++
 include/linux/watchdog.h        |  2 ++
 2 files changed, 51 insertions(+)

Comments

Grzegorz Jaszczyk June 16, 2021, 1:59 p.m. UTC | #1
On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote:
> > The watchdog drivers often disable wdog clock during suspend and then
> > enable it again during resume. Nevertheless the ping worker is still
> > running and can issue low-level ping while the wdog clock is disabled
> > causing the system hang. To prevent such condition introduce
> > watchdog_dev_suspend/resume which can be used by any wdog driver and
> > actually cancel ping worker during suspend and restore it back, if
> > needed, during resume.
> >
>
> I'll have to look into this further, but I don't think this is the correct
> solution. Most likely the watchdog core needs to have its own independent
> suspend/resule functions and suspend the high resolution timer on
> suspend and restore it on resume. This may require an additional flag
> to be set by drivers to indicate that the timer should be stopped on
> suspend.

That makes sense - thank you for your suggestion. I think I could
register a pm notifier in the watchdog core when the new e.g.
WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and
actually call watchdog_dev_suspend/resume from the notifier callback.
Please let me know if you see any other issue with this solution, if
not I will post v2.

Thank you in advance,
Grzegorz




>
> > Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org>
> > ---
> >  drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++
> >  include/linux/watchdog.h        |  2 ++
> >  2 files changed, 51 insertions(+)
> >
> > diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> > index 2946f3a63110..3feca1567281 100644
> > --- a/drivers/watchdog/watchdog_dev.c
> > +++ b/drivers/watchdog/watchdog_dev.c
> > @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void)
> >       kthread_destroy_worker(watchdog_kworker);
> >  }
> >
> > +int watchdog_dev_suspend(struct watchdog_device *wdd)
> > +{
> > +     struct watchdog_core_data *wd_data = wdd->wd_data;
> > +     int ret;
> > +
> > +     if (!wdd->wd_data)
> > +             return -ENODEV;
> > +
> > +     /* ping for the last time before suspend */
> > +     mutex_lock(&wd_data->lock);
> > +     if (watchdog_worker_should_ping(wd_data))
> > +             ret = __watchdog_ping(wd_data->wdd);
> > +     mutex_unlock(&wd_data->lock);
> > +
> > +     if (ret)
> > +             return ret;
> > +
> > +     /*
> > +      * make sure that watchdog worker will not kick in when the wdog is
> > +      * suspended
> > +      */
> > +     hrtimer_cancel(&wd_data->timer);
> > +     kthread_cancel_work_sync(&wd_data->work);
> > +
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(watchdog_dev_suspend);
> > +
> > +int watchdog_dev_resume(struct watchdog_device *wdd)
> > +{
> > +     struct watchdog_core_data *wd_data = wdd->wd_data;
> > +     int ret;
> > +
> > +     if (!wdd->wd_data)
> > +             return -ENODEV;
> > +
> > +     /*
> > +      * __watchdog_ping will also retrigger hrtimer and therefore restore the
> > +      * ping worker if needed.
> > +      */
> > +     mutex_lock(&wd_data->lock);
> > +     if (watchdog_worker_should_ping(wd_data))
> > +             ret = __watchdog_ping(wd_data->wdd);
> > +     mutex_unlock(&wd_data->lock);
> > +
> > +     return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(watchdog_dev_resume);
> > +
> >  module_param(handle_boot_enabled, bool, 0444);
> >  MODULE_PARM_DESC(handle_boot_enabled,
> >       "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="
> > diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> > index 9b19e6bb68b5..febfde3b1ff6 100644
> > --- a/include/linux/watchdog.h
> > +++ b/include/linux/watchdog.h
> > @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,
> >                                 unsigned int timeout_parm, struct device *dev);
> >  extern int watchdog_register_device(struct watchdog_device *);
> >  extern void watchdog_unregister_device(struct watchdog_device *);
> > +int watchdog_dev_suspend(struct watchdog_device *wdd);
> > +int watchdog_dev_resume(struct watchdog_device *wdd);
> >
> >  int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);
> >
> > --
> > 2.29.0
> >
Guenter Roeck June 16, 2021, 5:57 p.m. UTC | #2
On Wed, Jun 16, 2021 at 03:59:23PM +0200, Grzegorz Jaszczyk wrote:
> On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote:
> > > The watchdog drivers often disable wdog clock during suspend and then
> > > enable it again during resume. Nevertheless the ping worker is still
> > > running and can issue low-level ping while the wdog clock is disabled
> > > causing the system hang. To prevent such condition introduce
> > > watchdog_dev_suspend/resume which can be used by any wdog driver and
> > > actually cancel ping worker during suspend and restore it back, if
> > > needed, during resume.
> > >
> >
> > I'll have to look into this further, but I don't think this is the correct
> > solution. Most likely the watchdog core needs to have its own independent
> > suspend/resule functions and suspend the high resolution timer on
> > suspend and restore it on resume. This may require an additional flag
> > to be set by drivers to indicate that the timer should be stopped on
> > suspend.
> 
> That makes sense - thank you for your suggestion. I think I could
> register a pm notifier in the watchdog core when the new e.g.
> WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and
> actually call watchdog_dev_suspend/resume from the notifier callback.
> Please let me know if you see any other issue with this solution, if
> not I will post v2.
> 
Go for it.

Thanks,
Guenter
kernel test robot June 16, 2021, 11:21 p.m. UTC | #3
Hi Grzegorz,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on hwmon/hwmon-next]
[also build test WARNING on soc/for-next linus/master v5.13-rc6 next-20210616]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447
base:   https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next
config: arm64-randconfig-r022-20210615 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 64720f57bea6a6bf033feef4a5751ab9c0c3b401)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/71dadcad8da1862c1205b19ddf4279d494e57545
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447
        git checkout 71dadcad8da1862c1205b19ddf4279d494e57545
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/watchdog/watchdog_dev.c:1232:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/watchdog/watchdog_dev.c:1236:6: note: uninitialized use occurs here
           if (ret)
               ^~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/watchdog/watchdog_dev.c:1232:2: note: remove the 'if' if its condition is always true
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:23: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                         ^
   drivers/watchdog/watchdog_dev.c:1225:9: note: initialize the variable 'ret' to silence this warning
           int ret;
                  ^
                   = 0
   drivers/watchdog/watchdog_dev.c:1263:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/watchdog/watchdog_dev.c:1267:9: note: uninitialized use occurs here
           return ret;
                  ^~~
   drivers/watchdog/watchdog_dev.c:1263:2: note: remove the 'if' if its condition is always true
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:23: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                         ^
   drivers/watchdog/watchdog_dev.c:1253:9: note: initialize the variable 'ret' to silence this warning
           int ret;
                  ^
                   = 0
   2 warnings generated.


vim +1232 drivers/watchdog/watchdog_dev.c

  1221	
  1222	int watchdog_dev_suspend(struct watchdog_device *wdd)
  1223	{
  1224		struct watchdog_core_data *wd_data = wdd->wd_data;
  1225		int ret;
  1226	
  1227		if (!wdd->wd_data)
  1228			return -ENODEV;
  1229	
  1230		/* ping for the last time before suspend */
  1231		mutex_lock(&wd_data->lock);
> 1232		if (watchdog_worker_should_ping(wd_data))
  1233			ret = __watchdog_ping(wd_data->wdd);
  1234		mutex_unlock(&wd_data->lock);
  1235	
  1236		if (ret)
  1237			return ret;
  1238	
  1239		/*
  1240		 * make sure that watchdog worker will not kick in when the wdog is
  1241		 * suspended
  1242		 */
  1243		hrtimer_cancel(&wd_data->timer);
  1244		kthread_cancel_work_sync(&wd_data->work);
  1245	
  1246		return 0;
  1247	}
  1248	EXPORT_SYMBOL_GPL(watchdog_dev_suspend);
  1249	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff mbox series

Patch

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 2946f3a63110..3feca1567281 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -1219,6 +1219,55 @@  void __exit watchdog_dev_exit(void)
 	kthread_destroy_worker(watchdog_kworker);
 }
 
+int watchdog_dev_suspend(struct watchdog_device *wdd)
+{
+	struct watchdog_core_data *wd_data = wdd->wd_data;
+	int ret;
+
+	if (!wdd->wd_data)
+		return -ENODEV;
+
+	/* ping for the last time before suspend */
+	mutex_lock(&wd_data->lock);
+	if (watchdog_worker_should_ping(wd_data))
+		ret = __watchdog_ping(wd_data->wdd);
+	mutex_unlock(&wd_data->lock);
+
+	if (ret)
+		return ret;
+
+	/*
+	 * make sure that watchdog worker will not kick in when the wdog is
+	 * suspended
+	 */
+	hrtimer_cancel(&wd_data->timer);
+	kthread_cancel_work_sync(&wd_data->work);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(watchdog_dev_suspend);
+
+int watchdog_dev_resume(struct watchdog_device *wdd)
+{
+	struct watchdog_core_data *wd_data = wdd->wd_data;
+	int ret;
+
+	if (!wdd->wd_data)
+		return -ENODEV;
+
+	/*
+	 * __watchdog_ping will also retrigger hrtimer and therefore restore the
+	 * ping worker if needed.
+	 */
+	mutex_lock(&wd_data->lock);
+	if (watchdog_worker_should_ping(wd_data))
+		ret = __watchdog_ping(wd_data->wdd);
+	mutex_unlock(&wd_data->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(watchdog_dev_resume);
+
 module_param(handle_boot_enabled, bool, 0444);
 MODULE_PARM_DESC(handle_boot_enabled,
 	"Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 9b19e6bb68b5..febfde3b1ff6 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -209,6 +209,8 @@  extern int watchdog_init_timeout(struct watchdog_device *wdd,
 				  unsigned int timeout_parm, struct device *dev);
 extern int watchdog_register_device(struct watchdog_device *);
 extern void watchdog_unregister_device(struct watchdog_device *);
+int watchdog_dev_suspend(struct watchdog_device *wdd);
+int watchdog_dev_resume(struct watchdog_device *wdd);
 
 int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);