diff mbox

[2/3] watchdog: control hard lockup detection default

Message ID 20140730134342.GA7959@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Don Zickus July 30, 2014, 1:43 p.m. UTC
On Fri, Jul 25, 2014 at 01:25:11PM +0200, Andrew Jones wrote:
> > to enable hard lockup detection explicitly.
> > 
> > I think changing the 'watchdog_thresh' while 'watchdog_running' is true should
> > _not_ enable hard lockup detection as a side-effect, because a user may have a
> > 'sysctl.conf' entry such as
> > 
> >    kernel.watchdog_thresh = ...
> > 
> > or may only want to change the 'watchdog_thresh' on the fly.
> > 
> > I think the following flow of execution could cause such undesired side-effect.
> > 
> >    proc_dowatchdog
> >      if (watchdog_user_enabled && watchdog_thresh) {
> > 
> >          watchdog_enable_hardlockup_detector
> >            hardlockup_detector_enabled = true
> > 
> >          watchdog_enable_all_cpus
> >            if (!watchdog_running) {
> >                ...
> >            } else if (sample_period_changed)
> >                       update_timers_all_cpus
> >                         for_each_online_cpu
> >                             update_timers
> >                               watchdog_nmi_disable
> >                               ...
> >                               watchdog_nmi_enable
> > 
> >                                 watchdog_hardlockup_detector_is_enabled
> >                                   return true
> > 
> >                                 enable perf counter for hard lockup detection
> > 
> > Regards,
> > 
> > Uli
> 
> Nice catch. Looks like this will need a v2. Paolo, do we have a
> consensus on the proc echoing? Or should that be revisited in the v2 as
> well?

As discussed privately, how about something like this to handle that case:
(applied on top of these patches)

Cheers,
Don


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paolo Bonzini July 30, 2014, 2:16 p.m. UTC | #1
Il 30/07/2014 15:43, Don Zickus ha scritto:
>> > Nice catch. Looks like this will need a v2. Paolo, do we have a
>> > consensus on the proc echoing? Or should that be revisited in the v2 as
>> > well?
> As discussed privately, how about something like this to handle that case:
> (applied on top of these patches)

Don, what do you think about proc?

My opinion is still what I mentioned earlier in the thread, i.e. that if
the file says "1", writing "0" and then "1" should not constitute a
change WRT to the initial state.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Don Zickus July 30, 2014, 5:07 p.m. UTC | #2
On Wed, Jul 30, 2014 at 04:16:38PM +0200, Paolo Bonzini wrote:
> Il 30/07/2014 15:43, Don Zickus ha scritto:
> >> > Nice catch. Looks like this will need a v2. Paolo, do we have a
> >> > consensus on the proc echoing? Or should that be revisited in the v2 as
> >> > well?
> > As discussed privately, how about something like this to handle that case:
> > (applied on top of these patches)
> 
> Don, what do you think about proc?
> 
> My opinion is still what I mentioned earlier in the thread, i.e. that if
> the file says "1", writing "0" and then "1" should not constitute a
> change WRT to the initial state.
> 

I can agree.  The problem is there are two things this proc value
controls, softlockup and hardlockup.  I have always tried to keep the both
disabled or enabled together.

This patchset tries to separate them for an edge case.  Hence the proc
value becomes slightly confusing.

I don't know the right way to solve this without introducing more proc
values.

We have /proc/sys/kernel/nmi_watchdog and /proc/sys/kernel/watchdog which
point to the same internal variable.  Do I separate them and have
'nmi_watchdog' just mean hardlockup and 'watchdog' mean softlockup?  Then
we can be clear on what the output is.  Or does 'watchdog' represent a
superset of 'nmi_watchdog' && softlockup?

That is where the confusion lies.

Cheers,
Don

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 34eca29..027fb6c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -666,7 +666,12 @@  int proc_dowatchdog(struct ctl_table *table, int write,
 	 * watchdog_*_all_cpus() function takes care of this.
 	 */
 	if (watchdog_user_enabled && watchdog_thresh) {
-		watchdog_enable_hardlockup_detector(true);
+		/*
+		 * Prevent a change in watchdog_thresh accidentally overriding
+		 * the enablement of the hardlockup detector.
+		 */
+		if (watchdog_user_enabled != old_enabled)
+			watchdog_enable_hardlockup_detector(true);
 		err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh);
 	} else
 		watchdog_disable_all_cpus();