diff mbox series

[1/2] watchdog/hpwdt: Disable NMI in Crash Kernel

Message ID 1606097320-56762-2-git-send-email-jerry.hoemann@hpe.com (mailing list archive)
State Accepted
Headers show
Series watchdog/hpwdt: Disable Pretimeout/NMI in Crash Path | expand

Commit Message

Jerry Hoemann Nov. 23, 2020, 2:08 a.m. UTC
NMIs received during the crash path are problematic as hpwdt_pretimeout
handling of the NMI would cause a reentry into kdump.

The situation is complicated in that I/O errors can be signaled as NMI
circumventing hpwdt_pretimeout's attempt to not claim NMI not associated
with either the WDT or the iLO NMI switch.  These NMI can additionally
cause a secondary NMI which cause the system to hang.

By disabling pretimeout and hpwdtimeout in crash path we both reduce
the risk of receiving an NMI and simuletaneously leave the WDT running
(if it was already in use) to allow the WDT to break the system out of
hangs by the WDT reset.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
---
 drivers/watchdog/hpwdt.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Guenter Roeck Nov. 23, 2020, 2:20 a.m. UTC | #1
On Sun, Nov 22, 2020 at 07:08:39PM -0700, Jerry Hoemann wrote:
> NMIs received during the crash path are problematic as hpwdt_pretimeout
> handling of the NMI would cause a reentry into kdump.
> 
> The situation is complicated in that I/O errors can be signaled as NMI
> circumventing hpwdt_pretimeout's attempt to not claim NMI not associated
> with either the WDT or the iLO NMI switch.  These NMI can additionally
> cause a secondary NMI which cause the system to hang.
> 
> By disabling pretimeout and hpwdtimeout in crash path we both reduce
> the risk of receiving an NMI and simuletaneously leave the WDT running
> (if it was already in use) to allow the WDT to break the system out of
> hangs by the WDT reset.
> 
> Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>

Reviewed-by: Guenter Roeck <linux@roeck-us.net>

> ---
>  drivers/watchdog/hpwdt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> index 7d34bcf..eeb4df2 100644
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -21,6 +21,7 @@
>  #include <linux/types.h>
>  #include <linux/watchdog.h>
>  #include <asm/nmi.h>
> +#include <linux/crash_dump.h>
>  
>  #define HPWDT_VERSION			"2.0.3"
>  #define SECS_TO_TICKS(secs)		((secs) * 1000 / 128)
> @@ -334,6 +335,11 @@ static int hpwdt_init_one(struct pci_dev *dev,
>  	watchdog_set_nowayout(&hpwdt_dev, nowayout);
>  	watchdog_init_timeout(&hpwdt_dev, soft_margin, NULL);
>  
> +	if (is_kdump_kernel()) {
> +		pretimeout = 0;
> +		kdumptimeout = 0;
> +	}
> +
>  	if (pretimeout && hpwdt_dev.timeout <= PRETIMEOUT_SEC) {
>  		dev_warn(&dev->dev, "timeout <= pretimeout. Setting pretimeout to zero\n");
>  		pretimeout = 0;
> -- 
> 1.8.3.1
>
diff mbox series

Patch

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 7d34bcf..eeb4df2 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -21,6 +21,7 @@ 
 #include <linux/types.h>
 #include <linux/watchdog.h>
 #include <asm/nmi.h>
+#include <linux/crash_dump.h>
 
 #define HPWDT_VERSION			"2.0.3"
 #define SECS_TO_TICKS(secs)		((secs) * 1000 / 128)
@@ -334,6 +335,11 @@  static int hpwdt_init_one(struct pci_dev *dev,
 	watchdog_set_nowayout(&hpwdt_dev, nowayout);
 	watchdog_init_timeout(&hpwdt_dev, soft_margin, NULL);
 
+	if (is_kdump_kernel()) {
+		pretimeout = 0;
+		kdumptimeout = 0;
+	}
+
 	if (pretimeout && hpwdt_dev.timeout <= PRETIMEOUT_SEC) {
 		dev_warn(&dev->dev, "timeout <= pretimeout. Setting pretimeout to zero\n");
 		pretimeout = 0;