Message ID | 87b1c97177c43eeec640483cc2f83f5f4d7b1060.camel@wefi.net (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC] improve it87_wdt (IT8784/IT8786) / keeping WDTCTRL unchanged / deactivate watchdog by setting WDTVALLSB/WDTVALMSB 0 | expand |
Hi Guenter, hi Wim, hi Hanspeter, On Mon, 2023-10-16 at 15:16 +0200, Werner Fischer wrote: > Hi Guenter, hi Hanspeter, > > I currently testing two devices with IT8784 and IT8786 watchdog > timers. > Although the chips are supported by it87_wdt.c after Hanspeter's > patches back in 2020, the watchdog functionality does not work in my > following test: > - Debian 12 using Kernel 6.1.58 or current 6.6-rc > - loading module it87_wdt > - starting wd_keepalive Deamon > - killing wd_keepalive using signal 9 > -> system keeps on running even after the configured watchdog timeout > > For debugging purposes, I have used the patch below to report the > content of the watchdog registers 0x71 (WDTCTRL), 0x72 (WDTCFG), 0x73 > (WDTVALLSB), and 0x74 (WDTVALMSB) to the system log. > > It turned out, that 0x71 (WDTCTRL) has initially the following value > set (before the module changes it to 0x00): > - 8 decimal (IT8784 / IT8786) > - 4 decimal (IT8613) > > I figured out, that the following code line makes the watchdog of > IT8784 and IT8786 non-functional for me: > superio_outb(0x00, WDTCTRL); > I have removed this code in my patch below, then the watchdog works > for IT8784 and IT8786. > > I'm not sure, why the WDTCTRL register is set to 0x00 in the code. As > it seems, the register can have different meanings for differnt > IT8xxx chips. Accoring to [1] it seems sufficient to set both > WDTVALLSB and WDTVALMSB to 0x00 to deactivate the watchdog timer: > "When the WDT Time-out Value register is set to a non-zero value, the > WDT loads the value and begin counting down from the value." > This happens e.g. also when wd_keepalive is stopped cleanly. > > I am open to support to improve the it87_wdt code. > > But before I'm writing and sending a patch, I have the following > question: > * What is the reason, why WDTCTRL is set to 0x00 in the code? and > * Could we think about removing this (at least for IT8784/8786)? It seems to me that setting WDTCTRL to 0x00 has been in the code from the beginning. For my test systems with IT8784 and IT8786 I got the following information from the system vendor: "71H bit 3 is the mode choice for the clock input of the IT8784/IT8786 chip. This bit is set to 1 (= PCICLK mode) and can not be set to 0." Setting it to 0 breaks the watchdog functionality. Unfortunately, ITE does not provide the specifications PDFs publicly anymore. But the documentation at [2] provides details regarding the Watchdog Timer Control Register (71h) of an ITE chip, which has the description "External CLK_IN Select: 1: PCICLK" for bit 3, too. As it seems system-dependent, removing superio_outb(0x00, WDTCTRL); from the code may lead to problems with other ITE chips, which maybe could need WDTCTRL set to 0x00. So my idea to be on the safe side for exiting users of it87_wdt, too: * What do you think about an optional module parameter to let the user choose to leave WDTCTRL untouched? (this would make the watchdog work e.g. with my test systems with IT8784 and IT8786, too) In case you would support such a code change, I would be happy to write a patch (including some other minor/cosmetic code fixes/addition of further ITE chips). Best regards, Werner [2] http://files.nexcom.com/Driver/NDiSB425/User_Manual_NDiSB425-SI3_170111.pdf
On 11/7/23 05:18, Werner Fischer wrote: > Hi Guenter, hi Wim, hi Hanspeter, > > On Mon, 2023-10-16 at 15:16 +0200, Werner Fischer wrote: >> Hi Guenter, hi Hanspeter, >> >> I currently testing two devices with IT8784 and IT8786 watchdog >> timers. >> Although the chips are supported by it87_wdt.c after Hanspeter's >> patches back in 2020, the watchdog functionality does not work in my >> following test: >> - Debian 12 using Kernel 6.1.58 or current 6.6-rc >> - loading module it87_wdt >> - starting wd_keepalive Deamon >> - killing wd_keepalive using signal 9 >> -> system keeps on running even after the configured watchdog timeout >> >> For debugging purposes, I have used the patch below to report the >> content of the watchdog registers 0x71 (WDTCTRL), 0x72 (WDTCFG), 0x73 >> (WDTVALLSB), and 0x74 (WDTVALMSB) to the system log. >> >> It turned out, that 0x71 (WDTCTRL) has initially the following value >> set (before the module changes it to 0x00): >> - 8 decimal (IT8784 / IT8786) >> - 4 decimal (IT8613) >> >> I figured out, that the following code line makes the watchdog of >> IT8784 and IT8786 non-functional for me: >> superio_outb(0x00, WDTCTRL); >> I have removed this code in my patch below, then the watchdog works >> for IT8784 and IT8786. >> >> I'm not sure, why the WDTCTRL register is set to 0x00 in the code. As >> it seems, the register can have different meanings for differnt >> IT8xxx chips. Accoring to [1] it seems sufficient to set both >> WDTVALLSB and WDTVALMSB to 0x00 to deactivate the watchdog timer: >> "When the WDT Time-out Value register is set to a non-zero value, the >> WDT loads the value and begin counting down from the value." >> This happens e.g. also when wd_keepalive is stopped cleanly. >> >> I am open to support to improve the it87_wdt code. >> >> But before I'm writing and sending a patch, I have the following >> question: >> * What is the reason, why WDTCTRL is set to 0x00 in the code? and >> * Could we think about removing this (at least for IT8784/8786)? > It seems to me that setting WDTCTRL to 0x00 has been in the code from > the beginning. > > For my test systems with IT8784 and IT8786 I got the following > information from the system vendor: > "71H bit 3 is the mode choice for the clock input of the IT8784/IT8786 > chip. This bit is set to 1 (= PCICLK mode) and can not be set to 0." > Setting it to 0 breaks the watchdog functionality. > > Unfortunately, ITE does not provide the specifications PDFs publicly > anymore. But the documentation at [2] provides details regarding the They really never did, or at least not for a long time. Some board vendors used to be Linux-friendly and provided datasheets on request, but that is no longer the case. My recommendation used to be, and still is, not to use boards with Super-IO chips from ITE to run Linux. This is not only due to lack of datasheets, but also due to the lack of support from both chip and board vendors if there are any issues when trying to support the chips in Linux. > Watchdog Timer Control Register (71h) of an ITE chip, which has the > description "External CLK_IN Select: 1: PCICLK" for bit 3, too. > > As it seems system-dependent, removing > superio_outb(0x00, WDTCTRL); > from the code may lead to problems with other ITE chips, which maybe > could need WDTCTRL set to 0x00. > Bit 4..7 of the register are used to control watchdog timer resets (pings). Skipping the write entirely is therefore unacceptable even for IT8784/IT8786 because we _don't_ want activity on a (legacy) keyboard, mouse, game, or infrared port to reset the watchdog timer. > So my idea to be on the safe side for exiting users of it87_wdt, too: > * What do you think about an optional module parameter to let the user > choose to leave WDTCTRL untouched? (this would make the watchdog work > e.g. with my test systems with IT8784 and IT8786, too) > Make it conditional for IT8784/IT8786: On those chips, read the value from the chip and clear all but bit 3. This is the safest we can do. Future chips can be added as needed. Guenter
On Tue, 2023-11-07 at 06:34 -0800, Guenter Roeck wrote: > On 11/7/23 05:18, Werner Fischer wrote: > > > * What is the reason, why WDTCTRL is set to 0x00 in the code? and > > > * Could we think about removing this (at least for IT8784/8786)? > > It seems to me that setting WDTCTRL to 0x00 has been in the code > > from the beginning. > > > > For my test systems with IT8784 and IT8786 I got the following > > information from the system vendor: > > "71H bit 3 is the mode choice for the clock input of the > > IT8784/IT8786 chip. This bit is set to 1 (= PCICLK mode) and can > > not be set to 0." > > Setting it to 0 breaks the watchdog functionality. > > > > Unfortunately, ITE does not provide the specifications PDFs > > publicly anymore. But the documentation at [2] provides details > > regarding the > They really never did, or at least not for a long time. Some board > vendors used to be Linux-friendly and provided datasheets on request, > but that is no longer the case. My recommendation used to be, and > still is, not to use boards with Super-IO chips from ITE to run > Linux. This is not only due to lack of datasheets, but also due to > the lack of support from both chip and board vendors if there are any > issues when trying to support the chips in Linux. Fortunately, the manufacturer of my test systems supports me very well here. I will simply try to improve the watchdog support for the chips in the test systems. Currently I have four different boards, containing one of IT8613, IT8659, IT8784 and IT8786. > > Bit 4..7 of the register are used to control watchdog timer resets > (pings). Skipping the write entirely is therefore unacceptable even > for IT8784/IT8786 because we _don't_ want activity on a (legacy) > keyboard, mouse, game, or infrared port to reset the watchdog timer. Thanks for the explanation. Understood. > > So my idea to be on the safe side for exiting users of it87_wdt, > > too: > > * What do you think about an optional module parameter to let the > > user choose to leave WDTCTRL untouched? (this would make the > > watchdog work e.g. with my test systems with IT8784 and IT8786, > > too) > Make it conditional for IT8784/IT8786: On those chips, read the value > from the chip and clear all but bit 3. This is the safest we can do. > Future chips can be added as needed. This is a great idea, thank you very much for that. I am trying to write a corresponding patch. Werner
diff --git a/drivers/watchdog/it87_wdt.c b/drivers/watchdog/it87_wdt.c index bb1122909396..4ea5505b1771 100644 --- a/drivers/watchdog/it87_wdt.c +++ b/drivers/watchdog/it87_wdt.c @@ -50,6 +50,7 @@ /* Chip Id numbers */ #define NO_DEV_ID 0xffff #define IT8607_ID 0x8607 +#define IT8613_ID 0x8613 #define IT8620_ID 0x8620 #define IT8622_ID 0x8622 #define IT8625_ID 0x8625 @@ -152,9 +153,26 @@ static inline int superio_inw(int reg) return val; } +static void wdt_debug_registers(unsigned int debugstage) +{ + unsigned int wdt_ctrl; + unsigned int wdt_cfg; + unsigned int wdt_vallsb; + unsigned int wdt_valmsb; + wdt_ctrl = superio_inb(WDTCTRL); + wdt_cfg = superio_inb(WDTCFG); + wdt_vallsb = superio_inb(WDTVALLSB); + wdt_valmsb = superio_inb(WDTVALMSB); + pr_info("DEBUG at stage %d: wdt_ctrl=%d; wdt_cfg=%d; wdt_vallsb=%d, wdt_valmsb=%d\n", debugstage, wdt_ctrl, wdt_cfg, wdt_vallsb, wdt_valmsb); +} + /* Internal function, should be called after superio_select(GPIO) */ static void _wdt_update_timeout(unsigned int t) { + wdt_debug_registers(3); + /* test werner reset */ + /* superio_outb(0x00, WDTCTRL); */ + wdt_debug_registers(9); unsigned char cfg = WDT_KRST; if (testmode) @@ -172,6 +190,7 @@ static void _wdt_update_timeout(unsigned int t) superio_outb(t, WDTVALLSB); if (max_units > 255) superio_outb(t >> 8, WDTVALMSB); + wdt_debug_registers(4); } static int wdt_update_timeout(unsigned int t) @@ -258,11 +277,13 @@ static int __init it87_wdt_init(void) int rc; rc = superio_enter(); + wdt_debug_registers(0); if (rc) return rc; chip_type = superio_inw(CHIPID); chip_rev = superio_inb(CHIPREV) & 0x0f; + wdt_debug_registers(1); superio_exit(); switch (chip_type) { @@ -274,9 +295,8 @@ static int __init it87_wdt_init(void) break; case IT8716_ID: case IT8726_ID: - max_units = 65535; - break; case IT8607_ID: + case IT8613_ID: case IT8620_ID: case IT8622_ID: case IT8625_ID: @@ -310,10 +330,12 @@ static int __init it87_wdt_init(void) rc = superio_enter(); if (rc) return rc; - + wdt_debug_registers(10); superio_select(GPIO); + wdt_debug_registers(11); superio_outb(WDT_TOV1, WDTCFG); - superio_outb(0x00, WDTCTRL); + /* superio_outb(0x00, WDTCTRL); */ + wdt_debug_registers(2); superio_exit(); if (timeout < 1 || timeout > max_units * 60) {