Message ID | 20210616190708.1220-1-digetx@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [v1] hwmon: (lm90) Use edge-triggered interrupt | expand |
On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote: > The LM90 driver uses level-based interrupt triggering. The interrupt > handler prints a warning message about the breached temperature and > quits. There is no way to stop interrupt from re-triggering since it's > level-based, thus thousands of warning messages are printed per second > once interrupt is triggered. Use edge-triggered interrupt in order to > fix this trouble. > > Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ") > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > drivers/hwmon/lm90.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c > index ebbfd5f352c0..ce8ebe60fcdc 100644 > --- a/drivers/hwmon/lm90.c > +++ b/drivers/hwmon/lm90.c > @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client) > dev_dbg(dev, "IRQ: %d\n", client->irq); > err = devm_request_threaded_irq(dev, client->irq, > NULL, lm90_irq_thread, > - IRQF_TRIGGER_LOW | IRQF_ONESHOT, > + IRQF_TRIGGER_FALLING | IRQF_ONESHOT, > "lm90", client); We can't do that. Problem is that many of the devices supported by this driver behave differently when it comes to interrupts. Specifically, the interrupt handler is supposed to reset the interrupt condition (ie reading the status register should reset it). If that is the not the case for a specific chip, we'll have to update the code to address the problem for that specific chip. The above code would probably just generate a single interrupt while never resetting the interrupt condition, which is obviously not what we want to happen. Guenter > if (err < 0) { > dev_err(dev, "cannot request IRQ %d\n", client->irq); > -- > 2.30.2 >
17.06.2021 03:12, Guenter Roeck пишет: > On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote: >> The LM90 driver uses level-based interrupt triggering. The interrupt >> handler prints a warning message about the breached temperature and >> quits. There is no way to stop interrupt from re-triggering since it's >> level-based, thus thousands of warning messages are printed per second >> once interrupt is triggered. Use edge-triggered interrupt in order to >> fix this trouble. >> >> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ") >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> >> --- >> drivers/hwmon/lm90.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c >> index ebbfd5f352c0..ce8ebe60fcdc 100644 >> --- a/drivers/hwmon/lm90.c >> +++ b/drivers/hwmon/lm90.c >> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client) >> dev_dbg(dev, "IRQ: %d\n", client->irq); >> err = devm_request_threaded_irq(dev, client->irq, >> NULL, lm90_irq_thread, >> - IRQF_TRIGGER_LOW | IRQF_ONESHOT, >> + IRQF_TRIGGER_FALLING | IRQF_ONESHOT, >> "lm90", client); > > We can't do that. Problem is that many of the devices supported by this driver > behave differently when it comes to interrupts. Specifically, the interrupt > handler is supposed to reset the interrupt condition (ie reading the status > register should reset it). If that is the not the case for a specific chip, > we'll have to update the code to address the problem for that specific chip. > The above code would probably just generate a single interrupt while never > resetting the interrupt condition, which is obviously not what we want to > happen. The nct1008/72 datasheet [1] says that reading the status register doesn't reset interrupt until temperature is returned back into normal state, which is what I'm witnessing. [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf Page 10 "Status Register": "Reading the status register clears the five flags, Bit 6 to Bit 2, provided the error conditions causing the flags to beset have gone away. A flag bit can be reset only if the corresponding value register contains an in-limit measurement or if the sensor is good." So the interrupt handler doesn't actually stop interrupt from reoccurring and the whole KMSG is instantly spammed with: ... [ 217.484034] lm90 0-004c: temp2 out of range, please check! [ 217.484569] lm90 0-004c: temp2 out of range, please check! [ 217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some messages lost. [ 217.485109] lm90 0-004c: temp2 out of range, please check! [ 217.485699] lm90 0-004c: temp2 out of range, please check! [ 217.486235] lm90 0-004c: temp2 out of range, please check! [ 217.486776] lm90 0-004c: temp2 out of range, please check! [ 217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ... It's interesting that the very first version of the nct1008-support patch used edge-triggered interrupt flags [2]. [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html Limiting the interrupt rate could be an alternative solution. What do you think about something like this: diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c index ce8ebe60fcdc..74886b8066ab 100644 --- a/drivers/hwmon/lm90.c +++ b/drivers/hwmon/lm90.c @@ -79,6 +79,7 @@ * concern all supported chipsets, unless mentioned otherwise. */ +#include <linux/delay.h> #include <linux/module.h> #include <linux/init.h> #include <linux/slab.h> @@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657, max6659, adt7461, max6680, #define MAX6696_STATUS2_R2OT2 (1 << 6) /* remote2 emergency limit tripped */ #define MAX6696_STATUS2_LOT2 (1 << 7) /* local emergency limit tripped */ +/* Prevent instant interrupt re-triggering */ +#define LM90_IRQ_DELAY (15 * MSEC_PER_SEC) + /* * Driver data (common to all clients) */ @@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void *dev_id) struct i2c_client *client = dev_id; u16 status; - if (lm90_is_tripped(client, &status)) - return IRQ_HANDLED; - else + if (!lm90_is_tripped(client, &status)) return IRQ_NONE; + + msleep(LM90_IRQ_DELAY); + + return IRQ_HANDLED; } static void lm90_remove_pec(void *dev)
On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote: > 17.06.2021 03:12, Guenter Roeck пишет: > > On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote: > >> The LM90 driver uses level-based interrupt triggering. The interrupt > >> handler prints a warning message about the breached temperature and > >> quits. There is no way to stop interrupt from re-triggering since it's > >> level-based, thus thousands of warning messages are printed per second > >> once interrupt is triggered. Use edge-triggered interrupt in order to > >> fix this trouble. > >> > >> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ") > >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > >> --- > >> drivers/hwmon/lm90.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c > >> index ebbfd5f352c0..ce8ebe60fcdc 100644 > >> --- a/drivers/hwmon/lm90.c > >> +++ b/drivers/hwmon/lm90.c > >> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client) > >> dev_dbg(dev, "IRQ: %d\n", client->irq); > >> err = devm_request_threaded_irq(dev, client->irq, > >> NULL, lm90_irq_thread, > >> - IRQF_TRIGGER_LOW | IRQF_ONESHOT, > >> + IRQF_TRIGGER_FALLING | IRQF_ONESHOT, > >> "lm90", client); > > > > We can't do that. Problem is that many of the devices supported by this driver > > behave differently when it comes to interrupts. Specifically, the interrupt > > handler is supposed to reset the interrupt condition (ie reading the status > > register should reset it). If that is the not the case for a specific chip, > > we'll have to update the code to address the problem for that specific chip. > > The above code would probably just generate a single interrupt while never > > resetting the interrupt condition, which is obviously not what we want to > > happen. > > The nct1008/72 datasheet [1] says that reading the status register > doesn't reset interrupt until temperature is returned back into normal > state, which is what I'm witnessing. > > [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf > > Page 10 "Status Register": > > "Reading the status register clears the five flags, Bit 6 to Bit 2, > provided the error conditions causing the flags to beset have gone > away. A flag bit can be reset only if the corresponding > value register contains an in-limit measurement or if the > sensor is good." > > So the interrupt handler doesn't actually stop interrupt from > reoccurring and the whole KMSG is instantly spammed with: > > ... > [ 217.484034] lm90 0-004c: temp2 out of range, please check! > [ 217.484569] lm90 0-004c: temp2 out of range, please check! > [ 217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some > messages lost. > [ 217.485109] lm90 0-004c: temp2 out of range, please check! > [ 217.485699] lm90 0-004c: temp2 out of range, please check! > [ 217.486235] lm90 0-004c: temp2 out of range, please check! > [ 217.486776] lm90 0-004c: temp2 out of range, please check! > [ 217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ... > > It's interesting that the very first version of the nct1008-support > patch used edge-triggered interrupt flags [2]. > > [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html > A lot of this depends on the chip and its wiring, as well as on chip configuration. Even for a specific chip there may be configuration dependencies. The interrupt configuration in situations like this should really be determined by devicetree configuration, and not be hardcoded. Is this a devicetree based system ? If so, there should be an entry for this chip pointing to the interrupt, and that entry should include a trigger mask. That mask should be set to edge triggered. > Limiting the interrupt rate could be an alternative solution. > > What do you think about something like this: > A sleep in an interrupt handler to "prevent" an interrupt storm is never acceptable. Guenter > diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c > index ce8ebe60fcdc..74886b8066ab 100644 > --- a/drivers/hwmon/lm90.c > +++ b/drivers/hwmon/lm90.c > @@ -79,6 +79,7 @@ > * concern all supported chipsets, unless mentioned otherwise. > */ > > +#include <linux/delay.h> > #include <linux/module.h> > #include <linux/init.h> > #include <linux/slab.h> > @@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657, > max6659, adt7461, max6680, > #define MAX6696_STATUS2_R2OT2 (1 << 6) /* remote2 emergency limit > tripped */ > #define MAX6696_STATUS2_LOT2 (1 << 7) /* local emergency limit tripped */ > > +/* Prevent instant interrupt re-triggering */ > +#define LM90_IRQ_DELAY (15 * MSEC_PER_SEC) > + > /* > * Driver data (common to all clients) > */ > @@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void > *dev_id) > struct i2c_client *client = dev_id; > u16 status; > > - if (lm90_is_tripped(client, &status)) > - return IRQ_HANDLED; > - else > + if (!lm90_is_tripped(client, &status)) > return IRQ_NONE; > + > + msleep(LM90_IRQ_DELAY); > + > + return IRQ_HANDLED; > } > > static void lm90_remove_pec(void *dev)
17.06.2021 16:12, Guenter Roeck пишет: > On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote: >> 17.06.2021 03:12, Guenter Roeck пишет: >>> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote: >>>> The LM90 driver uses level-based interrupt triggering. The interrupt >>>> handler prints a warning message about the breached temperature and >>>> quits. There is no way to stop interrupt from re-triggering since it's >>>> level-based, thus thousands of warning messages are printed per second >>>> once interrupt is triggered. Use edge-triggered interrupt in order to >>>> fix this trouble. >>>> >>>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ") >>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> >>>> --- >>>> drivers/hwmon/lm90.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c >>>> index ebbfd5f352c0..ce8ebe60fcdc 100644 >>>> --- a/drivers/hwmon/lm90.c >>>> +++ b/drivers/hwmon/lm90.c >>>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client) >>>> dev_dbg(dev, "IRQ: %d\n", client->irq); >>>> err = devm_request_threaded_irq(dev, client->irq, >>>> NULL, lm90_irq_thread, >>>> - IRQF_TRIGGER_LOW | IRQF_ONESHOT, >>>> + IRQF_TRIGGER_FALLING | IRQF_ONESHOT, >>>> "lm90", client); >>> >>> We can't do that. Problem is that many of the devices supported by this driver >>> behave differently when it comes to interrupts. Specifically, the interrupt >>> handler is supposed to reset the interrupt condition (ie reading the status >>> register should reset it). If that is the not the case for a specific chip, >>> we'll have to update the code to address the problem for that specific chip. >>> The above code would probably just generate a single interrupt while never >>> resetting the interrupt condition, which is obviously not what we want to >>> happen. >> >> The nct1008/72 datasheet [1] says that reading the status register >> doesn't reset interrupt until temperature is returned back into normal >> state, which is what I'm witnessing. >> >> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf >> >> Page 10 "Status Register": >> >> "Reading the status register clears the five flags, Bit 6 to Bit 2, >> provided the error conditions causing the flags to beset have gone >> away. A flag bit can be reset only if the corresponding >> value register contains an in-limit measurement or if the >> sensor is good." >> >> So the interrupt handler doesn't actually stop interrupt from >> reoccurring and the whole KMSG is instantly spammed with: >> >> ... >> [ 217.484034] lm90 0-004c: temp2 out of range, please check! >> [ 217.484569] lm90 0-004c: temp2 out of range, please check! >> [ 217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some >> messages lost. >> [ 217.485109] lm90 0-004c: temp2 out of range, please check! >> [ 217.485699] lm90 0-004c: temp2 out of range, please check! >> [ 217.486235] lm90 0-004c: temp2 out of range, please check! >> [ 217.486776] lm90 0-004c: temp2 out of range, please check! >> [ 217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ... >> >> It's interesting that the very first version of the nct1008-support >> patch used edge-triggered interrupt flags [2]. >> >> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html >> > A lot of this depends on the chip and its wiring, as well as on chip > configuration. Even for a specific chip there may be configuration > dependencies. The interrupt configuration in situations like this > should really be determined by devicetree configuration, and not > be hardcoded. Is this a devicetree based system ? If so, there should > be an entry for this chip pointing to the interrupt, and that entry > should include a trigger mask. That mask should be set to edge > triggered. This is a device-tree based system, in particular it's NVIDIA Tegra30 Nexus 7. The interrupt support was originally added to the lm90 driver by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra device-trees are specifying the trigger mask and apparently they all are cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it should be IRQ_TYPE_EDGE_FALLING. The IRQF flag in devm_request_threaded_irq() overrides the trigger mask specified in a device-tree. IIUC, the interrupt is used only by OF-based devices, hence I think we could simply remove the IRQF flag from the code and fix the device-trees. Does it sound good to you?
On Thu, Jun 17, 2021 at 04:48:08PM +0300, Dmitry Osipenko wrote: > 17.06.2021 16:12, Guenter Roeck пишет: > > On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote: > >> 17.06.2021 03:12, Guenter Roeck пишет: > >>> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote: > >>>> The LM90 driver uses level-based interrupt triggering. The interrupt > >>>> handler prints a warning message about the breached temperature and > >>>> quits. There is no way to stop interrupt from re-triggering since it's > >>>> level-based, thus thousands of warning messages are printed per second > >>>> once interrupt is triggered. Use edge-triggered interrupt in order to > >>>> fix this trouble. > >>>> > >>>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ") > >>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > >>>> --- > >>>> drivers/hwmon/lm90.c | 2 +- > >>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c > >>>> index ebbfd5f352c0..ce8ebe60fcdc 100644 > >>>> --- a/drivers/hwmon/lm90.c > >>>> +++ b/drivers/hwmon/lm90.c > >>>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client) > >>>> dev_dbg(dev, "IRQ: %d\n", client->irq); > >>>> err = devm_request_threaded_irq(dev, client->irq, > >>>> NULL, lm90_irq_thread, > >>>> - IRQF_TRIGGER_LOW | IRQF_ONESHOT, > >>>> + IRQF_TRIGGER_FALLING | IRQF_ONESHOT, > >>>> "lm90", client); > >>> > >>> We can't do that. Problem is that many of the devices supported by this driver > >>> behave differently when it comes to interrupts. Specifically, the interrupt > >>> handler is supposed to reset the interrupt condition (ie reading the status > >>> register should reset it). If that is the not the case for a specific chip, > >>> we'll have to update the code to address the problem for that specific chip. > >>> The above code would probably just generate a single interrupt while never > >>> resetting the interrupt condition, which is obviously not what we want to > >>> happen. > >> > >> The nct1008/72 datasheet [1] says that reading the status register > >> doesn't reset interrupt until temperature is returned back into normal > >> state, which is what I'm witnessing. > >> > >> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf > >> > >> Page 10 "Status Register": > >> > >> "Reading the status register clears the five flags, Bit 6 to Bit 2, > >> provided the error conditions causing the flags to beset have gone > >> away. A flag bit can be reset only if the corresponding > >> value register contains an in-limit measurement or if the > >> sensor is good." > >> > >> So the interrupt handler doesn't actually stop interrupt from > >> reoccurring and the whole KMSG is instantly spammed with: > >> > >> ... > >> [ 217.484034] lm90 0-004c: temp2 out of range, please check! > >> [ 217.484569] lm90 0-004c: temp2 out of range, please check! > >> [ 217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some > >> messages lost. > >> [ 217.485109] lm90 0-004c: temp2 out of range, please check! > >> [ 217.485699] lm90 0-004c: temp2 out of range, please check! > >> [ 217.486235] lm90 0-004c: temp2 out of range, please check! > >> [ 217.486776] lm90 0-004c: temp2 out of range, please check! > >> [ 217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ... > >> > >> It's interesting that the very first version of the nct1008-support > >> patch used edge-triggered interrupt flags [2]. > >> > >> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html > >> > > A lot of this depends on the chip and its wiring, as well as on chip > > configuration. Even for a specific chip there may be configuration > > dependencies. The interrupt configuration in situations like this > > should really be determined by devicetree configuration, and not > > be hardcoded. Is this a devicetree based system ? If so, there should > > be an entry for this chip pointing to the interrupt, and that entry > > should include a trigger mask. That mask should be set to edge > > triggered. > > This is a device-tree based system, in particular it's NVIDIA Tegra30 > Nexus 7. The interrupt support was originally added to the lm90 driver > by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra > device-trees are specifying the trigger mask and apparently they all are > cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it Be fair, no one is perfect. > should be IRQ_TYPE_EDGE_FALLING. It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING, and the interrupt handler should call hwmon_notify_event() instead of clogging the kernel log, but that should be done in a separate patch. Anyway, the tegra30 dts files in the upstream kernel either use IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file in the upstream kernel has no interrupt configured (and coincidentally it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ? > > The IRQF flag in devm_request_threaded_irq() overrides the trigger mask > specified in a device-tree. IIUC, the interrupt is used only by OF-based > devices, hence I think we could simply remove the IRQF flag from the > code and fix the device-trees. Does it sound good to you? Yes, that is a better approach. Thanks, Guenter
17.06.2021 17:13, Guenter Roeck пишет: ... >> This is a device-tree based system, in particular it's NVIDIA Tegra30 >> Nexus 7. The interrupt support was originally added to the lm90 driver >> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra >> device-trees are specifying the trigger mask and apparently they all are >> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it > > Be fair, no one is perfect. This is a very minor problem, so no wonder that nobody noticed or bothered to fix it yet. I'm just clarifying the status here. >> should be IRQ_TYPE_EDGE_FALLING. > > It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING, For now I see that the rising edge isn't needed, the TEMP_ALERT goes HIGH by itself when temperature backs to normal. But I will try to double check. > and the interrupt handler should call hwmon_notify_event() instead of > clogging the kernel log, but that should be done in a separate patch. Thank you for suggestion, I will take a look. > Anyway, the tegra30 dts files in the upstream kernel either use > IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file > in the upstream kernel has no interrupt configured (and coincidentally > it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ? I have a patch that will add the interrupt property, it's stashed locally for the next kernel release. IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that use IRQ_TYPE_LEVEL_LOW are in the same position. >> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask >> specified in a device-tree. IIUC, the interrupt is used only by OF-based >> devices, hence I think we could simply remove the IRQF flag from the >> code and fix the device-trees. Does it sound good to you? > > Yes, that is a better approach. Thank you for reviewing this patch. I'll prepare v2.
On Thu, Jun 17, 2021 at 05:46:33PM +0300, Dmitry Osipenko wrote: > 17.06.2021 17:13, Guenter Roeck пишет: > ... > >> This is a device-tree based system, in particular it's NVIDIA Tegra30 > >> Nexus 7. The interrupt support was originally added to the lm90 driver > >> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra > >> device-trees are specifying the trigger mask and apparently they all are > >> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it > > > > Be fair, no one is perfect. > > This is a very minor problem, so no wonder that nobody noticed or > bothered to fix it yet. I'm just clarifying the status here. > > >> should be IRQ_TYPE_EDGE_FALLING. > > > > It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING, > > For now I see that the rising edge isn't needed, the TEMP_ALERT goes > HIGH by itself when temperature backs to normal. But I will try to > double check. > The point is that a sysfs event should be sent to userspace on both edges, not only when an alarm is raised. But, you are correct, IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events are not generated. > > and the interrupt handler should call hwmon_notify_event() instead of > > clogging the kernel log, but that should be done in a separate patch. > > Thank you for suggestion, I will take a look. > > > Anyway, the tegra30 dts files in the upstream kernel either use > > IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file > > in the upstream kernel has no interrupt configured (and coincidentally > > it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ? > > I have a patch that will add the interrupt property, it's stashed > locally for the next kernel release. > > IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that > use IRQ_TYPE_LEVEL_LOW are in the same position. I still don't see a IRQ_TYPE_LEVEL_HIGH, though. Thanks, Guenter > > >> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask > >> specified in a device-tree. IIUC, the interrupt is used only by OF-based > >> devices, hence I think we could simply remove the IRQF flag from the > >> code and fix the device-trees. Does it sound good to you? > > > > Yes, that is a better approach. > > Thank you for reviewing this patch. I'll prepare v2.
17.06.2021 18:12, Guenter Roeck пишет: >> For now I see that the rising edge isn't needed, the TEMP_ALERT goes >> HIGH by itself when temperature backs to normal. But I will try to >> double check. >> > The point is that a sysfs event should be sent to userspace on both > edges, not only when an alarm is raised. But, you are correct, > IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events > are not generated. Ok, thank you for the clarification. >>> Anyway, the tegra30 dts files in the upstream kernel either use >>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file >>> in the upstream kernel has no interrupt configured (and coincidentally >>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ? >> I have a patch that will add the interrupt property, it's stashed >> locally for the next kernel release. >> >> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that >> use IRQ_TYPE_LEVEL_LOW are in the same position. > I still don't see a IRQ_TYPE_LEVEL_HIGH, though. Could you please clarify why you're looking for HIGH and not for LOW? The TEMP_ALERT is active-low.
On Thu, Jun 17, 2021 at 06:27:50PM +0300, Dmitry Osipenko wrote: > 17.06.2021 18:12, Guenter Roeck пишет: > >> For now I see that the rising edge isn't needed, the TEMP_ALERT goes > >> HIGH by itself when temperature backs to normal. But I will try to > >> double check. > >> > > The point is that a sysfs event should be sent to userspace on both > > edges, not only when an alarm is raised. But, you are correct, > > IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events > > are not generated. > > Ok, thank you for the clarification. > > >>> Anyway, the tegra30 dts files in the upstream kernel either use > >>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file > >>> in the upstream kernel has no interrupt configured (and coincidentally > >>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ? > >> I have a patch that will add the interrupt property, it's stashed > >> locally for the next kernel release. > >> > >> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that > >> use IRQ_TYPE_LEVEL_LOW are in the same position. > > I still don't see a IRQ_TYPE_LEVEL_HIGH, though. > > Could you please clarify why you're looking for HIGH and not for LOW? > The TEMP_ALERT is active-low. Because you stated earlier: "... cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH ..." ^^^^^^^^^^^^^^^^^^^ Guenter
18.06.2021 00:42, Guenter Roeck пишет: > On Thu, Jun 17, 2021 at 06:27:50PM +0300, Dmitry Osipenko wrote: >> 17.06.2021 18:12, Guenter Roeck пишет: >>>> For now I see that the rising edge isn't needed, the TEMP_ALERT goes >>>> HIGH by itself when temperature backs to normal. But I will try to >>>> double check. >>>> >>> The point is that a sysfs event should be sent to userspace on both >>> edges, not only when an alarm is raised. But, you are correct, >>> IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events >>> are not generated. >> >> Ok, thank you for the clarification. >> >>>>> Anyway, the tegra30 dts files in the upstream kernel either use >>>>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file >>>>> in the upstream kernel has no interrupt configured (and coincidentally >>>>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ? >>>> I have a patch that will add the interrupt property, it's stashed >>>> locally for the next kernel release. >>>> >>>> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that >>>> use IRQ_TYPE_LEVEL_LOW are in the same position. >>> I still don't see a IRQ_TYPE_LEVEL_HIGH, though. >> >> Could you please clarify why you're looking for HIGH and not for LOW? >> The TEMP_ALERT is active-low. > > Because you stated earlier: > > "... cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH ..." > ^^^^^^^^^^^^^^^^^^^ That was a typo, my bad.
diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c index ebbfd5f352c0..ce8ebe60fcdc 100644 --- a/drivers/hwmon/lm90.c +++ b/drivers/hwmon/lm90.c @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client) dev_dbg(dev, "IRQ: %d\n", client->irq); err = devm_request_threaded_irq(dev, client->irq, NULL, lm90_irq_thread, - IRQF_TRIGGER_LOW | IRQF_ONESHOT, + IRQF_TRIGGER_FALLING | IRQF_ONESHOT, "lm90", client); if (err < 0) { dev_err(dev, "cannot request IRQ %d\n", client->irq);
The LM90 driver uses level-based interrupt triggering. The interrupt handler prints a warning message about the breached temperature and quits. There is no way to stop interrupt from re-triggering since it's level-based, thus thousands of warning messages are printed per second once interrupt is triggered. Use edge-triggered interrupt in order to fix this trouble. Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> --- drivers/hwmon/lm90.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)