diff mbox series

[v1] hwmon: (lm90) Use edge-triggered interrupt

Message ID 20210616190708.1220-1-digetx@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series [v1] hwmon: (lm90) Use edge-triggered interrupt | expand

Commit Message

Dmitry Osipenko June 16, 2021, 7:07 p.m. UTC
The LM90 driver uses level-based interrupt triggering. The interrupt
handler prints a warning message about the breached temperature and
quits. There is no way to stop interrupt from re-triggering since it's
level-based, thus thousands of warning messages are printed per second
once interrupt is triggered. Use edge-triggered interrupt in order to
fix this trouble.

Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/hwmon/lm90.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Guenter Roeck June 17, 2021, 12:12 a.m. UTC | #1
On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> The LM90 driver uses level-based interrupt triggering. The interrupt
> handler prints a warning message about the breached temperature and
> quits. There is no way to stop interrupt from re-triggering since it's
> level-based, thus thousands of warning messages are printed per second
> once interrupt is triggered. Use edge-triggered interrupt in order to
> fix this trouble.
> 
> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/hwmon/lm90.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> index ebbfd5f352c0..ce8ebe60fcdc 100644
> --- a/drivers/hwmon/lm90.c
> +++ b/drivers/hwmon/lm90.c
> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>  		err = devm_request_threaded_irq(dev, client->irq,
>  						NULL, lm90_irq_thread,
> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>  						"lm90", client);

We can't do that. Problem is that many of the devices supported by this driver
behave differently when it comes to interrupts. Specifically, the interrupt
handler is supposed to reset the interrupt condition (ie reading the status
register should reset it). If that is the not the case for a specific chip,
we'll have to update the code to address the problem for that specific chip.
The above code would probably just generate a single interrupt while never
resetting the interrupt condition, which is obviously not what we want to
happen.

Guenter

>  		if (err < 0) {
>  			dev_err(dev, "cannot request IRQ %d\n", client->irq);
> -- 
> 2.30.2
>
Dmitry Osipenko June 17, 2021, 7:11 a.m. UTC | #2
17.06.2021 03:12, Guenter Roeck пишет:
> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
>> The LM90 driver uses level-based interrupt triggering. The interrupt
>> handler prints a warning message about the breached temperature and
>> quits. There is no way to stop interrupt from re-triggering since it's
>> level-based, thus thousands of warning messages are printed per second
>> once interrupt is triggered. Use edge-triggered interrupt in order to
>> fix this trouble.
>>
>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/hwmon/lm90.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
>> index ebbfd5f352c0..ce8ebe60fcdc 100644
>> --- a/drivers/hwmon/lm90.c
>> +++ b/drivers/hwmon/lm90.c
>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>>  		err = devm_request_threaded_irq(dev, client->irq,
>>  						NULL, lm90_irq_thread,
>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>>  						"lm90", client);
> 
> We can't do that. Problem is that many of the devices supported by this driver
> behave differently when it comes to interrupts. Specifically, the interrupt
> handler is supposed to reset the interrupt condition (ie reading the status
> register should reset it). If that is the not the case for a specific chip,
> we'll have to update the code to address the problem for that specific chip.
> The above code would probably just generate a single interrupt while never
> resetting the interrupt condition, which is obviously not what we want to
> happen.

The nct1008/72 datasheet [1] says that reading the status register
doesn't reset interrupt until temperature is returned back into normal
state, which is what I'm witnessing.

[1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf

Page 10 "Status Register":

"Reading the status register clears the five flags, Bit 6 to Bit 2,
provided the error conditions causing the flags to beset  have  gone
away.  A  flag  bit  can  be  reset  only  if  the corresponding
value    register    contains    an    in-limit measurement or if the
sensor is good."

So the interrupt handler doesn't actually stop interrupt from
reoccurring and the whole KMSG is instantly spammed with:

...
[  217.484034] lm90 0-004c: temp2 out of range, please check!
[  217.484569] lm90 0-004c: temp2 out of range, please check!
[  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
messages lost.
[  217.485109] lm90 0-004c: temp2 out of range, please check!
[  217.485699] lm90 0-004c: temp2 out of range, please check!
[  217.486235] lm90 0-004c: temp2 out of range, please check!
[  217.486776] lm90 0-004c: temp2 out of range, please check!
[  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...

It's interesting that the very first version of the nct1008-support
patch used edge-triggered interrupt flags [2].

[2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html

Limiting the interrupt rate could be an alternative solution.

What do you think about something like this:

diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
index ce8ebe60fcdc..74886b8066ab 100644
--- a/drivers/hwmon/lm90.c
+++ b/drivers/hwmon/lm90.c
@@ -79,6 +79,7 @@
  * concern all supported chipsets, unless mentioned otherwise.
  */

+#include <linux/delay.h>
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/slab.h>
@@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657,
max6659, adt7461, max6680,
 #define MAX6696_STATUS2_R2OT2	(1 << 6) /* remote2 emergency limit
tripped */
 #define MAX6696_STATUS2_LOT2	(1 << 7) /* local emergency limit tripped */

+/* Prevent instant interrupt re-triggering */
+#define LM90_IRQ_DELAY		(15 * MSEC_PER_SEC)
+
 /*
  * Driver data (common to all clients)
  */
@@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void
*dev_id)
 	struct i2c_client *client = dev_id;
 	u16 status;

-	if (lm90_is_tripped(client, &status))
-		return IRQ_HANDLED;
-	else
+	if (!lm90_is_tripped(client, &status))
 		return IRQ_NONE;
+
+	msleep(LM90_IRQ_DELAY);
+
+	return IRQ_HANDLED;
 }

 static void lm90_remove_pec(void *dev)
Guenter Roeck June 17, 2021, 1:12 p.m. UTC | #3
On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
> 17.06.2021 03:12, Guenter Roeck пишет:
> > On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> >> The LM90 driver uses level-based interrupt triggering. The interrupt
> >> handler prints a warning message about the breached temperature and
> >> quits. There is no way to stop interrupt from re-triggering since it's
> >> level-based, thus thousands of warning messages are printed per second
> >> once interrupt is triggered. Use edge-triggered interrupt in order to
> >> fix this trouble.
> >>
> >> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/hwmon/lm90.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> >> index ebbfd5f352c0..ce8ebe60fcdc 100644
> >> --- a/drivers/hwmon/lm90.c
> >> +++ b/drivers/hwmon/lm90.c
> >> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
> >>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
> >>  		err = devm_request_threaded_irq(dev, client->irq,
> >>  						NULL, lm90_irq_thread,
> >> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> >> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
> >>  						"lm90", client);
> > 
> > We can't do that. Problem is that many of the devices supported by this driver
> > behave differently when it comes to interrupts. Specifically, the interrupt
> > handler is supposed to reset the interrupt condition (ie reading the status
> > register should reset it). If that is the not the case for a specific chip,
> > we'll have to update the code to address the problem for that specific chip.
> > The above code would probably just generate a single interrupt while never
> > resetting the interrupt condition, which is obviously not what we want to
> > happen.
> 
> The nct1008/72 datasheet [1] says that reading the status register
> doesn't reset interrupt until temperature is returned back into normal
> state, which is what I'm witnessing.
> 
> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
> 
> Page 10 "Status Register":
> 
> "Reading the status register clears the five flags, Bit 6 to Bit 2,
> provided the error conditions causing the flags to beset  have  gone
> away.  A  flag  bit  can  be  reset  only  if  the corresponding
> value    register    contains    an    in-limit measurement or if the
> sensor is good."
> 
> So the interrupt handler doesn't actually stop interrupt from
> reoccurring and the whole KMSG is instantly spammed with:
> 
> ...
> [  217.484034] lm90 0-004c: temp2 out of range, please check!
> [  217.484569] lm90 0-004c: temp2 out of range, please check!
> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
> messages lost.
> [  217.485109] lm90 0-004c: temp2 out of range, please check!
> [  217.485699] lm90 0-004c: temp2 out of range, please check!
> [  217.486235] lm90 0-004c: temp2 out of range, please check!
> [  217.486776] lm90 0-004c: temp2 out of range, please check!
> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
> 
> It's interesting that the very first version of the nct1008-support
> patch used edge-triggered interrupt flags [2].
> 
> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
> 
A lot of this depends on the chip and its wiring, as well as on chip
configuration. Even for a specific chip there may be configuration
dependencies. The interrupt configuration in situations like this
should really be determined by devicetree configuration, and not
be hardcoded. Is this a devicetree based system ? If so, there should
be an entry for this chip pointing to the interrupt, and that entry
should include a trigger mask. That mask should be set to edge
triggered.

> Limiting the interrupt rate could be an alternative solution.
> 
> What do you think about something like this:
> 
A sleep in an interrupt handler to "prevent" an interrupt storm
is never acceptable.

Guenter

> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> index ce8ebe60fcdc..74886b8066ab 100644
> --- a/drivers/hwmon/lm90.c
> +++ b/drivers/hwmon/lm90.c
> @@ -79,6 +79,7 @@
>   * concern all supported chipsets, unless mentioned otherwise.
>   */
> 
> +#include <linux/delay.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/slab.h>
> @@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657,
> max6659, adt7461, max6680,
>  #define MAX6696_STATUS2_R2OT2	(1 << 6) /* remote2 emergency limit
> tripped */
>  #define MAX6696_STATUS2_LOT2	(1 << 7) /* local emergency limit tripped */
> 
> +/* Prevent instant interrupt re-triggering */
> +#define LM90_IRQ_DELAY		(15 * MSEC_PER_SEC)
> +
>  /*
>   * Driver data (common to all clients)
>   */
> @@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void
> *dev_id)
>  	struct i2c_client *client = dev_id;
>  	u16 status;
> 
> -	if (lm90_is_tripped(client, &status))
> -		return IRQ_HANDLED;
> -	else
> +	if (!lm90_is_tripped(client, &status))
>  		return IRQ_NONE;
> +
> +	msleep(LM90_IRQ_DELAY);
> +
> +	return IRQ_HANDLED;
>  }
> 
>  static void lm90_remove_pec(void *dev)
Dmitry Osipenko June 17, 2021, 1:48 p.m. UTC | #4
17.06.2021 16:12, Guenter Roeck пишет:
> On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
>> 17.06.2021 03:12, Guenter Roeck пишет:
>>> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
>>>> The LM90 driver uses level-based interrupt triggering. The interrupt
>>>> handler prints a warning message about the breached temperature and
>>>> quits. There is no way to stop interrupt from re-triggering since it's
>>>> level-based, thus thousands of warning messages are printed per second
>>>> once interrupt is triggered. Use edge-triggered interrupt in order to
>>>> fix this trouble.
>>>>
>>>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>>  drivers/hwmon/lm90.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
>>>> index ebbfd5f352c0..ce8ebe60fcdc 100644
>>>> --- a/drivers/hwmon/lm90.c
>>>> +++ b/drivers/hwmon/lm90.c
>>>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>>>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>>>>  		err = devm_request_threaded_irq(dev, client->irq,
>>>>  						NULL, lm90_irq_thread,
>>>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
>>>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>>>>  						"lm90", client);
>>>
>>> We can't do that. Problem is that many of the devices supported by this driver
>>> behave differently when it comes to interrupts. Specifically, the interrupt
>>> handler is supposed to reset the interrupt condition (ie reading the status
>>> register should reset it). If that is the not the case for a specific chip,
>>> we'll have to update the code to address the problem for that specific chip.
>>> The above code would probably just generate a single interrupt while never
>>> resetting the interrupt condition, which is obviously not what we want to
>>> happen.
>>
>> The nct1008/72 datasheet [1] says that reading the status register
>> doesn't reset interrupt until temperature is returned back into normal
>> state, which is what I'm witnessing.
>>
>> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
>>
>> Page 10 "Status Register":
>>
>> "Reading the status register clears the five flags, Bit 6 to Bit 2,
>> provided the error conditions causing the flags to beset  have  gone
>> away.  A  flag  bit  can  be  reset  only  if  the corresponding
>> value    register    contains    an    in-limit measurement or if the
>> sensor is good."
>>
>> So the interrupt handler doesn't actually stop interrupt from
>> reoccurring and the whole KMSG is instantly spammed with:
>>
>> ...
>> [  217.484034] lm90 0-004c: temp2 out of range, please check!
>> [  217.484569] lm90 0-004c: temp2 out of range, please check!
>> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
>> messages lost.
>> [  217.485109] lm90 0-004c: temp2 out of range, please check!
>> [  217.485699] lm90 0-004c: temp2 out of range, please check!
>> [  217.486235] lm90 0-004c: temp2 out of range, please check!
>> [  217.486776] lm90 0-004c: temp2 out of range, please check!
>> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
>>
>> It's interesting that the very first version of the nct1008-support
>> patch used edge-triggered interrupt flags [2].
>>
>> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
>>
> A lot of this depends on the chip and its wiring, as well as on chip
> configuration. Even for a specific chip there may be configuration
> dependencies. The interrupt configuration in situations like this
> should really be determined by devicetree configuration, and not
> be hardcoded. Is this a devicetree based system ? If so, there should
> be an entry for this chip pointing to the interrupt, and that entry
> should include a trigger mask. That mask should be set to edge
> triggered.

This is a device-tree based system, in particular it's NVIDIA Tegra30
Nexus 7. The interrupt support was originally added to the lm90 driver
by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
device-trees are specifying the trigger mask and apparently they all are
cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it
should be IRQ_TYPE_EDGE_FALLING.

The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
specified in a device-tree. IIUC, the interrupt is used only by OF-based
devices, hence I think we could simply remove the IRQF flag from the
code and fix the device-trees. Does it sound good to you?
Guenter Roeck June 17, 2021, 2:13 p.m. UTC | #5
On Thu, Jun 17, 2021 at 04:48:08PM +0300, Dmitry Osipenko wrote:
> 17.06.2021 16:12, Guenter Roeck пишет:
> > On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
> >> 17.06.2021 03:12, Guenter Roeck пишет:
> >>> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> >>>> The LM90 driver uses level-based interrupt triggering. The interrupt
> >>>> handler prints a warning message about the breached temperature and
> >>>> quits. There is no way to stop interrupt from re-triggering since it's
> >>>> level-based, thus thousands of warning messages are printed per second
> >>>> once interrupt is triggered. Use edge-triggered interrupt in order to
> >>>> fix this trouble.
> >>>>
> >>>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> >>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >>>> ---
> >>>>  drivers/hwmon/lm90.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> >>>> index ebbfd5f352c0..ce8ebe60fcdc 100644
> >>>> --- a/drivers/hwmon/lm90.c
> >>>> +++ b/drivers/hwmon/lm90.c
> >>>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
> >>>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
> >>>>  		err = devm_request_threaded_irq(dev, client->irq,
> >>>>  						NULL, lm90_irq_thread,
> >>>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> >>>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
> >>>>  						"lm90", client);
> >>>
> >>> We can't do that. Problem is that many of the devices supported by this driver
> >>> behave differently when it comes to interrupts. Specifically, the interrupt
> >>> handler is supposed to reset the interrupt condition (ie reading the status
> >>> register should reset it). If that is the not the case for a specific chip,
> >>> we'll have to update the code to address the problem for that specific chip.
> >>> The above code would probably just generate a single interrupt while never
> >>> resetting the interrupt condition, which is obviously not what we want to
> >>> happen.
> >>
> >> The nct1008/72 datasheet [1] says that reading the status register
> >> doesn't reset interrupt until temperature is returned back into normal
> >> state, which is what I'm witnessing.
> >>
> >> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
> >>
> >> Page 10 "Status Register":
> >>
> >> "Reading the status register clears the five flags, Bit 6 to Bit 2,
> >> provided the error conditions causing the flags to beset  have  gone
> >> away.  A  flag  bit  can  be  reset  only  if  the corresponding
> >> value    register    contains    an    in-limit measurement or if the
> >> sensor is good."
> >>
> >> So the interrupt handler doesn't actually stop interrupt from
> >> reoccurring and the whole KMSG is instantly spammed with:
> >>
> >> ...
> >> [  217.484034] lm90 0-004c: temp2 out of range, please check!
> >> [  217.484569] lm90 0-004c: temp2 out of range, please check!
> >> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
> >> messages lost.
> >> [  217.485109] lm90 0-004c: temp2 out of range, please check!
> >> [  217.485699] lm90 0-004c: temp2 out of range, please check!
> >> [  217.486235] lm90 0-004c: temp2 out of range, please check!
> >> [  217.486776] lm90 0-004c: temp2 out of range, please check!
> >> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
> >>
> >> It's interesting that the very first version of the nct1008-support
> >> patch used edge-triggered interrupt flags [2].
> >>
> >> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
> >>
> > A lot of this depends on the chip and its wiring, as well as on chip
> > configuration. Even for a specific chip there may be configuration
> > dependencies. The interrupt configuration in situations like this
> > should really be determined by devicetree configuration, and not
> > be hardcoded. Is this a devicetree based system ? If so, there should
> > be an entry for this chip pointing to the interrupt, and that entry
> > should include a trigger mask. That mask should be set to edge
> > triggered.
> 
> This is a device-tree based system, in particular it's NVIDIA Tegra30
> Nexus 7. The interrupt support was originally added to the lm90 driver
> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
> device-trees are specifying the trigger mask and apparently they all are
> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it

Be fair, no one is perfect.

> should be IRQ_TYPE_EDGE_FALLING.

It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING,
and the interrupt handler should call hwmon_notify_event() instead of
clogging the kernel log, but that should be done in a separate patch.

Anyway, the tegra30 dts files in the upstream kernel either use
IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
in the upstream kernel has no interrupt configured (and coincidentally
it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?

> 
> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
> specified in a device-tree. IIUC, the interrupt is used only by OF-based
> devices, hence I think we could simply remove the IRQF flag from the
> code and fix the device-trees. Does it sound good to you?

Yes, that is a better approach.

Thanks,
Guenter
Dmitry Osipenko June 17, 2021, 2:46 p.m. UTC | #6
17.06.2021 17:13, Guenter Roeck пишет:
...
>> This is a device-tree based system, in particular it's NVIDIA Tegra30
>> Nexus 7. The interrupt support was originally added to the lm90 driver
>> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
>> device-trees are specifying the trigger mask and apparently they all are
>> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it
> 
> Be fair, no one is perfect.

This is a very minor problem, so no wonder that nobody noticed or
bothered to fix it yet. I'm just clarifying the status here.

>> should be IRQ_TYPE_EDGE_FALLING.
> 
> It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING,

For now I see that the rising edge isn't needed, the TEMP_ALERT goes
HIGH by itself when temperature backs to normal. But I will try to
double check.

> and the interrupt handler should call hwmon_notify_event() instead of
> clogging the kernel log, but that should be done in a separate patch.

Thank you for suggestion, I will take a look.

> Anyway, the tegra30 dts files in the upstream kernel either use
> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
> in the upstream kernel has no interrupt configured (and coincidentally
> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?

I have a patch that will add the interrupt property, it's stashed
locally for the next kernel release.

IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
use IRQ_TYPE_LEVEL_LOW are in the same position.

>> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
>> specified in a device-tree. IIUC, the interrupt is used only by OF-based
>> devices, hence I think we could simply remove the IRQF flag from the
>> code and fix the device-trees. Does it sound good to you?
> 
> Yes, that is a better approach.

Thank you for reviewing this patch. I'll prepare v2.
Guenter Roeck June 17, 2021, 3:12 p.m. UTC | #7
On Thu, Jun 17, 2021 at 05:46:33PM +0300, Dmitry Osipenko wrote:
> 17.06.2021 17:13, Guenter Roeck пишет:
> ...
> >> This is a device-tree based system, in particular it's NVIDIA Tegra30
> >> Nexus 7. The interrupt support was originally added to the lm90 driver
> >> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
> >> device-trees are specifying the trigger mask and apparently they all are
> >> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it
> > 
> > Be fair, no one is perfect.
> 
> This is a very minor problem, so no wonder that nobody noticed or
> bothered to fix it yet. I'm just clarifying the status here.
> 
> >> should be IRQ_TYPE_EDGE_FALLING.
> > 
> > It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING,
> 
> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
> HIGH by itself when temperature backs to normal. But I will try to
> double check.
> 
The point is that a sysfs event should be sent to userspace on both
edges, not only when an alarm is raised. But, you are correct,
IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
are not generated.

> > and the interrupt handler should call hwmon_notify_event() instead of
> > clogging the kernel log, but that should be done in a separate patch.
> 
> Thank you for suggestion, I will take a look.
> 
> > Anyway, the tegra30 dts files in the upstream kernel either use
> > IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
> > in the upstream kernel has no interrupt configured (and coincidentally
> > it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
> 
> I have a patch that will add the interrupt property, it's stashed
> locally for the next kernel release.
> 
> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
> use IRQ_TYPE_LEVEL_LOW are in the same position.

I still don't see a IRQ_TYPE_LEVEL_HIGH, though.

Thanks,
Guenter

> 
> >> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
> >> specified in a device-tree. IIUC, the interrupt is used only by OF-based
> >> devices, hence I think we could simply remove the IRQF flag from the
> >> code and fix the device-trees. Does it sound good to you?
> > 
> > Yes, that is a better approach.
> 
> Thank you for reviewing this patch. I'll prepare v2.
Dmitry Osipenko June 17, 2021, 3:27 p.m. UTC | #8
17.06.2021 18:12, Guenter Roeck пишет:
>> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
>> HIGH by itself when temperature backs to normal. But I will try to
>> double check.
>>
> The point is that a sysfs event should be sent to userspace on both
> edges, not only when an alarm is raised. But, you are correct,
> IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
> are not generated.

Ok, thank you for the clarification.

>>> Anyway, the tegra30 dts files in the upstream kernel either use
>>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
>>> in the upstream kernel has no interrupt configured (and coincidentally
>>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
>> I have a patch that will add the interrupt property, it's stashed
>> locally for the next kernel release.
>>
>> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
>> use IRQ_TYPE_LEVEL_LOW are in the same position.
> I still don't see a IRQ_TYPE_LEVEL_HIGH, though.

Could you please clarify why you're looking for HIGH and not for LOW?
The TEMP_ALERT is active-low.
Guenter Roeck June 17, 2021, 9:42 p.m. UTC | #9
On Thu, Jun 17, 2021 at 06:27:50PM +0300, Dmitry Osipenko wrote:
> 17.06.2021 18:12, Guenter Roeck пишет:
> >> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
> >> HIGH by itself when temperature backs to normal. But I will try to
> >> double check.
> >>
> > The point is that a sysfs event should be sent to userspace on both
> > edges, not only when an alarm is raised. But, you are correct,
> > IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
> > are not generated.
> 
> Ok, thank you for the clarification.
> 
> >>> Anyway, the tegra30 dts files in the upstream kernel either use
> >>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
> >>> in the upstream kernel has no interrupt configured (and coincidentally
> >>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
> >> I have a patch that will add the interrupt property, it's stashed
> >> locally for the next kernel release.
> >>
> >> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
> >> use IRQ_TYPE_LEVEL_LOW are in the same position.
> > I still don't see a IRQ_TYPE_LEVEL_HIGH, though.
> 
> Could you please clarify why you're looking for HIGH and not for LOW?
> The TEMP_ALERT is active-low.

Because you stated earlier:

"... cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH ..."
                                             ^^^^^^^^^^^^^^^^^^^

Guenter
Dmitry Osipenko June 18, 2021, 8:55 a.m. UTC | #10
18.06.2021 00:42, Guenter Roeck пишет:
> On Thu, Jun 17, 2021 at 06:27:50PM +0300, Dmitry Osipenko wrote:
>> 17.06.2021 18:12, Guenter Roeck пишет:
>>>> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
>>>> HIGH by itself when temperature backs to normal. But I will try to
>>>> double check.
>>>>
>>> The point is that a sysfs event should be sent to userspace on both
>>> edges, not only when an alarm is raised. But, you are correct,
>>> IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
>>> are not generated.
>>
>> Ok, thank you for the clarification.
>>
>>>>> Anyway, the tegra30 dts files in the upstream kernel either use
>>>>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
>>>>> in the upstream kernel has no interrupt configured (and coincidentally
>>>>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
>>>> I have a patch that will add the interrupt property, it's stashed
>>>> locally for the next kernel release.
>>>>
>>>> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
>>>> use IRQ_TYPE_LEVEL_LOW are in the same position.
>>> I still don't see a IRQ_TYPE_LEVEL_HIGH, though.
>>
>> Could you please clarify why you're looking for HIGH and not for LOW?
>> The TEMP_ALERT is active-low.
> 
> Because you stated earlier:
> 
> "... cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH ..."
>                                              ^^^^^^^^^^^^^^^^^^^

That was a typo, my bad.
diff mbox series

Patch

diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
index ebbfd5f352c0..ce8ebe60fcdc 100644
--- a/drivers/hwmon/lm90.c
+++ b/drivers/hwmon/lm90.c
@@ -1908,7 +1908,7 @@  static int lm90_probe(struct i2c_client *client)
 		dev_dbg(dev, "IRQ: %d\n", client->irq);
 		err = devm_request_threaded_irq(dev, client->irq,
 						NULL, lm90_irq_thread,
-						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
+						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
 						"lm90", client);
 		if (err < 0) {
 			dev_err(dev, "cannot request IRQ %d\n", client->irq);