mbox series

[v4,0/2] thermal: rcar_gen3_thermal: fix IRQ issues

Message ID 20190424051145.23072-1-jiada_wang@mentor.com (mailing list archive)
Headers show
Series thermal: rcar_gen3_thermal: fix IRQ issues | expand

Message

Wang, Jiada April 24, 2019, 5:11 a.m. UTC
There are issues with interrupt handling in rcar_gen3_thermal driver.

Currently IRQ is remain enabled after .remove, later if device is probed,
IRQ is requested before .thermal_init, this may cause IRQ function be
triggered but not able to clear IRQ status, thus cause system to hang.

Since the irq line isn't shared between different devices,
so the proper interrupt type flag should be IRQF_ONESHOT.

This patch-set fix these interrupt handling retated issues.

---
v4: remove 'spinlock_t lock'
    add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
    fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")

v3: fix to use correct code base
    remove unused "flag" variable in rcar_gen3_thermal_irq

v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
    disable interrupt in .remove

v1: initial version

Jiada Wang (2):
  thermal: rcar_gen3_thermal: fix interrupt type
  thermal: rcar_gen3_thermal: disable interrupt in .remove

 drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
 1 file changed, 9 insertions(+), 32 deletions(-)

Comments

Niklas Söderlund April 24, 2019, 6:51 a.m. UTC | #1
Hi Jiada,

I think this series looks good. Unfortunately I'm out of office for the 
next week and are thus unable to test it. Please don't let this block 
this series but if it's still on the ML when I'm back I will do a proper 
review and test of it.

On 2019-04-24 14:11:43 +0900, Jiada Wang wrote:
> There are issues with interrupt handling in rcar_gen3_thermal driver.
> 
> Currently IRQ is remain enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> triggered but not able to clear IRQ status, thus cause system to hang.
> 
> Since the irq line isn't shared between different devices,
> so the proper interrupt type flag should be IRQF_ONESHOT.
> 
> This patch-set fix these interrupt handling retated issues.
> 
> ---
> v4: remove 'spinlock_t lock'
>     add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
>     fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")
> 
> v3: fix to use correct code base
>     remove unused "flag" variable in rcar_gen3_thermal_irq
> 
> v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
>     disable interrupt in .remove
> 
> v1: initial version
> 
> Jiada Wang (2):
>   thermal: rcar_gen3_thermal: fix interrupt type
>   thermal: rcar_gen3_thermal: disable interrupt in .remove
> 
>  drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
>  1 file changed, 9 insertions(+), 32 deletions(-)
> 
> -- 
> 2.19.2
>
Eugeniu Rosca April 24, 2019, 12:09 p.m. UTC | #2
On Wed, Apr 24, 2019 at 02:11:43PM +0900, Jiada Wang wrote:
> There are issues with interrupt handling in rcar_gen3_thermal driver.
> 
> Currently IRQ is remain enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> triggered but not able to clear IRQ status, thus cause system to hang.
> 
> Since the irq line isn't shared between different devices,
> so the proper interrupt type flag should be IRQF_ONESHOT.
> 
> This patch-set fix these interrupt handling retated issues.
> 
> ---
> v4: remove 'spinlock_t lock'
>     add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
>     fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")
> 
> v3: fix to use correct code base
>     remove unused "flag" variable in rcar_gen3_thermal_irq
> 
> v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
>     disable interrupt in .remove
> 
> v1: initial version
> 
> Jiada Wang (2):
>   thermal: rcar_gen3_thermal: fix interrupt type
>   thermal: rcar_gen3_thermal: disable interrupt in .remove
> 
>  drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
>  1 file changed, 9 insertions(+), 32 deletions(-)
> 
> -- 
> 2.19.2

For the record, below is the diff between v3 and v4 [1].
It accurately reflects my review comments in
https://patchwork.kernel.org/patch/10913165/#22601305 .

In addition, I made sure there are either false positives or
no new issues reported by:
 - sparse v0.5.2-1-ga3c4716a703a
 - smatch v0.5.0-4785-g4968bcad1c08
 - cppcheck 1.88 dev
 - make W=123
 - make coccicheck

I repeated the same test steps as described in
https://patchwork.kernel.org/cover/10913163/#22602335
and the results were the same:

Reviewed-and-Tested-by: Eugeniu Rosca <erosca@de.adit-jv.com>

[1] diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c
index c63a86d3dac6..280230951dfe 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -14,7 +14,6 @@
 #include <linux/of_device.h>
 #include <linux/platform_device.h>
 #include <linux/pm_runtime.h>
-#include <linux/spinlock.h>
 #include <linux/sys_soc.h>
 #include <linux/thermal.h>
 
@@ -82,7 +81,6 @@ struct rcar_gen3_thermal_tsc {
 struct rcar_gen3_thermal_priv {
 	struct rcar_gen3_thermal_tsc *tscs[TSC_MAX_NUM];
 	unsigned int num_tscs;
-	spinlock_t lock; /* Protect interrupts on and off */
 	void (*thermal_init)(struct rcar_gen3_thermal_tsc *tsc);
 };
 
@@ -231,8 +229,8 @@ static void rcar_thermal_irq_set(struct rcar_gen3_thermal_priv *priv, bool on)
 static irqreturn_t rcar_gen3_thermal_irq(int irq, void *data)
 {
 	struct rcar_gen3_thermal_priv *priv = data;
-	int i;
 	u32 status;
+	int i;
 
 	for (i = 0; i < priv->num_tscs; i++) {
 		status = rcar_gen3_thermal_read(priv->tscs[i], REG_GEN3_IRQSTR);
@@ -352,8 +350,6 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)
 	if (soc_device_match(r8a7795es1))
 		priv->thermal_init = rcar_gen3_thermal_init_r8a7795es1;
 
-	spin_lock_init(&priv->lock);
-
 	platform_set_drvdata(pdev, priv);
 
 	/*
Niklas Söderlund May 7, 2019, 11:54 p.m. UTC | #3
Hi Jiada,

Thanks for your patches.

On 2019-04-24 14:11:43 +0900, Jiada Wang wrote:
> There are issues with interrupt handling in rcar_gen3_thermal driver.
> 
> Currently IRQ is remain enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> triggered but not able to clear IRQ status, thus cause system to hang.
> 
> Since the irq line isn't shared between different devices,
> so the proper interrupt type flag should be IRQF_ONESHOT.
> 
> This patch-set fix these interrupt handling retated issues.

I really like this series, nice work.

Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

> 
> ---
> v4: remove 'spinlock_t lock'
>     add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
>     fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")
> 
> v3: fix to use correct code base
>     remove unused "flag" variable in rcar_gen3_thermal_irq
> 
> v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
>     disable interrupt in .remove
> 
> v1: initial version
> 
> Jiada Wang (2):
>   thermal: rcar_gen3_thermal: fix interrupt type
>   thermal: rcar_gen3_thermal: disable interrupt in .remove
> 
>  drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
>  1 file changed, 9 insertions(+), 32 deletions(-)
> 
> -- 
> 2.19.2
>
Eugeniu Rosca May 10, 2019, 10:42 a.m. UTC | #4
Hi Niklas,

On Wed, May 08, 2019 at 01:54:03AM +0200, Niklas Söderlund wrote:
> Hi Jiada,
[..]
> I really like this series, nice work.
> 
> Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

Is there anything off-the-shelf available for testing the rcar3
thermal driver, to avoid reinventing the wheel via
https://patchwork.kernel.org/cover/10913163/#22602335

Thank you.
Niklas Söderlund May 10, 2019, 11:36 a.m. UTC | #5
Hi Eugeniu,

On 2019-05-10 12:42:31 +0200, Eugeniu Rosca wrote:
> Hi Niklas,
> 
> On Wed, May 08, 2019 at 01:54:03AM +0200, Niklas Söderlund wrote:
> > Hi Jiada,
> [..]
> > I really like this series, nice work.
> > 
> > Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> > Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> 
> Is there anything off-the-shelf available for testing the rcar3
> thermal driver, to avoid reinventing the wheel via
> https://patchwork.kernel.org/cover/10913163/#22602335

Not that I know of, unfortunately :-(

I have a private home hacked testing framework (don't we all?) based on 
tcl+expect where I have two basic tests for rcar_gen3_thermal. I'm 
willing to share the tests if you by chance want them, but be warned 
that they are highly specialised for my needs and I'm reluctant to 
publish my whole hack tool as it just a ugly hack ;-)

On a high level the tests I have are

1. thermal-load
    Generates load on target and observes the temperature is increased 
    using the /sys/class/thermal/thermal_zone*/temp" interface. This 
    seems similar to the test case your reference using stress-ng.

2. thermal-cooling
    Emulate the passive trip point temperatures using the 
    /sys/class/thermal/*/emul_temp interface and observe that the 
    specified cooling state is achieved.

I should add a third test to make sure IRQ fires but this is just a pet 
project for me so maybe I will get around to it sometime...

If you know of anything around to test thermal drivers or if you create 
something please let me know so I can add it to my tests. And let me 
know if you want my hacks for inspiration for your own testing.
Eugeniu Rosca May 10, 2019, 3:50 p.m. UTC | #6
Hi Niklas,

On Fri, May 10, 2019 at 01:36:08PM +0200, Niklas Söderlund wrote:
> Hi Eugeniu,
> 
> On 2019-05-10 12:42:31 +0200, Eugeniu Rosca wrote:
> > Hi Niklas,
> > 
> > On Wed, May 08, 2019 at 01:54:03AM +0200, Niklas Söderlund wrote:
> > > Hi Jiada,
> > [..]
> > > I really like this series, nice work.
> > > 
> > > Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> > > Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> > 
> > Is there anything off-the-shelf available for testing the rcar3
> > thermal driver, to avoid reinventing the wheel via
> > https://patchwork.kernel.org/cover/10913163/#22602335
> 
> Not that I know of, unfortunately :-(
> 
> I have a private home hacked testing framework (don't we all?) based on 
> tcl+expect where I have two basic tests for rcar_gen3_thermal. I'm 
> willing to share the tests if you by chance want them, but be warned 
> that they are highly specialised for my needs and I'm reluctant to 
> publish my whole hack tool as it just a ugly hack ;-)
> 
> On a high level the tests I have are
> 
> 1. thermal-load
>     Generates load on target and observes the temperature is increased 
>     using the /sys/class/thermal/thermal_zone*/temp" interface. This 
>     seems similar to the test case your reference using stress-ng.
> 
> 2. thermal-cooling
>     Emulate the passive trip point temperatures using the 
>     /sys/class/thermal/*/emul_temp interface and observe that the 
>     specified cooling state is achieved.
> 
> I should add a third test to make sure IRQ fires but this is just a pet 
> project for me so maybe I will get around to it sometime...
> 
> If you know of anything around to test thermal drivers or if you create 
> something please let me know so I can add it to my tests. And let me 
> know if you want my hacks for inspiration for your own testing.

Thanks for this summary. It would be definitely convenient to have
a set of tests covering the most important features of the driver.

I was particularly thinking of the test procedure in light of below:
 - I still can reproduce a few UBSAN (signed integer overflow) and
   KASAN (use-after-free) reports with the most recent vanilla driver.
 - There are a couple of thermal commits in rcar-3.9.x pending for
   mainline submission:

https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=fe7d0d1c77f9 ("thermal: rcar_gen3_thermal: Use FUSE values if they are available")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=2776ccd63649 ("thermal: rcar_gen3_thermal: Fix interrupt count issue")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=9146af785f41 ("thermal: rcar_gen3_thermal: Enable selection between polling/interrupt mode")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=55b262766ec2 ("thermal: rcar_gen3_thermal: PIO-INT can be selected for each TSC separately")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=d323d9de0683 ("thermal: rcar_gen3_thermal: Add support for r8a77990")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=fb8efb8bac29 ("thermal: rcar_gen3_thermal: Fix interrupts are not raised issue on E3")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=5627c42a1bd5 ("thermal: rcar_gen3_thermal: Use DIV_ROUND_CLOSEST correctly as its description")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=d4e41702e53b ("thermal: rcar_gen3_thermal: [H3/M3N] Update calculation formula due to HW evaluation")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=958bd36e03b7 ("thermal: rcar_gen3_thermal: [E3] Update calculation formula due to HW evaluation")

Long story short, I think we will review more thermal commits in
hopefully not so distant future and it would be helpful to reach some
common understanding what kind of testing the new patches should pass.

Your summary already gives some insight in that direction. Thanks.