Message ID | 20200521074946.21799-1-dinghao.liu@zju.edu.cn (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | spi: tegra20-slink: Fix runtime PM imbalance on error | expand |
On Thu, May 21, 2020 at 10:50 AM Dinghao Liu <dinghao.liu@zju.edu.cn> wrote: > > pm_runtime_get_sync() increments the runtime PM usage counter even > when it returns an error code. Thus a pairing decrement is needed on > the error handling path to keep the counter balanced. ... > ret = pm_runtime_get_sync(&pdev->dev); > if (ret < 0) { > dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret); > + pm_runtime_put(&pdev->dev); For all your patches, please, double check what you are proposing. Here, I believe, the correct one will be _put_noidle(). AFAIU you are not supposed to actually suspend the device in case of error. But I might be mistaken, thus see above. > goto exit_pm_disable; > }
On Thu, May 21, 2020 at 11:04 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Thu, May 21, 2020 at 10:50 AM Dinghao Liu <dinghao.liu@zju.edu.cn> wrote:
Any I have coccinelle script for this, I can share with you.
Hi Andy, Thank you for your advice! I will fix the problem in the next edition of patch. The coccinelle script will be very helpful and I'm looking forward to it. Regards, Dinghao "Andy Shevchenko" <andy.shevchenko@gmail.com>写道: > On Thu, May 21, 2020 at 11:04 AM Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > > On Thu, May 21, 2020 at 10:50 AM Dinghao Liu <dinghao.liu@zju.edu.cn> wrote: > > Any I have coccinelle script for this, I can share with you. > > -- > With Best Regards, > Andy Shevchenko
On 21/05/2020 09:04, Andy Shevchenko wrote: > On Thu, May 21, 2020 at 10:50 AM Dinghao Liu <dinghao.liu@zju.edu.cn> wrote: >> >> pm_runtime_get_sync() increments the runtime PM usage counter even >> when it returns an error code. Thus a pairing decrement is needed on >> the error handling path to keep the counter balanced. > > ... > >> ret = pm_runtime_get_sync(&pdev->dev); >> if (ret < 0) { >> dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret); > >> + pm_runtime_put(&pdev->dev); > > For all your patches, please, double check what you are proposing. > > Here, I believe, the correct one will be _put_noidle(). > > AFAIU you are not supposed to actually suspend the device in case of error. > But I might be mistaken, thus see above. > >> goto exit_pm_disable; >> } Is there any reason why this is not handled in pm_runtime_get itself? Jon
On 21/05/2020 09:38, Jon Hunter wrote: > > On 21/05/2020 09:04, Andy Shevchenko wrote: >> On Thu, May 21, 2020 at 10:50 AM Dinghao Liu <dinghao.liu@zju.edu.cn> wrote: >>> >>> pm_runtime_get_sync() increments the runtime PM usage counter even >>> when it returns an error code. Thus a pairing decrement is needed on >>> the error handling path to keep the counter balanced. >> >> ... >> >>> ret = pm_runtime_get_sync(&pdev->dev); >>> if (ret < 0) { >>> dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret); >> >>> + pm_runtime_put(&pdev->dev); >> >> For all your patches, please, double check what you are proposing. >> >> Here, I believe, the correct one will be _put_noidle(). >> >> AFAIU you are not supposed to actually suspend the device in case of error. >> But I might be mistaken, thus see above. >> >>> goto exit_pm_disable; >>> } > > > Is there any reason why this is not handled in pm_runtime_get itself? Ah I see a response from Rafael here: https://lkml.org/lkml/2020/5/20/1100 OK so this is intentional and needs to be fixed. Jon
Hi Andy, Thank you for your advice! Your suggestion is to use pm_runtime_put_noidle(), right? The only difference between pm_runtime_put() and this function is that pm_runtime_put() will run an extra pm_request_idle(). I checked this patched function again and found there is a pm_runtime_put() in the normal branch of pm_runtime_get_sync(). Does this mean the original program logic need to execute idle callback? According to runtime PM's doc, the pm_runtime_get_sync() call paired with a pm_runtime_put() call will be appropriate to ensure that the device is not put back to sleep during the probe. Therefore I think pm_runtime_put() is more appropriate here. Do you have more detailed suggestion for why we should use _put_noidle()? Regards, Dinghao > On Thu, May 21, 2020 at 10:50 AM Dinghao Liu <dinghao.liu@zju.edu.cn> wrote: > > > > pm_runtime_get_sync() increments the runtime PM usage counter even > > when it returns an error code. Thus a pairing decrement is needed on > > the error handling path to keep the counter balanced. > > ... > > > ret = pm_runtime_get_sync(&pdev->dev); > > if (ret < 0) { > > dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret); > > > + pm_runtime_put(&pdev->dev); > > For all your patches, please, double check what you are proposing. > > Here, I believe, the correct one will be _put_noidle(). > > AFAIU you are not supposed to actually suspend the device in case of error. > But I might be mistaken, thus see above. > > > goto exit_pm_disable; > > } > > -- > With Best Regards, > Andy Shevchenko
On Fri, May 22, 2020 at 10:46 AM <dinghao.liu@zju.edu.cn> wrote: > > Hi Andy, > > Thank you for your advice! You are welcome, but please, stop top-posting. > Your suggestion is to use pm_runtime_put_noidle(), right? > The only difference between pm_runtime_put() and this function > is that pm_runtime_put() will run an extra pm_request_idle(). > > I checked this patched function again and found there is a > pm_runtime_put() in the normal branch of pm_runtime_get_sync(). > Does this mean the original program logic need to execute idle > callback? > > According to runtime PM's doc, the pm_runtime_get_sync() call > paired with a pm_runtime_put() call will be appropriate to ensure > that the device is not put back to sleep during the probe. Correct. > Therefore > I think pm_runtime_put() is more appropriate here. How come to wrong conclusion? We are considering error path. What does documentation say about this? > Do you have > more detailed suggestion for why we should use _put_noidle()? Because in error case there is no need to go through all code patch to be sure that the device is idling. Moreover, consider below case CPU1: ...somewhere in the code... pm_runtime_get() // with success! ...see below... pm_runtime_put() CPU2: ...on parallel thread... ret = pm_runtime_get_sync() // failed! if (ret) pm_runtime_put() // oi vei, we put device into sleep So, there is a potential issue. > > > pm_runtime_get_sync() increments the runtime PM usage counter even > > > when it returns an error code. Thus a pairing decrement is needed on > > > the error handling path to keep the counter balanced. > > > > ... > > > > > ret = pm_runtime_get_sync(&pdev->dev); > > > if (ret < 0) { > > > dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret); > > > > > + pm_runtime_put(&pdev->dev); > > > > For all your patches, please, double check what you are proposing. > > > > Here, I believe, the correct one will be _put_noidle(). > > > > AFAIU you are not supposed to actually suspend the device in case of error. > > But I might be mistaken, thus see above. > > > > > goto exit_pm_disable; > > > }
On Fri, May 22, 2020 at 6:20 PM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > On Fri, May 22, 2020 at 10:46 AM <dinghao.liu@zju.edu.cn> wrote: ... > Moreover, consider below case > > CPU1: ...somewhere in the code... > pm_runtime_get() // with success! > ...see below... > pm_runtime_put() > > CPU2: ...on parallel thread... > ret = pm_runtime_get_sync() // failed! > if (ret) > pm_runtime_put() // oi vei, we put device into sleep > > So, there is a potential issue. ...and even if it's impossible (no bugs in runtime PM core, etc) the code with put() looks suspicious.
> On Fri, May 22, 2020 at 6:20 PM Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > > On Fri, May 22, 2020 at 10:46 AM <dinghao.liu@zju.edu.cn> wrote: > > ... > > > Moreover, consider below case > > > > CPU1: ...somewhere in the code... > > pm_runtime_get() // with success! > > ...see below... > > pm_runtime_put() > > > > CPU2: ...on parallel thread... > > ret = pm_runtime_get_sync() // failed! > > if (ret) > > pm_runtime_put() // oi vei, we put device into sleep > > > > So, there is a potential issue. > > ...and even if it's impossible (no bugs in runtime PM core, etc) the > code with put() looks suspicious. > I may understand what you are worried about. Do you mean that executing pm_runtime_put() will influence other threads (e.g., one parallel thread can put the device into sleep while other threads are using this device)? I think this will never happen. Because in this case the PM usage counter cannot be decreased to zero if there are still some threads using this device. Otherwise, pm_runtime_put() should never be used in the case of multithreading, which is strange since this API is used widely. I also checked many other implementation of probe in drivers. It seems that using pm_runtime_put() is ok. If I misunderstood your opinion, please point it out, thanks. Regards, Dinghao > -- > With Best Regards, > Andy Shevchenko
On Sat, May 23, 2020 at 2:32 PM <dinghao.liu@zju.edu.cn> wrote: > > > On Fri, May 22, 2020 at 6:20 PM Andy Shevchenko > > <andy.shevchenko@gmail.com> wrote: > > > On Fri, May 22, 2020 at 10:46 AM <dinghao.liu@zju.edu.cn> wrote: ... > I also checked many other implementation of probe in drivers. > It seems that using pm_runtime_put() is ok. In *error path* or normal path? > If I misunderstood > your opinion, please point it out, thanks. Bottom line is (for the *error path* case): pm_runtime_put_noidle() has no side effects pm_runtime_put() (potentially) might have side effects. You should choose one which is clearer about what it does.
> On Sat, May 23, 2020 at 2:32 PM <dinghao.liu@zju.edu.cn> wrote: > > > > > On Fri, May 22, 2020 at 6:20 PM Andy Shevchenko > > > <andy.shevchenko@gmail.com> wrote: > > > > On Fri, May 22, 2020 at 10:46 AM <dinghao.liu@zju.edu.cn> wrote: > > ... > > > I also checked many other implementation of probe in drivers. > > It seems that using pm_runtime_put() is ok. > > In *error path* or normal path? > Error path (e.g., sysc_probe, exynos_trng_probe, map_rng_probe, ti_eqep_probe). > > If I misunderstood > > your opinion, please point it out, thanks. > > Bottom line is (for the *error path* case): > pm_runtime_put_noidle() has no side effects > pm_runtime_put() (potentially) might have side effects. > > You should choose one which is clearer about what it does. > > -- > With Best Regards, > Andy Shevchenko Agree, for this bug using _noidle() is clearer. I will send a new path to fix this. Regards, Dinghao
diff --git a/drivers/spi/spi-tegra20-slink.c b/drivers/spi/spi-tegra20-slink.c index 7f4d932dade7..15361db00982 100644 --- a/drivers/spi/spi-tegra20-slink.c +++ b/drivers/spi/spi-tegra20-slink.c @@ -1118,6 +1118,7 @@ static int tegra_slink_probe(struct platform_device *pdev) ret = pm_runtime_get_sync(&pdev->dev); if (ret < 0) { dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret); + pm_runtime_put(&pdev->dev); goto exit_pm_disable; } tspi->def_command_reg = SLINK_M_S;
pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error code. Thus a pairing decrement is needed on the error handling path to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> --- drivers/spi/spi-tegra20-slink.c | 1 + 1 file changed, 1 insertion(+)