diff mbox

[1/2] drm/msm/dsi: Use correct pm_runtime_put variant during host_init

Message ID 20171006105707.20416-2-architt@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Archit Taneja Oct. 6, 2017, 10:57 a.m. UTC
The DSI runtime PM suspend/resume callbacks check whether
msm_host->cfg_hnd is non-NULL before trying to enable the bus clocks.
This is done to accommodate early calls to these functions that may
happen before the bus clocks are even initialized.

Calling pm_runtime_put_autosuspend() in dsi_host_init() can result in
racy behaviour since msm_host->cfg_hnd is set very soon after. If the
suspend callback happens too late, we end up trying to disable clocks
that were never enabled, resulting in a bunch of WARN_ON splats.

Use pm_runtime_put_sync() so that the suspend callback is called
immediately.

Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
---
 drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Daniel Vetter Oct. 6, 2017, 2:02 p.m. UTC | #1
On Fri, Oct 06, 2017 at 04:27:06PM +0530, Archit Taneja wrote:
> The DSI runtime PM suspend/resume callbacks check whether
> msm_host->cfg_hnd is non-NULL before trying to enable the bus clocks.
> This is done to accommodate early calls to these functions that may
> happen before the bus clocks are even initialized.
> 
> Calling pm_runtime_put_autosuspend() in dsi_host_init() can result in
> racy behaviour since msm_host->cfg_hnd is set very soon after. If the
> suspend callback happens too late, we end up trying to disable clocks
> that were never enabled, resulting in a bunch of WARN_ON splats.

Sounds like the correct fix here is to block autosuspend until after
everything is set up, including bus clocks. This patch just makes the race
harder to hit in practice ...
-Daniel

> 
> Use pm_runtime_put_sync() so that the suspend callback is called
> immediately.
> 
> Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
> Signed-off-by: Archit Taneja <architt@codeaurora.org>
> ---
>  drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
> index dbb31a014419..deaf869374ea 100644
> --- a/drivers/gpu/drm/msm/dsi/dsi_host.c
> +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
> @@ -248,7 +248,7 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
>  	clk_disable_unprepare(ahb_clk);
>  disable_gdsc:
>  	regulator_disable(gdsc_reg);
> -	pm_runtime_put_autosuspend(dev);
> +	pm_runtime_put_sync(dev);
>  put_clk:
>  	clk_put(ahb_clk);
>  put_gdsc:
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Archit Taneja Oct. 10, 2017, 3:28 a.m. UTC | #2
On 10/06/2017 07:32 PM, Daniel Vetter wrote:
> On Fri, Oct 06, 2017 at 04:27:06PM +0530, Archit Taneja wrote:
>> The DSI runtime PM suspend/resume callbacks check whether
>> msm_host->cfg_hnd is non-NULL before trying to enable the bus clocks.
>> This is done to accommodate early calls to these functions that may
>> happen before the bus clocks are even initialized.
>>
>> Calling pm_runtime_put_autosuspend() in dsi_host_init() can result in
>> racy behaviour since msm_host->cfg_hnd is set very soon after. If the
>> suspend callback happens too late, we end up trying to disable clocks
>> that were never enabled, resulting in a bunch of WARN_ON splats.
> 
> Sounds like the correct fix here is to block autosuspend until after
> everything is set up, including bus clocks. This patch just makes the race
> harder to hit in practice ...

Thanks for the review. pm_runtime_put_sync() ensures that the suspend handler
is called in the same context as that of the caller, right? msm_host->cfg_hnd
is set to a non-NULL value only when we return from dsi_get_config(). The race
would never happen in this case.

This call is a one time thing during DSI probe, we do a pm_runtime_get_sync()
just so that we can read the block revision number. Once we have the revision
number, we look at an internal table which maintains IP version specific
resources, like what bus clocks to get, etc. Having pm_runtime_put_autosuspend()
here didn't help much anyway.

Archit


> -Daniel
> 
>>
>> Use pm_runtime_put_sync() so that the suspend callback is called
>> immediately.
>>
>> Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
>> Signed-off-by: Archit Taneja <architt@codeaurora.org>
>> ---
>>   drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
>> index dbb31a014419..deaf869374ea 100644
>> --- a/drivers/gpu/drm/msm/dsi/dsi_host.c
>> +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
>> @@ -248,7 +248,7 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
>>   	clk_disable_unprepare(ahb_clk);
>>   disable_gdsc:
>>   	regulator_disable(gdsc_reg);
>> -	pm_runtime_put_autosuspend(dev);
>> +	pm_runtime_put_sync(dev);
>>   put_clk:
>>   	clk_put(ahb_clk);
>>   put_gdsc:
>> -- 
>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
>> hosted by The Linux Foundation
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
Daniel Vetter Oct. 10, 2017, 8:59 a.m. UTC | #3
On Tue, Oct 10, 2017 at 08:58:03AM +0530, Archit Taneja wrote:
> 
> 
> On 10/06/2017 07:32 PM, Daniel Vetter wrote:
> > On Fri, Oct 06, 2017 at 04:27:06PM +0530, Archit Taneja wrote:
> > > The DSI runtime PM suspend/resume callbacks check whether
> > > msm_host->cfg_hnd is non-NULL before trying to enable the bus clocks.
> > > This is done to accommodate early calls to these functions that may
> > > happen before the bus clocks are even initialized.
> > > 
> > > Calling pm_runtime_put_autosuspend() in dsi_host_init() can result in
> > > racy behaviour since msm_host->cfg_hnd is set very soon after. If the
> > > suspend callback happens too late, we end up trying to disable clocks
> > > that were never enabled, resulting in a bunch of WARN_ON splats.
> > 
> > Sounds like the correct fix here is to block autosuspend until after
> > everything is set up, including bus clocks. This patch just makes the race
> > harder to hit in practice ...
> 
> Thanks for the review. pm_runtime_put_sync() ensures that the suspend handler
> is called in the same context as that of the caller, right? msm_host->cfg_hnd
> is set to a non-NULL value only when we return from dsi_get_config(). The race
> would never happen in this case.
> 
> This call is a one time thing during DSI probe, we do a pm_runtime_get_sync()
> just so that we can read the block revision number. Once we have the revision
> number, we look at an internal table which maintains IP version specific
> resources, like what bus clocks to get, etc. Having pm_runtime_put_autosuspend()
> here didn't help much anyway.

Maybe this stuff is different on arm than on pci, but on x86 you have to
explicitly enable autosuspend. Before that, the device stays on.

What I mean with properly fixing this, is to delay enabling of autosuspend
until you're fully set up. Which then allows you to drop the check for
clocks and other stuff.

For i915, what we do is hold an artificial runtime pm reference that we
grab first thing in ->probe and drop only once everything is fully set up
(including asynchronous workers that load firmware). That way we make sure
that our runtime pm code never sees a partially initiliazed driver.

Doing something similar sounds best too, i.e. instead of dropping the
runtime pm reference here, only drop it once dsi_get_config has been
called. Pronto, no race, no need for special functions.
-Daniel

> 
> Archit
> 
> 
> > -Daniel
> > 
> > > 
> > > Use pm_runtime_put_sync() so that the suspend callback is called
> > > immediately.
> > > 
> > > Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
> > > Signed-off-by: Archit Taneja <architt@codeaurora.org>
> > > ---
> > >   drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
> > > index dbb31a014419..deaf869374ea 100644
> > > --- a/drivers/gpu/drm/msm/dsi/dsi_host.c
> > > +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
> > > @@ -248,7 +248,7 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
> > >   	clk_disable_unprepare(ahb_clk);
> > >   disable_gdsc:
> > >   	regulator_disable(gdsc_reg);
> > > -	pm_runtime_put_autosuspend(dev);
> > > +	pm_runtime_put_sync(dev);
> > >   put_clk:
> > >   	clk_put(ahb_clk);
> > >   put_gdsc:
> > > -- 
> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> > > hosted by The Linux Foundation
> > > 
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
Rob Clark Oct. 10, 2017, 12:07 p.m. UTC | #4
On Tue, Oct 10, 2017 at 4:59 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Oct 10, 2017 at 08:58:03AM +0530, Archit Taneja wrote:
>>
>>
>> On 10/06/2017 07:32 PM, Daniel Vetter wrote:
>> > On Fri, Oct 06, 2017 at 04:27:06PM +0530, Archit Taneja wrote:
>> > > The DSI runtime PM suspend/resume callbacks check whether
>> > > msm_host->cfg_hnd is non-NULL before trying to enable the bus clocks.
>> > > This is done to accommodate early calls to these functions that may
>> > > happen before the bus clocks are even initialized.
>> > >
>> > > Calling pm_runtime_put_autosuspend() in dsi_host_init() can result in
>> > > racy behaviour since msm_host->cfg_hnd is set very soon after. If the
>> > > suspend callback happens too late, we end up trying to disable clocks
>> > > that were never enabled, resulting in a bunch of WARN_ON splats.
>> >
>> > Sounds like the correct fix here is to block autosuspend until after
>> > everything is set up, including bus clocks. This patch just makes the race
>> > harder to hit in practice ...
>>
>> Thanks for the review. pm_runtime_put_sync() ensures that the suspend handler
>> is called in the same context as that of the caller, right? msm_host->cfg_hnd
>> is set to a non-NULL value only when we return from dsi_get_config(). The race
>> would never happen in this case.
>>
>> This call is a one time thing during DSI probe, we do a pm_runtime_get_sync()
>> just so that we can read the block revision number. Once we have the revision
>> number, we look at an internal table which maintains IP version specific
>> resources, like what bus clocks to get, etc. Having pm_runtime_put_autosuspend()
>> here didn't help much anyway.
>
> Maybe this stuff is different on arm than on pci, but on x86 you have to
> explicitly enable autosuspend. Before that, the device stays on.

On arm, we cannot assume that clks are enabled ever unless we've
enabled them.  And unclocked register access is insta-reboot.
Although we probably shouldn't be using _autosuspend() on the display
side of things.  It makes sense for the gpu, where "booting up" the
gpu is a heavier process and we might want to keep things alive for a
bit longer incase another submit comes in.

BR,
-R

> What I mean with properly fixing this, is to delay enabling of autosuspend
> until you're fully set up. Which then allows you to drop the check for
> clocks and other stuff.
>
> For i915, what we do is hold an artificial runtime pm reference that we
> grab first thing in ->probe and drop only once everything is fully set up
> (including asynchronous workers that load firmware). That way we make sure
> that our runtime pm code never sees a partially initiliazed driver.
>
> Doing something similar sounds best too, i.e. instead of dropping the
> runtime pm reference here, only drop it once dsi_get_config has been
> called. Pronto, no race, no need for special functions.
> -Daniel
>
>>
>> Archit
>>
>>
>> > -Daniel
>> >
>> > >
>> > > Use pm_runtime_put_sync() so that the suspend callback is called
>> > > immediately.
>> > >
>> > > Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
>> > > Signed-off-by: Archit Taneja <architt@codeaurora.org>
>> > > ---
>> > >   drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +-
>> > >   1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
>> > > index dbb31a014419..deaf869374ea 100644
>> > > --- a/drivers/gpu/drm/msm/dsi/dsi_host.c
>> > > +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
>> > > @@ -248,7 +248,7 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
>> > >           clk_disable_unprepare(ahb_clk);
>> > >   disable_gdsc:
>> > >           regulator_disable(gdsc_reg);
>> > > - pm_runtime_put_autosuspend(dev);
>> > > + pm_runtime_put_sync(dev);
>> > >   put_clk:
>> > >           clk_put(ahb_clk);
>> > >   put_gdsc:
>> > > --
>> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
>> > > hosted by The Linux Foundation
>> > >
>> > > _______________________________________________
>> > > dri-devel mailing list
>> > > dri-devel@lists.freedesktop.org
>> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>> >
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
diff mbox

Patch

diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
index dbb31a014419..deaf869374ea 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -248,7 +248,7 @@  static const struct msm_dsi_cfg_handler *dsi_get_config(
 	clk_disable_unprepare(ahb_clk);
 disable_gdsc:
 	regulator_disable(gdsc_reg);
-	pm_runtime_put_autosuspend(dev);
+	pm_runtime_put_sync(dev);
 put_clk:
 	clk_put(ahb_clk);
 put_gdsc: