diff mbox series

drm/stm: Enable RPM during fbdev registration

Message ID 20201104125200.259639-1-marex@denx.de (mailing list archive)
State New, archived
Headers show
Series drm/stm: Enable RPM during fbdev registration | expand

Commit Message

Marek Vasut Nov. 4, 2020, 12:52 p.m. UTC
Enable runtime PM before registering the fbdev emulation and disable it
afterward, otherwise register access to the LTDC IP during the fbdev
emulation registration might hang the system.

The problem happens because RPM is activated at the end of ltdc_load(),
but the fbdev emulation registration happens only after that, and ends
up calling ltdc_crtc_mode_set_nofb(), which checks whether RPM is active
and only if it is not active, calls pm_runtime_get_sync() to enable the
clock and so on. If the clock are not enabled, any register access in
ltdc_crtc_mode_set_nofb() could hang the platform completely.

This patch makes sure that ltdc_crtc_mode_set_nofb() is called within
pm_runtime_get_sync(), so with clock enabled.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@st.com>
Cc: Philippe Cornu <philippe.cornu@st.com>
Cc: Yannick Fertré <yannick.fertre@st.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
---
 drivers/gpu/drm/stm/drv.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Daniel Vetter Nov. 5, 2020, 9:39 a.m. UTC | #1
On Wed, Nov 04, 2020 at 01:52:00PM +0100, Marek Vasut wrote:
> Enable runtime PM before registering the fbdev emulation and disable it
> afterward, otherwise register access to the LTDC IP during the fbdev
> emulation registration might hang the system.
> 
> The problem happens because RPM is activated at the end of ltdc_load(),
> but the fbdev emulation registration happens only after that, and ends
> up calling ltdc_crtc_mode_set_nofb(), which checks whether RPM is active
> and only if it is not active, calls pm_runtime_get_sync() to enable the
> clock and so on. If the clock are not enabled, any register access in
> ltdc_crtc_mode_set_nofb() could hang the platform completely.
> 
> This patch makes sure that ltdc_crtc_mode_set_nofb() is called within
> pm_runtime_get_sync(), so with clock enabled.
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Alexandre Torgue <alexandre.torgue@st.com>
> Cc: Benjamin Gaignard <benjamin.gaignard@st.com>
> Cc: Philippe Cornu <philippe.cornu@st.com>
> Cc: Yannick Fertré <yannick.fertre@st.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-stm32@st-md-mailman.stormreply.com

This looks like you're papering over a bug in your modeset code. If
userspace later on does a setpar on the fbdev chardev, the exact same
thing could happen. You need to fix your modeset code to avoid this, not
sprinkle temporary rpm_get/put all over some top level entry points,
because you can't even patch those all.
-Daniel


> ---
>  drivers/gpu/drm/stm/drv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/stm/drv.c b/drivers/gpu/drm/stm/drv.c
> index 411103f013e2..d8921edc83db 100644
> --- a/drivers/gpu/drm/stm/drv.c
> +++ b/drivers/gpu/drm/stm/drv.c
> @@ -197,7 +197,9 @@ static int stm_drm_platform_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto err_put;
>  
> +	pm_runtime_get_sync(ddev->dev);
>  	drm_fbdev_generic_setup(ddev, 16);
> +	pm_runtime_put_sync(ddev->dev);
>  
>  	return 0;
>  
> -- 
> 2.28.0
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Marek Vasut Nov. 5, 2020, 9:45 a.m. UTC | #2
On 11/5/20 10:39 AM, Daniel Vetter wrote:
> On Wed, Nov 04, 2020 at 01:52:00PM +0100, Marek Vasut wrote:
>> Enable runtime PM before registering the fbdev emulation and disable it
>> afterward, otherwise register access to the LTDC IP during the fbdev
>> emulation registration might hang the system.
>>
>> The problem happens because RPM is activated at the end of ltdc_load(),
>> but the fbdev emulation registration happens only after that, and ends
>> up calling ltdc_crtc_mode_set_nofb(), which checks whether RPM is active
>> and only if it is not active, calls pm_runtime_get_sync() to enable the
>> clock and so on. If the clock are not enabled, any register access in
>> ltdc_crtc_mode_set_nofb() could hang the platform completely.
>>
>> This patch makes sure that ltdc_crtc_mode_set_nofb() is called within
>> pm_runtime_get_sync(), so with clock enabled.

[...]

> This looks like you're papering over a bug in your modeset code. If
> userspace later on does a setpar on the fbdev chardev, the exact same
> thing could happen. You need to fix your modeset code to avoid this, not
> sprinkle temporary rpm_get/put all over some top level entry points,
> because you can't even patch those all.

I have a feeling all those pm_runtime_active() checks in the driver 
might be the root cause of this ? I wonder why the code doesn't use 
pm_runtime_{get,put}_sync() only when accessing registers. Thoughts?
Yannick FERTRE Nov. 6, 2020, 4:13 p.m. UTC | #3
Hi Marek,

On 11/5/20 10:45 AM, Marek Vasut wrote:
> On 11/5/20 10:39 AM, Daniel Vetter wrote:
>> On Wed, Nov 04, 2020 at 01:52:00PM +0100, Marek Vasut wrote:
>>> Enable runtime PM before registering the fbdev emulation and disable it
>>> afterward, otherwise register access to the LTDC IP during the fbdev
>>> emulation registration might hang the system.
>>>
>>> The problem happens because RPM is activated at the end of ltdc_load(),
>>> but the fbdev emulation registration happens only after that, and ends
>>> up calling ltdc_crtc_mode_set_nofb(), which checks whether RPM is active
>>> and only if it is not active, calls pm_runtime_get_sync() to enable the
>>> clock and so on. If the clock are not enabled, any register access in
>>> ltdc_crtc_mode_set_nofb() could hang the platform completely.
>>>
>>> This patch makes sure that ltdc_crtc_mode_set_nofb() is called within
>>> pm_runtime_get_sync(), so with clock enabled.
> 
> [...]
> 
>> This looks like you're papering over a bug in your modeset code. If
>> userspace later on does a setpar on the fbdev chardev, the exact same
>> thing could happen. You need to fix your modeset code to avoid this, not
>> sprinkle temporary rpm_get/put all over some top level entry points,
>> because you can't even patch those all.
> 
> I have a feeling all those pm_runtime_active() checks in the driver 
> might be the root cause of this ? I wonder why the code doesn't use 
> pm_runtime_{get,put}_sync() only when accessing registers. Thoughts?

First line of function ltdc_crtc_mode_set_nofb check the pm_runtime to 
avoid to access registers without clock enabled.



static void ltdc_crtc_mode_set_nofb(struct drm_crtc *crtc)
{
...
	if (!pm_runtime_active(ddev->dev)) {
		ret = pm_runtime_get_sync(ddev->dev);

I test the fb with framebuffer console, & it works fine on my side.
Do you test fb on a old kernel?
How can I reproduce your issue?

Best regards

Yannick
Marek Vasut Nov. 6, 2020, 4:23 p.m. UTC | #4
On 11/6/20 5:13 PM, Yannick FERTRE wrote:
> Hi Marek,

Hi,

> On 11/5/20 10:45 AM, Marek Vasut wrote:
>> On 11/5/20 10:39 AM, Daniel Vetter wrote:
>>> On Wed, Nov 04, 2020 at 01:52:00PM +0100, Marek Vasut wrote:
>>>> Enable runtime PM before registering the fbdev emulation and disable it
>>>> afterward, otherwise register access to the LTDC IP during the fbdev
>>>> emulation registration might hang the system.
>>>>
>>>> The problem happens because RPM is activated at the end of ltdc_load(),
>>>> but the fbdev emulation registration happens only after that, and ends
>>>> up calling ltdc_crtc_mode_set_nofb(), which checks whether RPM is active
>>>> and only if it is not active, calls pm_runtime_get_sync() to enable the
>>>> clock and so on. If the clock are not enabled, any register access in
>>>> ltdc_crtc_mode_set_nofb() could hang the platform completely.
>>>>
>>>> This patch makes sure that ltdc_crtc_mode_set_nofb() is called within
>>>> pm_runtime_get_sync(), so with clock enabled.
>>
>> [...]
>>
>>> This looks like you're papering over a bug in your modeset code. If
>>> userspace later on does a setpar on the fbdev chardev, the exact same
>>> thing could happen. You need to fix your modeset code to avoid this, not
>>> sprinkle temporary rpm_get/put all over some top level entry points,
>>> because you can't even patch those all.
>>
>> I have a feeling all those pm_runtime_active() checks in the driver
>> might be the root cause of this ? I wonder why the code doesn't use
>> pm_runtime_{get,put}_sync() only when accessing registers. Thoughts?
> 
> First line of function ltdc_crtc_mode_set_nofb check the pm_runtime to
> avoid to access registers without clock enabled.
> 
> 
> 
> static void ltdc_crtc_mode_set_nofb(struct drm_crtc *crtc)
> {
> ...
> 	if (!pm_runtime_active(ddev->dev)) {
> 		ret = pm_runtime_get_sync(ddev->dev);
> 
> I test the fb with framebuffer console, & it works fine on my side.
> Do you test fb on a old kernel?
> How can I reproduce your issue?

I observed sporadic hangs and tracked it down to the fbdev registration, 
which calls this code. Note that pm_runtime_active() here will return 0, 
because it was already activated in ltdc_load().

My question in reply to Daniel was more geared toward why do we even 
have all these checks in the driver, wouldn't it be better to just 
always call pm_runtime_get_sync()/pm_runtime_put_sync() in the code 
which requires the hardware to be active ?
diff mbox series

Patch

diff --git a/drivers/gpu/drm/stm/drv.c b/drivers/gpu/drm/stm/drv.c
index 411103f013e2..d8921edc83db 100644
--- a/drivers/gpu/drm/stm/drv.c
+++ b/drivers/gpu/drm/stm/drv.c
@@ -197,7 +197,9 @@  static int stm_drm_platform_probe(struct platform_device *pdev)
 	if (ret)
 		goto err_put;
 
+	pm_runtime_get_sync(ddev->dev);
 	drm_fbdev_generic_setup(ddev, 16);
+	pm_runtime_put_sync(ddev->dev);
 
 	return 0;