mbox series

[0/3] omapdrm: Fix runtime PM issues at module load and unload time

Message ID 20181101102525.6582-1-laurent.pinchart@ideasonboard.com (mailing list archive)
Headers show
Series omapdrm: Fix runtime PM issues at module load and unload time | expand

Message

Laurent Pinchart Nov. 1, 2018, 10:25 a.m. UTC
Hello,

This series fixes crashes in the omapdss driver at both load and unload
time, due to runtime PM problems related to probe deferral. The bugs got
introduced in v4.20-rc, this should thus be considered as v4.20 fixes.

At the core of the problem comes commit 27d624527d99 ("drm/omap: dss:
Acquire next dssdev at probe time") which can cause probe deferral for
the DSS when the encoder and panel modules are not loaded yet. As the
omapdss module contains drivers for the DSS and all its children, this
results in the internal encoders being probed before the DSS. Missing
runtime PM handling around register access then caused imprecise
external aborts.

Patch 1/3 moves population of the DSS children from arch code to the
omapdss driver. This prevents the DSS children from being probed before
the DSS. The change can be considered as a workaround in the sense that
runtime PM of the DSS children should operate correctly even when the
DSS probe fail. However, given that reducing the amount of arch code is
an improvement in itself, I believe the solution to be acceptable.

Patches 2/3 and 3/3 then ensure that the HDMI4 and DSI devices are
active at bind and probe time respectively, in order to access hardware
registers there.

Tony, patch 1/3 touches both the mach-omap2 and omapdss. Would you be
fine merging it through the DRM tree ? I don't think there's a risk of
conflict during the v4.20-rc cycle.

Laurent Pinchart (3):
  drm/omap: Populate DSS children in omapdss driver
  drm/omap: hdmi4: Ensure the device is active during bind
  drm/omap: dsi: Ensure the device is active during probe

 arch/arm/mach-omap2/display.c       | 111 +++++++++++++---------------
 drivers/gpu/drm/omapdrm/dss/dsi.c   |  25 +++++--
 drivers/gpu/drm/omapdrm/dss/dss.c   |  11 ++-
 drivers/gpu/drm/omapdrm/dss/hdmi4.c |  10 ++-
 4 files changed, 91 insertions(+), 66 deletions(-)

Comments

Tomi Valkeinen Nov. 1, 2018, 11:47 a.m. UTC | #1
Hi Laurent,

On 01/11/18 12:25, Laurent Pinchart wrote:
> Hello,
> 
> This series fixes crashes in the omapdss driver at both load and unload
> time, due to runtime PM problems related to probe deferral. The bugs got
> introduced in v4.20-rc, this should thus be considered as v4.20 fixes.
> 
> At the core of the problem comes commit 27d624527d99 ("drm/omap: dss:
> Acquire next dssdev at probe time") which can cause probe deferral for
> the DSS when the encoder and panel modules are not loaded yet. As the
> omapdss module contains drivers for the DSS and all its children, this
> results in the internal encoders being probed before the DSS. Missing
> runtime PM handling around register access then caused imprecise
> external aborts.
> 
> Patch 1/3 moves population of the DSS children from arch code to the
> omapdss driver. This prevents the DSS children from being probed before
> the DSS. The change can be considered as a workaround in the sense that
> runtime PM of the DSS children should operate correctly even when the
> DSS probe fail. However, given that reducing the amount of arch code is
> an improvement in itself, I believe the solution to be acceptable.
> 
> Patches 2/3 and 3/3 then ensure that the HDMI4 and DSI devices are
> active at bind and probe time respectively, in order to access hardware
> registers there.
> 
> Tony, patch 1/3 touches both the mach-omap2 and omapdss. Would you be
> fine merging it through the DRM tree ? I don't think there's a risk of
> conflict during the v4.20-rc cycle.

Thanks for debugging this! I have to say I really don't like these
(well, 2 is fine), as they feel like hacks.

We do dispc_runtime_get/put in the HDMI driver's suspend/resume too, so
don't we need similar hack (as you add in dsi.c) there also?

So we have two problems (aside missing runtime_get/put calls in those
few places):

- All DSS submodules depend on DSS (dss core)

- All DSS encoders depend on DISPC. The dependency is there only when
we're starting the video stream.

Can we not defer the probe of the submodules until the dependencies have
been probed? Are there any other ways to manage such device
dependencies? Possibly we could handle the driver registration so that
we only register encoders when dss and dispc have been probed, but that
feels a bit hacky too.

And considering these issues, is the assumption that "There's no reason
to delay initialization of most of the driver (such as mapping memory
I/O or enabling runtime PM) to the component bind handler." still valid?
 edb715dffdee71bb8216ee4d71c0714d932e9acf doesn't really state any
benefit for this move either. Clearly there's a reason to do this in
bind, although it doesn't solve the dispc dependency.

 Tomi
Laurent Pinchart Nov. 1, 2018, 12:13 p.m. UTC | #2
Ho Tomi,

On Thursday, 1 November 2018 13:47:40 EET Tomi Valkeinen wrote:
> On 01/11/18 12:25, Laurent Pinchart wrote:
> > Hello,
> > 
> > This series fixes crashes in the omapdss driver at both load and unload
> > time, due to runtime PM problems related to probe deferral. The bugs got
> > introduced in v4.20-rc, this should thus be considered as v4.20 fixes.
> > 
> > At the core of the problem comes commit 27d624527d99 ("drm/omap: dss:
> > Acquire next dssdev at probe time") which can cause probe deferral for
> > the DSS when the encoder and panel modules are not loaded yet. As the
> > omapdss module contains drivers for the DSS and all its children, this
> > results in the internal encoders being probed before the DSS. Missing
> > runtime PM handling around register access then caused imprecise
> > external aborts.
> > 
> > Patch 1/3 moves population of the DSS children from arch code to the
> > omapdss driver. This prevents the DSS children from being probed before
> > the DSS. The change can be considered as a workaround in the sense that
> > runtime PM of the DSS children should operate correctly even when the
> > DSS probe fail. However, given that reducing the amount of arch code is
> > an improvement in itself, I believe the solution to be acceptable.
> > 
> > Patches 2/3 and 3/3 then ensure that the HDMI4 and DSI devices are
> > active at bind and probe time respectively, in order to access hardware
> > registers there.
> > 
> > Tony, patch 1/3 touches both the mach-omap2 and omapdss. Would you be
> > fine merging it through the DRM tree ? I don't think there's a risk of
> > conflict during the v4.20-rc cycle.
> 
> Thanks for debugging this! I have to say I really don't like these
> (well, 2 is fine), as they feel like hacks.

I assume you also have nothing against the first hunk of patch 3/3.

> We do dispc_runtime_get/put in the HDMI driver's suspend/resume too, so
> don't we need similar hack (as you add in dsi.c) there also?

We would if we had to access HDMI registers at probe time.

> So we have two problems (aside missing runtime_get/put calls in those
> few places):
> 
> - All DSS submodules depend on DSS (dss core)
> 
> - All DSS encoders depend on DISPC. The dependency is there only when
> we're starting the video stream.
> 
> Can we not defer the probe of the submodules until the dependencies have
> been probed?

That would be difficult, as the submodules don't know about the DSS. It goes 
the other way round, it's the DSS that gathers a list of submodules.

Furthermore the DSS probe is deferred due to the devices connected to the DPI 
and SDI outputs not being probed yet. See dpi_init_output_port(), the call to 
omapdss_of_find_connected_device() returns -EPROBE_DEFER. This is why the DSS 
probe is deferred if you load the omapdss module before all the other modules 
for the external components. Getting hold if the next device in the chain was 
needed to reverse the chain control direction.

I would like to also point out that regardless of the underlying issue, I 
think creating the DSS children in the DSS driver instead of mach-omap2 code 
is the right thing to do, as the DSS really handles the bus through which the 
children are accessed. This is similar to how an I2C adapter creates the I2C 
slaves.

> Are there any other ways to manage such device dependencies? Possibly we
> could handle the driver registration so that we only register encoders when
> dss and dispc have been probed, but that feels a bit hacky too.

I also think that's an even bigger hack.

> And considering these issues, is the assumption that "There's no reason
> to delay initialization of most of the driver (such as mapping memory
> I/O or enabling runtime PM) to the component bind handler." still valid?
> edb715dffdee71bb8216ee4d71c0714d932e9acf doesn't really state any
> benefit for this move either. Clearly there's a reason to do this in
> bind, although it doesn't solve the dispc dependency.

We could try moving code back to the bind handler, but that would get in the 
way of probe deferral until bridges are available. In order to solve that we 
would need to convert all bridges to the component framework, which we can't 
do as they're used by display drivers that don't use components. Lovely, isn't 
it ?

There's a nearly infinite amount of problems to fix and I wish I could tackle 
them all at the same time, but that's unfortunately not possible. I would like 
at some point in the future to add an asynchronous bind mechanism to DRM 
bridge itself, whose usage would be optional, unlike the component framework. 
This would help with probe handling, but it's way down the road.
Tomi Valkeinen Nov. 1, 2018, 12:56 p.m. UTC | #3
On 01/11/18 14:13, Laurent Pinchart wrote:

>> Thanks for debugging this! I have to say I really don't like these
>> (well, 2 is fine), as they feel like hacks.
> 
> I assume you also have nothing against the first hunk of patch 3/3.

Yes. And after our discussion, I think 1 is fine too, so the only
problem is the latter part of patch 3.

>> Can we not defer the probe of the submodules until the dependencies have
>> been probed?
> 
> That would be difficult, as the submodules don't know about the DSS. It goes 
> the other way round, it's the DSS that gathers a list of submodules.
> 
> Furthermore the DSS probe is deferred due to the devices connected to the DPI 
> and SDI outputs not being probed yet. See dpi_init_output_port(), the call to 
> omapdss_of_find_connected_device() returns -EPROBE_DEFER. This is why the DSS 
> probe is deferred if you load the omapdss module before all the other modules 
> for the external components. Getting hold if the next device in the chain was 
> needed to reverse the chain control direction.

Ok.

> I would like to also point out that regardless of the underlying issue, I 
> think creating the DSS children in the DSS driver instead of mach-omap2 code 
> is the right thing to do, as the DSS really handles the bus through which the 
> children are accessed. This is similar to how an I2C adapter creates the I2C 
> slaves.

Agreed.

>> And considering these issues, is the assumption that "There's no reason
>> to delay initialization of most of the driver (such as mapping memory
>> I/O or enabling runtime PM) to the component bind handler." still valid?
>> edb715dffdee71bb8216ee4d71c0714d932e9acf doesn't really state any
>> benefit for this move either. Clearly there's a reason to do this in
>> bind, although it doesn't solve the dispc dependency.
> 
> We could try moving code back to the bind handler, but that would get in the 
> way of probe deferral until bridges are available. In order to solve that we 
> would need to convert all bridges to the component framework, which we can't 
> do as they're used by display drivers that don't use components. Lovely, isn't 
> it ?
> 
> There's a nearly infinite amount of problems to fix and I wish I could tackle 
> them all at the same time, but that's unfortunately not possible. I would like 
> at some point in the future to add an asynchronous bind mechanism to DRM 
> bridge itself, whose usage would be optional, unlike the component framework. 
> This would help with probe handling, but it's way down the road.

Yep, this seems to be rather complex issue. So, maybe we can go with the
third patch too, but perhaps split it in two, as the first hunk is fine.
And perhaps add mark the rest clearly as HACK, also in the suspend.

 Tomi
Tony Lindgren Nov. 1, 2018, 3:58 p.m. UTC | #4
Hi,

* Laurent Pinchart <laurent.pinchart@ideasonboard.com> [181101 12:13]:
> On Thursday, 1 November 2018 13:47:40 EET Tomi Valkeinen wrote:
> > We do dispc_runtime_get/put in the HDMI driver's suspend/resume too, so
> > don't we need similar hack (as you add in dsi.c) there also?
> 
> We would if we had to access HDMI registers at probe time.

With these I'm still seeing the following issue with hdmi on rmmod
of omapdrm related modules as hdmi->dss is NULL in hdmi_runtime_resume.

Regards,

Tony

8< ------
Unable to handle kernel NULL pointer dereference at virtual address 00000278
...
PC is at hdmi_runtime_resume+0xc/0x1c [omapdss]
LR is at __rpm_callback+0x144/0x1d8
...
(hdmi_runtime_resume [omapdss]) from [<c06079b4>] (__rpm_callback+0x144/0x1d8)
(__rpm_callback) from [<c0607a68>] (rpm_callback+0x20/0x80)
(rpm_callback) from [<c06075f0>] (rpm_resume+0x60c/0x828)
(rpm_resume) from [<c0607858>] (__pm_runtime_resume+0x4c/0x64)
(__pm_runtime_resume) from [<c05fc7ec>] (device_release_driver_internal+0x130/0x234)
(device_release_driver_internal) from [<c05fc934>] (driver_detach+0x38/0x6c)
(driver_detach) from [<c05fb698>] (bus_remove_driver+0x4c/0xa4)
(bus_remove_driver) from [<c05fe23c>] (platform_unregister_drivers+0x20/0x2c)
(platform_unregister_drivers) from [<c01f0fe0>] (sys_delete_module+0x1c0/0x230)
(sys_delete_module) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
Laurent Pinchart Nov. 1, 2018, 4:17 p.m. UTC | #5
Hi Tony,

On Thursday, 1 November 2018 17:58:56 EET Tony Lindgren wrote:
> * Laurent Pinchart <laurent.pinchart@ideasonboard.com> [181101 12:13]:
> > On Thursday, 1 November 2018 13:47:40 EET Tomi Valkeinen wrote:
> > > We do dispc_runtime_get/put in the HDMI driver's suspend/resume too, so
> > > don't we need similar hack (as you add in dsi.c) there also?
> > 
> > We would if we had to access HDMI registers at probe time.
> 
> With these I'm still seeing the following issue with hdmi on rmmod
> of omapdrm related modules as hdmi->dss is NULL in hdmi_runtime_resume.

This is actually what I expected, but to my surprise the problem didn't occur 
on my system, I don't know why. I'll try to investigate.

> Regards,
> 
> Tony
> 
> 8< ------
> Unable to handle kernel NULL pointer dereference at virtual address 00000278
> ...
> PC is at hdmi_runtime_resume+0xc/0x1c [omapdss]
> LR is at __rpm_callback+0x144/0x1d8
> ...
> (hdmi_runtime_resume [omapdss]) from [<c06079b4>]
> (__rpm_callback+0x144/0x1d8) (__rpm_callback) from [<c0607a68>]
> (rpm_callback+0x20/0x80)
> (rpm_callback) from [<c06075f0>] (rpm_resume+0x60c/0x828)
> (rpm_resume) from [<c0607858>] (__pm_runtime_resume+0x4c/0x64)
> (__pm_runtime_resume) from [<c05fc7ec>]
> (device_release_driver_internal+0x130/0x234)
> (device_release_driver_internal) from [<c05fc934>]
> (driver_detach+0x38/0x6c) (driver_detach) from [<c05fb698>]
> (bus_remove_driver+0x4c/0xa4)
> (bus_remove_driver) from [<c05fe23c>]
> (platform_unregister_drivers+0x20/0x2c) (platform_unregister_drivers) from
> [<c01f0fe0>] (sys_delete_module+0x1c0/0x230) (sys_delete_module) from
> [<c0101000>] (ret_fast_syscall+0x0/0x28)
Laurent Pinchart Nov. 5, 2018, 3:14 p.m. UTC | #6
Hi Tony,

On Thursday, 1 November 2018 18:17:43 EET Laurent Pinchart wrote:
> On Thursday, 1 November 2018 17:58:56 EET Tony Lindgren wrote:
> > * Laurent Pinchart <laurent.pinchart@ideasonboard.com> [181101 12:13]:
> >> On Thursday, 1 November 2018 13:47:40 EET Tomi Valkeinen wrote:
> >>> We do dispc_runtime_get/put in the HDMI driver's suspend/resume too,
> >>> so don't we need similar hack (as you add in dsi.c) there also?
> >> 
> >> We would if we had to access HDMI registers at probe time.
> > 
> > With these I'm still seeing the following issue with hdmi on rmmod
> > of omapdrm related modules as hdmi->dss is NULL in hdmi_runtime_resume.
> 
> This is actually what I expected, but to my surprise the problem didn't
> occur on my system, I don't know why. I'll try to investigate.

I've submitted a v2 which should hopefully fix this.
Tony Lindgren Nov. 5, 2018, 8:15 p.m. UTC | #7
* Laurent Pinchart <laurent.pinchart@ideasonboard.com> [181105 15:14]:
> Hi Tony,
> 
> On Thursday, 1 November 2018 18:17:43 EET Laurent Pinchart wrote:
> > On Thursday, 1 November 2018 17:58:56 EET Tony Lindgren wrote:
> > > * Laurent Pinchart <laurent.pinchart@ideasonboard.com> [181101 12:13]:
> > >> On Thursday, 1 November 2018 13:47:40 EET Tomi Valkeinen wrote:
> > >>> We do dispc_runtime_get/put in the HDMI driver's suspend/resume too,
> > >>> so don't we need similar hack (as you add in dsi.c) there also?
> > >> 
> > >> We would if we had to access HDMI registers at probe time.
> > > 
> > > With these I'm still seeing the following issue with hdmi on rmmod
> > > of omapdrm related modules as hdmi->dss is NULL in hdmi_runtime_resume.
> > 
> > This is actually what I expected, but to my surprise the problem didn't
> > occur on my system, I don't know why. I'll try to investigate.
> 
> I've submitted a v2 which should hopefully fix this.

Yes thanks seems to fix it, noticed one issue there though
for hdmi.

Regards,

Tony