diff mbox series

[RFC,04/10] drm/panel_helper: Introduce drm_panel_helper

Message ID 20230804140605.RFC.4.I930069a32baab6faf46d6b234f89613b5cec0f14@changeid (mailing list archive)
State New, archived
Headers show
Series drm/panel: Remove most store/double-check of prepared/enabled state | expand

Commit Message

Doug Anderson Aug. 4, 2023, 9:06 p.m. UTC
The goal of this file is to contain helper functions for panel drivers
to use. To start off with, let's add drm_panel_helper_shutdown() for
use by panels that want to make sure they're powered off at
shutdown/remove time if they happen to be powered on.

The main goal of introducting this function is so that panel drivers
don't need to track the enabled/prepared state themselves.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
If I've misunderstood and the drm_panel_helper_shutdown() should
belong in some other file and we don't need to introduce a "helper"
for this then please le me know.

 drivers/gpu/drm/Makefile           |  1 +
 drivers/gpu/drm/drm_panel_helper.c | 37 ++++++++++++++++++++++++++++++
 include/drm/drm_panel_helper.h     | 13 +++++++++++
 3 files changed, 51 insertions(+)
 create mode 100644 drivers/gpu/drm/drm_panel_helper.c
 create mode 100644 include/drm/drm_panel_helper.h

Comments

Maxime Ripard Aug. 7, 2023, 6:41 a.m. UTC | #1
Hi Doug,

Thanks for working on this :)

On Fri, Aug 04, 2023 at 02:06:07PM -0700, Douglas Anderson wrote:
> The goal of this file is to contain helper functions for panel drivers
> to use. To start off with, let's add drm_panel_helper_shutdown() for
> use by panels that want to make sure they're powered off at
> shutdown/remove time if they happen to be powered on.
> 
> The main goal of introducting this function is so that panel drivers
> don't need to track the enabled/prepared state themselves.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

It shouldn't be necessary at all: drivers should call
drm_atomic_helper_shutdown at removal time which will disable the
connector (which in turn should unprepare/disable its panel).

If either the driver is missing drm_atomic_helper_shutdown, or if the
connector doesn't properly disable the panel, then I would consider that
a bug.

Maxime
Doug Anderson Aug. 25, 2023, 9:58 p.m. UTC | #2
Maxime,

On Sun, Aug 6, 2023 at 11:41 PM Maxime Ripard <mripard@kernel.org> wrote:
>
> Hi Doug,
>
> Thanks for working on this :)
>
> On Fri, Aug 04, 2023 at 02:06:07PM -0700, Douglas Anderson wrote:
> > The goal of this file is to contain helper functions for panel drivers
> > to use. To start off with, let's add drm_panel_helper_shutdown() for
> > use by panels that want to make sure they're powered off at
> > shutdown/remove time if they happen to be powered on.
> >
> > The main goal of introducting this function is so that panel drivers
> > don't need to track the enabled/prepared state themselves.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>
> It shouldn't be necessary at all: drivers should call
> drm_atomic_helper_shutdown at removal time which will disable the
> connector (which in turn should unprepare/disable its panel).
>
> If either the driver is missing drm_atomic_helper_shutdown, or if the
> connector doesn't properly disable the panel, then I would consider that
> a bug.

Hmmm. I'm a bit hesitant here. I guess I'm less worried about the
removal time and more worried about the shutdown time.

For removal I'd be fine with just dropping the call and saying it's
the responsibility of the driver to call drm_atomic_helper_shutdown(),
as you suggest. I'd tend to believe that removal of DRM drivers is not
used anywhere in "production" code (or at least not common) and I
think it's super hard to get it right, to unregister and unbind all
the DRM components in the right order. Presumably anyone trying to
remove a DRM panel in a generic case supporting lots of different
hardware is used to it being a bit broken...  Not that it's a super
great situation to be in for remove() not to work reliably, but that's
how I think it is right now.

For shutdown, however, I'm not really OK with just blindly removing
the code that tries to power off the panel. Shutdown is called any
time you reboot a device. That means that if a DRM driver is _not_
calling drm_atomic_helper_shutdown() on the panel's behalf at shutdown
time then the panel won't be powered off properly. This feels to me
like something that might actually matter. Panels tend to be one of
those things that really care about their power sequencing and can
even get damaged (or not turn on properly next time) if sequencing is
not done properly, so just removing this code and putting the blame on
the DRM driver seems scary to me. Sure enough, a quick survey of DRM
drivers shows that many don't call drm_atomic_helper_shutdown() at
.shutdown time. A _very_ quick skim of callers to
drm_atomic_helper_shutdown():

* omapdrm/omap_drv.c - calls at remove, not shutdown
* arm/hdlcd_drv.c - calls at unbind, not shutdown
* arm/malidp_drv.c - calls at unbind, not shutdown
* armada/armada_drv.c - calls at unbind, not shutdown

...huh, actually, there are probably too many to list that don't call
it at shutdown. There are some that do, but also quite a few that
don't. I'm not sure I really want to blindly add
drm_atomic_helper_shutdown() to all those DRM driver's shutdown
callbacks... That feels like asking for someone to flame me...

...but then, what's the way forward? I think normally the panel's
shutdown() callback would happen _before_ the DRM driver's shutdown()
callback, so we can't easily write logic in the panel's shutdown like
"if the DRM panel didn't shut the panel down then print a warning and
shut down the panel". We'd have to somehow invent and register for a
"late shutdown" callback and have the panel use that to shut itself
down if the DRM driver didn't. That seems like a bad idea...

Do you have any brilliant ideas here? I could keep the function as-is
but only have panels only call it at shutdown time if you want. I
could add to the comments and/or the commit message some summary of
the above and that the call is important for panels that absolutely
need to be powered off at shutdown time even if the DRM driver doesn't
do anything special at shutdown... Any other ideas?


-Doug
Maxime Ripard Aug. 28, 2023, 7:45 a.m. UTC | #3
On Fri, Aug 25, 2023 at 02:58:02PM -0700, Doug Anderson wrote:
> Maxime,
> 
> On Sun, Aug 6, 2023 at 11:41 PM Maxime Ripard <mripard@kernel.org> wrote:
> >
> > Hi Doug,
> >
> > Thanks for working on this :)
> >
> > On Fri, Aug 04, 2023 at 02:06:07PM -0700, Douglas Anderson wrote:
> > > The goal of this file is to contain helper functions for panel drivers
> > > to use. To start off with, let's add drm_panel_helper_shutdown() for
> > > use by panels that want to make sure they're powered off at
> > > shutdown/remove time if they happen to be powered on.
> > >
> > > The main goal of introducting this function is so that panel drivers
> > > don't need to track the enabled/prepared state themselves.
> > >
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> >
> > It shouldn't be necessary at all: drivers should call
> > drm_atomic_helper_shutdown at removal time which will disable the
> > connector (which in turn should unprepare/disable its panel).
> >
> > If either the driver is missing drm_atomic_helper_shutdown, or if the
> > connector doesn't properly disable the panel, then I would consider that
> > a bug.
> 
> Hmmm. I'm a bit hesitant here. I guess I'm less worried about the
> removal time and more worried about the shutdown time.
> 
> For removal I'd be fine with just dropping the call and saying it's
> the responsibility of the driver to call drm_atomic_helper_shutdown(),
> as you suggest. I'd tend to believe that removal of DRM drivers is not
> used anywhere in "production" code (or at least not common) and I
> think it's super hard to get it right, to unregister and unbind all
> the DRM components in the right order.

It depends on the kind of devices. USB devices are very likely to be
removed, platform devices very unlikely, and PCIe cards somewhere in the
middle :)

I'm not sure the likelyhood of the device getting removed has much to do
with it though, and likely or not, it's definitely something we should
address and fix if an issue is to be found.

> Presumably anyone trying to remove a DRM panel in a generic case
> supporting lots of different hardware is used to it being a bit
> broken...

It's not. Most drivers might be broken, but it's totally something we
support and should strive for.

> Not that it's a super great situation to be in for remove() not to
> work reliably, but that's how I think it is right now.
> 
> For shutdown, however, I'm not really OK with just blindly removing
> the code that tries to power off the panel.

I disagree with that statement. It's not "blindly removing the code",
that code is still there, in the disable hook.

> Shutdown is called any time you reboot a device. That means that if a
> DRM driver is _not_ calling drm_atomic_helper_shutdown() on the
> panel's behalf at shutdown time then the panel won't be powered off
> properly. This feels to me like something that might actually matter.

It does matter. What I disagree on is that you suggest working around
that brokenness in the core framework. What I'm saying is driver is
broken, we should keep the core framework sane and fix it in the driver.

It should be fairly easy with a coccinelle script to figure out which
panels are affected, and to add that call in remove.

> Panels tend to be one of those things that really care about their
> power sequencing and can even get damaged (or not turn on properly
> next time) if sequencing is not done properly, so just removing this
> code and putting the blame on the DRM driver seems scary to me.

Sure, it's bad. But there's no difference compared to the approach you
suggest in that patch: you created a helper, yes, but every driver will
still have to call that helper and if they don't, the panel will still
be called and it's a bug. And we would have to convert everything to
that new helper.

It's fundamentally the same discussion than what you were opposed to
above.

> Sure enough, a quick survey of DRM drivers shows that many don't call
> drm_atomic_helper_shutdown() at .shutdown time. A _very_ quick skim of
> callers to drm_atomic_helper_shutdown():
> 
> * omapdrm/omap_drv.c - calls at remove, not shutdown
> * arm/hdlcd_drv.c - calls at unbind, not shutdown
> * arm/malidp_drv.c - calls at unbind, not shutdown
> * armada/armada_drv.c - calls at unbind, not shutdown
> 
> ...huh, actually, there are probably too many to list that don't call
> it at shutdown. There are some that do, but also quite a few that
> don't. I'm not sure I really want to blindly add
> drm_atomic_helper_shutdown() to all those DRM driver's shutdown
> callbacks... That feels like asking for someone to flame me...

No one will flame you, and if they do, we'll take care of it. And yes,
those are bugs, so let's fix them instead of working around them?

> ...but then, what's the way forward? I think normally the panel's
> shutdown() callback would happen _before_ the DRM driver's shutdown()
> callback,

Is there such a guarantee?

> so we can't easily write logic in the panel's shutdown like "if the
> DRM panel didn't shut the panel down then print a warning and shut
> down the panel". We'd have to somehow invent and register for a "late
> shutdown" callback and have the panel use that to shut itself down if
> the DRM driver didn't. That seems like a bad idea...
> 
> Do you have any brilliant ideas here? I could keep the function as-is
> but only have panels only call it at shutdown time if you want. I
> could add to the comments and/or the commit message some summary of
> the above and that the call is important for panels that absolutely
> need to be powered off at shutdown time even if the DRM driver doesn't
> do anything special at shutdown... Any other ideas?

Brilliant ideas to do what exactly?

I disagree that the solution to our problem is to disable the panel
resources at shutdown time and we should only do it at disable time.
Your helper does exactly that though, so I don't think the helper is a
good idea.

Maxime
Doug Anderson Aug. 28, 2023, 4:06 p.m. UTC | #4
Hi,

On Mon, Aug 28, 2023 at 12:45 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> > For removal I'd be fine with just dropping the call and saying it's
> > the responsibility of the driver to call drm_atomic_helper_shutdown(),
> > as you suggest. I'd tend to believe that removal of DRM drivers is not
> > used anywhere in "production" code (or at least not common) and I
> > think it's super hard to get it right, to unregister and unbind all
> > the DRM components in the right order.
>
> It depends on the kind of devices. USB devices are very likely to be
> removed, platform devices very unlikely, and PCIe cards somewhere in the
> middle :)

Good point. Obviously those cases need to work. I guess I've just
spent lots of time on the SoC case where all the pieces coming
together are very intertwined with each other...


> > Shutdown is called any time you reboot a device. That means that if a
> > DRM driver is _not_ calling drm_atomic_helper_shutdown() on the
> > panel's behalf at shutdown time then the panel won't be powered off
> > properly. This feels to me like something that might actually matter.
>
> It does matter. What I disagree on is that you suggest working around
> that brokenness in the core framework. What I'm saying is driver is
> broken, we should keep the core framework sane and fix it in the driver.
>
> It should be fairly easy with a coccinelle script to figure out which
> panels are affected, and to add that call in remove.

I think I'm confused here. I've already figured out which panels are
affected in my patch series, right? It's the set of panels that today
try to power the panel off in their shutdown call, right? ...but I
think we can't add the call you're suggesting,
drm_atomic_helper_shutdown(), to the _panel_'s shutdown callback, can
we? We need to add it to the shutdown callback of the top-level DRM
driver, right?


> > Panels tend to be one of those things that really care about their
> > power sequencing and can even get damaged (or not turn on properly
> > next time) if sequencing is not done properly, so just removing this
> > code and putting the blame on the DRM driver seems scary to me.
>
> Sure, it's bad. But there's no difference compared to the approach you
> suggest in that patch: you created a helper, yes, but every driver will
> still have to call that helper and if they don't, the panel will still
> be called and it's a bug. And we would have to convert everything to
> that new helper.
>
> It's fundamentally the same discussion than what you were opposed to
> above.

I think the key difference here is that, if I understand correctly,
drm_atomic_helper_shutdown() needs to be added to the top-level DRM
driver, not to the panel itself. I guess I'm worried that I'll either
miss a case or that simply adding a call to
drm_atomic_helper_shutdown() in the top-level DRM driver will break
something. Well, I suppose I can try it and see what happens...


I'll try to cook up a v2 and we'll see what people say. I might keep
the RFC tag for v2 just because I expect it to still be touching a lot
of stuff.

-Doug
Maxime Ripard Aug. 29, 2023, 8:38 a.m. UTC | #5
On Mon, Aug 28, 2023 at 09:06:29AM -0700, Doug Anderson wrote:
> > > Shutdown is called any time you reboot a device. That means that if a
> > > DRM driver is _not_ calling drm_atomic_helper_shutdown() on the
> > > panel's behalf at shutdown time then the panel won't be powered off
> > > properly. This feels to me like something that might actually matter.
> >
> > It does matter. What I disagree on is that you suggest working around
> > that brokenness in the core framework. What I'm saying is driver is
> > broken, we should keep the core framework sane and fix it in the driver.
> >
> > It should be fairly easy with a coccinelle script to figure out which
> > panels are affected, and to add that call in remove.
> 
> I think I'm confused here. I've already figured out which panels are
> affected in my patch series, right? It's the set of panels that today
> try to power the panel off in their shutdown call, right? ...but I
> think we can't add the call you're suggesting,
> drm_atomic_helper_shutdown(), to the _panel_'s shutdown callback, can
> we? We need to add it to the shutdown callback of the top-level DRM
> driver, right?

I have no idea what happens if we just unbind the panel device from its
driver.

If we can't, then it's all fine. If we can, then we need figure out how
to unregister the DRM device (or block the unbinding from happening).

> > > Panels tend to be one of those things that really care about their
> > > power sequencing and can even get damaged (or not turn on properly
> > > next time) if sequencing is not done properly, so just removing this
> > > code and putting the blame on the DRM driver seems scary to me.
> >
> > Sure, it's bad. But there's no difference compared to the approach you
> > suggest in that patch: you created a helper, yes, but every driver will
> > still have to call that helper and if they don't, the panel will still
> > be called and it's a bug. And we would have to convert everything to
> > that new helper.
> >
> > It's fundamentally the same discussion than what you were opposed to
> > above.
> 
> I think the key difference here is that, if I understand correctly,
> drm_atomic_helper_shutdown() needs to be added to the top-level DRM
> driver, not to the panel itself. I guess I'm worried that I'll either
> miss a case or that simply adding a call to
> drm_atomic_helper_shutdown() in the top-level DRM driver will break
> something. Well, I suppose I can try it and see what happens...

The more I think about this discussion, the more I think that the
original intent of the prepared/enabled flags were precisely there to
prevent a double-disable for drivers with drm_atomic_helper_shutdown(),
while still shutting down the panel resources when the panel is used
with a driver that doesn't call it.

Honestly, I think the right thing to do here is to make every driver
call shutdown, and then you don't need the reference counting anymore.

I had a shot at a (possibly very suboptimal) coccinelle script to look
for drivers that are KMS drivers but don't call
drm_atomic_helper_shutdown() at shutdown.

https://paste.ack.tf/bb42e6@raw

The result is:

$ make coccicheck COCCI=./test-drm-shutdown.cocci MODE=report

...

./drivers/gpu/drm/sti/sti_drv.c:262:30-49: ERROR: KMS driver sti_platform_driver is missing shutdown implementation
./drivers/gpu/drm/armada/armada_drv.c:245:30-56: ERROR: KMS driver armada_drm_platform_driver is missing shutdown implementation
./drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:1637:25-39: ERROR: KMS driver vmw_pci_driver is missing shutdown implementation
./drivers/gpu/drm/vboxvideo/vbox_drv.c:163:25-40: ERROR: KMS driver vbox_pci_driver is missing shutdown implementation
./drivers/gpu/drm/tiny/arcpgu.c:424:30-52: ERROR: KMS driver arcpgu_platform_driver is missing shutdown implementation
./drivers/gpu/drm/omapdrm/omap_drv.c:856:30-34: ERROR: KMS driver pdev is missing shutdown implementation
./drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c:361:30-57: ERROR: KMS driver fsl_dcu_drm_platform_driver is missing shutdown implementation
./drivers/gpu/drm/tve200/tve200_drv.c:261:30-43: ERROR: KMS driver tve200_driver is missing shutdown implementation
./drivers/gpu/drm/stm/drv.c:235:30-53: ERROR: KMS driver stm_drm_platform_driver is missing shutdown implementation
./drivers/gpu/drm/qxl/qxl_drv.c:267:25-39: ERROR: KMS driver qxl_pci_driver is missing shutdown implementation
./drivers/gpu/drm/logicvc/logicvc_drm.c:494:30-57: ERROR: KMS driver logicvc_drm_platform_driver is missing shutdown implementation
./drivers/gpu/drm/sun4i/sun4i_drv.c:439:30-55: ERROR: KMS driver sun4i_drv_platform_driver is missing shutdown implementation
./drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c:827:30-60: ERROR: KMS driver atmel_hlcdc_dc_platform_driver is missing shutdown implementation
./drivers/gpu/drm/mcde/mcde_drv.c:471:30-41: ERROR: KMS driver mcde_driver is missing shutdown implementation
./drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c:366:25-41: ERROR: KMS driver hibmc_pci_driver is missing shutdown implementation
./drivers/gpu/drm/aspeed/aspeed_gfx_drv.c:365:30-56: ERROR: KMS driver aspeed_gfx_platform_driver is missing shutdown implementation
./drivers/gpu/drm/ingenic/ingenic-drm-drv.c:1607:30-48: ERROR: KMS driver ingenic_drm_driver is missing shutdown implementation
./drivers/gpu/drm/arm/malidp_drv.c:982:30-52: ERROR: KMS driver malidp_platform_driver is missing shutdown implementation
./drivers/gpu/drm/arm/hdlcd_drv.c:400:30-51: ERROR: KMS driver hdlcd_platform_driver is missing shutdown implementation
./drivers/gpu/drm/kmb/kmb_drv.c:622:30-49: ERROR: KMS driver kmb_platform_driver is missing shutdown implementation
./drivers/gpu/drm/exynos/exynos_drm_drv.c:356:30-56: ERROR: KMS driver exynos_drm_platform_driver is missing shutdown implementation
./drivers/gpu/drm/tiny/bochs.c:718:25-41: ERROR: KMS driver bochs_pci_driver is missing shutdown implementation
./drivers/gpu/drm/tiny/cirrus.c:746:25-42: ERROR: KMS driver cirrus_pci_driver is missing shutdown implementation
./drivers/gpu/drm/mediatek/mtk_drm_drv.c:954:30-53: ERROR: KMS driver mtk_drm_platform_driver is missing shutdown implementation

It's a significant number of drivers, but it's not the end of the world,
really.

Then, once the expectation is that all drivers are calling shutdown, we
don't have to worry about the refcounting at all in the panels, or
resources not being free'd anymore. And we have a single path to test
(disable) instead of two including one that might be difficult to test
properly.

> I'll try to cook up a v2 and we'll see what people say. I might keep
> the RFC tag for v2 just because I expect it to still be touching a lot
> of stuff.

Awesome, thanks

Maxime
Doug Anderson Aug. 30, 2023, 11:10 p.m. UTC | #6
Hi,

On Tue, Aug 29, 2023 at 1:38 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> On Mon, Aug 28, 2023 at 09:06:29AM -0700, Doug Anderson wrote:
> > > > Shutdown is called any time you reboot a device. That means that if a
> > > > DRM driver is _not_ calling drm_atomic_helper_shutdown() on the
> > > > panel's behalf at shutdown time then the panel won't be powered off
> > > > properly. This feels to me like something that might actually matter.
> > >
> > > It does matter. What I disagree on is that you suggest working around
> > > that brokenness in the core framework. What I'm saying is driver is
> > > broken, we should keep the core framework sane and fix it in the driver.
> > >
> > > It should be fairly easy with a coccinelle script to figure out which
> > > panels are affected, and to add that call in remove.
> >
> > I think I'm confused here. I've already figured out which panels are
> > affected in my patch series, right? It's the set of panels that today
> > try to power the panel off in their shutdown call, right? ...but I
> > think we can't add the call you're suggesting,
> > drm_atomic_helper_shutdown(), to the _panel_'s shutdown callback, can
> > we? We need to add it to the shutdown callback of the top-level DRM
> > driver, right?
>
> I have no idea what happens if we just unbind the panel device from its
> driver.
>
> If we can't, then it's all fine. If we can, then we need figure out how
> to unregister the DRM device (or block the unbinding from happening).

Nothing prevents unbinding the panel driver directly. I just confirmed
on my sc7180-lazor:

cd /sys/bus/dp-aux/drivers/panel-simple-dp-aux
echo aux-2-0008 > unbind

...no errors happened and the panel unbound. I think this is as
expected since I'm not aware of a way to prevent unbinding a driver. I
think the relevant function is unbind_store() in bus.c, right? There
is no failure code returned by device_driver_detach(). Presumably this
is by design?

FWIW, also trying to re-bind didn't work, also as expected (I think).
I think the whole bridge chain would need to be resolved again and
nothing I'm aware of makes that happen. Maybe I'm simply missing
something in my understanding, of course.

If you want to attempt to tackle some of these issues then I'd be more
than happy. I'm already waist deep in yak hair and I think this is too
big of a task for me to add on.

Should this issue block my work, then? Today, panels will at least
cleanly power themselves off if someone tries to unbind them like
that. If I remove the clean power off at driver unbind time and rely
on the DRM driver to power panels off cleanly then if someone directly
unbinds a panel like I've done above then it won't be power sequenced
properly, right?


> > > > Panels tend to be one of those things that really care about their
> > > > power sequencing and can even get damaged (or not turn on properly
> > > > next time) if sequencing is not done properly, so just removing this
> > > > code and putting the blame on the DRM driver seems scary to me.
> > >
> > > Sure, it's bad. But there's no difference compared to the approach you
> > > suggest in that patch: you created a helper, yes, but every driver will
> > > still have to call that helper and if they don't, the panel will still
> > > be called and it's a bug. And we would have to convert everything to
> > > that new helper.
> > >
> > > It's fundamentally the same discussion than what you were opposed to
> > > above.
> >
> > I think the key difference here is that, if I understand correctly,
> > drm_atomic_helper_shutdown() needs to be added to the top-level DRM
> > driver, not to the panel itself. I guess I'm worried that I'll either
> > miss a case or that simply adding a call to
> > drm_atomic_helper_shutdown() in the top-level DRM driver will break
> > something. Well, I suppose I can try it and see what happens...
>
> The more I think about this discussion, the more I think that the
> original intent of the prepared/enabled flags were precisely there to
> prevent a double-disable for drivers with drm_atomic_helper_shutdown(),
> while still shutting down the panel resources when the panel is used
> with a driver that doesn't call it.

Sure, that makes sense.


> Honestly, I think the right thing to do here is to make every driver
> call shutdown, and then you don't need the reference counting anymore.
>
> I had a shot at a (possibly very suboptimal) coccinelle script to look
> for drivers that are KMS drivers but don't call
> drm_atomic_helper_shutdown() at shutdown.
>
> https://paste.ack.tf/bb42e6@raw

Your paste seems to have expired. Maybe you used the default and had
it expire in 1 day? Maybe you could just paste the script in email so
it'll be archived for posterity on lore?


> The result is:
>
> $ make coccicheck COCCI=./test-drm-shutdown.cocci MODE=report
>
> ...
>
> ./drivers/gpu/drm/sti/sti_drv.c:262:30-49: ERROR: KMS driver sti_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/armada/armada_drv.c:245:30-56: ERROR: KMS driver armada_drm_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:1637:25-39: ERROR: KMS driver vmw_pci_driver is missing shutdown implementation
> ./drivers/gpu/drm/vboxvideo/vbox_drv.c:163:25-40: ERROR: KMS driver vbox_pci_driver is missing shutdown implementation
> ./drivers/gpu/drm/tiny/arcpgu.c:424:30-52: ERROR: KMS driver arcpgu_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/omapdrm/omap_drv.c:856:30-34: ERROR: KMS driver pdev is missing shutdown implementation
> ./drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c:361:30-57: ERROR: KMS driver fsl_dcu_drm_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/tve200/tve200_drv.c:261:30-43: ERROR: KMS driver tve200_driver is missing shutdown implementation
> ./drivers/gpu/drm/stm/drv.c:235:30-53: ERROR: KMS driver stm_drm_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/qxl/qxl_drv.c:267:25-39: ERROR: KMS driver qxl_pci_driver is missing shutdown implementation
> ./drivers/gpu/drm/logicvc/logicvc_drm.c:494:30-57: ERROR: KMS driver logicvc_drm_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/sun4i/sun4i_drv.c:439:30-55: ERROR: KMS driver sun4i_drv_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c:827:30-60: ERROR: KMS driver atmel_hlcdc_dc_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/mcde/mcde_drv.c:471:30-41: ERROR: KMS driver mcde_driver is missing shutdown implementation
> ./drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c:366:25-41: ERROR: KMS driver hibmc_pci_driver is missing shutdown implementation
> ./drivers/gpu/drm/aspeed/aspeed_gfx_drv.c:365:30-56: ERROR: KMS driver aspeed_gfx_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/ingenic/ingenic-drm-drv.c:1607:30-48: ERROR: KMS driver ingenic_drm_driver is missing shutdown implementation
> ./drivers/gpu/drm/arm/malidp_drv.c:982:30-52: ERROR: KMS driver malidp_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/arm/hdlcd_drv.c:400:30-51: ERROR: KMS driver hdlcd_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/kmb/kmb_drv.c:622:30-49: ERROR: KMS driver kmb_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/exynos/exynos_drm_drv.c:356:30-56: ERROR: KMS driver exynos_drm_platform_driver is missing shutdown implementation
> ./drivers/gpu/drm/tiny/bochs.c:718:25-41: ERROR: KMS driver bochs_pci_driver is missing shutdown implementation
> ./drivers/gpu/drm/tiny/cirrus.c:746:25-42: ERROR: KMS driver cirrus_pci_driver is missing shutdown implementation
> ./drivers/gpu/drm/mediatek/mtk_drm_drv.c:954:30-53: ERROR: KMS driver mtk_drm_platform_driver is missing shutdown implementation
>
> It's a significant number of drivers, but it's not the end of the world,
> really.

My own analysis shows more than that, actually. I can't look at your
script since it's expired, but as one example your list seems to have
neither imx/ipuv3 nor imx/dcss. imx/ipuv3 seems to be missing
drm_atomic_helper_shutdown() at both unbind and system shutdown time.
imx/dcss seems to be missing drm_atomic_helper_shutdown() at system
shutdown time, though it does have it at driver remove time. Did I
mess up in identifying those two drivers as ones that need fixing?

I can't give you an exact list because I don't have a great search
that identifies the problem. I'm mostly looking for all instances of
drm_dev_register() where the driver is "DRIVER_MODESET" and then
manually checking their use of drm_atomic_helper_shutdown(). Until I
get through them all I won't know the count, and it's a manual
process.


-Doug
Maxime Ripard Aug. 31, 2023, 7:38 a.m. UTC | #7
On Wed, Aug 30, 2023 at 04:10:20PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Tue, Aug 29, 2023 at 1:38 AM Maxime Ripard <mripard@kernel.org> wrote:
> >
> > On Mon, Aug 28, 2023 at 09:06:29AM -0700, Doug Anderson wrote:
> > > > > Shutdown is called any time you reboot a device. That means that if a
> > > > > DRM driver is _not_ calling drm_atomic_helper_shutdown() on the
> > > > > panel's behalf at shutdown time then the panel won't be powered off
> > > > > properly. This feels to me like something that might actually matter.
> > > >
> > > > It does matter. What I disagree on is that you suggest working around
> > > > that brokenness in the core framework. What I'm saying is driver is
> > > > broken, we should keep the core framework sane and fix it in the driver.
> > > >
> > > > It should be fairly easy with a coccinelle script to figure out which
> > > > panels are affected, and to add that call in remove.
> > >
> > > I think I'm confused here. I've already figured out which panels are
> > > affected in my patch series, right? It's the set of panels that today
> > > try to power the panel off in their shutdown call, right? ...but I
> > > think we can't add the call you're suggesting,
> > > drm_atomic_helper_shutdown(), to the _panel_'s shutdown callback, can
> > > we? We need to add it to the shutdown callback of the top-level DRM
> > > driver, right?
> >
> > I have no idea what happens if we just unbind the panel device from its
> > driver.
> >
> > If we can't, then it's all fine. If we can, then we need figure out how
> > to unregister the DRM device (or block the unbinding from happening).
> 
> Nothing prevents unbinding the panel driver directly. I just confirmed
> on my sc7180-lazor:
> 
> cd /sys/bus/dp-aux/drivers/panel-simple-dp-aux
> echo aux-2-0008 > unbind
> 
> ...no errors happened and the panel unbound. I think this is as
> expected since I'm not aware of a way to prevent unbinding a driver. I
> think the relevant function is unbind_store() in bus.c, right? There
> is no failure code returned by device_driver_detach(). Presumably this
> is by design?

Well, that sucks :(

If I'm reading the docs right, then we could add a device link from the
panel (provider) to the KMS device (consumer), and unbinding the panel
would unbind the KMS driver.

https://www.kernel.org/doc/html/latest/driver-api/device_link.html#state-machine

"""
Before a supplier's driver is removed, links to consumers that are
not bound to a driver are updated to DL_STATE_SUPPLIER_UNBIND. (Call to
device_links_busy() from __device_release_driver().) This prevents the
consumers from binding. (Call to device_links_check_suppliers() from
really_probe().) Consumers that are bound are freed from their driver;
consumers that are probing are waited for until they are done. (Call to
device_links_unbind_consumers() from __device_release_driver().) Once
all links to consumers are in DL_STATE_SUPPLIER_UNBIND state, the
supplier driver is released and the links revert to DL_STATE_DORMANT.
(Call to device_links_driver_cleanup() from __device_release_driver().)
"""

It would be painful to rebind the whole thing, but at least we would be
safe.

> FWIW, also trying to re-bind didn't work, also as expected (I think).
> I think the whole bridge chain would need to be resolved again and
> nothing I'm aware of makes that happen. Maybe I'm simply missing
> something in my understanding, of course.

Yeah, I would expect rebinding the entire thing would solve this too. We
never really supported partial devices (which is also why the panel or
bridges going away sucks), so switching to something more dynamic is
probably going to be a lot of work for not much gains.

> If you want to attempt to tackle some of these issues then I'd be more
> than happy. I'm already waist deep in yak hair and I think this is too
> big of a task for me to add on.

I don't think we need to fix this in this series, or even now. Maybe we
should add a TODO note so that we don't forget about it, but I'd rather
get the original problem fixed :)

> Should this issue block my work, then?

No, definitely not. It's orthogonal to me. The only thing we could learn
from that experiment for this series is that calling
drm_atomic_helper_shutdown() is probably just wishful thinking at this
point.

> Today, panels will at least cleanly power themselves off if someone
> tries to unbind them like that. If I remove the clean power off at
> driver unbind time and rely on the DRM driver to power panels off
> cleanly then if someone directly unbinds a panel like I've done above
> then it won't be power sequenced properly, right?

Yeah... The kernel also probably just blew up because it followed a
dangling pointer to what used to be our panel but got freed when it was
unbound.

So, if we go back to the original intent of this series and sums things
up.

Our goals are:

  1) We want to have the panel disabled at KMS shutdown and remove
     (Doug, Maxime);

  2) but we want them to be disabled once to avoid any double-free or
     related issue (Doug, Maxime);

  3) doing so should work out of the box so that we close this once and
     for all, so no boilerplate or function to call in the panel driver
     (Maxime);

For 3) to happen, it means that all drivers should call
drm_atomic_helper_shutdown() both in remove/unbind and at shutdown. It's
something that drivers should do anymay, but a very significant number
of them don't.

However, panels and bridges, when they are unbound/removed, don't notify
the KMS driver that they are connected to and thus we end up with
dangling pointers and the panel still powered on if 3 is implemented as
is.

We shouldn't try to fix panel unbinding at the moment, so we have to
take it into account.

Is that a good summary?

If so, then I think we can implement everything by doing something like:

  - Implement enable and disable refcounting in panels.
    drm_panel_prepare and drm_panel_enable would increase it,
    drm_panel_unprepare and drm_panel_disable would decrease it.

  - Only actually call the disable and unprepare functions when that
    refcount goes to 0 in the normal case.

  - In drm_panel_remove, if we still have a refcount > 0, then we WARN
    and force unprepare/disable the panel.

And in parallel, we convert all drivers to use
drm_atomic_helper_shutdown() so that, in the most common case, we don't
rely on the drm_panel_remove safety net. And we'll want a bunch of TODO
items to eventually remove that safety net and the refcounting entirely
once we fix the unbinding issue.

Does that make sense?

> > > > > Panels tend to be one of those things that really care about their
> > > > > power sequencing and can even get damaged (or not turn on properly
> > > > > next time) if sequencing is not done properly, so just removing this
> > > > > code and putting the blame on the DRM driver seems scary to me.
> > > >
> > > > Sure, it's bad. But there's no difference compared to the approach you
> > > > suggest in that patch: you created a helper, yes, but every driver will
> > > > still have to call that helper and if they don't, the panel will still
> > > > be called and it's a bug. And we would have to convert everything to
> > > > that new helper.
> > > >
> > > > It's fundamentally the same discussion than what you were opposed to
> > > > above.
> > >
> > > I think the key difference here is that, if I understand correctly,
> > > drm_atomic_helper_shutdown() needs to be added to the top-level DRM
> > > driver, not to the panel itself. I guess I'm worried that I'll either
> > > miss a case or that simply adding a call to
> > > drm_atomic_helper_shutdown() in the top-level DRM driver will break
> > > something. Well, I suppose I can try it and see what happens...
> >
> > The more I think about this discussion, the more I think that the
> > original intent of the prepared/enabled flags were precisely there to
> > prevent a double-disable for drivers with drm_atomic_helper_shutdown(),
> > while still shutting down the panel resources when the panel is used
> > with a driver that doesn't call it.
> 
> Sure, that makes sense.

And we would also catch that and complain loudly so that if we ever miss
drivers in the conversion, it's clear that it needs to be fixed.

> > Honestly, I think the right thing to do here is to make every driver
> > call shutdown, and then you don't need the reference counting anymore.
> >
> > I had a shot at a (possibly very suboptimal) coccinelle script to look
> > for drivers that are KMS drivers but don't call
> > drm_atomic_helper_shutdown() at shutdown.
> >
> > https://paste.ack.tf/bb42e6@raw
> 
> Your paste seems to have expired. Maybe you used the default and had
> it expire in 1 day? Maybe you could just paste the script in email so
> it'll be archived for posterity on lore?

Argh, sorry. The coccinelle script was massive so I didn't want to
inline it, but here you go:

----8<----
virtual report

@ has_driver @
identifier driver;
position p;
@@

(
 struct pci_driver driver@p = {
   ...
 };
|
 struct platform_driver driver@p = {
   ...
 };
|
 struct spi_driver driver@p = {
   ...
 };
)

@ has_probe @
identifier has_driver.driver;
identifier probe_f;
@@

(
  struct pci_driver driver = {
	.probe = probe_f,
  };
|
  struct platform_driver driver = {
	.probe = probe_f,
  };
|
  struct spi_driver driver = {
	.probe = probe_f,
  };
)

@ has_shutdown @
identifier has_driver.driver;
identifier shutdown_f;
@@

(
  struct pci_driver driver = {
	.shutdown = shutdown_f,
  };
|
  struct platform_driver driver = {
	.shutdown = shutdown_f,
  };
|
  struct spi_driver driver = {
	.shutdown = shutdown_f,
  };
)

@ has_kms_atomic @
identifier kms_driver;
@@

  struct drm_driver kms_driver = {
        .driver_features = DRIVER_ATOMIC | ...,
  };

@ is_kms_probe depends on has_probe && has_kms_atomic @
identifier has_probe.probe_f;
identifier has_kms_atomic.kms_driver;
@@

 probe_f(...)
 {
	<+...
(
	drm_dev_alloc(&kms_driver, ...)
|
	devm_drm_dev_alloc(..., &kms_driver, ...)
)
	...+>
 }

@ uses_component @
identifier ops;
identifier bind_f;
@@

  struct component_master_ops ops = {
        .bind = bind_f,
  };

@ registers_bind depends on has_probe && uses_component @
identifier uses_component.ops;
identifier has_probe.probe_f;
@@

 probe_f(...)
 {
	<+...
(
	component_master_add_with_match(..., &ops, ...)
|
	drm_of_component_probe(..., &ops)
)
	...+>
 }

@ is_kms_component depends on uses_component && registers_bind && has_kms_atomic @
identifier uses_component.bind_f;
identifier has_kms_atomic.kms_driver;
@@

 bind_f(...)
 {
	<+...
(
	drm_dev_alloc(&kms_driver, ...)
|
	devm_drm_dev_alloc(..., &kms_driver, ...)
)
	...+>
 }

@ has_kms_init_f depends on has_kms_atomic && !(is_kms_probe || is_kms_component) @
identifier has_kms_atomic.kms_driver;
identifier init_f;
@@

 init_f(...)
 {
	<+...
(
	drm_dev_alloc(&kms_driver, ...)
|
	devm_drm_dev_alloc(..., &kms_driver, ...)
)
	...+>
 }

@ is_kms_probe_subf depends on has_probe && has_kms_init_f @
identifier has_kms_init_f.init_f;
identifier has_probe.probe_f;
@@

 probe_f(...)
 {
	<+...
	init_f(...)
	...+>
 }

@ is_kms_bind_subf depends on uses_component && has_kms_init_f @
identifier has_kms_init_f.init_f;
identifier uses_component.bind_f;
@@

 bind_f(...)
 {
	<+...
	init_f(...)
	...+>
 }

@ shuts_down depends on has_shutdown @
identifier has_shutdown.shutdown_f;
@@

 shutdown_f(...)
 {
	<+...
	drm_atomic_helper_shutdown(...)
	...+>
 }

@ script:python depends on report && (is_kms_probe || is_kms_probe_subf || is_kms_component || is_kms_bind_subf) && !has_shutdown @
ops << has_driver.driver;
p << has_driver.p;
@@

coccilib.report.print_report(p[0], "ERROR: KMS driver %s is missing shutdown implementation" % (ops))

@ script:python depends on report && (is_kms_probe || is_kms_probe_subf || is_kms_component || is_kms_bind_subf) && (has_shutdown && !shuts_down) @
ops << has_driver.driver;
p << has_driver.p;
@@

coccilib.report.print_report(p[0], "ERROR: KMS driver %s shutdown implementation is missing a call to drm_atomic_helper_shutdown()" % (ops))

----8<----

> 
> > The result is:
> >
> > $ make coccicheck COCCI=./test-drm-shutdown.cocci MODE=report
> >
> > ...
> >
> > ./drivers/gpu/drm/sti/sti_drv.c:262:30-49: ERROR: KMS driver sti_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/armada/armada_drv.c:245:30-56: ERROR: KMS driver armada_drm_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:1637:25-39: ERROR: KMS driver vmw_pci_driver is missing shutdown implementation
> > ./drivers/gpu/drm/vboxvideo/vbox_drv.c:163:25-40: ERROR: KMS driver vbox_pci_driver is missing shutdown implementation
> > ./drivers/gpu/drm/tiny/arcpgu.c:424:30-52: ERROR: KMS driver arcpgu_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/omapdrm/omap_drv.c:856:30-34: ERROR: KMS driver pdev is missing shutdown implementation
> > ./drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c:361:30-57: ERROR: KMS driver fsl_dcu_drm_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/tve200/tve200_drv.c:261:30-43: ERROR: KMS driver tve200_driver is missing shutdown implementation
> > ./drivers/gpu/drm/stm/drv.c:235:30-53: ERROR: KMS driver stm_drm_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/qxl/qxl_drv.c:267:25-39: ERROR: KMS driver qxl_pci_driver is missing shutdown implementation
> > ./drivers/gpu/drm/logicvc/logicvc_drm.c:494:30-57: ERROR: KMS driver logicvc_drm_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/sun4i/sun4i_drv.c:439:30-55: ERROR: KMS driver sun4i_drv_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c:827:30-60: ERROR: KMS driver atmel_hlcdc_dc_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/mcde/mcde_drv.c:471:30-41: ERROR: KMS driver mcde_driver is missing shutdown implementation
> > ./drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c:366:25-41: ERROR: KMS driver hibmc_pci_driver is missing shutdown implementation
> > ./drivers/gpu/drm/aspeed/aspeed_gfx_drv.c:365:30-56: ERROR: KMS driver aspeed_gfx_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/ingenic/ingenic-drm-drv.c:1607:30-48: ERROR: KMS driver ingenic_drm_driver is missing shutdown implementation
> > ./drivers/gpu/drm/arm/malidp_drv.c:982:30-52: ERROR: KMS driver malidp_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/arm/hdlcd_drv.c:400:30-51: ERROR: KMS driver hdlcd_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/kmb/kmb_drv.c:622:30-49: ERROR: KMS driver kmb_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/exynos/exynos_drm_drv.c:356:30-56: ERROR: KMS driver exynos_drm_platform_driver is missing shutdown implementation
> > ./drivers/gpu/drm/tiny/bochs.c:718:25-41: ERROR: KMS driver bochs_pci_driver is missing shutdown implementation
> > ./drivers/gpu/drm/tiny/cirrus.c:746:25-42: ERROR: KMS driver cirrus_pci_driver is missing shutdown implementation
> > ./drivers/gpu/drm/mediatek/mtk_drm_drv.c:954:30-53: ERROR: KMS driver mtk_drm_platform_driver is missing shutdown implementation
> >
> > It's a significant number of drivers, but it's not the end of the world,
> > really.
> 
> My own analysis shows more than that, actually. I can't look at your
> script since it's expired, but as one example your list seems to have
> neither imx/ipuv3 nor imx/dcss. imx/ipuv3 seems to be missing
> drm_atomic_helper_shutdown() at both unbind and system shutdown time.

Yeah, this was due to the script ignoring drm_of_component_probe that
ipuv3 uses. It's fixed now.

> imx/dcss seems to be missing drm_atomic_helper_shutdown() at system
> shutdown time, though it does have it at driver remove time.

And I'm wondering if this one is due to the fact that probe only calls a
function that isn't in the same file, which confuses my script. I'm not
entirely sure how to address that though.

> Did I mess up in identifying those two drivers as ones that need
> fixing?

No, both look like legit issues indeed.

> I can't give you an exact list because I don't have a great search
> that identifies the problem. I'm mostly looking for all instances of
> drm_dev_register() where the driver is "DRIVER_MODESET" and then
> manually checking their use of drm_atomic_helper_shutdown(). Until I
> get through them all I won't know the count, and it's a manual
> process.

I think my coccinnelle script will match with most of them, but we might
still miss a few indeed.

Maxime
Doug Anderson Aug. 31, 2023, 6:18 p.m. UTC | #8
Hi,

On Thu, Aug 31, 2023 at 12:38 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> If so, then I think we can implement everything by doing something like:
>
>   - Implement enable and disable refcounting in panels.
>     drm_panel_prepare and drm_panel_enable would increase it,
>     drm_panel_unprepare and drm_panel_disable would decrease it.
>
>   - Only actually call the disable and unprepare functions when that
>     refcount goes to 0 in the normal case.

Just to be clear: by refcounting do you mean switching this to actual
refcount? Today this is explicitly _not_ refcounting, right? It is
simply treating double-enables as no-ops and double-disables as
no-ops. With our current understanding, the only thing we actually
need to guard against is double-disable but at the moment we do guard
against both. Specifically we believe the cases that are issues:

a) At shutdown/remove time we want to disable the panel, but only if
it was enabled (we wouldn't want to call disable if the panel was
already off because userspace turned it off).

b) At shutdown time we want to disable the panel but then we don't
want to double-disable if the main DRM driver also causes us to get
disabled.

I'd rather keep it the way it is (prevent double-disable) and not
switch it to a refcount.


I'll also note that drm_panel currently already keeps track of the
enabled/prepared state, so there's no "implement" step here, right?
That's what landed in commit d2aacaf07395 ("drm/panel: Check for
already prepared/enabled in drm_panel"). Just to remind ourselves of
the history:

1. I needed to keep track of the "prepared" state anyway to make the
"panel follower" API work properly. See drm_panel_add_follower() where
we immediately power on a follower if the panel they're following was
already prepared.

2. Since I was keeping track of the "prepared" state in the core
anyway, it seemed like a good idea to prevent
double-prepare/unprepare/enable/disable in the drm_panel core so that
we could remove it from individual panels since that was always a
point of contention in individual panels. It was asserted that, even
in the core, we shouldn't need code to prevent
double-prepare/unprepare/enable/disable but that as a first step
moving this to the core and out of drivers made sense. Anyone relying
on the core would get a warning printed out indicating that they were
doing something wrong and this would eventually move to a WARN_ON.

3. While trying to remove this from the drivers we ended up realizing
some of the issues with panels wanting to power them off at shutdown /
remove time.


>   - In drm_panel_remove, if we still have a refcount > 0, then we WARN
>     and force unprepare/disable the panel.

I think the net-net of this is that you're saying I should fold the
code from this patch straight into drm_panel_remove() and add a WARN
if it ever triggers, right? In this patch series I'd remove the calls
to drm_panel_helper_shutdown() and rely on drm_panel_remove() to do
the power off at remove time. This might give a warning but, as
discussed, removing a panel like this isn't likely to work well and at
least we'd power sequence it properly. In some cases, I might have to
move the call to drm_panel_remove() around a little bit but I think
it'll work.

The above solves the problem with panels wanting to power sequence
themselves at remove() time, but not at shutdown() time. Thus we'd
still have a dependency on having all drivers use
drm_atomic_helper_shutdown() so that work becomes a dependency.


> > I can't give you an exact list because I don't have a great search
> > that identifies the problem. I'm mostly looking for all instances of
> > drm_dev_register() where the driver is "DRIVER_MODESET" and then
> > manually checking their use of drm_atomic_helper_shutdown(). Until I
> > get through them all I won't know the count, and it's a manual
> > process.
>
> I think my coccinnelle script will match with most of them, but we might
> still miss a few indeed.

I'll just plug through with my original list for now, then I can try
your script when I'm done and see if it catches anything I missed.

-Doug
Maxime Ripard Sept. 1, 2023, 8:15 a.m. UTC | #9
On Thu, Aug 31, 2023 at 11:18:49AM -0700, Doug Anderson wrote:
> Hi,
> 
> On Thu, Aug 31, 2023 at 12:38 AM Maxime Ripard <mripard@kernel.org> wrote:
> >
> > If so, then I think we can implement everything by doing something like:
> >
> >   - Implement enable and disable refcounting in panels.
> >     drm_panel_prepare and drm_panel_enable would increase it,
> >     drm_panel_unprepare and drm_panel_disable would decrease it.
> >
> >   - Only actually call the disable and unprepare functions when that
> >     refcount goes to 0 in the normal case.
> 
> Just to be clear: by refcounting do you mean switching this to actual
> refcount?

Yes

> Today this is explicitly _not_ refcounting, right? It is simply
> treating double-enables as no-ops and double-disables as no-ops. With
> our current understanding, the only thing we actually need to guard
> against is double-disable but at the moment we do guard against both.
> Specifically we believe the cases that are issues:
> 
> a) At shutdown/remove time we want to disable the panel, but only if
> it was enabled (we wouldn't want to call disable if the panel was
> already off because userspace turned it off).

Yeah, and that's doable with refcounting too.

> b) At shutdown time we want to disable the panel but then we don't
> want to double-disable if the main DRM driver also causes us to get
> disabled.

That's what I want to prevent though. Eventually, I don't want to see
panels call drm_panel_unprepare/disable themselves. There's no need to
if all drivers implement the shutdown sequence properly.

> I'd rather keep it the way it is (prevent double-disable) and not
> switch it to a refcount.
>
> I'll also note that drm_panel currently already keeps track of the
> enabled/prepared state, so there's no "implement" step here, right?
> That's what landed in commit d2aacaf07395 ("drm/panel: Check for
> already prepared/enabled in drm_panel"). Just to remind ourselves of
> the history:
> 
> 1. I needed to keep track of the "prepared" state anyway to make the
> "panel follower" API work properly. See drm_panel_add_follower() where
> we immediately power on a follower if the panel they're following was
> already prepared.
> 
> 2. Since I was keeping track of the "prepared" state in the core
> anyway, it seemed like a good idea to prevent
> double-prepare/unprepare/enable/disable in the drm_panel core so that
> we could remove it from individual panels since that was always a
> point of contention in individual panels. It was asserted that, even
> in the core, we shouldn't need code to prevent
> double-prepare/unprepare/enable/disable but that as a first step
> moving this to the core and out of drivers made sense. Anyone relying
> on the core would get a warning printed out indicating that they were
> doing something wrong and this would eventually move to a WARN_ON.
> 
> 3. While trying to remove this from the drivers we ended up realizing
> some of the issues with panels wanting to power them off at shutdown /
> remove time.

Yes, I'm aware. It's not clear to me what you're confused about though.
Is there anything I said that would conflict with that?

> >   - In drm_panel_remove, if we still have a refcount > 0, then we WARN
> >     and force unprepare/disable the panel.
> 
> I think the net-net of this is that you're saying I should fold the
> code from this patch straight into drm_panel_remove() and add a WARN
> if it ever triggers, right?

Aside from the refcounting-or-not discussion, yes.

> In this patch series I'd remove the calls to
> drm_panel_helper_shutdown() and rely on drm_panel_remove() to do the
> power off at remove time.

Yep

> This might give a warning but, as discussed, removing a panel like
> this isn't likely to work well and at least we'd power sequence it
> properly. In some cases, I might have to move the call to
> drm_panel_remove() around a little bit but I think it'll work.

Yep

> The above solves the problem with panels wanting to power sequence
> themselves at remove() time, but not at shutdown() time. Thus we'd
> still have a dependency on having all drivers use
> drm_atomic_helper_shutdown() so that work becomes a dependency.

Does it? I think it can be done in parallel?

Maxime
Doug Anderson Sept. 1, 2023, 1:42 p.m. UTC | #10
Hi,

On Fri, Sep 1, 2023 at 1:15 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> On Thu, Aug 31, 2023 at 11:18:49AM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Aug 31, 2023 at 12:38 AM Maxime Ripard <mripard@kernel.org> wrote:
> > >
> > > If so, then I think we can implement everything by doing something like:
> > >
> > >   - Implement enable and disable refcounting in panels.
> > >     drm_panel_prepare and drm_panel_enable would increase it,
> > >     drm_panel_unprepare and drm_panel_disable would decrease it.
> > >
> > >   - Only actually call the disable and unprepare functions when that
> > >     refcount goes to 0 in the normal case.
> >
> > Just to be clear: by refcounting do you mean switching this to actual
> > refcount?
>
> Yes
>
> > Today this is explicitly _not_ refcounting, right? It is simply
> > treating double-enables as no-ops and double-disables as no-ops. With
> > our current understanding, the only thing we actually need to guard
> > against is double-disable but at the moment we do guard against both.
> > Specifically we believe the cases that are issues:
> >
> > a) At shutdown/remove time we want to disable the panel, but only if
> > it was enabled (we wouldn't want to call disable if the panel was
> > already off because userspace turned it off).
>
> Yeah, and that's doable with refcounting too.

I don't understand the benefit of switching to refcounting, though. We
don't ever expect the "prepare" or "enable" function to be called more
than once and all we're guarding against is a double-unprepare and a
double-enable. Switching this to refcounting would make the reader
think that there was a legitimate case for things to be prepared or
enabled twice. As far as I know, there isn't.

In any case, I don't think there's any need to switch this to
refcounting as part of this effort. Someone could, in theory, do it as
a separate patch series.


> > The above solves the problem with panels wanting to power sequence
> > themselves at remove() time, but not at shutdown() time. Thus we'd
> > still have a dependency on having all drivers use
> > drm_atomic_helper_shutdown() so that work becomes a dependency.
>
> Does it? I think it can be done in parallel?

I don't think it can be in parallel. While it makes sense for panels
to call drm_panel_remove() at remove time, it doesn't make sense for
them to call it at shutdown time. That means that the trick of having
the panel get powered off in drm_panel_remove() won't help for
shutdown. For shutdown, which IMO is the more important case, we need
to wait until all drm drivers call drm_atomic_helper_shutdown()
properly.

-Doug
Doug Anderson Sept. 1, 2023, 11:44 p.m. UTC | #11
Hi,

On Fri, Sep 1, 2023 at 6:42 AM Doug Anderson <dianders@chromium.org> wrote:
>
> > > The above solves the problem with panels wanting to power sequence
> > > themselves at remove() time, but not at shutdown() time. Thus we'd
> > > still have a dependency on having all drivers use
> > > drm_atomic_helper_shutdown() so that work becomes a dependency.
> >
> > Does it? I think it can be done in parallel?
>
> I don't think it can be in parallel. While it makes sense for panels
> to call drm_panel_remove() at remove time, it doesn't make sense for
> them to call it at shutdown time. That means that the trick of having
> the panel get powered off in drm_panel_remove() won't help for
> shutdown. For shutdown, which IMO is the more important case, we need
> to wait until all drm drivers call drm_atomic_helper_shutdown()
> properly.

FWIW, it was a bit of a slog, but I've managed to put together a RFT
series. Good thing it's Friday and beer-o-clock. Please enjoy
reviewing.

Ugh. It's now two series because there are too many recipients. Email
is fun. OK, these should be at:

* https://lore.kernel.org/r/20230901234015.566018-1-dianders@chromium.org
- patches targeting drm-misc

* https://lore.kernel.org/r/20230901234202.566951-1-dianders@chromium.org
- patches targeting something that's not drm-misc

-Doug
Maxime Ripard Sept. 4, 2023, 3:33 p.m. UTC | #12
Hi,

On Fri, Sep 01, 2023 at 06:42:42AM -0700, Doug Anderson wrote:
> On Fri, Sep 1, 2023 at 1:15 AM Maxime Ripard <mripard@kernel.org> wrote:
> > On Thu, Aug 31, 2023 at 11:18:49AM -0700, Doug Anderson wrote:
> > > Today this is explicitly _not_ refcounting, right? It is simply
> > > treating double-enables as no-ops and double-disables as no-ops. With
> > > our current understanding, the only thing we actually need to guard
> > > against is double-disable but at the moment we do guard against both.
> > > Specifically we believe the cases that are issues:
> > >
> > > a) At shutdown/remove time we want to disable the panel, but only if
> > > it was enabled (we wouldn't want to call disable if the panel was
> > > already off because userspace turned it off).
> >
> > Yeah, and that's doable with refcounting too.
> 
> I don't understand the benefit of switching to refcounting, though. We
> don't ever expect the "prepare" or "enable" function to be called more
> than once and all we're guarding against is a double-unprepare and a
> double-enable. Switching this to refcounting would make the reader
> think that there was a legitimate case for things to be prepared or
> enabled twice. As far as I know, there isn't.

Sure, eventually we'll want to remove it.

I even said it as such here:
https://lore.kernel.org/dri-devel/wwzbd7dt5qyimshnd7sbgkf5gxk7tq5dxtrerz76uw5p6s7tzt@cbiezkfeuqqn/

However, we have a number of panels following various anti-patterns
where disable and unprepare would be called multiple times. A boolean
would just ignore the second, refcounting would warn over it, and that's
what we want.

And that's exactly because there isn't a legitimate case for things to
be disabled or unprepared twice, but yet many panel driver do it anyway.

> In any case, I don't think there's any need to switch this to
> refcounting as part of this effort. Someone could, in theory, do it as
> a separate patch series.

I'm sorry, but I'll insist on getting a solution that will warn panels
that call drm_atomic_helper_shutdown or drm_panel_disable/unprepare by
hand. It doesn't have to be refcounting though if you have a better idea
in mind.

> > > The above solves the problem with panels wanting to power sequence
> > > themselves at remove() time, but not at shutdown() time. Thus we'd
> > > still have a dependency on having all drivers use
> > > drm_atomic_helper_shutdown() so that work becomes a dependency.
> >
> > Does it? I think it can be done in parallel?
> 
> I don't think it can be in parallel. While it makes sense for panels
> to call drm_panel_remove() at remove time, it doesn't make sense for
> them to call it at shutdown time. That means that the trick of having
> the panel get powered off in drm_panel_remove() won't help for
> shutdown. For shutdown, which IMO is the more important case, we need
> to wait until all drm drivers call drm_atomic_helper_shutdown()
> properly.

Right, my point was more that drivers that already don't disable the
panel in their shutdown implementation will still not do it. And drivers
that do will still do it, so there's no regression.

We obviously want to tend to having all drivers call
drm_atomic_helper_shutdown(), but not having it will not introduce any
regression.

Maxime
Doug Anderson Sept. 5, 2023, 4:45 p.m. UTC | #13
Hi,

On Mon, Sep 4, 2023 at 8:33 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> > I don't understand the benefit of switching to refcounting, though. We
> > don't ever expect the "prepare" or "enable" function to be called more
> > than once and all we're guarding against is a double-unprepare and a
> > double-enable. Switching this to refcounting would make the reader
> > think that there was a legitimate case for things to be prepared or
> > enabled twice. As far as I know, there isn't.
>
> Sure, eventually we'll want to remove it.
>
> I even said it as such here:
> https://lore.kernel.org/dri-devel/wwzbd7dt5qyimshnd7sbgkf5gxk7tq5dxtrerz76uw5p6s7tzt@cbiezkfeuqqn/
>
> However, we have a number of panels following various anti-patterns
> where disable and unprepare would be called multiple times. A boolean
> would just ignore the second, refcounting would warn over it, and that's
> what we want.

Can you provide a concrete example of a case where refcounting would
give a better error? I'm still having a hard time understanding why
you are saying that refcounting is better and something concrete would
help me. Can you point at a driver you have in mind that follows an
anti-pattern where refcount would be better?

I'll try to be more concrete too and maybe you can point out where I'm
confused. As far as I understand the only difference between the
boolean and the refcount would be if someone _enabled_ or _prepared_
more than once, right? That would cause a refcount to increment to 2
but the boolean would stay at "true". I'm not aware of anyone calling
enable or prepare more than once, but maybe you are? ...or maybe
there's some other difference that I'm not aware of?

Said another way...

With a boolean and _not_ having more than one enable:

1. enable() => set "enabled" to true and enable panel.
2. disable() => set "enabled" to false and disable panel.
3. disable() => WARN, leave "enabled" as false.
4. disable() => WARN, leave "enabled" as false

With a refcount and _not_ having more than one enable:

1. enable() => refcount becomes 1 and enable panel
2. disable() => refcount becomes 0 and disable panel
3. disable() => WARN, refcount stays 0
4. disable() => WARN, refcount stays 0

So with only one enable the behavior is the same.

With a boolean and more than one enable:

1. enable() => set "enabled" to true and enable panel.
2. enable() => WARN, leave "enabled" as true
3. disable() => set "enabled" to false and disable panel.
4. disable() => WARN, leave "enabled" as false.
5. disable() => WARN, leave "enabled" as false

With a refcount and more than one enable:

1. enable() => refcount becomes 1 and enable panel
2. enable() => refcount becomes 2
3. disable() => refcount becomes 1
4. disable() => refcount becomes 0 and disable panel
5. disable() => WARN, refcount stays 0

In the case that there is more than one enable, I think the "boolean"
is better. Specifically:

a) It doesn't change behavior from today. Perhaps you can show me a
counterexample, but lacking that I'd assert that everyone today is
expecting things to work like the "boolean" case. If we change to a
refcount I think it could break someone. Even if nobody is relying on
things working like the "boolean" case today, I would assert that I
don't think anyone is expecting things to work like the "refcount"
case.

b) With a boolean we WARN at more appropriate times. Sure we could add
a warning when the refcount becomes 2, but at that point why aren't we
just using a boolean?

c) The "boolean" case is already implemented today so no patches need
to be sent for it.


> > In any case, I don't think there's any need to switch this to
> > refcounting as part of this effort. Someone could, in theory, do it as
> > a separate patch series.
>
> I'm sorry, but I'll insist on getting a solution that will warn panels
> that call drm_atomic_helper_shutdown or drm_panel_disable/unprepare by
> hand. It doesn't have to be refcounting though if you have a better idea
> in mind.

As per above, I think this already happens with the boolean? Won't you
see an error message like this:

dev_warn(panel->dev, "Skipping unprepare of already unprepared panel\n");

...from drm_panel_unprepare()


> > > > The above solves the problem with panels wanting to power sequence
> > > > themselves at remove() time, but not at shutdown() time. Thus we'd
> > > > still have a dependency on having all drivers use
> > > > drm_atomic_helper_shutdown() so that work becomes a dependency.
> > >
> > > Does it? I think it can be done in parallel?
> >
> > I don't think it can be in parallel. While it makes sense for panels
> > to call drm_panel_remove() at remove time, it doesn't make sense for
> > them to call it at shutdown time. That means that the trick of having
> > the panel get powered off in drm_panel_remove() won't help for
> > shutdown. For shutdown, which IMO is the more important case, we need
> > to wait until all drm drivers call drm_atomic_helper_shutdown()
> > properly.
>
> Right, my point was more that drivers that already don't disable the
> panel in their shutdown implementation will still not do it. And drivers
> that do will still do it, so there's no regression.
>
> We obviously want to tend to having all drivers call
> drm_atomic_helper_shutdown(), but not having it will not introduce any
> regression.

I'm still confused here too. The next patch in this patch series here
that we're talking about is:

https://lore.kernel.org/dri-devel/20230804140605.RFC.5.Icc3238e91bc726d4b04c51a4acf67f001ec453d7@changeid/

Let's look at an arbitrary concrete part of that patch: the
modification to "panel-tdo-tl070wsh30.c". For that panel, my RFC patch
removed the tracking of "prepared" and and then did this:

@@ -220,16 +209,14 @@ static void tdo_tl070wsh30_panel_remove(struct
mipi_dsi_device *dsi)
  dev_err(&dsi->dev, "failed to detach from DSI host: %d\n", err);

  drm_panel_remove(&tdo_tl070wsh30->base);
- drm_panel_disable(&tdo_tl070wsh30->base);
- drm_panel_unprepare(&tdo_tl070wsh30->base);
+ drm_panel_helper_shutdown(&tdo_tl070wsh30->base);
 }

 static void tdo_tl070wsh30_panel_shutdown(struct mipi_dsi_device *dsi)
 {
  struct tdo_tl070wsh30_panel *tdo_tl070wsh30 = mipi_dsi_get_drvdata(dsi);

- drm_panel_disable(&tdo_tl070wsh30->base);
- drm_panel_unprepare(&tdo_tl070wsh30->base);
+ drm_panel_helper_shutdown(&tdo_tl070wsh30->base);
 }

As per our discussion, the plan is to send a V2 of my patch series
where I _don't_ create the call drm_panel_helper_shutdown() and thus I
won't call it. That means that the V2 of the patch for
"panel-tdo-tl070wsh30.c" will remove the calls to drm_panel_disable()
and drm_panel_unprepare() and not replace them with anything.

As per our discussion, in V2 we will make drm_panel_remove() actually
unprepare / disable a panel if it was left enabled. This would
essentially fold in the drm_panel_helper_shutdown() from my RFC patch.
This would make tdo_tl070wsh30_panel_remove() behave the same as it
did before. Ugh, though I may have to think about this more when I get
to implementation since I don't think there's a guarantee of the
ordering of shutdown calls between the DRM driver and the panel.
Anyway, something to discuss later.

However... tdo_tl070wsh30_panel_shutdown() will no longer power the
panel off properly _unless_ the main DRM driver properly calls
drm_atomic_helper_shutdown().


Did I get anything above incorrect? Assuming I got it correct, that
means that our choices are:

a) Block landing the change to "panel-tdo-tl070wsh30.c" until after
all drivers properly call drm_atomic_helper_shutdown().

b) Add a hacky call to drm_panel_remove() in
tdo_tl070wsh30_panel_shutdown() to get the old behavior. This would
work, but IMO it's ugly and I'm pretty strongly against it.

c) Ignore the regression and just say that this panel won't get power
sequenced properly until its DRM driver is updated. I'm strongly
against this.

...unless you are aware of another choice, the only choice I'm willing
to go with is "a)" which is why I say we are blocked on getting all
drivers to properly call drm_atomic_helper_shutdown().

NOTE: I could still land patches #1-3 of this series and I'm actually
included to do so. I'll respond to the cover letter proposing this.


-Doug
Doug Anderson Sept. 5, 2023, 7:12 p.m. UTC | #14
Hi,

On Tue, Sep 5, 2023 at 9:45 AM Doug Anderson <dianders@chromium.org> wrote:
>
> As per our discussion, in V2 we will make drm_panel_remove() actually
> unprepare / disable a panel if it was left enabled. This would
> essentially fold in the drm_panel_helper_shutdown() from my RFC patch.
> This would make tdo_tl070wsh30_panel_remove() behave the same as it
> did before. Ugh, though I may have to think about this more when I get
> to implementation since I don't think there's a guarantee of the
> ordering of shutdown calls between the DRM driver and the panel.
> Anyway, something to discuss later.

Ugh, ignore the above paragraph. I managed to confuse myself and was
thinking about shutdown but talking about remove. Sigh. :( Instead,
pretend the above paragraph said:

As per our discussion, in V2 we will make drm_panel_remove() actually
unprepare / disable a panel (and print a warning) if it was left
enabled. This would essentially fold in the
drm_panel_helper_shutdown() from my RFC patch (but add a warning).
This would make tdo_tl070wsh30_panel_remove() behave the same as it
did before with the addition of a warning if someone tries to remove a
currently powered panel.

-Doug
Maxime Ripard Sept. 7, 2023, 2:14 p.m. UTC | #15
On Tue, Sep 05, 2023 at 09:45:49AM -0700, Doug Anderson wrote:
> > > In any case, I don't think there's any need to switch this to
> > > refcounting as part of this effort. Someone could, in theory, do it as
> > > a separate patch series.
> >
> > I'm sorry, but I'll insist on getting a solution that will warn panels
> > that call drm_atomic_helper_shutdown or drm_panel_disable/unprepare by
> > hand. It doesn't have to be refcounting though if you have a better idea
> > in mind.
> 
> As per above, I think this already happens with the boolean? Won't you
> see an error message like this:
> 
> dev_warn(panel->dev, "Skipping unprepare of already unprepared panel\n");
> 
> ...from drm_panel_unprepare()

Urgh. I missed that part, sorry for dragging you into that refcounting
discussion then.

> > > > > The above solves the problem with panels wanting to power sequence
> > > > > themselves at remove() time, but not at shutdown() time. Thus we'd
> > > > > still have a dependency on having all drivers use
> > > > > drm_atomic_helper_shutdown() so that work becomes a dependency.
> > > >
> > > > Does it? I think it can be done in parallel?
> > >
> > > I don't think it can be in parallel. While it makes sense for panels
> > > to call drm_panel_remove() at remove time, it doesn't make sense for
> > > them to call it at shutdown time. That means that the trick of having
> > > the panel get powered off in drm_panel_remove() won't help for
> > > shutdown. For shutdown, which IMO is the more important case, we need
> > > to wait until all drm drivers call drm_atomic_helper_shutdown()
> > > properly.
> >
> > Right, my point was more that drivers that already don't disable the
> > panel in their shutdown implementation will still not do it. And drivers
> > that do will still do it, so there's no regression.
> >
> > We obviously want to tend to having all drivers call
> > drm_atomic_helper_shutdown(), but not having it will not introduce any
> > regression.
> 
> I'm still confused here too. The next patch in this patch series here
> that we're talking about is:
> 
> https://lore.kernel.org/dri-devel/20230804140605.RFC.5.Icc3238e91bc726d4b04c51a4acf67f001ec453d7@changeid/
> 
> Let's look at an arbitrary concrete part of that patch: the
> modification to "panel-tdo-tl070wsh30.c". For that panel, my RFC patch
> removed the tracking of "prepared" and and then did this:
> 
> @@ -220,16 +209,14 @@ static void tdo_tl070wsh30_panel_remove(struct
> mipi_dsi_device *dsi)
>   dev_err(&dsi->dev, "failed to detach from DSI host: %d\n", err);
> 
>   drm_panel_remove(&tdo_tl070wsh30->base);
> - drm_panel_disable(&tdo_tl070wsh30->base);
> - drm_panel_unprepare(&tdo_tl070wsh30->base);
> + drm_panel_helper_shutdown(&tdo_tl070wsh30->base);
>  }
> 
>  static void tdo_tl070wsh30_panel_shutdown(struct mipi_dsi_device *dsi)
>  {
>   struct tdo_tl070wsh30_panel *tdo_tl070wsh30 = mipi_dsi_get_drvdata(dsi);
> 
> - drm_panel_disable(&tdo_tl070wsh30->base);
> - drm_panel_unprepare(&tdo_tl070wsh30->base);
> + drm_panel_helper_shutdown(&tdo_tl070wsh30->base);
>  }
> 
> As per our discussion, the plan is to send a V2 of my patch series
> where I _don't_ create the call drm_panel_helper_shutdown() and thus I
> won't call it. That means that the V2 of the patch for
> "panel-tdo-tl070wsh30.c" will remove the calls to drm_panel_disable()
> and drm_panel_unprepare() and not replace them with anything.

Right, that's what we should do.

> As per our discussion, in V2 we will make drm_panel_remove() actually
> unprepare / disable a panel if it was left enabled. This would
> essentially fold in the drm_panel_helper_shutdown() from my RFC patch.
> This would make tdo_tl070wsh30_panel_remove() behave the same as it
> did before.

Not really? You would get a warning from drm_panel_remove(), but our
discussion very much implied still disabling it...

> Ugh, though I may have to think about this more when I get to
> implementation since I don't think there's a guarantee of the ordering
> of shutdown calls between the DRM driver and the panel. Anyway,
> something to discuss later.
> 
> However... tdo_tl070wsh30_panel_shutdown() will no longer power the
> panel off properly _unless_ the main DRM driver properly calls
> drm_atomic_helper_shutdown().

... with the expectation that all KMS drivers should call
drm_atomic_helper_shutdown() anyway (thanks to your other series) and
thus we wouldn't trigger that remove warning anyway.

> Did I get anything above incorrect? Assuming I got it correct, that
> means that our choices are:
> 
> a) Block landing the change to "panel-tdo-tl070wsh30.c" until after
> all drivers properly call drm_atomic_helper_shutdown().

I don't think we care about all drivers here. Only the driver it's
enabled with would be a blocker. Like, let's say sun4i hasn't been
converted but that panel is only used with rockchip anyway, we don't
really care that sun4i shutdown patch hasn't been merged (yet).

> b) Add a hacky call to drm_panel_remove() in
> tdo_tl070wsh30_panel_shutdown() to get the old behavior. This would
> work, but IMO it's ugly and I'm pretty strongly against it.

Yeah, it's ugly.

> c) Ignore the regression and just say that this panel won't get power
> sequenced properly until its DRM driver is updated. I'm strongly
> against this.

That would work too, but the first one looks like the best trade-off to
me.

Maxime
Maxime Ripard Sept. 7, 2023, 2:16 p.m. UTC | #16
On Tue, Sep 05, 2023 at 12:12:58PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Tue, Sep 5, 2023 at 9:45 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > As per our discussion, in V2 we will make drm_panel_remove() actually
> > unprepare / disable a panel if it was left enabled. This would
> > essentially fold in the drm_panel_helper_shutdown() from my RFC patch.
> > This would make tdo_tl070wsh30_panel_remove() behave the same as it
> > did before. Ugh, though I may have to think about this more when I get
> > to implementation since I don't think there's a guarantee of the
> > ordering of shutdown calls between the DRM driver and the panel.
> > Anyway, something to discuss later.
> 
> Ugh, ignore the above paragraph. I managed to confuse myself and was
> thinking about shutdown but talking about remove. Sigh. :( Instead,
> pretend the above paragraph said:
> 
> As per our discussion, in V2 we will make drm_panel_remove() actually
> unprepare / disable a panel (and print a warning) if it was left
> enabled. This would essentially fold in the
> drm_panel_helper_shutdown() from my RFC patch (but add a warning).
> This would make tdo_tl070wsh30_panel_remove() behave the same as it
> did before with the addition of a warning if someone tries to remove a
> currently powered panel.

Ok, that works for me :)

Maxime
Doug Anderson Sept. 13, 2023, 6:28 p.m. UTC | #17
Hi,

On Thu, Sep 7, 2023 at 7:15 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> > a) Block landing the change to "panel-tdo-tl070wsh30.c" until after
> > all drivers properly call drm_atomic_helper_shutdown().
>
> I don't think we care about all drivers here. Only the driver it's
> enabled with would be a blocker. Like, let's say sun4i hasn't been
> converted but that panel is only used with rockchip anyway, we don't
> really care that sun4i shutdown patch hasn't been merged (yet).

Yeah, I suppose that would work, though it would require a bit of
research. Some other things have popped onto my plate recently, but
for now I'm going to focus on seeing how much of the drivers can get
their shutdown fixed. When that stalls out then we can see if we can
unblock some of the panels by digging into which DRM drivers they're
used with.

Also, as my proposal in the cover letter [1], I'm leaving a breadcrumb
here that I landed the first 3 patches in this series just to get them
out of the way. Those 3 patches didn't depend on the resolution of the
issues discussed here.

[1] https://lore.kernel.org/lkml/CAD=FV=UFuUsrrZmkL8_RL5WLvkJryDwRSAy_PWTa-hX_p0dF+Q@mail.gmail.com/
diff mbox series

Patch

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 215e78e79125..e811f3d68235 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -118,6 +118,7 @@  drm_kms_helper-y := \
 	drm_gem_framebuffer_helper.o \
 	drm_kms_helper_common.o \
 	drm_modeset_helper.o \
+	drm_panel_helper.o \
 	drm_plane_helper.o \
 	drm_probe_helper.o \
 	drm_rect.o \
diff --git a/drivers/gpu/drm/drm_panel_helper.c b/drivers/gpu/drm/drm_panel_helper.c
new file mode 100644
index 000000000000..85a55b5731cf
--- /dev/null
+++ b/drivers/gpu/drm/drm_panel_helper.c
@@ -0,0 +1,37 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2023 Google Inc.
+ */
+
+#include <linux/dev_printk.h>
+
+#include <drm/drm_panel.h>
+#include <drm/drm_panel_helper.h>
+
+/**
+ * drm_panel_helper_shutdown - helper for panels to use at shutdown time
+ * @panel: DRM panel
+ *
+ * Panels may call this function unconditionally at shutdown time to ensure
+ * that they are disabled and unprepared if necessary.
+ *
+ * As part of this function:
+ * - The backlight will be turned off, if it was on.
+ * - Any panel followers will be power sequenced.
+ */
+void drm_panel_helper_shutdown(struct drm_panel *panel)
+{
+	int ret;
+
+	if (panel->enabled) {
+		ret = drm_panel_disable(panel);
+		if (ret)
+			dev_warn(panel->dev, "Error disabling panel %d\n", ret);
+	}
+	if (panel->prepared) {
+		ret = drm_panel_unprepare(panel);
+		if (ret)
+			dev_warn(panel->dev, "Error unpreparing panel %d\n", ret);
+	}
+}
+EXPORT_SYMBOL_GPL(drm_panel_helper_shutdown);
diff --git a/include/drm/drm_panel_helper.h b/include/drm/drm_panel_helper.h
new file mode 100644
index 000000000000..5621482053a9
--- /dev/null
+++ b/include/drm/drm_panel_helper.h
@@ -0,0 +1,13 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2023 Google Inc.
+ */
+
+#ifndef DRM_PANEL_HELPER_H
+#define DRM_PANEL_HELPER_H
+
+struct drm_panel;
+
+void drm_panel_helper_shutdown(struct drm_panel *panel);
+
+#endif /* DRM_PANEL_HELPER_H */