mbox series

[v5,0/7] Fix some races between sysfb device registration and drivers probe

Message ID 20220511112438.1251024-1-javierm@redhat.com (mailing list archive)
Headers show
Series Fix some races between sysfb device registration and drivers probe | expand

Message

Javier Martinez Canillas May 11, 2022, 11:24 a.m. UTC
Hello,

The patches in this series contain mostly changes suggested by Daniel Vetter
Thomas Zimmermann. They aim to fix existing races between the Generic System
Framebuffer (sysfb) infrastructure and the fbdev and DRM device registration.

For example, it is currently possible for sysfb to register a platform
device after a real DRM driver was registered and requested to remove the
conflicting framebuffers. Or is possible for a simple{fb,drm} to match with
a device previously registered by sysfb, even after a real driver is present.

A symptom of this issue, was worked around with the commit fb561bf9abde
("fbdev: Prevent probing generic drivers if a FB is already registered")
but that's really a hack and should be reverted instead.

This series attempt to fix it more correctly and revert the mentioned hack.
That will also allow to make the num_registered_fb variable not visible to
drivers anymore, since that's internal to fbdev core.

Pach 1 is just a simple cleanup in preparation for later patches.

Patch 2 add a sysfb_disable() and sysfb_try_unregister() helpers to allow
disabling sysfb and attempt to unregister registered devices respectively.

Patch 3 changes how is dealt with conflicting framebuffers unregistering,
rather than having a variable to determine if a lock should be take, it
just drops the lock before unregistering the platform device.

Patch 4 changes the fbdev core to not attempt to unregister devices that
were registered by sysfb, let the same code doing the registration to also
handle the unregistration.

Patch 5 fixes the race that exists between sysfb devices registration and
fbdev framebuffer devices registration, by disabling the sysfb when a DRM
or fbdev driver requests to remove conflicting framebuffers.

Patch 6 is the revert patch that was posted by Daniel before but dropped
from his set and finally patch 7 is the one that makes num_registered_fb
private to fbmem.c, to not allow drivers to use it anymore.

The patches were tested on a rpi4 with the vc4, simpledrm and simplefb
drivers, using different combinations of built-in and as a module.

For example, having simpledrm as a module and loading it after the vc4
driver probed would not register a DRM device, which happens now without
the patches from this series.

Best regards,
Javier

Changes in v5:
- Move the sysfb_disable() call at conflicting framebuffers again to
  avoid the need of a DRIVER_FIRMWARE capability flag.
- Add Daniel Vetter's Reviewed-by tag again since reverted to the old
  patch that he already reviewed in v2.
- Drop patches that added a DRM_FIRMWARE capability and use them
  since the case those prevented could be ignored (Daniel Vetter).

Changes in v4:
- Make sysfb_disable() to also attempt to unregister a device.
- Drop call to sysfb_disable() in fbmem since is done in other places now.
- Add patch to make registered_fb[] private.
- Add patches that introduce the DRM_FIRMWARE capability and usage.

Changes in v3:
- Add Thomas Zimmermann's Reviewed-by tag to patch #1.
- Call sysfb_disable() when a DRM dev and a fbdev are registered rather
  than when conflicting framebuffers are removed (Thomas Zimmermann).
- Call sysfb_disable() when a fbdev framebuffer is registered rather
  than when conflicting framebuffers are removed (Thomas Zimmermann).
- Drop Daniel Vetter's Reviewed-by tag since patch changed a lot.
- Rebase on top of latest drm-misc-next branch.

Changes in v2:
- Rebase on top of latest drm-misc-next and fix conflicts (Daniel Vetter).
- Add kernel-doc comments and include in other_interfaces.rst (Daniel Vetter).
- Explain in the commit message that fbmem has to unregister the device
  as fallback if a driver registered the device itself (Daniel Vetter).
- Also explain that fallback in a comment in the code (Daniel Vetter).
- Don't encode in fbmem the assumption that sysfb will always register
  platform devices (Daniel Vetter).
- Add a FIXME comment about drivers registering devices (Daniel Vetter).
- Explain in the commit message that fbmem has to unregister the device
  as fallback if a driver registered the device itself (Daniel Vetter).
- Also explain that fallback in a comment in the code (Daniel Vetter).
- Don't encode in fbmem the assumption that sysfb will always register
  platform devices (Daniel Vetter).
- Add a FIXME comment about drivers registering devices (Daniel Vetter).
- Drop RFC prefix since patches were already reviewed by Daniel Vetter.
- Add Daniel Reviewed-by tags to the patches.

Daniel Vetter (2):
  Revert "fbdev: Prevent probing generic drivers if a FB is already
    registered"
  fbdev: Make registered_fb[] private to fbmem.c

Javier Martinez Canillas (5):
  firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
  firmware: sysfb: Add helpers to unregister a pdev and disable
    registration
  fbdev: Restart conflicting fb removal loop when unregistering devices
  fbdev: Make sysfb to unregister its own registered devices
  fbdev: Disable sysfb device registration when removing conflicting FBs

 .../driver-api/firmware/other_interfaces.rst  |  6 ++
 drivers/firmware/sysfb.c                      | 91 +++++++++++++++++--
 drivers/firmware/sysfb_simplefb.c             | 16 ++--
 drivers/video/fbdev/core/fbmem.c              | 67 +++++++++++---
 drivers/video/fbdev/efifb.c                   | 11 ---
 drivers/video/fbdev/simplefb.c                | 11 ---
 include/linux/fb.h                            |  8 +-
 include/linux/sysfb.h                         | 29 +++++-
 8 files changed, 178 insertions(+), 61 deletions(-)

Comments

Thomas Zimmermann May 13, 2022, 10:48 a.m. UTC | #1
Hi Javier,

thanks again for providing the examples. I think I now better get what 
you're trying to solve.

First of all let's merge patch 3, as it seems unrelated.

For the other patches, I'd like to take a step back and try to solve the 
broader problem. IIRC we talked about this briefly already.

 From my understanding, the problem of the current code is that removal 
of the firmware device is build around drivers (either DRM or fbdev). 
Everything works fine if a driver is bound to the firmware device and 
the native driver can remove it.  In other case, we have to manually 
disable sysfb and force-remove unbound firmware devices.  And the 
patchset doesn't even cover firmware devices that come from DT nodes.

But what we really want is to track resource ownership based on devices. 
When we add a firmware device (via sysfb or DT), we immediately add it 
to a list of firmware devices. When the native driver arrives, it can 
remove the firmware device even if no driver has been bound.

We can also operate in the other way if the native drivers implicitly 
reserves the framebuffer range. If sysfb would try to register a 
firmware device in that range, he can let it fail. No more need to fully 
disable sysfb on the first arriving native device.

We already track the memory ranges in drm aperture helpers. We'd need to 
move the code to a more prominent location (e.g., <linux/aperture.h>) 
and change fbdev to use it. Sysfb and DT code needs to insert platform 
devices upon creation. We can then implement the more fancy stuff, such 
as tracking native-device memory.  (And if we ever get to fix all usage 
of Linux' request-mem-region, we can even move all the functionality there).

Best regards
Thomas


Am 11.05.22 um 13:24 schrieb Javier Martinez Canillas:
> Hello,
> 
> The patches in this series contain mostly changes suggested by Daniel Vetter
> Thomas Zimmermann. They aim to fix existing races between the Generic System
> Framebuffer (sysfb) infrastructure and the fbdev and DRM device registration.
> 
> For example, it is currently possible for sysfb to register a platform
> device after a real DRM driver was registered and requested to remove the
> conflicting framebuffers. Or is possible for a simple{fb,drm} to match with
> a device previously registered by sysfb, even after a real driver is present.
> 
> A symptom of this issue, was worked around with the commit fb561bf9abde
> ("fbdev: Prevent probing generic drivers if a FB is already registered")
> but that's really a hack and should be reverted instead.
> 
> This series attempt to fix it more correctly and revert the mentioned hack.
> That will also allow to make the num_registered_fb variable not visible to
> drivers anymore, since that's internal to fbdev core.
> 
> Pach 1 is just a simple cleanup in preparation for later patches.
> 
> Patch 2 add a sysfb_disable() and sysfb_try_unregister() helpers to allow
> disabling sysfb and attempt to unregister registered devices respectively.
> 
> Patch 3 changes how is dealt with conflicting framebuffers unregistering,
> rather than having a variable to determine if a lock should be take, it
> just drops the lock before unregistering the platform device.
> 
> Patch 4 changes the fbdev core to not attempt to unregister devices that
> were registered by sysfb, let the same code doing the registration to also
> handle the unregistration.
> 
> Patch 5 fixes the race that exists between sysfb devices registration and
> fbdev framebuffer devices registration, by disabling the sysfb when a DRM
> or fbdev driver requests to remove conflicting framebuffers.
> 
> Patch 6 is the revert patch that was posted by Daniel before but dropped
> from his set and finally patch 7 is the one that makes num_registered_fb
> private to fbmem.c, to not allow drivers to use it anymore.
> 
> The patches were tested on a rpi4 with the vc4, simpledrm and simplefb
> drivers, using different combinations of built-in and as a module.
> 
> For example, having simpledrm as a module and loading it after the vc4
> driver probed would not register a DRM device, which happens now without
> the patches from this series.
> 
> Best regards,
> Javier
> 
> Changes in v5:
> - Move the sysfb_disable() call at conflicting framebuffers again to
>    avoid the need of a DRIVER_FIRMWARE capability flag.
> - Add Daniel Vetter's Reviewed-by tag again since reverted to the old
>    patch that he already reviewed in v2.
> - Drop patches that added a DRM_FIRMWARE capability and use them
>    since the case those prevented could be ignored (Daniel Vetter).
> 
> Changes in v4:
> - Make sysfb_disable() to also attempt to unregister a device.
> - Drop call to sysfb_disable() in fbmem since is done in other places now.
> - Add patch to make registered_fb[] private.
> - Add patches that introduce the DRM_FIRMWARE capability and usage.
> 
> Changes in v3:
> - Add Thomas Zimmermann's Reviewed-by tag to patch #1.
> - Call sysfb_disable() when a DRM dev and a fbdev are registered rather
>    than when conflicting framebuffers are removed (Thomas Zimmermann).
> - Call sysfb_disable() when a fbdev framebuffer is registered rather
>    than when conflicting framebuffers are removed (Thomas Zimmermann).
> - Drop Daniel Vetter's Reviewed-by tag since patch changed a lot.
> - Rebase on top of latest drm-misc-next branch.
> 
> Changes in v2:
> - Rebase on top of latest drm-misc-next and fix conflicts (Daniel Vetter).
> - Add kernel-doc comments and include in other_interfaces.rst (Daniel Vetter).
> - Explain in the commit message that fbmem has to unregister the device
>    as fallback if a driver registered the device itself (Daniel Vetter).
> - Also explain that fallback in a comment in the code (Daniel Vetter).
> - Don't encode in fbmem the assumption that sysfb will always register
>    platform devices (Daniel Vetter).
> - Add a FIXME comment about drivers registering devices (Daniel Vetter).
> - Explain in the commit message that fbmem has to unregister the device
>    as fallback if a driver registered the device itself (Daniel Vetter).
> - Also explain that fallback in a comment in the code (Daniel Vetter).
> - Don't encode in fbmem the assumption that sysfb will always register
>    platform devices (Daniel Vetter).
> - Add a FIXME comment about drivers registering devices (Daniel Vetter).
> - Drop RFC prefix since patches were already reviewed by Daniel Vetter.
> - Add Daniel Reviewed-by tags to the patches.
> 
> Daniel Vetter (2):
>    Revert "fbdev: Prevent probing generic drivers if a FB is already
>      registered"
>    fbdev: Make registered_fb[] private to fbmem.c
> 
> Javier Martinez Canillas (5):
>    firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
>    firmware: sysfb: Add helpers to unregister a pdev and disable
>      registration
>    fbdev: Restart conflicting fb removal loop when unregistering devices
>    fbdev: Make sysfb to unregister its own registered devices
>    fbdev: Disable sysfb device registration when removing conflicting FBs
> 
>   .../driver-api/firmware/other_interfaces.rst  |  6 ++
>   drivers/firmware/sysfb.c                      | 91 +++++++++++++++++--
>   drivers/firmware/sysfb_simplefb.c             | 16 ++--
>   drivers/video/fbdev/core/fbmem.c              | 67 +++++++++++---
>   drivers/video/fbdev/efifb.c                   | 11 ---
>   drivers/video/fbdev/simplefb.c                | 11 ---
>   include/linux/fb.h                            |  8 +-
>   include/linux/sysfb.h                         | 29 +++++-
>   8 files changed, 178 insertions(+), 61 deletions(-)
>
Javier Martinez Canillas May 13, 2022, 11:10 a.m. UTC | #2
Hello Thomas,

Thanks for your feedback and comments.

On 5/13/22 12:48, Thomas Zimmermann wrote:
> Hi Javier,
> 
> thanks again for providing the examples. I think I now better get what 
> you're trying to solve.
>

You are welcome.
 
> First of all let's merge patch 3, as it seems unrelated.
>

Agreed. I can push it to drm-misc-next.
 
> For the other patches, I'd like to take a step back and try to solve the 
> broader problem. IIRC we talked about this briefly already.
> 

Yes, that's what we discussed on IRC some time ago.

>  From my understanding, the problem of the current code is that removal 
> of the firmware device is build around drivers (either DRM or fbdev). 
> Everything works fine if a driver is bound to the firmware device and 
> the native driver can remove it.  In other case, we have to manually 

And this is the common case, since most kernels will have some driver
that binds to a platform device registered to provide the firmware FB.

> disable sysfb and force-remove unbound firmware devices.  And the 
> patchset doesn't even cover firmware devices that come from DT nodes.
>

Indeed.
 
> But what we really want is to track resource ownership based on devices. 
> When we add a firmware device (via sysfb or DT), we immediately add it 
> to a list of firmware devices. When the native driver arrives, it can 
> remove the firmware device even if no driver has been bound.
> 
> We can also operate in the other way if the native drivers implicitly 
> reserves the framebuffer range. If sysfb would try to register a 
> firmware device in that range, he can let it fail. No more need to fully 
> disable sysfb on the first arriving native device.
> 
> We already track the memory ranges in drm aperture helpers. We'd need to 
> move the code to a more prominent location (e.g., <linux/aperture.h>) 
> and change fbdev to use it. Sysfb and DT code needs to insert platform 
> devices upon creation. We can then implement the more fancy stuff, such 
> as tracking native-device memory.  (And if we ever get to fix all usage 
> of Linux' request-mem-region, we can even move all the functionality there).
>

Agreed. And as mentioned, the race that these patches attempt to fix are for
the less common case when a native driver probes but either no generic driver
registered a framebuffer yet or the platform device wasn't registered yet.

But this can only happen if for example a native driver is built-in but the
generic driver is build as a module, which is not the common configuration.

What most distros do is the opposite, to have {simple,of,efi,vesa}fb or
simpledrm built-in and the native drivers built as modules.

So there's no rush to fix this by piling more hacks on top of the ones we
already have and instead try to fix it more properly as you suggested.
Thomas Zimmermann May 13, 2022, 11:32 a.m. UTC | #3
Hi Javier

Am 13.05.22 um 13:10 schrieb Javier Martinez Canillas:
...
>> We already track the memory ranges in drm aperture helpers. We'd need to
>> move the code to a more prominent location (e.g., <linux/aperture.h>)
>> and change fbdev to use it. Sysfb and DT code needs to insert platform
>> devices upon creation. We can then implement the more fancy stuff, such
>> as tracking native-device memory.  (And if we ever get to fix all usage
>> of Linux' request-mem-region, we can even move all the functionality there).
>>
> 
> Agreed. And as mentioned, the race that these patches attempt to fix are for
> the less common case when a native driver probes but either no generic driver
> registered a framebuffer yet or the platform device wasn't registered yet.
> 
> But this can only happen if for example a native driver is built-in but the
> generic driver is build as a module, which is not the common configuration.
> 
> What most distros do is the opposite, to have {simple,of,efi,vesa}fb or
> simpledrm built-in and the native drivers built as modules.
> 
> So there's no rush to fix this by piling more hacks on top of the ones we
> already have and instead try to fix it more properly as you suggested.
>   

A first step would be to use DRM's aperture helpers in fbdev. That would 
be a good idea anyway, as it would simplify both. You already mentioned 
some API changes to make aperture helpers DRM-independent.

The affected fbdev drivers use platform devices, so this should be easy.

Aperture helpers have something like the registration_lock. [1] I don't 
know if we need to recreate patch 3 for this as well.

If we absolutely need some special detachment handling for fbdev, we can 
make devm_aperture_aquire() a public interface. The detach helper is 
provided by the caller.

Best regards
Thomas


[1] 
https://elixir.bootlin.com/linux/v5.17.6/source/drivers/gpu/drm/drm_aperture.c#L254
[2] 
https://elixir.bootlin.com/linux/v5.17.6/source/drivers/gpu/drm/drm_aperture.c#L159