mbox series

[0/7] drm/msm: probe deferral fixes

Message ID 20220912154046.12900-1-johan+linaro@kernel.org (mailing list archive)
Headers show
Series drm/msm: probe deferral fixes | expand

Message

Johan Hovold Sept. 12, 2022, 3:40 p.m. UTC
The MSM DRM is currently broken in multiple ways with respect to probe
deferral. Not only does the driver currently fail to probe again after a
late deferral, but due to a related use-after-free bug this also
triggers NULL-pointer dereferences.

These bugs are not new but have become critical with the release of
5.19 where probe is deferred in case the aux-bus EP panel driver has not
yet been loaded.

The underlying problem is lifetime issues due to careless use of
device-managed resources.

Specifically, device-managed resources allocated post component bind
must be tied to the lifetime of the aggregate DRM device or they will
not necessarily be released when binding of the aggregate device is
deferred.

The following call chain and pseudo code serves as an illustration of
the problem:

 - platform_probe(pdev1)
   - dp_display_probe()
     - component_add()

 - platform_probe(pdev2) 				// last component
   - dp_display_probe()					// d0
       - component_add()
         - try_to_bring_up_aggregate_device()
	   - devres_open_group(adev->parent)		// d1

	   - msm_drm_bind()
	     - msm_drm_init()
	       - component_bind_all()
	         - for_each_component()
		   - component_bind()
		     - devres_open_group(&pdev->dev)	// d2
	             - dp_display_bind()
		       - devm_kzalloc(&pdev->dev)	// a1, OK
		     - devres_close_group(&pdev->dev)	// d3

	       - dpu_kms_hw_init()
	         - for_each_panel()
	           - msm_dp_modeset_init()
		     - dp_display_request_irq()
		       - devm_request_irq(&pdev->dev)	// a2, BUG
		     - if (pdev == pdev2 && condition)
		       - return -EPROBE_DEFER;

	      - if (error)
 	        - component_unbind_all()
	          - for_each_component()
		    - component_unbind()
		      - dp_display_unbind()
		      - devres_release_group(&pdev->dev) // d4, only a1 is freed

           - if (error)
	     - devres_release_group(adev->parent)	// d5

The device-managed allocation a2 is buggy as its lifetime is tied to the
component platform device and will not be released when the aggregate
device bind fails (e.g. due to a probe deferral).

When pdev2 is later probed again, the attempt to allocate the IRQ a
second time will fail for pdev1 (which is still bound to its platform
driver).

This series fixes the lifetime issues by tying the lifetime of a2 (and
similar allocations) to the lifetime of the aggregate device so that a2
is released at d5.

In some cases, such has for the DP IRQ, the above situation can also be
avoided by moving the allocation in question to the platform driver
probe (d0) or component bind (between d2 and d3). But as doing so is not
a general fix, this can be done later as a cleanup/optimisation.

Johan


Johan Hovold (7):
  drm/msm: fix use-after-free on probe deferral
  drm/msm: fix memory corruption with too many bridges
  drm/msm/dp: fix IRQ lifetime
  drm/msm/dp: fix aux-bus EP lifetime
  drm/msm/dp: fix bridge lifetime
  drm/msm/hdmi: fix IRQ lifetime
  drm/msm: drop modeset sanity checks

 drivers/gpu/drm/bridge/parade-ps8640.c   |  2 +-
 drivers/gpu/drm/display/drm_dp_aux_bus.c |  5 +++--
 drivers/gpu/drm/msm/dp/dp_display.c      | 16 +++++++++-------
 drivers/gpu/drm/msm/dp/dp_parser.c       |  6 +++---
 drivers/gpu/drm/msm/dp/dp_parser.h       |  5 +++--
 drivers/gpu/drm/msm/dsi/dsi.c            |  9 +++++----
 drivers/gpu/drm/msm/hdmi/hdmi.c          |  7 ++++++-
 drivers/gpu/drm/msm/msm_drv.c            |  1 +
 include/drm/display/drm_dp_aux_bus.h     |  6 +++---
 9 files changed, 34 insertions(+), 23 deletions(-)

Comments

Abhinav Kumar Sept. 12, 2022, 5:11 p.m. UTC | #1
Adding kuogee to this series

Hi Johan

Thanks for posting this.

We will take a look at this, re-validate and give our reviews/tested-bys.

Thanks

Abhinav
On 9/12/2022 8:40 AM, Johan Hovold wrote:
> The MSM DRM is currently broken in multiple ways with respect to probe
> deferral. Not only does the driver currently fail to probe again after a
> late deferral, but due to a related use-after-free bug this also
> triggers NULL-pointer dereferences.
> 
> These bugs are not new but have become critical with the release of
> 5.19 where probe is deferred in case the aux-bus EP panel driver has not
> yet been loaded.
> 
> The underlying problem is lifetime issues due to careless use of
> device-managed resources.
> 
> Specifically, device-managed resources allocated post component bind
> must be tied to the lifetime of the aggregate DRM device or they will
> not necessarily be released when binding of the aggregate device is
> deferred.
> 
> The following call chain and pseudo code serves as an illustration of
> the problem:
> 
>   - platform_probe(pdev1)
>     - dp_display_probe()
>       - component_add()
> 
>   - platform_probe(pdev2) 				// last component
>     - dp_display_probe()					// d0
>         - component_add()
>           - try_to_bring_up_aggregate_device()
> 	   - devres_open_group(adev->parent)		// d1
> 
> 	   - msm_drm_bind()
> 	     - msm_drm_init()
> 	       - component_bind_all()
> 	         - for_each_component()
> 		   - component_bind()
> 		     - devres_open_group(&pdev->dev)	// d2
> 	             - dp_display_bind()
> 		       - devm_kzalloc(&pdev->dev)	// a1, OK
> 		     - devres_close_group(&pdev->dev)	// d3
> 
> 	       - dpu_kms_hw_init()
> 	         - for_each_panel()
> 	           - msm_dp_modeset_init()
> 		     - dp_display_request_irq()
> 		       - devm_request_irq(&pdev->dev)	// a2, BUG
> 		     - if (pdev == pdev2 && condition)
> 		       - return -EPROBE_DEFER;
> 
> 	      - if (error)
>   	        - component_unbind_all()
> 	          - for_each_component()
> 		    - component_unbind()
> 		      - dp_display_unbind()
> 		      - devres_release_group(&pdev->dev) // d4, only a1 is freed
> 
>             - if (error)
> 	     - devres_release_group(adev->parent)	// d5
> 
> The device-managed allocation a2 is buggy as its lifetime is tied to the
> component platform device and will not be released when the aggregate
> device bind fails (e.g. due to a probe deferral).
> 
> When pdev2 is later probed again, the attempt to allocate the IRQ a
> second time will fail for pdev1 (which is still bound to its platform
> driver).
> 
> This series fixes the lifetime issues by tying the lifetime of a2 (and
> similar allocations) to the lifetime of the aggregate device so that a2
> is released at d5.
> 
> In some cases, such has for the DP IRQ, the above situation can also be
> avoided by moving the allocation in question to the platform driver
> probe (d0) or component bind (between d2 and d3). But as doing so is not
> a general fix, this can be done later as a cleanup/optimisation.
> 
> Johan
> 
> 
> Johan Hovold (7):
>    drm/msm: fix use-after-free on probe deferral
>    drm/msm: fix memory corruption with too many bridges
>    drm/msm/dp: fix IRQ lifetime
>    drm/msm/dp: fix aux-bus EP lifetime
>    drm/msm/dp: fix bridge lifetime
>    drm/msm/hdmi: fix IRQ lifetime
>    drm/msm: drop modeset sanity checks
> 
>   drivers/gpu/drm/bridge/parade-ps8640.c   |  2 +-
>   drivers/gpu/drm/display/drm_dp_aux_bus.c |  5 +++--
>   drivers/gpu/drm/msm/dp/dp_display.c      | 16 +++++++++-------
>   drivers/gpu/drm/msm/dp/dp_parser.c       |  6 +++---
>   drivers/gpu/drm/msm/dp/dp_parser.h       |  5 +++--
>   drivers/gpu/drm/msm/dsi/dsi.c            |  9 +++++----
>   drivers/gpu/drm/msm/hdmi/hdmi.c          |  7 ++++++-
>   drivers/gpu/drm/msm/msm_drv.c            |  1 +
>   include/drm/display/drm_dp_aux_bus.h     |  6 +++---
>   9 files changed, 34 insertions(+), 23 deletions(-)
>