[v3.1,3/3] drm/i915: Don't try to remove MST cleanly when force removed.
diff mbox

Message ID 1438861657-14109-3-git-send-email-maarten.lankhorst@linux.intel.com
State New
Headers show

Commit Message

Maarten Lankhorst Aug. 6, 2015, 11:47 a.m. UTC
Physically disconnecting a DP connector with an active MST stream
can lead to a kernel panic in intel_mst_disable_dp when calling
drm_dp_update_payload_part1. Examining the code it seems that the
port is freed while work to remove the connector is scheduled.

This probably means it's fine to skip call all the mst helper calls
and only attempt to disable the real encoder.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_dp_mst.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Daniel Vetter Aug. 6, 2015, 1:01 p.m. UTC | #1
On Thu, Aug 06, 2015 at 01:47:37PM +0200, Maarten Lankhorst wrote:
> Physically disconnecting a DP connector with an active MST stream
> can lead to a kernel panic in intel_mst_disable_dp when calling
> drm_dp_update_payload_part1. Examining the code it seems that the
> port is freed while work to remove the connector is scheduled.
> 
> This probably means it's fine to skip call all the mst helper calls
> and only attempt to disable the real encoder.

I think this is a refcounting bug in the dp mst helpers, the port
shouldn't just go poof when we still hold a reference to it. Can you
please try to figure out what's really broken here, and if that's too
tricky add a big FIXME comment?

Thanks, Daniel

> 
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_dp_mst.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c
> index 91ad17110c2f..19c9e49d74e0 100644
> --- a/drivers/gpu/drm/i915/intel_dp_mst.c
> +++ b/drivers/gpu/drm/i915/intel_dp_mst.c
> @@ -110,6 +110,9 @@ static void intel_mst_disable_dp(struct intel_encoder *encoder)
>  
>  	DRM_DEBUG_KMS("%d\n", intel_dp->active_mst_links);
>  
> +	if (!intel_mst->port)
> +		return;
> +
>  	drm_dp_mst_reset_vcpi_slots(&intel_dp->mst_mgr, intel_mst->port);
>  
>  	ret = drm_dp_update_payload_part1(&intel_dp->mst_mgr);
> @@ -126,12 +129,14 @@ static void intel_mst_post_disable_dp(struct intel_encoder *encoder)
>  
>  	DRM_DEBUG_KMS("%d\n", intel_dp->active_mst_links);
>  
> -	/* this can fail */
> -	drm_dp_check_act_status(&intel_dp->mst_mgr);
> -	/* and this can also fail */
> -	drm_dp_update_payload_part2(&intel_dp->mst_mgr);
> +	if (intel_mst->port) {
> +		/* this can fail */
> +		drm_dp_check_act_status(&intel_dp->mst_mgr);
> +		/* and this can also fail */
> +		drm_dp_update_payload_part2(&intel_dp->mst_mgr);
>  
> -	drm_dp_mst_deallocate_vcpi(&intel_dp->mst_mgr, intel_mst->port);
> +		drm_dp_mst_deallocate_vcpi(&intel_dp->mst_mgr, intel_mst->port);
> +	}
>  
>  	intel_dp->active_mst_links--;
>  	intel_mst->port = NULL;
> @@ -471,9 +476,13 @@ static void intel_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
>  	/* need to nuke the connector */
>  	drm_modeset_lock_all(dev);
>  	if (connector->state->crtc) {
> +		struct drm_encoder *encoder = connector->state->best_encoder;
> +		struct intel_dp_mst_encoder *intel_mst = enc_to_mst(encoder);
>  		struct drm_mode_set set;
>  		int ret;
>  
> +		intel_mst->port = NULL;
> +
>  		memset(&set, 0, sizeof(set));
>  		set.crtc = connector->state->crtc,
>  
> -- 
> 2.1.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Maarten Lankhorst Aug. 6, 2015, 1:51 p.m. UTC | #2
Hey,

Op 06-08-15 om 15:01 schreef Daniel Vetter:
> On Thu, Aug 06, 2015 at 01:47:37PM +0200, Maarten Lankhorst wrote:
>> Physically disconnecting a DP connector with an active MST stream
>> can lead to a kernel panic in intel_mst_disable_dp when calling
>> drm_dp_update_payload_part1. Examining the code it seems that the
>> port is freed while work to remove the connector is scheduled.
>>
>> This probably means it's fine to skip call all the mst helper calls
>> and only attempt to disable the real encoder.
> I think this is a refcounting bug in the dp mst helpers, the port
> shouldn't just go poof when we still hold a reference to it. Can you
> please try to figure out what's really broken here, and if that's too
> tricky add a big FIXME comment?
Look at drm_dp_destroy_port, it calls schedule_work(&mgr->destroy_connector_work), and also calls kfree(port).

Doesn't look like a refcounting bug to me, more like something intentional.

~Maarten
Daniel Vetter Aug. 6, 2015, 3:45 p.m. UTC | #3
On Thu, Aug 06, 2015 at 03:51:31PM +0200, Maarten Lankhorst wrote:
> Hey,
> 
> Op 06-08-15 om 15:01 schreef Daniel Vetter:
> > On Thu, Aug 06, 2015 at 01:47:37PM +0200, Maarten Lankhorst wrote:
> >> Physically disconnecting a DP connector with an active MST stream
> >> can lead to a kernel panic in intel_mst_disable_dp when calling
> >> drm_dp_update_payload_part1. Examining the code it seems that the
> >> port is freed while work to remove the connector is scheduled.
> >>
> >> This probably means it's fine to skip call all the mst helper calls
> >> and only attempt to disable the real encoder.
> > I think this is a refcounting bug in the dp mst helpers, the port
> > shouldn't just go poof when we still hold a reference to it. Can you
> > please try to figure out what's really broken here, and if that's too
> > tricky add a big FIXME comment?
> Look at drm_dp_destroy_port, it calls schedule_work(&mgr->destroy_connector_work), and also calls kfree(port).
> 
> Doesn't look like a refcounting bug to me, more like something intentional.

Intentionally buggy is still buggy ;-)

And yeah we probably need to separate the port cleanup due to unplug (i.e.
the driver callback) from freeing the port. And the intel_connector should
hold a reference onto the underlying port I think. Of course it would be
best to do the get/put on the port in the mst helper itself (so that we
don't need to change i915/radeon).
-Daniel

Patch
diff mbox

diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c
index 91ad17110c2f..19c9e49d74e0 100644
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -110,6 +110,9 @@  static void intel_mst_disable_dp(struct intel_encoder *encoder)
 
 	DRM_DEBUG_KMS("%d\n", intel_dp->active_mst_links);
 
+	if (!intel_mst->port)
+		return;
+
 	drm_dp_mst_reset_vcpi_slots(&intel_dp->mst_mgr, intel_mst->port);
 
 	ret = drm_dp_update_payload_part1(&intel_dp->mst_mgr);
@@ -126,12 +129,14 @@  static void intel_mst_post_disable_dp(struct intel_encoder *encoder)
 
 	DRM_DEBUG_KMS("%d\n", intel_dp->active_mst_links);
 
-	/* this can fail */
-	drm_dp_check_act_status(&intel_dp->mst_mgr);
-	/* and this can also fail */
-	drm_dp_update_payload_part2(&intel_dp->mst_mgr);
+	if (intel_mst->port) {
+		/* this can fail */
+		drm_dp_check_act_status(&intel_dp->mst_mgr);
+		/* and this can also fail */
+		drm_dp_update_payload_part2(&intel_dp->mst_mgr);
 
-	drm_dp_mst_deallocate_vcpi(&intel_dp->mst_mgr, intel_mst->port);
+		drm_dp_mst_deallocate_vcpi(&intel_dp->mst_mgr, intel_mst->port);
+	}
 
 	intel_dp->active_mst_links--;
 	intel_mst->port = NULL;
@@ -471,9 +476,13 @@  static void intel_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	/* need to nuke the connector */
 	drm_modeset_lock_all(dev);
 	if (connector->state->crtc) {
+		struct drm_encoder *encoder = connector->state->best_encoder;
+		struct intel_dp_mst_encoder *intel_mst = enc_to_mst(encoder);
 		struct drm_mode_set set;
 		int ret;
 
+		intel_mst->port = NULL;
+
 		memset(&set, 0, sizeof(set));
 		set.crtc = connector->state->crtc,