diff mbox series

[v2,1/2] drm/dp_mst: Fix clearing payload state on topology disable

Message ID 20200122194321.14953-1-lyude@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2,1/2] drm/dp_mst: Fix clearing payload state on topology disable | expand

Commit Message

Lyude Paul Jan. 22, 2020, 7:43 p.m. UTC
The issues caused by:

64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")

Prompted me to take a closer look at how we clear the payload state in
general when disabling the topology, and it turns out there's actually
two subtle issues here.

The first is that we're not grabbing &mgr.payload_lock when clearing the
payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
lock order is &mgr.payload_lock -> &mgr.lock (because we always want
&mgr.lock to be the inner-most lock so topology validation always
works), this makes perfect sense. It also means that -technically- there
could be racing between someone calling
drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
modeset occurring that's modifying the payload state at the same time.

The second is the more obvious issue that Wayne Lin discovered, that
we're not clearing proposed_payloads when disabling the topology.

I actually can't see any obvious places where the racing caused by the
first issue would break something, and it could be that some of our
higher-level locks already prevent this by happenstance, but better safe
then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
first grabs &mgr.payload_lock followed by &mgr.lock so that we never
race when modifying the payload state. Then, we also clear
proposed_payloads to fix the original issue of enabling a new topology
with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
structures, but those are getting destroyed along with the ports anyway.

Changes since v1:
* Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
  vsyrjala

Cc: Sean Paul <sean@poorly.run>
Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Ville Syrjala Jan. 22, 2020, 8:51 p.m. UTC | #1
On Wed, Jan 22, 2020 at 02:43:20PM -0500, Lyude Paul wrote:
> The issues caused by:
> 
> 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")
> 
> Prompted me to take a closer look at how we clear the payload state in
> general when disabling the topology, and it turns out there's actually
> two subtle issues here.
> 
> The first is that we're not grabbing &mgr.payload_lock when clearing the
> payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
> lock order is &mgr.payload_lock -> &mgr.lock (because we always want
> &mgr.lock to be the inner-most lock so topology validation always
> works), this makes perfect sense. It also means that -technically- there
> could be racing between someone calling
> drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
> modeset occurring that's modifying the payload state at the same time.
> 
> The second is the more obvious issue that Wayne Lin discovered, that
> we're not clearing proposed_payloads when disabling the topology.
> 
> I actually can't see any obvious places where the racing caused by the
> first issue would break something, and it could be that some of our
> higher-level locks already prevent this by happenstance, but better safe
> then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
> first grabs &mgr.payload_lock followed by &mgr.lock so that we never
> race when modifying the payload state. Then, we also clear
> proposed_payloads to fix the original issue of enabling a new topology
> with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
> structures, but those are getting destroyed along with the ports anyway.
> 
> Changes since v1:
> * Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
>   vsyrjala
> 
> Cc: Sean Paul <sean@poorly.run>
> Cc: Wayne Lin <Wayne.Lin@amd.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Lyude Paul <lyude@redhat.com>
> ---
>  drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
> index 3649e82b963d..23cf46bfef74 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -3501,6 +3501,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
>  	int ret = 0;
>  	struct drm_dp_mst_branch *mstb = NULL;
>  
> +	mutex_lock(&mgr->payload_lock);
>  	mutex_lock(&mgr->lock);
>  	if (mst_state == mgr->mst_state)
>  		goto out_unlock;
> @@ -3559,7 +3560,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
>  		/* this can fail if the device is gone */
>  		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
>  		ret = 0;
> -		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
> +		memset(mgr->payloads, 0,
> +		       mgr->max_payloads * sizeof(mgr->payloads[0]));
> +		memset(mgr->proposed_vcpis, 0,
> +		       mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
>  		mgr->payload_mask = 0;
>  		set_bit(0, &mgr->payload_mask);
>  		mgr->vcpi_mask = 0;
> @@ -3568,6 +3572,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
>  
>  out_unlock:
>  	mutex_unlock(&mgr->lock);
> +	mutex_unlock(&mgr->payload_lock);

Locking order looks sane. Not entirely sure what the implications of
clearing all that stuff outside of a proper modeset is, but at least
it matches what we already do. So

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

>  	if (mstb)
>  		drm_dp_mst_topology_put_mstb(mstb);
>  	return ret;
> -- 
> 2.24.1
Lyude Paul Jan. 22, 2020, 11:43 p.m. UTC | #2
On Wed, 2020-01-22 at 22:51 +0200, Ville Syrjälä wrote:
> On Wed, Jan 22, 2020 at 02:43:20PM -0500, Lyude Paul wrote:
> > The issues caused by:
> > 
> > 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")
> > 
> > Prompted me to take a closer look at how we clear the payload state in
> > general when disabling the topology, and it turns out there's actually
> > two subtle issues here.
> > 
> > The first is that we're not grabbing &mgr.payload_lock when clearing the
> > payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
> > lock order is &mgr.payload_lock -> &mgr.lock (because we always want
> > &mgr.lock to be the inner-most lock so topology validation always
> > works), this makes perfect sense. It also means that -technically- there
> > could be racing between someone calling
> > drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
> > modeset occurring that's modifying the payload state at the same time.
> > 
> > The second is the more obvious issue that Wayne Lin discovered, that
> > we're not clearing proposed_payloads when disabling the topology.
> > 
> > I actually can't see any obvious places where the racing caused by the
> > first issue would break something, and it could be that some of our
> > higher-level locks already prevent this by happenstance, but better safe
> > then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
> > first grabs &mgr.payload_lock followed by &mgr.lock so that we never
> > race when modifying the payload state. Then, we also clear
> > proposed_payloads to fix the original issue of enabling a new topology
> > with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
> > structures, but those are getting destroyed along with the ports anyway.
> > 
> > Changes since v1:
> > * Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
> >   vsyrjala
> > 
> > Cc: Sean Paul <sean@poorly.run>
> > Cc: Wayne Lin <Wayne.Lin@amd.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Lyude Paul <lyude@redhat.com>
> > ---
> >  drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 3649e82b963d..23cf46bfef74 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -3501,6 +3501,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >  	int ret = 0;
> >  	struct drm_dp_mst_branch *mstb = NULL;
> >  
> > +	mutex_lock(&mgr->payload_lock);
> >  	mutex_lock(&mgr->lock);
> >  	if (mst_state == mgr->mst_state)
> >  		goto out_unlock;
> > @@ -3559,7 +3560,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >  		/* this can fail if the device is gone */
> >  		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
> >  		ret = 0;
> > -		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct
> > drm_dp_payload));
> > +		memset(mgr->payloads, 0,
> > +		       mgr->max_payloads * sizeof(mgr->payloads[0]));
> > +		memset(mgr->proposed_vcpis, 0,
> > +		       mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
> >  		mgr->payload_mask = 0;
> >  		set_bit(0, &mgr->payload_mask);
> >  		mgr->vcpi_mask = 0;
> > @@ -3568,6 +3572,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >  
> >  out_unlock:
> >  	mutex_unlock(&mgr->lock);
> > +	mutex_unlock(&mgr->payload_lock);
> 
> Locking order looks sane. Not entirely sure what the implications of
> clearing all that stuff outside of a proper modeset is, but at least
> it matches what we already do. So

fwiw - it's not clearing anything currently programmed in hw at that point,
it's just clearing the arrays we use for keeping track of pending/current
payload allocations so there's nothing in them the next time we enable MST on
a display. This is basically what we want anyway - once we disable MST through
the DPCD, we won't end up sending any payload updates anyway so these don't
get used in the modesets that follow.

tbh, part of me wonders if we could handle more of this in atomic state in the
future, but I haven't seen a good enough reason to try that yet.
> 
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> >  	if (mstb)
> >  		drm_dp_mst_topology_put_mstb(mstb);
> >  	return ret;
> > -- 
> > 2.24.1
Wayne Lin Jan. 23, 2020, 12:34 a.m. UTC | #3
[AMD Official Use Only - Internal Distribution Only]

Hi Lyude,

I also notice one thing that I was not aware of before. Since we're clearing payloads here, maybe we should also call drm_dp_mst_put_port_malloc() to put malloc refcounts here?

Appreciate for all your help! :)
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index 3649e82b963d..23cf46bfef74 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3501,6 +3501,7 @@  int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
 	int ret = 0;
 	struct drm_dp_mst_branch *mstb = NULL;
 
+	mutex_lock(&mgr->payload_lock);
 	mutex_lock(&mgr->lock);
 	if (mst_state == mgr->mst_state)
 		goto out_unlock;
@@ -3559,7 +3560,10 @@  int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
 		/* this can fail if the device is gone */
 		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
 		ret = 0;
-		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
+		memset(mgr->payloads, 0,
+		       mgr->max_payloads * sizeof(mgr->payloads[0]));
+		memset(mgr->proposed_vcpis, 0,
+		       mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
 		mgr->payload_mask = 0;
 		set_bit(0, &mgr->payload_mask);
 		mgr->vcpi_mask = 0;
@@ -3568,6 +3572,7 @@  int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
 
 out_unlock:
 	mutex_unlock(&mgr->lock);
+	mutex_unlock(&mgr->payload_lock);
 	if (mstb)
 		drm_dp_mst_topology_put_mstb(mstb);
 	return ret;