Message ID | 20250307183152.3822170-1-imre.deak@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | drm/dp_mst: Fix locking when skipping CSN before topology probing | expand |
[Public] > -----Original Message----- > From: Imre Deak <imre.deak@intel.com> > Sent: Saturday, March 8, 2025 2:32 AM > To: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; dri- > devel@lists.freedesktop.org > Cc: Lin, Wayne <Wayne.Lin@amd.com>; Lyude Paul <lyude@redhat.com>; > stable@vger.kernel.org > Subject: [PATCH] drm/dp_mst: Fix locking when skipping CSN before topology > probing > > The handling of the MST Connection Status Notify message is skipped if the probing > of the topology is still pending. Acquiring the drm_dp_mst_topology_mgr::probe_lock > for this in > drm_dp_mst_handle_up_req() is problematic: the task/work this function is called > from is also responsible for handling MST down-request replies (in > drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() - holding > already probe_lock - could be blocked waiting for an MST down-request reply while > drm_dp_mst_handle_up_req() is waiting for probe_lock while processing a CSN > message. This leads to the probe work's down-request message timing out. > > A scenario similar to the above leading to a down-request timeout is handling a CSN > message in drm_dp_mst_handle_conn_stat(), holding the probe_lock and sending > down-request messages while a second CSN message sent by the sink > subsequently is handled by drm_dp_mst_handle_up_req(). > > Fix the above by moving the logic to skip the CSN handling to > drm_dp_mst_process_up_req(). This function is called from a work (separate from > the task/work handling new up/down messages), already holding probe_lock. This > solves the above timeout issue, since handling of down-request replies won't be > blocked by probe_lock. > > Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is not done yet") > Cc: Wayne Lin <Wayne.Lin@amd.com> > Cc: Lyude Paul <lyude@redhat.com> > Cc: stable@vger.kernel.org # v6.6+ > Signed-off-by: Imre Deak <imre.deak@intel.com> > --- > drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 +++++++++++-------- > 1 file changed, 24 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c > b/drivers/gpu/drm/display/drm_dp_mst_topology.c > index 8b68bb3fbffb0..3a1f1ffc7b552 100644 > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c > @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct > drm_dp_mst_topology_mgr *mgr) > return 0; > } > > +static bool primary_mstb_probing_is_done(struct drm_dp_mst_topology_mgr > +*mgr) { > + bool probing_done = false; > + > + mutex_lock(&mgr->lock); Thanks for catching this, Imre! Here I think using mgr->lock is not sufficient for determining probing is done or not by mst_primary->link_address_sent. Since it might still be probing the rest of the topology with mst_primary probed. Use probe_lock instead? Thanks! > + > + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr- > >mst_primary)) { > + probing_done = mgr->mst_primary->link_address_sent; > + drm_dp_mst_topology_put_mstb(mgr->mst_primary); > + } > + > + mutex_unlock(&mgr->lock); > + > + return probing_done; > +} > + > static inline bool > drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr *mgr, > struct drm_dp_pending_up_req *up_req) @@ -4066,8 > +4082,12 @@ drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr > *mgr, > > /* TODO: Add missing handler for DP_RESOURCE_STATUS_NOTIFY > events */ > if (msg->req_type == DP_CONNECTION_STATUS_NOTIFY) { > - dowork = drm_dp_mst_handle_conn_stat(mstb, &msg->u.conn_stat); > - hotplug = true; > + if (!primary_mstb_probing_is_done(mgr)) { > + drm_dbg_kms(mgr->dev, "Got CSN before finish topology > probing. Skip it.\n"); > + } else { > + dowork = drm_dp_mst_handle_conn_stat(mstb, &msg- > >u.conn_stat); > + hotplug = true; > + } > } > > drm_dp_mst_topology_put_mstb(mstb); > @@ -4149,10 +4169,11 @@ static int drm_dp_mst_handle_up_req(struct > drm_dp_mst_topology_mgr *mgr) > drm_dp_send_up_ack_reply(mgr, mst_primary, up_req->msg.req_type, > false); > > + drm_dp_mst_topology_put_mstb(mst_primary); > + > if (up_req->msg.req_type == DP_CONNECTION_STATUS_NOTIFY) { > const struct drm_dp_connection_status_notify *conn_stat = > &up_req->msg.u.conn_stat; > - bool handle_csn; > > drm_dbg_kms(mgr->dev, "Got CSN: pn: %d ldps:%d ddps: %d mcs: > %d ip: %d pdt: %d\n", > conn_stat->port_number, > @@ -4161,16 +4182,6 @@ static int drm_dp_mst_handle_up_req(struct > drm_dp_mst_topology_mgr *mgr) > conn_stat->message_capability_status, > conn_stat->input_port, > conn_stat->peer_device_type); > - > - mutex_lock(&mgr->probe_lock); > - handle_csn = mst_primary->link_address_sent; > - mutex_unlock(&mgr->probe_lock); > - > - if (!handle_csn) { > - drm_dbg_kms(mgr->dev, "Got CSN before finish topology > probing. Skip it."); > - kfree(up_req); > - goto out_put_primary; > - } > } else if (up_req->msg.req_type == DP_RESOURCE_STATUS_NOTIFY) { > const struct drm_dp_resource_status_notify *res_stat = > &up_req->msg.u.resource_stat; > @@ -4185,9 +4196,6 @@ static int drm_dp_mst_handle_up_req(struct > drm_dp_mst_topology_mgr *mgr) > list_add_tail(&up_req->next, &mgr->up_req_list); > mutex_unlock(&mgr->up_req_lock); > queue_work(system_long_wq, &mgr->up_req_work); > - > -out_put_primary: > - drm_dp_mst_topology_put_mstb(mst_primary); > out_clear_reply: > reset_msg_rx_state(&mgr->up_req_recv); > return ret; > -- > 2.44.2 -- Regards, Wayne Lin
On Mon, Mar 10, 2025 at 08:59:51AM +0000, Lin, Wayne wrote: > > > -----Original Message----- > > From: Imre Deak <imre.deak@intel.com> > > Sent: Saturday, March 8, 2025 2:32 AM > > To: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; dri- > > devel@lists.freedesktop.org > > Cc: Lin, Wayne <Wayne.Lin@amd.com>; Lyude Paul <lyude@redhat.com>; > > stable@vger.kernel.org > > Subject: [PATCH] drm/dp_mst: Fix locking when skipping CSN before topology > > probing > > > > The handling of the MST Connection Status Notify message is skipped if the probing > > of the topology is still pending. Acquiring the drm_dp_mst_topology_mgr::probe_lock > > for this in > > drm_dp_mst_handle_up_req() is problematic: the task/work this function is called > > from is also responsible for handling MST down-request replies (in > > drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() - holding > > already probe_lock - could be blocked waiting for an MST down-request reply while > > drm_dp_mst_handle_up_req() is waiting for probe_lock while processing a CSN > > message. This leads to the probe work's down-request message timing out. > > > > A scenario similar to the above leading to a down-request timeout is handling a CSN > > message in drm_dp_mst_handle_conn_stat(), holding the probe_lock and sending > > down-request messages while a second CSN message sent by the sink > > subsequently is handled by drm_dp_mst_handle_up_req(). > > > > Fix the above by moving the logic to skip the CSN handling to > > drm_dp_mst_process_up_req(). This function is called from a work (separate from > > the task/work handling new up/down messages), already holding probe_lock. This > > solves the above timeout issue, since handling of down-request replies won't be > > blocked by probe_lock. > > > > Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is not done yet") > > Cc: Wayne Lin <Wayne.Lin@amd.com> > > Cc: Lyude Paul <lyude@redhat.com> > > Cc: stable@vger.kernel.org # v6.6+ > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > --- > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 +++++++++++-------- > > 1 file changed, 24 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > index 8b68bb3fbffb0..3a1f1ffc7b552 100644 > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct > > drm_dp_mst_topology_mgr *mgr) > > return 0; > > } > > > > +static bool primary_mstb_probing_is_done(struct drm_dp_mst_topology_mgr > > +*mgr) { > > + bool probing_done = false; > > + > > + mutex_lock(&mgr->lock); > > Thanks for catching this, Imre! > > Here I think using mgr->lock is not sufficient for determining probing > is done or not by mst_primary->link_address_sent. Since it might still > be probing the rest of the topology with mst_primary probed. Use > probe_lock instead? Thanks! mgr->lock is taken here to guard the mgr->mst_primary access. probe_lock is also held, taken already by the caller in drm_dp_mst_up_req_work(). > > + > > + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr-> >mst_primary)) { > > + probing_done = mgr->mst_primary->link_address_sent; > > + drm_dp_mst_topology_put_mstb(mgr->mst_primary); > > + } > > + > > + mutex_unlock(&mgr->lock); > > + > > + return probing_done; > > +}
[Public] > -----Original Message----- > From: Imre Deak <imre.deak@intel.com> > Sent: Monday, March 10, 2025 7:00 PM > To: Lin, Wayne <Wayne.Lin@amd.com> > Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; dri- > devel@lists.freedesktop.org; Lyude Paul <lyude@redhat.com>; > stable@vger.kernel.org > Subject: Re: [PATCH] drm/dp_mst: Fix locking when skipping CSN before topology > probing > > On Mon, Mar 10, 2025 at 08:59:51AM +0000, Lin, Wayne wrote: > > > > > -----Original Message----- > > > From: Imre Deak <imre.deak@intel.com> > > > Sent: Saturday, March 8, 2025 2:32 AM > > > To: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; > > > dri- devel@lists.freedesktop.org > > > Cc: Lin, Wayne <Wayne.Lin@amd.com>; Lyude Paul <lyude@redhat.com>; > > > stable@vger.kernel.org > > > Subject: [PATCH] drm/dp_mst: Fix locking when skipping CSN before > > > topology probing > > > > > > The handling of the MST Connection Status Notify message is skipped > > > if the probing of the topology is still pending. Acquiring the > > > drm_dp_mst_topology_mgr::probe_lock > > > for this in > > > drm_dp_mst_handle_up_req() is problematic: the task/work this > > > function is called from is also responsible for handling MST > > > down-request replies (in drm_dp_mst_handle_down_rep()). Thus > > > drm_dp_mst_link_probe_work() - holding already probe_lock - could be > > > blocked waiting for an MST down-request reply while > > > drm_dp_mst_handle_up_req() is waiting for probe_lock while > > > processing a CSN message. This leads to the probe work's down-request > message timing out. > > > > > > A scenario similar to the above leading to a down-request timeout is > > > handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the > > > probe_lock and sending down-request messages while a second CSN > > > message sent by the sink subsequently is handled by > drm_dp_mst_handle_up_req(). > > > > > > Fix the above by moving the logic to skip the CSN handling to > > > drm_dp_mst_process_up_req(). This function is called from a work > > > (separate from the task/work handling new up/down messages), already > > > holding probe_lock. This solves the above timeout issue, since > > > handling of down-request replies won't be blocked by probe_lock. > > > > > > Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is > > > not done yet") > > > Cc: Wayne Lin <Wayne.Lin@amd.com> > > > Cc: Lyude Paul <lyude@redhat.com> > > > Cc: stable@vger.kernel.org # v6.6+ > > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > > --- > > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 > > > +++++++++++-------- > > > 1 file changed, 24 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > index 8b68bb3fbffb0..3a1f1ffc7b552 100644 > > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct > > > drm_dp_mst_topology_mgr *mgr) > > > return 0; > > > } > > > > > > +static bool primary_mstb_probing_is_done(struct > > > +drm_dp_mst_topology_mgr > > > +*mgr) { > > > + bool probing_done = false; > > > + > > > + mutex_lock(&mgr->lock); > > > > Thanks for catching this, Imre! > > > > Here I think using mgr->lock is not sufficient for determining probing > > is done or not by mst_primary->link_address_sent. Since it might still > > be probing the rest of the topology with mst_primary probed. Use > > probe_lock instead? Thanks! > > mgr->lock is taken here to guard the mgr->mst_primary access. > > probe_lock is also held, taken already by the caller in drm_dp_mst_up_req_work(). Oh I see. It looks good to me. Feel free to add: Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Thanks! > > > > + > > > + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr-> > >mst_primary)) { > > > + probing_done = mgr->mst_primary->link_address_sent; > > > + drm_dp_mst_topology_put_mstb(mgr->mst_primary); > > > + } > > > + > > > + mutex_unlock(&mgr->lock); > > > + > > > + return probing_done; > > > +} -- Regards, Wayne Lin
On Mon, Mar 10, 2025 at 01:01:25PM +0000, Lin, Wayne wrote: > [Public] > > > -----Original Message----- > > From: Imre Deak <imre.deak@intel.com> > > Sent: Monday, March 10, 2025 7:00 PM > > To: Lin, Wayne <Wayne.Lin@amd.com> > > Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; dri- > > devel@lists.freedesktop.org; Lyude Paul <lyude@redhat.com>; > > stable@vger.kernel.org > > Subject: Re: [PATCH] drm/dp_mst: Fix locking when skipping CSN before topology > > probing > > > > On Mon, Mar 10, 2025 at 08:59:51AM +0000, Lin, Wayne wrote: > > > > > > > -----Original Message----- > > > > From: Imre Deak <imre.deak@intel.com> > > > > Sent: Saturday, March 8, 2025 2:32 AM > > > > To: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; > > > > dri- devel@lists.freedesktop.org > > > > Cc: Lin, Wayne <Wayne.Lin@amd.com>; Lyude Paul <lyude@redhat.com>; > > > > stable@vger.kernel.org > > > > Subject: [PATCH] drm/dp_mst: Fix locking when skipping CSN before > > > > topology probing > > > > > > > > The handling of the MST Connection Status Notify message is skipped > > > > if the probing of the topology is still pending. Acquiring the > > > > drm_dp_mst_topology_mgr::probe_lock > > > > for this in > > > > drm_dp_mst_handle_up_req() is problematic: the task/work this > > > > function is called from is also responsible for handling MST > > > > down-request replies (in drm_dp_mst_handle_down_rep()). Thus > > > > drm_dp_mst_link_probe_work() - holding already probe_lock - could be > > > > blocked waiting for an MST down-request reply while > > > > drm_dp_mst_handle_up_req() is waiting for probe_lock while > > > > processing a CSN message. This leads to the probe > > > > work's down-request message timing out. > > > > > > > > A scenario similar to the above leading to a down-request timeout is > > > > handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the > > > > probe_lock and sending down-request messages while a second CSN > > > > message sent by the sink subsequently is handled by drm_dp_mst_handle_up_req(). > > > > > > > > Fix the above by moving the logic to skip the CSN handling to > > > > drm_dp_mst_process_up_req(). This function is called from a work > > > > (separate from the task/work handling new up/down messages), already > > > > holding probe_lock. This solves the above timeout issue, since > > > > handling of down-request replies won't be blocked by probe_lock. > > > > > > > > Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is > > > > not done yet") > > > > Cc: Wayne Lin <Wayne.Lin@amd.com> > > > > Cc: Lyude Paul <lyude@redhat.com> > > > > Cc: stable@vger.kernel.org # v6.6+ > > > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > > > --- > > > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 > > > > +++++++++++-------- > > > > 1 file changed, 24 insertions(+), 16 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > index 8b68bb3fbffb0..3a1f1ffc7b552 100644 > > > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct > > > > drm_dp_mst_topology_mgr *mgr) > > > > return 0; > > > > } > > > > > > > > +static bool primary_mstb_probing_is_done(struct drm_dp_mst_topology_mgr *mgr) > > > > +{ > > > > + bool probing_done = false; > > > > + > > > > + mutex_lock(&mgr->lock); > > > > > > Thanks for catching this, Imre! > > > > > > Here I think using mgr->lock is not sufficient for determining probing > > > is done or not by mst_primary->link_address_sent. Since it might still > > > be probing the rest of the topology with mst_primary probed. Use > > > probe_lock instead? Thanks! > > > > mgr->lock is taken here to guard the mgr->mst_primary access. > > > > probe_lock is also held, taken already by the caller in drm_dp_mst_up_req_work(). > > Oh I see. It looks good to me. Feel free to add: > > Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Thanks for the review. Lyude, are you ok with the change and if I push it to drm-misc-fixes? > > Thanks! > > > > > > + > > > > + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr->mst_primary)) { > > > > + probing_done = mgr->mst_primary->link_address_sent; > > > > + drm_dp_mst_topology_put_mstb(mgr->mst_primary); > > > > + } > > > > + > > > > + mutex_unlock(&mgr->lock); > > > > + > > > > + return probing_done; > > > > +} > -- > Regards, > Wayne Lin
Reviewed-by: Lyude Paul <lyude@redhat.com> And yes - feel free to push this change! On Mon, 2025-03-10 at 19:24 +0200, Imre Deak wrote: > On Mon, Mar 10, 2025 at 01:01:25PM +0000, Lin, Wayne wrote: > > [Public] > > > > > -----Original Message----- > > > From: Imre Deak <imre.deak@intel.com> > > > Sent: Monday, March 10, 2025 7:00 PM > > > To: Lin, Wayne <Wayne.Lin@amd.com> > > > Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; dri- > > > devel@lists.freedesktop.org; Lyude Paul <lyude@redhat.com>; > > > stable@vger.kernel.org > > > Subject: Re: [PATCH] drm/dp_mst: Fix locking when skipping CSN before topology > > > probing > > > > > > On Mon, Mar 10, 2025 at 08:59:51AM +0000, Lin, Wayne wrote: > > > > > > > > > -----Original Message----- > > > > > From: Imre Deak <imre.deak@intel.com> > > > > > Sent: Saturday, March 8, 2025 2:32 AM > > > > > To: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; > > > > > dri- devel@lists.freedesktop.org > > > > > Cc: Lin, Wayne <Wayne.Lin@amd.com>; Lyude Paul <lyude@redhat.com>; > > > > > stable@vger.kernel.org > > > > > Subject: [PATCH] drm/dp_mst: Fix locking when skipping CSN before > > > > > topology probing > > > > > > > > > > The handling of the MST Connection Status Notify message is skipped > > > > > if the probing of the topology is still pending. Acquiring the > > > > > drm_dp_mst_topology_mgr::probe_lock > > > > > for this in > > > > > drm_dp_mst_handle_up_req() is problematic: the task/work this > > > > > function is called from is also responsible for handling MST > > > > > down-request replies (in drm_dp_mst_handle_down_rep()). Thus > > > > > drm_dp_mst_link_probe_work() - holding already probe_lock - could be > > > > > blocked waiting for an MST down-request reply while > > > > > drm_dp_mst_handle_up_req() is waiting for probe_lock while > > > > > processing a CSN message. This leads to the probe > > > > > work's down-request message timing out. > > > > > > > > > > A scenario similar to the above leading to a down-request timeout is > > > > > handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the > > > > > probe_lock and sending down-request messages while a second CSN > > > > > message sent by the sink subsequently is handled by drm_dp_mst_handle_up_req(). > > > > > > > > > > Fix the above by moving the logic to skip the CSN handling to > > > > > drm_dp_mst_process_up_req(). This function is called from a work > > > > > (separate from the task/work handling new up/down messages), already > > > > > holding probe_lock. This solves the above timeout issue, since > > > > > handling of down-request replies won't be blocked by probe_lock. > > > > > > > > > > Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is > > > > > not done yet") > > > > > Cc: Wayne Lin <Wayne.Lin@amd.com> > > > > > Cc: Lyude Paul <lyude@redhat.com> > > > > > Cc: stable@vger.kernel.org # v6.6+ > > > > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > > > > --- > > > > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 > > > > > +++++++++++-------- > > > > > 1 file changed, 24 insertions(+), 16 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > > index 8b68bb3fbffb0..3a1f1ffc7b552 100644 > > > > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > > > > @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct > > > > > drm_dp_mst_topology_mgr *mgr) > > > > > return 0; > > > > > } > > > > > > > > > > +static bool primary_mstb_probing_is_done(struct drm_dp_mst_topology_mgr *mgr) > > > > > +{ > > > > > + bool probing_done = false; > > > > > + > > > > > + mutex_lock(&mgr->lock); > > > > > > > > Thanks for catching this, Imre! > > > > > > > > Here I think using mgr->lock is not sufficient for determining probing > > > > is done or not by mst_primary->link_address_sent. Since it might still > > > > be probing the rest of the topology with mst_primary probed. Use > > > > probe_lock instead? Thanks! > > > > > > mgr->lock is taken here to guard the mgr->mst_primary access. > > > > > > probe_lock is also held, taken already by the caller in drm_dp_mst_up_req_work(). > > > > Oh I see. It looks good to me. Feel free to add: > > > > Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> > > Thanks for the review. > > Lyude, are you ok with the change and if I push it to drm-misc-fixes? > > > > > Thanks! > > > > > > > > + > > > > > + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr->mst_primary)) { > > > > > + probing_done = mgr->mst_primary->link_address_sent; > > > > > + drm_dp_mst_topology_put_mstb(mgr->mst_primary); > > > > > + } > > > > > + > > > > > + mutex_unlock(&mgr->lock); > > > > > + > > > > > + return probing_done; > > > > > +} > > -- > > Regards, > > Wayne Lin >
On Fri, Mar 07, 2025 at 08:18:18PM +0000, Patchwork wrote: > == Series Details == > > Series: drm/dp_mst: Fix locking when skipping CSN before topology probing > URL : https://patchwork.freedesktop.org/series/146019/ > State : success Thanks for the reviews, patch is pushed to drm-misc-fixes. > > == Summary == > > CI Bug Log - changes from CI_DRM_16246 -> Patchwork_146019v1 > ==================================================== > > Summary > ------- > > **SUCCESS** > > No regressions found. > > External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/index.html > > Participating hosts (43 -> 43) > ------------------------------ > > Additional (1): bat-arlh-2 > Missing (1): fi-snb-2520m > > Known issues > ------------ > > Here are the changes found in Patchwork_146019v1 that come from known issues: > > ### IGT changes ### > > #### Issues hit #### > > * igt@debugfs_test@basic-hwmon: > - bat-arlh-2: NOTRUN -> [SKIP][1] ([i915#11346] / [i915#9318]) > [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@debugfs_test@basic-hwmon.html > > * igt@fbdev@eof: > - bat-arlh-2: NOTRUN -> [SKIP][2] ([i915#11345] / [i915#11346]) +3 other tests skip > [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@fbdev@eof.html > > * igt@fbdev@info: > - bat-arlh-2: NOTRUN -> [SKIP][3] ([i915#11346] / [i915#1849]) > [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@fbdev@info.html > - fi-kbl-8809g: NOTRUN -> [SKIP][4] ([i915#1849]) > [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/fi-kbl-8809g/igt@fbdev@info.html > > * igt@gem_huc_copy@huc-copy: > - fi-kbl-8809g: NOTRUN -> [SKIP][5] ([i915#2190]) > [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/fi-kbl-8809g/igt@gem_huc_copy@huc-copy.html > > * igt@gem_lmem_swapping@basic: > - bat-arlh-2: NOTRUN -> [SKIP][6] ([i915#10213] / [i915#11346] / [i915#11671]) +3 other tests skip > [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@gem_lmem_swapping@basic.html > > * igt@gem_lmem_swapping@parallel-random-engines: > - fi-kbl-8809g: NOTRUN -> [SKIP][7] ([i915#4613]) +3 other tests skip > [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/fi-kbl-8809g/igt@gem_lmem_swapping@parallel-random-engines.html > > * igt@gem_mmap@basic: > - bat-arlh-2: NOTRUN -> [SKIP][8] ([i915#11343] / [i915#11346]) > [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@gem_mmap@basic.html > > * igt@gem_render_tiled_blits@basic: > - bat-arlh-2: NOTRUN -> [SKIP][9] ([i915#10197] / [i915#10211] / [i915#11346] / [i915#11725]) > [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@gem_render_tiled_blits@basic.html > > * igt@gem_tiled_blits@basic: > - bat-arlh-2: NOTRUN -> [SKIP][10] ([i915#11346] / [i915#12637]) +4 other tests skip > [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@gem_tiled_blits@basic.html > > * igt@gem_tiled_pread_basic: > - bat-arlh-2: NOTRUN -> [SKIP][11] ([i915#10206] / [i915#11346] / [i915#11724]) > [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@gem_tiled_pread_basic.html > > * igt@i915_pm_rpm@module-reload: > - bat-adls-6: [PASS][12] -> [FAIL][13] ([i915#13633]) > [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-adls-6/igt@i915_pm_rpm@module-reload.html > [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-adls-6/igt@i915_pm_rpm@module-reload.html > - bat-dg1-7: [PASS][14] -> [FAIL][15] ([i915#13633]) > [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-dg1-7/igt@i915_pm_rpm@module-reload.html > [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-dg1-7/igt@i915_pm_rpm@module-reload.html > - bat-rpls-4: [PASS][16] -> [FAIL][17] ([i915#13633]) > [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-rpls-4/igt@i915_pm_rpm@module-reload.html > [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-rpls-4/igt@i915_pm_rpm@module-reload.html > > * igt@i915_pm_rps@basic-api: > - bat-arlh-2: NOTRUN -> [SKIP][18] ([i915#10209] / [i915#11346] / [i915#11681]) > [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@i915_pm_rps@basic-api.html > > * igt@i915_selftest@live: > - bat-mtlp-8: [PASS][19] -> [DMESG-FAIL][20] ([i915#12061]) +1 other test dmesg-fail > [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-mtlp-8/igt@i915_selftest@live.html > [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-mtlp-8/igt@i915_selftest@live.html > > * igt@i915_selftest@live@workarounds: > - bat-arls-6: [PASS][21] -> [DMESG-FAIL][22] ([i915#12061]) +1 other test dmesg-fail > [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-arls-6/igt@i915_selftest@live@workarounds.html > [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arls-6/igt@i915_selftest@live@workarounds.html > > * igt@kms_addfb_basic@addfb25-y-tiled-small-legacy: > - bat-arlh-2: NOTRUN -> [SKIP][23] ([i915#10200] / [i915#11346] / [i915#11666] / [i915#12203]) > [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@kms_addfb_basic@addfb25-y-tiled-small-legacy.html > > * igt@kms_addfb_basic@basic-x-tiled-legacy: > - bat-arlh-2: NOTRUN -> [SKIP][24] ([i915#10200] / [i915#11346] / [i915#11666]) +8 other tests skip > [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@kms_addfb_basic@basic-x-tiled-legacy.html > > * igt@kms_dsc@dsc-basic: > - fi-kbl-8809g: NOTRUN -> [SKIP][25] +34 other tests skip > [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/fi-kbl-8809g/igt@kms_dsc@dsc-basic.html > > * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence: > - bat-dg2-11: [PASS][26] -> [SKIP][27] ([i915#9197]) +3 other tests skip > [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html > [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html > > * igt@kms_psr@psr-primary-page-flip: > - bat-arlh-2: NOTRUN -> [SKIP][28] ([i915#11346]) +32 other tests skip > [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@kms_psr@psr-primary-page-flip.html > > * igt@kms_setmode@basic-clone-single-crtc: > - bat-arlh-2: NOTRUN -> [SKIP][29] ([i915#10208] / [i915#11346] / [i915#8809]) > [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@kms_setmode@basic-clone-single-crtc.html > > * igt@prime_vgem@basic-fence-read: > - bat-arlh-2: NOTRUN -> [SKIP][30] ([i915#10212] / [i915#11346] / [i915#11726]) > [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@prime_vgem@basic-fence-read.html > > * igt@prime_vgem@basic-read: > - bat-arlh-2: NOTRUN -> [SKIP][31] ([i915#10214] / [i915#11346] / [i915#11726]) > [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@prime_vgem@basic-read.html > > * igt@prime_vgem@basic-write: > - bat-arlh-2: NOTRUN -> [SKIP][32] ([i915#10216] / [i915#11346] / [i915#11723]) > [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arlh-2/igt@prime_vgem@basic-write.html > > > #### Possible fixes #### > > * igt@i915_selftest@live@workarounds: > - bat-arls-5: [DMESG-FAIL][33] ([i915#12061]) -> [PASS][34] +1 other test pass > [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-arls-5/igt@i915_selftest@live@workarounds.html > [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-arls-5/igt@i915_selftest@live@workarounds.html > - bat-mtlp-6: [DMESG-FAIL][35] ([i915#12061]) -> [PASS][36] +1 other test pass > [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-mtlp-6/igt@i915_selftest@live@workarounds.html > [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-mtlp-6/igt@i915_selftest@live@workarounds.html > - bat-mtlp-9: [DMESG-FAIL][37] ([i915#12061]) -> [PASS][38] +1 other test pass > [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16246/bat-mtlp-9/igt@i915_selftest@live@workarounds.html > [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/bat-mtlp-9/igt@i915_selftest@live@workarounds.html > > > [i915#10197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10197 > [i915#10200]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10200 > [i915#10206]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10206 > [i915#10208]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10208 > [i915#10209]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10209 > [i915#10211]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10211 > [i915#10212]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10212 > [i915#10213]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10213 > [i915#10214]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10214 > [i915#10216]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10216 > [i915#11343]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11343 > [i915#11345]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11345 > [i915#11346]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11346 > [i915#11666]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11666 > [i915#11671]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11671 > [i915#11681]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11681 > [i915#11723]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11723 > [i915#11724]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11724 > [i915#11725]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11725 > [i915#11726]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11726 > [i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061 > [i915#12203]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12203 > [i915#12637]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12637 > [i915#13633]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13633 > [i915#1849]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/1849 > [i915#2190]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/2190 > [i915#4613]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/4613 > [i915#8809]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/8809 > [i915#9197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9197 > [i915#9318]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9318 > > > Build changes > ------------- > > * Linux: CI_DRM_16246 -> Patchwork_146019v1 > > CI-20190529: 20190529 > CI_DRM_16246: f811577f424491a57b1e8669bde62998227d6907 @ git://anongit.freedesktop.org/gfx-ci/linux > IGT_8264: 8264 > Patchwork_146019v1: f811577f424491a57b1e8669bde62998227d6907 @ git://anongit.freedesktop.org/gfx-ci/linux > > == Logs == > > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_146019v1/index.html
diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c index 8b68bb3fbffb0..3a1f1ffc7b552 100644 --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0; } +static bool primary_mstb_probing_is_done(struct drm_dp_mst_topology_mgr *mgr) +{ + bool probing_done = false; + + mutex_lock(&mgr->lock); + + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr->mst_primary)) { + probing_done = mgr->mst_primary->link_address_sent; + drm_dp_mst_topology_put_mstb(mgr->mst_primary); + } + + mutex_unlock(&mgr->lock); + + return probing_done; +} + static inline bool drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_pending_up_req *up_req) @@ -4066,8 +4082,12 @@ drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr *mgr, /* TODO: Add missing handler for DP_RESOURCE_STATUS_NOTIFY events */ if (msg->req_type == DP_CONNECTION_STATUS_NOTIFY) { - dowork = drm_dp_mst_handle_conn_stat(mstb, &msg->u.conn_stat); - hotplug = true; + if (!primary_mstb_probing_is_done(mgr)) { + drm_dbg_kms(mgr->dev, "Got CSN before finish topology probing. Skip it.\n"); + } else { + dowork = drm_dp_mst_handle_conn_stat(mstb, &msg->u.conn_stat); + hotplug = true; + } } drm_dp_mst_topology_put_mstb(mstb); @@ -4149,10 +4169,11 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) drm_dp_send_up_ack_reply(mgr, mst_primary, up_req->msg.req_type, false); + drm_dp_mst_topology_put_mstb(mst_primary); + if (up_req->msg.req_type == DP_CONNECTION_STATUS_NOTIFY) { const struct drm_dp_connection_status_notify *conn_stat = &up_req->msg.u.conn_stat; - bool handle_csn; drm_dbg_kms(mgr->dev, "Got CSN: pn: %d ldps:%d ddps: %d mcs: %d ip: %d pdt: %d\n", conn_stat->port_number, @@ -4161,16 +4182,6 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) conn_stat->message_capability_status, conn_stat->input_port, conn_stat->peer_device_type); - - mutex_lock(&mgr->probe_lock); - handle_csn = mst_primary->link_address_sent; - mutex_unlock(&mgr->probe_lock); - - if (!handle_csn) { - drm_dbg_kms(mgr->dev, "Got CSN before finish topology probing. Skip it."); - kfree(up_req); - goto out_put_primary; - } } else if (up_req->msg.req_type == DP_RESOURCE_STATUS_NOTIFY) { const struct drm_dp_resource_status_notify *res_stat = &up_req->msg.u.resource_stat; @@ -4185,9 +4196,6 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) list_add_tail(&up_req->next, &mgr->up_req_list); mutex_unlock(&mgr->up_req_lock); queue_work(system_long_wq, &mgr->up_req_work); - -out_put_primary: - drm_dp_mst_topology_put_mstb(mst_primary); out_clear_reply: reset_msg_rx_state(&mgr->up_req_recv); return ret;
The handling of the MST Connection Status Notify message is skipped if the probing of the topology is still pending. Acquiring the drm_dp_mst_topology_mgr::probe_lock for this in drm_dp_mst_handle_up_req() is problematic: the task/work this function is called from is also responsible for handling MST down-request replies (in drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() - holding already probe_lock - could be blocked waiting for an MST down-request reply while drm_dp_mst_handle_up_req() is waiting for probe_lock while processing a CSN message. This leads to the probe work's down-request message timing out. A scenario similar to the above leading to a down-request timeout is handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the probe_lock and sending down-request messages while a second CSN message sent by the sink subsequently is handled by drm_dp_mst_handle_up_req(). Fix the above by moving the logic to skip the CSN handling to drm_dp_mst_process_up_req(). This function is called from a work (separate from the task/work handling new up/down messages), already holding probe_lock. This solves the above timeout issue, since handling of down-request replies won't be blocked by probe_lock. Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is not done yet") Cc: Wayne Lin <Wayne.Lin@amd.com> Cc: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 +++++++++++-------- 1 file changed, 24 insertions(+), 16 deletions(-)