diff mbox series

[v2] drm/dp_mst: Remove VCPI while disabling topology mgr

Message ID 20191205090043.7580-1-Wayne.Lin@amd.com (mailing list archive)
State New, archived
Headers show
Series [v2] drm/dp_mst: Remove VCPI while disabling topology mgr | expand

Commit Message

Lin, Wayne Dec. 5, 2019, 9 a.m. UTC
[Why]

This patch is trying to address the issue observed when hotplug DP
daisy chain monitors.

e.g.
src-mstb-mstb-sst -> src (unplug) mstb-mstb-sst -> src-mstb-mstb-sst
(plug in again)

Once unplug a DP MST capable device, driver will call
drm_dp_mst_topology_mgr_set_mst() to disable MST. In this function,
it cleans data of topology manager while disabling mst_state. However,
it doesn't clean up the proposed_vcpis of topology manager.
If proposed_vcpi is not reset, once plug in MST daisy chain monitors
later, code will fail at checking port validation while trying to
allocate payloads.

When MST capable device is plugged in again and try to allocate
payloads by calling drm_dp_update_payload_part1(), this
function will iterate over all proposed virtual channels to see if
any proposed VCPI's num_slots is greater than 0. If any proposed
VCPI's num_slots is greater than 0 and the port which the
specific virtual channel directed to is not in the topology, code then
fails at the port validation. Since there are stale VCPI allocations
from the previous topology enablement in proposed_vcpi[], code will fail
at port validation and reurn EINVAL.

[How]

Clean up the data of stale proposed_vcpi[] and reset mgr->proposed_vcpis
to NULL while disabling mst in drm_dp_mst_topology_mgr_set_mst().

Changes since v1:
*Add on more details in commit message to describe the issue which the 
patch is trying to fix

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Lyude Paul Dec. 6, 2019, 7:24 p.m. UTC | #1
Reviewed-by: Lyude Paul <lyude@redhat.com>

I'll go ahead and push this to drm-misc-next-fixes right now, thanks!

On Thu, 2019-12-05 at 17:00 +0800, Wayne Lin wrote:
> [Why]
> 
> This patch is trying to address the issue observed when hotplug DP
> daisy chain monitors.
> 
> e.g.
> src-mstb-mstb-sst -> src (unplug) mstb-mstb-sst -> src-mstb-mstb-sst
> (plug in again)
> 
> Once unplug a DP MST capable device, driver will call
> drm_dp_mst_topology_mgr_set_mst() to disable MST. In this function,
> it cleans data of topology manager while disabling mst_state. However,
> it doesn't clean up the proposed_vcpis of topology manager.
> If proposed_vcpi is not reset, once plug in MST daisy chain monitors
> later, code will fail at checking port validation while trying to
> allocate payloads.
> 
> When MST capable device is plugged in again and try to allocate
> payloads by calling drm_dp_update_payload_part1(), this
> function will iterate over all proposed virtual channels to see if
> any proposed VCPI's num_slots is greater than 0. If any proposed
> VCPI's num_slots is greater than 0 and the port which the
> specific virtual channel directed to is not in the topology, code then
> fails at the port validation. Since there are stale VCPI allocations
> from the previous topology enablement in proposed_vcpi[], code will fail
> at port validation and reurn EINVAL.
> 
> [How]
> 
> Clean up the data of stale proposed_vcpi[] and reset mgr->proposed_vcpis
> to NULL while disabling mst in drm_dp_mst_topology_mgr_set_mst().
> 
> Changes since v1:
> *Add on more details in commit message to describe the issue which the 
> patch is trying to fix
> 
> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> ---
>  drivers/gpu/drm/drm_dp_mst_topology.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> b/drivers/gpu/drm/drm_dp_mst_topology.c
> index ae5809a1f19a..bf4f745a4edb 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -3386,6 +3386,7 @@ static int drm_dp_get_vc_payload_bw(u8 dp_link_bw,
> u8  dp_link_count)
>  int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr,
> bool mst_state)
>  {
>  	int ret = 0;
> +	int i = 0;
>  	struct drm_dp_mst_branch *mstb = NULL;
>  
>  	mutex_lock(&mgr->lock);
> @@ -3446,10 +3447,21 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> drm_dp_mst_topology_mgr *mgr, bool ms
>  		/* this can fail if the device is gone */
>  		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
>  		ret = 0;
> +		mutex_lock(&mgr->payload_lock);
>  		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct
> drm_dp_payload));
>  		mgr->payload_mask = 0;
>  		set_bit(0, &mgr->payload_mask);
> +		for (i = 0; i < mgr->max_payloads; i++) {
> +			struct drm_dp_vcpi *vcpi = mgr->proposed_vcpis[i];
> +
> +			if (vcpi) {
> +				vcpi->vcpi = 0;
> +				vcpi->num_slots = 0;
> +			}
> +			mgr->proposed_vcpis[i] = NULL;
> +		}
>  		mgr->vcpi_mask = 0;
> +		mutex_unlock(&mgr->payload_lock);
>  	}
>  
>  out_unlock:
Lyude Paul Dec. 6, 2019, 7:57 p.m. UTC | #2
On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> Reviewed-by: Lyude Paul <lyude@redhat.com>
> 
> I'll go ahead and push this to drm-misc-next-fixes right now, thanks!

Whoops-meant to say drm-misc-next here, anyway, pushed!
> 
> On Thu, 2019-12-05 at 17:00 +0800, Wayne Lin wrote:
> > [Why]
> > 
> > This patch is trying to address the issue observed when hotplug DP
> > daisy chain monitors.
> > 
> > e.g.
> > src-mstb-mstb-sst -> src (unplug) mstb-mstb-sst -> src-mstb-mstb-sst
> > (plug in again)
> > 
> > Once unplug a DP MST capable device, driver will call
> > drm_dp_mst_topology_mgr_set_mst() to disable MST. In this function,
> > it cleans data of topology manager while disabling mst_state. However,
> > it doesn't clean up the proposed_vcpis of topology manager.
> > If proposed_vcpi is not reset, once plug in MST daisy chain monitors
> > later, code will fail at checking port validation while trying to
> > allocate payloads.
> > 
> > When MST capable device is plugged in again and try to allocate
> > payloads by calling drm_dp_update_payload_part1(), this
> > function will iterate over all proposed virtual channels to see if
> > any proposed VCPI's num_slots is greater than 0. If any proposed
> > VCPI's num_slots is greater than 0 and the port which the
> > specific virtual channel directed to is not in the topology, code then
> > fails at the port validation. Since there are stale VCPI allocations
> > from the previous topology enablement in proposed_vcpi[], code will fail
> > at port validation and reurn EINVAL.
> > 
> > [How]
> > 
> > Clean up the data of stale proposed_vcpi[] and reset mgr->proposed_vcpis
> > to NULL while disabling mst in drm_dp_mst_topology_mgr_set_mst().
> > 
> > Changes since v1:
> > *Add on more details in commit message to describe the issue which the 
> > patch is trying to fix
> > 
> > Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> > ---
> >  drivers/gpu/drm/drm_dp_mst_topology.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index ae5809a1f19a..bf4f745a4edb 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -3386,6 +3386,7 @@ static int drm_dp_get_vc_payload_bw(u8 dp_link_bw,
> > u8  dp_link_count)
> >  int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr,
> > bool mst_state)
> >  {
> >  	int ret = 0;
> > +	int i = 0;
> >  	struct drm_dp_mst_branch *mstb = NULL;
> >  
> >  	mutex_lock(&mgr->lock);
> > @@ -3446,10 +3447,21 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >  		/* this can fail if the device is gone */
> >  		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
> >  		ret = 0;
> > +		mutex_lock(&mgr->payload_lock);
> >  		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct
> > drm_dp_payload));
> >  		mgr->payload_mask = 0;
> >  		set_bit(0, &mgr->payload_mask);
> > +		for (i = 0; i < mgr->max_payloads; i++) {
> > +			struct drm_dp_vcpi *vcpi = mgr->proposed_vcpis[i];
> > +
> > +			if (vcpi) {
> > +				vcpi->vcpi = 0;
> > +				vcpi->num_slots = 0;
> > +			}
> > +			mgr->proposed_vcpis[i] = NULL;
> > +		}
> >  		mgr->vcpi_mask = 0;
> > +		mutex_unlock(&mgr->payload_lock);
> >  	}
> >  
> >  out_unlock:
Lin, Wayne Dec. 9, 2019, 5:56 a.m. UTC | #3
> -----Original Message-----
> From: Lyude Paul <lyude@redhat.com>
> Sent: Saturday, December 7, 2019 3:57 AM
> To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> amd-gfx@lists.freedesktop.org
> Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>; Wentland, Harry
> <Harry.Wentland@amd.com>; Zuo, Jerry <Jerry.Zuo@amd.com>;
> stable@vger.kernel.org
> Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling topology
> mgr
> 
> On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > Reviewed-by: Lyude Paul <lyude@redhat.com>
> >
> > I'll go ahead and push this to drm-misc-next-fixes right now, thanks!
> 
> Whoops-meant to say drm-misc-next here, anyway, pushed!
Thanks for your time!

> >
> > On Thu, 2019-12-05 at 17:00 +0800, Wayne Lin wrote:
> > > [Why]
> > >
> > > This patch is trying to address the issue observed when hotplug DP
> > > daisy chain monitors.
> > >
> > > e.g.
> > > src-mstb-mstb-sst -> src (unplug) mstb-mstb-sst -> src-mstb-mstb-sst
> > > (plug in again)
> > >
> > > Once unplug a DP MST capable device, driver will call
> > > drm_dp_mst_topology_mgr_set_mst() to disable MST. In this function,
> > > it cleans data of topology manager while disabling mst_state.
> > > However, it doesn't clean up the proposed_vcpis of topology manager.
> > > If proposed_vcpi is not reset, once plug in MST daisy chain monitors
> > > later, code will fail at checking port validation while trying to
> > > allocate payloads.
> > >
> > > When MST capable device is plugged in again and try to allocate
> > > payloads by calling drm_dp_update_payload_part1(), this function
> > > will iterate over all proposed virtual channels to see if any
> > > proposed VCPI's num_slots is greater than 0. If any proposed VCPI's
> > > num_slots is greater than 0 and the port which the specific virtual
> > > channel directed to is not in the topology, code then fails at the
> > > port validation. Since there are stale VCPI allocations from the
> > > previous topology enablement in proposed_vcpi[], code will fail at
> > > port validation and reurn EINVAL.
> > >
> > > [How]
> > >
> > > Clean up the data of stale proposed_vcpi[] and reset
> > > mgr->proposed_vcpis to NULL while disabling mst in
> drm_dp_mst_topology_mgr_set_mst().
> > >
> > > Changes since v1:
> > > *Add on more details in commit message to describe the issue which
> > > the patch is trying to fix
> > >
> > > Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> > > ---
> > >  drivers/gpu/drm/drm_dp_mst_topology.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > > index ae5809a1f19a..bf4f745a4edb 100644
> > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > > @@ -3386,6 +3386,7 @@ static int drm_dp_get_vc_payload_bw(u8
> > > dp_link_bw,
> > > u8  dp_link_count)
> > >  int drm_dp_mst_topology_mgr_set_mst(struct
> drm_dp_mst_topology_mgr
> > > *mgr, bool mst_state)  {
> > >  	int ret = 0;
> > > +	int i = 0;
> > >  	struct drm_dp_mst_branch *mstb = NULL;
> > >
> > >  	mutex_lock(&mgr->lock);
> > > @@ -3446,10 +3447,21 @@ int
> drm_dp_mst_topology_mgr_set_mst(struct
> > > drm_dp_mst_topology_mgr *mgr, bool ms
> > >  		/* this can fail if the device is gone */
> > >  		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
> > >  		ret = 0;
> > > +		mutex_lock(&mgr->payload_lock);
> > >  		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct
> > > drm_dp_payload));
> > >  		mgr->payload_mask = 0;
> > >  		set_bit(0, &mgr->payload_mask);
> > > +		for (i = 0; i < mgr->max_payloads; i++) {
> > > +			struct drm_dp_vcpi *vcpi = mgr->proposed_vcpis[i];
> > > +
> > > +			if (vcpi) {
> > > +				vcpi->vcpi = 0;
> > > +				vcpi->num_slots = 0;
> > > +			}
> > > +			mgr->proposed_vcpis[i] = NULL;
> > > +		}
> > >  		mgr->vcpi_mask = 0;
> > > +		mutex_unlock(&mgr->payload_lock);
> > >  	}
> > >
> > >  out_unlock:
> --
> Cheers,
> 	Lyude Paul

--
Regards,
Wayne Lin
Sean Paul Jan. 17, 2020, 4:19 p.m. UTC | #4
On Mon, Dec 9, 2019 at 12:56 AM Lin, Wayne <Wayne.Lin@amd.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Lyude Paul <lyude@redhat.com>
> > Sent: Saturday, December 7, 2019 3:57 AM
> > To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> > amd-gfx@lists.freedesktop.org
> > Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>; Wentland, Harry
> > <Harry.Wentland@amd.com>; Zuo, Jerry <Jerry.Zuo@amd.com>;
> > stable@vger.kernel.org
> > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling topology
> > mgr
> >
> > On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > >
> > > I'll go ahead and push this to drm-misc-next-fixes right now, thanks!
> >
> > Whoops-meant to say drm-misc-next here, anyway, pushed!
> Thanks for your time!
>

I'm getting the following warning on unplug with this patch:

[   54.010099]
[   54.011765] ======================================================
[   54.018670] WARNING: possible circular locking dependency detected
[   54.025577] 5.5.0-rc6-02274-g77381c23ee63 #47 Not tainted
[   54.031610] ------------------------------------------------------
[   54.038516] kworker/1:6/1040 is trying to acquire lock:
[   54.044354] ffff888272af3228 (&mgr->payload_lock){+.+.}, at:
drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.054957]
[   54.054957] but task is already holding lock:
[   54.061473] ffff888272af3060 (&mgr->lock){+.+.}, at:
drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4
[   54.071193]
[   54.071193] which lock already depends on the new lock.
[   54.071193]
[   54.080334]
[   54.080334] the existing dependency chain (in reverse order) is:
[   54.088697]
[   54.088697] -> #1 (&mgr->lock){+.+.}:
[   54.094440]        __mutex_lock+0xc3/0x498
[   54.099015]        drm_dp_mst_topology_get_port_validated+0x25/0x80
[   54.106018]        drm_dp_update_payload_part1+0xa2/0x2e2
[   54.112051]        intel_mst_pre_enable_dp+0x144/0x18f
[   54.117791]        intel_encoders_pre_enable+0x63/0x70
[   54.123532]        hsw_crtc_enable+0xa1/0x722
[   54.128396]        intel_update_crtc+0x50/0x194
[   54.133455]        skl_commit_modeset_enables+0x40c/0x540
[   54.139485]        intel_atomic_commit_tail+0x5f7/0x130d
[   54.145418]        intel_atomic_commit+0x2c8/0x2d8
[   54.150770]        drm_atomic_helper_set_config+0x5a/0x70
[   54.156801]        drm_mode_setcrtc+0x2ab/0x833
[   54.161862]        drm_ioctl+0x2e5/0x424
[   54.166242]        vfs_ioctl+0x21/0x2f
[   54.170426]        do_vfs_ioctl+0x5fb/0x61e
[   54.175096]        ksys_ioctl+0x55/0x75
[   54.179377]        __x64_sys_ioctl+0x1a/0x1e
[   54.184146]        do_syscall_64+0x5c/0x6d
[   54.188721]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   54.194946]
[   54.194946] -> #0 (&mgr->payload_lock){+.+.}:
[   54.201463]
[   54.201463] other info that might help us debug this:
[   54.201463]
[   54.210410]  Possible unsafe locking scenario:
[   54.210410]
[   54.217025]        CPU0                    CPU1
[   54.222082]        ----                    ----
[   54.227138]   lock(&mgr->lock);
[   54.230643]                                lock(&mgr->payload_lock);
[   54.237742]                                lock(&mgr->lock);
[   54.244062]   lock(&mgr->payload_lock);
[   54.248346]
[   54.248346]  *** DEADLOCK ***
[   54.248346]
[   54.254959] 7 locks held by kworker/1:6/1040:
[   54.259822]  #0: ffff888275c4f528 ((wq_completion)events){+.+.},
at: worker_thread+0x455/0x6e2
[   54.269451]  #1: ffffc9000119beb0
((work_completion)(&(&dev_priv->hotplug.hotplug_work)->work)){+.+.},
at: worker_thread+0x455/0x6e2
[   54.282768]  #2: ffff888272a403f0 (&dev->mode_config.mutex){+.+.},
at: i915_hotplug_work_func+0x4b/0x2be
[   54.293368]  #3: ffffffff824fc6c0 (drm_connector_list_iter){.+.+},
at: i915_hotplug_work_func+0x17e/0x2be
[   54.304061]  #4: ffffc9000119bc58 (crtc_ww_class_acquire){+.+.},
at: drm_helper_probe_detect_ctx+0x40/0xfd
[   54.314855]  #5: ffff888272a40470 (crtc_ww_class_mutex){+.+.}, at:
drm_modeset_lock+0x74/0xe2
[   54.324385]  #6: ffff888272af3060 (&mgr->lock){+.+.}, at:
drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4
[   54.334597]
[   54.334597] stack backtrace:
[   54.339464] CPU: 1 PID: 1040 Comm: kworker/1:6 Not tainted
5.5.0-rc6-02274-g77381c23ee63 #47
[   54.348893] Hardware name: Google Fizz/Fizz, BIOS
Google_Fizz.10139.39.0 01/04/2018
[   54.357451] Workqueue: events i915_hotplug_work_func
[   54.362995] Call Trace:
[   54.365724]  dump_stack+0x71/0x9c
[   54.369427]  check_noncircular+0x91/0xbc
[   54.373809]  ? __lock_acquire+0xc9e/0xf66
[   54.378286]  ? __lock_acquire+0xc9e/0xf66
[   54.382763]  ? lock_acquire+0x175/0x1ac
[   54.387048]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.393177]  ? __mutex_lock+0xc3/0x498
[   54.397362]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.403492]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.409620]  ? drm_dp_dpcd_access+0xd9/0x101
[   54.414390]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.420517]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.426645]  ? intel_digital_port_connected+0x34d/0x35c
[   54.432482]  ? intel_dp_detect+0x227/0x44e
[   54.437056]  ? ww_mutex_lock+0x49/0x9a
[   54.441242]  ? drm_helper_probe_detect_ctx+0x75/0xfd
[   54.446789]  ? intel_encoder_hotplug+0x4b/0x97
[   54.451752]  ? intel_ddi_hotplug+0x61/0x2e0
[   54.456423]  ? mark_held_locks+0x53/0x68
[   54.460803]  ? _raw_spin_unlock_irqrestore+0x3a/0x51
[   54.466347]  ? lockdep_hardirqs_on+0x187/0x1a4
[   54.471310]  ? drm_connector_list_iter_next+0x89/0x9a
[   54.476953]  ? i915_hotplug_work_func+0x206/0x2be
[   54.482208]  ? worker_thread+0x4d5/0x6e2
[   54.486587]  ? worker_thread+0x455/0x6e2
[   54.490966]  ? queue_work_on+0x64/0x64
[   54.495151]  ? kthread+0x1e9/0x1f1
[   54.498946]  ? queue_work_on+0x64/0x64
[   54.503130]  ? kthread_unpark+0x5e/0x5e
[   54.507413]  ? ret_from_fork+0x3a/0x50

\snip
Lyude Paul Jan. 17, 2020, 8:27 p.m. UTC | #5
On Fri, 2020-01-17 at 11:19 -0500, Sean Paul wrote:
> On Mon, Dec 9, 2019 at 12:56 AM Lin, Wayne <Wayne.Lin@amd.com> wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Lyude Paul <lyude@redhat.com>
> > > Sent: Saturday, December 7, 2019 3:57 AM
> > > To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> > > amd-gfx@lists.freedesktop.org
> > > Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>; Wentland, Harry
> > > <Harry.Wentland@amd.com>; Zuo, Jerry <Jerry.Zuo@amd.com>;
> > > stable@vger.kernel.org
> > > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling topology
> > > mgr
> > > 
> > > On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > > > 
> > > > I'll go ahead and push this to drm-misc-next-fixes right now, thanks!
> > > 
> > > Whoops-meant to say drm-misc-next here, anyway, pushed!
> > Thanks for your time!
> > 
> 
> I'm getting the following warning on unplug with this patch:
> 
> [   54.010099]
> [   54.011765] ======================================================
> [   54.018670] WARNING: possible circular locking dependency detected
> [   54.025577] 5.5.0-rc6-02274-g77381c23ee63 #47 Not tainted
> [   54.031610] ------------------------------------------------------
> [   54.038516] kworker/1:6/1040 is trying to acquire lock:
> [   54.044354] ffff888272af3228 (&mgr->payload_lock){+.+.}, at:
> drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
> [   54.054957]
> [   54.054957] but task is already holding lock:
> [   54.061473] ffff888272af3060 (&mgr->lock){+.+.}, at:
> drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4
> [   54.071193]
> [   54.071193] which lock already depends on the new lock.
> [   54.071193]
> [   54.080334]
> [   54.080334] the existing dependency chain (in reverse order) is:
> [   54.088697]
> [   54.088697] -> #1 (&mgr->lock){+.+.}:
> [   54.094440]        __mutex_lock+0xc3/0x498
> [   54.099015]        drm_dp_mst_topology_get_port_validated+0x25/0x80
> [   54.106018]        drm_dp_update_payload_part1+0xa2/0x2e2
> [   54.112051]        intel_mst_pre_enable_dp+0x144/0x18f
> [   54.117791]        intel_encoders_pre_enable+0x63/0x70
> [   54.123532]        hsw_crtc_enable+0xa1/0x722
> [   54.128396]        intel_update_crtc+0x50/0x194
> [   54.133455]        skl_commit_modeset_enables+0x40c/0x540
> [   54.139485]        intel_atomic_commit_tail+0x5f7/0x130d
> [   54.145418]        intel_atomic_commit+0x2c8/0x2d8
> [   54.150770]        drm_atomic_helper_set_config+0x5a/0x70
> [   54.156801]        drm_mode_setcrtc+0x2ab/0x833
> [   54.161862]        drm_ioctl+0x2e5/0x424
> [   54.166242]        vfs_ioctl+0x21/0x2f
> [   54.170426]        do_vfs_ioctl+0x5fb/0x61e
> [   54.175096]        ksys_ioctl+0x55/0x75
> [   54.179377]        __x64_sys_ioctl+0x1a/0x1e
> [   54.184146]        do_syscall_64+0x5c/0x6d
> [   54.188721]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   54.194946]
> [   54.194946] -> #0 (&mgr->payload_lock){+.+.}:
> [   54.201463]
> [   54.201463] other info that might help us debug this:
> [   54.201463]
> [   54.210410]  Possible unsafe locking scenario:
> [   54.210410]
> [   54.217025]        CPU0                    CPU1
> [   54.222082]        ----                    ----
> [   54.227138]   lock(&mgr->lock);
> [   54.230643]                                lock(&mgr->payload_lock);
> [   54.237742]                                lock(&mgr->lock);
> [   54.244062]   lock(&mgr->payload_lock);
> [   54.248346]
> [   54.248346]  *** DEADLOCK ***
> [   54.248346]
> [   54.254959] 7 locks held by kworker/1:6/1040:
> [   54.259822]  #0: ffff888275c4f528 ((wq_completion)events){+.+.},
> at: worker_thread+0x455/0x6e2
> [   54.269451]  #1: ffffc9000119beb0
> ((work_completion)(&(&dev_priv->hotplug.hotplug_work)->work)){+.+.},
> at: worker_thread+0x455/0x6e2
> [   54.282768]  #2: ffff888272a403f0 (&dev->mode_config.mutex){+.+.},
> at: i915_hotplug_work_func+0x4b/0x2be
> [   54.293368]  #3: ffffffff824fc6c0 (drm_connector_list_iter){.+.+},
> at: i915_hotplug_work_func+0x17e/0x2be
> [   54.304061]  #4: ffffc9000119bc58 (crtc_ww_class_acquire){+.+.},
> at: drm_helper_probe_detect_ctx+0x40/0xfd
> [   54.314855]  #5: ffff888272a40470 (crtc_ww_class_mutex){+.+.}, at:
> drm_modeset_lock+0x74/0xe2
> [   54.324385]  #6: ffff888272af3060 (&mgr->lock){+.+.}, at:
> drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4
> [   54.334597]
> [   54.334597] stack backtrace:
> [   54.339464] CPU: 1 PID: 1040 Comm: kworker/1:6 Not tainted
> 5.5.0-rc6-02274-g77381c23ee63 #47
> [   54.348893] Hardware name: Google Fizz/Fizz, BIOS
> Google_Fizz.10139.39.0 01/04/2018
> [   54.357451] Workqueue: events i915_hotplug_work_func
> [   54.362995] Call Trace:
> [   54.365724]  dump_stack+0x71/0x9c
> [   54.369427]  check_noncircular+0x91/0xbc
> [   54.373809]  ? __lock_acquire+0xc9e/0xf66
> [   54.378286]  ? __lock_acquire+0xc9e/0xf66
> [   54.382763]  ? lock_acquire+0x175/0x1ac
> [   54.387048]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
> [   54.393177]  ? __mutex_lock+0xc3/0x498
> [   54.397362]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
> [   54.403492]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
> [   54.409620]  ? drm_dp_dpcd_access+0xd9/0x101
> [   54.414390]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
> [   54.420517]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
> [   54.426645]  ? intel_digital_port_connected+0x34d/0x35c
> [   54.432482]  ? intel_dp_detect+0x227/0x44e
> [   54.437056]  ? ww_mutex_lock+0x49/0x9a
> [   54.441242]  ? drm_helper_probe_detect_ctx+0x75/0xfd
> [   54.446789]  ? intel_encoder_hotplug+0x4b/0x97
> [   54.451752]  ? intel_ddi_hotplug+0x61/0x2e0
> [   54.456423]  ? mark_held_locks+0x53/0x68
> [   54.460803]  ? _raw_spin_unlock_irqrestore+0x3a/0x51
> [   54.466347]  ? lockdep_hardirqs_on+0x187/0x1a4
> [   54.471310]  ? drm_connector_list_iter_next+0x89/0x9a
> [   54.476953]  ? i915_hotplug_work_func+0x206/0x2be
> [   54.482208]  ? worker_thread+0x4d5/0x6e2
> [   54.486587]  ? worker_thread+0x455/0x6e2
> [   54.490966]  ? queue_work_on+0x64/0x64
> [   54.495151]  ? kthread+0x1e9/0x1f1
> [   54.498946]  ? queue_work_on+0x64/0x64
> [   54.503130]  ? kthread_unpark+0x5e/0x5e
> [   54.507413]  ? ret_from_fork+0x3a/0x50
> 
> \snip

I think I've got a better fix for this that should avoid that problem, I'll
write up a patch and send it out in a bit
>
Sean Paul Jan. 17, 2020, 8:43 p.m. UTC | #6
On Fri, Jan 17, 2020 at 3:27 PM Lyude Paul <lyude@redhat.com> wrote:
>
> On Fri, 2020-01-17 at 11:19 -0500, Sean Paul wrote:
> > On Mon, Dec 9, 2019 at 12:56 AM Lin, Wayne <Wayne.Lin@amd.com> wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Lyude Paul <lyude@redhat.com>
> > > > Sent: Saturday, December 7, 2019 3:57 AM
> > > > To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> > > > amd-gfx@lists.freedesktop.org
> > > > Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>; Wentland, Harry
> > > > <Harry.Wentland@amd.com>; Zuo, Jerry <Jerry.Zuo@amd.com>;
> > > > stable@vger.kernel.org
> > > > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling topology
> > > > mgr
> > > >
> > > > On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > > > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > > > >
> > > > > I'll go ahead and push this to drm-misc-next-fixes right now, thanks!
> > > >
> > > > Whoops-meant to say drm-misc-next here, anyway, pushed!
> > > Thanks for your time!
> > >
> >
> > I'm getting the following warning on unplug with this patch:
> >

\snip

>
> I think I've got a better fix for this that should avoid that problem, I'll
> write up a patch and send it out in a bit

Thanks Lyude! Should we revert this patch for the time being?

> >
> --
> Cheers,
>         Lyude Paul
>
Lyude Paul Jan. 17, 2020, 8:45 p.m. UTC | #7
Yeah that's fine with me, I'll send out a revert for this in just a moment

On Fri, 2020-01-17 at 15:43 -0500, Sean Paul wrote:
> On Fri, Jan 17, 2020 at 3:27 PM Lyude Paul <lyude@redhat.com> wrote:
> > On Fri, 2020-01-17 at 11:19 -0500, Sean Paul wrote:
> > > On Mon, Dec 9, 2019 at 12:56 AM Lin, Wayne <Wayne.Lin@amd.com> wrote:
> > > > 
> > > > > -----Original Message-----
> > > > > From: Lyude Paul <lyude@redhat.com>
> > > > > Sent: Saturday, December 7, 2019 3:57 AM
> > > > > To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> > > > > amd-gfx@lists.freedesktop.org
> > > > > Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>; Wentland,
> > > > > Harry
> > > > > <Harry.Wentland@amd.com>; Zuo, Jerry <Jerry.Zuo@amd.com>;
> > > > > stable@vger.kernel.org
> > > > > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling
> > > > > topology
> > > > > mgr
> > > > > 
> > > > > On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > > > > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > > > > > 
> > > > > > I'll go ahead and push this to drm-misc-next-fixes right now,
> > > > > > thanks!
> > > > > 
> > > > > Whoops-meant to say drm-misc-next here, anyway, pushed!
> > > > Thanks for your time!
> > > > 
> > > 
> > > I'm getting the following warning on unplug with this patch:
> > > 
> 
> \snip
> 
> > I think I've got a better fix for this that should avoid that problem,
> > I'll
> > write up a patch and send it out in a bit
> 
> Thanks Lyude! Should we revert this patch for the time being?
> 
> > --
> > Cheers,
> >         Lyude Paul
> >
Lin, Wayne Jan. 22, 2020, 4:48 a.m. UTC | #8
[AMD Public Use]

Sorry for any inconvenience I brought.
Thank you so much Lyude, I will have a look on that fix patch later.

> -----Original Message-----
> From: Lyude Paul <lyude@redhat.com>
> Sent: Saturday, January 18, 2020 4:45 AM
> To: Sean Paul <sean@poorly.run>
> Cc: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> amd-gfx@lists.freedesktop.org; Zuo, Jerry <Jerry.Zuo@amd.com>; Kazlauskas,
> Nicholas <Nicholas.Kazlauskas@amd.com>; stable@vger.kernel.org
> Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling topology
> mgr
> 
> Yeah that's fine with me, I'll send out a revert for this in just a moment
> 
> On Fri, 2020-01-17 at 15:43 -0500, Sean Paul wrote:
> > On Fri, Jan 17, 2020 at 3:27 PM Lyude Paul <lyude@redhat.com> wrote:
> > > On Fri, 2020-01-17 at 11:19 -0500, Sean Paul wrote:
> > > > On Mon, Dec 9, 2019 at 12:56 AM Lin, Wayne <Wayne.Lin@amd.com>
> wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Lyude Paul <lyude@redhat.com>
> > > > > > Sent: Saturday, December 7, 2019 3:57 AM
> > > > > > To: Lin, Wayne <Wayne.Lin@amd.com>;
> > > > > > dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> > > > > > Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>;
> > > > > > Wentland, Harry <Harry.Wentland@amd.com>; Zuo, Jerry
> > > > > > <Jerry.Zuo@amd.com>; stable@vger.kernel.org
> > > > > > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while
> > > > > > disabling topology mgr
> > > > > >
> > > > > > On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > > > > > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > > > > > >
> > > > > > > I'll go ahead and push this to drm-misc-next-fixes right
> > > > > > > now, thanks!
> > > > > >
> > > > > > Whoops-meant to say drm-misc-next here, anyway, pushed!
> > > > > Thanks for your time!
> > > > >
> > > >
> > > > I'm getting the following warning on unplug with this patch:
> > > >
> >
> > \snip
> >
> > > I think I've got a better fix for this that should avoid that
> > > problem, I'll write up a patch and send it out in a bit
> >
> > Thanks Lyude! Should we revert this patch for the time being?
> >
> > > --
> > > Cheers,
> > >         Lyude Paul
> > >
> --
> Cheers,
> 	Lyude Paul
--
Best regards,
Wayne
Lyude Paul Jan. 22, 2020, 8:18 p.m. UTC | #9
On Wed, 2020-01-22 at 04:48 +0000, Lin, Wayne wrote:
> [AMD Public Use]
> 
> Sorry for any inconvenience I brought.

Nothing to be sorry about! This happens from time to time, it's my fault for
not noticing it in the first place anyway :P. Ville from Intel is able to
review it, so there's no rush

> Thank you so much Lyude, I will have a look on that fix patch later.
> 
> > -----Original Message-----
> > From: Lyude Paul <lyude@redhat.com>
> > Sent: Saturday, January 18, 2020 4:45 AM
> > To: Sean Paul <sean@poorly.run>
> > Cc: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> > amd-gfx@lists.freedesktop.org; Zuo, Jerry <Jerry.Zuo@amd.com>; Kazlauskas,
> > Nicholas <Nicholas.Kazlauskas@amd.com>; stable@vger.kernel.org
> > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while disabling topology
> > mgr
> > 
> > Yeah that's fine with me, I'll send out a revert for this in just a moment
> > 
> > On Fri, 2020-01-17 at 15:43 -0500, Sean Paul wrote:
> > > On Fri, Jan 17, 2020 at 3:27 PM Lyude Paul <lyude@redhat.com> wrote:
> > > > On Fri, 2020-01-17 at 11:19 -0500, Sean Paul wrote:
> > > > > On Mon, Dec 9, 2019 at 12:56 AM Lin, Wayne <Wayne.Lin@amd.com>
> > wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Lyude Paul <lyude@redhat.com>
> > > > > > > Sent: Saturday, December 7, 2019 3:57 AM
> > > > > > > To: Lin, Wayne <Wayne.Lin@amd.com>;
> > > > > > > dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> > > > > > > Cc: Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>;
> > > > > > > Wentland, Harry <Harry.Wentland@amd.com>; Zuo, Jerry
> > > > > > > <Jerry.Zuo@amd.com>; stable@vger.kernel.org
> > > > > > > Subject: Re: [PATCH v2] drm/dp_mst: Remove VCPI while
> > > > > > > disabling topology mgr
> > > > > > > 
> > > > > > > On Fri, 2019-12-06 at 14:24 -0500, Lyude Paul wrote:
> > > > > > > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > > > > > > > 
> > > > > > > > I'll go ahead and push this to drm-misc-next-fixes right
> > > > > > > > now, thanks!
> > > > > > > 
> > > > > > > Whoops-meant to say drm-misc-next here, anyway, pushed!
> > > > > > Thanks for your time!
> > > > > > 
> > > > > 
> > > > > I'm getting the following warning on unplug with this patch:
> > > > > 
> > > 
> > > \snip
> > > 
> > > > I think I've got a better fix for this that should avoid that
> > > > problem, I'll write up a patch and send it out in a bit
> > > 
> > > Thanks Lyude! Should we revert this patch for the time being?
> > > 
> > > > --
> > > > Cheers,
> > > >         Lyude Paul
> > > > 
> > --
> > Cheers,
> > 	Lyude Paul
> --
> Best regards,
> Wayne
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index ae5809a1f19a..bf4f745a4edb 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3386,6 +3386,7 @@  static int drm_dp_get_vc_payload_bw(u8 dp_link_bw, u8  dp_link_count)
 int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool mst_state)
 {
 	int ret = 0;
+	int i = 0;
 	struct drm_dp_mst_branch *mstb = NULL;
 
 	mutex_lock(&mgr->lock);
@@ -3446,10 +3447,21 @@  int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
 		/* this can fail if the device is gone */
 		drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
 		ret = 0;
+		mutex_lock(&mgr->payload_lock);
 		memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
 		mgr->payload_mask = 0;
 		set_bit(0, &mgr->payload_mask);
+		for (i = 0; i < mgr->max_payloads; i++) {
+			struct drm_dp_vcpi *vcpi = mgr->proposed_vcpis[i];
+
+			if (vcpi) {
+				vcpi->vcpi = 0;
+				vcpi->num_slots = 0;
+			}
+			mgr->proposed_vcpis[i] = NULL;
+		}
 		mgr->vcpi_mask = 0;
+		mutex_unlock(&mgr->payload_lock);
 	}
 
 out_unlock: