diff mbox

drm/mst: drop cancel work sync in the mstb destroy path

Message ID 1443573582-19537-1-git-send-email-airlied@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Airlie Sept. 30, 2015, 12:39 a.m. UTC
From: Dave Airlie <airlied@redhat.com>

Since 9eb1e57f564d4e6e10991402726cc83fe0b9172f
drm/dp/mst: make sure mst_primary mstb is valid in work function

we validate the mstb structs in the work function, and doing
that takes a reference. So we should never get here with the
work function running using the mstb device, only if the work
function hasn't run yet or is running for another mstb.

So we don't need to sync the work here, this was causing
lockdep spew when the device failed during the initial EDID
read.

Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Daniel Vetter Sept. 30, 2015, 7:20 a.m. UTC | #1
On Wed, Sep 30, 2015 at 10:39:42AM +1000, Dave Airlie wrote:
> From: Dave Airlie <airlied@redhat.com>
> 
> Since 9eb1e57f564d4e6e10991402726cc83fe0b9172f
> drm/dp/mst: make sure mst_primary mstb is valid in work function
> 
> we validate the mstb structs in the work function, and doing
> that takes a reference. So we should never get here with the
> work function running using the mstb device, only if the work
> function hasn't run yet or is running for another mstb.
> 
> So we don't need to sync the work here, this was causing
> lockdep spew when the device failed during the initial EDID
> read.

can you pls paste the lockdep spew here? I think I see the locking
recursion but would like to be sure.

> Signed-off-by: Dave Airlie <airlied@redhat.com>
> ---
>  drivers/gpu/drm/drm_dp_mst_topology.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
> index e23df5f..16f911c 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -804,8 +804,6 @@ static void drm_dp_destroy_mst_branch_device(struct kref *kref)
>  	struct drm_dp_mst_port *port, *tmp;
>  	bool wake_tx = false;
>  
> -	cancel_work_sync(&mstb->mgr->work);
> -

Without this there's no place any more where we stop this worker on
teardown. Generally we need to sync with outstanding work at least in
system suspend and on driver unload. I think we need a flush_work (outside
of the locked region ofc!) in drm_dp_mst_topology_mgr_suspend and
drm_dp_mst_topology_mgr_destroy. Might be better as a follow-up patch, but
should imo be part of this.

With lockdep splat + 2x flush_work this is Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-Daniel

>  	/*
>  	 * destroy all ports - don't need lock
>  	 * as there are no more references to the mst branch
> -- 
> 2.4.3
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Daniel Vetter Sept. 30, 2015, 7:30 a.m. UTC | #2
On Wed, Sep 30, 2015 at 09:20:47AM +0200, Daniel Vetter wrote:
> On Wed, Sep 30, 2015 at 10:39:42AM +1000, Dave Airlie wrote:
> > From: Dave Airlie <airlied@redhat.com>
> > 
> > Since 9eb1e57f564d4e6e10991402726cc83fe0b9172f
> > drm/dp/mst: make sure mst_primary mstb is valid in work function
> > 
> > we validate the mstb structs in the work function, and doing
> > that takes a reference. So we should never get here with the
> > work function running using the mstb device, only if the work
> > function hasn't run yet or is running for another mstb.
> > 
> > So we don't need to sync the work here, this was causing
> > lockdep spew when the device failed during the initial EDID
> > read.
> 
> can you pls paste the lockdep spew here? I think I see the locking
> recursion but would like to be sure.
> 
> > Signed-off-by: Dave Airlie <airlied@redhat.com>
> > ---
> >  drivers/gpu/drm/drm_dp_mst_topology.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index e23df5f..16f911c 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -804,8 +804,6 @@ static void drm_dp_destroy_mst_branch_device(struct kref *kref)
> >  	struct drm_dp_mst_port *port, *tmp;
> >  	bool wake_tx = false;
> >  
> > -	cancel_work_sync(&mstb->mgr->work);
> > -
> 
> Without this there's no place any more where we stop this worker on
> teardown. Generally we need to sync with outstanding work at least in
> system suspend and on driver unload. I think we need a flush_work (outside
> of the locked region ofc!) in drm_dp_mst_topology_mgr_suspend and
> drm_dp_mst_topology_mgr_destroy. Might be better as a follow-up patch, but
> should imo be part of this.
> 
> With lockdep splat + 2x flush_work this is Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

*1x flush_work - the one in drm_dp_mst_topology_mgr_destroy is already
there, I must have been blind ...
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index e23df5f..16f911c 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -804,8 +804,6 @@  static void drm_dp_destroy_mst_branch_device(struct kref *kref)
 	struct drm_dp_mst_port *port, *tmp;
 	bool wake_tx = false;
 
-	cancel_work_sync(&mstb->mgr->work);
-
 	/*
 	 * destroy all ports - don't need lock
 	 * as there are no more references to the mst branch