diff mbox

drm/atomic: Paper over locking WARN in default_state_clear

Message ID 1438167101-20704-1-git-send-email-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Vetter July 29, 2015, 10:51 a.m. UTC
In

commit 6f75cea66c8dd043ced282016b21a639af176642
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Nov 19 18:38:07 2014 +0100

    drm/atomic: Only destroy connector states with connection mutex held

I tried to fix races of atomic commits against connector
hot-unplugging. The idea is to ensure lifetimes by holding the
connection_mutex long enough. That works for synchronous commits, but
not for async ones.

For async atomic commit we really need to fix up connector lifetimes
for real. But that's a much bigger task, so just add more duct-tape:
For cleaning up connector states we currently don't need the connector
itself. So NULL it out and remove the locking check. Of course that
check was to protect the entire sequence, but the modeset itself
should be save since currently DP MST hot-removal does a dpms-off. And
that should synchronize with any outstanding async atomic commit.

Or at least that's my hope, this is all a giant mess.

Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/drm_atomic.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Maarten Lankhorst July 31, 2015, 8:34 a.m. UTC | #1
Hey,

Op 29-07-15 om 12:51 schreef Daniel Vetter:
> In
>
> commit 6f75cea66c8dd043ced282016b21a639af176642
> Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> Date:   Wed Nov 19 18:38:07 2014 +0100
>
>     drm/atomic: Only destroy connector states with connection mutex held
>
> I tried to fix races of atomic commits against connector
> hot-unplugging. The idea is to ensure lifetimes by holding the
> connection_mutex long enough. That works for synchronous commits, but
> not for async ones.
>
> For async atomic commit we really need to fix up connector lifetimes
> for real. But that's a much bigger task, so just add more duct-tape:
> For cleaning up connector states we currently don't need the connector
> itself. So NULL it out and remove the locking check. Of course that
> check was to protect the entire sequence, but the modeset itself
> should be save since currently DP MST hot-removal does a dpms-off. And
> that should synchronize with any outstanding async atomic commit.
>
> Or at least that's my hope, this is all a giant mess.
>
> Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/drm_atomic.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index 3efd91c0c6cb..434915448ea0 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -153,9 +153,15 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
>  		if (!connector)
>  			continue;
>  
> -		WARN_ON(!drm_modeset_is_locked(&config->connection_mutex));
> -
> -		connector->funcs->atomic_destroy_state(connector,
> +		/*
> +		 * FIXME: Async commits can race with connector unplugging and
> +		 * there's currently nothing that prevents cleanup up state for
> +		 * deleted connectors. As long as the callback doesn't look at
> +		 * the connector we'll be fine though, so make sure that's the
> +		 * case by setting all connector pointers to NULL.
> +		 */
> +		state->connector_states[i]->connector = NULL;
> +		connector->funcs->atomic_destroy_state(NULL,
>  						       state->connector_states[i]);
>
This wouldn't provide any additional guarantee during the async commit itself, so please don't do this. :-)
Daniel Vetter July 31, 2015, 1:41 p.m. UTC | #2
On Fri, Jul 31, 2015 at 10:34:43AM +0200, Maarten Lankhorst wrote:
> Hey,
> 
> Op 29-07-15 om 12:51 schreef Daniel Vetter:
> > In
> >
> > commit 6f75cea66c8dd043ced282016b21a639af176642
> > Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Date:   Wed Nov 19 18:38:07 2014 +0100
> >
> >     drm/atomic: Only destroy connector states with connection mutex held
> >
> > I tried to fix races of atomic commits against connector
> > hot-unplugging. The idea is to ensure lifetimes by holding the
> > connection_mutex long enough. That works for synchronous commits, but
> > not for async ones.
> >
> > For async atomic commit we really need to fix up connector lifetimes
> > for real. But that's a much bigger task, so just add more duct-tape:
> > For cleaning up connector states we currently don't need the connector
> > itself. So NULL it out and remove the locking check. Of course that
> > check was to protect the entire sequence, but the modeset itself
> > should be save since currently DP MST hot-removal does a dpms-off. And
> > that should synchronize with any outstanding async atomic commit.
> >
> > Or at least that's my hope, this is all a giant mess.
> >
> > Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> >  drivers/gpu/drm/drm_atomic.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > index 3efd91c0c6cb..434915448ea0 100644
> > --- a/drivers/gpu/drm/drm_atomic.c
> > +++ b/drivers/gpu/drm/drm_atomic.c
> > @@ -153,9 +153,15 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
> >  		if (!connector)
> >  			continue;
> >  
> > -		WARN_ON(!drm_modeset_is_locked(&config->connection_mutex));
> > -
> > -		connector->funcs->atomic_destroy_state(connector,
> > +		/*
> > +		 * FIXME: Async commits can race with connector unplugging and
> > +		 * there's currently nothing that prevents cleanup up state for
> > +		 * deleted connectors. As long as the callback doesn't look at
> > +		 * the connector we'll be fine though, so make sure that's the
> > +		 * case by setting all connector pointers to NULL.
> > +		 */
> > +		state->connector_states[i]->connector = NULL;
> > +		connector->funcs->atomic_destroy_state(NULL,
> >  						       state->connector_states[i]);
> >
> This wouldn't provide any additional guarantee during the async commit
> itself, so please don't do this. :-)

Nope, it's really just a big reminder that we have a bug here.
-Daniel
Daniel Vetter Aug. 10, 2015, 12:10 p.m. UTC | #3
On Fri, Jul 31, 2015 at 03:41:15PM +0200, Daniel Vetter wrote:
> On Fri, Jul 31, 2015 at 10:34:43AM +0200, Maarten Lankhorst wrote:
> > Hey,
> > 
> > Op 29-07-15 om 12:51 schreef Daniel Vetter:
> > > In
> > >
> > > commit 6f75cea66c8dd043ced282016b21a639af176642
> > > Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Date:   Wed Nov 19 18:38:07 2014 +0100
> > >
> > >     drm/atomic: Only destroy connector states with connection mutex held
> > >
> > > I tried to fix races of atomic commits against connector
> > > hot-unplugging. The idea is to ensure lifetimes by holding the
> > > connection_mutex long enough. That works for synchronous commits, but
> > > not for async ones.
> > >
> > > For async atomic commit we really need to fix up connector lifetimes
> > > for real. But that's a much bigger task, so just add more duct-tape:
> > > For cleaning up connector states we currently don't need the connector
> > > itself. So NULL it out and remove the locking check. Of course that
> > > check was to protect the entire sequence, but the modeset itself
> > > should be save since currently DP MST hot-removal does a dpms-off. And
> > > that should synchronize with any outstanding async atomic commit.
> > >
> > > Or at least that's my hope, this is all a giant mess.
> > >
> > > Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > ---
> > >  drivers/gpu/drm/drm_atomic.c | 12 +++++++++---
> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > > index 3efd91c0c6cb..434915448ea0 100644
> > > --- a/drivers/gpu/drm/drm_atomic.c
> > > +++ b/drivers/gpu/drm/drm_atomic.c
> > > @@ -153,9 +153,15 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
> > >  		if (!connector)
> > >  			continue;
> > >  
> > > -		WARN_ON(!drm_modeset_is_locked(&config->connection_mutex));
> > > -
> > > -		connector->funcs->atomic_destroy_state(connector,
> > > +		/*
> > > +		 * FIXME: Async commits can race with connector unplugging and
> > > +		 * there's currently nothing that prevents cleanup up state for
> > > +		 * deleted connectors. As long as the callback doesn't look at
> > > +		 * the connector we'll be fine though, so make sure that's the
> > > +		 * case by setting all connector pointers to NULL.
> > > +		 */
> > > +		state->connector_states[i]->connector = NULL;
> > > +		connector->funcs->atomic_destroy_state(NULL,
> > >  						       state->connector_states[i]);
> > >
> > This wouldn't provide any additional guarantee during the async commit
> > itself, so please don't do this. :-)
> 
> Nope, it's really just a big reminder that we have a bug here.

Ok, picked up to drm-misc with Maarten's irc r-b.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 3efd91c0c6cb..434915448ea0 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -153,9 +153,15 @@  void drm_atomic_state_default_clear(struct drm_atomic_state *state)
 		if (!connector)
 			continue;
 
-		WARN_ON(!drm_modeset_is_locked(&config->connection_mutex));
-
-		connector->funcs->atomic_destroy_state(connector,
+		/*
+		 * FIXME: Async commits can race with connector unplugging and
+		 * there's currently nothing that prevents cleanup up state for
+		 * deleted connectors. As long as the callback doesn't look at
+		 * the connector we'll be fine though, so make sure that's the
+		 * case by setting all connector pointers to NULL.
+		 */
+		state->connector_states[i]->connector = NULL;
+		connector->funcs->atomic_destroy_state(NULL,
 						       state->connector_states[i]);
 		state->connectors[i] = NULL;
 		state->connector_states[i] = NULL;