diff mbox

drm/i915: reset forcewake count after reset

Message ID 1308870382-1587-1-git-send-email-ben@bwidawsk.net (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky June 23, 2011, 11:06 p.m. UTC
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

Comments

Chris Wilson June 23, 2011, 11:45 p.m. UTC | #1
On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 0defd42..9292499 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
>  	} else switch (INTEL_INFO(dev)->gen) {
>  	case 6:
>  		ret = gen6_do_reset(dev, flags);
> +		atomic_set(&dev_priv->forcewake_count, 0);
>  		break;
>  	case 5:
>  		ret = ironlake_do_reset(dev, flags);

Can forcewake be non-zero here? If it has been bumped by a user wakelock,
then what happens when that is subsequently released? I don't think this
is safe...

What scenario are you trying to fix?
-Chris
Ben Widawsky June 24, 2011, 2 a.m. UTC | #2
On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote:
> On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 0defd42..9292499 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
> >  	} else switch (INTEL_INFO(dev)->gen) {
> >  	case 6:
> >  		ret = gen6_do_reset(dev, flags);
> > +		atomic_set(&dev_priv->forcewake_count, 0);
> >  		break;
> >  	case 5:
> >  		ret = ironlake_do_reset(dev, flags);
> 
> Can forcewake be non-zero here? If it has been bumped by a user wakelock,
> then what happens when that is subsequently released? I don't think this
> is safe...
> 
> What scenario are you trying to fix?
> -Chris

This is not the cleanest fix, but the problem is the following:

1. User bumps refcount
2. GPU hangs
3. Reset occurs
4. User doesn't close the file (or even the race before the user closes
   the file after the reset) the driver is now completely screwed in
   this case, once the user does close the file, things will go back to
   normal.

I was actually just about to respond to my original email to say this
belongs in -fixes (unless I'm confused).

Ben
Ben Widawsky June 24, 2011, 2:02 a.m. UTC | #3
On Thu, Jun 23, 2011 at 07:00:50PM -0700, Ben Widawsky wrote:
> On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote:
> > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c |    1 +
> > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > > index 0defd42..9292499 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
> > >  	} else switch (INTEL_INFO(dev)->gen) {
> > >  	case 6:
> > >  		ret = gen6_do_reset(dev, flags);
> > > +		atomic_set(&dev_priv->forcewake_count, 0);
> > >  		break;
> > >  	case 5:
> > >  		ret = ironlake_do_reset(dev, flags);
> > 
> > Can forcewake be non-zero here? If it has been bumped by a user wakelock,
> > then what happens when that is subsequently released? I don't think this
> > is safe...
> > 
> > What scenario are you trying to fix?
> > -Chris
> 
> This is not the cleanest fix, but the problem is the following:
> 
> 1. User bumps refcount
> 2. GPU hangs
> 3. Reset occurs
> 4. User doesn't close the file (or even the race before the user closes
>    the file after the reset) the driver is now completely screwed in
>    this case, once the user does close the file, things will go back to
>    normal.
> 
> I was actually just about to respond to my original email to say this
> belongs in -fixes (unless I'm confused).
> 
> Ben

Just realized that you're right. My code is buggy at step 4 when the
user closes the file... I do think we need some fix though. Agree?
Chris Wilson June 24, 2011, 7:54 a.m. UTC | #4
On Thu, 23 Jun 2011 19:02:32 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Thu, Jun 23, 2011 at 07:00:50PM -0700, Ben Widawsky wrote:
> > On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote:
> > > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.c |    1 +
> > > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > > > index 0defd42..9292499 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
> > > >  	} else switch (INTEL_INFO(dev)->gen) {
> > > >  	case 6:
> > > >  		ret = gen6_do_reset(dev, flags);
> > > > +		atomic_set(&dev_priv->forcewake_count, 0);
> > > >  		break;
> > > >  	case 5:
> > > >  		ret = ironlake_do_reset(dev, flags);
> > > 
> > > Can forcewake be non-zero here? If it has been bumped by a user wakelock,
> > > then what happens when that is subsequently released? I don't think this
> > > is safe...
> > > 
> > > What scenario are you trying to fix?
> > > -Chris
> > 
> > This is not the cleanest fix, but the problem is the following:
> > 
> > 1. User bumps refcount
> > 2. GPU hangs
> > 3. Reset occurs
> > 4. User doesn't close the file (or even the race before the user closes
> >    the file after the reset) the driver is now completely screwed in
> >    this case, once the user does close the file, things will go back to
> >    normal.
> > 
> > I was actually just about to respond to my original email to say this
> > belongs in -fixes (unless I'm confused).
> > 
> > Ben
> 
> Just realized that you're right. My code is buggy at step 4 when the
> user closes the file... I do think we need some fix though. Agree?

Are we sure that the GT forcedwake is hammered along with the GPU reset? I
haven't checked but that's the crux of the issue...

Assuming it is, I see the problem you're trying to solve (sleep is good!).
Even if it isn't, we could perform the forcedwake sequence so that our
refcnt was back in sync with the hardware. If we continue to presume that
struct_mutex is the one and only guard for forcedwake, then we should be
race free? Another solution would be to defer the reset until the
forcedwake refcnt drops to zero. But that conflates the notion of a
resetlock with the wakelock (although we could say that the user wakelock
is the combination of forcedwakelock and resetlock).

Something to think about, at least :)
-Chris
Ben Widawsky June 24, 2011, 3:37 p.m. UTC | #5
On Fri, Jun 24, 2011 at 08:54:24AM +0100, Chris Wilson wrote:
> On Thu, 23 Jun 2011 19:02:32 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Thu, Jun 23, 2011 at 07:00:50PM -0700, Ben Widawsky wrote:
> > > On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote:
> > > > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> > > > > 
> > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_drv.c |    1 +
> > > > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index 0defd42..9292499 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
> > > > >  	} else switch (INTEL_INFO(dev)->gen) {
> > > > >  	case 6:
> > > > >  		ret = gen6_do_reset(dev, flags);
> > > > > +		atomic_set(&dev_priv->forcewake_count, 0);
> > > > >  		break;
> > > > >  	case 5:
> > > > >  		ret = ironlake_do_reset(dev, flags);
> > > > 
> > > > Can forcewake be non-zero here? If it has been bumped by a user wakelock,
> > > > then what happens when that is subsequently released? I don't think this
> > > > is safe...
> > > > 
> > > > What scenario are you trying to fix?
> > > > -Chris
> > > 
> > > This is not the cleanest fix, but the problem is the following:
> > > 
> > > 1. User bumps refcount
> > > 2. GPU hangs
> > > 3. Reset occurs
> > > 4. User doesn't close the file (or even the race before the user closes
> > >    the file after the reset) the driver is now completely screwed in
> > >    this case, once the user does close the file, things will go back to
> > >    normal.
> > > 
> > > I was actually just about to respond to my original email to say this
> > > belongs in -fixes (unless I'm confused).
> > > 
> > > Ben
> > 
> > Just realized that you're right. My code is buggy at step 4 when the
> > user closes the file... I do think we need some fix though. Agree?
> 
> Are we sure that the GT forcedwake is hammered along with the GPU reset? I
> haven't checked but that's the crux of the issue...

Yes, the test I am performing leads me to believe so. You can try
yourself and tell me what you think:
forcewaked - you remember that nifty app I posted ;-)
gpu reset
intel_reg_write anything < 0x40000

> 
> Assuming it is, I see the problem you're trying to solve (sleep is good!).
> Even if it isn't, we could perform the forcedwake sequence so that our
> refcnt was back in sync with the hardware. If we continue to presume that
> struct_mutex is the one and only guard for forcedwake, then we should be
> race free? 

I believe the problem only exists with user initiated forcewake.
Fortunately that debugfs entry is root only, and your average person
won't be using it.

I'll modify this patch to be a WARN_ON instead of atomic_set(). That
will be helpful in proving it.

> Something to think about, at least :)
> -Chris

You know, I was really excited to post this because I felt it could be
really helpful, but now that you've convinced me this is only a problem
for users using the debugfs forcewakes (which is probably only me at
this moment), let me go think about it again.

Ben
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0defd42..9292499 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -579,6 +579,7 @@  int i915_reset(struct drm_device *dev, u8 flags)
 	} else switch (INTEL_INFO(dev)->gen) {
 	case 6:
 		ret = gen6_do_reset(dev, flags);
+		atomic_set(&dev_priv->forcewake_count, 0);
 		break;
 	case 5:
 		ret = ironlake_do_reset(dev, flags);