diff mbox

drm/i915: Undo gtt scratch pte unmapping again

Message ID s5hior0nnx6.wl%tiwai@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Takashi Iwai March 27, 2014, 6:41 a.m. UTC
At Wed, 26 Mar 2014 20:10:09 +0100,
Daniel Vetter wrote:
> 
> It apparently blows up on some machines. This functionally reverts
> 
> commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
> Author: Ben Widawsky <benjamin.widawsky@intel.com>
> Date:   Wed Oct 16 09:21:30 2013 -0700
> 
>     drm/i915: Disable GGTT PTEs on GEN6+ suspend
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
> Reported-and-Tested-by: Brad  Jackson <bjackson0971@gmail.com>
> Cc: stable@vger.kernel.org
> Cc: Takashi Iwai <tiwai@suse.de>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Todd Previte <tprevite@gmail.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

The commit was the fix for the memory corruption and lock up at S4 on
some (Haswell) machines.  This revert will re-introduce the issue
again very likely.  I'm going to check with the latest tree, but this
may take some time.

Wouldn't it be safer to revert this conditionally like I suggested in
comment 10 of the bugzilla entry?

        i915_check_and_clear_faults(dev);

thanks,

Takashi


> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 5d61de18ae55..4467974eb53b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1281,7 +1281,7 @@ void i915_gem_suspend_gtt_mappings(struct drm_device *dev)
>  	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
>  				       dev_priv->gtt.base.start,
>  				       dev_priv->gtt.base.total,
> -				       false);
> +				       true);
>  }
>  
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> -- 
> 1.8.5.2
>

Comments

Daniel Vetter March 27, 2014, 6:55 a.m. UTC | #1
On Thu, Mar 27, 2014 at 7:41 AM, Takashi Iwai <tiwai@suse.de> wrote:
>> It apparently blows up on some machines. This functionally reverts
>>
>> commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
>> Author: Ben Widawsky <benjamin.widawsky@intel.com>
>> Date:   Wed Oct 16 09:21:30 2013 -0700
>>
>>     drm/i915: Disable GGTT PTEs on GEN6+ suspend
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
>> Reported-and-Tested-by: Brad  Jackson <bjackson0971@gmail.com>
>> Cc: stable@vger.kernel.org
>> Cc: Takashi Iwai <tiwai@suse.de>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Cc: Todd Previte <tprevite@gmail.com>
>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> The commit was the fix for the memory corruption and lock up at S4 on
> some (Haswell) machines.  This revert will re-introduce the issue
> again very likely.  I'm going to check with the latest tree, but this
> may take some time.
>
> Wouldn't it be safer to revert this conditionally like I suggested in
> comment 10 of the bugzilla entry?

I know that this will blow up, but apparently no one from our side
really bothered to test stuff or work on this. So regression wins and
I've pushed out the revert - this bug has been lingering a bit too
long.
-Daniel
Takashi Iwai March 27, 2014, 7:13 a.m. UTC | #2
At Thu, 27 Mar 2014 07:55:57 +0100,
Daniel Vetter wrote:
> 
> On Thu, Mar 27, 2014 at 7:41 AM, Takashi Iwai <tiwai@suse.de> wrote:
> >> It apparently blows up on some machines. This functionally reverts
> >>
> >> commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
> >> Author: Ben Widawsky <benjamin.widawsky@intel.com>
> >> Date:   Wed Oct 16 09:21:30 2013 -0700
> >>
> >>     drm/i915: Disable GGTT PTEs on GEN6+ suspend
> >>
> >> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
> >> Reported-and-Tested-by: Brad  Jackson <bjackson0971@gmail.com>
> >> Cc: stable@vger.kernel.org
> >> Cc: Takashi Iwai <tiwai@suse.de>
> >> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> Cc: Todd Previte <tprevite@gmail.com>
> >> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > The commit was the fix for the memory corruption and lock up at S4 on
> > some (Haswell) machines.  This revert will re-introduce the issue
> > again very likely.  I'm going to check with the latest tree, but this
> > may take some time.
> >
> > Wouldn't it be safer to revert this conditionally like I suggested in
> > comment 10 of the bugzilla entry?
> 
> I know that this will blow up, but apparently no one from our side
> really bothered to test stuff or work on this. So regression wins and
> I've pushed out the revert - this bug has been lingering a bit too
> long.

Well, the problem is rather that no one who worked on the HSW S4 fix
(including me) could reproduce the regression on SNB machines.
Have you seen any similar report with gen7 or newer?


Takashi
Takashi Iwai March 27, 2014, 1:18 p.m. UTC | #3
At Thu, 27 Mar 2014 07:41:41 +0100,
Takashi Iwai wrote:
> 
> At Wed, 26 Mar 2014 20:10:09 +0100,
> Daniel Vetter wrote:
> > 
> > It apparently blows up on some machines. This functionally reverts
> > 
> > commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
> > Author: Ben Widawsky <benjamin.widawsky@intel.com>
> > Date:   Wed Oct 16 09:21:30 2013 -0700
> > 
> >     drm/i915: Disable GGTT PTEs on GEN6+ suspend
> > 
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
> > Reported-and-Tested-by: Brad  Jackson <bjackson0971@gmail.com>
> > Cc: stable@vger.kernel.org
> > Cc: Takashi Iwai <tiwai@suse.de>
> > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Cc: Todd Previte <tprevite@gmail.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> The commit was the fix for the memory corruption and lock up at S4 on
> some (Haswell) machines.  This revert will re-introduce the issue
> again very likely.  I'm going to check with the latest tree, but this
> may take some time.

Luckily I found the machine quickly and tested with your patch.
As expected, this revert breaks S4 again on Haswell machines indeed.
A test machine locks up after 4 times S4 cycle while 3.14-rc8 keeps
working after 100 S4 cycles.  Oh well.


Takashi

> 
> Wouldn't it be safer to revert this conditionally like I suggested in
> comment 10 of the bugzilla entry?
> 
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -818,7 +818,7 @@ void i915_gem_suspend_gtt_mappings(struct
> drm_device *dev)
>         /* Don't bother messing with faults pre GEN6 as we have little
>          * documentation supporting that it's a good idea.
>          */
> -       if (INTEL_INFO(dev)->gen < 6)
> +       if (INTEL_INFO(dev)->gen < 7)
>                 return;
>  
>         i915_check_and_clear_faults(dev);
> 
> thanks,
> 
> Takashi
> 
> 
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 5d61de18ae55..4467974eb53b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -1281,7 +1281,7 @@ void i915_gem_suspend_gtt_mappings(struct drm_device *dev)
> >  	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> >  				       dev_priv->gtt.base.start,
> >  				       dev_priv->gtt.base.total,
> > -				       false);
> > +				       true);
> >  }
> >  
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > -- 
> > 1.8.5.2
> > 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
diff mbox

Patch

--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -818,7 +818,7 @@  void i915_gem_suspend_gtt_mappings(struct
drm_device *dev)
        /* Don't bother messing with faults pre GEN6 as we have little
         * documentation supporting that it's a good idea.
         */
-       if (INTEL_INFO(dev)->gen < 6)
+       if (INTEL_INFO(dev)->gen < 7)
                return;