diff mbox

[regression] Re: 4.11-rc0, thinkpad x220: GPU hang

Message ID 20170306121047.GB15063@amd (mailing list archive)
State New, archived
Headers show

Commit Message

Pavel Machek March 6, 2017, 12:10 p.m. UTC
On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > Hi!
> > 
> > > > mplayer stopped working after a while. Dmesg says:
> > > > 
> > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > 
> > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > try? Bisect will be slow and nasty :-(.
> 
> I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> and under the presumption that your bug matches (as the symptoms do):
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 4ffa35faff49..62e31a7438ac 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
>  {
>         struct drm_i915_private *dev_priv = request->i915;
>  
> -       i915_gem_request_submit(request);
> -
>         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
>         I915_WRITE_TAIL(request->engine, request->tail);
> +
> +       i915_gem_request_submit(request);
>  }
>  
>  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)

I applied it as:


Hmm. But your next mail suggest that it may not be smart to try to
boot it? :-).

										Pavel

Comments

Chris Wilson March 6, 2017, 12:23 p.m. UTC | #1
On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
> On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> > On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > > mplayer stopped working after a while. Dmesg says:
> > > > > 
> > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > 
> > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > try? Bisect will be slow and nasty :-(.
> > 
> > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > and under the presumption that your bug matches (as the symptoms do):
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 4ffa35faff49..62e31a7438ac 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> >  {
> >         struct drm_i915_private *dev_priv = request->i915;
> >  
> > -       i915_gem_request_submit(request);
> > -
> >         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> >         I915_WRITE_TAIL(request->engine, request->tail);
> > +
> > +       i915_gem_request_submit(request);
> >  }
> >  
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
> 
> I applied it as:
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 91bc4ab..9c49c7a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
>  {
>  	struct drm_i915_private *dev_priv = request->i915;
>  
> -	i915_gem_request_submit(request);
> -
>  	I915_WRITE_TAIL(request->engine, request->tail);
> +
> +	i915_gem_request_submit(request);
>  }
>  
>  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> 
> Hmm. But your next mail suggest that it may not be smart to try to
> boot it? :-).

Don't bother, it'll promptly hang.
-Chris
Pavel Machek March 21, 2017, 2:02 p.m. UTC | #2
Hi!

> > > > > > mplayer stopped working after a while. Dmesg says:
> > > > > > 
> > > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > > 
> > > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > > try? Bisect will be slow and nasty :-(.
> > > 
> > > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > > and under the presumption that your bug matches (as the symptoms do):
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 4ffa35faff49..62e31a7438ac 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> > >  {
> > >         struct drm_i915_private *dev_priv = request->i915;
> > >  
> > > -       i915_gem_request_submit(request);
> > > -
> > >         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> > >         I915_WRITE_TAIL(request->engine, request->tail);
> > > +
> > > +       i915_gem_request_submit(request);
> > >  }
> > >  
> > >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
> > 
> > I applied it as:
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 91bc4ab..9c49c7a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> >  {
> >  	struct drm_i915_private *dev_priv = request->i915;
> >  
> > -	i915_gem_request_submit(request);
> > -
> >  	I915_WRITE_TAIL(request->engine, request->tail);
> > +
> > +	i915_gem_request_submit(request);
> >  }
> >  
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> > 
> > Hmm. But your next mail suggest that it may not be smart to try to
> > boot it? :-).
> 
> Don't bother, it'll promptly hang.

Any news here?

Is there something I can revert to get back to working system?

Thanks,
									Pavel
Pavel Machek March 25, 2017, 9:33 p.m. UTC | #3
On Mon 2017-03-06 12:23:41, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
> > On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> > > On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > > > > mplayer stopped working after a while. Dmesg says:
> > > > > > 
> > > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > > 
> > > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > > try? Bisect will be slow and nasty :-(.
> > > 
> > > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > > and under the presumption that your bug matches (as the symptoms do):
> > > 
...
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> > 
> > Hmm. But your next mail suggest that it may not be smart to try to
> > boot it? :-).
> 
> Don't bother, it'll promptly hang.

Any news here? Is there chance this is fixed in -rc4?
									Pavel
Pavel Machek April 9, 2017, 10:33 a.m. UTC | #4
On Mon 2017-03-06 12:23:41, Chris Wilson wrote:
> On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
> > On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
> > > On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > > > > mplayer stopped working after a while. Dmesg says:
> > > > > > 
> > > > > > [ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
> > > > 
> > > > Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to
> > > > try? Bisect will be slow and nasty :-(.
> > > 
> > > I came the conclusion that #99671 is the ring HEAD overtaking the TAIL,
> > > and under the presumption that your bug matches (as the symptoms do):
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 4ffa35faff49..62e31a7438ac 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> > >  {
> > >         struct drm_i915_private *dev_priv = request->i915;
> > >  
> > > -       i915_gem_request_submit(request);
> > > -
> > >         GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> > >         I915_WRITE_TAIL(request->engine, request->tail);
> > > +
> > > +       i915_gem_request_submit(request);
> > >  }
> > >  
> > >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
> > 
> > I applied it as:
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 91bc4ab..9c49c7a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
> >  {
> >  	struct drm_i915_private *dev_priv = request->i915;
> >  
> > -	i915_gem_request_submit(request);
> > -
> >  	I915_WRITE_TAIL(request->engine, request->tail);
> > +
> > +	i915_gem_request_submit(request);
> >  }
> >  
> >  static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
> > 
> > Hmm. But your next mail suggest that it may not be smart to try to
> > boot it? :-).
> 
> Don't bother, it'll promptly hang.

Any news here? 4.11-rc5 is actually usable on the hardware (unlike
-rc1), not sure what changed.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 91bc4ab..9c49c7a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1338,9 +1338,9 @@  static void i9xx_submit_request(struct drm_i915_gem_request *request)
 {
 	struct drm_i915_private *dev_priv = request->i915;
 
-	i915_gem_request_submit(request);
-
 	I915_WRITE_TAIL(request->engine, request->tail);
+
+	i915_gem_request_submit(request);
 }
 
 static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,