Message ID | 1376351584-733-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 12, 2013 at 04:53:03PM -0700, Ben Widawsky wrote: > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: lu hua <huax.lu@intel.com> (both bug reports) Please pimp the commit message as discussion on irc (reply with just the text is ok). Strictly speaking we initialize the 3rd mbox for the 4th ring here (which ofc is the vebox one for all rings != vebox). And also please mention the effects of this bug, i.e. how it blows up. Thanks, Daniel > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 10c2aaa..665602f 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1592,6 +1592,8 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno) > if (INTEL_INFO(ring->dev)->gen >= 6) { > I915_WRITE(RING_SYNC_0(ring->mmio_base), 0); > I915_WRITE(RING_SYNC_1(ring->mmio_base), 0); > + if (HAS_VEBOX(ring->dev)) > + I915_WRITE(RING_SYNC_2(ring->mmio_base), 0); > } > > ring->set_seqno(ring, seqno); > -- > 1.8.3.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Mon, Aug 12, 2013 at 04:53:03PM -0700, Ben Widawsky wrote: > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 10c2aaa..665602f 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1592,6 +1592,8 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno) > if (INTEL_INFO(ring->dev)->gen >= 6) { > I915_WRITE(RING_SYNC_0(ring->mmio_base), 0); > I915_WRITE(RING_SYNC_1(ring->mmio_base), 0); > + if (HAS_VEBOX(ring->dev)) > + I915_WRITE(RING_SYNC_2(ring->mmio_base), 0); > } > > ring->set_seqno(ring, seqno); We require n-1 mailboxes for proper semaphore synchronization. All semaphore synchronization code relies on proper values in these mailboxes. The fact that we failed to touch the vebox ring by itself was unlikely to be an issue since the HW should be initializing the values to 0. However the error framework for testing seqno wrap introduced by Mika, in addition to the hangcheck via seqno, and i915_error_first_batchbuffer() combined caused a nice explosion. The problem is caused by seqno wrap because the wrap condition is not properly setup. The wrap code attempts to set the sync mailboxes all to 0, and then set the current seqno to one less than 0. In all cases, the vebox mailbox wasn't properly being initialized. This caused a wrap to not occur. When hangcheck kicks in with the bogus seqno values, the rest just doesn't work. It makes me wonder if we shouldn't consider a dumber version of hangcheck... How we messed this up: VECS support was written before the aforementioned other features. Upon VECS being rebased, these facts were missed.
On Tue, Aug 13, 2013 at 01:55:36PM -0700, Ben Widawsky wrote: > On Mon, Aug 12, 2013 at 04:53:03PM -0700, Ben Widawsky wrote: > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 10c2aaa..665602f 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -1592,6 +1592,8 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno) > > if (INTEL_INFO(ring->dev)->gen >= 6) { > > I915_WRITE(RING_SYNC_0(ring->mmio_base), 0); > > I915_WRITE(RING_SYNC_1(ring->mmio_base), 0); > > + if (HAS_VEBOX(ring->dev)) > > + I915_WRITE(RING_SYNC_2(ring->mmio_base), 0); > > } > > > > ring->set_seqno(ring, seqno); > > We require n-1 mailboxes for proper semaphore synchronization. All > semaphore synchronization code relies on proper values in these > mailboxes. The fact that we failed to touch the vebox ring by itself was > unlikely to be an issue since the HW should be initializing the values > to 0. However the error framework for testing seqno wrap introduced by > Mika, in addition to the hangcheck via seqno, and > i915_error_first_batchbuffer() combined caused a nice explosion. > > The problem is caused by seqno wrap because the wrap condition is not > properly setup. The wrap code attempts to set the sync mailboxes all to > 0, and then set the current seqno to one less than 0. In all cases, the > vebox mailbox wasn't properly being initialized. This caused a wrap to > not occur. When hangcheck kicks in with the bogus seqno values, the rest > just doesn't work. It makes me wonder if we shouldn't consider a dumber > version of hangcheck... > > How we messed this up: > VECS support was written before the aforementioned other features. Upon > VECS being rebased, these facts were missed. Both patches applied, thanks. -Daniel
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 10c2aaa..665602f 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1592,6 +1592,8 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno) if (INTEL_INFO(ring->dev)->gen >= 6) { I915_WRITE(RING_SYNC_0(ring->mmio_base), 0); I915_WRITE(RING_SYNC_1(ring->mmio_base), 0); + if (HAS_VEBOX(ring->dev)) + I915_WRITE(RING_SYNC_2(ring->mmio_base), 0); } ring->set_seqno(ring, seqno);
Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ 1 file changed, 2 insertions(+)