Message ID | 1434741369-28932-1-git-send-email-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Jun 19, 2015 at 09:16:09PM +0200, Daniel Vetter wrote: > We've never figured out the magic trick to make irq vs. seqno > updates coherent, only tricks to make it work. And since > > commit 094f9a54e35500739da185cdb78f2e92fc379458 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Wed Sep 25 17:34:55 2013 +0100 > > drm/i915: Fix __wait_seqno to use true infinite timeouts > > we automatically fall back to an irq augmented with polling scheme > after the first missed interrupt. There's really nothing else we can > do, hence tune down the message to informational level. It's still > useful for users in case it reliable preceedes a hard system hang. > > v2: Use NOTICE since it might be of value for bug reports (Chris). > > Cc: Mark Janes <mark.a.janes@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: stable@vger.kernel.org > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Now all we need to is to save the GPU state to the pstore in the picoseconds before a hard hang, and we'll be sorted. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
On Fri, 19 Jun 2015, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > We've never figured out the magic trick to make irq vs. seqno > updates coherent, only tricks to make it work. And since > > commit 094f9a54e35500739da185cdb78f2e92fc379458 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Wed Sep 25 17:34:55 2013 +0100 > > drm/i915: Fix __wait_seqno to use true infinite timeouts > > we automatically fall back to an irq augmented with polling scheme > after the first missed interrupt. There's really nothing else we can > do, hence tune down the message to informational level. It's still > useful for users in case it reliable preceedes a hard system hang. > > v2: Use NOTICE since it might be of value for bug reports (Chris). > > Cc: Mark Janes <mark.a.janes@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: stable@vger.kernel.org > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > drivers/gpu/drm/i915/i915_irq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index e6bb72dca3ff..5072fb49367e 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -2946,8 +2946,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > /* Issue a wake-up to catch stuck h/w. */ > if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { > if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) > - DRM_ERROR("Hangcheck timer elapsed... %s idle\n", > - ring->name); > + DRM_NOTICE("Hangcheck timer elapsed... %s idle\n", drivers/gpu/drm/i915/i915_irq.c: In function ‘i915_hangcheck_elapsed’: drivers/gpu/drm/i915/i915_irq.c:2949:8: error: implicit declaration of function ‘DRM_NOTICE’ [-Werror=implicit-function-declaration] DRM_NOTICE("Hangcheck timer elapsed... %s idle\n", ^ BR, Jani. > + ring->name); > else > DRM_INFO("Fake missed irq on %s\n", > ring->name); > -- > 2.1.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, Jun 23, 2015 at 01:05:41PM +0300, Jani Nikula wrote: > On Fri, 19 Jun 2015, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > We've never figured out the magic trick to make irq vs. seqno > > updates coherent, only tricks to make it work. And since > > > > commit 094f9a54e35500739da185cdb78f2e92fc379458 > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > Date: Wed Sep 25 17:34:55 2013 +0100 > > > > drm/i915: Fix __wait_seqno to use true infinite timeouts > > > > we automatically fall back to an irq augmented with polling scheme > > after the first missed interrupt. There's really nothing else we can > > do, hence tune down the message to informational level. It's still > > useful for users in case it reliable preceedes a hard system hang. > > > > v2: Use NOTICE since it might be of value for bug reports (Chris). > > > > Cc: Mark Janes <mark.a.janes@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: stable@vger.kernel.org > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > --- > > drivers/gpu/drm/i915/i915_irq.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > index e6bb72dca3ff..5072fb49367e 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -2946,8 +2946,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > > /* Issue a wake-up to catch stuck h/w. */ > > if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { > > if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) > > - DRM_ERROR("Hangcheck timer elapsed... %s idle\n", > > - ring->name); > > + DRM_NOTICE("Hangcheck timer elapsed... %s idle\n", > > drivers/gpu/drm/i915/i915_irq.c: In function ‘i915_hangcheck_elapsed’: > drivers/gpu/drm/i915/i915_irq.c:2949:8: error: implicit declaration of function ‘DRM_NOTICE’ [-Werror=implicit-function-declaration] > DRM_NOTICE("Hangcheck timer elapsed... %s idle\n", > ^ Embarassing. Can you pick up v1 instead please? -Daniel
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index e6bb72dca3ff..5072fb49367e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2946,8 +2946,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) /* Issue a wake-up to catch stuck h/w. */ if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) - DRM_ERROR("Hangcheck timer elapsed... %s idle\n", - ring->name); + DRM_NOTICE("Hangcheck timer elapsed... %s idle\n", + ring->name); else DRM_INFO("Fake missed irq on %s\n", ring->name);
We've never figured out the magic trick to make irq vs. seqno updates coherent, only tricks to make it work. And since commit 094f9a54e35500739da185cdb78f2e92fc379458 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Sep 25 17:34:55 2013 +0100 drm/i915: Fix __wait_seqno to use true infinite timeouts we automatically fall back to an irq augmented with polling scheme after the first missed interrupt. There's really nothing else we can do, hence tune down the message to informational level. It's still useful for users in case it reliable preceedes a hard system hang. v2: Use NOTICE since it might be of value for bug reports (Chris). Cc: Mark Janes <mark.a.janes@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/i915/i915_irq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)