Message ID | 20180426021009.178880-1-tarun.vyas@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Apr 25, 2018 at 07:10:09PM -0700, tarun.vyas@intel.com wrote: > From: Tarun <tarun.vyas@intel.com> > > The Display scanline counter freezes on PSR entry. Inside > intel_pipe_update_start, once Vblank interrupts are enabled, we start > exiting PSR, but by the time the scanline counter is read, we may not > have completely exited PSR which leads us to schedule out and check back > later. > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but > lags w.r.t core kernel code, hot plugging an external display triggers > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A > closer analysis reveals that we try to read the scanline 3 times and > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL stuck @ > 1599. > This issue is not seen on upstream kernels, b/c for *some* reason we > loop inside intel_pipe_update start for ~2+ msec which in this case is > more than enough to exit PSR fully, hence an *unstuck* PIPEDSL counter, > hence no error. On the other hand, the ChromeOS kernel spends ~1.1 msec > looping inside intel_pipe_update_start and hence errors out b/c the > source is still in PSR. > > If PSR is enabled, then we should *wait* for the PSR > state to move to IDLE before re-reading the PIPEDSL so as to avoid bogus > and annoying "potential atomic update error" messages. > > P.S: This scenario applies to a configuration with an additional pipe, > as of now. > > Signed-off-by: Tarun <tarun.vyas@intel.com> > --- > drivers/gpu/drm/i915/intel_sprite.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c > index aa1dfaa692b9..77dd3b936131 100644 > --- a/drivers/gpu/drm/i915/intel_sprite.c > +++ b/drivers/gpu/drm/i915/intel_sprite.c > @@ -92,11 +92,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); > const struct drm_display_mode *adjusted_mode = &new_crtc_state->base.adjusted_mode; > long timeout = msecs_to_jiffies_timeout(1); > - int scanline, min, max, vblank_start; > + int scanline, min, max, vblank_start, old_scanline, new_scanline; > + bool retried = false; > wait_queue_head_t *wq = drm_crtc_vblank_waitqueue(&crtc->base); > bool need_vlv_dsi_wa = (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) && > intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI); > DEFINE_WAIT(wait); > + old_scanline = new_scanline = -1; > > vblank_start = adjusted_mode->crtc_vblank_start; > if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) > @@ -126,15 +128,24 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > * read the scanline. > */ > prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); > - > +retry: > scanline = intel_get_crtc_scanline(crtc); > + old_scanline = new_scanline, new_scanline = scanline; > + > if (scanline < min || scanline > max) > break; > > if (timeout <= 0) { > - DRM_ERROR("Potential atomic update failure on pipe %c\n", > + if(!i915.enable_psr || retried) { > + DRM_ERROR("Potential atomic update failure on pipe %c\n", > pipe_name(crtc->pipe)); > - break; > + break; > + } > + else if(old_scanline == new_scanline && !retried) { > + retried = true; > + intel_wait_for_register(dev_priv, EDP_PSR_STATUS_CTL, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 10); What's the point of obfuscating the loop with this stuff? Just wait for the PSR exit before we even enter the loop? > + goto retry; > + } > } > > local_irq_enable(); > -- > 2.13.5 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Thu, 2018-04-26 at 16:41 +0300, Ville Syrjälä wrote: > On Wed, Apr 25, 2018 at 07:10:09PM -0700, tarun.vyas@intel.com wrote: > > From: Tarun <tarun.vyas@intel.com> > > > > The Display scanline counter freezes on PSR entry. Inside > > intel_pipe_update_start, once Vblank interrupts are enabled, we start > > exiting PSR, but by the time the scanline counter is read, we may not > > have completely exited PSR which leads us to schedule out and check back > > later. > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but > > lags w.r.t core kernel code, hot plugging an external display triggers > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A > > closer analysis reveals that we try to read the scanline 3 times and > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL stuck @ > > 1599. > > This issue is not seen on upstream kernels, b/c for *some* reason we > > loop inside intel_pipe_update start for ~2+ msec which in this case is > > more than enough to exit PSR fully, hence an *unstuck* PIPEDSL counter, > > hence no error. On the other hand, the ChromeOS kernel spends ~1.1 msec > > looping inside intel_pipe_update_start and hence errors out b/c the > > source is still in PSR. > > > > If PSR is enabled, then we should *wait* for the PSR > > state to move to IDLE before re-reading the PIPEDSL so as to avoid bogus > > and annoying "potential atomic update error" messages. > > > > P.S: This scenario applies to a configuration with an additional pipe, > > as of now. > > Ville, Any idea what could be the reason the warnings start appearing when an external display is connected? We couldn't come up with an explanation. > > Signed-off-by: Tarun <tarun.vyas@intel.com> > > --- > > drivers/gpu/drm/i915/intel_sprite.c | 19 +++++++++++++++---- > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c > > index aa1dfaa692b9..77dd3b936131 100644 > > --- a/drivers/gpu/drm/i915/intel_sprite.c > > +++ b/drivers/gpu/drm/i915/intel_sprite.c > > @@ -92,11 +92,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > > struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); > > const struct drm_display_mode *adjusted_mode = &new_crtc_state->base.adjusted_mode; > > long timeout = msecs_to_jiffies_timeout(1); > > - int scanline, min, max, vblank_start; > > + int scanline, min, max, vblank_start, old_scanline, new_scanline; > > + bool retried = false; > > wait_queue_head_t *wq = drm_crtc_vblank_waitqueue(&crtc->base); > > bool need_vlv_dsi_wa = (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) && > > intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI); > > DEFINE_WAIT(wait); > > + old_scanline = new_scanline = -1; > > > > vblank_start = adjusted_mode->crtc_vblank_start; > > if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) > > @@ -126,15 +128,24 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > > * read the scanline. > > */ > > prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); > > - > > +retry: > > scanline = intel_get_crtc_scanline(crtc); > > + old_scanline = new_scanline, new_scanline = scanline; > > + > > if (scanline < min || scanline > max) > > break; > > > > if (timeout <= 0) { > > - DRM_ERROR("Potential atomic update failure on pipe %c\n", > > + if(!i915.enable_psr || retried) { You could use the CAN_PSR() macro that checks for source and sink support. > > + DRM_ERROR("Potential atomic update failure on pipe %c\n", > > pipe_name(crtc->pipe)); > > - break; > > + break; > > + } > > + else if(old_scanline == new_scanline && !retried) { > > + retried = true; > > + intel_wait_for_register(dev_priv, EDP_PSR_STATUS_CTL, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 10); > > What's the point of obfuscating the loop with this stuff? > Just wait for the PSR exit before we even enter the loop? > > > + goto retry; > > + } > > } > > > > local_irq_enable(); > > -- > > 2.13.5 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx >
On Thu, Apr 26, 2018 at 02:39:04PM -0700, Tarun Vyas wrote: > On Thu, Apr 26, 2018 at 10:47:40AM -0700, Dhinakaran Pandiyan wrote: > > > > > > > > On Thu, 2018-04-26 at 16:41 +0300, Ville Syrjälä wrote: > > > On Wed, Apr 25, 2018 at 07:10:09PM -0700, tarun.vyas@intel.com wrote: > > > > From: Tarun <tarun.vyas@intel.com> > > > > > > > > The Display scanline counter freezes on PSR entry. Inside > > > > intel_pipe_update_start, once Vblank interrupts are enabled, we start > > > > exiting PSR, but by the time the scanline counter is read, we may not > > > > have completely exited PSR which leads us to schedule out and check back > > > > later. > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but > > > > lags w.r.t core kernel code, hot plugging an external display triggers > > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A > > > > closer analysis reveals that we try to read the scanline 3 times and > > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL stuck @ > > > > 1599. > > > > This issue is not seen on upstream kernels, b/c for *some* reason we > > > > loop inside intel_pipe_update start for ~2+ msec which in this case is > > > > more than enough to exit PSR fully, hence an *unstuck* PIPEDSL counter, > > > > hence no error. On the other hand, the ChromeOS kernel spends ~1.1 msec > > > > looping inside intel_pipe_update_start and hence errors out b/c the > > > > source is still in PSR. > > > > > > > > If PSR is enabled, then we should *wait* for the PSR > > > > state to move to IDLE before re-reading the PIPEDSL so as to avoid bogus > > > > and annoying "potential atomic update error" messages. > > > > > > > > P.S: This scenario applies to a configuration with an additional pipe, > > > > as of now. > > > > > > > > Ville, > > > > Any idea what could be the reason the warnings start appearing when an > > external display is connected? We couldn't come up with an explanation. > > > Another source of confusion for me is that on the upstream kernels, it *appears* to take more time for us to get *re-scheduled* after we call schedule_timeout(). So with ~2+msec spent in the loop, it seems to be not working as intended b/c we end up spending a lot more time in the loop, which in turn contributes to this issue not being seen on upstream kernels. > > > > > > Signed-off-by: Tarun <tarun.vyas@intel.com> > > > > --- > > > > drivers/gpu/drm/i915/intel_sprite.c | 19 +++++++++++++++---- > > > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c > > > > index aa1dfaa692b9..77dd3b936131 100644 > > > > --- a/drivers/gpu/drm/i915/intel_sprite.c > > > > +++ b/drivers/gpu/drm/i915/intel_sprite.c > > > > @@ -92,11 +92,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > > > > struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); > > > > const struct drm_display_mode *adjusted_mode = &new_crtc_state->base.adjusted_mode; > > > > long timeout = msecs_to_jiffies_timeout(1); > > > > - int scanline, min, max, vblank_start; > > > > + int scanline, min, max, vblank_start, old_scanline, new_scanline; > > > > + bool retried = false; > > > > wait_queue_head_t *wq = drm_crtc_vblank_waitqueue(&crtc->base); > > > > bool need_vlv_dsi_wa = (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) && > > > > intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI); > > > > DEFINE_WAIT(wait); > > > > + old_scanline = new_scanline = -1; > > > > > > > > vblank_start = adjusted_mode->crtc_vblank_start; > > > > if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) > > > > @@ -126,15 +128,24 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > > > > * read the scanline. > > > > */ > > > > prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); > > > > - > > > > +retry: > > > > scanline = intel_get_crtc_scanline(crtc); > > > > + old_scanline = new_scanline, new_scanline = scanline; > > > > + > > > > if (scanline < min || scanline > max) > > > > break; > > > > > > > > if (timeout <= 0) { > > > > - DRM_ERROR("Potential atomic update failure on pipe %c\n", > > > > + if(!i915.enable_psr || retried) { > > > > You could use the CAN_PSR() macro that checks for source and sink > > support. > > > Will do. > > > > + DRM_ERROR("Potential atomic update failure on pipe %c\n", > > > > pipe_name(crtc->pipe)); > > > > - break; > > > > + break; > > > > + } > > > > + else if(old_scanline == new_scanline && !retried) { > > > > + retried = true; > > > > + intel_wait_for_register(dev_priv, EDP_PSR_STATUS_CTL, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 10); > > > > > > What's the point of obfuscating the loop with this stuff? > > > Just wait for the PSR exit before we even enter the loop? > > > > Agreed. On a second thought, I was doing it wrong in the initial RFC. Can't do a wait_for_register with irqs disabled by local_irq_disable(). So, will have to *poll* the PSR_STATE, but will that be desirable ? > > > > + goto retry; > > > > + } > > > > } > > > > > > > > local_irq_enable(); > > > > -- > > > > 2.13.5 > > > > > > > > _______________________________________________ > > > > Intel-gfx mailing list > > > > Intel-gfx@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > > > > >
On Thu, Apr 26, 2018 at 08:09:56PM -0700, Tarun Vyas wrote: > On Thu, Apr 26, 2018 at 02:39:04PM -0700, Tarun Vyas wrote: > > On Thu, Apr 26, 2018 at 10:47:40AM -0700, Dhinakaran Pandiyan wrote: > > > > > > > > > > > > On Thu, 2018-04-26 at 16:41 +0300, Ville Syrjälä wrote: > > > > On Wed, Apr 25, 2018 at 07:10:09PM -0700, tarun.vyas@intel.com wrote: > > > > > From: Tarun <tarun.vyas@intel.com> > > > > > > > > > > The Display scanline counter freezes on PSR entry. Inside > > > > > intel_pipe_update_start, once Vblank interrupts are enabled, we start > > > > > exiting PSR, but by the time the scanline counter is read, we may not > > > > > have completely exited PSR which leads us to schedule out and check back > > > > > later. > > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 but > > > > > lags w.r.t core kernel code, hot plugging an external display triggers > > > > > tons of "potential atomic update errors" in the dmesg, on *pipe A*. A > > > > > closer analysis reveals that we try to read the scanline 3 times and > > > > > eventually timeout, b/c PSR hasn't exited fully leading to a PIPEDSL stuck @ > > > > > 1599. > > > > > This issue is not seen on upstream kernels, b/c for *some* reason we > > > > > loop inside intel_pipe_update start for ~2+ msec which in this case is > > > > > more than enough to exit PSR fully, hence an *unstuck* PIPEDSL counter, > > > > > hence no error. On the other hand, the ChromeOS kernel spends ~1.1 msec > > > > > looping inside intel_pipe_update_start and hence errors out b/c the > > > > > source is still in PSR. > > > > > > > > > > If PSR is enabled, then we should *wait* for the PSR > > > > > state to move to IDLE before re-reading the PIPEDSL so as to avoid bogus > > > > > and annoying "potential atomic update error" messages. > > > > > > > > > > P.S: This scenario applies to a configuration with an additional pipe, > > > > > as of now. > > > > > > > > > > > Ville, > > > > > > Any idea what could be the reason the warnings start appearing when an > > > external display is connected? We couldn't come up with an explanation. > > > > > Another source of confusion for me is that on the upstream kernels, it *appears* to take more time for us to get *re-scheduled* after we call schedule_timeout(). So with ~2+msec spent in the loop, it seems to be not working as intended b/c we end up spending a lot more time in the loop, which in turn contributes to this issue not being seen on upstream kernels. > > > > > > > > Signed-off-by: Tarun <tarun.vyas@intel.com> > > > > > --- > > > > > drivers/gpu/drm/i915/intel_sprite.c | 19 +++++++++++++++---- > > > > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c > > > > > index aa1dfaa692b9..77dd3b936131 100644 > > > > > --- a/drivers/gpu/drm/i915/intel_sprite.c > > > > > +++ b/drivers/gpu/drm/i915/intel_sprite.c > > > > > @@ -92,11 +92,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > > > > > struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); > > > > > const struct drm_display_mode *adjusted_mode = &new_crtc_state->base.adjusted_mode; > > > > > long timeout = msecs_to_jiffies_timeout(1); > > > > > - int scanline, min, max, vblank_start; > > > > > + int scanline, min, max, vblank_start, old_scanline, new_scanline; > > > > > + bool retried = false; > > > > > wait_queue_head_t *wq = drm_crtc_vblank_waitqueue(&crtc->base); > > > > > bool need_vlv_dsi_wa = (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) && > > > > > intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI); > > > > > DEFINE_WAIT(wait); > > > > > + old_scanline = new_scanline = -1; > > > > > > > > > > vblank_start = adjusted_mode->crtc_vblank_start; > > > > > if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) > > > > > @@ -126,15 +128,24 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) > > > > > * read the scanline. > > > > > */ > > > > > prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); > > > > > - > > > > > +retry: > > > > > scanline = intel_get_crtc_scanline(crtc); > > > > > + old_scanline = new_scanline, new_scanline = scanline; > > > > > + > > > > > if (scanline < min || scanline > max) > > > > > break; > > > > > > > > > > if (timeout <= 0) { > > > > > - DRM_ERROR("Potential atomic update failure on pipe %c\n", > > > > > + if(!i915.enable_psr || retried) { > > > > > > You could use the CAN_PSR() macro that checks for source and sink > > > support. > > > > > Will do. > > > > > + DRM_ERROR("Potential atomic update failure on pipe %c\n", > > > > > pipe_name(crtc->pipe)); > > > > > - break; > > > > > + break; > > > > > + } > > > > > + else if(old_scanline == new_scanline && !retried) { > > > > > + retried = true; > > > > > + intel_wait_for_register(dev_priv, EDP_PSR_STATUS_CTL, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 10); > > > > > > > > What's the point of obfuscating the loop with this stuff? > > > > Just wait for the PSR exit before we even enter the loop? > > > > > > Agreed. > On a second thought, I was doing it wrong in the initial RFC. Can't do a wait_for_register with irqs disabled by local_irq_disable(). So, will have to *poll* the PSR_STATE, but will that be desirable ? Do it before disabling the irqs? As long as we prevent it from re-entering PSR after the wait it should be safe. Maybe the vblank irq is the best way to prevent the re-entry? > > > > > + goto retry; > > > > > + } > > > > > } > > > > > > > > > > local_irq_enable(); > > > > > -- > > > > > 2.13.5 > > > > > > > > > > _______________________________________________ > > > > > Intel-gfx mailing list > > > > > Intel-gfx@lists.freedesktop.org > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > > > > > > >
Quoting Ville Syrjälä (2018-04-27 13:41:42) > On Thu, Apr 26, 2018 at 08:09:56PM -0700, Tarun Vyas wrote: > > On a second thought, I was doing it wrong in the initial RFC. Can't do a wait_for_register with irqs disabled by local_irq_disable(). So, will have to *poll* the PSR_STATE, but will that be desirable ? > > Do it before disabling the irqs? As long as we prevent it from > re-entering PSR after the wait it should be safe. Maybe the vblank irq > is the best way to prevent the re-entry? There's also an atomic variant of wait_for_register. But if we don't need to wait with irqs off, don't. -Chris
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c index aa1dfaa692b9..77dd3b936131 100644 --- a/drivers/gpu/drm/i915/intel_sprite.c +++ b/drivers/gpu/drm/i915/intel_sprite.c @@ -92,11 +92,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); const struct drm_display_mode *adjusted_mode = &new_crtc_state->base.adjusted_mode; long timeout = msecs_to_jiffies_timeout(1); - int scanline, min, max, vblank_start; + int scanline, min, max, vblank_start, old_scanline, new_scanline; + bool retried = false; wait_queue_head_t *wq = drm_crtc_vblank_waitqueue(&crtc->base); bool need_vlv_dsi_wa = (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) && intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI); DEFINE_WAIT(wait); + old_scanline = new_scanline = -1; vblank_start = adjusted_mode->crtc_vblank_start; if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) @@ -126,15 +128,24 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) * read the scanline. */ prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); - +retry: scanline = intel_get_crtc_scanline(crtc); + old_scanline = new_scanline, new_scanline = scanline; + if (scanline < min || scanline > max) break; if (timeout <= 0) { - DRM_ERROR("Potential atomic update failure on pipe %c\n", + if(!i915.enable_psr || retried) { + DRM_ERROR("Potential atomic update failure on pipe %c\n", pipe_name(crtc->pipe)); - break; + break; + } + else if(old_scanline == new_scanline && !retried) { + retried = true; + intel_wait_for_register(dev_priv, EDP_PSR_STATUS_CTL, EDP_PSR_STATUS_STATE_MASK, EDP_PSR_STATUS_STATE_IDLE, 10); + goto retry; + } } local_irq_enable();