Message ID | 1397587299-2476-2-git-send-email-ville.syrjala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
needs a small rebase but everything looks good for me so Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> On Tue, Apr 15, 2014 at 11:41 AM, <ville.syrjala@linux.intel.com> wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Starting from ILK, mmio flips also cause a flip done interrupt to be > signalled. This means if we first do a set_base and follow it > immediately with the CS flip, we might mistake the flip done interrupt > caused by the set_base as the flip done interrupt caused by the CS > flip. > > The hardware has a flip counter which increments every time a mmio or > CS flip is issued. It basically counts the number of DSPSURF register > writes. This means we can sample the counter before we put the CS > flip into the ring, and then when we get a flip done interrupt we can > check whether the CS flip has actually performed the surface address > update, or if the interrupt was caused by a previous but yet > unfinished mmio flip. > > Even with the flip counter we still have a race condition of the CS flip > base address update happens after the mmio flip done interrupt was > raised but not yet processed by the driver. When the interrupt is > eventually processed, the flip counter will already indicate that the > CS flip has been executed, but it would not actually complete until the > next start of vblank. We can use the DSPSURFLIVE register to check > whether the hardware is actually scanning out of the buffer we expect, > or if we managed hit this race window. > > This covers all the cases where the CS flip actually changes the base > address. If the base address remains unchanged, we might still complete > the CS flip before it has actually completed. But since the address > didn't change anyway, the premature flip completion can't result in > userspace overwriting data that's still being scanned out. > > CTG already has the flip counter and DSPSURFLIVE registers, and > although the flip done interrupt is still limited to CS flips alone, > the code now also checks the flip counter on CTG as well. > > v2: s/dspsurf/gtt_offset/ (Chris) > > Testcase: igt/kms_mmio_vs_cs_flip/setcrtc_vs_cs_flip > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73027 > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_reg.h | 1 + > drivers/gpu/drm/i915/intel_display.c | 73 > ++++++++++++++++++++++++++++++++---- > drivers/gpu/drm/i915/intel_drv.h | 2 + > 3 files changed, 69 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > b/drivers/gpu/drm/i915/i915_reg.h > index 8f84555..c28169c 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -3638,6 +3638,7 @@ enum punit_power_well { > #define _PIPEA_FRMCOUNT_GM45 0x70040 > #define _PIPEA_FLIPCOUNT_GM45 0x70044 > #define PIPE_FRMCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FRMCOUNT_GM45) > +#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FLIPCOUNT_GM45) > > /* Cursor A & B regs */ > #define _CURACNTR (dev_priv->info.display_mmio_offset + > 0x70080) > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 1390ab5..32c6c16 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -8577,6 +8577,48 @@ void intel_finish_page_flip_plane(struct drm_device > *dev, int plane) > do_intel_finish_page_flip(dev, crtc); > } > > +/* Is 'a' after or equal to 'b'? */ > +static bool flip_count_after_eq(u32 a, u32 b) > +{ > + return !((a - b) & 0x80000000); > +} > + > +static bool page_flip_finished(struct intel_crtc *crtc) > +{ > + struct drm_device *dev = crtc->base.dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > + > + /* > + * The relevant registers doen't exist on pre-ctg. > + * As the flip done interrupt doesn't trigger for mmio > + * flips on gmch platforms, a flip count check isn't > + * really needed there. But since ctg has the registers, > + * include it in the check anyway. > + */ > + if (INTEL_INFO(dev)->gen < 5 && !IS_G4X(dev)) > + return true; > + > + /* > + * A DSPSURFLIVE check isn't enough in case the mmio and CS flips > + * used the same base address. In that case the mmio flip might > + * have completed, but the CS hasn't even executed the flip yet. > + * > + * A flip count check isn't enough as the CS might have updated > + * the base address just after start of vblank, but before we > + * managed to process the interrupt. This means we'd complete the > + * CS flip too soon. > + * > + * Combining both checks should get us a good enough result. It may > + * still happen that the CS flip has been executed, but has not > + * yet actually completed. But in case the base address is the same > + * anyway, we don't really care. > + */ > + return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) == > + crtc->unpin_work->gtt_offset && > + > flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)), > + crtc->unpin_work->flip_count); > +} > + > void intel_prepare_page_flip(struct drm_device *dev, int plane) > { > struct drm_i915_private *dev_priv = dev->dev_private; > @@ -8589,7 +8631,7 @@ void intel_prepare_page_flip(struct drm_device *dev, > int plane) > * is also accompanied by a spurious intel_prepare_page_flip(). > */ > spin_lock_irqsave(&dev->event_lock, flags); > - if (intel_crtc->unpin_work) > + if (intel_crtc->unpin_work && page_flip_finished(intel_crtc)) > atomic_inc_not_zero(&intel_crtc->unpin_work->pending); > spin_unlock_irqrestore(&dev->event_lock, flags); > } > @@ -8619,6 +8661,9 @@ static int intel_gen2_queue_flip(struct drm_device > *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 6); > if (ret) > goto err_unpin; > @@ -8635,7 +8680,7 @@ static int intel_gen2_queue_flip(struct drm_device > *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0]); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + > intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > intel_ring_emit(ring, 0); /* aux display base address, unused */ > > intel_mark_page_flip_active(intel_crtc); > @@ -8664,6 +8709,9 @@ static int intel_gen3_queue_flip(struct drm_device > *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 6); > if (ret) > goto err_unpin; > @@ -8677,7 +8725,7 @@ static int intel_gen3_queue_flip(struct drm_device > *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0]); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + > intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > intel_ring_emit(ring, MI_NOOP); > > intel_mark_page_flip_active(intel_crtc); > @@ -8706,6 +8754,9 @@ static int intel_gen4_queue_flip(struct drm_device > *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 4); > if (ret) > goto err_unpin; > @@ -8717,8 +8768,7 @@ static int intel_gen4_queue_flip(struct drm_device > *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0]); > - intel_ring_emit(ring, > - (i915_gem_obj_ggtt_offset(obj) + > intel_crtc->dspaddr_offset) | > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset | > obj->tiling_mode); > > /* XXX Enabling the panel-fitter across page-flip is so far > @@ -8755,6 +8805,9 @@ static int intel_gen6_queue_flip(struct drm_device > *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 4); > if (ret) > goto err_unpin; > @@ -8762,7 +8815,7 @@ static int intel_gen6_queue_flip(struct drm_device > *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + > intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > > /* Contrary to the suggestions in the documentation, > * "Enable Panel Fitter" does not seem to be required when page > @@ -8804,6 +8857,9 @@ static int intel_gen7_queue_flip(struct drm_device > *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > switch(intel_crtc->plane) { > case PLANE_A: > plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A; > @@ -8881,7 +8937,7 @@ static int intel_gen7_queue_flip(struct drm_device > *dev, > > intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit); > intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode)); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + > intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > intel_ring_emit(ring, (MI_NOOP)); > > intel_mark_page_flip_active(intel_crtc); > @@ -8979,6 +9035,9 @@ static int intel_crtc_page_flip(struct drm_crtc > *crtc, > atomic_inc(&intel_crtc->unpin_work_count); > intel_crtc->reset_counter = > atomic_read(&dev_priv->gpu_error.reset_counter); > > + if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev)) > + work->flip_count = > I915_READ(PIPE_FLIPCOUNT_GM45(intel_crtc->pipe)) + 1; > + > ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, > page_flip_flags); > if (ret) > goto cleanup_pending; > diff --git a/drivers/gpu/drm/i915/intel_drv.h > b/drivers/gpu/drm/i915/intel_drv.h > index c551472..edef34e 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -590,6 +590,8 @@ struct intel_unpin_work { > #define INTEL_FLIP_INACTIVE 0 > #define INTEL_FLIP_PENDING 1 > #define INTEL_FLIP_COMPLETE 2 > + u32 flip_count; > + u32 gtt_offset; > bool enable_stall_check; > }; > > -- > 1.8.3.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx >
On Tue, Apr 15, 2014 at 09:41:34PM +0300, ville.syrjala@linux.intel.com wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Starting from ILK, mmio flips also cause a flip done interrupt to be > signalled. This means if we first do a set_base and follow it > immediately with the CS flip, we might mistake the flip done interrupt > caused by the set_base as the flip done interrupt caused by the CS > flip. > > The hardware has a flip counter which increments every time a mmio or > CS flip is issued. It basically counts the number of DSPSURF register > writes. This means we can sample the counter before we put the CS > flip into the ring, and then when we get a flip done interrupt we can > check whether the CS flip has actually performed the surface address > update, or if the interrupt was caused by a previous but yet > unfinished mmio flip. > > Even with the flip counter we still have a race condition of the CS flip > base address update happens after the mmio flip done interrupt was > raised but not yet processed by the driver. When the interrupt is > eventually processed, the flip counter will already indicate that the > CS flip has been executed, but it would not actually complete until the > next start of vblank. We can use the DSPSURFLIVE register to check > whether the hardware is actually scanning out of the buffer we expect, > or if we managed hit this race window. > > This covers all the cases where the CS flip actually changes the base > address. If the base address remains unchanged, we might still complete > the CS flip before it has actually completed. But since the address > didn't change anyway, the premature flip completion can't result in > userspace overwriting data that's still being scanned out. > > CTG already has the flip counter and DSPSURFLIVE registers, and > although the flip done interrupt is still limited to CS flips alone, > the code now also checks the flip counter on CTG as well. > > v2: s/dspsurf/gtt_offset/ (Chris) > > Testcase: igt/kms_mmio_vs_cs_flip/setcrtc_vs_cs_flip > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73027 > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_reg.h | 1 + > drivers/gpu/drm/i915/intel_display.c | 73 ++++++++++++++++++++++++++++++++---- > drivers/gpu/drm/i915/intel_drv.h | 2 + > 3 files changed, 69 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 8f84555..c28169c 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -3638,6 +3638,7 @@ enum punit_power_well { > #define _PIPEA_FRMCOUNT_GM45 0x70040 > #define _PIPEA_FLIPCOUNT_GM45 0x70044 > #define PIPE_FRMCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FRMCOUNT_GM45) > +#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FLIPCOUNT_GM45) > > /* Cursor A & B regs */ > #define _CURACNTR (dev_priv->info.display_mmio_offset + 0x70080) > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 1390ab5..32c6c16 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -8577,6 +8577,48 @@ void intel_finish_page_flip_plane(struct drm_device *dev, int plane) > do_intel_finish_page_flip(dev, crtc); > } > > +/* Is 'a' after or equal to 'b'? */ > +static bool flip_count_after_eq(u32 a, u32 b) I've added a g4x prefix here. The wrap-around semantics on earlier chips (yeah I know doesn't exist, but the vblank counter does) wouldn't work. -Daniel > +{ > + return !((a - b) & 0x80000000); > +} > + > +static bool page_flip_finished(struct intel_crtc *crtc) > +{ > + struct drm_device *dev = crtc->base.dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > + > + /* > + * The relevant registers doen't exist on pre-ctg. > + * As the flip done interrupt doesn't trigger for mmio > + * flips on gmch platforms, a flip count check isn't > + * really needed there. But since ctg has the registers, > + * include it in the check anyway. > + */ > + if (INTEL_INFO(dev)->gen < 5 && !IS_G4X(dev)) > + return true; > + > + /* > + * A DSPSURFLIVE check isn't enough in case the mmio and CS flips > + * used the same base address. In that case the mmio flip might > + * have completed, but the CS hasn't even executed the flip yet. > + * > + * A flip count check isn't enough as the CS might have updated > + * the base address just after start of vblank, but before we > + * managed to process the interrupt. This means we'd complete the > + * CS flip too soon. > + * > + * Combining both checks should get us a good enough result. It may > + * still happen that the CS flip has been executed, but has not > + * yet actually completed. But in case the base address is the same > + * anyway, we don't really care. > + */ > + return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) == > + crtc->unpin_work->gtt_offset && > + flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)), > + crtc->unpin_work->flip_count); > +} > + > void intel_prepare_page_flip(struct drm_device *dev, int plane) > { > struct drm_i915_private *dev_priv = dev->dev_private; > @@ -8589,7 +8631,7 @@ void intel_prepare_page_flip(struct drm_device *dev, int plane) > * is also accompanied by a spurious intel_prepare_page_flip(). > */ > spin_lock_irqsave(&dev->event_lock, flags); > - if (intel_crtc->unpin_work) > + if (intel_crtc->unpin_work && page_flip_finished(intel_crtc)) > atomic_inc_not_zero(&intel_crtc->unpin_work->pending); > spin_unlock_irqrestore(&dev->event_lock, flags); > } > @@ -8619,6 +8661,9 @@ static int intel_gen2_queue_flip(struct drm_device *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 6); > if (ret) > goto err_unpin; > @@ -8635,7 +8680,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0]); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > intel_ring_emit(ring, 0); /* aux display base address, unused */ > > intel_mark_page_flip_active(intel_crtc); > @@ -8664,6 +8709,9 @@ static int intel_gen3_queue_flip(struct drm_device *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 6); > if (ret) > goto err_unpin; > @@ -8677,7 +8725,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0]); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > intel_ring_emit(ring, MI_NOOP); > > intel_mark_page_flip_active(intel_crtc); > @@ -8706,6 +8754,9 @@ static int intel_gen4_queue_flip(struct drm_device *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 4); > if (ret) > goto err_unpin; > @@ -8717,8 +8768,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0]); > - intel_ring_emit(ring, > - (i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset) | > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset | > obj->tiling_mode); > > /* XXX Enabling the panel-fitter across page-flip is so far > @@ -8755,6 +8805,9 @@ static int intel_gen6_queue_flip(struct drm_device *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > ret = intel_ring_begin(ring, 4); > if (ret) > goto err_unpin; > @@ -8762,7 +8815,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev, > intel_ring_emit(ring, MI_DISPLAY_FLIP | > MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); > intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > > /* Contrary to the suggestions in the documentation, > * "Enable Panel Fitter" does not seem to be required when page > @@ -8804,6 +8857,9 @@ static int intel_gen7_queue_flip(struct drm_device *dev, > if (ret) > goto err; > > + intel_crtc->unpin_work->gtt_offset = > + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; > + > switch(intel_crtc->plane) { > case PLANE_A: > plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A; > @@ -8881,7 +8937,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, > > intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit); > intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode)); > - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); > + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); > intel_ring_emit(ring, (MI_NOOP)); > > intel_mark_page_flip_active(intel_crtc); > @@ -8979,6 +9035,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, > atomic_inc(&intel_crtc->unpin_work_count); > intel_crtc->reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter); > > + if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev)) > + work->flip_count = I915_READ(PIPE_FLIPCOUNT_GM45(intel_crtc->pipe)) + 1; > + > ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, page_flip_flags); > if (ret) > goto cleanup_pending; > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h > index c551472..edef34e 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -590,6 +590,8 @@ struct intel_unpin_work { > #define INTEL_FLIP_INACTIVE 0 > #define INTEL_FLIP_PENDING 1 > #define INTEL_FLIP_COMPLETE 2 > + u32 flip_count; > + u32 gtt_offset; > bool enable_stall_check; > }; > > -- > 1.8.3.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Reviewed the code. Everything looks good, so Reviewed-by: Sourab Gupta <sourabgupta@gmail.com> On Wed, May 21, 2014 at 6:09 AM, Rodrigo Vivi <rodrigo.vivi@gmail.com>wrote: > needs a small rebase but everything looks good for me so > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> > > > On Tue, Apr 15, 2014 at 11:41 AM, <ville.syrjala@linux.intel.com> wrote: > >> From: Ville Syrjälä <ville.syrjala@linux.intel.com> >> >> Starting from ILK, mmio flips also cause a flip done interrupt to be >> signalled. This means if we first do a set_base and follow it >> immediately with the CS flip, we might mistake the flip done interrupt >> caused by the set_base as the flip done interrupt caused by the CS >> flip. >> >> The hardware has a flip counter which increments every time a mmio or >> CS flip is issued. It basically counts the number of DSPSURF register >> writes. This means we can sample the counter before we put the CS >> flip into the ring, and then when we get a flip done interrupt we can >> check whether the CS flip has actually performed the surface address >> update, or if the interrupt was caused by a previous but yet >> unfinished mmio flip. >> >> Even with the flip counter we still have a race condition of the CS flip >> base address update happens after the mmio flip done interrupt was >> raised but not yet processed by the driver. When the interrupt is >> eventually processed, the flip counter will already indicate that the >> CS flip has been executed, but it would not actually complete until the >> next start of vblank. We can use the DSPSURFLIVE register to check >> whether the hardware is actually scanning out of the buffer we expect, >> or if we managed hit this race window. >> >> This covers all the cases where the CS flip actually changes the base >> address. If the base address remains unchanged, we might still complete >> the CS flip before it has actually completed. But since the address >> didn't change anyway, the premature flip completion can't result in >> userspace overwriting data that's still being scanned out. >> >> CTG already has the flip counter and DSPSURFLIVE registers, and >> although the flip done interrupt is still limited to CS flips alone, >> the code now also checks the flip counter on CTG as well. >> >> v2: s/dspsurf/gtt_offset/ (Chris) >> >> Testcase: igt/kms_mmio_vs_cs_flip/setcrtc_vs_cs_flip >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73027 >> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> >> --- >> drivers/gpu/drm/i915/i915_reg.h | 1 + >> drivers/gpu/drm/i915/intel_display.c | 73 >> ++++++++++++++++++++++++++++++++---- >> drivers/gpu/drm/i915/intel_drv.h | 2 + >> 3 files changed, 69 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h >> b/drivers/gpu/drm/i915/i915_reg.h >> index 8f84555..c28169c 100644 >> --- a/drivers/gpu/drm/i915/i915_reg.h >> +++ b/drivers/gpu/drm/i915/i915_reg.h >> @@ -3638,6 +3638,7 @@ enum punit_power_well { >> #define _PIPEA_FRMCOUNT_GM45 0x70040 >> #define _PIPEA_FLIPCOUNT_GM45 0x70044 >> #define PIPE_FRMCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FRMCOUNT_GM45) >> +#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FLIPCOUNT_GM45) >> >> /* Cursor A & B regs */ >> #define _CURACNTR (dev_priv->info.display_mmio_offset + >> 0x70080) >> diff --git a/drivers/gpu/drm/i915/intel_display.c >> b/drivers/gpu/drm/i915/intel_display.c >> index 1390ab5..32c6c16 100644 >> --- a/drivers/gpu/drm/i915/intel_display.c >> +++ b/drivers/gpu/drm/i915/intel_display.c >> @@ -8577,6 +8577,48 @@ void intel_finish_page_flip_plane(struct >> drm_device *dev, int plane) >> do_intel_finish_page_flip(dev, crtc); >> } >> >> +/* Is 'a' after or equal to 'b'? */ >> +static bool flip_count_after_eq(u32 a, u32 b) >> +{ >> + return !((a - b) & 0x80000000); >> +} >> + >> +static bool page_flip_finished(struct intel_crtc *crtc) >> +{ >> + struct drm_device *dev = crtc->base.dev; >> + struct drm_i915_private *dev_priv = dev->dev_private; >> + >> + /* >> + * The relevant registers doen't exist on pre-ctg. >> + * As the flip done interrupt doesn't trigger for mmio >> + * flips on gmch platforms, a flip count check isn't >> + * really needed there. But since ctg has the registers, >> + * include it in the check anyway. >> + */ >> + if (INTEL_INFO(dev)->gen < 5 && !IS_G4X(dev)) >> + return true; >> + >> + /* >> + * A DSPSURFLIVE check isn't enough in case the mmio and CS flips >> + * used the same base address. In that case the mmio flip might >> + * have completed, but the CS hasn't even executed the flip yet. >> + * >> + * A flip count check isn't enough as the CS might have updated >> + * the base address just after start of vblank, but before we >> + * managed to process the interrupt. This means we'd complete the >> + * CS flip too soon. >> + * >> + * Combining both checks should get us a good enough result. It >> may >> + * still happen that the CS flip has been executed, but has not >> + * yet actually completed. But in case the base address is the >> same >> + * anyway, we don't really care. >> + */ >> + return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) == >> + crtc->unpin_work->gtt_offset && >> + >> flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)), >> + crtc->unpin_work->flip_count); >> +} >> + >> void intel_prepare_page_flip(struct drm_device *dev, int plane) >> { >> struct drm_i915_private *dev_priv = dev->dev_private; >> @@ -8589,7 +8631,7 @@ void intel_prepare_page_flip(struct drm_device >> *dev, int plane) >> * is also accompanied by a spurious intel_prepare_page_flip(). >> */ >> spin_lock_irqsave(&dev->event_lock, flags); >> - if (intel_crtc->unpin_work) >> + if (intel_crtc->unpin_work && page_flip_finished(intel_crtc)) >> atomic_inc_not_zero(&intel_crtc->unpin_work->pending); >> spin_unlock_irqrestore(&dev->event_lock, flags); >> } >> @@ -8619,6 +8661,9 @@ static int intel_gen2_queue_flip(struct drm_device >> *dev, >> if (ret) >> goto err; >> >> + intel_crtc->unpin_work->gtt_offset = >> + i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset; >> + >> ret = intel_ring_begin(ring, 6); >> if (ret) >> goto err_unpin; >> @@ -8635,7 +8680,7 @@ static int intel_gen2_queue_flip(struct drm_device >> *dev, >> intel_ring_emit(ring, MI_DISPLAY_FLIP | >> MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); >> intel_ring_emit(ring, fb->pitches[0]); >> - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset); >> + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); >> intel_ring_emit(ring, 0); /* aux display base address, unused */ >> >> intel_mark_page_flip_active(intel_crtc); >> @@ -8664,6 +8709,9 @@ static int intel_gen3_queue_flip(struct drm_device >> *dev, >> if (ret) >> goto err; >> >> + intel_crtc->unpin_work->gtt_offset = >> + i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset; >> + >> ret = intel_ring_begin(ring, 6); >> if (ret) >> goto err_unpin; >> @@ -8677,7 +8725,7 @@ static int intel_gen3_queue_flip(struct drm_device >> *dev, >> intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | >> MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); >> intel_ring_emit(ring, fb->pitches[0]); >> - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset); >> + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); >> intel_ring_emit(ring, MI_NOOP); >> >> intel_mark_page_flip_active(intel_crtc); >> @@ -8706,6 +8754,9 @@ static int intel_gen4_queue_flip(struct drm_device >> *dev, >> if (ret) >> goto err; >> >> + intel_crtc->unpin_work->gtt_offset = >> + i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset; >> + >> ret = intel_ring_begin(ring, 4); >> if (ret) >> goto err_unpin; >> @@ -8717,8 +8768,7 @@ static int intel_gen4_queue_flip(struct drm_device >> *dev, >> intel_ring_emit(ring, MI_DISPLAY_FLIP | >> MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); >> intel_ring_emit(ring, fb->pitches[0]); >> - intel_ring_emit(ring, >> - (i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset) | >> + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset | >> obj->tiling_mode); >> >> /* XXX Enabling the panel-fitter across page-flip is so far >> @@ -8755,6 +8805,9 @@ static int intel_gen6_queue_flip(struct drm_device >> *dev, >> if (ret) >> goto err; >> >> + intel_crtc->unpin_work->gtt_offset = >> + i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset; >> + >> ret = intel_ring_begin(ring, 4); >> if (ret) >> goto err_unpin; >> @@ -8762,7 +8815,7 @@ static int intel_gen6_queue_flip(struct drm_device >> *dev, >> intel_ring_emit(ring, MI_DISPLAY_FLIP | >> MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); >> intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode); >> - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset); >> + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); >> >> /* Contrary to the suggestions in the documentation, >> * "Enable Panel Fitter" does not seem to be required when page >> @@ -8804,6 +8857,9 @@ static int intel_gen7_queue_flip(struct drm_device >> *dev, >> if (ret) >> goto err; >> >> + intel_crtc->unpin_work->gtt_offset = >> + i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset; >> + >> switch(intel_crtc->plane) { >> case PLANE_A: >> plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A; >> @@ -8881,7 +8937,7 @@ static int intel_gen7_queue_flip(struct drm_device >> *dev, >> >> intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit); >> intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode)); >> - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + >> intel_crtc->dspaddr_offset); >> + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); >> intel_ring_emit(ring, (MI_NOOP)); >> >> intel_mark_page_flip_active(intel_crtc); >> @@ -8979,6 +9035,9 @@ static int intel_crtc_page_flip(struct drm_crtc >> *crtc, >> atomic_inc(&intel_crtc->unpin_work_count); >> intel_crtc->reset_counter = >> atomic_read(&dev_priv->gpu_error.reset_counter); >> >> + if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev)) >> + work->flip_count = >> I915_READ(PIPE_FLIPCOUNT_GM45(intel_crtc->pipe)) + 1; >> + >> ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, >> page_flip_flags); >> if (ret) >> goto cleanup_pending; >> diff --git a/drivers/gpu/drm/i915/intel_drv.h >> b/drivers/gpu/drm/i915/intel_drv.h >> index c551472..edef34e 100644 >> --- a/drivers/gpu/drm/i915/intel_drv.h >> +++ b/drivers/gpu/drm/i915/intel_drv.h >> @@ -590,6 +590,8 @@ struct intel_unpin_work { >> #define INTEL_FLIP_INACTIVE 0 >> #define INTEL_FLIP_PENDING 1 >> #define INTEL_FLIP_COMPLETE 2 >> + u32 flip_count; >> + u32 gtt_offset; >> bool enable_stall_check; >> }; >> >> -- >> 1.8.3.2 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx >> > > > > -- > Rodrigo Vivi > Blog: http://blog.vivi.eng.br > > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > >
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 8f84555..c28169c 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3638,6 +3638,7 @@ enum punit_power_well { #define _PIPEA_FRMCOUNT_GM45 0x70040 #define _PIPEA_FLIPCOUNT_GM45 0x70044 #define PIPE_FRMCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FRMCOUNT_GM45) +#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FLIPCOUNT_GM45) /* Cursor A & B regs */ #define _CURACNTR (dev_priv->info.display_mmio_offset + 0x70080) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 1390ab5..32c6c16 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -8577,6 +8577,48 @@ void intel_finish_page_flip_plane(struct drm_device *dev, int plane) do_intel_finish_page_flip(dev, crtc); } +/* Is 'a' after or equal to 'b'? */ +static bool flip_count_after_eq(u32 a, u32 b) +{ + return !((a - b) & 0x80000000); +} + +static bool page_flip_finished(struct intel_crtc *crtc) +{ + struct drm_device *dev = crtc->base.dev; + struct drm_i915_private *dev_priv = dev->dev_private; + + /* + * The relevant registers doen't exist on pre-ctg. + * As the flip done interrupt doesn't trigger for mmio + * flips on gmch platforms, a flip count check isn't + * really needed there. But since ctg has the registers, + * include it in the check anyway. + */ + if (INTEL_INFO(dev)->gen < 5 && !IS_G4X(dev)) + return true; + + /* + * A DSPSURFLIVE check isn't enough in case the mmio and CS flips + * used the same base address. In that case the mmio flip might + * have completed, but the CS hasn't even executed the flip yet. + * + * A flip count check isn't enough as the CS might have updated + * the base address just after start of vblank, but before we + * managed to process the interrupt. This means we'd complete the + * CS flip too soon. + * + * Combining both checks should get us a good enough result. It may + * still happen that the CS flip has been executed, but has not + * yet actually completed. But in case the base address is the same + * anyway, we don't really care. + */ + return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) == + crtc->unpin_work->gtt_offset && + flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)), + crtc->unpin_work->flip_count); +} + void intel_prepare_page_flip(struct drm_device *dev, int plane) { struct drm_i915_private *dev_priv = dev->dev_private; @@ -8589,7 +8631,7 @@ void intel_prepare_page_flip(struct drm_device *dev, int plane) * is also accompanied by a spurious intel_prepare_page_flip(). */ spin_lock_irqsave(&dev->event_lock, flags); - if (intel_crtc->unpin_work) + if (intel_crtc->unpin_work && page_flip_finished(intel_crtc)) atomic_inc_not_zero(&intel_crtc->unpin_work->pending); spin_unlock_irqrestore(&dev->event_lock, flags); } @@ -8619,6 +8661,9 @@ static int intel_gen2_queue_flip(struct drm_device *dev, if (ret) goto err; + intel_crtc->unpin_work->gtt_offset = + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; + ret = intel_ring_begin(ring, 6); if (ret) goto err_unpin; @@ -8635,7 +8680,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_DISPLAY_FLIP | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); intel_ring_emit(ring, fb->pitches[0]); - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); intel_ring_emit(ring, 0); /* aux display base address, unused */ intel_mark_page_flip_active(intel_crtc); @@ -8664,6 +8709,9 @@ static int intel_gen3_queue_flip(struct drm_device *dev, if (ret) goto err; + intel_crtc->unpin_work->gtt_offset = + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; + ret = intel_ring_begin(ring, 6); if (ret) goto err_unpin; @@ -8677,7 +8725,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); intel_ring_emit(ring, fb->pitches[0]); - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); intel_ring_emit(ring, MI_NOOP); intel_mark_page_flip_active(intel_crtc); @@ -8706,6 +8754,9 @@ static int intel_gen4_queue_flip(struct drm_device *dev, if (ret) goto err; + intel_crtc->unpin_work->gtt_offset = + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; + ret = intel_ring_begin(ring, 4); if (ret) goto err_unpin; @@ -8717,8 +8768,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_DISPLAY_FLIP | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); intel_ring_emit(ring, fb->pitches[0]); - intel_ring_emit(ring, - (i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset) | + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset | obj->tiling_mode); /* XXX Enabling the panel-fitter across page-flip is so far @@ -8755,6 +8805,9 @@ static int intel_gen6_queue_flip(struct drm_device *dev, if (ret) goto err; + intel_crtc->unpin_work->gtt_offset = + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; + ret = intel_ring_begin(ring, 4); if (ret) goto err_unpin; @@ -8762,7 +8815,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_DISPLAY_FLIP | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane)); intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode); - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); /* Contrary to the suggestions in the documentation, * "Enable Panel Fitter" does not seem to be required when page @@ -8804,6 +8857,9 @@ static int intel_gen7_queue_flip(struct drm_device *dev, if (ret) goto err; + intel_crtc->unpin_work->gtt_offset = + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset; + switch(intel_crtc->plane) { case PLANE_A: plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A; @@ -8881,7 +8937,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit); intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode)); - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset); + intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); intel_ring_emit(ring, (MI_NOOP)); intel_mark_page_flip_active(intel_crtc); @@ -8979,6 +9035,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, atomic_inc(&intel_crtc->unpin_work_count); intel_crtc->reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter); + if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev)) + work->flip_count = I915_READ(PIPE_FLIPCOUNT_GM45(intel_crtc->pipe)) + 1; + ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, page_flip_flags); if (ret) goto cleanup_pending; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index c551472..edef34e 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -590,6 +590,8 @@ struct intel_unpin_work { #define INTEL_FLIP_INACTIVE 0 #define INTEL_FLIP_PENDING 1 #define INTEL_FLIP_COMPLETE 2 + u32 flip_count; + u32 gtt_offset; bool enable_stall_check; };