Message ID | 1359476018-31274-7-git-send-email-ville.syrjala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 29, 2013 at 06:13:38PM +0200, ville.syrjala@linux.intel.com wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Since obj->pending_flips was never set, intel_pipe_set_base() never > actually waited for pending page flips to complete. > > We really do want to wait for the pending flips, because otherwise the > mmio surface base address update could overtake the flip, and you > could end up with an old frame on the screen once the flip really > completes. > > Just call intel_crtc_wait_pending_flips_locked() instead of > intel_finish_fb() from intel_pipe_set_base() to achieve the > desired result. > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
On Tue, Jan 29, 2013 at 06:13:38PM +0200, ville.syrjala@linux.intel.com wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Since obj->pending_flips was never set, intel_pipe_set_base() never > actually waited for pending page flips to complete. > > We really do want to wait for the pending flips, because otherwise the > mmio surface base address update could overtake the flip, and you > could end up with an old frame on the screen once the flip really > completes. > > Just call intel_crtc_wait_pending_flips_locked() instead of > intel_finish_fb() from intel_pipe_set_base() to achieve the > desired result. > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > drivers/gpu/drm/i915/intel_display.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index a2e04f7..7e047c1 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -2330,8 +2330,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y, > return ret; > } > > - if (crtc->fb) > - intel_finish_fb(crtc->fb); > + intel_crtc_wait_for_pending_flips_locked(crtc); Ah, I see now why you grab dev->struct_mutex and need to kick waiters from the pending flip queue. I think grabbing the mutex twice isn't a major offense, since both the crtc disable code and set_base are slowpaths used rarely. So what about simply calling wait_for_pending_flips before grabbing the mutex in intel_pipe_set_base? We could then also inline finish_fb into it's only callsite ... -Daniel > > ret = dev_priv->display.update_plane(crtc, fb, x, y); > if (ret) { > -- > 1.7.12.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Wed, Feb 13, 2013 at 04:49:35PM +0100, Daniel Vetter wrote: > On Tue, Jan 29, 2013 at 06:13:38PM +0200, ville.syrjala@linux.intel.com wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > Since obj->pending_flips was never set, intel_pipe_set_base() never > > actually waited for pending page flips to complete. > > > > We really do want to wait for the pending flips, because otherwise the > > mmio surface base address update could overtake the flip, and you > > could end up with an old frame on the screen once the flip really > > completes. > > > > Just call intel_crtc_wait_pending_flips_locked() instead of > > intel_finish_fb() from intel_pipe_set_base() to achieve the > > desired result. > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/intel_display.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > > index a2e04f7..7e047c1 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -2330,8 +2330,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y, > > return ret; > > } > > > > - if (crtc->fb) > > - intel_finish_fb(crtc->fb); > > + intel_crtc_wait_for_pending_flips_locked(crtc); > > Ah, I see now why you grab dev->struct_mutex and need to kick waiters from > the pending flip queue. I think grabbing the mutex twice isn't a major > offense, since both the crtc disable code and set_base are slowpaths used > rarely. So what about simply calling wait_for_pending_flips before > grabbing the mutex in intel_pipe_set_base? We could then also inline > finish_fb into it's only callsite ... I didn't want to slow down intel_pipe_set_base() too much. If we wait for pending flips before pinning the new fb, we can never achieve any parallelism there. But if no-one cares about that, we can reorder.
On Wed, Feb 13, 2013 at 6:06 PM, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: >> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c >> > index a2e04f7..7e047c1 100644 >> > --- a/drivers/gpu/drm/i915/intel_display.c >> > +++ b/drivers/gpu/drm/i915/intel_display.c >> > @@ -2330,8 +2330,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y, >> > return ret; >> > } >> > >> > - if (crtc->fb) >> > - intel_finish_fb(crtc->fb); >> > + intel_crtc_wait_for_pending_flips_locked(crtc); >> >> Ah, I see now why you grab dev->struct_mutex and need to kick waiters from >> the pending flip queue. I think grabbing the mutex twice isn't a major >> offense, since both the crtc disable code and set_base are slowpaths used >> rarely. So what about simply calling wait_for_pending_flips before >> grabbing the mutex in intel_pipe_set_base? We could then also inline >> finish_fb into it's only callsite ... > > I didn't want to slow down intel_pipe_set_base() too much. If we wait > for pending flips before pinning the new fb, we can never achieve any > parallelism there. But if no-one cares about that, we can reorder. I'm confused here - where can we extract parallelism in set_base between waiting for pending flips and the pinning? And imo set_base isn't really critical: It's officially a synchronous thing (we have a vblank wait in there), and if we want to fix that imo the nuclear pageflip should be the answer. -Daniel
On Wed, Feb 13, 2013 at 06:11:00PM +0100, Daniel Vetter wrote: > On Wed, Feb 13, 2013 at 6:06 PM, Ville Syrjälä > <ville.syrjala@linux.intel.com> wrote: > >> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > >> > index a2e04f7..7e047c1 100644 > >> > --- a/drivers/gpu/drm/i915/intel_display.c > >> > +++ b/drivers/gpu/drm/i915/intel_display.c > >> > @@ -2330,8 +2330,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y, > >> > return ret; > >> > } > >> > > >> > - if (crtc->fb) > >> > - intel_finish_fb(crtc->fb); > >> > + intel_crtc_wait_for_pending_flips_locked(crtc); > >> > >> Ah, I see now why you grab dev->struct_mutex and need to kick waiters from > >> the pending flip queue. I think grabbing the mutex twice isn't a major > >> offense, since both the crtc disable code and set_base are slowpaths used > >> rarely. So what about simply calling wait_for_pending_flips before > >> grabbing the mutex in intel_pipe_set_base? We could then also inline > >> finish_fb into it's only callsite ... > > > > I didn't want to slow down intel_pipe_set_base() too much. If we wait > > for pending flips before pinning the new fb, we can never achieve any > > parallelism there. But if no-one cares about that, we can reorder. > > I'm confused here - where can we extract parallelism in set_base > between waiting for pending flips and the pinning? And imo set_base > isn't really critical: It's officially a synchronous thing (we have a > vblank wait in there), and if we want to fix that imo the nuclear > pageflip should be the answer. The flip is making progress on the GPU side, and at the same time the CPU side can make some progress with the pin operation. At least that was my theory.
On Wed, Feb 13, 2013 at 6:26 PM, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > The flip is making progress on the GPU side, and at the same time the > CPU side can make some progress with the pin operation. At least that > was my theory. The pin should be a no-op when moving a given framebuffer around. It's only really expensive when we need to clflush (and lesser so when we need to rewrite ptes), which should only happen on fresh framebuffers. As long as userspace does it's job and caches scanout buffers we should be fine. The other thing is waiting for outstanding rendering (to avoid hanging on an MI_WAIT_SCANLINE cmd), which just blocks. So doesn't really matter imo. -Daniel
On Wed, Feb 13, 2013 at 06:11:00PM +0100, Daniel Vetter wrote: > I'm confused here - where can we extract parallelism in set_base > between waiting for pending flips and the pinning? And imo set_base > isn't really critical: It's officially a synchronous thing (we have a > vblank wait in there), and if we want to fix that imo the nuclear > pageflip should be the answer. Speaking of which I've had a patch to remove that extra synchronous wait for over a year... -Chris
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a2e04f7..7e047c1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2330,8 +2330,7 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y, return ret; } - if (crtc->fb) - intel_finish_fb(crtc->fb); + intel_crtc_wait_for_pending_flips_locked(crtc); ret = dev_priv->display.update_plane(crtc, fb, x, y); if (ret) {