Message ID | 1376057602-15233-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote: > A simple modeset, where we only wish to switch over to a new framebuffer > such as the transition from fbcon to X, takes around 30-60ms. This is > due to three factors: > > 1. We need to make sure the fb->obj is in the display domain, which > incurs a cache flush to ensure no dirt is left on the scanout. > > 2. We need to flush any pending rendering before performing the mmio > so that the frame is complete before it is shown. > > 3. We currently wait for the vblank after the mmio to be sure that the > old fb is no longer being shown before releasing it. > > (1) can only be eliminated by userspace preparing the fb->obj in advance > to already be in the display domain. This can be done through use of the > create2 ioctl, or by reusing an existing fb->obj. > > However, (2) and (3) are already solved by the existing page flip > mechanism, and it is surprisingly trivial to wire them up for use in the > set-base fast path. Though it can be argued that this represents a > subtle ABI break in that the set_config ioctl now returns before the old > framebuffer is unpinned. The danger is that userspace will start to > modify it before it is no longer being shown, however we should be able > to prevent that through proper domain tracking. Hm, right now we don't prevent anyone from rendering into a to-be-flipped out buffer. There was once code in it, using MI_WAIT_EVENT but we've ripped it out. I guess we could just throw in a synchronous stall on the flip queue though, that should work always. Testing would be easy if we have the crtc CRC stuff, but that's atm stuck due to lack of volunteers ... Overall I really like the idea and I think doing most of the plane enabling (including psr, fbc, ips, and all that stuff which potentially blows through a wblank wait) should be done in async work queues. That should then also help resume time a lot. Cheers, Daniel > By combining all of the above, we can achieve an instaneous set_config: > > [ 6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal > [ 6.601] (II) intel(0): Setting screen physical size to 677 x 381 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/intel_display.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 809b968..c6eea51 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -9077,10 +9077,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set) > ret = intel_set_mode(set->crtc, set->mode, > set->x, set->y, set->fb); > } else if (config->fb_changed) { > - intel_crtc_wait_for_pending_flips(set->crtc); > - > - ret = intel_pipe_set_base(set->crtc, > - set->x, set->y, set->fb); > + if (to_intel_framebuffer(set->fb)->obj->ring == NULL || > + save_set.x != set->x || save_set.y != set->y || > + intel_crtc_page_flip(set->crtc, set->fb, NULL)) { > + intel_crtc_wait_for_pending_flips(set->crtc); > + ret = intel_pipe_set_base(set->crtc, > + set->x, set->y, set->fb); > + } > } > > if (ret) { > -- > 1.8.1.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Fri, Aug 09, 2013 at 09:17:11PM +0200, Daniel Vetter wrote: > On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote: > > A simple modeset, where we only wish to switch over to a new framebuffer > > such as the transition from fbcon to X, takes around 30-60ms. This is > > due to three factors: > > > > 1. We need to make sure the fb->obj is in the display domain, which > > incurs a cache flush to ensure no dirt is left on the scanout. > > > > 2. We need to flush any pending rendering before performing the mmio > > so that the frame is complete before it is shown. > > > > 3. We currently wait for the vblank after the mmio to be sure that the > > old fb is no longer being shown before releasing it. > > > > (1) can only be eliminated by userspace preparing the fb->obj in advance > > to already be in the display domain. This can be done through use of the > > create2 ioctl, or by reusing an existing fb->obj. > > > > However, (2) and (3) are already solved by the existing page flip > > mechanism, and it is surprisingly trivial to wire them up for use in the > > set-base fast path. Though it can be argued that this represents a > > subtle ABI break in that the set_config ioctl now returns before the old > > framebuffer is unpinned. The danger is that userspace will start to > > modify it before it is no longer being shown, however we should be able > > to prevent that through proper domain tracking. > > Hm, right now we don't prevent anyone from rendering into a to-be-flipped > out buffer. There was once code in it, using MI_WAIT_EVENT but we've > ripped it out. I guess we could just throw in a synchronous stall on the > flip queue though, that should work always. I'm glad we did. I'd rather put that into userspace rather than have the kernel impose that policy on everybody, as for X that is exactly the behaviour we want (i.e. not blocking rendering on the next scanout). > Testing would be easy if we have the crtc CRC stuff, but that's atm stuck > due to lack of volunteers ... > > Overall I really like the idea and I think doing most of the plane > enabling (including psr, fbc, ips, and all that stuff which potentially > blows through a wblank wait) should be done in async work queues. That > should then also help resume time a lot. I'd also like to hear Ville's opinion since with his atomic modesetting I hope we will be able to achieve something very similar. -Chris
On Fri, Aug 09, 2013 at 09:06:36PM +0100, Chris Wilson wrote: > On Fri, Aug 09, 2013 at 09:17:11PM +0200, Daniel Vetter wrote: > > On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote: > > > A simple modeset, where we only wish to switch over to a new framebuffer > > > such as the transition from fbcon to X, takes around 30-60ms. This is > > > due to three factors: > > > > > > 1. We need to make sure the fb->obj is in the display domain, which > > > incurs a cache flush to ensure no dirt is left on the scanout. > > > > > > 2. We need to flush any pending rendering before performing the mmio > > > so that the frame is complete before it is shown. > > > > > > 3. We currently wait for the vblank after the mmio to be sure that the > > > old fb is no longer being shown before releasing it. > > > > > > (1) can only be eliminated by userspace preparing the fb->obj in advance > > > to already be in the display domain. This can be done through use of the > > > create2 ioctl, or by reusing an existing fb->obj. > > > > > > However, (2) and (3) are already solved by the existing page flip > > > mechanism, and it is surprisingly trivial to wire them up for use in the > > > set-base fast path. Though it can be argued that this represents a > > > subtle ABI break in that the set_config ioctl now returns before the old > > > framebuffer is unpinned. The danger is that userspace will start to > > > modify it before it is no longer being shown, however we should be able > > > to prevent that through proper domain tracking. > > > > Hm, right now we don't prevent anyone from rendering into a to-be-flipped > > out buffer. There was once code in it, using MI_WAIT_EVENT but we've > > ripped it out. I guess we could just throw in a synchronous stall on the > > flip queue though, that should work always. > > I'm glad we did. I'd rather put that into userspace rather than have the > kernel impose that policy on everybody, as for X that is exactly the > behaviour we want (i.e. not blocking rendering on the next scanout). > > > Testing would be easy if we have the crtc CRC stuff, but that's atm stuck > > due to lack of volunteers ... > > > > Overall I really like the idea and I think doing most of the plane > > enabling (including psr, fbc, ips, and all that stuff which potentially > > blows through a wblank wait) should be done in async work queues. That > > should then also help resume time a lot. > > I'd also like to hear Ville's opinion since with his atomic modesetting > I hope we will be able to achieve something very similar. Async is nice. Like everyone else I suppose, my only concern has to do with writes to the old scanout buffer. One option would be to add an event also to setcrtc, and maybe a new flag to request the event+nonblocking operation. That's what I have in my atomic page flip code (actually I think I had separate flags for each). Unfortunately we don't seem to have flags in setcrtc, unless we commandeer some bits from mode_valid.
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 809b968..c6eea51 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9077,10 +9077,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set) ret = intel_set_mode(set->crtc, set->mode, set->x, set->y, set->fb); } else if (config->fb_changed) { - intel_crtc_wait_for_pending_flips(set->crtc); - - ret = intel_pipe_set_base(set->crtc, - set->x, set->y, set->fb); + if (to_intel_framebuffer(set->fb)->obj->ring == NULL || + save_set.x != set->x || save_set.y != set->y || + intel_crtc_page_flip(set->crtc, set->fb, NULL)) { + intel_crtc_wait_for_pending_flips(set->crtc); + ret = intel_pipe_set_base(set->crtc, + set->x, set->y, set->fb); + } } if (ret) {
A simple modeset, where we only wish to switch over to a new framebuffer such as the transition from fbcon to X, takes around 30-60ms. This is due to three factors: 1. We need to make sure the fb->obj is in the display domain, which incurs a cache flush to ensure no dirt is left on the scanout. 2. We need to flush any pending rendering before performing the mmio so that the frame is complete before it is shown. 3. We currently wait for the vblank after the mmio to be sure that the old fb is no longer being shown before releasing it. (1) can only be eliminated by userspace preparing the fb->obj in advance to already be in the display domain. This can be done through use of the create2 ioctl, or by reusing an existing fb->obj. However, (2) and (3) are already solved by the existing page flip mechanism, and it is surprisingly trivial to wire them up for use in the set-base fast path. Though it can be argued that this represents a subtle ABI break in that the set_config ioctl now returns before the old framebuffer is unpinned. The danger is that userspace will start to modify it before it is no longer being shown, however we should be able to prevent that through proper domain tracking. By combining all of the above, we can achieve an instaneous set_config: [ 6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal [ 6.601] (II) intel(0): Setting screen physical size to 677 x 381 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/intel_display.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)