Message ID | 1390213057-19091-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jan 20, 2014 at 10:17:36AM +0000, Chris Wilson wrote: > On older generations (gen2, gen3) the GPU requires fences for many > operations, such as blits. The display hardware also requires fences for > scanouts and this leads to a situation where an arbitrary number of > fences may be pinned by old scanouts following a pageflip but before we > have executed the unpin workqueue. This is unpredictable by userspace > and leads to random EDEADLK when submitting an otherwise benign > execbuffer. However, we can detect when we have an outstanding flip and > so cause userspace to wait upon their completion before finally > declaring that the system is starved of fences. This is really no worse > than forcing the GPU to stall waiting for older execbuffer to retire and > release their fences before we can reallocate them for the next > execbuffer. > > v2: move the test for a pending fb unpin to a common routine for > later reuse during eviction > > Reported-and-tested-by: dimon@gmx.net > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 Testcase: i-g-t/kms_flip/flip-vs-fences > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> The testcase is only for the trivial reproduction scenario, not the more sporadic situation involving a slightly hungry workqueue, but it is enough to demonstrate the issue and the effectiveness of the simple workaround (for typical userspace). -Chris
On Mon, Jan 20, 2014 at 9:30 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Mon, Jan 20, 2014 at 10:17:36AM +0000, Chris Wilson wrote: >> On older generations (gen2, gen3) the GPU requires fences for many >> operations, such as blits. The display hardware also requires fences for >> scanouts and this leads to a situation where an arbitrary number of >> fences may be pinned by old scanouts following a pageflip but before we >> have executed the unpin workqueue. This is unpredictable by userspace >> and leads to random EDEADLK when submitting an otherwise benign >> execbuffer. However, we can detect when we have an outstanding flip and >> so cause userspace to wait upon their completion before finally >> declaring that the system is starved of fences. This is really no worse >> than forcing the GPU to stall waiting for older execbuffer to retire and >> release their fences before we can reallocate them for the next >> execbuffer. >> >> v2: move the test for a pending fb unpin to a common routine for >> later reuse during eviction >> >> Reported-and-tested-by: dimon@gmx.net >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 > Testcase: i-g-t/kms_flip/flip-vs-fences >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > The testcase is only for the trivial reproduction scenario, not the more > sporadic situation involving a slightly hungry workqueue, but it is > enough to demonstrate the issue and the effectiveness of the simple > workaround (for typical userspace). Testcase convinced me, this is better than status quo ;-) I guess I'll add a short comment though when merging that busy-looping through EAGAIN is a but un-neat. Jon, since this ties into the evict_something fun we've had last year can you please review this patch plus the follow-up one? Thanks, Daniel
On Tue, Jan 21, 2014 at 10:32:50AM +0100, Daniel Vetter wrote: > On Mon, Jan 20, 2014 at 9:30 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > The testcase is only for the trivial reproduction scenario, not the more > > sporadic situation involving a slightly hungry workqueue, but it is > > enough to demonstrate the issue and the effectiveness of the simple > > workaround (for typical userspace). > > Testcase convinced me, this is better than status quo ;-) I guess I'll > add a short comment though when merging that busy-looping through > EAGAIN is a but un-neat. On the other hand, it does reduce the RT problem into a typical priority inversion issue - all the lock dropping and complex error case handling is built in and does not need introduction of more fragile code. So reducing it to a known problem is itself neat. :) -Chris
> -----Original Message----- > From: intel-gfx-bounces@lists.freedesktop.org [mailto:intel-gfx- > bounces@lists.freedesktop.org] On Behalf Of Chris Wilson > Sent: Monday, January 20, 2014 10:18 AM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Wait for completion of pending > flips when starved of fences > > On older generations (gen2, gen3) the GPU requires fences for many > operations, such as blits. The display hardware also requires fences for > scanouts and this leads to a situation where an arbitrary number of fences > may be pinned by old scanouts following a pageflip but before we have > executed the unpin workqueue. This is unpredictable by userspace and leads > to random EDEADLK when submitting an otherwise benign execbuffer. > However, we can detect when we have an outstanding flip and so cause > userspace to wait upon their completion before finally declaring that the > system is starved of fences. This is really no worse than forcing the GPU to > stall waiting for older execbuffer to retire and release their fences before we > can reallocate them for the next execbuffer. > > v2: move the test for a pending fb unpin to a common routine for later reuse > during eviction > > Reported-and-tested-by: dimon@gmx.net > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Review-by: Jon Bloomfield <jon.bloomfield@intel.com>
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 41d5a99..ba99b31 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3214,7 +3214,7 @@ i915_find_fence_reg(struct drm_device *dev) } if (avail == NULL) - return NULL; + goto deadlock; /* None available, try to steal one or wait for a user to finish */ list_for_each_entry(reg, &dev_priv->mm.fence_list, lru_list) { @@ -3224,7 +3224,12 @@ i915_find_fence_reg(struct drm_device *dev) return reg; } - return NULL; +deadlock: + /* Wait for completion of pending flips which consume fences */ + if (intel_has_pending_fb_unpin(dev)) + return ERR_PTR(-EAGAIN); + + return ERR_PTR(-EDEADLK); } /** @@ -3269,8 +3274,8 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj) } } else if (enable) { reg = i915_find_fence_reg(dev); - if (reg == NULL) - return -EDEADLK; + if (IS_ERR(reg)) + return PTR_ERR(reg); if (reg->obj) { struct drm_i915_gem_object *old = reg->obj; diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 25c01c2..03e703d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2982,6 +2982,30 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc) return pending; } +bool intel_has_pending_fb_unpin(struct drm_device *dev) +{ + struct intel_crtc *crtc; + + /* Note that we don't need to be called with mode_config.lock here + * as our list of CRTC objects is static for the lifetime of the + * device and so cannot disappear as we iterate. Similarly, we can + * happily treat the predicates as racy, atomic checks as userspace + * cannot claim and pin a new fb without at least acquring the + * struct_mutex and so serialising with us. + */ + list_for_each_entry(crtc, &dev->mode_config.crtc_list, base.head) { + if (atomic_read(&crtc->unpin_work_count) == 0) + continue; + + if (crtc->unpin_work) + intel_wait_for_vblank(dev, crtc->pipe); + + return true; + } + + return false; +} + static void intel_crtc_wait_for_pending_flips(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 2b72b1d..713009b 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -632,6 +632,7 @@ void intel_ddi_get_config(struct intel_encoder *encoder, /* intel_display.c */ const char *intel_output_name(int output); +bool intel_has_pending_fb_unpin(struct drm_device *dev); int intel_pch_rawclk(struct drm_device *dev); void intel_mark_busy(struct drm_device *dev); void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
On older generations (gen2, gen3) the GPU requires fences for many operations, such as blits. The display hardware also requires fences for scanouts and this leads to a situation where an arbitrary number of fences may be pinned by old scanouts following a pageflip but before we have executed the unpin workqueue. This is unpredictable by userspace and leads to random EDEADLK when submitting an otherwise benign execbuffer. However, we can detect when we have an outstanding flip and so cause userspace to wait upon their completion before finally declaring that the system is starved of fences. This is really no worse than forcing the GPU to stall waiting for older execbuffer to retire and release their fences before we can reallocate them for the next execbuffer. v2: move the test for a pending fb unpin to a common routine for later reuse during eviction Reported-and-tested-by: dimon@gmx.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 13 +++++++++---- drivers/gpu/drm/i915/intel_display.c | 24 ++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_drv.h | 1 + 3 files changed, 34 insertions(+), 4 deletions(-)