From patchwork Thu Jan 31 12:01:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10790323 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 30EAA746 for ; Thu, 31 Jan 2019 12:01:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2434C30BD3 for ; Thu, 31 Jan 2019 12:01:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2278C30C0D; Thu, 31 Jan 2019 12:01:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B967230C19 for ; Thu, 31 Jan 2019 12:01:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CDB246ECA9; Thu, 31 Jan 2019 12:01:57 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id C44B26E690 for ; Thu, 31 Jan 2019 12:01:52 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 15415347-1500050 for multiple; Thu, 31 Jan 2019 12:01:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 31 Jan 2019 12:01:03 +0000 Message-Id: <20190131120105.23144-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190131120105.23144-1-chris@chris-wilson.co.uk> References: <20190131120105.23144-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/5] drm/i915: Uninterruptibly drain the timelines on unwedging X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP On wedging, we mark all executing requests as complete and all pending requests completed as soon as they are ready. Before unwedging though we wish to flush those pending requests prior to restoring default execution, and so we must wait. Do so interruptibly as we do not provide the EINTR gracefully back to userspace in this case but persistent in the permanently wedged start without restarting the syscall. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_reset.c | 28 ++++++++-------------------- 1 file changed, 8 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c index db37fe819504..bbe3141c5c72 100644 --- a/drivers/gpu/drm/i915/i915_reset.c +++ b/drivers/gpu/drm/i915/i915_reset.c @@ -861,7 +861,6 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915) { struct i915_gpu_error *error = &i915->gpu_error; struct i915_timeline *tl; - bool ret = false; if (!test_bit(I915_WEDGED, &error->flags)) return true; @@ -886,30 +885,20 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915) mutex_lock(&i915->gt.timelines.mutex); list_for_each_entry(tl, &i915->gt.timelines.active_list, link) { struct i915_request *rq; - long timeout; rq = i915_gem_active_get_unlocked(&tl->last_request); if (!rq) continue; /* - * We can't use our normal waiter as we want to - * avoid recursively trying to handle the current - * reset. The basic dma_fence_default_wait() installs - * a callback for dma_fence_signal(), which is - * triggered by our nop handler (indirectly, the - * callback enables the signaler thread which is - * woken by the nop_submit_request() advancing the seqno - * and when the seqno passes the fence, the signaler - * then signals the fence waking us up). + * All internal dependencies (i915_requests) will have + * been flushed by the set-wedge, but we may be stuck waiting + * for external fences. These should all be capped to 10s + * (I915_FENCE_TIMEOUT) so this wait should not be unbounded + * in the worst case. */ - timeout = dma_fence_default_wait(&rq->fence, true, - MAX_SCHEDULE_TIMEOUT); + dma_fence_default_wait(&rq->fence, false, MAX_SCHEDULE_TIMEOUT); i915_request_put(rq); - if (timeout < 0) { - mutex_unlock(&i915->gt.timelines.mutex); - goto unlock; - } } mutex_unlock(&i915->gt.timelines.mutex); @@ -930,11 +919,10 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915) smp_mb__before_atomic(); /* complete takeover before enabling execbuf */ clear_bit(I915_WEDGED, &i915->gpu_error.flags); - ret = true; -unlock: + mutex_unlock(&i915->gpu_error.wedge_mutex); - return ret; + return true; } static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)