From patchwork Wed Jul 19 12:54:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 9851939 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7668E60392 for ; Wed, 19 Jul 2017 12:55:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 68D4028641 for ; Wed, 19 Jul 2017 12:55:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5DBDB2864A; Wed, 19 Jul 2017 12:55:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EFE5E28641 for ; Wed, 19 Jul 2017 12:55:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4026A6E51F; Wed, 19 Jul 2017 12:55:20 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id CE2AD6E4E3 for ; Wed, 19 Jul 2017 12:55:18 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id n64so2406647wmg.2 for ; Wed, 19 Jul 2017 05:55:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AIr86y5IoV64ndqEoZgAXAwJOSlYHB+cAHi7m4xoM3Q=; b=f4dPg01UoJPnlxCc0tbqseJRokoyCu2JN1poDDKwoCPIEPC8RpirfF1roAReeAj6VW Ny7jMKJ7WH0j3AUEYrEgu1wAfybWy/napm4bBgaLgq3DTnQKd2weWJvz8tVN9+sC1veF 9UdocUnXoScdEKIvsl7H21SnxImPIykI/4lTU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AIr86y5IoV64ndqEoZgAXAwJOSlYHB+cAHi7m4xoM3Q=; b=pQJQGsFSkgN0jhy55LRVC+3+cN+QXIrWkuPCkRCoz/tU+jWm8+zD5UfMmEhHINiC7o tCC6OBJ8TcaMlOEcqTlUY6qbkDorw9+IQJ6GB/gh2TG6s9kXS8iK5auQuzM1KwhS3S7J BnsG2Wg21iXnt+2SZ5vevsWBbZAj+vWxipayRFIycqXz2hwLgtx1w/PbCrENsxBQUD6F 7HR2AUzmRZrmn/JVWXpl+OX62nEy3pmsbo4LydMXKJSpbpDFQ4oIpF2nvfsWOTrKtyMO +D1GRx+3XmrRtM0zfsyfDAkNiOliMvR7PK1We9vv1if7ITC26pND4mE0f7turinF3tm7 zALw== X-Gm-Message-State: AIVw11364N1y5wCoREICPu7S2Qq0RYwXWIPBDmxvyVPK7umWxoIXvPQN Qz+8Yr7H/noxZq/R X-Received: by 10.80.179.137 with SMTP id s9mr5825497edd.57.1500468917443; Wed, 19 Jul 2017 05:55:17 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:5640:0:960b:2678:e223:c1c6]) by smtp.gmail.com with ESMTPSA id e46sm3069420edd.26.2017.07.19.05.55.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jul 2017 05:55:16 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Wed, 19 Jul 2017 14:54:58 +0200 Message-Id: <20170719125502.25696-6-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170719125502.25696-1-daniel.vetter@ffwll.ch> References: <20170719125502.25696-1-daniel.vetter@ffwll.ch> Cc: Daniel Vetter , Intel Graphics Development , Daniel Vetter , Mika Kuoppala Subject: [Intel-gfx] [PATCH 5/9] drm/i915: More surgically unbreak the modeset vs reset deadlock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP There's no reason to entirely wedge the gpu, for the minimal deadlock bugfix we only need to unbreak/decouple the atomic commit from the gpu reset. The simplest wait to fix that is by replacing the unconditional fence wait a the top of commit_tail by a wait which completes either when the fences are done (normal case, or when a reset doesn't need to touch the display state). Or when the gpu reset needs to force-unblock all pending modeset states. Note that in both cases TDR itself keeps working, so from a userspace pov this trickery isn't observable. Users themselvs might spot a short glitch while the rendering is catching up again, but that's still better than pre-TDR where we've thrown away all the rendering, including innocent batches. Also, this fixes the regression TDR introduced of making gpu resets deadlock-prone when we do need to touch the display. One thing I noticed is that gpu_error.flags seems to use both our own wait-queue in gpu_error.wait_queue, and the generic wait_on_bit facilities. Not entirely sure why this inconsistency exists, I just picked one style. A possible future avenue could be to insert the gpu reset in-between ongoing modeset changes, which would avoid the momentary glitch. But that's a lot more work to implement in the atomic commit machinery, and given that we only need this for pre-g4x hw, of questionable utility just for the sake of polishing gpu reset even more on those old boxes. It might be useful for other features though. v2: Rebase onto 4.13 with a s/wait_queue_t/struct wait_queue_entry/. Cc: Chris Wilson Cc: Mika Kuoppala Cc: Joonas Lahtinen Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel_display.c | 35 ++++++++++++++++++++++++++++++----- 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 07e98b07c5bc..369968539b40 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1564,6 +1564,7 @@ struct i915_gpu_error { unsigned long flags; #define I915_RESET_BACKOFF 0 #define I915_RESET_HANDOFF 1 +#define I915_RESET_MODESET 2 #define I915_WEDGED (BITS_PER_LONG - 1) #define I915_RESET_ENGINE (I915_WEDGED - I915_NUM_ENGINES) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 5aa7ca1ab592..4762f158032d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3471,10 +3471,9 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv) !gpu_reset_clobbers_display(dev_priv)) return; - /* We have a modeset vs reset deadlock, defensively unbreak it. - * - * FIXME: We can do a _lot_ better, this is just a first iteration.*/ - i915_gem_set_wedged(dev_priv); + /* We have a modeset vs reset deadlock, defensively unbreak it. */ + set_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags); + wake_up_all(&dev_priv->gpu_error.wait_queue); /* * Need mode_config.mutex so that we don't @@ -3569,6 +3568,8 @@ void intel_finish_reset(struct drm_i915_private *dev_priv) drm_modeset_drop_locks(ctx); drm_modeset_acquire_fini(ctx); mutex_unlock(&dev->mode_config.mutex); + + clear_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags); } static bool abort_flip_on_reset(struct intel_crtc *crtc) @@ -12384,6 +12385,30 @@ static void intel_atomic_helper_free_state_worker(struct work_struct *work) intel_atomic_helper_free_state(dev_priv); } +static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state) +{ + struct wait_queue_entry wait_fence, wait_reset; + struct drm_i915_private *dev_priv = to_i915(intel_state->base.dev); + + init_wait_entry(&wait_fence, 0); + init_wait_entry(&wait_reset, 0); + for (;;) { + prepare_to_wait(&intel_state->commit_ready.wait, + &wait_fence, TASK_UNINTERRUPTIBLE); + prepare_to_wait(&dev_priv->gpu_error.wait_queue, + &wait_reset, TASK_UNINTERRUPTIBLE); + + + if (i915_sw_fence_done(&intel_state->commit_ready) + || (dev_priv->gpu_error.flags & I915_RESET_MODESET)) + break; + + schedule(); + } + finish_wait(&intel_state->commit_ready.wait, &wait_fence); + finish_wait(&dev_priv->gpu_error.wait_queue, &wait_reset); +} + static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -12397,7 +12422,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) unsigned crtc_vblank_mask = 0; int i; - i915_sw_fence_wait(&intel_state->commit_ready); + intel_atomic_commit_fence_wait(intel_state); drm_atomic_helper_wait_for_dependencies(state);