From patchwork Sat Jul 15 11:40:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 9842333 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EBEFA60212 for ; Sat, 15 Jul 2017 11:40:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DB9B2286FB for ; Sat, 15 Jul 2017 11:40:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D043B2872B; Sat, 15 Jul 2017 11:40:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 65846286FB for ; Sat, 15 Jul 2017 11:40:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EEA556E90F; Sat, 15 Jul 2017 11:40:34 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id B3A3C6E901 for ; Sat, 15 Jul 2017 11:40:28 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id u23so15001825wma.2 for ; Sat, 15 Jul 2017 04:40:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3p+j+W7atr5HfsWqsrL4uhvhsgvtjYDIBjNr4jq2XD4=; b=kIEAMYqJhSlDDc9zKnyScC56fL8Eak+9DqIZ/EZ5dmvS2P1lksqdNf2EP1WSmfqbRV CTY2djCzOi8Cp1iuGlcW5kLCGw6O5xIiy8Ti4OG/e9ng5L+0BWYcXpp+qSX4dWmZbiFB CA/05NV1Ym7YDSHBFKpNYI+CVd+SeRs02Pp3I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3p+j+W7atr5HfsWqsrL4uhvhsgvtjYDIBjNr4jq2XD4=; b=Ls3URPuGCDj2Ps/TaPuGE555Cz1SlmL1q+mSXZwmB1DFBqnKGiJl4Ph4SsnBluL6ni vxgjLIdF9lgZhgbykdmdj/UA+nsxbyIsewgbGmUiRdJdvjq2EeaQhQSVlAww1MK1LOmO gFmIHFXm27+bJxaOXcTTuEAScRO8TQE60jvrLCW5cJKuJzoJfh5mttE5MdRO5QNl2j2S LHlOyi+2Jmm9EQPOO2a+17uD8mf125MUcW7OsWCmwEtlQ0u//MBfns/TallB51/lTmeV CdEqFp4p6w2v1yNOmZ6cAHuffxDWVFuHCm79qCYGg+XOyc5sOL/0p6ICgkgDXwV/aLPJ fXQQ== X-Gm-Message-State: AIVw1109cNpAMUkKIx038+q5f4kDgvhTudpNBQF7ykK+PNhpvnsu99u7 +RwBaSxAEx9vNv242Rw= X-Received: by 10.80.180.141 with SMTP id w13mr10421288edd.42.1500118827070; Sat, 15 Jul 2017 04:40:27 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:5640:0:960b:2678:e223:c1c6]) by smtp.gmail.com with ESMTPSA id c8sm5898449edc.13.2017.07.15.04.40.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jul 2017 04:40:26 -0700 (PDT) From: Daniel Vetter To: Intel Graphics Development Date: Sat, 15 Jul 2017 13:40:06 +0200 Message-Id: <20170715114006.6380-8-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170715114006.6380-1-daniel.vetter@ffwll.ch> References: <20170715114006.6380-1-daniel.vetter@ffwll.ch> Cc: Daniel Vetter , Mika Kuoppala , Daniel Vetter Subject: [Intel-gfx] [PATCH 7/7] drm/i915: More surgically unbreak the modeset vs reset deadlock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP There's no reason to entirely wedge the gpu, for the minimal deadlock bugfix we only need to unbreak/decouple the atomic commit from the gpu reset. The simplest wait to fix that is by replacing the unconditional fence wait a the top of commit_tail by a wait which completes either when the fences are done (normal case, or when a reset doesn't need to touch the display state). Or when the gpu reset needs to force-unblock all pending modeset states. Note that in both cases TDR itself keeps working, so from a userspace pov this trickery isn't observable. Users themselvs might spot a short glitch while the rendering is catching up again, but that's still better than pre-TDR where we've thrown away all the rendering, including innocent batches. Also, this fixes the regression TDR introduced of making gpu resets deadlock-prone when we do need to touch the display. One thing I noticed is that gpu_error.flags seems to use both our own wait-queue in gpu_error.wait_queue, and the generic wait_on_bit facilities. Not entirely sure why this inconsistency exists, I just picked one style. A possible future avenue could be to insert the gpu reset in-between ongoing modeset changes, which would avoid the momentary glitch. But that's a lot more work to implement in the atomic commit machinery, and given that we only need this for pre-g4x hw, of questionable utility just for the sake of polishing gpu reset even more on those old boxes. It might be useful for other features though. Cc: Chris Wilson Cc: Mika Kuoppala Cc: Joonas Lahtinen Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel_display.c | 35 ++++++++++++++++++++++++++++++----- 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bdee66ca23af..ac64ac628bfb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1564,6 +1564,7 @@ struct i915_gpu_error { unsigned long flags; #define I915_RESET_BACKOFF 0 #define I915_RESET_HANDOFF 1 +#define I915_RESET_MODESET 2 #define I915_WEDGED (BITS_PER_LONG - 1) #define I915_RESET_ENGINE (I915_WEDGED - I915_NUM_ENGINES) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index f2ceb908ee95..9524d6d769e4 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3471,10 +3471,9 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv) !gpu_reset_clobbers_display(dev_priv)) return; - /* We have a modeset vs reset deadlock, defensively unbreak it. - * - * FIXME: We can do a _lot_ better, this is just a first iteration.*/ - i915_gem_set_wedged(dev_priv); + /* We have a modeset vs reset deadlock, defensively unbreak it. */ + set_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags); + wake_up_all(&dev_priv->gpu_error.wait_queue); /* * Need mode_config.mutex so that we don't @@ -3572,6 +3571,8 @@ void intel_finish_reset(struct drm_i915_private *dev_priv) drm_modeset_drop_locks(ctx); drm_modeset_acquire_fini(ctx); mutex_unlock(&dev->mode_config.mutex); + + clear_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags); } static bool abort_flip_on_reset(struct intel_crtc *crtc) @@ -12377,6 +12378,30 @@ static void intel_atomic_helper_free_state_worker(struct work_struct *work) intel_atomic_helper_free_state(dev_priv); } +static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state) +{ + wait_queue_t wait_fence, wait_reset; + struct drm_i915_private *dev_priv = to_i915(intel_state->base.dev); + + init_wait_entry(&wait_fence, 0); + init_wait_entry(&wait_reset, 0); + for (;;) { + prepare_to_wait(&intel_state->commit_ready.wait, + &wait_fence, TASK_UNINTERRUPTIBLE); + prepare_to_wait(&dev_priv->gpu_error.wait_queue, + &wait_reset, TASK_UNINTERRUPTIBLE); + + + if (i915_sw_fence_done(&intel_state->commit_ready) + || (dev_priv->gpu_error.flags & I915_RESET_MODESET)) + break; + + schedule(); + } + finish_wait(&intel_state->commit_ready.wait, &wait_fence); + finish_wait(&dev_priv->gpu_error.wait_queue, &wait_reset); +} + static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -12390,7 +12415,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) unsigned crtc_vblank_mask = 0; int i; - i915_sw_fence_wait(&intel_state->commit_ready); + intel_atomic_commit_fence_wait(intel_state); drm_atomic_helper_wait_for_dependencies(state);