From patchwork Wed Jul 19 12:54:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 9851949 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D596960393 for ; Wed, 19 Jul 2017 12:55:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C810828641 for ; Wed, 19 Jul 2017 12:55:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BCE642864A; Wed, 19 Jul 2017 12:55:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 698E328647 for ; Wed, 19 Jul 2017 12:55:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 87B696E4E3; Wed, 19 Jul 2017 12:55:22 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1A4266E51F for ; Wed, 19 Jul 2017 12:55:19 +0000 (UTC) Received: by mail-wm0-x242.google.com with SMTP id 143so596175wmu.5 for ; Wed, 19 Jul 2017 05:55:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AIr86y5IoV64ndqEoZgAXAwJOSlYHB+cAHi7m4xoM3Q=; b=f4dPg01UoJPnlxCc0tbqseJRokoyCu2JN1poDDKwoCPIEPC8RpirfF1roAReeAj6VW Ny7jMKJ7WH0j3AUEYrEgu1wAfybWy/napm4bBgaLgq3DTnQKd2weWJvz8tVN9+sC1veF 9UdocUnXoScdEKIvsl7H21SnxImPIykI/4lTU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AIr86y5IoV64ndqEoZgAXAwJOSlYHB+cAHi7m4xoM3Q=; b=Ik4BWB1OuzlMNfK5P/yedU/lrtKewLS8oPgpJhOHkRobgQY8SuVzrdh+p4mdBmR2Ss epPKVOOhXgfFugryVtdOXM9pMxIBEqNqocEgnhhLEcW03IoYILdIRQ4kxn1ySTsBA1xi 1sZrMUrp2jZ3tQAPyqct8lwWXYHawKywCKiF/OWuPSGlLKmj2awG5Ue4LFpSQPxkX/C/ y/q4pludkL7EfshEZH9DrZBVr0XkZJKlFrl/+mBTBmLMsmMu+2rs+o/DeS+ApLKL+5Rg DaPE7wRIHhlUxoMZMcTVhVhqHo6vtHHNTNc+qqDscv8wSXwucdWQ7Cc6C8JpO4dgq6ya yOUw== X-Gm-Message-State: AIVw110ay7bcPRMhvyxyBC7Gyj8WRzLY1Hb+3Y0XRcwQNMeCeYx/aw+d bbEz+IHe00tEmwYLx0c= X-Received: by 10.80.179.137 with SMTP id s9mr5825497edd.57.1500468917443; Wed, 19 Jul 2017 05:55:17 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:5640:0:960b:2678:e223:c1c6]) by smtp.gmail.com with ESMTPSA id e46sm3069420edd.26.2017.07.19.05.55.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jul 2017 05:55:16 -0700 (PDT) From: Daniel Vetter To: DRI Development Subject: [PATCH 5/9] drm/i915: More surgically unbreak the modeset vs reset deadlock Date: Wed, 19 Jul 2017 14:54:58 +0200 Message-Id: <20170719125502.25696-6-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170719125502.25696-1-daniel.vetter@ffwll.ch> References: <20170719125502.25696-1-daniel.vetter@ffwll.ch> Cc: Daniel Vetter , Intel Graphics Development , Joonas Lahtinen , Daniel Vetter , Mika Kuoppala X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP There's no reason to entirely wedge the gpu, for the minimal deadlock bugfix we only need to unbreak/decouple the atomic commit from the gpu reset. The simplest wait to fix that is by replacing the unconditional fence wait a the top of commit_tail by a wait which completes either when the fences are done (normal case, or when a reset doesn't need to touch the display state). Or when the gpu reset needs to force-unblock all pending modeset states. Note that in both cases TDR itself keeps working, so from a userspace pov this trickery isn't observable. Users themselvs might spot a short glitch while the rendering is catching up again, but that's still better than pre-TDR where we've thrown away all the rendering, including innocent batches. Also, this fixes the regression TDR introduced of making gpu resets deadlock-prone when we do need to touch the display. One thing I noticed is that gpu_error.flags seems to use both our own wait-queue in gpu_error.wait_queue, and the generic wait_on_bit facilities. Not entirely sure why this inconsistency exists, I just picked one style. A possible future avenue could be to insert the gpu reset in-between ongoing modeset changes, which would avoid the momentary glitch. But that's a lot more work to implement in the atomic commit machinery, and given that we only need this for pre-g4x hw, of questionable utility just for the sake of polishing gpu reset even more on those old boxes. It might be useful for other features though. v2: Rebase onto 4.13 with a s/wait_queue_t/struct wait_queue_entry/. Cc: Chris Wilson Cc: Mika Kuoppala Cc: Joonas Lahtinen Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel_display.c | 35 ++++++++++++++++++++++++++++++----- 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 07e98b07c5bc..369968539b40 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1564,6 +1564,7 @@ struct i915_gpu_error { unsigned long flags; #define I915_RESET_BACKOFF 0 #define I915_RESET_HANDOFF 1 +#define I915_RESET_MODESET 2 #define I915_WEDGED (BITS_PER_LONG - 1) #define I915_RESET_ENGINE (I915_WEDGED - I915_NUM_ENGINES) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 5aa7ca1ab592..4762f158032d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3471,10 +3471,9 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv) !gpu_reset_clobbers_display(dev_priv)) return; - /* We have a modeset vs reset deadlock, defensively unbreak it. - * - * FIXME: We can do a _lot_ better, this is just a first iteration.*/ - i915_gem_set_wedged(dev_priv); + /* We have a modeset vs reset deadlock, defensively unbreak it. */ + set_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags); + wake_up_all(&dev_priv->gpu_error.wait_queue); /* * Need mode_config.mutex so that we don't @@ -3569,6 +3568,8 @@ void intel_finish_reset(struct drm_i915_private *dev_priv) drm_modeset_drop_locks(ctx); drm_modeset_acquire_fini(ctx); mutex_unlock(&dev->mode_config.mutex); + + clear_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags); } static bool abort_flip_on_reset(struct intel_crtc *crtc) @@ -12384,6 +12385,30 @@ static void intel_atomic_helper_free_state_worker(struct work_struct *work) intel_atomic_helper_free_state(dev_priv); } +static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state) +{ + struct wait_queue_entry wait_fence, wait_reset; + struct drm_i915_private *dev_priv = to_i915(intel_state->base.dev); + + init_wait_entry(&wait_fence, 0); + init_wait_entry(&wait_reset, 0); + for (;;) { + prepare_to_wait(&intel_state->commit_ready.wait, + &wait_fence, TASK_UNINTERRUPTIBLE); + prepare_to_wait(&dev_priv->gpu_error.wait_queue, + &wait_reset, TASK_UNINTERRUPTIBLE); + + + if (i915_sw_fence_done(&intel_state->commit_ready) + || (dev_priv->gpu_error.flags & I915_RESET_MODESET)) + break; + + schedule(); + } + finish_wait(&intel_state->commit_ready.wait, &wait_fence); + finish_wait(&dev_priv->gpu_error.wait_queue, &wait_reset); +} + static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -12397,7 +12422,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) unsigned crtc_vblank_mask = 0; int i; - i915_sw_fence_wait(&intel_state->commit_ready); + intel_atomic_commit_fence_wait(intel_state); drm_atomic_helper_wait_for_dependencies(state);