From patchwork Thu Jul 20 17:57:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 9855537 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3540A602BA for ; Thu, 20 Jul 2017 17:58:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2C26D28577 for ; Thu, 20 Jul 2017 17:58:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 21067285E1; Thu, 20 Jul 2017 17:58:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7DD8328577 for ; Thu, 20 Jul 2017 17:58:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D5DD26E6F5; Thu, 20 Jul 2017 17:58:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) by gabe.freedesktop.org (Postfix) with ESMTPS id CB29E6E6F5 for ; Thu, 20 Jul 2017 17:58:02 +0000 (UTC) Received: by mail-wm0-x242.google.com with SMTP id 65so4452087wmf.0 for ; Thu, 20 Jul 2017 10:58:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SJ5eW2rBj9SxYSW2RL6paLJj2jXCeK/pogIWbnRUjkg=; b=EONktOztHi7D+vx+0J6cGTKgdTI7VIicnrDydTDmpi9XV3BiCQsu/MUbx3KC7uA89P ivMcXY/2fAhEjXO9qOnvAxqLCtRUxcUa/17fK48EbyojF/QeL6T1KcjWnkhFtaoJz0Qq tODdCm93s/hWfdOPPPVqXJdZBOcslhVQk+StM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SJ5eW2rBj9SxYSW2RL6paLJj2jXCeK/pogIWbnRUjkg=; b=NXwdG06uvcM3tzxfZL4ikSfYnwfkVqnilLQvQmz5VYD7mCAzc/7KEQhIKCZ0hzkQ7C pwEFn4MyaRQpLnpChe8qDQoOw/lG2rXxmQbUmzGQkeotlqw2tSprhSyWDATuFmfWK3di E90Ut247Q1kv8joNqV9DlkrsBfyuGtfRd/h/nK6xj4R+tYEYQaq9ORmrtm6f0F7IXyXm ++vmzSR49FKkHRrLz+StjsufpivkAQrzRS1M2T58BZemRHO4V+oix3xYpYBF456f4D28 cU61qRlEpdw5jIpWPa8vMwfiL1LCbXj//89jgy3EqXSiPpQM/IcKDnEGeuqV4YkkodC0 enYQ== X-Gm-Message-State: AIVw112rZIbP4+ifrx3HF8SotLBgg/A5L8CwTSi860cX55bUUThjgcUy oOQ21Z3QQqjbijTXGSA= X-Received: by 10.80.212.206 with SMTP id e14mr3472226edj.239.1500573481206; Thu, 20 Jul 2017 10:58:01 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:5640:0:960b:2678:e223:c1c6]) by smtp.gmail.com with ESMTPSA id b4sm1773591eda.34.2017.07.20.10.57.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Jul 2017 10:58:00 -0700 (PDT) From: Daniel Vetter To: Intel Graphics Development Date: Thu, 20 Jul 2017 19:57:48 +0200 Message-Id: <20170720175754.30751-2-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170720175754.30751-1-daniel.vetter@ffwll.ch> References: <20170720175754.30751-1-daniel.vetter@ffwll.ch> Cc: Daniel Vetter , Mika Kuoppala , Daniel Vetter Subject: [Intel-gfx] [PATCH 1/7] drm/i915: Avoid the gpu reset vs. modeset deadlock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP ... using the biggest hammer we have. This is essentially a weaponized version of the timeout-based wedging Chris added in commit 36703e79a982c8ce5a8e43833291f2719e92d0d1 Author: Chris Wilson Date: Thu Jun 22 11:56:25 2017 +0100 drm/i915: Break modeset deadlocks on reset Because defense-in-depth is good it's good to still have both. Also note that with the locking change we can now restrict this a lot (old gpus and special testing only), so this doesn't kill the TDR benefits on at least anything remotely modern. And futuremore with a few tricks it should be possible to make a much more educated guess about whether an atomic commit is stuck waiting on the gpu (atomic_t counting the pending i915_sw_fence used by the atomic modeset code should do it), so we can improve this. But for now just start with something that is guaranteed to recover faster, for much better CI througput. This defacto reverts TDR on these platforms, but there's not really a single commit to specify as the sole offender. v2: Add a debug message to explain what's going on. We can't DRM_ERROR because that spams CI. And the timeout based fallback still prints a DRM_ERROR, in case something goes wrong. Fixes: 4680816be336 ("drm/i915: Wait first for submission, before waiting for request completion") Fixes: 221fe7994554 ("drm/i915: Perform a direct reset of the GPU from the waiter") Cc: Chris Wilson Cc: Mika Kuoppala Cc: Joonas Lahtinen Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/intel_display.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 02b1f4966049..995522e40ec1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3471,6 +3471,12 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv) !gpu_reset_clobbers_display(dev_priv)) return; + /* We have a modeset vs reset deadlock, defensively unbreak it. + * + * FIXME: We can do a _lot_ better, this is just a first iteration.*/ + i915_gem_set_wedged(dev_priv); + DRM_DEBUG_DRIVER("Wedging GPU to avoid deadlocks with pending modeset updates\n"); + /* * Need mode_config.mutex so that we don't * trample ongoing ->detect() and whatnot.