From patchwork Fri Jan 15 14:35:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 8041451 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E31399F859 for ; Fri, 15 Jan 2016 14:36:07 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id F2B452038E for ; Fri, 15 Jan 2016 14:36:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 0DFE220434 for ; Fri, 15 Jan 2016 14:36:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B67D46EB80; Fri, 15 Jan 2016 06:36:03 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-f67.google.com (mail-wm0-f67.google.com [74.125.82.67]) by gabe.freedesktop.org (Postfix) with ESMTPS id 964146EB7C for ; Fri, 15 Jan 2016 06:36:00 -0800 (PST) Received: by mail-wm0-f67.google.com with SMTP id b14so3166508wmb.1 for ; Fri, 15 Jan 2016 06:36:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=VHJPb8ei6TMBTD4CiktcECyFy2EJyFbHaQd2njZonlc=; b=QiFtlFk5CqqwP51eqHve5JdVuoiqdWrSDrxPeBkFcI9kPlvJ9Mr9wy3LAVr6LZD6mX YmE7zmsS6FDSAAQRT+OQddedOK0r+XqH4kd0hDTGymglCPieA2pb7A/cLhRRacO/EQaI jRCXFSOqB+Jchzkrgx71WNO3z+h//hQmeGmMWYeRxjjGHpV/LioMeGWtzDJgFbne8loa anIdcZH27fe7R5A1VvJhX5D7+AuXOa+J+97UK7Bc9SLYywCfpdKSJwJ6XLKrPlpV+LSC sJ87FkPtcd8RZAQR7X75rhGGLaLai6mc5VFGJsRj+LcqKqjyiLKb5VvvGlY6JwicgwA1 Lw/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=VHJPb8ei6TMBTD4CiktcECyFy2EJyFbHaQd2njZonlc=; b=FkChHEi6BC+jt1QMci/QgwB2+JlxKdTR3jbTE1QMJgIc8FJ4glTrkCGrhvseABH8xA unssTw0ldbCiHPlKVGz/iCkYyaSWiHKZDEb/lsYx858/WoPTkHYOPKHZP7QKWIdxOglD TD8hY/JDMvr3FCzdnOzEVYRhu0r9vYaT875/60n5vlFZW97H5Ub1M9EpIk8nj7j31MgV tJjj3l5LiU6OcWCRu8V1DJuReBJdfWn9G/Mqt4grcM3ADD50CkOEyRbmhJthu6xQaZ4l 2iKxzL0X0LFZcpULCjRmIP5ZE6J3VX0NFX+UBCJ4JsZ10Fd8J4JjULFOjMdFxW0TVa3w svyg== X-Gm-Message-State: ALoCoQnMDo2n7hJ616WAiRi6iWn6qk2IxWHFzfBeM5MrXHT0LeWizh34R9gxuCQ3vkEzm+OqdSTVq5ZbuyoWmVmM4xcW0VYvHQ== X-Received: by 10.194.171.66 with SMTP id as2mr10287511wjc.73.1452868559362; Fri, 15 Jan 2016 06:35:59 -0800 (PST) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id qs1sm10900961wjc.2.2016.01.15.06.35.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 15 Jan 2016 06:35:58 -0800 (PST) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 15 Jan 2016 14:35:43 +0000 Message-Id: <1452868545-19586-5-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.7.0.rc3 In-Reply-To: <1452868545-19586-1-git-send-email-chris@chris-wilson.co.uk> References: <1452868545-19586-1-git-send-email-chris@chris-wilson.co.uk> Cc: Mika Kuoppala Subject: [Intel-gfx] [PATCH 4/6] drm/i915: Harden detection of missed interrupts X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Only declare a missed interrupt if we find that the GPU is idle with waiters and a hangcheck interval has passed in which no new user interrupts have been raised. Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/i915_debugfs.c | 11 +++++++---- drivers/gpu/drm/i915/i915_irq.c | 7 ++++++- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index b421b53ca128..966fc022418c 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -730,10 +730,10 @@ static int i915_gem_request_info(struct seq_file *m, void *data) static void i915_ring_seqno_info(struct seq_file *m, struct intel_engine_cs *ring) { - if (ring->get_seqno) { - seq_printf(m, "Current sequence (%s): %x\n", - ring->name, ring->get_seqno(ring)); - } + seq_printf(m, "Current sequence (%s): %x\n", + ring->name, ring->get_seqno(ring)); + seq_printf(m, "Current user interrupts (%s): %x\n", + ring->name, READ_ONCE(ring->user_interrupts)); } static int i915_gem_seqno_info(struct seq_file *m, void *data) @@ -1361,6 +1361,9 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) seq_printf(m, "%s:\n", ring->name); seq_printf(m, "\tseqno = %x [current %x]\n", ring->hangcheck.seqno, seqno[i]); + seq_printf(m, "\tuser interrupts = %x [current %x]\n", + ring->hangcheck.user_interrupts, + ring->user_interrupts); seq_printf(m, "\tACTHD = 0x%08llx [current 0x%08llx]\n", (long long)ring->hangcheck.acthd, (long long)acthd[i]); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 07bc2cdd6252..978eebcf4594 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1000,6 +1000,7 @@ static void notify_ring(struct intel_engine_cs *ring) return; trace_i915_gem_request_notify(ring); + ring->user_interrupts++; wake_up_all(&ring->irq_queue); } @@ -3097,6 +3098,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) for_each_ring(ring, dev_priv, i) { u64 acthd; u32 seqno; + unsigned user_interrupts; bool busy = true; semaphore_clear_deadlocks(dev_priv); @@ -3113,6 +3115,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) acthd = intel_ring_get_active_head(ring); seqno = ring->get_seqno(ring); + user_interrupts = READ_ONCE(ring->user_interrupts); if (ring->hangcheck.seqno == seqno) { if (ring_idle(ring, seqno)) { @@ -3120,7 +3123,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) if (waitqueue_active(&ring->irq_queue)) { /* Issue a wake-up to catch stuck h/w. */ - if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { + if (ring->hangcheck.user_interrupts == user_interrupts && + !test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) { if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring))) DRM_ERROR("Hangcheck timer elapsed... %s idle\n", ring->name); @@ -3187,6 +3191,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work) ring->hangcheck.seqno = seqno; ring->hangcheck.acthd = acthd; + ring->hangcheck.user_interrupts = user_interrupts; busy_count += busy; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 8fb02b21e75d..b22573561669 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -90,6 +90,7 @@ struct intel_ring_hangcheck { u64 acthd; u64 max_acthd; u32 seqno; + unsigned user_interrupts; int score; enum intel_ring_hangcheck_action action; int deadlock; @@ -301,6 +302,7 @@ struct intel_engine_cs { * inspecting request list. */ u32 last_submitted_seqno; + unsigned user_interrupts; bool gpu_caches_dirty;