From patchwork Thu Mar 22 07:35:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10300913 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 31FBB600F6 for ; Thu, 22 Mar 2018 07:35:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2656B29A8A for ; Thu, 22 Mar 2018 07:35:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A8AE29A8E; Thu, 22 Mar 2018 07:35:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A689229A8A for ; Thu, 22 Mar 2018 07:35:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0AC8D6EB82; Thu, 22 Mar 2018 07:35:55 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 831A86EB7D for ; Thu, 22 Mar 2018 07:35:53 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 11115053-1500050 for multiple; Thu, 22 Mar 2018 07:35:41 +0000 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Thu, 22 Mar 2018 07:35:41 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 22 Mar 2018 07:35:33 +0000 Message-Id: <20180322073533.5313-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180322073533.5313-1-chris@chris-wilson.co.uk> References: <20180322073533.5313-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 4/4] drm/i915: Flush pending interrupt following a GPU reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP After resetting the GPU (or subset of engines), call synchronize_irq() to flush any pending irq before proceeding with the cleanup. For a device level reset, we disable the interupts around the reset, but when resetting just one engine, we have to avoid such global disabling. This leaves us open to an interrupt arriving for the engine as we try to reset it. We already do try to flush the IIR following the reset, but we have to ensure that the in-flight interrupt does not land after we start cleaning up after the reset; enter synchronize_irq(). As it current stands, we very rarely, but fatally, see sequences such as: 2.... 57964564us : execlists_reset_prepare: rcs0 2.... 57964613us : execlists_reset: rcs0 seqno=424 0d.h1 57964615us : gen8_cs_irq_handler: rcs0 CS active=1 2d..1 57964617us : __i915_request_unsubmit: rcs0 fence 29:1056 <- global_seqno 1060 2.... 57964703us : execlists_reset_finish: rcs0 0..s. 57964705us : execlists_submission_tasklet: rcs0 awake?=1, active=0, irq-posted?=1 v2: Move the sync into the execlists reset handler so that we coordinate the flush with disabling the interrupt handling and canceling the pending interrupt. v3: Just use synchronize_hardirq() to avoid the might_sleep(), we do not yet have threaded-irq to worry about. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Michel Thierry Cc: MichaƂ Winiarski Cc: Jeff McGee --- drivers/gpu/drm/i915/intel_lrc.c | 7 ++++--- drivers/gpu/drm/i915/intel_uncore.c | 4 +++- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 67b6a0f658d6..ce09c5ad334f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -805,6 +805,10 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) spin_unlock(&engine->timeline->lock); + /* Mark all CS interrupts as complete */ + smp_store_mb(execlists->active, 0); + synchronize_hardirq(engine->i915->drm.irq); + /* * The port is checked prior to scheduling a tasklet, but * just in case we have suspended the tasklet to do the @@ -813,9 +817,6 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) */ clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted); - /* Mark all CS interrupts as complete */ - execlists->active = 0; - local_irq_restore(flags); } diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 4c616d074a97..f37ecfc69e49 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -2116,8 +2116,10 @@ int intel_gpu_reset(struct drm_i915_private *dev_priv, unsigned engine_mask) i915_stop_engines(dev_priv, engine_mask); ret = -ENODEV; - if (reset) + if (reset) { + GEM_TRACE("engine_mask=%x\n", engine_mask); ret = reset(dev_priv, engine_mask); + } if (ret != -ETIMEDOUT) break;