From patchwork Fri Oct 23 01:32:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomas Elf X-Patchwork-Id: 7469151 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 249F49F36A for ; Fri, 23 Oct 2015 01:34:03 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 54C4920812 for ; Fri, 23 Oct 2015 01:34:01 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 809CB2080F for ; Fri, 23 Oct 2015 01:33:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DB4656E877; Thu, 22 Oct 2015 18:33:58 -0700 (PDT) X-Original-To: Intel-GFX@lists.freedesktop.org Delivered-To: Intel-GFX@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id 6568A6E538 for ; Thu, 22 Oct 2015 18:33:57 -0700 (PDT) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 22 Oct 2015 18:33:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,185,1444719600"; d="scan'208";a="586357901" Received: from telf-linux2.isw.intel.com ([10.102.226.163]) by FMSMGA003.fm.intel.com with ESMTP; 22 Oct 2015 18:33:56 -0700 From: Tomas Elf To: Intel-GFX@Lists.FreeDesktop.Org Date: Fri, 23 Oct 2015 02:32:40 +0100 Message-Id: <1445563962-20753-19-git-send-email-tomas.elf@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1445563962-20753-1-git-send-email-tomas.elf@intel.com> References: <1445563962-20753-1-git-send-email-tomas.elf@intel.com> Subject: [Intel-gfx] [PATCH 18/20] drm/i915: TDR / per-engine hang recovery kernel docs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Tomas Elf --- Documentation/DocBook/gpu.tmpl | 476 ++++++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_irq.c | 8 +- 2 files changed, 483 insertions(+), 1 deletion(-) diff --git a/Documentation/DocBook/gpu.tmpl b/Documentation/DocBook/gpu.tmpl index c05d7df..91b75aa 100644 --- a/Documentation/DocBook/gpu.tmpl +++ b/Documentation/DocBook/gpu.tmpl @@ -4128,6 +4128,482 @@ int num_ioctls; + + + + GPU hang management + + There are two sides to handling GPU hangs: and . In this section we will discuss how the driver + detect hangs and what it can do to recover from them. + + + + Detection + + There is no theoretically sound definition of what a GPU hang actually is, + only assumptions based on empirical observations. One such observation is + that if a batch buffer takes more than a certain amount of time to finish + then we would assume that it's hung. However, one problem with that + assumption is that the execution might be ongoing inside the batch buffer. + In fact, it's easy to determine whether or not execution is progressing + within a batch buffer. If taking that into account we could create a more + refined hang detection algorithm. Unfortunately, there is then the + complication that the execution might be stuck in a never-ending loop which + keeps execution busy for an unbounded amount of time. These are all + practical problems that we need to deal with when detecting a hang and + whatever hang detection algorithm we come up with will have a certain + probability of false positives. + + + The i915 driver currently supports two forms of hang + detection: + + + + + + + + + + + + Periodic Hang Checking + + The periodic hang checker is a work queue that keeps running in the + background as long as there is work outstanding that is pending execution. + i915_hangcheck_elapsed() implements the work function of the queue and is + executed at every hang checker invocation. + + + + While being scheduled the hang checker keeps track of a hang score for each + individual engine. The hang score is an indication of what degree of + severity a hang has reached for a certain engine. The higher the score gets + the more radical forms of intervention are employed to force execution to + resume. + + + + The hang checker is scheduled from two places: + + + + + __i915_add_request(), after a new request has been added and is pending submission. + + + + i915_hangcheck_elapsed() itself, if work is still pending for any GPU + engine the hang checker is rescheduled. + + + + + The periodic hang checker keeps track of the sequence number + progression of the currently executing requests on every GPU engine. If + they keep progressing in between every hang checker invocation this is + interpreted as the engine being active, the hang score is cleared and and + no intervention is made. If the sequence number has stalled for one or more + engines in between two hang checks that is an indication of one of two + things: + + + + + There is no more work pending on the given engine. If there are no + threads waiting for request completion this is an indication that no more + hang checking is necessary and the hang checker is not rescheduled. If + there is someone waiting for request completion the hang checker is + rescheduled and the hang score is continually incremented. + + + + The given engine is truly hung. In this case a number of hardware + state checks are made to determine what the most suitable course of action + is and a corresponding hang score incrementation is made to reflect the + current hang severity. + + + + + If the hang score of any engine reaches the hung threshold hang recovery is + scheduled by calling i915_handle_error() with a engine flag mask containing + the bits representing all currently hung engines. + + + + + Context Submission State Consistency Checking + + + On top of this there is the context submission status consistency pre-check + in the hang checker that keeps track of driver/HW consistency. The + underlying problem that this pre-check is trying to solve is the fact that + on some occasions the driver does not receive the proper context event + interrupt upon context state changes. Specifically, this has been observed + following the completion of engine reset and the subsequent resubmission of + the fixed-up context. At this point the engine hang is unblocked and the + context completes and the hardware marks the context as complete in the + context status buffer (CSB) for the given engine. However, the interrupt + that would normally signal this to the driver is lost. What this means to + the driver is that it gets stuck waiting for context completion on the + given engine until reboot, stalling all further submissions to the engine + ELSP. + + + + The way to detect this is to check for inconsistencies between the context + submission state in the hardware as well as in the driver. What this means + is that the EXECLIST_STATUS register has to be checked for every engine. + From this register the ID of the currently running context can be extracted + as well as information about whether or not the engine is idle or not. This + information can then be compared against the current state of the execlist + queue for the given engine. If the hardware is idle but the driver has + pending contexts in the execlist queue for a prolonged period of time then + it's safe to assume that the driver/HW state is inconsistent. + + + + The way driver/HW state inconsistencies are rectified is by faking the + presumably lost context event interrupts simply by calling the execlist + interrupt handler manually. + + + + What this means to the periodic hang checker is the following: + + + + + + State consistency checking happens at the start of the hang check + procedure. If an inconsistency has been detected enough times (more + detections than the threshold level of I915_FAKED_CONTEXT_IRQ_THRESHOLD) + the hang checker will fake a context event interrupt. If there are + outstanding, unprocessed context events in the CSB buffer these will be + acted upon. + + + + + + As long as the driver/HW state has been determined to be inconsistent the + error handler will not be called. The reason for this is that the engine + recovery mode, which is the hang recovery mode that the driver prefers, is + not effective if context submissions does not work. If the driver/HW state + is inconsistent it might mean that the hardware is currently executing (and + might be hung in) a completely different context than the driver expects, which would lead to + unexpected pre-emptions, which might mean that trying to resubmit the + context that the driver has identified as hung might make the situation + worse. Therefore, before any recovery is scheduled the driver/HW state must + be confirmed as consistent and stable. + + + + + + If any inconsistent driver/HW states persist regardless of any attempts to + rectify the situation there is a final fall-back: In case the hang score on + any engine reaches twice that of the normal hang threshold the error + handler is called with no engine mask populated, meaning that a full GPU + reset is forced. Going for a full GPU reset in this case makes sense since + there are two problems that need fixing: 1) The GPU + is hung and 2) The driver/HW state is + inconsistent. The full GPU reset solves both of these problems + and does not require the driver/HW state to be consistent to begin with so + its a sensible choice in this situation. + + + + + + + + + + Watchdog Timeout + + Unlike the periodic hang + checker Watchdog Timeout is a mode of hang detection that relies on + the GPU hardware to notify the driver in the event of a hang. Another + dissimilarity is that this mode does not target every engine at all times + but rather targets individual batch buffers that have been selected by the + submitting application. The way this works is that a submitter can opt-in + to use Watchdog Timeout for a particular batch buffer is by setting the + Watchdog Timeout enablement flag for that batch buffer. By doing so the + driver will emit instructions in the ring buffer before the batch buffer + start instruction to enable the Watchdog HW timer and afterwards to cancel + the same timer. The purpose of this is to keep track of how long the + execution stays inside the batch buffer once the execution reaches that + point. If the execution takes to long to clear the batch buffer and the + preset Watchdog Timer Threshold elapses the GPU hardware will fire a + Watchdog Timeout interrupt to the driver, which is interpreted as current + batch buffer for the given engine being hung. Thus, hang detection in this + case is purely interrupt-driven and the driver is free to do other things. + + + + Once the GT interrupt handler receives the Watchdog Timeout interrupt it + then proceeds by making a direct call to i915_handle_error() with + information about which engine is hung and by setting the dedicated + watchdog priority flag that allows the error handler to circumvent the + normal hang promotion logic that applies to hang detections originating + from the periodic hang checker. + + + + In order to enable this Watchdog Timeout for a particular batch buffer + userland libDRM has to enable the corresponding bit contained in + I915_EXEC_ENABLE_WATCHDOG in the batch buffer flag bitmask. This feature is + disabled by default and therefore it operates purely on an opt-in basis + from userland's point of view. + + + + + + + + Recovery + + Once a hang has been detected, either through periodic hang checking or + Watchdog Timeout, the error handler (i915_handle_error) takes over and + decices what to do from there on. Generally speaking there are two modes of + hang recovery that the error handler can choose from: + + + + + + + + + + + Exactly what recovery mode the hang is promoted to depends on a number of factors: + + + + + + + + Did the caller say that a hang had been detected but did not specifically ask for engine reset? + + If the wedged parameter is set in the call to i915_handle_error() but the + engine_mask parameter is set to 0 it means that we need to do some kind of + hang recovery but no engine is specified. In that case the outcome will + always be an attempt to do a GPU reset. + + + + + + + + Did the caller say that a hang had been detected and specify at least one hung engine? + + If one or more engines have been specified as hung the first attempt will + always be to do an engine reset of those hung engines. There are two + reasons why an GPU reset would be carried out instead of a simple engine + reset: + + + + + + An engine reset was carried out on the same engine too recently. What + constitutes "too recent" is determined by the i915 module parameter + gpu_reset_promotion_time. If two engine resets were attempted within the + time window defined by this module parameter it is decided that the + previous engine reset was ineffective and therefore there is no point in + trying another one. Thus, a full GPU reset will be done instead. + + + + + + An engine reset was carried out but failed. In this case the hang recovery + path (i915_error_work_func) would go straight from the failed engine reset + attempt (i915_reset_engine call) to a full GPU reset without delay. + + + + + + + + + + + + Did the Watchdog Timeout detect the hang? + + In case of the Watchdog Timeout calling the error handler the dedicated + watchdog parameter will be set and this forces the error handler to only + consider engine reset and not full GPU reset. We will only promote to full + GPU reset if the driver itself, based on its own hang detection mechanism, + has detected a persisting hang that will not be resolved by an engine hang. + Watchdog Timeout is user-controlled and is therefore not trusted the same + way. + + + + + + + When the error handler reaches a decision of what hang recovery mode to use + it sets up the corresponding reset in progress flag. There is one main + reset in progress flag for GPU resets as well as one dedicated reset in + progress flag in each hangcheck struct for each engine. After that the + error handler schedules the actual hang recovery work queue, which ends up + in i915_error_work_func, which is the function that grabs all necessary + locks and actually calls the main hang recovery functions. For all engines + that have their respective error in progress flags the engine reset + path is taken for each engine in sequence. If the GPU reset in + progress flag is set no attempts at carrying out engine resets are made and + instead the legacy full + GPU reset path is taken. + + + + Engine Reset + + The engine reset path is implemented in i915_reset_engine and the following + is a summary of how that function operates: + + + + + Get currently running context and check context submission status + consistency. If the currently running (hung) context is in an inconsistent + state there is really no reason why the execution should be at this point + since the hang checker does a consistency check before scheduling hang + recovery unless the state has changed since hang recovery was scheduled, in + which case the engine is not truly hung. If so, do early exit. + + + + + + Force engine to idle and save the current context image. On gen8+ this is + done by setting the reset request bit in the reset control register. On + gen7 and earlier gens the MI_MODE register in combination with the ring + control register has to be used to disable the engine. + + + + + + Save the head MMIO register value and nudge it to the following valid + instruction in the ring buffer following the batch buffer start instruction + of the currently hung batch buffer. + + + + + + Reset engine. + + + + + + Call the init() function for the previously hung engine, which should + reapply HW workarounds and carry out other essential state + reinitialization. + + + + + + Write the previously nudged head register value to both MMIO and context registers. + + + + + + Submit updated context to ELSP in order to force execution to resume (gen8 only). + + + + + + Clear reset in progress engine flag and wake up all threads waiting for requests to complete. + + + + + + + + The intended outcome of an engine reset is that the hung batch buffer is + dropped by forcing the execution to resume following the batch buffer start + instruction in the ring buffer. This should only affect the hung engine and + none other. No reinitialization aside from a subset of the state for the + hung engine should happen and pending work should be retained requiring no + further resubmissions. + + + + + + + GPU reset + + Basically the GPU reset function, i915_reset, does 3 things: + + + + + + Reset GEM. + + + + + + Do the actual GPU reset. + + + + + + Reinitialize the GEM part of the driver, including purging all pending work, reinitialize the engines and ring setup and more. + + + + + + + The intended outcome of a GPU reset is that all work, including the hung + batch buffer as well as all batch buffers following it, is dropped and the + GEM part of the driver is reinitialized following the GPU reset. This means + that the driver goes to an idle state together with the hardware and should + start over from a state in which it is ready to accept more work and move + forwards from there. All pending work will have to be resubmitted by the + submitting application. + + + + + + + + Tracing diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 8fe972b..f0e826e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2682,7 +2682,9 @@ static void i915_report_and_clear_eir(struct drm_device *dev) * or if one of the current engine resets fails we fall * back to legacy full GPU reset. * @watchdog: true = Engine hang detected by hardware watchdog. + * * @wedged: true = Hang detected, invoke hang recovery. + * * @fmt, ...: Error message describing reason for error. * * Do some basic checking of register state at error time and @@ -3134,7 +3136,11 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd) return HANGCHECK_HUNG; } -/* +/** + * i915_hangcheck_elapsed - hang checker work function + * + * @work: Work item containing reference to private DRM struct. + * * This is called when the chip hasn't reported back with completed * batchbuffers in a long time. We keep track per ring seqno progress and * if there are no progress, hangcheck score for that ring is increased.