From patchwork Mon Aug 15 09:49:10 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9280747 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ABBCD60839 for ; Mon, 15 Aug 2016 09:50:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C6A928C8D for ; Mon, 15 Aug 2016 09:50:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 913C728C91; Mon, 15 Aug 2016 09:50:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2F31328CA9 for ; Mon, 15 Aug 2016 09:50:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A8C6B6E3E8; Mon, 15 Aug 2016 09:50:06 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id 75C0D6E3E2 for ; Mon, 15 Aug 2016 09:49:54 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id o80so10245381wme.0 for ; Mon, 15 Aug 2016 02:49:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:subject:date:message-id:in-reply-to:references; bh=zcajg7W66UJO3eAPFvc2YrAspAZHOMdDgIkCbFx1J50=; b=mqFXbR6fiR9gLeuj3N/rB3q8Q1KbxWaV+eF33LCHY15zIgUL5slyC5sq0Ddyzk1kyV anNRXj0NkBprttLMgkYQSRFdBbk6miYnzi3d1YgapIt36tNJ6GJ3SxFT6O4o8fVWsUCO m67m1BJ89x6FijVzopWhJNZjKyvzdmZcxA1lLNywYcoFMsZI2Mm0WRLN57wfqwUNWOWH DmrW3QzKma71cWsZOFMhyRdqUyiuxiwNW6Dn08F+T/giz5ycZJu+nskawUgAgrvXzhAa vQcGuNpCE4jefVMuQP7WnumYQoxavRUyMg3ggYENb05os58tAifw3+Wb9mUtOwEbd2cb BBrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references; bh=zcajg7W66UJO3eAPFvc2YrAspAZHOMdDgIkCbFx1J50=; b=P3XKevSHZZVIAA9A6T1Hprsjnx7KM4fVGqcn17/y5Dij6H7Lg+EFIj9L526Ul0gvXw vclou950GBu4MDs/tKaZIGpZgj54fIswBOIkFu5vcOras/3KC7nTiVDWeqp+Pvwf1gA4 PCJK3JnnZPbWfECyitw8Bt/WrTD+d/wOn0K1mt4HyFp8i9kkMXmS1i3Wqg4KFHYTs4/T CRmmYFA0xETV+KDH8Yny1+cmLOV24jKw2GgpwyOLO4KorZK2hB79RzjjUnrn1j3RxAYn RX7Zm2RpRwXTmYgGemI3uqRkZDVc9wEfM8oIFa084Z4L7dAGslNkufRPNVM8Q8RAZpZR wAfQ== X-Gm-Message-State: AEkoouvAf4zhNb//RXk6a5dw9TOtmxItGwfxlulDwDLThVA4zCSbRo1lhVTaGPu0On63/A== X-Received: by 10.194.173.4 with SMTP id bg4mr30197095wjc.28.1471254592833; Mon, 15 Aug 2016 02:49:52 -0700 (PDT) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id v203sm15675247wmv.2.2016.08.15.02.49.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 15 Aug 2016 02:49:51 -0700 (PDT) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Aug 2016 10:49:10 +0100 Message-Id: <1471254551-25805-31-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1471254551-25805-1-git-send-email-chris@chris-wilson.co.uk> References: <1471254551-25805-1-git-send-email-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [CI 31/32] drm/i915: Only record active and pending requests upon a GPU hang X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP There is no other state pertaining to the completed requests in the hang, other than gleamed through the ringbuffer, so including the expired requests in the list of outstanding requests simply adds noise. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_gpu_error.c | 109 +++++++++++++++++++--------------- 1 file changed, 61 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 0f0b65214ef1..776818b86c0c 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1060,12 +1060,68 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, } } +static void engine_record_requests(struct intel_engine_cs *engine, + struct drm_i915_gem_request *first, + struct drm_i915_error_engine *ee) +{ + struct drm_i915_gem_request *request; + int count; + + count = 0; + request = first; + list_for_each_entry_from(request, &engine->request_list, link) + count++; + if (!count) + return; + + ee->requests = kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC); + if (!ee->requests) + return; + + ee->num_requests = count; + + count = 0; + request = first; + list_for_each_entry_from(request, &engine->request_list, link) { + struct drm_i915_error_request *erq; + + if (count >= ee->num_requests) { + /* + * If the ring request list was changed in + * between the point where the error request + * list was created and dimensioned and this + * point then just exit early to avoid crashes. + * + * We don't need to communicate that the + * request list changed state during error + * state capture and that the error state is + * slightly incorrect as a consequence since we + * are typically only interested in the request + * list state at the point of error state + * capture, not in any changes happening during + * the capture. + */ + break; + } + + erq = &ee->requests[count++]; + erq->seqno = request->fence.seqno; + erq->jiffies = request->emitted_jiffies; + erq->head = request->head; + erq->tail = request->tail; + + rcu_read_lock(); + erq->pid = request->ctx->pid ? pid_nr(request->ctx->pid) : 0; + rcu_read_unlock(); + } + ee->num_requests = count; +} + static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error) { struct i915_ggtt *ggtt = &dev_priv->ggtt; - struct drm_i915_gem_request *request; - int i, count; + int i; error->semaphore = i915_error_object_create(dev_priv, dev_priv->semaphore); @@ -1073,6 +1129,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, for (i = 0; i < I915_NUM_ENGINES; i++) { struct intel_engine_cs *engine = &dev_priv->engine[i]; struct drm_i915_error_engine *ee = &error->engine[i]; + struct drm_i915_gem_request *request; ee->pid = -1; ee->engine_id = -1; @@ -1131,6 +1188,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->cpu_ring_tail = ring->tail; ee->ringbuffer = i915_error_object_create(dev_priv, ring->vma); + + engine_record_requests(engine, request, ee); } ee->hws_page = @@ -1139,52 +1198,6 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->wa_ctx = i915_error_object_create(dev_priv, engine->wa_ctx.vma); - - count = 0; - list_for_each_entry(request, &engine->request_list, link) - count++; - - ee->num_requests = count; - ee->requests = - kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC); - if (!ee->requests) { - ee->num_requests = 0; - continue; - } - - count = 0; - list_for_each_entry(request, &engine->request_list, link) { - struct drm_i915_error_request *erq; - - if (count >= ee->num_requests) { - /* - * If the ring request list was changed in - * between the point where the error request - * list was created and dimensioned and this - * point then just exit early to avoid crashes. - * - * We don't need to communicate that the - * request list changed state during error - * state capture and that the error state is - * slightly incorrect as a consequence since we - * are typically only interested in the request - * list state at the point of error state - * capture, not in any changes happening during - * the capture. - */ - break; - } - - erq = &ee->requests[count++]; - erq->seqno = request->fence.seqno; - erq->jiffies = request->emitted_jiffies; - erq->head = request->head; - erq->tail = request->tail; - - rcu_read_lock(); - erq->pid = request->ctx->pid ? pid_nr(request->ctx->pid) : 0; - rcu_read_unlock(); - } } }