From patchwork Tue Jul 12 23:31:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA1A9CCA47F for ; Tue, 12 Jul 2022 23:31:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9E0D8976B4; Tue, 12 Jul 2022 23:31:39 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5A628976BA; Tue, 12 Jul 2022 23:31:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668697; x=1689204697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qBe309PiRX8yFdfwVnKYrxKQ1Mrayni6ox1W/IA4huM=; b=KZPqkcfAv+t2pxWMaIRqFWO30zRH+wrMPSCHbk7HcMWxd64rhvYBHZ9c JA1zm0Z1bbF6nEpaZxlUTzLKzlhaJGIfxhU1WDLfKNLNDiOL/kkpYsa9K JA+0Hg2uG5/sTvoNaCtr19XtUujQJlcHj9a6PH5NtCo1mwL60CaGS3Jcp sPV+bfyg97nIWbFGLaUl2V2E0buEep9MoqPltQQZaEvTbQYsquAgdDNKa k8Ws7oGF9mEa282VwqUyybRfauKquZr4+RMEVxPk1pyla3puInXKM7oXk c7nYNnUwTKA1NP3nlf7t3RaaA4v7Eh8moAsnS3X3+5hFlJu84fohpgdDR A==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812561" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812561" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137755" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:25 -0700 Message-Id: <20220712233136.1044951-2-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Brost Remove bogus GEM_BUG_ON which compared kernel context timeline seqno to seqno in memory on engine PM unpark. If a GT reset occurred these values might not match as a kernel context could be skipped. This bug was hidden by always switching to a kernel context on park (execlists requirement). Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index b0a4a2dbe3ee9..fb3e1599d04ec 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf) ce->timeline->seqno, READ_ONCE(*ce->timeline->hwsp_seqno), ce->ring->emit); - GEM_BUG_ON(ce->timeline->seqno != - READ_ONCE(*ce->timeline->hwsp_seqno)); } if (engine->unpark) From patchwork Tue Jul 12 23:31:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B7CBC43334 for ; Tue, 12 Jul 2022 23:31:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A5462976C3; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8DDAA976B9; Tue, 12 Jul 2022 23:31:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668697; x=1689204697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dT+uNgZXsC1HYSIFLrHz/ls3gdspPj2YRZC3shcCVcE=; b=GCxsQ/idXITZGLt7a3BG4atNPZ/MZbTXZOwsR1czHG81yS9Lh3h6i1R9 ZyNh8r/4ejLb8LbhA7+z1Vf6Rr2jkaOpb3jFyMY19wRKeQShrqrTRNCli F1W1fIrGyAm+u0v6giT9AfLtqnVYYt4vErv90kCKcJaHtZlashe6f0EF4 JjSwysd/h323zuFwSzsviiXH91/KcnGinB4s5oMN380kaqWEwLwY89+3x 0g4KMTOlcKCyTgtIbC+2Liq5WPN4TZmdonqdbKvtpolUUuFohqoSiSOWA 17Yh6H5OeTuO/Yk9pFiN6/489pZ/db8PGtWuARQ7g2hIRMcLqh4B6WQXq Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812562" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812562" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137758" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:26 -0700 Message-Id: <20220712233136.1044951-3-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Brost The engine registers really shouldn't be touched during GuC submission as the GuC owns the registers. Don't call ring_is_idle and tie intel_engine_is_idle strictly to the engine pm. Because intel_engine_is_idle tied to the engine pm, retire requests before checking intel_engines_are_idle in gt_drop_caches, and lastly increase the timeout in gt_drop_caches for the intel_engines_are_idle check. Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++ drivers/gpu/drm/i915/i915_debugfs.c | 6 +++--- drivers/gpu/drm/i915/i915_drv.h | 2 +- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 283870c659911..959a7c92e8f4d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs *engine) { bool idle = true; + /* GuC submission shouldn't access HEAD & TAIL via MMIO */ + GEM_BUG_ON(intel_engine_uses_guc(engine)); + if (I915_SELFTEST_ONLY(!engine->mmio_base)) return true; @@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) if (!i915_sched_engine_is_empty(engine->sched_engine)) return false; + /* + * We shouldn't touch engine registers with GuC submission as the GuC + * owns the registers. Let's tie the idle to engine pm, at worst this + * function sometimes will falsely report non-idle when idle during the + * delay to retire requests or with virtual engines and a request + * running on another instance within the same class / submit mask. + */ + if (intel_engine_uses_guc(engine)) + return false; + /* Ring stopped? */ return ring_is_idle(engine); } diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 94e5c29d2ee3a..ee5334840e9cb 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val) { int ret; + if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE) + intel_gt_retire_requests(gt); + if (val & DROP_RESET_ACTIVE && wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT)) intel_gt_set_wedged(gt); - if (val & DROP_RETIRE) - intel_gt_retire_requests(gt); - if (val & (DROP_IDLE | DROP_ACTIVE)) { ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); if (ret) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c22f29c3faa0e..53c7474dde495 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -278,7 +278,7 @@ struct i915_gem_mm { u32 shrink_count; }; -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */ +#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */ unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915, u64 context); From patchwork Tue Jul 12 23:31:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915820 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 59B21CCA47F for ; Tue, 12 Jul 2022 23:32:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 34EA1976D6; Tue, 12 Jul 2022 23:31:44 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id BF1C9976BD; Tue, 12 Jul 2022 23:31:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668697; x=1689204697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=koQtewMmMuXnhh3PufImj24iKBtfjhu5HFVVsTn04ao=; b=VnwHwh5en+O9nyJO++ky41837UitDEUUshCxjC8lOt5k+iRkNp90TRsz 0xTdrDCloCgiqJzBev8Km/l1xKX2CIVEC7pKqqb1b2cOXrRt9j7b7JMVr Ai78hFZnmTpVrpmuWzMWq0y2vj5yfH3QUsqZA/ZzXKxzuj0cZ1J7xkBf2 s9Piqktimcyv01qRHCvHckMejt5vV8Bx+Mbpz7ElhQ/SGnXsyMeypy7L+ IawLKVxOxfWii6iFn5zJOaF2bklwD5DK4JRMF+AWUv0OGZalRJSLaEbGf hqXpSYGykkLxT91c09HIy4ghfCJdUQtS1BUa1xXSkI2+8o6uh14YOic8G g==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812563" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812563" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137762" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:27 -0700 Message-Id: <20220712233136.1044951-4-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 03/12] drm/i915/guc: Fix issues with live_preempt_cancel X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Brost Having semaphores results in different behavior when a dependent request is cancelled. In the case of semaphores the request could be on the HW and complete successfully while without the request is held in the driver and the error from the dependent request is propagated. Fix live_preempt_cancel to take this behavior into account. Also update live_preempt_cancel to use new function intel_context_ban rather than intel_context_set_banned. Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/selftest_execlists.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 02fc97a0ab502..015f8cd3463e2 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -2087,7 +2087,7 @@ static int __cancel_active0(struct live_preempt_cancel *arg) goto out; } - intel_context_set_banned(rq->context); + intel_context_ban(rq->context, rq); err = intel_engine_pulse(arg->engine); if (err) goto out; @@ -2146,7 +2146,7 @@ static int __cancel_active1(struct live_preempt_cancel *arg) if (err) goto out; - intel_context_set_banned(rq[1]->context); + intel_context_ban(rq[1]->context, rq[1]); err = intel_engine_pulse(arg->engine); if (err) goto out; @@ -2229,7 +2229,7 @@ static int __cancel_queued(struct live_preempt_cancel *arg) if (err) goto out; - intel_context_set_banned(rq[2]->context); + intel_context_ban(rq[2]->context, rq[2]); err = intel_engine_pulse(arg->engine); if (err) goto out; @@ -2244,7 +2244,13 @@ static int __cancel_queued(struct live_preempt_cancel *arg) goto out; } - if (rq[1]->fence.error != 0) { + /* + * The behavior between having semaphores and not is different. With + * semaphores the subsequent request is on the hardware and not cancelled + * while without the request is held in the driver and cancelled. + */ + if (intel_engine_has_semaphores(rq[1]->engine) && + rq[1]->fence.error != 0) { pr_err("Normal inflight1 request did not complete\n"); err = -EINVAL; goto out; @@ -2292,7 +2298,7 @@ static int __cancel_hostile(struct live_preempt_cancel *arg) goto out; } - intel_context_set_banned(rq->context); + intel_context_ban(rq->context, rq); err = intel_engine_pulse(arg->engine); /* force reset */ if (err) goto out; From patchwork Tue Jul 12 23:31:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD1D5CCA47F for ; Tue, 12 Jul 2022 23:32:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A42AB976E1; Tue, 12 Jul 2022 23:31:45 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 03162976B9; Tue, 12 Jul 2022 23:31:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VuMvfjQQB8L/h3Nyp5OLpiuai7an51itwG9GQ9JeVKM=; b=DQxwv4URKe40kechVxKR6CaBDCGs0G2g2HhZ64DFn88ezCOrOh928Axr 3/tQ6MeDyPWwTNfjFou8Upy3WNnIOJOdhQa95WtpPD2I4Z4h1yP9+C33T m4kCkWrpNFaU3/d4kkG/Y7tO5jUqZogTmSj4LwI4GKQv6X9pCzXc+QL+L 5ls+0giLmTiCjCcjYfs+MR1H0A8nFGZlyfWQjQl6dMZiHomrRmD0AOAEU ppU+jGhVP/zxcqSUUhg4IFkHY6OE63Un7KUeks8T631hsqSLrN9Y/ZO6Q 374/UsAyy/J6SBAjnu2hoqcPMuXEOx8M1URuXLii886Ahnp+9ODErBpaH g==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812564" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812564" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137765" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:28 -0700 Message-Id: <20220712233136.1044951-5-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 04/12] drm/i915/guc: Add GuC <-> kernel time stamp translation information X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison It is useful to be able to match GuC events to kernel events when looking at the GuC log. That requires being able to convert GuC timestamps to kernel time. So, when dumping error captures and/or GuC logs, include a stamp in both time zones plus the clock frequency. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 2 ++ drivers/gpu/drm/i915/gt/uc/intel_guc.c | 19 +++++++++++++++++++ drivers/gpu/drm/i915/gt/uc/intel_guc.h | 2 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 2 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 12 ++++++++++++ drivers/gpu/drm/i915/i915_gpu_error.h | 3 +++ 6 files changed, 40 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index e6bb24dc7b998..b1258f7239b06 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1004,6 +1004,8 @@ #define GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC (1 << 9) #define GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC (1 << 7) +#define GUCPMTIMESTAMP _MMIO(0xc3e8) + #define __GEN9_RCS0_MOCS0 0xc800 #define GEN9_GFX_MOCS(i) _MMIO(__GEN9_RCS0_MOCS0 + (i) * 4) #define __GEN9_VCS0_MOCS0 0xc900 diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 2706a8c650900..ab4aacc516aa4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc) intel_uncore_forcewake_put(uncore, FORCEWAKE_GT); } +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p) +{ + struct intel_gt *gt = guc_to_gt(guc); + intel_wakeref_t wakeref; + u32 stamp = 0; + u64 ktime; + + intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p); + + with_intel_runtime_pm(>->i915->runtime_pm, wakeref) + stamp = intel_uncore_read(gt->uncore, GUCPMTIMESTAMP); + ktime = ktime_get_boottime_ns(); + + drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", ktime, ktime); + drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", stamp, stamp); + drm_printf(p, "CS timestamp frequency: %u Hz, %u ns\n", + gt->clock_frequency, gt->clock_period_ns); +} + int intel_guc_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index d0d99f178f2d4..111484372e6f8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -459,4 +459,6 @@ void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p); void intel_guc_write_barrier(struct intel_guc *guc); +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p); + #endif diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index 02311ad902641..45f62cdabe356 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -763,6 +763,8 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, if (!obj) return 0; + intel_guc_dump_time_info(guc, p); + map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(map)) { DRM_DEBUG("Failed to pin object\n"); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 52ea13fee015e..5bcf36c292ebd 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -690,6 +690,7 @@ static void err_print_uc(struct drm_i915_error_state_buf *m, intel_uc_fw_dump(&error_uc->guc_fw, &p); intel_uc_fw_dump(&error_uc->huc_fw, &p); + err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp); intel_gpu_error_print_vma(m, NULL, error_uc->guc_log); } @@ -732,6 +733,8 @@ static void err_print_gt_global_nonguc(struct drm_i915_error_state_buf *m, int i; err_printf(m, "GT awake: %s\n", str_yes_no(gt->awake)); + err_printf(m, "CS timestamp frequency: %u Hz, %d ns\n", + gt->clock_frequency, gt->clock_period_ns); err_printf(m, "EIR: 0x%08x\n", gt->eir); err_printf(m, "PGTBL_ER: 0x%08x\n", gt->pgtbl_er); @@ -1687,6 +1690,13 @@ gt_record_uc(struct intel_gt_coredump *gt, */ error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); + + /* + * Save the GuC log and include a timestamp reference for converting the + * log times to system times (in conjunction with the error->boottime and + * gt->clock_frequency fields saved elsewhere). + */ + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, "GuC log buffer", compress); @@ -1845,6 +1855,8 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt) static void gt_record_info(struct intel_gt_coredump *gt) { memcpy(>->info, >->_gt->info, sizeof(struct intel_gt_info)); + gt->clock_frequency = gt->_gt->clock_frequency; + gt->clock_period_ns = gt->_gt->clock_period_ns; } /* diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h index 55a143b92d10e..d8a8b3d529e09 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.h +++ b/drivers/gpu/drm/i915/i915_gpu_error.h @@ -150,6 +150,8 @@ struct intel_gt_coredump { u32 gtt_cache; u32 aux_err; /* gen12 */ u32 gam_done; /* gen12 */ + u32 clock_frequency; + u32 clock_period_ns; /* Display related */ u32 derrmr; @@ -164,6 +166,7 @@ struct intel_gt_coredump { struct intel_uc_fw guc_fw; struct intel_uc_fw huc_fw; struct i915_vma_coredump *guc_log; + u32 timestamp; bool is_guc_capture; } *uc; From patchwork Tue Jul 12 23:31:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C4D54C43334 for ; Tue, 12 Jul 2022 23:31:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C6596976B9; Tue, 12 Jul 2022 23:31:39 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 35620976BF; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QAhi7M4Z/IjmQDzgNmMupr2I8G5TxqWPD/tZZMc0WgY=; b=D9OedH/OcmsRk2qc+Q5bsuOLZzJRw6NjbZG0LBfn1G7AodSiulXtJ8jD IlB2J8UUw3IebpFMBsJi7CTwQxcJQLLrpMiY/I/vkccBotN9h4t8xy+Z+ IKisNLo4tBWFwhxiwZFAN7ezQLN37TFmBqTFUtUdD36Oks0R32HLLmSGV NrRSWEgR/rOVboWbRjmUoo3vav6eRnBpN4H4r3StcgCnlXrxv20nxVEUw SivVbp+osimMba91K0Praszf5DoZLwHEgxE3tZMLV2LbcAkZlIAdxymls kk4FzPpvUUtIySy7SdISW5+XCiJC0hBhmXTneMEwc3D+ZxuypDiDxM3fJ Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812565" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812565" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137769" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:29 -0700 Message-Id: <20220712233136.1044951-6-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 05/12] drm/i915/guc: Record CTB info in error logs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison When debugging GuC communication issues, it is useful to have the CTB info available. So add the state and buffer contents to the error capture log. Also, add a sub-structure for the GuC specific error capture info as it is now becoming numerous. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gpu_error.c | 59 +++++++++++++++++++++++---- drivers/gpu/drm/i915/i915_gpu_error.h | 20 +++++++-- 2 files changed, 67 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 5bcf36c292ebd..0e43b8dd22cf7 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -683,6 +683,18 @@ static void err_print_pciid(struct drm_i915_error_state_buf *m, pdev->subsystem_device); } +static void err_print_guc_ctb(struct drm_i915_error_state_buf *m, + const char *name, + const struct intel_ctb_coredump *ctb) +{ + if (!ctb->size) + return; + + err_printf(m, "GuC %s CTB: raw: 0x%08X, 0x%08X/%08X, cached: 0x%08X/%08X, desc = 0x%08X, buf = 0x%08X x 0x%08X\n", + name, ctb->raw_status, ctb->raw_head, ctb->raw_tail, + ctb->head, ctb->tail, ctb->desc_offset, ctb->cmds_offset, ctb->size); +} + static void err_print_uc(struct drm_i915_error_state_buf *m, const struct intel_uc_coredump *error_uc) { @@ -690,8 +702,12 @@ static void err_print_uc(struct drm_i915_error_state_buf *m, intel_uc_fw_dump(&error_uc->guc_fw, &p); intel_uc_fw_dump(&error_uc->huc_fw, &p); - err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp); - intel_gpu_error_print_vma(m, NULL, error_uc->guc_log); + err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->guc.timestamp); + intel_gpu_error_print_vma(m, NULL, error_uc->guc.vma_log); + err_printf(m, "GuC CTB fence: %d\n", error_uc->guc.last_fence); + err_print_guc_ctb(m, "Send", error_uc->guc.ctb + 0); + err_print_guc_ctb(m, "Recv", error_uc->guc.ctb + 1); + intel_gpu_error_print_vma(m, NULL, error_uc->guc.vma_ctb); } static void err_free_sgl(struct scatterlist *sgl) @@ -866,7 +882,7 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, if (error->gt) { bool print_guc_capture = false; - if (error->gt->uc && error->gt->uc->is_guc_capture) + if (error->gt->uc && error->gt->uc->guc.is_guc_capture) print_guc_capture = true; err_print_gt_display(m, error->gt); @@ -1021,7 +1037,8 @@ static void cleanup_uc(struct intel_uc_coredump *uc) { kfree(uc->guc_fw.path); kfree(uc->huc_fw.path); - i915_vma_coredump_free(uc->guc_log); + i915_vma_coredump_free(uc->guc.vma_log); + i915_vma_coredump_free(uc->guc.vma_ctb); kfree(uc); } @@ -1670,6 +1687,23 @@ gt_record_engines(struct intel_gt_coredump *gt, } } +static void gt_record_guc_ctb(struct intel_ctb_coredump *saved, + const struct intel_guc_ct_buffer *ctb, + const void *blob_ptr, struct intel_guc *guc) +{ + if (!ctb || !ctb->desc) + return; + + saved->raw_status = ctb->desc->status; + saved->raw_head = ctb->desc->head; + saved->raw_tail = ctb->desc->tail; + saved->head = ctb->head; + saved->tail = ctb->tail; + saved->size = ctb->size; + saved->desc_offset = ((void *)ctb->desc) - blob_ptr; + saved->cmds_offset = ((void *)ctb->cmds) - blob_ptr; +} + static struct intel_uc_coredump * gt_record_uc(struct intel_gt_coredump *gt, struct i915_vma_compress *compress) @@ -1696,9 +1730,16 @@ gt_record_uc(struct intel_gt_coredump *gt, * log times to system times (in conjunction with the error->boottime and * gt->clock_frequency fields saved elsewhere). */ - error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); - error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, - "GuC log buffer", compress); + error_uc->guc.timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); + error_uc->guc.vma_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, + "GuC log buffer", compress); + error_uc->guc.vma_ctb = create_vma_coredump(gt->_gt, uc->guc.ct.vma, + "GuC CT buffer", compress); + error_uc->guc.last_fence = uc->guc.ct.requests.last_fence; + gt_record_guc_ctb(error_uc->guc.ctb + 0, &uc->guc.ct.ctbs.send, + uc->guc.ct.ctbs.send.desc, (struct intel_guc *)&uc->guc); + gt_record_guc_ctb(error_uc->guc.ctb + 1, &uc->guc.ct.ctbs.recv, + uc->guc.ct.ctbs.send.desc, (struct intel_guc *)&uc->guc); return error_uc; } @@ -2051,9 +2092,9 @@ __i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask, u32 du error->gt->uc = gt_record_uc(error->gt, compress); if (error->gt->uc) { if (dump_flags & CORE_DUMP_FLAG_IS_GUC_CAPTURE) - error->gt->uc->is_guc_capture = true; + error->gt->uc->guc.is_guc_capture = true; else - GEM_BUG_ON(error->gt->uc->is_guc_capture); + GEM_BUG_ON(error->gt->uc->guc.is_guc_capture); } } diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h index d8a8b3d529e09..efc75cc2ffdb9 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.h +++ b/drivers/gpu/drm/i915/i915_gpu_error.h @@ -125,6 +125,15 @@ struct intel_engine_coredump { struct intel_engine_coredump *next; }; +struct intel_ctb_coredump { + u32 raw_head, head; + u32 raw_tail, tail; + u32 raw_status; + u32 desc_offset; + u32 cmds_offset; + u32 size; +}; + struct intel_gt_coredump { const struct intel_gt *_gt; bool awake; @@ -165,9 +174,14 @@ struct intel_gt_coredump { struct intel_uc_coredump { struct intel_uc_fw guc_fw; struct intel_uc_fw huc_fw; - struct i915_vma_coredump *guc_log; - u32 timestamp; - bool is_guc_capture; + struct guc_info { + struct intel_ctb_coredump ctb[2]; + struct i915_vma_coredump *vma_ctb; + struct i915_vma_coredump *vma_log; + u32 timestamp; + u16 last_fence; + bool is_guc_capture; + } guc; } *uc; struct intel_gt_coredump *next; From patchwork Tue Jul 12 23:31:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C3A09C433EF for ; Tue, 12 Jul 2022 23:32:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 54E8F976C4; Tue, 12 Jul 2022 23:31:43 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4B03A976B7; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2ehL0zZkwvXNrmnYBiiPSCHEAsOmycj3lFAbLlsu+t0=; b=AiugKUqi56EXYa6CUMMOergOOf5QzrmulH8wctax2m110xJ1odambXhc q115gsQNZVDaIj3SPI/fKO8sZLth2CXUfix6cBxdb2FaUbJ0IAGN+eX2U qFok5kAKar+YZzfzeELmSy8HPx/Bb/fUTqFDgdwoe+XL0iE4I9j3T/V1d j3X/IBVqxVdeJKB/9//JW5OuNoeSe9IlBKcc0w+dZenLZrwh2yE5O7G1W 3mUxkSqQVH8lw5rutA9A/SGsIBQ7TZL6JWeKzooTf/Mbvv0zzorDWzKFi Wk47GAs6DXBBxWbnvXSfcs+AcnZ+YLAXA6HP1bf/4F9DdOc2BUBUxoO4s w==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812566" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812566" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137773" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:30 -0700 Message-Id: <20220712233136.1044951-7-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 06/12] drm/i915/guc: Use streaming loads to speed up dumping the guc log X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson , DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson Use a temporary page and mempy_from_wc to reduce the time it takes to dump the guc log to debugfs. Signed-off-by: Chris Wilson Reviewed-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 24 ++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index 45f62cdabe356..ff091adb56096 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -749,8 +749,9 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, struct intel_guc *guc = log_to_guc(log); struct intel_uc *uc = container_of(guc, struct intel_uc, guc); struct drm_i915_gem_object *obj = NULL; - u32 *map; - int i = 0; + void *map; + u32 *page; + int i, j; if (!intel_guc_is_supported(guc)) return -ENODEV; @@ -763,23 +764,34 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, if (!obj) return 0; + page = (u32 *)__get_free_page(GFP_KERNEL); + if (!page) + return -ENOMEM; + intel_guc_dump_time_info(guc, p); map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(map)) { DRM_DEBUG("Failed to pin object\n"); drm_puts(p, "(log data unaccessible)\n"); + free_page((unsigned long)page); return PTR_ERR(map); } - for (i = 0; i < obj->base.size / sizeof(u32); i += 4) - drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n", - *(map + i), *(map + i + 1), - *(map + i + 2), *(map + i + 3)); + for (i = 0; i < obj->base.size; i += PAGE_SIZE) { + if (!i915_memcpy_from_wc(page, map + i, PAGE_SIZE)) + memcpy(page, map + i, PAGE_SIZE); + + for (j = 0; j < PAGE_SIZE / sizeof(u32); j += 4) + drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n", + *(page + j + 0), *(page + j + 1), + *(page + j + 2), *(page + j + 3)); + } drm_puts(p, "\n"); i915_gem_object_unpin_map(obj); + free_page((unsigned long)page); return 0; } From patchwork Tue Jul 12 23:31:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915824 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF775C43334 for ; Tue, 12 Jul 2022 23:32:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 463B7976D7; Tue, 12 Jul 2022 23:31:44 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 67D97976B9; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VFqMe0j+S44jZw9uqafpjDNUbGUGAOF3nDONWQ0Vxp4=; b=TbttHTuumy8IGUpxqODJ0KqYg3O6B4/8dBhJzPmL13v78Rrk/+Te6qli KKYOBOjYR0h7m93OdU7PpwEqfSF8vTv/YrGlaujV9HDybpDwizegjXwVm 3zI2whq5VICsWJi9Nyq2YQ6gO8mr3vnB+0xmuY+qOV2F2xTjNn8zu3xzI vqcTi77wDAzkrmqqHDVBJRAwucFcXLT8LPjGyAH2Z5jzCuRkFcXUwvsDD 8BctcD8VsZj0z9II8xgkzAl9zMQ1Q6SmppYkB3VtJrI2XW8s9WuaVZxWo Jus2C2Vfa9R2jfGUAJuoK95DyxSz3m7c83Ss0J1je49TuzKb6Dsf/wUxe g==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812567" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812567" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137776" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:31 -0700 Message-Id: <20220712233136.1044951-8-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+ X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Micha=C5=82_Winiarski?= , DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Michał Winiarski Since we're going to use semaphores in selftests (and eventually in regular GuC submission), let's route semaphores to GuC. Signed-off-by: Michał Winiarski Reviewed-by: Matthew Brost Reviewed-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h | 4 ++++ drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h index 8dc063f087eb1..a7092f711e9cd 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h @@ -102,6 +102,10 @@ #define GUC_SEND_TRIGGER (1<<0) #define GEN11_GUC_HOST_INTERRUPT _MMIO(0x1901f0) +#define GEN12_GUC_SEM_INTR_ENABLES _MMIO(0xc71c) +#define GUC_SEM_INTR_ROUTE_TO_GUC BIT(31) +#define GUC_SEM_INTR_ENABLE_ALL (0xff) + #define GUC_NUM_DOORBELLS 256 /* format of the HW-monitored doorbell cacheline */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 40f726c61e951..7537459080278 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine) void intel_guc_submission_enable(struct intel_guc *guc) { + struct intel_gt *gt = guc_to_gt(guc); + + /* Enable and route to GuC */ + if (GRAPHICS_VER(gt->i915) >= 12) + intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, + GUC_SEM_INTR_ROUTE_TO_GUC | + GUC_SEM_INTR_ENABLE_ALL); + guc_init_lrc_mapping(guc); guc_init_engine_stats(guc); } void intel_guc_submission_disable(struct intel_guc *guc) { + struct intel_gt *gt = guc_to_gt(guc); + /* Note: By the time we're here, GuC may have already been reset */ + + /* Disable and route to host */ + if (GRAPHICS_VER(gt->i915) >= 12) + intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0); } static bool __guc_submission_supported(struct intel_guc *guc) From patchwork Tue Jul 12 23:31:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CA598C433EF for ; Tue, 12 Jul 2022 23:31:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BC347976B3; Tue, 12 Jul 2022 23:31:40 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 847FF976C1; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T5bXWjsBZJq1RD2+IzFYmGze2avfRAHa+EQvelqgFzk=; b=eqS0RHoMUxLVuTNAQDm39qI+tSgK98nnhXK97sL343G4TRkd+FF45APb LxrLzX+/eUqTVZj1BXvdoflB5+pRdNaBPPRcsx9Pz87Ze4+1uEhuGT0fI 06FFN/vI2+oM+yFYTk3Mo0FWIEusTYzMfo/c36tIa+YrmWLfshG3H4dK7 GSmBBBcOArnGhBJEkYp/4b6Ks7FiILy8yimLstbsOBxTsB1iYmVuioaPl OKOha0G9SzOAgi3x3rvSBC+E6Hd9pXOM1N0H/4isbodC3fMM0dH1PM+o6 GYjGkNXI79mU1xFVBFGHP8lPW6KxDYFyrSFz9nR+lcvpsKLbScvAXkTP7 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812568" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812568" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137779" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:32 -0700 Message-Id: <20220712233136.1044951-9-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 08/12] drm/i915/guc: Add selftest for a hung GuC X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org, Rahul Kumar Singh Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Rahul Kumar Singh Add a test to check that the hangcheck will recover from a submission hang in the GuC. Signed-off-by: Rahul Kumar Singh --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1 + .../drm/i915/gt/uc/selftest_guc_hangcheck.c | 159 ++++++++++++++++++ .../drm/i915/selftests/i915_live_selftests.h | 1 + 3 files changed, 161 insertions(+) create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 7537459080278..72832a4f4bac7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -4937,4 +4937,5 @@ bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve) #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftest_guc.c" #include "selftest_guc_multi_lrc.c" +#include "selftest_guc_hangcheck.c" #endif diff --git a/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c b/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c new file mode 100644 index 0000000000000..af913c4b09d37 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c @@ -0,0 +1,159 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright �� 2019 Intel Corporation + */ + +#include "selftests/igt_spinner.h" +#include "selftests/igt_reset.h" +#include "selftests/intel_scheduler_helpers.h" +#include "gt/intel_engine_heartbeat.h" +#include "gem/selftests/mock_context.h" + +#define BEAT_INTERVAL 100 + +static struct i915_request *nop_request(struct intel_engine_cs *engine) +{ + struct i915_request *rq; + + rq = intel_engine_create_kernel_request(engine); + if (IS_ERR(rq)) + return rq; + + i915_request_get(rq); + i915_request_add(rq); + + return rq; +} + +static int intel_hang_guc(void *arg) +{ + struct intel_gt *gt = arg; + int ret = 0; + struct i915_gem_context *ctx; + struct intel_context *ce; + struct igt_spinner spin; + struct i915_request *rq; + intel_wakeref_t wakeref; + struct i915_gpu_error *global = >->i915->gpu_error; + struct intel_engine_cs *engine; + unsigned int reset_count; + u32 guc_status; + u32 old_beat; + + ctx = kernel_context(gt->i915, NULL); + if (IS_ERR(ctx)) { + pr_err("Failed get kernel context: %ld\n", PTR_ERR(ctx)); + return PTR_ERR(ctx); + } + + wakeref = intel_runtime_pm_get(gt->uncore->rpm); + + ce = intel_context_create(gt->engine[BCS0]); + if (IS_ERR(ce)) { + ret = PTR_ERR(ce); + pr_err("Failed to create spinner request: %d\n", ret); + goto err; + } + + engine = ce->engine; + reset_count = i915_reset_count(global); + + old_beat = engine->props.heartbeat_interval_ms; + ret = intel_engine_set_heartbeat(engine, BEAT_INTERVAL); + if (ret) { + pr_err("Failed to boost heatbeat interval: %d\n", ret); + goto err; + } + + ret = igt_spinner_init(&spin, engine->gt); + if (ret) { + pr_err("Failed to create spinner: %d\n", ret); + goto err; + } + + rq = igt_spinner_create_request(&spin, ce, MI_NOOP); + intel_context_put(ce); + if (IS_ERR(rq)) { + ret = PTR_ERR(rq); + pr_err("Failed to create spinner request: %d\n", ret); + goto err_spin; + } + + ret = request_add_spin(rq, &spin); + if (ret) { + i915_request_put(rq); + pr_err("Failed to add Spinner request: %d\n", ret); + goto err_spin; + } + + ret = intel_reset_guc(gt); + if (ret) { + i915_request_put(rq); + pr_err("Failed to reset GuC, ret = %d\n", ret); + goto err_spin; + } + + guc_status = intel_uncore_read(gt->uncore, GUC_STATUS); + if (!(guc_status & GS_MIA_IN_RESET)) { + i915_request_put(rq); + pr_err("GuC failed to reset: status = 0x%08X\n", guc_status); + ret = -EIO; + goto err_spin; + } + + /* Wait for the heartbeat to cause a reset */ + ret = intel_selftest_wait_for_rq(rq); + i915_request_put(rq); + if (ret) { + pr_err("Request failed to complete: %d\n", ret); + goto err_spin; + } + + if (i915_reset_count(global) == reset_count) { + pr_err("Failed to record a GPU reset\n"); + ret = -EINVAL; + goto err_spin; + } + +err_spin: + igt_spinner_end(&spin); + igt_spinner_fini(&spin); + intel_engine_set_heartbeat(engine, old_beat); + + if (ret == 0) { + rq = nop_request(engine); + if (IS_ERR(rq)) { + ret = PTR_ERR(rq); + goto err; + } + + ret = intel_selftest_wait_for_rq(rq); + i915_request_put(rq); + if (ret) { + pr_err("No-op failed to complete: %d\n", ret); + goto err; + } + } + +err: + intel_runtime_pm_put(gt->uncore->rpm, wakeref); + kernel_context_close(ctx); + + return ret; +} + +int intel_guc_hang_check(struct drm_i915_private *i915) +{ + static const struct i915_subtest tests[] = { + SUBTEST(intel_hang_guc), + }; + struct intel_gt *gt = to_gt(i915); + + if (intel_gt_is_wedged(gt)) + return 0; + + if (!intel_uc_uses_guc_submission(>->uc)) + return 0; + + return intel_gt_live_subtests(tests, gt); +} diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h index bdd290f2bf3cd..aaf8a380e5c78 100644 --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h @@ -49,5 +49,6 @@ selftest(perf, i915_perf_live_selftests) selftest(slpc, intel_slpc_live_selftests) selftest(guc, intel_guc_live_selftests) selftest(guc_multi_lrc, intel_guc_multi_lrc_live_selftests) +selftest(guc_hang, intel_guc_hang_check) /* Here be dragons: keep last to run last! */ selftest(late_gt_pm, intel_gt_pm_late_selftests) From patchwork Tue Jul 12 23:31:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DA909C43334 for ; Tue, 12 Jul 2022 23:32:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DE978976D5; Tue, 12 Jul 2022 23:31:43 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id A3330976C2; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EszMBTrM3tmnOy5hUioPTWVuKtMomlV5dThX/JbF/V0=; b=eiHb64oULWS02c0JMr497glxwOzBAvEwZlCa8lV3wv1PNbMMRhJMcAz4 wtRiNITDTMlXQYa11ge/1wh7qIRKjsVNoLcRqmKdEHk03VyD5M3wWt1s1 4wyF/xJ417vEpOaRkZ5+PMldzIxgJv1R4z5EDpl2Eq95P+oTmyU6+4SIw YenFUYMBJLroVhz3RCdhXsj7+dOrMs7V2rD9PMN2FPIHBu5bLEoRUk0+F 6NrwPzZt3xyquaOeaXNMb8/DjHidXYJ2oUbPlBSPcjJiGJytIVMTkeUpq 7s0Cs86vCbjv29IfHexGOBWgxZCTW7gMSva14TcZpzOH6RQo02SUvr1Qm g==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812569" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812569" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137783" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:36 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:33 -0700 Message-Id: <20220712233136.1044951-10-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 09/12] drm/i915/selftest: Cope with not having an RCS engine X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison It is no longer guaranteed that there will always be an RCS engine. So, use the helper function for finding the first available engine that can be used for general purpose selftets. Signed-off-by: John Harrison Reviewed-by: Matthew Brost --- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 6493265d5f642..7f3bb1d34dfbf 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -1302,13 +1302,15 @@ static int igt_reset_wait(void *arg) { struct intel_gt *gt = arg; struct i915_gpu_error *global = >->i915->gpu_error; - struct intel_engine_cs *engine = gt->engine[RCS0]; + struct intel_engine_cs *engine; struct i915_request *rq; unsigned int reset_count; struct hang h; long timeout; int err; + engine = intel_selftest_find_any_engine(gt); + if (!engine || !intel_engine_can_store_dword(engine)) return 0; @@ -1432,7 +1434,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt, int (*fn)(void *), unsigned int flags) { - struct intel_engine_cs *engine = gt->engine[RCS0]; + struct intel_engine_cs *engine; struct drm_i915_gem_object *obj; struct task_struct *tsk = NULL; struct i915_request *rq; @@ -1444,6 +1446,8 @@ static int __igt_reset_evict_vma(struct intel_gt *gt, if (!gt->ggtt->num_fences && flags & EXEC_OBJECT_NEEDS_FENCE) return 0; + engine = intel_selftest_find_any_engine(gt); + if (!engine || !intel_engine_can_store_dword(engine)) return 0; @@ -1819,12 +1823,14 @@ static int igt_handle_error(void *arg) { struct intel_gt *gt = arg; struct i915_gpu_error *global = >->i915->gpu_error; - struct intel_engine_cs *engine = gt->engine[RCS0]; + struct intel_engine_cs *engine; struct hang h; struct i915_request *rq; struct i915_gpu_coredump *error; int err; + engine = intel_selftest_find_any_engine(gt); + /* Check that we can issue a global GPU and engine reset */ if (!intel_has_reset_engine(gt)) From patchwork Tue Jul 12 23:31:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24443C433EF for ; Tue, 12 Jul 2022 23:32:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8EEB1976E0; Tue, 12 Jul 2022 23:31:45 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id B73C5976C4; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668698; x=1689204698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DggPn44YSGRWvUcM1n2qdrDcdW1KhV8vx2xjTdu51VI=; b=j4UylcGLK0DxbdSuB4LcpsCqhwdznHe4bGcirN2vx+RyXprYYVQTpxyt MOQ/gu8ZiP7KA+8zXBkNFD9ygpvz6NnVtiNb9H2KpcOBgMPOyt4mTTfAo Bwz8YFtZ+wmf5ERautBGHtMUrNdQkCXhJwV1h5VjZWtv8Ri2KOfjtn9Jk o5Zx2bW5zsSARPcCANXgIVyKgwgprkr47Cgehmmb/3bcoFtUrXBHQqzXM e43KCzIeCXT0ES4EridXoYj1hNP46n+Ag8OkMlA9j+BNmyjVT6tmGyn0A ivykdKyY1iHuSuft3hRtbuo52lx9u0mzJv/CaQr43w0/PCUdKqXDJF69o Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812570" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812570" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137786" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:37 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:34 -0700 Message-Id: <20220712233136.1044951-11-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Brost The GuC needs a copy of a golden context for implementing watchdog resets (aka media resets). This context is larger on newer platforms. So adjust the size being allocated/copied accordingly. Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index ba7541f3ca610..74cbe8eaf5318 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct intel_gt *gt, } #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32)) -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE) +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32)) +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50) ? \ + XEHP_LR_HW_CONTEXT_SIZE : \ + LR_HW_CONTEXT_SIZE) +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SZ(i915)) static int guc_prep_golden_context(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct intel_guc *guc) * on all engines). */ ads_blob_write(guc, ads.eng_state_size[guc_class], - real_size - LRC_SKIP_SIZE); + real_size - LRC_SKIP_SIZE(gt->i915)); ads_blob_write(guc, ads.golden_context_lrca[guc_class], addr_ggtt); @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct intel_guc *guc) } GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) != - real_size - LRC_SKIP_SIZE); + real_size - LRC_SKIP_SIZE(gt->i915)); GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt); addr_ggtt += alloc_size; From patchwork Tue Jul 12 23:31:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 60140C433EF for ; Tue, 12 Jul 2022 23:32:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 490E0976C2; Tue, 12 Jul 2022 23:31:43 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id DFC2A976B7; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668699; x=1689204699; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rHhl06ObpqZjyXGZ1VryxfY6f/5XCI/yrin3mQpA+pc=; b=BPEc9Uxz+8rb4YXiC159SyowGT5JpLodSYnMYBAJxqRkmk4TK4jcBGAE JN/e2ow8rCIAHGWrxOAZaOlAiDDwXsI4agRFMiMAHfTd/9RXtcVpqc3+D TdvYdZn/mz+HJqAvgynUZ3o6r/UD2OjPkLCoE+wFDPkenqxLRF0BdUdfU UPbjciF+qwP3plQsGdAVokrUjcNNHyakAZaptk8a5qXdO1IH51f+EK2Mj 2+kgd0Ylx8egglh365sYWu/iWT2vkc5jjFUzQE4CIB+epeRKTbIEz9gBg zwoumHB92t5hLPXJo1ZZ/2svOBfhK/ZGECl7lOvLPWQuMpKKZR+3IqRrl g==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812571" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812571" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137789" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:37 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:35 -0700 Message-Id: <20220712233136.1044951-12-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 11/12] drm/i915/guc: Don't abort on CTB_UNUSED status X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison When the KMD sends a CLIENT_RESET request to GuC (as part of the suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the KMD then checked the CTB queue, it would see a non-zero status value and report the buffer as corrupted. Technically, no G2H messages should be received once the CLIENT_RESET has been sent. However, if a context was outstanding on an engine then it would get reset and a reset notification would be sent. So, don't actually treat UNUSED as a catastrophic error. Just flag it up as unexpected and keep going. Signed-off-by: John Harrison --- .../i915/gt/uc/abi/guc_communication_ctb_abi.h | 8 +++++--- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 18 ++++++++++++++++-- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h index df83c1cc7c7a6..28b8387f97b77 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h @@ -37,6 +37,7 @@ * | | | - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large) | * | | | - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message) | * | | | - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified) | + * | | | - _`GUC_CTB_STATUS_UNUSED` = 8 (CTB is not in use) | * +---+-------+--------------------------------------------------------------+ * |...| | RESERVED = MBZ | * +---+-------+--------------------------------------------------------------+ @@ -49,9 +50,10 @@ struct guc_ct_buffer_desc { u32 tail; u32 status; #define GUC_CTB_STATUS_NO_ERROR 0 -#define GUC_CTB_STATUS_OVERFLOW (1 << 0) -#define GUC_CTB_STATUS_UNDERFLOW (1 << 1) -#define GUC_CTB_STATUS_MISMATCH (1 << 2) +#define GUC_CTB_STATUS_OVERFLOW BIT(0) +#define GUC_CTB_STATUS_UNDERFLOW BIT(1) +#define GUC_CTB_STATUS_MISMATCH BIT(2) +#define GUC_CTB_STATUS_UNUSED BIT(3) u32 reserved[13]; } __packed; static_assert(sizeof(struct guc_ct_buffer_desc) == 64); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index f01325cd1b625..11b5d4ddb19ce 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -816,8 +816,22 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg) if (unlikely(ctb->broken)) return -EPIPE; - if (unlikely(desc->status)) - goto corrupted; + if (unlikely(desc->status)) { + u32 status = desc->status; + + if (status & GUC_CTB_STATUS_UNUSED) { + /* + * Potentially valid if a CLIENT_RESET request resulted in + * contexts/engines being reset. But should never happen as + * no contexts should be active when CLIENT_RESET is sent. + */ + CT_ERROR(ct, "Unexpected G2H after GuC has stopped!\n"); + status &= ~GUC_CTB_STATUS_UNUSED; + } + + if (status) + goto corrupted; + } GEM_BUG_ON(head > size); From patchwork Tue Jul 12 23:31:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12915818 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7884BC433EF for ; Tue, 12 Jul 2022 23:32:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CEAA2976D2; Tue, 12 Jul 2022 23:31:42 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id F21B5976C6; Tue, 12 Jul 2022 23:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657668699; x=1689204699; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ldq8n0PudUb3seogcSKkSq2VMrWwDUfdLFuwepmIGZc=; b=FmOtYFLKHCM66yOOO8x/RxQ+EcSNTThBan2eOeRB7t74juJumhwe8YMx E6YTb67ocde6Ww18AcdhlH4on5CHLGRpBuvII9rV/kZ4HuIAMFWIiyBex QwSWsIdalDgshSUB44EpfU1Dn48nY30SkmMitQtsA8EFVjmx+mgo7+mXI wWRgJvY+VHMyRT7YjIi213E3XjkuTrR1wtfahPfhiIwt64EB1/dq4zbz+ p7Q6OAG5wR4N44ucXJmC7gma6v8os74Fx2OFoQA/9vJbF6fw0t93XHJWo fdp/lrwYicaFbzLsXWalplxZTC4YiBl0+ZRRwKCRM5e1wvS4cUfHmE1S/ A==; X-IronPort-AV: E=McAfee;i="6400,9594,10406"; a="285812573" X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="285812573" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2022 16:31:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,266,1650956400"; d="scan'208";a="722137792" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by orsmga004.jf.intel.com with ESMTP; 12 Jul 2022 16:31:37 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 12 Jul 2022 16:31:36 -0700 Message-Id: <20220712233136.1044951-13-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220712233136.1044951-1-John.C.Harrison@Intel.com> References: <20220712233136.1044951-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 12/12] drm/i915/guc: Add a helper for log buffer size X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alan Previn , DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Alan Previn Add a helper to get GuC log buffer size. Signed-off-by: Alan Previn Reviewed-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 49 ++++++++++++---------- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index ff091adb56096..4bb81d2cf3828 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -15,6 +15,32 @@ static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log); +static u32 intel_guc_log_size(struct intel_guc_log *log) +{ + /* + * GuC Log buffer Layout: + * + * NB: Ordering must follow "enum guc_log_buffer_type". + * + * +===============================+ 00B + * | Debug state header | + * +-------------------------------+ 32B + * | Crash dump state header | + * +-------------------------------+ 64B + * | Capture state header | + * +-------------------------------+ 96B + * | | + * +===============================+ PAGE_SIZE (4KB) + * | Debug logs | + * +===============================+ + DEBUG_SIZE + * | Crash Dump logs | + * +===============================+ + CRASH_SIZE + * | Capture logs | + * +===============================+ + CAPTURE_SIZE + */ + return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE; +} + /** * DOC: GuC firmware log * @@ -461,32 +487,11 @@ int intel_guc_log_create(struct intel_guc_log *log) GEM_BUG_ON(log->vma); - /* - * GuC Log buffer Layout - * (this ordering must follow "enum guc_log_buffer_type" definition) - * - * +===============================+ 00B - * | Debug state header | - * +-------------------------------+ 32B - * | Crash dump state header | - * +-------------------------------+ 64B - * | Capture state header | - * +-------------------------------+ 96B - * | | - * +===============================+ PAGE_SIZE (4KB) - * | Debug logs | - * +===============================+ + DEBUG_SIZE - * | Crash Dump logs | - * +===============================+ + CRASH_SIZE - * | Capture logs | - * +===============================+ + CAPTURE_SIZE - */ if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE) DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n", CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc)); - guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + - CAPTURE_BUFFER_SIZE; + guc_log_size = intel_guc_log_size(log); vma = intel_guc_allocate_vma(guc, guc_log_size); if (IS_ERR(vma)) {