From patchwork Thu Feb 13 19:51:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alan Previn X-Patchwork-Id: 13973977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D244BC021A6 for ; Thu, 13 Feb 2025 19:51:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BAC4010EB8E; Thu, 13 Feb 2025 19:51:42 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="hZP7cCZ+"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 06C9510EB8A; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739476301; x=1771012301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MMfDOo5DjIX3gfbJX6BC6ZgkPTQ8H6tLdpxDUTRtyCM=; b=hZP7cCZ+i74Vw7gy9snMNa87nQBSOC+7yl0jDKHB9EON4CbZyldcPUvX Ut/RtdUGmEiyBu6yG4inAWgolWRr+b3jgOqQhh07wHA1+inUhlQ9CG4F0 l3gTM3UYQSvs/cKoTSzZamJ7Alnp3ZcDf3Fk/qwOjOqWxtVeDMhhrliM0 XQf/bEwW1ljOnkHkJl0RHF7i2uU+tPPyAHz2ytl8o4g8iJ7I84SXG5YOn brRqVBz2LJ0kMaUxRrOQ8aBi7mzi/SMx58Uqy50cLkxNdNzYhQWyXpUC9 rs9quQYEjwnfVrl6dycFt22W0uiBJNPXmehzPGQliOZcDCtNcmWEyOvS1 Q==; X-CSE-ConnectionGUID: k/o/nD4xSx2ggEmCwIoCDw== X-CSE-MsgGUID: irKRRqzySHu4oA0ZBg9Cwg== X-IronPort-AV: E=McAfee;i="6700,10204,11344"; a="40354746" X-IronPort-AV: E=Sophos;i="6.13,282,1732608000"; d="scan'208";a="40354746" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2025 11:51:40 -0800 X-CSE-ConnectionGUID: MdvtPMyqT72eziMDXX8lSg== X-CSE-MsgGUID: cnX0WBD1TfOiUKHNe61CVw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="117372209" Received: from aalteres-desk1.fm.intel.com ([10.1.39.140]) by fmviesa003.fm.intel.com with ESMTP; 13 Feb 2025 11:51:40 -0800 From: Alan Previn To: intel-xe@lists.freedesktop.org Cc: Alan Previn , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , John Harrison , Matthew Brost , Zhanjun Dong , Rodrigo Vivi Subject: [PATCH v8 1/6] drm/xe/guc: Rename __guc_capture_parsed_output Date: Thu, 13 Feb 2025 11:51:34 -0800 Message-Id: <20250213195139.3396082-2-alan.previn.teres.alexis@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> References: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Since '__guc_capture_parsed_output *' is a handle that is retrieved, stored and relinquished by an entity external to GuC (i.e. xe_devcoredump), lets rename it to something formal without the'__' prefix and export it via give a header file. v7: - Copyright header fix in xe_guc_capture_snapshot_types.h (Rodrigo) Signed-off-by: Alan Previn Reviewed-by: Rodrigo Vivi Reviewed-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_devcoredump_types.h | 2 +- drivers/gpu/drm/xe/xe_guc_capture.c | 83 ++++++------------- drivers/gpu/drm/xe/xe_guc_capture.h | 2 +- .../drm/xe/xe_guc_capture_snapshot_types.h | 53 ++++++++++++ drivers/gpu/drm/xe/xe_hw_engine.c | 2 +- drivers/gpu/drm/xe/xe_hw_engine_types.h | 5 -- 6 files changed, 81 insertions(+), 66 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h index 1a1d16a96b2d..c94ce21043a8 100644 --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h @@ -58,7 +58,7 @@ struct xe_devcoredump_snapshot { * this single-node tracker works because devcoredump will always only * produce one hw-engine capture per devcoredump event */ - struct __guc_capture_parsed_output *matched_node; + struct xe_guc_capture_snapshot *matched_node; /** @vm: Snapshot of VM state */ struct xe_vm_snapshot *vm; diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c index f6d523e4c5fe..e04c87739267 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -26,6 +26,7 @@ #include "xe_guc_ads.h" #include "xe_guc_capture.h" #include "xe_guc_capture_types.h" +#include "xe_guc_capture_snapshot_types.h" #include "xe_guc_ct.h" #include "xe_guc_exec_queue_types.h" #include "xe_guc_log.h" @@ -53,40 +54,6 @@ struct __guc_capture_bufstate { u32 wr; }; -/* - * struct __guc_capture_parsed_output - extracted error capture node - * - * A single unit of extracted error-capture output data grouped together - * at an engine-instance level. We keep these nodes in a linked list. - * See cachelist and outlist below. - */ -struct __guc_capture_parsed_output { - /* - * A single set of 3 capture lists: a global-list - * an engine-class-list and an engine-instance list. - * outlist in __guc_capture_parsed_output will keep - * a linked list of these nodes that will eventually - * be detached from outlist and attached into to - * xe_codedump in response to a context reset - */ - struct list_head link; - bool is_partial; - u32 eng_class; - u32 eng_inst; - u32 guc_id; - u32 lrca; - u32 type; - bool locked; - enum xe_hw_engine_snapshot_source_id source; - struct gcap_reg_list_info { - u32 vfid; - u32 num_regs; - struct guc_mmio_reg *regs; - } reginfo[GUC_STATE_CAPTURE_TYPE_MAX]; -#define GCAP_PARSED_REGLIST_INDEX_GLOBAL BIT(GUC_STATE_CAPTURE_TYPE_GLOBAL) -#define GCAP_PARSED_REGLIST_INDEX_ENGCLASS BIT(GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS) -}; - /* * Define all device tables of GuC error capture register lists * NOTE: @@ -287,7 +254,7 @@ struct xe_guc_state_capture { static void guc_capture_remove_stale_matches_from_list(struct xe_guc_state_capture *gc, - struct __guc_capture_parsed_output *node); + struct xe_guc_capture_snapshot *node); static const struct __guc_mmio_reg_descr_group * guc_capture_get_device_reglist(struct xe_device *xe) @@ -841,7 +808,7 @@ static void check_guc_capture_size(struct xe_guc *guc) } static void -guc_capture_add_node_to_list(struct __guc_capture_parsed_output *node, +guc_capture_add_node_to_list(struct xe_guc_capture_snapshot *node, struct list_head *list) { list_add(&node->link, list); @@ -849,7 +816,7 @@ guc_capture_add_node_to_list(struct __guc_capture_parsed_output *node, static void guc_capture_add_node_to_outlist(struct xe_guc_state_capture *gc, - struct __guc_capture_parsed_output *node) + struct xe_guc_capture_snapshot *node) { guc_capture_remove_stale_matches_from_list(gc, node); guc_capture_add_node_to_list(node, &gc->outlist); @@ -857,14 +824,14 @@ guc_capture_add_node_to_outlist(struct xe_guc_state_capture *gc, static void guc_capture_add_node_to_cachelist(struct xe_guc_state_capture *gc, - struct __guc_capture_parsed_output *node) + struct xe_guc_capture_snapshot *node) { guc_capture_add_node_to_list(node, &gc->cachelist); } static void guc_capture_free_outlist_node(struct xe_guc_state_capture *gc, - struct __guc_capture_parsed_output *n) + struct xe_guc_capture_snapshot *n) { if (n) { n->locked = 0; @@ -876,9 +843,9 @@ guc_capture_free_outlist_node(struct xe_guc_state_capture *gc, static void guc_capture_remove_stale_matches_from_list(struct xe_guc_state_capture *gc, - struct __guc_capture_parsed_output *node) + struct xe_guc_capture_snapshot *node) { - struct __guc_capture_parsed_output *n, *ntmp; + struct xe_guc_capture_snapshot *n, *ntmp; int guc_id = node->guc_id; list_for_each_entry_safe(n, ntmp, &gc->outlist, link) { @@ -888,7 +855,7 @@ guc_capture_remove_stale_matches_from_list(struct xe_guc_state_capture *gc, } static void -guc_capture_init_node(struct xe_guc *guc, struct __guc_capture_parsed_output *node) +guc_capture_init_node(struct xe_guc *guc, struct xe_guc_capture_snapshot *node) { struct guc_mmio_reg *tmp[GUC_STATE_CAPTURE_TYPE_MAX]; int i; @@ -1067,13 +1034,13 @@ guc_capture_log_get_register(struct xe_guc *guc, struct __guc_capture_bufstate * return 0; } -static struct __guc_capture_parsed_output * +static struct xe_guc_capture_snapshot * guc_capture_get_prealloc_node(struct xe_guc *guc) { - struct __guc_capture_parsed_output *found = NULL; + struct xe_guc_capture_snapshot *found = NULL; if (!list_empty(&guc->capture->cachelist)) { - struct __guc_capture_parsed_output *n, *ntmp; + struct xe_guc_capture_snapshot *n, *ntmp; /* get first avail node from the cache list */ list_for_each_entry_safe(n, ntmp, &guc->capture->cachelist, link) { @@ -1081,7 +1048,7 @@ guc_capture_get_prealloc_node(struct xe_guc *guc) break; } } else { - struct __guc_capture_parsed_output *n, *ntmp; + struct xe_guc_capture_snapshot *n, *ntmp; /* * traverse reversed and steal back the oldest node already @@ -1100,11 +1067,11 @@ guc_capture_get_prealloc_node(struct xe_guc *guc) return found; } -static struct __guc_capture_parsed_output * -guc_capture_clone_node(struct xe_guc *guc, struct __guc_capture_parsed_output *original, +static struct xe_guc_capture_snapshot * +guc_capture_clone_node(struct xe_guc *guc, struct xe_guc_capture_snapshot *original, u32 keep_reglist_mask) { - struct __guc_capture_parsed_output *new; + struct xe_guc_capture_snapshot *new; int i; new = guc_capture_get_prealloc_node(guc); @@ -1146,7 +1113,7 @@ guc_capture_extract_reglists(struct xe_guc *guc, struct __guc_capture_bufstate * struct xe_gt *gt = guc_to_gt(guc); struct guc_state_capture_group_header_t ghdr = {0}; struct guc_state_capture_header_t hdr = {0}; - struct __guc_capture_parsed_output *node = NULL; + struct xe_guc_capture_snapshot *node = NULL; struct guc_mmio_reg *regs = NULL; int i, numlists, numregs, ret = 0; enum guc_state_capture_type datatype; @@ -1439,11 +1406,11 @@ void xe_guc_capture_process(struct xe_guc *guc) __guc_capture_process_output(guc); } -static struct __guc_capture_parsed_output * +static struct xe_guc_capture_snapshot * guc_capture_alloc_one_node(struct xe_guc *guc) { struct drm_device *drm = guc_to_drm(guc); - struct __guc_capture_parsed_output *new; + struct xe_guc_capture_snapshot *new; int i; new = drmm_kzalloc(drm, sizeof(*new), GFP_KERNEL); @@ -1468,7 +1435,7 @@ guc_capture_alloc_one_node(struct xe_guc *guc) static void __guc_capture_create_prealloc_nodes(struct xe_guc *guc) { - struct __guc_capture_parsed_output *node = NULL; + struct xe_guc_capture_snapshot *node = NULL; int i; for (i = 0; i < PREALLOC_NODES_MAX_COUNT; ++i) { @@ -1583,7 +1550,7 @@ xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot struct xe_devcoredump *devcoredump = &xe->devcoredump; enum guc_capture_list_class_type capture_class; const struct __guc_mmio_reg_descr_group *list; - struct __guc_capture_parsed_output *new; + struct xe_guc_capture_snapshot *new; enum guc_state_capture_type type; u16 guc_id = 0; u32 lrca = 0; @@ -1849,7 +1816,7 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm * * Returns: found guc-capture node ptr else NULL */ -struct __guc_capture_parsed_output * +struct xe_guc_capture_snapshot * xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q) { struct xe_hw_engine *hwe; @@ -1878,7 +1845,7 @@ xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q) } if (guc_class <= GUC_LAST_ENGINE_CLASS) { - struct __guc_capture_parsed_output *n, *ntmp; + struct xe_guc_capture_snapshot *n, *ntmp; struct xe_guc *guc = &q->gt->uc.guc; u16 guc_id = q->guc->id; u32 lrca = xe_lrc_ggtt_addr(q->lrc[0]); @@ -1931,7 +1898,7 @@ xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q) coredump->snapshot.hwe[id] = xe_hw_engine_snapshot_capture(hwe, q); } else { - struct __guc_capture_parsed_output *new; + struct xe_guc_capture_snapshot *new; new = xe_guc_capture_get_matching_and_lock(q); if (new) { @@ -1965,7 +1932,7 @@ void xe_guc_capture_put_matched_nodes(struct xe_guc *guc) { struct xe_device *xe = guc_to_xe(guc); struct xe_devcoredump *devcoredump = &xe->devcoredump; - struct __guc_capture_parsed_output *n = devcoredump->snapshot.matched_node; + struct xe_guc_capture_snapshot *n = devcoredump->snapshot.matched_node; if (n) { guc_capture_remove_stale_matches_from_list(guc->capture, n); diff --git a/drivers/gpu/drm/xe/xe_guc_capture.h b/drivers/gpu/drm/xe/xe_guc_capture.h index 20a078dc4b85..046989fba3b1 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.h +++ b/drivers/gpu/drm/xe/xe_guc_capture.h @@ -50,7 +50,7 @@ size_t xe_guc_capture_ads_input_worst_size(struct xe_guc *guc); const struct __guc_mmio_reg_descr_group * xe_guc_capture_get_reg_desc_list(struct xe_gt *gt, u32 owner, u32 type, enum guc_capture_list_class_type capture_class, bool is_ext); -struct __guc_capture_parsed_output *xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q); +struct xe_guc_capture_snapshot *xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q); void xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot *snapshot); void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p); void xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q); diff --git a/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h b/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h new file mode 100644 index 000000000000..a5579e69da2e --- /dev/null +++ b/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2025 Intel Corporation + */ + +#ifndef _XE_GUC_CAPTURE_SNAPSHOT_TYPES_H +#define _XE_GUC_CAPTURE_SNAPSHOT_TYPES_H + +#include +#include + +struct guc_mmio_reg; + +enum xe_guc_capture_snapshot_source { + XE_ENGINE_CAPTURE_SOURCE_MANUAL, + XE_ENGINE_CAPTURE_SOURCE_GUC +}; + +/* + * struct xe_guc_capture_snapshot - extracted error capture node + * + * A single unit of extracted error-capture output data grouped together + * at an engine-instance level. We keep these nodes in a linked list. + * See cachelist and outlist below. + */ +struct xe_guc_capture_snapshot { + /* + * A single set of 3 capture lists: a global-list + * an engine-class-list and an engine-instance list. + * outlist in xe_guc_state_capture will keep + * a linked list of these nodes that will eventually + * be detached from outlist and attached into to + * xe_codedump in response to a context reset + */ + struct list_head link; + bool is_partial; + u32 eng_class; + u32 eng_inst; + u32 guc_id; + u32 lrca; + u32 type; + bool locked; + enum xe_guc_capture_snapshot_source source; + struct gcap_reg_list_info { + u32 vfid; + u32 num_regs; + struct guc_mmio_reg *regs; + } reginfo[GUC_STATE_CAPTURE_TYPE_MAX]; +#define GCAP_PARSED_REGLIST_INDEX_GLOBAL BIT(GUC_STATE_CAPTURE_TYPE_GLOBAL) +#define GCAP_PARSED_REGLIST_INDEX_ENGCLASS BIT(GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS) +}; + +#endif diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index fc447751fe78..a99e3160724b 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -843,7 +843,7 @@ struct xe_hw_engine_snapshot * xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) { struct xe_hw_engine_snapshot *snapshot; - struct __guc_capture_parsed_output *node; + struct xe_guc_capture_snapshot *node; if (!xe_hw_engine_is_valid(hwe)) return NULL; diff --git a/drivers/gpu/drm/xe/xe_hw_engine_types.h b/drivers/gpu/drm/xe/xe_hw_engine_types.h index e4191a7a2c31..de69e2628f2f 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_types.h +++ b/drivers/gpu/drm/xe/xe_hw_engine_types.h @@ -152,11 +152,6 @@ struct xe_hw_engine { struct xe_hw_engine_group *hw_engine_group; }; -enum xe_hw_engine_snapshot_source_id { - XE_ENGINE_CAPTURE_SOURCE_MANUAL, - XE_ENGINE_CAPTURE_SOURCE_GUC -}; - /** * struct xe_hw_engine_snapshot - Hardware engine snapshot * From patchwork Thu Feb 13 19:51:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Previn X-Patchwork-Id: 13973975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9BF31C021A6 for ; Thu, 13 Feb 2025 19:51:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B78A510EB8A; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="LrBR9ANG"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2C41D10EB88; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739476301; x=1771012301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J7voNj+xAJYUyUil0t25q7g3ZuCshmzYphxAtdQVM7Y=; b=LrBR9ANG/xrA6QCvkytfVI6TxUmIfTdn3QT/GCx4FpDpRZ5Y4QFZlSqc hmGOClwe2MPzPJX4R+h8yPsbglb9FgUSqn1OFhIJ69nT9TRV52rjoazTq TRuAJEXvqf+T9AtPSBrQ4bzyOdaGXYRdVVfHajDtXGRBw5ROD4qdmasyd Rv2FAfvJcTh4BG0VdlLDfrzMaqtfUrEk/R0DlLS3XhFGjBbE0SdV5nfPB PiMwSzWcQ9BeLxtlfwbVbVCt49fZh4sSgs/JDC8aQwS3RdIEA7r2v+khU 7vzqn08wIQbjQB372KWe6gpUoh13AG2WtyooN/FLX1KRAsCFbN3M0IsGk g==; X-CSE-ConnectionGUID: ok9bNEZ5SA2TbsZr4XgQUQ== X-CSE-MsgGUID: 02tZ8vIRS7CI6nntJaOZ/g== X-IronPort-AV: E=McAfee;i="6700,10204,11344"; a="40354747" X-IronPort-AV: E=Sophos;i="6.13,282,1732608000"; d="scan'208";a="40354747" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2025 11:51:40 -0800 X-CSE-ConnectionGUID: TOPAjIoyRwiKbjaiD+6Ksw== X-CSE-MsgGUID: oe7cbA5ETTqen8FFscwUgw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="117372212" Received: from aalteres-desk1.fm.intel.com ([10.1.39.140]) by fmviesa003.fm.intel.com with ESMTP; 13 Feb 2025 11:51:40 -0800 From: Alan Previn To: intel-xe@lists.freedesktop.org Cc: Alan Previn , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , John Harrison , Matthew Brost , Zhanjun Dong , Rodrigo Vivi Subject: [PATCH v8 2/6] drm/xe/guc: Don't store capture nodes in xe_devcoredump_snapshot Date: Thu, 13 Feb 2025 11:51:35 -0800 Message-Id: <20250213195139.3396082-3-alan.previn.teres.alexis@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> References: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" GuC-Err-Capture should not be storing register snapshot nodes directly inside of the top level xe_devcoredump_snapshot structure that it doesn't control. Furthermore, that is is not right from a driver subsystem layering perspective. Instead, when a matching GuC-Err-Capture register snapshot is available, it should be stored in xe_hw_engine_snapshot structure. To ensure the manual snapshots can be retrieved and released like the firmware reported snapshot nodes, replace xe_engine_manual_capture with xe_guc_capture_snapshot_store_manual_job (which generates and stores the manual GuC-Err-Capture register snapshot with a job association within its internal outlist). Take note that this replacement function will NOT handle raw jobless register dumps. That will be created as a separate helper in a following patch of this series. v8:- Add back missing SRIOV-VF-bailout check when getting manual register dumps (Zhanjun). - Add header-comments on the separation of jobless manual-capture as a subsequent patch. (Zhanjun) - Change some xe_gt_warns to xe_gt_dbgs. (Zhanjun) v7:- Use xe_gt_dbg instead of xe_gt_warn when neither GuC-sourced nor manual-sourced capture node is found during xe_hw_engine printing because this can be valid in some code-paths such as for gt-reset events. (John Harrison) Signed-off-by: Alan Previn Reviewed-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_devcoredump.c | 3 - drivers/gpu/drm/xe/xe_devcoredump_types.h | 6 - drivers/gpu/drm/xe/xe_guc_capture.c | 154 ++++++++++------------ drivers/gpu/drm/xe/xe_guc_capture.h | 9 +- drivers/gpu/drm/xe/xe_guc_submit.c | 12 +- drivers/gpu/drm/xe/xe_hw_engine.c | 34 +++-- drivers/gpu/drm/xe/xe_hw_engine_types.h | 8 ++ 7 files changed, 106 insertions(+), 120 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c index 39fe485d2085..006041997550 100644 --- a/drivers/gpu/drm/xe/xe_devcoredump.c +++ b/drivers/gpu/drm/xe/xe_devcoredump.c @@ -149,9 +149,6 @@ static void xe_devcoredump_snapshot_free(struct xe_devcoredump_snapshot *ss) xe_guc_ct_snapshot_free(ss->guc.ct); ss->guc.ct = NULL; - xe_guc_capture_put_matched_nodes(&ss->gt->uc.guc); - ss->matched_node = NULL; - xe_guc_exec_queue_snapshot_free(ss->ge); ss->ge = NULL; diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h index c94ce21043a8..28486ed93314 100644 --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h @@ -53,12 +53,6 @@ struct xe_devcoredump_snapshot { struct xe_hw_engine_snapshot *hwe[XE_NUM_HW_ENGINES]; /** @job: Snapshot of job state */ struct xe_sched_job_snapshot *job; - /** - * @matched_node: The matched capture node for timedout job - * this single-node tracker works because devcoredump will always only - * produce one hw-engine capture per devcoredump event - */ - struct xe_guc_capture_snapshot *matched_node; /** @vm: Snapshot of VM state */ struct xe_vm_snapshot *vm; diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c index e04c87739267..1f9d49f5a805 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -1532,35 +1532,21 @@ read_reg_to_node(struct xe_hw_engine *hwe, const struct __guc_mmio_reg_descr_gro } } -/** - * xe_engine_manual_capture - Take a manual engine snapshot from engine. - * @hwe: Xe HW Engine. - * @snapshot: The engine snapshot - * - * Take engine snapshot from engine read. - * - * Returns: None - */ -void -xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot *snapshot) +static struct xe_guc_capture_snapshot * +guc_capture_get_manual_snapshot(struct xe_guc *guc, struct xe_hw_engine *hwe) { - struct xe_gt *gt = hwe->gt; - struct xe_device *xe = gt_to_xe(gt); - struct xe_guc *guc = >->uc.guc; - struct xe_devcoredump *devcoredump = &xe->devcoredump; + struct xe_gt *gt = guc_to_gt(guc); enum guc_capture_list_class_type capture_class; const struct __guc_mmio_reg_descr_group *list; struct xe_guc_capture_snapshot *new; enum guc_state_capture_type type; - u16 guc_id = 0; - u32 lrca = 0; - if (IS_SRIOV_VF(xe)) - return; + if (IS_SRIOV_VF(guc_to_xe(guc))) + return NULL; new = guc_capture_get_prealloc_node(guc); if (!new) - return; + return NULL; capture_class = xe_engine_class_to_guc_capture_class(hwe->class); for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) { @@ -1594,26 +1580,64 @@ xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot } } - if (devcoredump && devcoredump->captured) { - struct xe_guc_submit_exec_queue_snapshot *ge = devcoredump->snapshot.ge; + new->eng_class = xe_engine_class_to_guc_class(hwe->class); + new->eng_inst = hwe->instance; - if (ge) { - guc_id = ge->guc.id; - if (ge->lrc[0]) - lrca = ge->lrc[0]->context_desc; - } + return new; +} + +/** + * xe_guc_capture_snapshot_store_manual_job - Generate and store a manual engine register dump + * @guc: Target GuC for manual capture + * @q: Associated xe_exec_queue to simulate a manual capture on its behalf. + * + * Generate a manual GuC-Error-Capture snapshot of engine instance + engine class registers + * for the engine of the given exec queue. Stores this node in internal outlist for future + * retrieval with the ability to match up against the same queue. + * + * Returns: None + */ +void +xe_guc_capture_snapshot_store_manual_job(struct xe_guc *guc, struct xe_exec_queue *q) +{ + struct xe_guc_capture_snapshot *new; + struct xe_gt *gt = guc_to_gt(guc); + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + /* we don't support GuC-Error-Capture, including manual captures on VFs */ + if (IS_SRIOV_VF(guc_to_xe(guc))) + return; + + if (!q) { + xe_gt_dbg(gt, "Manual GuC Error capture requested with invalid job\n"); + return; } - new->eng_class = xe_engine_class_to_guc_class(hwe->class); - new->eng_inst = hwe->instance; - new->guc_id = guc_id; - new->lrca = lrca; + /* Find hwe for the queue */ + for_each_hw_engine(hwe, gt, id) { + if (hwe != q->hwe) + continue; + break; + } + if (hwe != q->hwe) { + xe_gt_dbg(gt, "Manual GuC Error capture failed to find matching engine\n"); + return; + } + + new = guc_capture_get_manual_snapshot(guc, hwe); + if (!new) + return; + + new->guc_id = q->guc->id; + new->lrca = xe_lrc_ggtt_addr(q->lrc[0]); new->is_partial = 0; new->locked = 1; new->source = XE_ENGINE_CAPTURE_SOURCE_MANUAL; guc_capture_add_node_to_outlist(guc->capture, new); - devcoredump->snapshot.matched_node = new; + + return; } static struct guc_mmio_reg * @@ -1638,20 +1662,18 @@ snapshot_print_by_list_order(struct xe_hw_engine_snapshot *snapshot, struct drm_ u32 type, const struct __guc_mmio_reg_descr_group *list) { struct xe_gt *gt = snapshot->hwe->gt; - struct xe_device *xe = gt_to_xe(gt); struct xe_guc *guc = >->uc.guc; - struct xe_devcoredump *devcoredump = &xe->devcoredump; - struct xe_devcoredump_snapshot *devcore_snapshot = &devcoredump->snapshot; struct gcap_reg_list_info *reginfo = NULL; u32 i, last_value = 0; bool is_ext, low32_ready = false; if (!list || !list->list || list->num_regs == 0) return; - XE_WARN_ON(!devcore_snapshot->matched_node); + + XE_WARN_ON(!snapshot->matched_node); is_ext = list == guc->capture->extlists; - reginfo = &devcore_snapshot->matched_node->reginfo[type]; + reginfo = &snapshot->matched_node->reginfo[type]; /* * loop through descriptor first and find the register in the node @@ -1756,21 +1778,14 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm int type; const struct __guc_mmio_reg_descr_group *list; enum guc_capture_list_class_type capture_class; - struct xe_gt *gt; - struct xe_device *xe; - struct xe_devcoredump *devcoredump; - struct xe_devcoredump_snapshot *devcore_snapshot; if (!snapshot) return; gt = snapshot->hwe->gt; - xe = gt_to_xe(gt); - devcoredump = &xe->devcoredump; - devcore_snapshot = &devcoredump->snapshot; - if (!devcore_snapshot->matched_node) + if (!snapshot->matched_node) return; xe_gt_assert(gt, snapshot->hwe); @@ -1781,9 +1796,9 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm snapshot->name ? snapshot->name : "", snapshot->logical_instance); drm_printf(p, "\tCapture_source: %s\n", - devcore_snapshot->matched_node->source == XE_ENGINE_CAPTURE_SOURCE_GUC ? + snapshot->matched_node->source == XE_ENGINE_CAPTURE_SOURCE_GUC ? "GuC" : "Manual"); - drm_printf(p, "\tCoverage: %s\n", grptype[devcore_snapshot->matched_node->is_partial]); + drm_printf(p, "\tCoverage: %s\n", grptype[snapshot->matched_node->is_partial]); drm_printf(p, "\tForcewake: domain 0x%x, ref %d\n", snapshot->forcewake.domain, snapshot->forcewake.ref); drm_printf(p, "\tReserved: %s\n", @@ -1809,6 +1824,7 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm /** * xe_guc_capture_get_matching_and_lock - Matching GuC capture for the queue. * @q: The exec queue object + * @srctype: if the capture-node being searched was manual or from guc * * Search within the capture outlist for the queue, could be used for check if * GuC capture is ready for the queue. @@ -1817,13 +1833,13 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm * Returns: found guc-capture node ptr else NULL */ struct xe_guc_capture_snapshot * -xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q) +xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, + enum xe_guc_capture_snapshot_source srctype) { struct xe_hw_engine *hwe; enum xe_hw_engine_id id; struct xe_device *xe; u16 guc_class = GUC_LAST_ENGINE_CLASS + 1; - struct xe_devcoredump_snapshot *ss; if (!q || !q->gt) return NULL; @@ -1832,10 +1848,6 @@ xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q) if (xe->wedged.mode >= 2 || !xe_device_uc_enabled(xe) || IS_SRIOV_VF(xe)) return NULL; - ss = &xe->devcoredump.snapshot; - if (ss->matched_node && ss->matched_node->source == XE_ENGINE_CAPTURE_SOURCE_GUC) - return ss->matched_node; - /* Find hwe for the queue */ for_each_hw_engine(hwe, q->gt, id) { if (hwe != q->hwe) @@ -1858,7 +1870,7 @@ xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q) list_for_each_entry_safe(n, ntmp, &guc->capture->outlist, link) { if (n->eng_class == guc_class && n->eng_inst == hwe->instance && n->guc_id == guc_id && n->lrca == lrca && - n->source == XE_ENGINE_CAPTURE_SOURCE_GUC) { + n->source == srctype) { n->locked = 1; return n; } @@ -1893,51 +1905,23 @@ xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q) coredump->snapshot.hwe[id] = NULL; continue; } - - if (!coredump->snapshot.hwe[id]) { - coredump->snapshot.hwe[id] = - xe_hw_engine_snapshot_capture(hwe, q); - } else { - struct xe_guc_capture_snapshot *new; - - new = xe_guc_capture_get_matching_and_lock(q); - if (new) { - struct xe_guc *guc = &q->gt->uc.guc; - - /* - * If we are in here, it means we found a fresh - * GuC-err-capture node for this engine after - * previously failing to find a match in the - * early part of guc_exec_queue_timedout_job. - * Thus we must free the manually captured node - */ - guc_capture_free_outlist_node(guc->capture, - coredump->snapshot.matched_node); - coredump->snapshot.matched_node = new; - } - } - - break; + coredump->snapshot.hwe[id] = xe_hw_engine_snapshot_capture(hwe, q); } } /* * xe_guc_capture_put_matched_nodes - Cleanup matched nodes * @guc: The GuC object + * @n: the capture node we want to free (along with stale reports from GuC) * * Free matched node and all nodes with the equal guc_id from * GuC captured outlist */ -void xe_guc_capture_put_matched_nodes(struct xe_guc *guc) +void xe_guc_capture_put_matched_nodes(struct xe_guc *guc, struct xe_guc_capture_snapshot *n) { - struct xe_device *xe = guc_to_xe(guc); - struct xe_devcoredump *devcoredump = &xe->devcoredump; - struct xe_guc_capture_snapshot *n = devcoredump->snapshot.matched_node; - if (n) { guc_capture_remove_stale_matches_from_list(guc->capture, n); guc_capture_free_outlist_node(guc->capture, n); - devcoredump->snapshot.matched_node = NULL; } } diff --git a/drivers/gpu/drm/xe/xe_guc_capture.h b/drivers/gpu/drm/xe/xe_guc_capture.h index 046989fba3b1..8ac893c92f19 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.h +++ b/drivers/gpu/drm/xe/xe_guc_capture.h @@ -9,6 +9,7 @@ #include #include "abi/guc_capture_abi.h" #include "xe_guc.h" +#include "xe_guc_capture_snapshot_types.h" #include "xe_guc_fwif.h" struct xe_exec_queue; @@ -50,12 +51,14 @@ size_t xe_guc_capture_ads_input_worst_size(struct xe_guc *guc); const struct __guc_mmio_reg_descr_group * xe_guc_capture_get_reg_desc_list(struct xe_gt *gt, u32 owner, u32 type, enum guc_capture_list_class_type capture_class, bool is_ext); -struct xe_guc_capture_snapshot *xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q); -void xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot *snapshot); +struct xe_guc_capture_snapshot * +xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, + enum xe_guc_capture_snapshot_source srctype); +void xe_guc_capture_snapshot_store_manual_job(struct xe_guc *guc, struct xe_exec_queue *q); void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p); void xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q); void xe_guc_capture_steered_list_init(struct xe_guc *guc); -void xe_guc_capture_put_matched_nodes(struct xe_guc *guc); +void xe_guc_capture_put_matched_nodes(struct xe_guc *guc, struct xe_guc_capture_snapshot *n); int xe_guc_capture_init(struct xe_guc *guc); #endif diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 913c74d6e2ae..6e33081dd7b8 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -28,6 +28,7 @@ #include "xe_gt_printk.h" #include "xe_guc.h" #include "xe_guc_capture.h" +#include "xe_guc_capture_snapshot_types.h" #include "xe_guc_ct.h" #include "xe_guc_exec_queue_types.h" #include "xe_guc_id_mgr.h" @@ -1070,14 +1071,17 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job) * do manual capture first and decide later if we need to use it */ if (!exec_queue_killed(q) && !xe->devcoredump.captured && - !xe_guc_capture_get_matching_and_lock(q)) { + !xe_guc_capture_get_matching_and_lock(q, XE_ENGINE_CAPTURE_SOURCE_GUC)) { /* take force wake before engine register manual capture */ fw_ref = xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL); if (!xe_force_wake_ref_has_domain(fw_ref, XE_FORCEWAKE_ALL)) xe_gt_info(q->gt, "failed to get forcewake for coredump capture\n"); - - xe_engine_snapshot_capture_for_queue(q); - + /* + * Generate a manual capture. Below function will store it + * in GuC Error Capture's internal link-list as if it came from GuC + * but with a source-type == XE_ENGINE_CAPTURE_SOURCE_MANUAL + */ + xe_guc_capture_snapshot_store_manual_job(guc, q); xe_force_wake_put(gt_to_fw(q->gt), fw_ref); } diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index a99e3160724b..02871d319471 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -25,6 +25,7 @@ #include "xe_gt_mcr.h" #include "xe_gt_topology.h" #include "xe_guc_capture.h" +#include "xe_guc_capture_snapshot_types.h" #include "xe_hw_engine_group.h" #include "xe_hw_fence.h" #include "xe_irq.h" @@ -867,22 +868,22 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) return snapshot; if (q) { - /* If got guc capture, set source to GuC */ - node = xe_guc_capture_get_matching_and_lock(q); - if (node) { - struct xe_device *xe = gt_to_xe(hwe->gt); - struct xe_devcoredump *coredump = &xe->devcoredump; - - coredump->snapshot.matched_node = node; - xe_gt_dbg(hwe->gt, "Found and locked GuC-err-capture node"); - return snapshot; + /* First, retrieve the manual GuC-Error-Capture node if it exists */ + node = xe_guc_capture_get_matching_and_lock(q, XE_ENGINE_CAPTURE_SOURCE_MANUAL); + /* Find preferred node type sourced from firmware if available */ + snapshot->matched_node = xe_guc_capture_get_matching_and_lock(q, XE_ENGINE_CAPTURE_SOURCE_GUC); + if (!snapshot->matched_node) { + xe_gt_dbg(hwe->gt, "No fw sourced GuC-Err-Capture for queue %s", q->name); + snapshot->matched_node = node; + } else if (node) { + xe_gt_dbg(hwe->gt, "Found manual GuC-Err-Capture for queue %s", q->name); + xe_guc_capture_put_matched_nodes(&hwe->gt->uc.guc, node); } + if (!snapshot->matched_node) + xe_gt_dbg(hwe->gt, "Can't retrieve any GuC-Err-Capture node for queue %s", + q->name); } - /* otherwise, do manual capture */ - xe_engine_manual_capture(hwe, snapshot); - xe_gt_dbg(hwe->gt, "Proceeding with manual engine snapshot"); - return snapshot; } @@ -900,12 +901,7 @@ void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot) return; gt = snapshot->hwe->gt; - /* - * xe_guc_capture_put_matched_nodes is called here and from - * xe_devcoredump_snapshot_free, to cover the 2 calling paths - * of hw_engines - debugfs and devcoredump free. - */ - xe_guc_capture_put_matched_nodes(>->uc.guc); + xe_guc_capture_put_matched_nodes(>->uc.guc, snapshot->matched_node); kfree(snapshot->name); kfree(snapshot); diff --git a/drivers/gpu/drm/xe/xe_hw_engine_types.h b/drivers/gpu/drm/xe/xe_hw_engine_types.h index de69e2628f2f..de1f82c11bcf 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_types.h +++ b/drivers/gpu/drm/xe/xe_hw_engine_types.h @@ -152,6 +152,7 @@ struct xe_hw_engine { struct xe_hw_engine_group *hw_engine_group; }; +struct xe_guc_capture_snapshot; /** * struct xe_hw_engine_snapshot - Hardware engine snapshot * @@ -175,6 +176,13 @@ struct xe_hw_engine_snapshot { u32 mmio_base; /** @kernel_reserved: Engine reserved, can't be used by userspace */ bool kernel_reserved; + /** + * @matched_node: GuC Capture snapshot: + * The matched capture node for the timedout job + * this single-node tracker works because devcoredump will always only + * produce one hw-engine capture per devcoredump event + */ + struct xe_guc_capture_snapshot *matched_node; }; #endif From patchwork Thu Feb 13 19:51:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Previn X-Patchwork-Id: 13973981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0EE55C021A6 for ; Thu, 13 Feb 2025 19:51:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5058310EB96; Thu, 13 Feb 2025 19:51:45 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="S24iJQD/"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 59A4B10EB8A; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739476301; x=1771012301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kh5uGgnHg7tTJNEMZkY2R9WbjLBFy6F1i/R/BcahmOo=; b=S24iJQD/+JHKtiLyyUavndpEZSdhlVs22aEDIV72HZ1sAkMkn5haAtDD HReXeUdIchOVoHdce44t+exI1XTfxFuxfF95Lew0qhbVn42XKd7xNjeZt IW3+c4JiH4qEn9ZT2hID48C+HtVA+fOLEB5wxwkGEmy9mJpiKfGmAxvan T5vK0X76JefgYgfXIpCZ+x4+bUoGxewDUZpHskLcCaYYe8VlllY7yrSiD L787mEl/RnwUMxlpLkNulqd8YbKV5dRP5KPzzeJ7Q1YC4Ko6fCQQsyncg EkDUzSfPKwuV5s/dPpDFApzUGf6hs/mRAsTBfdHMkrsL7hgPkd3Vc7chV A==; X-CSE-ConnectionGUID: MFak7ZqVS5qdQMcc8coPEA== X-CSE-MsgGUID: GEmXxN9aTP+UjkY/vGq7ZA== X-IronPort-AV: E=McAfee;i="6700,10204,11344"; a="40354748" X-IronPort-AV: E=Sophos;i="6.13,282,1732608000"; d="scan'208";a="40354748" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2025 11:51:40 -0800 X-CSE-ConnectionGUID: TNTpwYnUQYuxNQG30uEFhg== X-CSE-MsgGUID: xW8ah/SJTke4tyalGP39VQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="117372217" Received: from aalteres-desk1.fm.intel.com ([10.1.39.140]) by fmviesa003.fm.intel.com with ESMTP; 13 Feb 2025 11:51:40 -0800 From: Alan Previn To: intel-xe@lists.freedesktop.org Cc: Alan Previn , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , John Harrison , Matthew Brost , Zhanjun Dong , Rodrigo Vivi Subject: [PATCH v8 3/6] drm/xe/guc: Split engine state print between xe_hw_engine vs xe_guc_capture Date: Thu, 13 Feb 2025 11:51:36 -0800 Message-Id: <20250213195139.3396082-4-alan.previn.teres.alexis@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> References: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Relocate the xe_engine_snapshot_print function from xe_guc_capture.c into xe_hw_engine.c but split out the GuC-Err-Capture register printing portion out into a separate helper inside xe_guc_capture.c so that we can have a clear separation between printing the general engine info vs GuC-Err-Capture node's register list. v7: - Fix function name to respect "xe_hw_engine" name space. (Rodrigo) - Remove additional newline in engine dump (Jose Souza) + ensure changes didn't break mesa's aubinator tool (Rodrigo) Signed-off-by: Alan Previn Reviewed-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_devcoredump.c | 2 +- drivers/gpu/drm/xe/xe_guc_capture.c | 79 +++++++++++++---------------- drivers/gpu/drm/xe/xe_guc_capture.h | 4 +- drivers/gpu/drm/xe/xe_hw_engine.c | 29 ++++++++++- drivers/gpu/drm/xe/xe_hw_engine.h | 1 + 5 files changed, 67 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c index 006041997550..7a4610d2ea4f 100644 --- a/drivers/gpu/drm/xe/xe_devcoredump.c +++ b/drivers/gpu/drm/xe/xe_devcoredump.c @@ -128,7 +128,7 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count, drm_puts(&p, "\n**** HW Engines ****\n"); for (i = 0; i < XE_NUM_HW_ENGINES; i++) if (ss->hwe[i]) - xe_engine_snapshot_print(ss->hwe[i], &p); + xe_hw_engine_snapshot_print(ss->hwe[i], &p); drm_puts(&p, "\n**** VM state ****\n"); xe_vm_snapshot_print(ss->vm, &p); diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c index 1f9d49f5a805..ac3134da3f19 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -917,9 +917,10 @@ guc_capture_init_node(struct xe_guc *guc, struct xe_guc_capture_snapshot *node) * -------------------- * --> xe_devcoredump_read-> * L--> xxx_snapshot_print - * L--> xe_engine_snapshot_print - * Print register lists values saved at - * guc->capture->outlist + * L--> xe_hw_engine_print --> xe_hw_engine_snapshot_print + * L--> xe_guc_capture_snapshot_print + * Print register lists values saved in matching + * node from guc->capture->outlist * */ @@ -1658,22 +1659,16 @@ guc_capture_find_reg(struct gcap_reg_list_info *reginfo, u32 addr, u32 flags) } static void -snapshot_print_by_list_order(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p, - u32 type, const struct __guc_mmio_reg_descr_group *list) +print_noderegs_by_list_order(struct xe_guc *guc, struct gcap_reg_list_info *reginfo, + const struct __guc_mmio_reg_descr_group *list, struct drm_printer *p) { - struct xe_gt *gt = snapshot->hwe->gt; - struct xe_guc *guc = >->uc.guc; - struct gcap_reg_list_info *reginfo = NULL; - u32 i, last_value = 0; + u32 last_value, i; bool is_ext, low32_ready = false; if (!list || !list->list || list->num_regs == 0) return; - XE_WARN_ON(!snapshot->matched_node); - is_ext = list == guc->capture->extlists; - reginfo = &snapshot->matched_node->reginfo[type]; /* * loop through descriptor first and find the register in the node @@ -1743,8 +1738,8 @@ snapshot_print_by_list_order(struct xe_hw_engine_snapshot *snapshot, struct drm_ group = FIELD_GET(GUC_REGSET_STEERING_GROUP, reg_desc->flags); instance = FIELD_GET(GUC_REGSET_STEERING_INSTANCE, reg_desc->flags); - dss = xe_gt_mcr_steering_info_to_dss_id(gt, group, instance); - + dss = xe_gt_mcr_steering_info_to_dss_id(guc_to_gt(guc), group, + instance); drm_printf(p, "\t%s[%u]: 0x%08x\n", reg_desc->regname, dss, value); } else { drm_printf(p, "\t%s: 0x%08x\n", reg_desc->regname, value); @@ -1763,13 +1758,18 @@ snapshot_print_by_list_order(struct xe_hw_engine_snapshot *snapshot, struct drm_ } /** - * xe_engine_snapshot_print - Print out a given Xe HW Engine snapshot. - * @snapshot: Xe HW Engine snapshot object. + * xe_guc_capture_snapshot_print - Print out a the contents of a provided Guc-Err-Capture node + * @guc : Target GuC for operation. + * @node: GuC Error Capture register dump node. * @p: drm_printer where it will be printed out. * - * This function prints out a given Xe HW Engine snapshot object. + * This function prints out a register dump of a GuC-Err-Capture node that was retrieved + * earlier either by GuC-FW reporting or by manual capture depending on how the + * caller (typically xe_hw_engine_snapshot) was invoked and used. */ -void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p) + +void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_snapshot *node, + struct drm_printer *p) { const char *grptype[GUC_STATE_CAPTURE_GROUP_TYPE_MAX] = { "full-capture", @@ -1777,45 +1777,36 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm }; int type; const struct __guc_mmio_reg_descr_group *list; - enum guc_capture_list_class_type capture_class; struct xe_gt *gt; - if (!snapshot) + if (!guc) return; - - gt = snapshot->hwe->gt; - - if (!snapshot->matched_node) + gt = guc_to_gt(guc); + if (!node) { + xe_gt_warn(gt, "GuC Capture printing without node!\n"); return; + } + if (!p) { + xe_gt_warn(gt, "GuC Capture printing without printer!\n"); + return; + } - xe_gt_assert(gt, snapshot->hwe); - - capture_class = xe_engine_class_to_guc_capture_class(snapshot->hwe->class); - - drm_printf(p, "%s (physical), logical instance=%d\n", - snapshot->name ? snapshot->name : "", - snapshot->logical_instance); drm_printf(p, "\tCapture_source: %s\n", - snapshot->matched_node->source == XE_ENGINE_CAPTURE_SOURCE_GUC ? + node->source == XE_ENGINE_CAPTURE_SOURCE_GUC ? "GuC" : "Manual"); - drm_printf(p, "\tCoverage: %s\n", grptype[snapshot->matched_node->is_partial]); - drm_printf(p, "\tForcewake: domain 0x%x, ref %d\n", - snapshot->forcewake.domain, snapshot->forcewake.ref); - drm_printf(p, "\tReserved: %s\n", - str_yes_no(snapshot->kernel_reserved)); + drm_printf(p, "\tCoverage: %s\n", grptype[node->is_partial]); for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) { list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, type, - capture_class, false); - snapshot_print_by_list_order(snapshot, p, type, list); + node->eng_class, false); + print_noderegs_by_list_order(guc, &node->reginfo[type], list, p); } - if (capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) { + if (node->eng_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) { + type = GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS; list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, - GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, - capture_class, true); - snapshot_print_by_list_order(snapshot, p, GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, - list); + type, node->eng_class, true); + print_noderegs_by_list_order(guc, &node->reginfo[type], list, p); } drm_puts(p, "\n"); diff --git a/drivers/gpu/drm/xe/xe_guc_capture.h b/drivers/gpu/drm/xe/xe_guc_capture.h index 8ac893c92f19..e67589ab4342 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.h +++ b/drivers/gpu/drm/xe/xe_guc_capture.h @@ -15,7 +15,6 @@ struct xe_exec_queue; struct xe_guc; struct xe_hw_engine; -struct xe_hw_engine_snapshot; static inline enum guc_capture_list_class_type xe_guc_class_to_capture_class(u16 class) { @@ -55,7 +54,8 @@ struct xe_guc_capture_snapshot * xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, enum xe_guc_capture_snapshot_source srctype); void xe_guc_capture_snapshot_store_manual_job(struct xe_guc *guc, struct xe_exec_queue *q); -void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p); +void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_snapshot *node, + struct drm_printer *p); void xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q); void xe_guc_capture_steered_list_init(struct xe_guc *guc); void xe_guc_capture_put_matched_nodes(struct xe_guc *guc, struct xe_guc_capture_snapshot *n); diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index 02871d319471..c980a5c84a8b 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -907,6 +907,33 @@ void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot) kfree(snapshot); } +/** + * xe_hw_engine_snapshot_print - Print out a given Xe HW Engine snapshot. + * @snapshot: Xe HW Engine snapshot object. + * @p: drm_printer where it will be printed out. + * + * This function prints out a given Xe HW Engine snapshot object. + */ +void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p) +{ + struct xe_gt *gt; + + if (!snapshot) + return; + + gt = snapshot->hwe->gt; + + drm_printf(p, "%s (physical), logical instance=%d\n", + snapshot->name ? snapshot->name : "", + snapshot->logical_instance); + drm_printf(p, "\tForcewake: domain 0x%x, ref %d\n", + snapshot->forcewake.domain, snapshot->forcewake.ref); + drm_printf(p, "\tReserved: %s\n", + str_yes_no(snapshot->kernel_reserved)); + + xe_guc_capture_snapshot_print(>->uc.guc, snapshot->matched_node, p); +} + /** * xe_hw_engine_print - Xe HW Engine Print. * @hwe: Hardware Engine. @@ -919,7 +946,7 @@ void xe_hw_engine_print(struct xe_hw_engine *hwe, struct drm_printer *p) struct xe_hw_engine_snapshot *snapshot; snapshot = xe_hw_engine_snapshot_capture(hwe, NULL); - xe_engine_snapshot_print(snapshot, p); + xe_hw_engine_snapshot_print(snapshot, p); xe_hw_engine_snapshot_free(snapshot); } diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 6b5f9fa2a594..069b32aa7423 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -58,6 +58,7 @@ u32 xe_hw_engine_mask_per_class(struct xe_gt *gt, struct xe_hw_engine_snapshot * xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q); void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot); +void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p); void xe_hw_engine_print(struct xe_hw_engine *hwe, struct drm_printer *p); void xe_hw_engine_setup_default_lrc_state(struct xe_hw_engine *hwe); From patchwork Thu Feb 13 19:51:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Previn X-Patchwork-Id: 13973980 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 476FEC021A8 for ; Thu, 13 Feb 2025 19:51:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5460310EB95; Thu, 13 Feb 2025 19:51:43 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="mxUw58c+"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 618B810EB88; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739476301; x=1771012301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+NO0b0muPhplHkRCwarGBSbooNg18Pk8HMsQIZuiP1c=; b=mxUw58c+/ufDJyqP/Q0KXIv7SUCFe+xLkgGyhnpn1A4durHNRY0dsTQv lzOkj78Jfp/ZHbWdPtJe6eb82uCGCUkTndXqcsSB+gZ2mWls8QESRZX23 LxNKeugSnaCHY75HBeuiUrdmiMlb3RpH1x/njU2jRmvRDjuthgix/UDtQ B95IkSQAnGMfbRRVhVLqqgGGOVjD/uhvsrxsD5VRLOeANdQCwIpuKSsP1 yi79jq9HQHH3af5/BvJaGtYKn5WzGLDyE0m/4znHW56b8L1JJ2hxatJxp TVIjx2QSGt1dtQ8ex9FrIOLqUEd/FLQqbJALM7LJAJKiAX4XGPaQica4L A==; X-CSE-ConnectionGUID: FS1Mm50xR/qSshac/eGGdQ== X-CSE-MsgGUID: 9XEDGbdLRhG1dR58pdYhiQ== X-IronPort-AV: E=McAfee;i="6700,10204,11344"; a="40354749" X-IronPort-AV: E=Sophos;i="6.13,282,1732608000"; d="scan'208";a="40354749" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2025 11:51:40 -0800 X-CSE-ConnectionGUID: zYjI3UZkSb+J0r4+GtMGzQ== X-CSE-MsgGUID: N806ZgfQTxS3p8laBX72aw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="117372220" Received: from aalteres-desk1.fm.intel.com ([10.1.39.140]) by fmviesa003.fm.intel.com with ESMTP; 13 Feb 2025 11:51:40 -0800 From: Alan Previn To: intel-xe@lists.freedesktop.org Cc: Alan Previn , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , John Harrison , Matthew Brost , Zhanjun Dong , Rodrigo Vivi Subject: [PATCH v8 4/6] drm/xe/guc: Move xe_hw_engine_snapshot creation back to xe_hw_engine.c Date: Thu, 13 Feb 2025 11:51:37 -0800 Message-Id: <20250213195139.3396082-5-alan.previn.teres.alexis@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> References: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" xe_devcoredump calls xe_engine_snapshot_capture_for_queue() to allocate and populate the xe_hw_engine_snapshot structure. Move that function back into xe_hw_engine.c since it doesn't make sense for GuC-Err-Capture to allocate a structure it doesn't own. v7: Rename function to respect "xe_hw_engine" namespace (Rodrigo) Signed-off-by: Alan Previn Reviewed-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_devcoredump.c | 2 +- drivers/gpu/drm/xe/xe_guc_capture.c | 30 ----------------------- drivers/gpu/drm/xe/xe_guc_capture.h | 1 - drivers/gpu/drm/xe/xe_hw_engine.c | 38 ++++++++++++++++++++++++++--- drivers/gpu/drm/xe/xe_hw_engine.h | 3 +-- 5 files changed, 36 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c index 7a4610d2ea4f..6cbb4fce8ef2 100644 --- a/drivers/gpu/drm/xe/xe_devcoredump.c +++ b/drivers/gpu/drm/xe/xe_devcoredump.c @@ -311,7 +311,7 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump, ss->job = xe_sched_job_snapshot_capture(job); ss->vm = xe_vm_snapshot_capture(q->vm); - xe_engine_snapshot_capture_for_queue(q); + xe_hw_engine_snapshot_capture_for_queue(q); queue_work(system_unbound_wq, &ss->work); diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c index ac3134da3f19..c57b13afcfd9 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -1870,36 +1870,6 @@ xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, return NULL; } -/** - * xe_engine_snapshot_capture_for_queue - Take snapshot of associated engine - * @q: The exec queue object - * - * Take snapshot of associated HW Engine - * - * Returns: None. - */ -void -xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q) -{ - struct xe_device *xe = gt_to_xe(q->gt); - struct xe_devcoredump *coredump = &xe->devcoredump; - struct xe_hw_engine *hwe; - enum xe_hw_engine_id id; - u32 adj_logical_mask = q->logical_mask; - - if (IS_SRIOV_VF(xe)) - return; - - for_each_hw_engine(hwe, q->gt, id) { - if (hwe->class != q->hwe->class || - !(BIT(hwe->logical_instance) & adj_logical_mask)) { - coredump->snapshot.hwe[id] = NULL; - continue; - } - coredump->snapshot.hwe[id] = xe_hw_engine_snapshot_capture(hwe, q); - } -} - /* * xe_guc_capture_put_matched_nodes - Cleanup matched nodes * @guc: The GuC object diff --git a/drivers/gpu/drm/xe/xe_guc_capture.h b/drivers/gpu/drm/xe/xe_guc_capture.h index e67589ab4342..77ee35a3f205 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.h +++ b/drivers/gpu/drm/xe/xe_guc_capture.h @@ -56,7 +56,6 @@ xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, void xe_guc_capture_snapshot_store_manual_job(struct xe_guc *guc, struct xe_exec_queue *q); void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_snapshot *node, struct drm_printer *p); -void xe_engine_snapshot_capture_for_queue(struct xe_exec_queue *q); void xe_guc_capture_steered_list_init(struct xe_guc *guc); void xe_guc_capture_put_matched_nodes(struct xe_guc *guc, struct xe_guc_capture_snapshot *n); int xe_guc_capture_init(struct xe_guc *guc); diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index c980a5c84a8b..fef01d2086a8 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -830,7 +830,7 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec) } /** - * xe_hw_engine_snapshot_capture - Take a quick snapshot of the HW Engine. + * hw_engine_snapshot_capture - Take a quick snapshot of the HW Engine. * @hwe: Xe HW Engine. * @q: The exec queue object. * @@ -840,8 +840,8 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec) * Returns: a Xe HW Engine snapshot object that must be freed by the * caller, using `xe_hw_engine_snapshot_free`. */ -struct xe_hw_engine_snapshot * -xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) +static struct xe_hw_engine_snapshot * +hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) { struct xe_hw_engine_snapshot *snapshot; struct xe_guc_capture_snapshot *node; @@ -887,6 +887,36 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) return snapshot; } +/** + * xe_hw_engine_snapshot_capture_for_queue - Take snapshot of associated engine + * @q: The exec queue object + * + * Take snapshot of associated HW Engine + * + * Returns: None. + */ +void +xe_hw_engine_snapshot_capture_for_queue(struct xe_exec_queue *q) +{ + struct xe_device *xe = gt_to_xe(q->gt); + struct xe_devcoredump *coredump = &xe->devcoredump; + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + u32 adj_logical_mask = q->logical_mask; + + if (IS_SRIOV_VF(xe)) + return; + + for_each_hw_engine(hwe, q->gt, id) { + if (hwe->class != q->hwe->class || + !(BIT(hwe->logical_instance) & adj_logical_mask)) { + coredump->snapshot.hwe[id] = NULL; + continue; + } + coredump->snapshot.hwe[id] = hw_engine_snapshot_capture(hwe, q); + } +} + /** * xe_hw_engine_snapshot_free - Free all allocated objects for a given snapshot. * @snapshot: Xe HW Engine snapshot object. @@ -945,7 +975,7 @@ void xe_hw_engine_print(struct xe_hw_engine *hwe, struct drm_printer *p) { struct xe_hw_engine_snapshot *snapshot; - snapshot = xe_hw_engine_snapshot_capture(hwe, NULL); + snapshot = hw_engine_snapshot_capture(hwe, NULL); xe_hw_engine_snapshot_print(snapshot, p); xe_hw_engine_snapshot_free(snapshot); } diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 069b32aa7423..74f6ea0c8d3e 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -55,8 +55,7 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec); void xe_hw_engine_enable_ring(struct xe_hw_engine *hwe); u32 xe_hw_engine_mask_per_class(struct xe_gt *gt, enum xe_engine_class engine_class); -struct xe_hw_engine_snapshot * -xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q); +void xe_hw_engine_snapshot_capture_for_queue(struct xe_exec_queue *q); void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot); void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p); void xe_hw_engine_print(struct xe_hw_engine *hwe, struct drm_printer *p); From patchwork Thu Feb 13 19:51:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Previn X-Patchwork-Id: 13973976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 013E4C021A0 for ; Thu, 13 Feb 2025 19:51:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6D08C10EB8D; Thu, 13 Feb 2025 19:51:42 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="UhXJPiHG"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 86EDF10EB8B; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739476301; x=1771012301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NaNHoDYf+et9DXLM7xLDnK5Agqg5rqH2SPnPKVPC1qo=; b=UhXJPiHGL9W4kdrIyeST+xeKtMwMcAI6naG0eWqcOSEfzIPGiuXvG8vI ZvPS6Es/OIw7o9+zDNEERfQyjKEaC4GJBIL4Cu7AakGWrKXPefNQkWZTY mFIqncNqbbj87t7dw3eY60hRCAmH8LlAev6IiLQAZv1ekdGDUuVe9uiiS v6+k7EkGatr4mBiO5i5y1SXLsGgnNytEdIQS1kDsgkc/OEKz3VHdMMR3h G+BoMvpraKMfLdSGetjyixeROuZUfM18eCl3vu3DgwIeIWEx+r49Dpj5i hoVICTs9l96hr6JXsxseKe3QEaXsjlZXfR2r/38Mzv9VgY8upVvQfy6K/ g==; X-CSE-ConnectionGUID: ym33JccOQSi64Yfg/1ex7A== X-CSE-MsgGUID: HmJgfnaWRQG5gKEuMvZyWQ== X-IronPort-AV: E=McAfee;i="6700,10204,11344"; a="40354750" X-IronPort-AV: E=Sophos;i="6.13,282,1732608000"; d="scan'208";a="40354750" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2025 11:51:41 -0800 X-CSE-ConnectionGUID: LCDlmrxuQ96sUgLN/uNKXw== X-CSE-MsgGUID: k5fTFdJRS4uK4t5yop9YLQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="117372224" Received: from aalteres-desk1.fm.intel.com ([10.1.39.140]) by fmviesa003.fm.intel.com with ESMTP; 13 Feb 2025 11:51:40 -0800 From: Alan Previn To: intel-xe@lists.freedesktop.org Cc: Alan Previn , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , John Harrison , Matthew Brost , Zhanjun Dong , Rodrigo Vivi Subject: [PATCH v8 5/6] drm/xe/xe_hw_engine: Update xe_hw_engine capture for debugfs/gt_reset Date: Thu, 13 Feb 2025 11:51:38 -0800 Message-Id: <20250213195139.3396082-6-alan.previn.teres.alexis@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> References: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" The xe_hw_engine_print function is called indirectly debugfs or gt-resets to do an immediate raw dump of the engine registers. That function relies on the function 'hw_engine_snapshot_capture' which in turn assumes a prior capture (guc or manual) with a matching job is ready for printing. However, for the debugfs or gt-reset cases, there is no prior job so ensure hw_engine_snapshot_capture can also invoke GuC-Err-Capture for an immediate jobless snapshot. Additionally, because these jobless events have very different use-case events and callstack flows let's differentiate manual captures that were attached to a job vs late, raw jobless ones. v8:- Rename the enum xe_guc_capture_snapshot_source to xe_engine_capture_source to match the defines (Matthew Brost/John Harrison). - Minor patch header comment improvement. (Alan Previn) v7:- Fix mismatch func name vs comment (kernel robot) - Differentiate between early manual captures that have a job association vs raw manual captures that may not have a job association like in gt-reset events. (John Harrison). Signed-off-by: Alan Previn Reviewed-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_guc_capture.c | 39 ++++++++++++++++--- drivers/gpu/drm/xe/xe_guc_capture.h | 4 +- .../drm/xe/xe_guc_capture_snapshot_types.h | 10 +++-- drivers/gpu/drm/xe/xe_guc_submit.c | 2 +- drivers/gpu/drm/xe/xe_hw_engine.c | 17 ++++++-- 5 files changed, 59 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c index c57b13afcfd9..4ab71dfa7a20 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -1587,6 +1587,32 @@ guc_capture_get_manual_snapshot(struct xe_guc *guc, struct xe_hw_engine *hwe) return new; } +/** + * xe_guc_capture_snapshot_manual_hwe - Generate and get manual engine register dump + * @guc: Target GuC for manual capture + * @hwe: The engine instance to capture from + * + * Generate a manual GuC-Error-Capture snapshot of engine instance + engine class registers + * without any queue association. This capture node is not stored in outlist or cachelist, + * Returns: New capture node and caller must "put" + */ +struct xe_guc_capture_snapshot * +xe_guc_capture_snapshot_manual_hwe(struct xe_guc *guc, struct xe_hw_engine *hwe) +{ + struct xe_guc_capture_snapshot *new; + + new = guc_capture_get_manual_snapshot(guc, hwe); + if (!new) + return NULL; + + new->guc_id = 0; + new->lrca = 0; + new->is_partial = 0; + new->source = XE_ENGINE_CAPTURE_SOURCE_MANUAL_RAW; + + return new; +} + /** * xe_guc_capture_snapshot_store_manual_job - Generate and store a manual engine register dump * @guc: Target GuC for manual capture @@ -1634,7 +1660,7 @@ xe_guc_capture_snapshot_store_manual_job(struct xe_guc *guc, struct xe_exec_queu new->lrca = xe_lrc_ggtt_addr(q->lrc[0]); new->is_partial = 0; new->locked = 1; - new->source = XE_ENGINE_CAPTURE_SOURCE_MANUAL; + new->source = XE_ENGINE_CAPTURE_SOURCE_MANUAL_JOB; guc_capture_add_node_to_outlist(guc->capture, new); @@ -1775,6 +1801,11 @@ void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_sna "full-capture", "partial-capture" }; + const char *srctype[XE_ENGINE_CAPTURE_SOURCE_GUC + 1] = { + "Manual-Job", + "Manual-Raw", + "GuC" + }; int type; const struct __guc_mmio_reg_descr_group *list; struct xe_gt *gt; @@ -1791,9 +1822,7 @@ void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_sna return; } - drm_printf(p, "\tCapture_source: %s\n", - node->source == XE_ENGINE_CAPTURE_SOURCE_GUC ? - "GuC" : "Manual"); + drm_printf(p, "\tCapture_source: %s\n", srctype[node->source]); drm_printf(p, "\tCoverage: %s\n", grptype[node->is_partial]); for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) { @@ -1825,7 +1854,7 @@ void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_sna */ struct xe_guc_capture_snapshot * xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, - enum xe_guc_capture_snapshot_source srctype) + enum xe_engine_capture_source srctype) { struct xe_hw_engine *hwe; enum xe_hw_engine_id id; diff --git a/drivers/gpu/drm/xe/xe_guc_capture.h b/drivers/gpu/drm/xe/xe_guc_capture.h index 77ee35a3f205..ef7b9bace4ac 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.h +++ b/drivers/gpu/drm/xe/xe_guc_capture.h @@ -52,8 +52,10 @@ xe_guc_capture_get_reg_desc_list(struct xe_gt *gt, u32 owner, u32 type, enum guc_capture_list_class_type capture_class, bool is_ext); struct xe_guc_capture_snapshot * xe_guc_capture_get_matching_and_lock(struct xe_exec_queue *q, - enum xe_guc_capture_snapshot_source srctype); + enum xe_engine_capture_source srctype); void xe_guc_capture_snapshot_store_manual_job(struct xe_guc *guc, struct xe_exec_queue *q); +struct xe_guc_capture_snapshot * +xe_guc_capture_snapshot_manual_hwe(struct xe_guc *guc, struct xe_hw_engine *hwe); void xe_guc_capture_snapshot_print(struct xe_guc *guc, struct xe_guc_capture_snapshot *node, struct drm_printer *p); void xe_guc_capture_steered_list_init(struct xe_guc *guc); diff --git a/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h b/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h index a5579e69da2e..422470aa25a1 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h +++ b/drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h @@ -11,8 +11,12 @@ struct guc_mmio_reg; -enum xe_guc_capture_snapshot_source { - XE_ENGINE_CAPTURE_SOURCE_MANUAL, +enum xe_engine_capture_source { + /* KMD captured engine registers when job timeout is detected */ + XE_ENGINE_CAPTURE_SOURCE_MANUAL_JOB, + /* KMD captured raw engine registers without any job association */ + XE_ENGINE_CAPTURE_SOURCE_MANUAL_RAW, + /* GUC-FW captured engine registers before workload was killed */ XE_ENGINE_CAPTURE_SOURCE_GUC }; @@ -40,7 +44,7 @@ struct xe_guc_capture_snapshot { u32 lrca; u32 type; bool locked; - enum xe_guc_capture_snapshot_source source; + enum xe_engine_capture_source source; struct gcap_reg_list_info { u32 vfid; u32 num_regs; diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 6e33081dd7b8..4d7530e8bf63 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1079,7 +1079,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job) /* * Generate a manual capture. Below function will store it * in GuC Error Capture's internal link-list as if it came from GuC - * but with a source-type == XE_ENGINE_CAPTURE_SOURCE_MANUAL + * but with a source-type == XE_ENGINE_CAPTURE_SOURCE_MANUAL_JOB */ xe_guc_capture_snapshot_store_manual_job(guc, q); xe_force_wake_put(gt_to_fw(q->gt), fw_ref); diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index fef01d2086a8..d0ed0639ae08 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -832,7 +832,7 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec) /** * hw_engine_snapshot_capture - Take a quick snapshot of the HW Engine. * @hwe: Xe HW Engine. - * @q: The exec queue object. + * @q: The exec queue object. (can be NULL for debugfs engine-register dump) * * This can be printed out in a later stage like during dev_coredump * analysis. @@ -845,9 +845,11 @@ hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) { struct xe_hw_engine_snapshot *snapshot; struct xe_guc_capture_snapshot *node; + struct xe_guc *guc; if (!xe_hw_engine_is_valid(hwe)) return NULL; + guc = &hwe->gt->uc.guc; snapshot = kzalloc(sizeof(*snapshot), GFP_ATOMIC); @@ -869,7 +871,7 @@ hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) if (q) { /* First, retrieve the manual GuC-Error-Capture node if it exists */ - node = xe_guc_capture_get_matching_and_lock(q, XE_ENGINE_CAPTURE_SOURCE_MANUAL); + node = xe_guc_capture_get_matching_and_lock(q, XE_ENGINE_CAPTURE_SOURCE_MANUAL_JOB); /* Find preferred node type sourced from firmware if available */ snapshot->matched_node = xe_guc_capture_get_matching_and_lock(q, XE_ENGINE_CAPTURE_SOURCE_GUC); if (!snapshot->matched_node) { @@ -877,13 +879,22 @@ hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q) snapshot->matched_node = node; } else if (node) { xe_gt_dbg(hwe->gt, "Found manual GuC-Err-Capture for queue %s", q->name); - xe_guc_capture_put_matched_nodes(&hwe->gt->uc.guc, node); + xe_guc_capture_put_matched_nodes(guc, node); } if (!snapshot->matched_node) xe_gt_dbg(hwe->gt, "Can't retrieve any GuC-Err-Capture node for queue %s", q->name); } + if (!snapshot->matched_node) { + /* + * Fallback path - do an immediate jobless manual engine capture. + * This will happen when debugfs is triggered to force an engine dump. + */ + snapshot->matched_node = xe_guc_capture_snapshot_manual_hwe(guc, hwe); + xe_gt_dbg(hwe->gt, "Fallback to jobless-manual-err-capture node"); + } + return snapshot; } From patchwork Thu Feb 13 19:51:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Previn X-Patchwork-Id: 13973979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 50C8BC021A4 for ; Thu, 13 Feb 2025 19:51:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3BFCE10EB94; Thu, 13 Feb 2025 19:51:43 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="VBEBgSHL"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8CE9310EB8C; Thu, 13 Feb 2025 19:51:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739476301; x=1771012301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=j1b48mZQZrOwtZCbIamUNkWKk66t9YWs8MImyttoz7Q=; b=VBEBgSHLTHWnIFhGyhIsYusa5JCftLhQugWKb1L+qLIMvCp20JMw+oCa rtuYVIGeBM2aDloQru2v/OfAptkqXpPmFV9T7xgg12Aknl3MAkL5RKkqb 7m/6FAhqjF3NQ6DvqTmSKhzzt5FsVgOZkcaefzX+0WgeV08zz467vjmgA 0S89xRruLusnGeEgOpU+YoGLiMhp35uN2LO/vp9+h/W47Beq9AyD3SGiz vU7gzi/HCYMmyGZkhmNuyvUsRO9tt/9ggGmWs5qD7tTltq23y1H+SufHJ s3fLEFugWDzlBb14XuvaUpnSfNJKQyCPF5C1s3kWfm8XRsZNc58UezvBI A==; X-CSE-ConnectionGUID: 9s2tiR08RvyaKzcd1vuGCg== X-CSE-MsgGUID: jJpRqJYyQw21hpNNH6AECw== X-IronPort-AV: E=McAfee;i="6700,10204,11344"; a="40354751" X-IronPort-AV: E=Sophos;i="6.13,282,1732608000"; d="scan'208";a="40354751" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2025 11:51:41 -0800 X-CSE-ConnectionGUID: NpLOmgLFQ2WfUKscLqEkfw== X-CSE-MsgGUID: BbkqywndR9yDcbfkWmF7cg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="117372227" Received: from aalteres-desk1.fm.intel.com ([10.1.39.140]) by fmviesa003.fm.intel.com with ESMTP; 13 Feb 2025 11:51:41 -0800 From: Alan Previn To: intel-xe@lists.freedesktop.org Cc: Alan Previn , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , John Harrison , Matthew Brost , Zhanjun Dong , Rodrigo Vivi Subject: [PATCH v8 6/6] drm/xe/guc: Update comments on GuC-Err-Capture flows Date: Thu, 13 Feb 2025 11:51:39 -0800 Message-Id: <20250213195139.3396082-7-alan.previn.teres.alexis@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> References: <20250213195139.3396082-1-alan.previn.teres.alexis@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Update the comments on GuC-Err-Capture flows with the updated function names. Signed-off-by: Alan Previn Reviewed-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_guc_capture.c | 42 ++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c index 4ab71dfa7a20..0996b32fcee7 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -905,22 +905,38 @@ guc_capture_init_node(struct xe_guc *guc, struct xe_guc_capture_snapshot *node) * list. This list is used for matchup and printout by xe_devcoredump_read * and xe_engine_snapshot_print, (when user invokes the devcoredump sysfs). * - * GUC --> notify context reset: - * ----------------------------- + * DRM Scheduler job-timeout OR GuC-notify guc-id reset: + * ----------------------------------------------------- * --> guc_exec_queue_timedout_job - * L--> xe_devcoredump + * L--> xe_guc_capture_snapshot_store_manual_job (if GuC didn't report an + * error capture node for this job) + * L--> xe_devcoredump * L--> devcoredump_snapshot - * --> xe_hw_engine_snapshot_capture - * --> xe_engine_manual_capture(For manual capture) + * --> xe_engine_snapshot_capture_for_queue * - * User Sysfs / Debugfs - * -------------------- - * --> xe_devcoredump_read-> - * L--> xxx_snapshot_print - * L--> xe_hw_engine_print --> xe_hw_engine_snapshot_print - * L--> xe_guc_capture_snapshot_print - * Print register lists values saved in matching - * node from guc->capture->outlist + * (Printing) User Devcoredump Sysfs + * --------------------------------- + * --> xe_devcoredump_read-> (user cats devcoredump) + * L--> xe_devcoredump_deferred_snap_work -> xe_devcoredump_deferred_snap_work + * L --> __xe_devcoredump_read -> xe_hw_engine_snapshot_print + * L--> xe_hw_engine_print -> xe_guc_capture_snapshot_print: + * Prints register list values saved in the matching node that + * was previously stored in guc->capture->outlist. However if + * devcoredump was triggered in response to a gt_reset, then it's + * possible job queues maybe lost or unavailable at the time of + * printing and a jobless capture would be taken. + * + * --> xe_devcoredump_free (when user clears the dump) + * L--> xe_devcoredump_snapshot_free --> xe_hw_engine_snapshot_free -> + * L--> xe_guc_capture_put_matched_nodes -> xe_guc_capture_put_matched_nodes + * + * (Printing) User Engine Dump via Debugfs + * --------------------------------------- + * --> xe_gt_debugfs_simple_show -> hw_engines -> xe_hw_engine_print + * L--> hw_engine_snapshot_capture -> xe_guc_capture_snapshot_manual_hwe + * L--> xe_guc_capture_snapshot_print (no valid queue provided) + * (unlike sysfs path above, fallback to jobless immediate dump) + * L--> xe_hw_engine_snapshot_free -> xe_guc_capture_put_matched_nodes * */