From patchwork Mon Dec 9 13:32:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 979BAE7717D for ; Mon, 9 Dec 2024 13:33:03 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5239C10E73C; Mon, 9 Dec 2024 13:33:02 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="W84HmVpJ"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2A69D10E73A; Mon, 9 Dec 2024 13:33:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751181; x=1765287181; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tVQZnPZQN0kBtWeOaQJmx0P/Q5d+rwbyumCvJqbkGcI=; b=W84HmVpJLL+4girkPsQAcwz/fSsIcXajj5LxCrE8LwdcL2LVqCjmZd0Z CSwlTeeyo3PDEvguoHZTYwx0mHBFXWBWP/DVyXroc9YH5j19N6bb8OzTR MEhKsXxCqd+1Va5Mjwal1PVGAbY77wFwIy/+Tzl4ewq7HGvLN5p2P4jkG uPozBTY9nyzDE0Vn9lTSSIO7kyigksWb9MKq5CjsAoSKqYeOT2oyAb0kj A/RQKp9bTUNW6qcNRZ2MZGUuC0x0e/kfa4OZ+2SigZhhJ4yS65euBCFlM GCMhgeGQxncuK9Eu53M5lTfMa4ku3mcuuyK0TvvRZJVUSWKT9dr9kgh4h A==; X-CSE-ConnectionGUID: eBBxC9bCTZWa5DxjXV7I8w== X-CSE-MsgGUID: i8o+IeAAR0OCva2bpO4Jiw== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191891" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191891" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:01 -0800 X-CSE-ConnectionGUID: oIWXo/SSSn27cBE95jrq0g== X-CSE-MsgGUID: +w6WQHGOTZWY+f3GxU1VSA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531232" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:32:57 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Mika Kuoppala , Oleg Nesterov , linux-kernel@vger.kernel.org, Dave Airlie , Lucas De Marchi , Matthew Brost , Andi Shyti , Joonas Lahtinen , Maciej Patelczyk , Dominik Grzegorzek , Jonathan Cavitt , Andi Shyti Subject: [PATCH 01/26] ptrace: export ptrace_may_access Date: Mon, 9 Dec 2024 15:32:52 +0200 Message-ID: <20241209133318.1806472-2-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" xe driver would like to allow fine grained access control for GDB debugger using ptrace. Without this export, the only option would be to check for CAP_SYS_ADMIN. The check intended for an ioctl to attach a GPU debugger is similar to the ptrace use case: allow a calling process to manipulate a target process if it has the necessary capabilities or the same permissions, as described in Documentation/process/adding-syscalls.rst. Export ptrace_may_access function to allow GPU debugger to have identical access control for debugger(s) as a CPU debugger. v2: proper commit message (Lucas) Cc: Oleg Nesterov Cc: linux-kernel@vger.kernel.org Cc: Dave Airlie CC: Lucas De Marchi Cc: Matthew Brost CC: Andi Shyti Cc: Joonas Lahtinen CC: Maciej Patelczyk Cc: Dominik Grzegorzek Signed-off-by: Mika Kuoppala Signed-off-by: Jonathan Cavitt Reviewed-by: Andi Shyti --- kernel/ptrace.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/ptrace.c b/kernel/ptrace.c index d5f89f9ef29f..86be1805ebd8 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -354,6 +354,7 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode) task_unlock(task); return !err; } +EXPORT_SYMBOL_GPL(ptrace_may_access); static int check_ptrace_options(unsigned long data) { From patchwork Mon Dec 9 13:32:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899774 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4EA20E7717D for ; Mon, 9 Dec 2024 13:33:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C0D9E10E73D; Mon, 9 Dec 2024 13:33:05 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="mm9gdiE8"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1D2E610E73D; Mon, 9 Dec 2024 13:33:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751185; x=1765287185; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yoXMeDU3iWCeDZLthTcg58I1O+e8Mi9zACf2H/XS7T4=; b=mm9gdiE8j7VwTwPw+wD1q4hPDdp63Jqp3twDPlHmwsRLSxXERBT6YPl+ 0icX2V0RlYUvxwVqOMI913Kft9s5fFiZJPhD1+TdrAQxe9t7ihKFXMLXn VrWwdxQZaZsFeFDqx3mN8sko+NSX3gxSbh/fDSyz9oiSbI6ZAeJLFGo1A /8jXElVboFRrHLOTw5CL2Pot2gMsigHrUSKA/+2iv9jkMRfCr5yrf2oM4 uIVxiYKDdDXGfdVqZDro6HP+ppMAVxp/GJ4S8oMWpKUbftC1ZaJSpzb2F +HjZ8qMbQk89HDJZBdEmyME1LATP/5naG0r8exO+SnPI9qTOM3zDV+C56 A==; X-CSE-ConnectionGUID: bAPHgRFHSFC91j1iVvc90g== X-CSE-MsgGUID: FrYz6xTtQIOyeyNG1wCtbg== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191902" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191902" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:05 -0800 X-CSE-ConnectionGUID: l9irBON4TtmfmOjbqPJMqA== X-CSE-MsgGUID: 59ekYqJhQ6avfM/N8vZI/g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531242" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:01 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Mika Kuoppala , Maarten Lankhorst , Lucas De Marchi , Dominik Grzegorzek , Andi Shyti , Matt Roper , Matthew Brost , =?utf-8?q?Zbigniew_Kempczy=C5=84sk?= =?utf-8?q?i?= , Andrzej Hajda , Maciej Patelczyk , Jonathan Cavitt , Christoph Manszewski Subject: [PATCH 02/26] drm/xe/eudebug: Introduce eudebug support Date: Mon, 9 Dec 2024 15:32:53 +0200 Message-ID: <20241209133318.1806472-3-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" With eudebug event interface, user space debugger process (like gdb) is able to keep track of resources created by another process (debuggee using drm/xe) and act upon these resources. For example, debugger can find a client vm which contains isa/elf for a particular shader/eu-kernel and then inspect and modify it (for example installing a breakpoint). Debugger first opens a connection to xe with a drm ioctl specifying target pid to connect. This returns an anon fd handle that can then be used to listen for events with dedicated ioctl. This patch introduces eudebug connection and event queuing, adding client create/destroy and vm create/destroy events as a baseline. More events for full debugger operation are needed and those will be introduced in follow up patches. The resource tracking parts are inspired by the work of Maciej Patelczyk on resource handling for i915. Chris Wilson suggested improvement of two ways mapping which makes it easy to use resource map as a definitive bookkeep of what resources are played to debugger in the discovery phase (on follow up patch). v2: - Kconfig support (Matthew) - ptraced access control (Lucas) - pass expected event length to user (Zbigniew) - only track long running VMs - checkpatch (Tilak) - include order (Andrzej) - 32bit fixes (Andrzej) - cleaner get_task_struct - remove xa_array and use clients.list for tracking (Mika) v3: - adapt to removal of clients.lock (Mika) - create_event cleanup (Christoph) Cc: Maarten Lankhorst Cc: Lucas De Marchi Cc: Dominik Grzegorzek Cc: Andi Shyti Cc: Matt Roper Cc: Matthew Brost Cc: Zbigniew Kempczyński Cc: Andrzej Hajda Signed-off-by: Mika Kuoppala Signed-off-by: Maciej Patelczyk Signed-off-by: Dominik Grzegorzek Signed-off-by: Jonathan Cavitt Signed-off-by: Christoph Manszewski --- drivers/gpu/drm/xe/Kconfig | 10 + drivers/gpu/drm/xe/Makefile | 2 + drivers/gpu/drm/xe/xe_device.c | 10 + drivers/gpu/drm/xe/xe_device_types.h | 35 + drivers/gpu/drm/xe/xe_eudebug.c | 1118 +++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_eudebug.h | 46 + drivers/gpu/drm/xe/xe_eudebug_types.h | 169 ++++ drivers/gpu/drm/xe/xe_vm.c | 7 +- include/uapi/drm/xe_drm.h | 21 + include/uapi/drm/xe_drm_eudebug.h | 56 ++ 10 files changed, 1473 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/xe/xe_eudebug.c create mode 100644 drivers/gpu/drm/xe/xe_eudebug.h create mode 100644 drivers/gpu/drm/xe/xe_eudebug_types.h create mode 100644 include/uapi/drm/xe_drm_eudebug.h diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig index b51a2bde73e2..ed97730b1af3 100644 --- a/drivers/gpu/drm/xe/Kconfig +++ b/drivers/gpu/drm/xe/Kconfig @@ -87,6 +87,16 @@ config DRM_XE_FORCE_PROBE Use "!*" to block the probe of the driver for all known devices. +config DRM_XE_EUDEBUG + bool "Enable gdb debugger support (eudebug)" + depends on DRM_XE + default y + help + Choose this option if you want to add support for debugger (gdb) to + attach into process using Xe and debug the gpu/gpgpu programs. + With debugger support, Xe will provide interface for a debugger to + process to track, inspect and modify resources. + menu "drm/Xe Debugging" depends on DRM_XE depends on EXPERT diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index 7730e0596299..deabcdd3ea52 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -116,6 +116,8 @@ xe-y += xe_bb.o \ xe_wa.o \ xe_wopcm.o +xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o + xe-$(CONFIG_HMM_MIRROR) += xe_hmm.o # graphics hardware monitoring (HWMON) support diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index d6fccea1e083..9ed0de1eba0b 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -28,6 +28,7 @@ #include "xe_dma_buf.h" #include "xe_drm_client.h" #include "xe_drv.h" +#include "xe_eudebug.h" #include "xe_exec.h" #include "xe_exec_queue.h" #include "xe_force_wake.h" @@ -100,6 +101,8 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file) put_task_struct(task); } + xe_eudebug_file_open(xef); + return 0; } @@ -153,6 +156,8 @@ static void xe_file_close(struct drm_device *dev, struct drm_file *file) xe_pm_runtime_get(xe); + xe_eudebug_file_close(xef); + /* * No need for exec_queue.lock here as there is no contention for it * when FD is closing as IOCTLs presumably can't be modifying the @@ -191,6 +196,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = { DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(XE_EUDEBUG_CONNECT, xe_eudebug_connect_ioctl, DRM_RENDER_ALLOW), }; static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) @@ -281,6 +287,8 @@ static void xe_device_destroy(struct drm_device *dev, void *dummy) { struct xe_device *xe = to_xe_device(dev); + xe_eudebug_fini(xe); + if (xe->preempt_fence_wq) destroy_workqueue(xe->preempt_fence_wq); @@ -352,6 +360,8 @@ struct xe_device *xe_device_create(struct pci_dev *pdev, INIT_LIST_HEAD(&xe->pinned.external_vram); INIT_LIST_HEAD(&xe->pinned.evicted); + xe_eudebug_init(xe); + xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq", WQ_MEM_RECLAIM); xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0); diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 1373a222f5a5..9f04e6476195 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -383,6 +383,17 @@ struct xe_device { struct workqueue_struct *wq; } sriov; +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + /** @clients: eudebug clients info */ + struct { + /** @clients.lock: Protects client list */ + spinlock_t lock; + + /** @xa: client list for eudebug discovery */ + struct list_head list; + } clients; +#endif + /** @usm: unified memory state */ struct { /** @usm.asid: convert a ASID to VM */ @@ -525,6 +536,23 @@ struct xe_device { u8 vm_inject_error_position; #endif +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + /** @debugger connection list and globals for device */ + struct { + /** @lock: protects the list of connections */ + spinlock_t lock; + + /** @list: list of connections, aka debuggers */ + struct list_head list; + + /** @session_count: session counter to track connections */ + u64 session_count; + + /** @available: is the debugging functionality available */ + bool available; + } eudebug; +#endif + /* private: */ #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY) @@ -642,6 +670,13 @@ struct xe_file { /** @refcount: ref count of this xe file */ struct kref refcount; + +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + struct { + /** @client_link: list entry in xe_device.clients.list */ + struct list_head client_link; + } eudebug; +#endif }; #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c new file mode 100644 index 000000000000..bbb5f1e81bb8 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -0,0 +1,1118 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#include +#include +#include +#include + +#include + +#include "xe_assert.h" +#include "xe_device.h" +#include "xe_eudebug.h" +#include "xe_eudebug_types.h" +#include "xe_macros.h" +#include "xe_vm.h" + +/* + * If there is no detected event read by userspace, during this period, assume + * userspace problem and disconnect debugger to allow forward progress. + */ +#define XE_EUDEBUG_NO_READ_DETECTED_TIMEOUT_MS (25 * 1000) + +#define for_each_debugger_rcu(debugger, head) \ + list_for_each_entry_rcu((debugger), (head), connection_link) +#define for_each_debugger(debugger, head) \ + list_for_each_entry((debugger), (head), connection_link) + +#define cast_event(T, event) container_of((event), typeof(*(T)), base) + +#define XE_EUDEBUG_DBG_STR "eudbg: %lld:%lu:%s (%d/%d) -> (%d/%d): " +#define XE_EUDEBUG_DBG_ARGS(d) (d)->session, \ + atomic_long_read(&(d)->events.seqno), \ + READ_ONCE(d->connection.status) <= 0 ? "disconnected" : "", \ + current->pid, \ + task_tgid_nr(current), \ + (d)->target_task->pid, \ + task_tgid_nr((d)->target_task) + +#define eu_err(d, fmt, ...) drm_err(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \ + XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__) +#define eu_warn(d, fmt, ...) drm_warn(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \ + XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__) +#define eu_dbg(d, fmt, ...) drm_dbg(&(d)->xe->drm, XE_EUDEBUG_DBG_STR # fmt, \ + XE_EUDEBUG_DBG_ARGS(d), ##__VA_ARGS__) + +#define xe_eudebug_assert(d, ...) xe_assert((d)->xe, ##__VA_ARGS__) + +#define struct_member(T, member) (((T *)0)->member) + +/* Keep 1:1 parity with uapi events */ +#define write_member(T_out, ptr, member, value) { \ + BUILD_BUG_ON(sizeof(*ptr) != sizeof(T_out)); \ + BUILD_BUG_ON(offsetof(typeof(*ptr), member) != \ + offsetof(typeof(T_out), member)); \ + BUILD_BUG_ON(sizeof(ptr->member) != sizeof(value)); \ + BUILD_BUG_ON(sizeof(struct_member(T_out, member)) != sizeof(value)); \ + BUILD_BUG_ON(!typecheck(typeof((ptr)->member), value)); \ + (ptr)->member = (value); \ + } + +static struct xe_eudebug_event * +event_fifo_pending(struct xe_eudebug *d) +{ + struct xe_eudebug_event *event; + + if (kfifo_peek(&d->events.fifo, &event)) + return event; + + return NULL; +} + +/* + * This is racy as we dont take the lock for read but all the + * callsites can handle the race so we can live without lock. + */ +__no_kcsan +static unsigned int +event_fifo_num_events_peek(const struct xe_eudebug * const d) +{ + return kfifo_len(&d->events.fifo); +} + +static bool +xe_eudebug_detached(struct xe_eudebug *d) +{ + int status; + + spin_lock(&d->connection.lock); + status = d->connection.status; + spin_unlock(&d->connection.lock); + + return status <= 0; +} + +static int +xe_eudebug_error(const struct xe_eudebug * const d) +{ + const int status = READ_ONCE(d->connection.status); + + return status <= 0 ? status : 0; +} + +static unsigned int +event_fifo_has_events(struct xe_eudebug *d) +{ + if (xe_eudebug_detached(d)) + return 1; + + return event_fifo_num_events_peek(d); +} + +static const struct rhashtable_params rhash_res = { + .head_offset = offsetof(struct xe_eudebug_handle, rh_head), + .key_len = sizeof_field(struct xe_eudebug_handle, key), + .key_offset = offsetof(struct xe_eudebug_handle, key), + .automatic_shrinking = true, +}; + +static struct xe_eudebug_resource * +resource_from_type(struct xe_eudebug_resources * const res, const int t) +{ + return &res->rt[t]; +} + +static struct xe_eudebug_resources * +xe_eudebug_resources_alloc(void) +{ + struct xe_eudebug_resources *res; + int err; + int i; + + res = kzalloc(sizeof(*res), GFP_ATOMIC); + if (!res) + return ERR_PTR(-ENOMEM); + + mutex_init(&res->lock); + + for (i = 0; i < XE_EUDEBUG_RES_TYPE_COUNT; i++) { + xa_init_flags(&res->rt[i].xa, XA_FLAGS_ALLOC1); + err = rhashtable_init(&res->rt[i].rh, &rhash_res); + + if (err) + break; + } + + if (err) { + while (i--) { + xa_destroy(&res->rt[i].xa); + rhashtable_destroy(&res->rt[i].rh); + } + + kfree(res); + return ERR_PTR(err); + } + + return res; +} + +static void res_free_fn(void *ptr, void *arg) +{ + XE_WARN_ON(ptr); + kfree(ptr); +} + +static void +xe_eudebug_destroy_resources(struct xe_eudebug *d) +{ + struct xe_eudebug_resources *res = d->res; + struct xe_eudebug_handle *h; + unsigned long j; + int i; + int err; + + mutex_lock(&res->lock); + for (i = 0; i < XE_EUDEBUG_RES_TYPE_COUNT; i++) { + struct xe_eudebug_resource *r = &res->rt[i]; + + xa_for_each(&r->xa, j, h) { + struct xe_eudebug_handle *t; + + err = rhashtable_remove_fast(&r->rh, + &h->rh_head, + rhash_res); + xe_eudebug_assert(d, !err); + t = xa_erase(&r->xa, h->id); + xe_eudebug_assert(d, t == h); + kfree(t); + } + } + mutex_unlock(&res->lock); + + for (i = 0; i < XE_EUDEBUG_RES_TYPE_COUNT; i++) { + struct xe_eudebug_resource *r = &res->rt[i]; + + rhashtable_free_and_destroy(&r->rh, res_free_fn, NULL); + xe_eudebug_assert(d, xa_empty(&r->xa)); + xa_destroy(&r->xa); + } + + mutex_destroy(&res->lock); + + kfree(res); +} + +static void xe_eudebug_free(struct kref *ref) +{ + struct xe_eudebug *d = container_of(ref, typeof(*d), ref); + struct xe_eudebug_event *event; + + while (kfifo_get(&d->events.fifo, &event)) + kfree(event); + + xe_eudebug_destroy_resources(d); + put_task_struct(d->target_task); + + xe_eudebug_assert(d, !kfifo_len(&d->events.fifo)); + + kfree_rcu(d, rcu); +} + +static void xe_eudebug_put(struct xe_eudebug *d) +{ + kref_put(&d->ref, xe_eudebug_free); +} + +static struct task_struct *find_get_target(const pid_t nr) +{ + struct task_struct *task; + + rcu_read_lock(); + task = pid_task(find_pid_ns(nr, task_active_pid_ns(current)), PIDTYPE_PID); + if (task) + get_task_struct(task); + rcu_read_unlock(); + + return task; +} + +static int +xe_eudebug_attach(struct xe_device *xe, struct xe_eudebug *d, + const pid_t pid_nr) +{ + struct task_struct *target; + struct xe_eudebug *iter; + int ret = 0; + + target = find_get_target(pid_nr); + if (!target) + return -ENOENT; + + if (!ptrace_may_access(target, PTRACE_MODE_READ_REALCREDS)) { + put_task_struct(target); + return -EACCES; + } + + XE_WARN_ON(d->connection.status != 0); + + spin_lock(&xe->eudebug.lock); + for_each_debugger(iter, &xe->eudebug.list) { + if (!same_thread_group(iter->target_task, target)) + continue; + + ret = -EBUSY; + } + + if (!ret && xe->eudebug.session_count + 1 == 0) + ret = -ENOSPC; + + if (!ret) { + d->connection.status = XE_EUDEBUG_STATUS_CONNECTED; + d->xe = xe; + d->target_task = get_task_struct(target); + d->session = ++xe->eudebug.session_count; + kref_get(&d->ref); + list_add_tail_rcu(&d->connection_link, &xe->eudebug.list); + } + spin_unlock(&xe->eudebug.lock); + + put_task_struct(target); + + return ret; +} + +static bool xe_eudebug_detach(struct xe_device *xe, + struct xe_eudebug *d, + const int err) +{ + bool detached = false; + + XE_WARN_ON(err > 0); + + spin_lock(&d->connection.lock); + if (d->connection.status == XE_EUDEBUG_STATUS_CONNECTED) { + d->connection.status = err; + detached = true; + } + spin_unlock(&d->connection.lock); + + if (!detached) + return false; + + spin_lock(&xe->eudebug.lock); + list_del_rcu(&d->connection_link); + spin_unlock(&xe->eudebug.lock); + + eu_dbg(d, "session %lld detached with %d", d->session, err); + + /* Our ref with the connection_link */ + xe_eudebug_put(d); + + return true; +} + +static int _xe_eudebug_disconnect(struct xe_eudebug *d, + const int err) +{ + wake_up_all(&d->events.write_done); + wake_up_all(&d->events.read_done); + + return xe_eudebug_detach(d->xe, d, err); +} + +#define xe_eudebug_disconnect(_d, _err) ({ \ + if (_xe_eudebug_disconnect((_d), (_err))) { \ + if ((_err) == 0 || (_err) == -ETIMEDOUT) \ + eu_dbg(d, "Session closed (%d)", (_err)); \ + else \ + eu_err(d, "Session disconnected, err = %d (%s:%d)", \ + (_err), __func__, __LINE__); \ + } \ +}) + +static int xe_eudebug_release(struct inode *inode, struct file *file) +{ + struct xe_eudebug *d = file->private_data; + + xe_eudebug_disconnect(d, 0); + xe_eudebug_put(d); + + return 0; +} + +static __poll_t xe_eudebug_poll(struct file *file, poll_table *wait) +{ + struct xe_eudebug * const d = file->private_data; + __poll_t ret = 0; + + poll_wait(file, &d->events.write_done, wait); + + if (xe_eudebug_detached(d)) { + ret |= EPOLLHUP; + if (xe_eudebug_error(d)) + ret |= EPOLLERR; + } + + if (event_fifo_num_events_peek(d)) + ret |= EPOLLIN; + + return ret; +} + +static ssize_t xe_eudebug_read(struct file *file, + char __user *buf, + size_t count, + loff_t *ppos) +{ + return -EINVAL; +} + +static struct xe_eudebug * +xe_eudebug_for_task_get(struct xe_device *xe, + struct task_struct *task) +{ + struct xe_eudebug *d, *iter; + + d = NULL; + + rcu_read_lock(); + for_each_debugger_rcu(iter, &xe->eudebug.list) { + if (!same_thread_group(iter->target_task, task)) + continue; + + if (kref_get_unless_zero(&iter->ref)) + d = iter; + + break; + } + rcu_read_unlock(); + + return d; +} + +static struct task_struct *find_task_get(struct xe_file *xef) +{ + struct task_struct *task; + struct pid *pid; + + rcu_read_lock(); + pid = rcu_dereference(xef->drm->pid); + task = pid_task(pid, PIDTYPE_PID); + if (task) + get_task_struct(task); + rcu_read_unlock(); + + return task; +} + +static struct xe_eudebug * +xe_eudebug_get(struct xe_file *xef) +{ + struct task_struct *task; + struct xe_eudebug *d; + + d = NULL; + task = find_task_get(xef); + if (task) { + d = xe_eudebug_for_task_get(to_xe_device(xef->drm->minor->dev), + task); + put_task_struct(task); + } + + if (!d) + return NULL; + + if (xe_eudebug_detached(d)) { + xe_eudebug_put(d); + return NULL; + } + + return d; +} + +static int xe_eudebug_queue_event(struct xe_eudebug *d, + struct xe_eudebug_event *event) +{ + const u64 wait_jiffies = msecs_to_jiffies(1000); + u64 last_read_detected_ts, last_head_seqno, start_ts; + + xe_eudebug_assert(d, event->len > sizeof(struct xe_eudebug_event)); + xe_eudebug_assert(d, event->type); + xe_eudebug_assert(d, event->type != DRM_XE_EUDEBUG_EVENT_READ); + + start_ts = ktime_get(); + last_read_detected_ts = start_ts; + last_head_seqno = 0; + + do { + struct xe_eudebug_event *head; + u64 head_seqno; + bool was_queued; + + if (xe_eudebug_detached(d)) + break; + + spin_lock(&d->events.lock); + head = event_fifo_pending(d); + if (head) + head_seqno = event->seqno; + else + head_seqno = 0; + + was_queued = kfifo_in(&d->events.fifo, &event, 1); + spin_unlock(&d->events.lock); + + wake_up_all(&d->events.write_done); + + if (was_queued) { + event = NULL; + break; + } + + XE_WARN_ON(!head_seqno); + + /* If we detect progress, restart timeout */ + if (last_head_seqno != head_seqno) + last_read_detected_ts = ktime_get(); + + last_head_seqno = head_seqno; + + wait_event_interruptible_timeout(d->events.read_done, + !kfifo_is_full(&d->events.fifo), + wait_jiffies); + + } while (ktime_ms_delta(ktime_get(), last_read_detected_ts) < + XE_EUDEBUG_NO_READ_DETECTED_TIMEOUT_MS); + + if (event) { + eu_dbg(d, + "event %llu queue failed (blocked %lld ms, avail %d)", + event ? event->seqno : 0, + ktime_ms_delta(ktime_get(), start_ts), + kfifo_avail(&d->events.fifo)); + + kfree(event); + + return -ETIMEDOUT; + } + + return 0; +} + +static struct xe_eudebug_handle * +alloc_handle(const int type, const u64 key) +{ + struct xe_eudebug_handle *h; + + h = kzalloc(sizeof(*h), GFP_ATOMIC); + if (!h) + return NULL; + + h->key = key; + + return h; +} + +static struct xe_eudebug_handle * +__find_handle(struct xe_eudebug_resource *r, + const u64 key) +{ + struct xe_eudebug_handle *h; + + h = rhashtable_lookup_fast(&r->rh, + &key, + rhash_res); + return h; +} + +static int find_handle(struct xe_eudebug_resources *res, + const int type, + const void *p) +{ + const u64 key = (uintptr_t)p; + struct xe_eudebug_resource *r; + struct xe_eudebug_handle *h; + int id; + + if (XE_WARN_ON(!key)) + return -EINVAL; + + r = resource_from_type(res, type); + + mutex_lock(&res->lock); + h = __find_handle(r, key); + id = h ? h->id : -ENOENT; + mutex_unlock(&res->lock); + + return id; +} + +static int _xe_eudebug_add_handle(struct xe_eudebug *d, + int type, + void *p, + u64 *seqno, + int *handle) +{ + const u64 key = (uintptr_t)p; + struct xe_eudebug_resource *r; + struct xe_eudebug_handle *h, *o; + int err; + + if (XE_WARN_ON(!p)) + return -EINVAL; + + if (xe_eudebug_detached(d)) + return -ENOTCONN; + + h = alloc_handle(type, key); + if (!h) + return -ENOMEM; + + r = resource_from_type(d->res, type); + + mutex_lock(&d->res->lock); + o = __find_handle(r, key); + if (!o) { + err = xa_alloc(&r->xa, &h->id, h, xa_limit_31b, GFP_KERNEL); + + if (h->id >= INT_MAX) { + xa_erase(&r->xa, h->id); + err = -ENOSPC; + } + + if (!err) + err = rhashtable_insert_fast(&r->rh, + &h->rh_head, + rhash_res); + + if (err) { + xa_erase(&r->xa, h->id); + } else { + if (seqno) + *seqno = atomic_long_inc_return(&d->events.seqno); + } + } else { + xe_eudebug_assert(d, o->id); + err = -EEXIST; + } + mutex_unlock(&d->res->lock); + + if (handle) + *handle = o ? o->id : h->id; + + if (err) { + kfree(h); + XE_WARN_ON(err > 0); + return err; + } + + xe_eudebug_assert(d, h->id); + + return h->id; +} + +static int xe_eudebug_add_handle(struct xe_eudebug *d, + int type, + void *p, + u64 *seqno) +{ + int ret; + + ret = _xe_eudebug_add_handle(d, type, p, seqno, NULL); + if (ret == -EEXIST || ret == -ENOTCONN) { + eu_dbg(d, "%d on adding %d", ret, type); + return 0; + } + + if (ret < 0) + xe_eudebug_disconnect(d, ret); + + return ret; +} + +static int _xe_eudebug_remove_handle(struct xe_eudebug *d, int type, void *p, + u64 *seqno) +{ + const u64 key = (uintptr_t)p; + struct xe_eudebug_resource *r; + struct xe_eudebug_handle *h, *xa_h; + int ret; + + if (XE_WARN_ON(!key)) + return -EINVAL; + + if (xe_eudebug_detached(d)) + return -ENOTCONN; + + r = resource_from_type(d->res, type); + + mutex_lock(&d->res->lock); + h = __find_handle(r, key); + if (h) { + ret = rhashtable_remove_fast(&r->rh, + &h->rh_head, + rhash_res); + xe_eudebug_assert(d, !ret); + xa_h = xa_erase(&r->xa, h->id); + xe_eudebug_assert(d, xa_h == h); + if (!ret) { + ret = h->id; + if (seqno) + *seqno = atomic_long_inc_return(&d->events.seqno); + } + } else { + ret = -ENOENT; + } + mutex_unlock(&d->res->lock); + + kfree(h); + + xe_eudebug_assert(d, ret); + + return ret; +} + +static int xe_eudebug_remove_handle(struct xe_eudebug *d, int type, void *p, + u64 *seqno) +{ + int ret; + + ret = _xe_eudebug_remove_handle(d, type, p, seqno); + if (ret == -ENOENT || ret == -ENOTCONN) { + eu_dbg(d, "%d on removing %d", ret, type); + return 0; + } + + if (ret < 0) + xe_eudebug_disconnect(d, ret); + + return ret; +} + +static struct xe_eudebug_event * +xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, + u32 len) +{ + const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM; + const u16 known_flags = + DRM_XE_EUDEBUG_EVENT_CREATE | + DRM_XE_EUDEBUG_EVENT_DESTROY | + DRM_XE_EUDEBUG_EVENT_STATE_CHANGE | + DRM_XE_EUDEBUG_EVENT_NEED_ACK; + struct xe_eudebug_event *event; + + BUILD_BUG_ON(type > max_event); + + xe_eudebug_assert(d, type <= max_event); + xe_eudebug_assert(d, !(~known_flags & flags)); + xe_eudebug_assert(d, len > sizeof(*event)); + + event = kzalloc(len, GFP_KERNEL); + if (!event) + return NULL; + + event->len = len; + event->type = type; + event->flags = flags; + event->seqno = seqno; + + return event; +} + +static long xe_eudebug_read_event(struct xe_eudebug *d, + const u64 arg, + const bool wait) +{ + struct xe_device *xe = d->xe; + struct drm_xe_eudebug_event __user * const user_orig = + u64_to_user_ptr(arg); + struct drm_xe_eudebug_event user_event; + struct xe_eudebug_event *event; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM; + long ret = 0; + + if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, !user_event.type)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, user_event.type > max_event)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, user_event.type != DRM_XE_EUDEBUG_EVENT_READ)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, user_event.len < sizeof(*user_orig))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, user_event.flags)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, user_event.reserved)) + return -EINVAL; + + /* XXX: define wait time in connect arguments ? */ + if (wait) { + ret = wait_event_interruptible_timeout(d->events.write_done, + event_fifo_has_events(d), + msecs_to_jiffies(5 * 1000)); + + if (XE_IOCTL_DBG(xe, ret < 0)) + return ret; + } + + ret = 0; + spin_lock(&d->events.lock); + event = event_fifo_pending(d); + if (event) { + if (user_event.len < event->len) { + ret = -EMSGSIZE; + } else if (!kfifo_out(&d->events.fifo, &event, 1)) { + eu_warn(d, "internal fifo corruption"); + ret = -ENOTCONN; + } + } + spin_unlock(&d->events.lock); + + wake_up_all(&d->events.read_done); + + if (ret == -EMSGSIZE && put_user(event->len, &user_orig->len)) + ret = -EFAULT; + + if (XE_IOCTL_DBG(xe, ret)) + return ret; + + if (!event) { + if (xe_eudebug_detached(d)) + return -ENOTCONN; + if (!wait) + return -EAGAIN; + + return -ENOENT; + } + + if (copy_to_user(user_orig, event, event->len)) + ret = -EFAULT; + else + eu_dbg(d, "event read: type=%u, flags=0x%x, seqno=%llu", event->type, + event->flags, event->seqno); + + kfree(event); + + return ret; +} + +static long xe_eudebug_ioctl(struct file *file, + unsigned int cmd, + unsigned long arg) +{ + struct xe_eudebug * const d = file->private_data; + long ret; + + switch (cmd) { + case DRM_XE_EUDEBUG_IOCTL_READ_EVENT: + ret = xe_eudebug_read_event(d, arg, + !(file->f_flags & O_NONBLOCK)); + break; + + default: + ret = -EINVAL; + } + + return ret; +} + +static const struct file_operations fops = { + .owner = THIS_MODULE, + .release = xe_eudebug_release, + .poll = xe_eudebug_poll, + .read = xe_eudebug_read, + .unlocked_ioctl = xe_eudebug_ioctl, +}; + +static int +xe_eudebug_connect(struct xe_device *xe, + struct drm_xe_eudebug_connect *param) +{ + const u64 known_open_flags = 0; + unsigned long f_flags = 0; + struct xe_eudebug *d; + int fd, err; + + if (param->extensions) + return -EINVAL; + + if (!param->pid) + return -EINVAL; + + if (param->flags & ~known_open_flags) + return -EINVAL; + + if (param->version && param->version != DRM_XE_EUDEBUG_VERSION) + return -EINVAL; + + param->version = DRM_XE_EUDEBUG_VERSION; + + if (!xe->eudebug.available) + return -EOPNOTSUPP; + + d = kzalloc(sizeof(*d), GFP_KERNEL); + if (!d) + return -ENOMEM; + + kref_init(&d->ref); + spin_lock_init(&d->connection.lock); + init_waitqueue_head(&d->events.write_done); + init_waitqueue_head(&d->events.read_done); + + spin_lock_init(&d->events.lock); + INIT_KFIFO(d->events.fifo); + + d->res = xe_eudebug_resources_alloc(); + if (IS_ERR(d->res)) { + err = PTR_ERR(d->res); + goto err_free; + } + + err = xe_eudebug_attach(xe, d, param->pid); + if (err) + goto err_free_res; + + fd = anon_inode_getfd("[xe_eudebug]", &fops, d, f_flags); + if (fd < 0) { + err = fd; + goto err_detach; + } + + eu_dbg(d, "connected session %lld", d->session); + + return fd; + +err_detach: + xe_eudebug_detach(xe, d, err); +err_free_res: + xe_eudebug_destroy_resources(d); +err_free: + kfree(d); + + return err; +} + +int xe_eudebug_connect_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file) +{ + struct xe_device *xe = to_xe_device(dev); + struct drm_xe_eudebug_connect * const param = data; + int ret = 0; + + ret = xe_eudebug_connect(xe, param); + + return ret; +} + +void xe_eudebug_init(struct xe_device *xe) +{ + spin_lock_init(&xe->eudebug.lock); + INIT_LIST_HEAD(&xe->eudebug.list); + + spin_lock_init(&xe->clients.lock); + INIT_LIST_HEAD(&xe->clients.list); + + xe->eudebug.available = true; +} + +void xe_eudebug_fini(struct xe_device *xe) +{ + xe_assert(xe, list_empty_careful(&xe->eudebug.list)); +} + +static int send_open_event(struct xe_eudebug *d, u32 flags, const u64 handle, + const u64 seqno) +{ + struct xe_eudebug_event *event; + struct xe_eudebug_event_open *eo; + + if (!handle) + return -EINVAL; + + if (XE_WARN_ON((long)handle >= INT_MAX)) + return -EINVAL; + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_OPEN, seqno, + flags, sizeof(*eo)); + if (!event) + return -ENOMEM; + + eo = cast_event(eo, event); + + write_member(struct drm_xe_eudebug_event_client, eo, + client_handle, handle); + + return xe_eudebug_queue_event(d, event); +} + +static int client_create_event(struct xe_eudebug *d, struct xe_file *xef) +{ + u64 seqno; + int ret; + + ret = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_CLIENT, xef, &seqno); + if (ret > 0) + ret = send_open_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, + ret, seqno); + + return ret; +} + +static int client_destroy_event(struct xe_eudebug *d, struct xe_file *xef) +{ + u64 seqno; + int ret; + + ret = xe_eudebug_remove_handle(d, XE_EUDEBUG_RES_TYPE_CLIENT, + xef, &seqno); + if (ret > 0) + ret = send_open_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY, + ret, seqno); + + return ret; +} + +#define xe_eudebug_event_put(_d, _err) ({ \ + if ((_err)) \ + xe_eudebug_disconnect((_d), (_err)); \ + xe_eudebug_put((_d)); \ + }) + +void xe_eudebug_file_open(struct xe_file *xef) +{ + struct xe_eudebug *d; + + INIT_LIST_HEAD(&xef->eudebug.client_link); + spin_lock(&xef->xe->clients.lock); + list_add_tail(&xef->eudebug.client_link, &xef->xe->clients.list); + spin_unlock(&xef->xe->clients.lock); + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, client_create_event(d, xef)); +} + +void xe_eudebug_file_close(struct xe_file *xef) +{ + struct xe_eudebug *d; + + d = xe_eudebug_get(xef); + if (d) + xe_eudebug_event_put(d, client_destroy_event(d, xef)); + + spin_lock(&xef->xe->clients.lock); + list_del_init(&xef->eudebug.client_link); + spin_unlock(&xef->xe->clients.lock); +} + +static int send_vm_event(struct xe_eudebug *d, u32 flags, + const u64 client_handle, + const u64 vm_handle, + const u64 seqno) +{ + struct xe_eudebug_event *event; + struct xe_eudebug_event_vm *e; + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM, + seqno, flags, sizeof(*e)); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_vm, e, client_handle, client_handle); + write_member(struct drm_xe_eudebug_event_vm, e, vm_handle, vm_handle); + + return xe_eudebug_queue_event(d, event); +} + +static int vm_create_event(struct xe_eudebug *d, + struct xe_file *xef, struct xe_vm *vm) +{ + int h_c, h_vm; + u64 seqno; + int ret; + + if (!xe_vm_in_lr_mode(vm)) + return 0; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); + if (h_c < 0) + return h_c; + + xe_eudebug_assert(d, h_c); + + h_vm = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_VM, vm, &seqno); + if (h_vm <= 0) + return h_vm; + + ret = send_vm_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, h_c, h_vm, seqno); + + return ret; +} + +static int vm_destroy_event(struct xe_eudebug *d, + struct xe_file *xef, struct xe_vm *vm) +{ + int h_c, h_vm; + u64 seqno; + + if (!xe_vm_in_lr_mode(vm)) + return 0; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); + if (h_c < 0) { + XE_WARN_ON("no client found for vm"); + eu_warn(d, "no client found for vm"); + return h_c; + } + + xe_eudebug_assert(d, h_c); + + h_vm = xe_eudebug_remove_handle(d, XE_EUDEBUG_RES_TYPE_VM, vm, &seqno); + if (h_vm <= 0) + return h_vm; + + return send_vm_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY, h_c, h_vm, seqno); +} + +void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm) +{ + struct xe_eudebug *d; + + if (!xe_vm_in_lr_mode(vm)) + return; + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, vm_create_event(d, xef, vm)); +} + +void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) +{ + struct xe_eudebug *d; + + if (!xe_vm_in_lr_mode(vm)) + return; + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, vm_destroy_event(d, xef, vm)); +} diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h new file mode 100644 index 000000000000..e3247365f72f --- /dev/null +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef _XE_EUDEBUG_H_ + +struct drm_device; +struct drm_file; +struct xe_device; +struct xe_file; +struct xe_vm; + +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + +int xe_eudebug_connect_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file); + +void xe_eudebug_init(struct xe_device *xe); +void xe_eudebug_fini(struct xe_device *xe); + +void xe_eudebug_file_open(struct xe_file *xef); +void xe_eudebug_file_close(struct xe_file *xef); + +void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm); +void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm); + +#else + +static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file) { return 0; } + +static inline void xe_eudebug_init(struct xe_device *xe) { } +static inline void xe_eudebug_fini(struct xe_device *xe) { } + +static inline void xe_eudebug_file_open(struct xe_file *xef) { } +static inline void xe_eudebug_file_close(struct xe_file *xef) { } + +static inline void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm) { } +static inline void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) { } + +#endif /* CONFIG_DRM_XE_EUDEBUG */ + +#endif diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h new file mode 100644 index 000000000000..a5185f18f640 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -0,0 +1,169 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef __XE_EUDEBUG_TYPES_H_ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +struct xe_device; +struct task_struct; +struct xe_eudebug_event; + +#define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64 + +/** + * struct xe_eudebug_handle - eudebug resource handle + */ +struct xe_eudebug_handle { + /** @key: key value in rhashtable */ + u64 key; + + /** @id: opaque handle id for xarray */ + int id; + + /** @rh_head: rhashtable head */ + struct rhash_head rh_head; +}; + +/** + * struct xe_eudebug_resource - Resource map for one resource + */ +struct xe_eudebug_resource { + /** @xa: xarrays for key> */ + struct xarray xa; + + /** @rh rhashtable for id> */ + struct rhashtable rh; +}; + +#define XE_EUDEBUG_RES_TYPE_CLIENT 0 +#define XE_EUDEBUG_RES_TYPE_VM 1 +#define XE_EUDEBUG_RES_TYPE_COUNT (XE_EUDEBUG_RES_TYPE_VM + 1) + +/** + * struct xe_eudebug_resources - eudebug resources for all types + */ +struct xe_eudebug_resources { + /** @lock: guards access into rt */ + struct mutex lock; + + /** @rt: resource maps for all types */ + struct xe_eudebug_resource rt[XE_EUDEBUG_RES_TYPE_COUNT]; +}; + +/** + * struct xe_eudebug - Top level struct for eudebug: the connection + */ +struct xe_eudebug { + /** @ref: kref counter for this struct */ + struct kref ref; + + /** @rcu: rcu_head for rcu destruction */ + struct rcu_head rcu; + + /** @connection_link: our link into the xe_device:eudebug.list */ + struct list_head connection_link; + + struct { + /** @status: connected = 1, disconnected = error */ +#define XE_EUDEBUG_STATUS_CONNECTED 1 + int status; + + /** @lock: guards access to status */ + spinlock_t lock; + } connection; + + /** @xe: the parent device we are serving */ + struct xe_device *xe; + + /** @target_task: the task that we are debugging */ + struct task_struct *target_task; + + /** @res: the resource maps we track for target_task */ + struct xe_eudebug_resources *res; + + /** @session: session number for this connection (for logs) */ + u64 session; + + /** @events: kfifo queue of to-be-delivered events */ + struct { + /** @lock: guards access to fifo */ + spinlock_t lock; + + /** @fifo: queue of events pending */ + DECLARE_KFIFO(fifo, + struct xe_eudebug_event *, + CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE); + + /** @write_done: waitqueue for signalling write to fifo */ + wait_queue_head_t write_done; + + /** @read_done: waitqueue for signalling read from fifo */ + wait_queue_head_t read_done; + + /** @event_seqno: seqno counter to stamp events for fifo */ + atomic_long_t seqno; + } events; + +}; + +/** + * struct xe_eudebug_event - Internal base event struct for eudebug + */ +struct xe_eudebug_event { + /** @len: length of this event, including payload */ + u32 len; + + /** @type: message type */ + u16 type; + + /** @flags: message flags */ + u16 flags; + + /** @seqno: sequence number for ordering */ + u64 seqno; + + /** @reserved: reserved field MBZ */ + u64 reserved; + + /** @data: payload bytes */ + u8 data[]; +}; + +/** + * struct xe_eudebug_event_open - Internal event for client open/close + */ +struct xe_eudebug_event_open { + /** @base: base event */ + struct xe_eudebug_event base; + + /** @client_handle: opaque handle for client */ + u64 client_handle; +}; + +/** + * struct xe_eudebug_event_vm - Internal event for vm open/close + */ +struct xe_eudebug_event_vm { + /** @base: base event */ + struct xe_eudebug_event base; + + /** @client_handle: client containing the vm open/close */ + u64 client_handle; + + /** @vm_handle: vm handle it's open/close */ + u64 vm_handle; +}; + +#endif diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 7788680da4e5..6f16049f4f6e 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -25,6 +25,7 @@ #include "xe_bo.h" #include "xe_device.h" #include "xe_drm_client.h" +#include "xe_eudebug.h" #include "xe_exec_queue.h" #include "xe_gt_pagefault.h" #include "xe_gt_tlb_invalidation.h" @@ -1797,6 +1798,8 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data, args->vm_id = id; + xe_eudebug_vm_create(xef, vm); + return 0; err_close_and_put: @@ -1828,8 +1831,10 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void *data, xa_erase(&xef->vm.xa, args->vm_id); mutex_unlock(&xef->vm.lock); - if (!err) + if (!err) { + xe_eudebug_vm_destroy(xef, vm); xe_vm_close_and_put(vm); + } return err; } diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 4a8a4a63e99c..78479100a0b6 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -102,6 +102,7 @@ extern "C" { #define DRM_XE_EXEC 0x09 #define DRM_XE_WAIT_USER_FENCE 0x0a #define DRM_XE_OBSERVATION 0x0b +#define DRM_XE_EUDEBUG_CONNECT 0x0c /* Must be kept compact -- no holes */ @@ -117,6 +118,7 @@ extern "C" { #define DRM_IOCTL_XE_EXEC DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec) #define DRM_IOCTL_XE_WAIT_USER_FENCE DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence) #define DRM_IOCTL_XE_OBSERVATION DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param) +#define DRM_IOCTL_XE_EUDEBUG_CONNECT DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EUDEBUG_CONNECT, struct drm_xe_eudebug_connect) /** * DOC: Xe IOCTL Extensions @@ -1713,6 +1715,25 @@ struct drm_xe_oa_stream_info { __u64 reserved[3]; }; +/* + * Debugger ABI (ioctl and events) Version History: + * 0 - No debugger available + * 1 - Initial version + */ +#define DRM_XE_EUDEBUG_VERSION 1 + +struct drm_xe_eudebug_connect { + /** @extensions: Pointer to the first extension struct, if any */ + __u64 extensions; + + __u64 pid; /* input: Target process ID */ + __u32 flags; /* MBZ */ + + __u32 version; /* output: current ABI (ioctl / events) version */ +}; + +#include "xe_drm_eudebug.h" + #if defined(__cplusplus) } #endif diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h new file mode 100644 index 000000000000..acf6071c82bf --- /dev/null +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef _UAPI_XE_DRM_EUDEBUG_H_ +#define _UAPI_XE_DRM_EUDEBUG_H_ + +#if defined(__cplusplus) +extern "C" { +#endif + +/** + * Do a eudebug event read for a debugger connection. + * + * This ioctl is available in debug version 1. + */ +#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0) + +/* XXX: Document events to match their internal counterparts when moved to xe_drm.h */ +struct drm_xe_eudebug_event { + __u32 len; + + __u16 type; +#define DRM_XE_EUDEBUG_EVENT_NONE 0 +#define DRM_XE_EUDEBUG_EVENT_READ 1 +#define DRM_XE_EUDEBUG_EVENT_OPEN 2 +#define DRM_XE_EUDEBUG_EVENT_VM 3 + + __u16 flags; +#define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) +#define DRM_XE_EUDEBUG_EVENT_DESTROY (1 << 1) +#define DRM_XE_EUDEBUG_EVENT_STATE_CHANGE (1 << 2) +#define DRM_XE_EUDEBUG_EVENT_NEED_ACK (1 << 3) + __u64 seqno; + __u64 reserved; +}; + +struct drm_xe_eudebug_event_client { + struct drm_xe_eudebug_event base; + + __u64 client_handle; /* This is unique per debug connection */ +}; + +struct drm_xe_eudebug_event_vm { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 vm_handle; +}; + +#if defined(__cplusplus) +} +#endif + +#endif From patchwork Mon Dec 9 13:32:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6BD52E77181 for ; Mon, 9 Dec 2024 13:33:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E307D10E745; Mon, 9 Dec 2024 13:33:07 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="haxlFxtg"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1F8F810E743; Mon, 9 Dec 2024 13:33:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751187; x=1765287187; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TdE61wFvEa3CBp1oDI1ZzVx+InKJ5b0kvqnG3Jn4mM8=; b=haxlFxtg1LbUiM+g0GkuEnCmQa44chFEshbfAToGoSNLNTdBrfF4mG9Q ange1J1NDhCLOS/S2Lxn9u+a3ucFUlzjlZ19ozdPSLoBjyluoQtpMQiGl 2ZWGF22JXYfGLcJ1jmKSIlDmGMkSYYLR/PsjHMcsN+US8QMtpb7ABP005 pTK22oDl/UWvS+OBE1bRpFv4hSzsbNvFxdoYUjwh3j/TGSnze2q1fqcpf ypTOlqkFH8REP+sswPaN7mWgtMU9f5NBz/flMsZm3sg3DnTmpVUNdSXti R0Rk+2GsJjuNIXQGclXgwOL8rv/isRTF90Sw1usnj64XhSWBvjB6dn61p Q==; X-CSE-ConnectionGUID: sf3ZENJ6TIWc/vImJ2OYFA== X-CSE-MsgGUID: h5eEAxlMRfOvyANm/aA1RQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191915" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191915" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:07 -0800 X-CSE-ConnectionGUID: IeYbJCUtSIWzIkiIffuIiQ== X-CSE-MsgGUID: g3iU/6KNSBKVQVU/PEAgyQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531249" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:05 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Mika Kuoppala , Matthew Brost , Dominik Grzegorzek , Maciej Patelczyk Subject: [PATCH 03/26] drm/xe/eudebug: Introduce discovery for resources Date: Mon, 9 Dec 2024 15:32:54 +0200 Message-ID: <20241209133318.1806472-4-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Debugger connection can happen way after the client has created and destroyed arbitrary number of resources. We need to playback all currently existing resources for the debugger. The client is held until this so called discovery process, executed by workqueue, is complete. This patch is based on discovery work by Maciej Patelczyk for i915 driver. v2: - use rw_semaphore to block drm_ioctls during discovery (Matthew) - only lock according to ioctl at play (Dominik) Cc: Matthew Brost Cc: Dominik Grzegorzek Co-developed-by: Maciej Patelczyk Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala Acked-by: Matthew Brost #locking --- drivers/gpu/drm/xe/xe_device.c | 10 +- drivers/gpu/drm/xe/xe_device.h | 34 +++++++ drivers/gpu/drm/xe/xe_device_types.h | 6 ++ drivers/gpu/drm/xe/xe_eudebug.c | 135 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug_types.h | 7 ++ 5 files changed, 185 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 9ed0de1eba0b..f051612908de 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -209,8 +209,11 @@ static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return -ECANCELED; ret = xe_pm_runtime_get_ioctl(xe); - if (ret >= 0) + if (ret >= 0) { + xe_eudebug_discovery_lock(xe, cmd); ret = drm_ioctl(file, cmd, arg); + xe_eudebug_discovery_unlock(xe, cmd); + } xe_pm_runtime_put(xe); return ret; @@ -227,8 +230,11 @@ static long xe_drm_compat_ioctl(struct file *file, unsigned int cmd, unsigned lo return -ECANCELED; ret = xe_pm_runtime_get_ioctl(xe); - if (ret >= 0) + if (ret >= 0) { + xe_eudebug_discovery_lock(xe, cmd); ret = drm_compat_ioctl(file, cmd, arg); + xe_eudebug_discovery_unlock(xe, cmd); + } xe_pm_runtime_put(xe); return ret; diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h index f1fbfe916867..088831a6b863 100644 --- a/drivers/gpu/drm/xe/xe_device.h +++ b/drivers/gpu/drm/xe/xe_device.h @@ -7,6 +7,7 @@ #define _XE_DEVICE_H_ #include +#include #include "xe_device_types.h" #include "xe_gt_types.h" @@ -205,4 +206,37 @@ void xe_file_put(struct xe_file *xef); #define LNL_FLUSH_WORK(wrk__) \ flush_work(wrk__) +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) +static inline int xe_eudebug_needs_lock(const unsigned int cmd) +{ + const unsigned int xe_cmd = DRM_IOCTL_NR(cmd) - DRM_COMMAND_BASE; + + switch (xe_cmd) { + case DRM_XE_VM_CREATE: + case DRM_XE_VM_DESTROY: + case DRM_XE_VM_BIND: + case DRM_XE_EXEC_QUEUE_CREATE: + case DRM_XE_EXEC_QUEUE_DESTROY: + case DRM_XE_EUDEBUG_CONNECT: + return 1; + } + + return 0; +} + +static inline void xe_eudebug_discovery_lock(struct xe_device *xe, unsigned int cmd) +{ + if (xe_eudebug_needs_lock(cmd)) + down_read(&xe->eudebug.discovery_lock); +} +static inline void xe_eudebug_discovery_unlock(struct xe_device *xe, unsigned int cmd) +{ + if (xe_eudebug_needs_lock(cmd)) + up_read(&xe->eudebug.discovery_lock); +} +#else +static inline void xe_eudebug_discovery_lock(struct xe_device *xe, unsigned int cmd) { } +static inline void xe_eudebug_discovery_unlock(struct xe_device *xe, unsigned int cmd) { } +#endif /* CONFIG_DRM_XE_EUDEBUG */ + #endif diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 9f04e6476195..9941ea1400c6 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -550,6 +550,12 @@ struct xe_device { /** @available: is the debugging functionality available */ bool available; + + /** @ordered_wq: used to discovery */ + struct workqueue_struct *ordered_wq; + + /** discovery_lock: used for discovery to block xe ioctls */ + struct rw_semaphore discovery_lock; } eudebug; #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index bbb5f1e81bb8..228bc36342ba 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -299,6 +299,8 @@ static bool xe_eudebug_detach(struct xe_device *xe, } spin_unlock(&d->connection.lock); + flush_work(&d->discovery_work); + if (!detached) return false; @@ -409,7 +411,7 @@ static struct task_struct *find_task_get(struct xe_file *xef) } static struct xe_eudebug * -xe_eudebug_get(struct xe_file *xef) +_xe_eudebug_get(struct xe_file *xef) { struct task_struct *task; struct xe_eudebug *d; @@ -433,6 +435,24 @@ xe_eudebug_get(struct xe_file *xef) return d; } +static struct xe_eudebug * +xe_eudebug_get(struct xe_file *xef) +{ + struct xe_eudebug *d; + + lockdep_assert_held(&xef->xe->eudebug.discovery_lock); + + d = _xe_eudebug_get(xef); + if (d) { + if (!completion_done(&d->discovery)) { + xe_eudebug_put(d); + d = NULL; + } + } + + return d; +} + static int xe_eudebug_queue_event(struct xe_eudebug *d, struct xe_eudebug_event *event) { @@ -813,6 +833,10 @@ static long xe_eudebug_ioctl(struct file *file, struct xe_eudebug * const d = file->private_data; long ret; + if (cmd != DRM_XE_EUDEBUG_IOCTL_READ_EVENT && + !completion_done(&d->discovery)) + return -EBUSY; + switch (cmd) { case DRM_XE_EUDEBUG_IOCTL_READ_EVENT: ret = xe_eudebug_read_event(d, arg, @@ -834,6 +858,8 @@ static const struct file_operations fops = { .unlocked_ioctl = xe_eudebug_ioctl, }; +static void discovery_work_fn(struct work_struct *work); + static int xe_eudebug_connect(struct xe_device *xe, struct drm_xe_eudebug_connect *param) @@ -868,9 +894,11 @@ xe_eudebug_connect(struct xe_device *xe, spin_lock_init(&d->connection.lock); init_waitqueue_head(&d->events.write_done); init_waitqueue_head(&d->events.read_done); + init_completion(&d->discovery); spin_lock_init(&d->events.lock); INIT_KFIFO(d->events.fifo); + INIT_WORK(&d->discovery_work, discovery_work_fn); d->res = xe_eudebug_resources_alloc(); if (IS_ERR(d->res)) { @@ -888,6 +916,9 @@ xe_eudebug_connect(struct xe_device *xe, goto err_detach; } + kref_get(&d->ref); + queue_work(xe->eudebug.ordered_wq, &d->discovery_work); + eu_dbg(d, "connected session %lld", d->session); return fd; @@ -922,13 +953,18 @@ void xe_eudebug_init(struct xe_device *xe) spin_lock_init(&xe->clients.lock); INIT_LIST_HEAD(&xe->clients.list); + init_rwsem(&xe->eudebug.discovery_lock); - xe->eudebug.available = true; + xe->eudebug.ordered_wq = alloc_ordered_workqueue("xe-eudebug-ordered-wq", 0); + xe->eudebug.available = !!xe->eudebug.ordered_wq; } void xe_eudebug_fini(struct xe_device *xe) { xe_assert(xe, list_empty_careful(&xe->eudebug.list)); + + if (xe->eudebug.ordered_wq) + destroy_workqueue(xe->eudebug.ordered_wq); } static int send_open_event(struct xe_eudebug *d, u32 flags, const u64 handle, @@ -994,21 +1030,25 @@ void xe_eudebug_file_open(struct xe_file *xef) struct xe_eudebug *d; INIT_LIST_HEAD(&xef->eudebug.client_link); + + down_read(&xef->xe->eudebug.discovery_lock); + spin_lock(&xef->xe->clients.lock); list_add_tail(&xef->eudebug.client_link, &xef->xe->clients.list); spin_unlock(&xef->xe->clients.lock); d = xe_eudebug_get(xef); - if (!d) - return; + if (d) + xe_eudebug_event_put(d, client_create_event(d, xef)); - xe_eudebug_event_put(d, client_create_event(d, xef)); + up_read(&xef->xe->eudebug.discovery_lock); } void xe_eudebug_file_close(struct xe_file *xef) { struct xe_eudebug *d; + down_read(&xef->xe->eudebug.discovery_lock); d = xe_eudebug_get(xef); if (d) xe_eudebug_event_put(d, client_destroy_event(d, xef)); @@ -1016,6 +1056,8 @@ void xe_eudebug_file_close(struct xe_file *xef) spin_lock(&xef->xe->clients.lock); list_del_init(&xef->eudebug.client_link); spin_unlock(&xef->xe->clients.lock); + + up_read(&xef->xe->eudebug.discovery_lock); } static int send_vm_event(struct xe_eudebug *d, u32 flags, @@ -1116,3 +1158,86 @@ void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) xe_eudebug_event_put(d, vm_destroy_event(d, xef, vm)); } + +static int discover_client(struct xe_eudebug *d, struct xe_file *xef) +{ + struct xe_vm *vm; + unsigned long i; + int err; + + err = client_create_event(d, xef); + if (err) + return err; + + xa_for_each(&xef->vm.xa, i, vm) { + err = vm_create_event(d, xef, vm); + if (err) + break; + } + + return err; +} + +static bool xe_eudebug_task_match(struct xe_eudebug *d, struct xe_file *xef) +{ + struct task_struct *task; + bool match; + + task = find_task_get(xef); + if (!task) + return false; + + match = same_thread_group(d->target_task, task); + + put_task_struct(task); + + return match; +} + +static void discover_clients(struct xe_device *xe, struct xe_eudebug *d) +{ + struct xe_file *xef; + int err; + + list_for_each_entry(xef, &xe->clients.list, eudebug.client_link) { + if (xe_eudebug_detached(d)) + break; + + if (xe_eudebug_task_match(d, xef)) + err = discover_client(d, xef); + else + err = 0; + + if (err) { + eu_dbg(d, "discover client %p: %d\n", xef, err); + break; + } + } +} + +static void discovery_work_fn(struct work_struct *work) +{ + struct xe_eudebug *d = container_of(work, typeof(*d), + discovery_work); + struct xe_device *xe = d->xe; + + if (xe_eudebug_detached(d)) { + complete_all(&d->discovery); + xe_eudebug_put(d); + return; + } + + down_write(&xe->eudebug.discovery_lock); + + eu_dbg(d, "Discovery start for %lld\n", d->session); + + discover_clients(xe, d); + + eu_dbg(d, "Discovery end for %lld\n", d->session); + + complete_all(&d->discovery); + + up_write(&xe->eudebug.discovery_lock); + + xe_eudebug_put(d); +} diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index a5185f18f640..080a821db3e4 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -19,6 +19,7 @@ struct xe_device; struct task_struct; struct xe_eudebug_event; +struct workqueue_struct; #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64 @@ -96,6 +97,12 @@ struct xe_eudebug { /** @session: session number for this connection (for logs) */ u64 session; + /** @discovery: completion to wait for discovery */ + struct completion discovery; + + /** @discovery_work: worker to discover resources for target_task */ + struct work_struct discovery_work; + /** @events: kfifo queue of to-be-delivered events */ struct { /** @lock: guards access to fifo */ From patchwork Mon Dec 9 13:32:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899776 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 785A2E7717D for ; Mon, 9 Dec 2024 13:33:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0259A10E73E; Mon, 9 Dec 2024 13:33:11 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="nhUzdzV3"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4572610E740; Mon, 9 Dec 2024 13:33:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751189; x=1765287189; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8fGLfl9iqBBCaAlH0EWdZ/5R+6YLppzz2BVb5pE3iH4=; b=nhUzdzV3cPF5uXJZTa/WKdGDGsxvryHzLqqjnwOSK4vgXRYI0dVsBccC jCuQRM4AKyvKYYgG32BxjXUxCu/eChiUQqOHZVv9JCz4lROodOQWxxAiB IUVy8Ga5LkcO+Sufil+ksIegOxQA81L32meAE34KeCd6telmcxpggDali VSZc6KAz2y160VzD7cLHCuxUoPXQOBDoFyoCmoSeUg1+UW/hKggQEIEtB Uoi/GYKF4+YWS0uGBa5TVLNvOZTdY49z3lANwcjicwtzklo01fg08f05V msMa+Z2oSI7BQ2UN1xqoL8AHj++RoaMoT2zE/6qB+/ZGM08RacQIN1dg9 A==; X-CSE-ConnectionGUID: ybAMuK0lS5e/8JtzNDYzAg== X-CSE-MsgGUID: UZchEljCTneowDfFm/UCXQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191937" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191937" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:09 -0800 X-CSE-ConnectionGUID: XytLXy5RSOKdC7ntOHPocA== X-CSE-MsgGUID: eGHabHY5TJugERT95LtcqQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531257" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:08 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Maciej Patelczyk , Mika Kuoppala Subject: [PATCH 04/26] drm/xe/eudebug: Introduce exec_queue events Date: Mon, 9 Dec 2024 15:32:55 +0200 Message-ID: <20241209133318.1806472-5-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Inform debugger about creation and destruction of exec_queues. 1) Use user engine class types instead of internal xe_engine_class enum in exec_queue event. 2) During discovery do not advertise every execqueue created, only ones with class render or compute. v2: - Only track long running queues - Checkpatch (Tilak) v3: __counted_by added Signed-off-by: Dominik Grzegorzek Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 189 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug.h | 7 + drivers/gpu/drm/xe/xe_eudebug_types.h | 31 ++++- drivers/gpu/drm/xe/xe_exec_queue.c | 5 + include/uapi/drm/xe_drm_eudebug.h | 12 ++ 5 files changed, 241 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 228bc36342ba..3ca46ec838b9 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -14,6 +14,7 @@ #include "xe_device.h" #include "xe_eudebug.h" #include "xe_eudebug_types.h" +#include "xe_exec_queue.h" #include "xe_macros.h" #include "xe_vm.h" @@ -716,7 +717,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -751,7 +752,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -1159,8 +1160,183 @@ void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) xe_eudebug_event_put(d, vm_destroy_event(d, xef, vm)); } +static bool exec_queue_class_is_tracked(enum xe_engine_class class) +{ + return class == XE_ENGINE_CLASS_COMPUTE || + class == XE_ENGINE_CLASS_RENDER; +} + +static const u16 xe_to_user_engine_class[] = { + [XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER, + [XE_ENGINE_CLASS_COPY] = DRM_XE_ENGINE_CLASS_COPY, + [XE_ENGINE_CLASS_VIDEO_DECODE] = DRM_XE_ENGINE_CLASS_VIDEO_DECODE, + [XE_ENGINE_CLASS_VIDEO_ENHANCE] = DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE, + [XE_ENGINE_CLASS_COMPUTE] = DRM_XE_ENGINE_CLASS_COMPUTE, +}; + +static int send_exec_queue_event(struct xe_eudebug *d, u32 flags, + u64 client_handle, u64 vm_handle, + u64 exec_queue_handle, enum xe_engine_class class, + u32 width, u64 *lrc_handles, u64 seqno) +{ + struct xe_eudebug_event *event; + struct xe_eudebug_event_exec_queue *e; + const u32 sz = struct_size(e, lrc_handle, width); + const u32 xe_engine_class = xe_to_user_engine_class[class]; + + if (!exec_queue_class_is_tracked(class)) + return -EINVAL; + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE, + seqno, flags, sz); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_exec_queue, e, client_handle, client_handle); + write_member(struct drm_xe_eudebug_event_exec_queue, e, vm_handle, vm_handle); + write_member(struct drm_xe_eudebug_event_exec_queue, e, exec_queue_handle, + exec_queue_handle); + write_member(struct drm_xe_eudebug_event_exec_queue, e, engine_class, xe_engine_class); + write_member(struct drm_xe_eudebug_event_exec_queue, e, width, width); + + memcpy(e->lrc_handle, lrc_handles, width); + + return xe_eudebug_queue_event(d, event); +} + +static int exec_queue_create_event(struct xe_eudebug *d, + struct xe_file *xef, struct xe_exec_queue *q) +{ + int h_c, h_vm, h_queue; + u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE], seqno; + int i; + + if (!xe_exec_queue_is_lr(q)) + return 0; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); + if (h_c < 0) + return h_c; + + h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm); + if (h_vm < 0) + return h_vm; + + if (XE_WARN_ON(q->width >= XE_HW_ENGINE_MAX_INSTANCE)) + return -EINVAL; + + for (i = 0; i < q->width; i++) { + int h, ret; + + ret = _xe_eudebug_add_handle(d, + XE_EUDEBUG_RES_TYPE_LRC, + q->lrc[i], + NULL, + &h); + + if (ret < 0 && ret != -EEXIST) + return ret; + + XE_WARN_ON(!h); + + h_lrc[i] = h; + } + + h_queue = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, q, &seqno); + if (h_queue <= 0) + return h_queue; + + /* No need to cleanup for added handles on error as if we fail + * we disconnect + */ + + return send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, + h_c, h_vm, h_queue, q->class, + q->width, h_lrc, seqno); +} + +static int exec_queue_destroy_event(struct xe_eudebug *d, + struct xe_file *xef, + struct xe_exec_queue *q) +{ + int h_c, h_vm, h_queue; + u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE], seqno; + int i; + + if (!xe_exec_queue_is_lr(q)) + return 0; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); + if (h_c < 0) + return h_c; + + h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm); + if (h_vm < 0) + return h_vm; + + if (XE_WARN_ON(q->width >= XE_HW_ENGINE_MAX_INSTANCE)) + return -EINVAL; + + h_queue = xe_eudebug_remove_handle(d, + XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, + q, + &seqno); + if (h_queue <= 0) + return h_queue; + + for (i = 0; i < q->width; i++) { + int ret; + + ret = _xe_eudebug_remove_handle(d, + XE_EUDEBUG_RES_TYPE_LRC, + q->lrc[i], + NULL); + if (ret < 0 && ret != -ENOENT) + return ret; + + XE_WARN_ON(!ret); + + h_lrc[i] = ret; + } + + return send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY, + h_c, h_vm, h_queue, q->class, + q->width, h_lrc, seqno); +} + +void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q) +{ + struct xe_eudebug *d; + + if (!exec_queue_class_is_tracked(q->class)) + return; + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, exec_queue_create_event(d, xef, q)); +} + +void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) +{ + struct xe_eudebug *d; + + if (!exec_queue_class_is_tracked(q->class)) + return; + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, exec_queue_destroy_event(d, xef, q)); +} + static int discover_client(struct xe_eudebug *d, struct xe_file *xef) { + struct xe_exec_queue *q; struct xe_vm *vm; unsigned long i; int err; @@ -1175,6 +1351,15 @@ static int discover_client(struct xe_eudebug *d, struct xe_file *xef) break; } + xa_for_each(&xef->exec_queue.xa, i, q) { + if (!exec_queue_class_is_tracked(q->class)) + continue; + + err = exec_queue_create_event(d, xef, q); + if (err) + break; + } + return err; } diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index e3247365f72f..326ddbd50651 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -10,6 +10,7 @@ struct drm_file; struct xe_device; struct xe_file; struct xe_vm; +struct xe_exec_queue; #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) @@ -26,6 +27,9 @@ void xe_eudebug_file_close(struct xe_file *xef); void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm); void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm); +void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q); +void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q); + #else static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, @@ -41,6 +45,9 @@ static inline void xe_eudebug_file_close(struct xe_file *xef) { } static inline void xe_eudebug_vm_create(struct xe_file *xef, struct xe_vm *vm) { } static inline void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) { } +static inline void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q) { } +static inline void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) { } + #endif /* CONFIG_DRM_XE_EUDEBUG */ #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index 080a821db3e4..4824c4159036 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -50,7 +50,9 @@ struct xe_eudebug_resource { #define XE_EUDEBUG_RES_TYPE_CLIENT 0 #define XE_EUDEBUG_RES_TYPE_VM 1 -#define XE_EUDEBUG_RES_TYPE_COUNT (XE_EUDEBUG_RES_TYPE_VM + 1) +#define XE_EUDEBUG_RES_TYPE_EXEC_QUEUE 2 +#define XE_EUDEBUG_RES_TYPE_LRC 3 +#define XE_EUDEBUG_RES_TYPE_COUNT (XE_EUDEBUG_RES_TYPE_LRC + 1) /** * struct xe_eudebug_resources - eudebug resources for all types @@ -173,4 +175,31 @@ struct xe_eudebug_event_vm { u64 vm_handle; }; +/** + * struct xe_eudebug_event_exec_queue - Internal event for + * exec_queue create/destroy + */ +struct xe_eudebug_event_exec_queue { + /** @base: base event */ + struct xe_eudebug_event base; + + /** @client_handle: client for the engine create/destroy */ + u64 client_handle; + + /** @vm_handle: vm handle for the engine create/destroy */ + u64 vm_handle; + + /** @exec_queue_handle: engine handle */ + u64 exec_queue_handle; + + /** @engine_handle: engine class */ + u32 engine_class; + + /** @width: submission width (number BB per exec) for this exec queue */ + u32 width; + + /** @lrc_handles: handles for each logical ring context created with this exec queue */ + u64 lrc_handle[] __counted_by(width); +}; + #endif diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index aab9e561153d..7f5d8af778be 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -23,6 +23,7 @@ #include "xe_ring_ops_types.h" #include "xe_trace.h" #include "xe_vm.h" +#include "xe_eudebug.h" enum xe_exec_queue_sched_prop { XE_EXEC_QUEUE_JOB_TIMEOUT = 0, @@ -654,6 +655,8 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data, args->exec_queue_id = id; + xe_eudebug_exec_queue_create(xef, q); + return 0; kill_exec_queue: @@ -840,6 +843,8 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data, if (q->vm && q->hwe->hw_engine_group) xe_hw_engine_group_del_exec_queue(q->hwe->hw_engine_group, q); + xe_eudebug_exec_queue_destroy(xef, q); + xe_exec_queue_kill(q); trace_xe_exec_queue_close(q); diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index acf6071c82bf..ac44e890152a 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -26,6 +26,7 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_READ 1 #define DRM_XE_EUDEBUG_EVENT_OPEN 2 #define DRM_XE_EUDEBUG_EVENT_VM 3 +#define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE 4 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) @@ -49,6 +50,17 @@ struct drm_xe_eudebug_event_vm { __u64 vm_handle; }; +struct drm_xe_eudebug_event_exec_queue { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 vm_handle; + __u64 exec_queue_handle; + __u32 engine_class; + __u32 width; + __u64 lrc_handle[]; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:32:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DAB7CE77183 for ; Mon, 9 Dec 2024 13:33:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5533810E743; Mon, 9 Dec 2024 13:33:12 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="RWAjm2J5"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id E449D10E748; Mon, 9 Dec 2024 13:33:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751191; x=1765287191; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FNzKJ557BMcxI2zE5cRgA79+1J7ZSisRgX0YkmChjLs=; b=RWAjm2J5+OqjRqF3HjfvrkSj30PCdLE05r4GqX3I8pWIAfKl8CtR8qTh JxwIgrJ4gOc1Akb8kKG/OuVrdglfl8D7TQxXr0ZaGVOh+nI1i55ok1++H SBIzvfVwcx1L2RTfw+UvoLwfD7KY/bpUgusOuac7yfzd5cLQnymB0UmMU 5gNigjLNe0gpmCWKaqvGonVVeLhVh9Z/ZoMhgqQtre1/KydQW8EFEkU3i TdvmMuq0KVcbbzo9DPN4ajjcNla2GnLK6/E4Rr8O25/b/FSraguCGjuqB fCQMlCoi+kr1Ish/aR9EPVE6rWFwQcFIS0ldbXeA9hSuBK8x8TJmOQLgP A==; X-CSE-ConnectionGUID: pp8A6re7S2W+7Z9I58L8fw== X-CSE-MsgGUID: +gk9k+cbSX+FVsnfYuYmRw== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191942" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191942" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:11 -0800 X-CSE-ConnectionGUID: se54qCDHRLqVgNq3RR5jUg== X-CSE-MsgGUID: 4zeihSdESdG58YhAheQTPA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531262" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:09 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Mika Kuoppala Subject: [PATCH 05/26] drm/xe/eudebug: Introduce exec queue placements event Date: Mon, 9 Dec 2024 15:32:56 +0200 Message-ID: <20241209133318.1806472-6-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek This commit introduces the DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS, which provides dbgUMD with information about the hw engines utilized during execution. The event is sent for every logical ring context (lrc) in scenarios involving parallel submission. Signed-off-by: Dominik Grzegorzek Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 99 ++++++++++++++++++++++++--- drivers/gpu/drm/xe/xe_eudebug_types.h | 26 +++++++ include/uapi/drm/xe_drm_eudebug.h | 17 +++++ 3 files changed, 133 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 3ca46ec838b9..cbcf7a72fdba 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -717,7 +717,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -752,7 +752,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -1206,12 +1206,88 @@ static int send_exec_queue_event(struct xe_eudebug *d, u32 flags, return xe_eudebug_queue_event(d, event); } -static int exec_queue_create_event(struct xe_eudebug *d, - struct xe_file *xef, struct xe_exec_queue *q) +static int send_exec_queue_placements_event(struct xe_eudebug *d, + u64 client_handle, u64 vm_handle, + u64 exec_queue_handle, u64 lrc_handle, + u32 num_placements, u64 *instances, + u64 seqno) +{ + struct xe_eudebug_event_exec_queue_placements *e; + const u32 sz = struct_size(e, instances, num_placements); + struct xe_eudebug_event *event; + + event = xe_eudebug_create_event(d, + DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS, + seqno, DRM_XE_EUDEBUG_EVENT_CREATE, sz); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_exec_queue_placements, e, client_handle, + client_handle); + write_member(struct drm_xe_eudebug_event_exec_queue_placements, e, vm_handle, vm_handle); + write_member(struct drm_xe_eudebug_event_exec_queue_placements, e, exec_queue_handle, + exec_queue_handle); + write_member(struct drm_xe_eudebug_event_exec_queue_placements, e, lrc_handle, lrc_handle); + write_member(struct drm_xe_eudebug_event_exec_queue_placements, e, num_placements, + num_placements); + + memcpy(e->instances, instances, num_placements * sizeof(*instances)); + + return xe_eudebug_queue_event(d, event); +} + +static int send_exec_queue_placements_events(struct xe_eudebug *d, struct xe_exec_queue *q, + u64 client_handle, u64 vm_handle, + u64 exec_queue_handle, u64 *lrc_handles) +{ + + struct drm_xe_engine_class_instance eci[XE_HW_ENGINE_MAX_INSTANCE] = {}; + unsigned long mask = q->logical_mask; + u32 num_placements = 0; + int ret, i, j; + u64 seqno; + + for_each_set_bit(i, &mask, sizeof(q->logical_mask) * 8) { + if (XE_WARN_ON(num_placements == XE_HW_ENGINE_MAX_INSTANCE)) + break; + + eci[num_placements].engine_class = xe_to_user_engine_class[q->class]; + eci[num_placements].engine_instance = i; + eci[num_placements++].gt_id = q->gt->info.id; + } + + ret = 0; + for (i = 0; i < q->width; i++) { + seqno = atomic_long_inc_return(&d->events.seqno); + + ret = send_exec_queue_placements_event(d, client_handle, vm_handle, + exec_queue_handle, lrc_handles[i], + num_placements, (u64 *)eci, seqno); + if (ret) + return ret; + + /* + * Parallel submissions must be logically contiguous, + * so the next placement is just q->logical_mask >> 1 + */ + for (j = 0; j < num_placements; j++) { + eci[j].engine_instance++; + XE_WARN_ON(eci[j].engine_instance >= XE_HW_ENGINE_MAX_INSTANCE); + } + } + + return ret; +} + +static int exec_queue_create_events(struct xe_eudebug *d, + struct xe_file *xef, struct xe_exec_queue *q) { int h_c, h_vm, h_queue; u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE], seqno; int i; + int ret = 0; if (!xe_exec_queue_is_lr(q)) return 0; @@ -1252,9 +1328,14 @@ static int exec_queue_create_event(struct xe_eudebug *d, * we disconnect */ - return send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, - h_c, h_vm, h_queue, q->class, - q->width, h_lrc, seqno); + ret = send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, + h_c, h_vm, h_queue, q->class, + q->width, h_lrc, seqno); + + if (ret) + return ret; + + return send_exec_queue_placements_events(d, q, h_c, h_vm, h_queue, h_lrc); } static int exec_queue_destroy_event(struct xe_eudebug *d, @@ -1317,7 +1398,7 @@ void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q) if (!d) return; - xe_eudebug_event_put(d, exec_queue_create_event(d, xef, q)); + xe_eudebug_event_put(d, exec_queue_create_events(d, xef, q)); } void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) @@ -1355,7 +1436,7 @@ static int discover_client(struct xe_eudebug *d, struct xe_file *xef) if (!exec_queue_class_is_tracked(q->class)) continue; - err = exec_queue_create_event(d, xef, q); + err = exec_queue_create_events(d, xef, q); if (err) break; } diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index 4824c4159036..bdffdfb1abff 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -202,4 +202,30 @@ struct xe_eudebug_event_exec_queue { u64 lrc_handle[] __counted_by(width); }; +struct xe_eudebug_event_exec_queue_placements { + /** @base: base event */ + struct xe_eudebug_event base; + + /** @client_handle: client for the engine create/destroy */ + u64 client_handle; + + /** @vm_handle: vm handle for the engine create/destroy */ + u64 vm_handle; + + /** @exec_queue_handle: engine handle */ + u64 exec_queue_handle; + + /** @engine_handle: engine class */ + u64 lrc_handle; + + /** @num_placements: all possible placements for given lrc */ + u32 num_placements; + + /** @pad: padding */ + u32 pad; + + /** @instances: num_placements sized array containing drm_xe_engine_class_instance*/ + u64 instances[]; __counted_by(num_placements); +}; + #endif diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index ac44e890152a..21690008a869 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -27,6 +27,7 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_OPEN 2 #define DRM_XE_EUDEBUG_EVENT_VM 3 #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE 4 +#define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS 5 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) @@ -61,6 +62,22 @@ struct drm_xe_eudebug_event_exec_queue { __u64 lrc_handle[]; }; +struct drm_xe_eudebug_event_exec_queue_placements { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 vm_handle; + __u64 exec_queue_handle; + __u64 lrc_handle; + __u32 num_placements; + __u32 pad; + /** + * @instances: user pointer to num_placements sized array of struct + * drm_xe_engine_class_instance + */ + __u64 instances[]; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:32:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899778 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD127E7717D for ; Mon, 9 Dec 2024 13:33:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3297710E736; Mon, 9 Dec 2024 13:33:14 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="VOBB03GF"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id E419210E74A; Mon, 9 Dec 2024 13:33:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751193; x=1765287193; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zNM6oQlSauXXuILxxXcsInHRqIDZ2PUMt45iwjx3+6c=; b=VOBB03GFjXA08IoeLdzXfU+XYMFtuZBAN8PIfkkAcmevzZkMOx1xOnk2 PmIt+B0FJLrd+b/FMY14kvRaOqwB6a/0LKoOCCW8tUsh/Q1E0wkscxROD L9uDaEF8AidDyVAqo2t2EnZbF0gxHX46WfnWmANiO7r5cBbybdUXpuxLL 5XZPdWJdNw4Y9PmBA2tDVDq6wLMm9IyHA8HBwwVgQ2kQc2WKevWuNm5HY 5wPRNzniCUbnRvu7BleyJpPLqd6+XCM1RxEAdVGNcYZB2Hj85HcLWIJJL rl3LtW7J93dpBBItAJY5VgqUOQdEXuBl6hEgV5MS51aFfb0wp/aQ1RKlt g==; X-CSE-ConnectionGUID: cbuZjH2JQz+XYMUm6uVU+Q== X-CSE-MsgGUID: eLdRAAqNRYuYgU6UPCar8w== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191954" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191954" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:12 -0800 X-CSE-ConnectionGUID: TCpWUaxSTlOG9xQMY9Sr9w== X-CSE-MsgGUID: l0e5TdlATTKLKM5vKpxbDQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531270" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:11 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Mika Kuoppala Subject: [PATCH 06/26] drm/xe/eudebug: hw enablement for eudebug Date: Mon, 9 Dec 2024 15:32:57 +0200 Message-ID: <20241209133318.1806472-7-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek In order to turn on debug capabilities, (i.e. breakpoints), TD_CTL and some other registers needs to be programmed. Implement eudebug mode enabling including eudebug related workarounds. v2: Move workarounds to xe_wa_oob. Use reg_sr directly instead of xe_rtp as it suits better for dynamic manipulation of those register we do later in the series. v3: get rid of undefining XE_MCR_REG (Mika) Signed-off-by: Dominik Grzegorzek Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/regs/xe_engine_regs.h | 4 ++ drivers/gpu/drm/xe/regs/xe_gt_regs.h | 10 +++++ drivers/gpu/drm/xe/xe_eudebug.c | 49 ++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_eudebug.h | 3 ++ drivers/gpu/drm/xe/xe_hw_engine.c | 2 + drivers/gpu/drm/xe/xe_wa_oob.rules | 2 + 6 files changed, 70 insertions(+) diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h index 7c78496e6213..e45c4d5378e5 100644 --- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h @@ -115,6 +115,10 @@ #define INDIRECT_RING_STATE(base) XE_REG((base) + 0x108) +#define CS_DEBUG_MODE2(base) XE_REG((base) + 0xd8, XE_REG_OPTION_MASKED) +#define INST_STATE_CACHE_INVALIDATE REG_BIT(6) +#define GLOBAL_DEBUG_ENABLE REG_BIT(5) + #define RING_BBADDR(base) XE_REG((base) + 0x140) #define RING_BBADDR_UDW(base) XE_REG((base) + 0x168) diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 162f18e975da..cd8c49a9000f 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -455,6 +455,14 @@ #define DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA REG_BIT(15) #define CLEAR_OPTIMIZATION_DISABLE REG_BIT(6) +#define TD_CTL XE_REG_MCR(0xe400) +#define TD_CTL_FEH_AND_FEE_ENABLE REG_BIT(7) /* forced halt and exception */ +#define TD_CTL_FORCE_EXTERNAL_HALT REG_BIT(6) +#define TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE REG_BIT(4) +#define TD_CTL_FORCE_EXCEPTION REG_BIT(3) +#define TD_CTL_BREAKPOINT_ENABLE REG_BIT(2) +#define TD_CTL_GLOBAL_DEBUG_ENABLE REG_BIT(0) /* XeHP */ + #define CACHE_MODE_SS XE_REG_MCR(0xe420, XE_REG_OPTION_MASKED) #define DISABLE_ECC REG_BIT(5) #define ENABLE_PREFETCH_INTO_IC REG_BIT(3) @@ -481,11 +489,13 @@ #define MDQ_ARBITRATION_MODE REG_BIT(12) #define STALL_DOP_GATING_DISABLE REG_BIT(5) #define EARLY_EOT_DIS REG_BIT(1) +#define STALL_DOP_GATING_DISABLE REG_BIT(5) #define ROW_CHICKEN2 XE_REG_MCR(0xe4f4, XE_REG_OPTION_MASKED) #define DISABLE_READ_SUPPRESSION REG_BIT(15) #define DISABLE_EARLY_READ REG_BIT(14) #define ENABLE_LARGE_GRF_MODE REG_BIT(12) +#define XEHPC_DISABLE_BTB REG_BIT(11) #define PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) #define DISABLE_TDL_SVHS_GATING REG_BIT(1) #define DISABLE_DOP_GATING REG_BIT(0) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index cbcf7a72fdba..fecb7c8a9779 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -10,13 +10,21 @@ #include +#include + +#include "regs/xe_gt_regs.h" +#include "regs/xe_engine_regs.h" + #include "xe_assert.h" #include "xe_device.h" #include "xe_eudebug.h" #include "xe_eudebug_types.h" #include "xe_exec_queue.h" #include "xe_macros.h" +#include "xe_reg_sr.h" +#include "xe_rtp.h" #include "xe_vm.h" +#include "xe_wa.h" /* * If there is no detected event read by userspace, during this period, assume @@ -947,6 +955,47 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev, return ret; } +static void add_sr_entry(struct xe_hw_engine *hwe, + struct xe_reg_mcr mcr_reg, + u32 mask) +{ + const struct xe_reg_sr_entry sr_entry = { + .reg = mcr_reg.__reg, + .clr_bits = mask, + .set_bits = mask, + .read_mask = mask, + }; + + xe_reg_sr_add(&hwe->reg_sr, &sr_entry, hwe->gt); +} + +void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe) +{ + struct xe_gt *gt = hwe->gt; + struct xe_device *xe = gt_to_xe(gt); + + if (!xe->eudebug.available) + return; + + if (!xe_rtp_match_first_render_or_compute(gt, hwe)) + return; + + if (XE_WA(gt, 18022722726)) + add_sr_entry(hwe, ROW_CHICKEN, STALL_DOP_GATING_DISABLE); + + if (XE_WA(gt, 14015474168)) + add_sr_entry(hwe, ROW_CHICKEN2, XEHPC_DISABLE_BTB); + + if (xe->info.graphics_verx100 >= 1200) + add_sr_entry(hwe, TD_CTL, + TD_CTL_BREAKPOINT_ENABLE | + TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE | + TD_CTL_FEH_AND_FEE_ENABLE); + + if (xe->info.graphics_verx100 >= 1250) + add_sr_entry(hwe, TD_CTL, TD_CTL_GLOBAL_DEBUG_ENABLE); +} + void xe_eudebug_init(struct xe_device *xe) { spin_lock_init(&xe->eudebug.lock); diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index 326ddbd50651..3cd6bc7bb682 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -11,6 +11,7 @@ struct xe_device; struct xe_file; struct xe_vm; struct xe_exec_queue; +struct xe_hw_engine; #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) @@ -20,6 +21,7 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev, void xe_eudebug_init(struct xe_device *xe); void xe_eudebug_fini(struct xe_device *xe); +void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe); void xe_eudebug_file_open(struct xe_file *xef); void xe_eudebug_file_close(struct xe_file *xef); @@ -38,6 +40,7 @@ static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, static inline void xe_eudebug_init(struct xe_device *xe) { } static inline void xe_eudebug_fini(struct xe_device *xe) { } +static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe) { } static inline void xe_eudebug_file_open(struct xe_file *xef) { } static inline void xe_eudebug_file_close(struct xe_file *xef) { } diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index c4b0dc3be39c..8a188ddc99f4 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -16,6 +16,7 @@ #include "xe_assert.h" #include "xe_bo.h" #include "xe_device.h" +#include "xe_eudebug.h" #include "xe_execlist.h" #include "xe_force_wake.h" #include "xe_gsc.h" @@ -558,6 +559,7 @@ static void hw_engine_init_early(struct xe_gt *gt, struct xe_hw_engine *hwe, xe_tuning_process_engine(hwe); xe_wa_process_engine(hwe); hw_engine_setup_default_state(hwe); + xe_eudebug_init_hw_engine(hwe); xe_reg_sr_init(&hwe->reg_whitelist, hwe->name, gt_to_xe(gt)); xe_reg_whitelist_process_engine(hwe); diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules index 3ed12a85cc60..cc2f28663072 100644 --- a/drivers/gpu/drm/xe/xe_wa_oob.rules +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules @@ -42,3 +42,5 @@ no_media_l3 MEDIA_VERSION(3000) 14022866841 GRAPHICS_VERSION(3000), GRAPHICS_STEP(A0, B0) MEDIA_VERSION(3000), MEDIA_STEP(A0, B0) +18022722726 GRAPHICS_VERSION_RANGE(1250, 1274) +14015474168 PLATFORM(PVC) From patchwork Mon Dec 9 13:32:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9068BE7717D for ; Mon, 9 Dec 2024 13:33:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0D60810E74D; Mon, 9 Dec 2024 13:33:17 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="kcmX4dZL"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 091D510E74A; Mon, 9 Dec 2024 13:33:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751195; x=1765287195; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VnC+LxBzfUxDJVFNAh5Y5nrN8vVYH24h6WQ2o4QXuMQ=; b=kcmX4dZLGFF5SJLRJLHOVZHMRkB4NQS60jhFNPa+DwUyYR9o/dWzPoS0 34sKT9I54b6tyfzpToqZlVI10GmluDJ8oIbKocXRqIL7YuPc+tsaTIT9x BPkZLM85TMh/4b70VCsaT6GeYdtZXgcHOH5827UKPMXtQbgJbyYSOw1xn OrDUBcO9gAdSdFJ3gqYzLRvckYrZO0Nt84mTZArJc2x2tXanbMiQqdZQE 73n1759okMzjNgp4vqy6WIZ53IRyUKtujVtQO9JLjHWkVv9X5NaUOD8mC C/T369TW0rMTKhotFVr8/d9x8Kt5PR1/p3SqebC1keU6ePGsjo1vHsXC/ Q==; X-CSE-ConnectionGUID: niRyskKgS1WroQ9ZRvmQnw== X-CSE-MsgGUID: BodzacyATmuASZMqv4LHIA== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191975" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191975" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:15 -0800 X-CSE-ConnectionGUID: N3+JcV0MQD6UEjGthA85aQ== X-CSE-MsgGUID: xZ9q+AD8RbuPqyc4okdmTg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531277" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:13 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Matthew Brost , Mika Kuoppala Subject: [PATCH 07/26] drm/xe: Add EUDEBUG_ENABLE exec queue property Date: Mon, 9 Dec 2024 15:32:58 +0200 Message-ID: <20241209133318.1806472-8-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Introduce exec queue immutable property of eudebug with a flags as value to enable eudebug specific feature(s). For now engine lrc will use this flag to set up runalone hw feature. Runalone is used to ensure that only one hw engine of group [rcs0, ccs0-3] is active on a tile. Note: unlike the i915, xe allows user to set runalone also on devices with single render/compute engine. It should not make much difference, but leave control to the user. v2: - check CONFIG_DRM_XE_EUDEBUG and LR mode (Matthew) - disable preempt (Dominik) - lrc_create remove from engine init Cc: Matthew Brost Signed-off-by: Dominik Grzegorzek Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 4 +-- drivers/gpu/drm/xe/xe_exec_queue.c | 46 ++++++++++++++++++++++-- drivers/gpu/drm/xe/xe_exec_queue.h | 2 ++ drivers/gpu/drm/xe/xe_exec_queue_types.h | 7 ++++ drivers/gpu/drm/xe/xe_execlist.c | 2 +- drivers/gpu/drm/xe/xe_lrc.c | 16 +++++++-- drivers/gpu/drm/xe/xe_lrc.h | 4 ++- include/uapi/drm/xe_drm.h | 3 +- 8 files changed, 74 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index fecb7c8a9779..4644d6846aae 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -1338,7 +1338,7 @@ static int exec_queue_create_events(struct xe_eudebug *d, int i; int ret = 0; - if (!xe_exec_queue_is_lr(q)) + if (!xe_exec_queue_is_debuggable(q)) return 0; h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); @@ -1395,7 +1395,7 @@ static int exec_queue_destroy_event(struct xe_eudebug *d, u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE], seqno; int i; - if (!xe_exec_queue_is_lr(q)) + if (!xe_exec_queue_is_debuggable(q)) return 0; h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 7f5d8af778be..cca46a32723e 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -109,6 +109,7 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe, static int __xe_exec_queue_init(struct xe_exec_queue *q) { struct xe_vm *vm = q->vm; + u32 flags = 0; int i, err; if (vm) { @@ -117,8 +118,11 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q) return err; } + if (q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE) + flags |= LRC_CREATE_RUNALONE; + for (i = 0; i < q->width; ++i) { - q->lrc[i] = xe_lrc_create(q->hwe, q->vm, SZ_16K); + q->lrc[i] = xe_lrc_create(q->hwe, q->vm, SZ_16K, flags); if (IS_ERR(q->lrc[i])) { err = PTR_ERR(q->lrc[i]); goto err_unlock; @@ -403,6 +407,42 @@ static int exec_queue_set_timeslice(struct xe_device *xe, struct xe_exec_queue * return 0; } +static int exec_queue_set_eudebug(struct xe_device *xe, struct xe_exec_queue *q, + u64 value) +{ + const u64 known_flags = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE; + + if (XE_IOCTL_DBG(xe, (q->class != XE_ENGINE_CLASS_RENDER && + q->class != XE_ENGINE_CLASS_COMPUTE))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, (value & ~known_flags))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !IS_ENABLED(CONFIG_DRM_XE_EUDEBUG))) + return -EOPNOTSUPP; + + if (XE_IOCTL_DBG(xe, !xe_exec_queue_is_lr(q))) + return -EINVAL; + /* + * We want to explicitly set the global feature if + * property is set. + */ + if (XE_IOCTL_DBG(xe, + !(value & DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE))) + return -EINVAL; + + q->eudebug_flags = EXEC_QUEUE_EUDEBUG_FLAG_ENABLE; + q->sched_props.preempt_timeout_us = 0; + + return 0; +} + +int xe_exec_queue_is_debuggable(struct xe_exec_queue *q) +{ + return q->eudebug_flags & EXEC_QUEUE_EUDEBUG_FLAG_ENABLE; +} + typedef int (*xe_exec_queue_set_property_fn)(struct xe_device *xe, struct xe_exec_queue *q, u64 value); @@ -410,6 +450,7 @@ typedef int (*xe_exec_queue_set_property_fn)(struct xe_device *xe, static const xe_exec_queue_set_property_fn exec_queue_set_property_funcs[] = { [DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY] = exec_queue_set_priority, [DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE] = exec_queue_set_timeslice, + [DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG] = exec_queue_set_eudebug, }; static int exec_queue_user_ext_set_property(struct xe_device *xe, @@ -429,7 +470,8 @@ static int exec_queue_user_ext_set_property(struct xe_device *xe, ARRAY_SIZE(exec_queue_set_property_funcs)) || XE_IOCTL_DBG(xe, ext.pad) || XE_IOCTL_DBG(xe, ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY && - ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE)) + ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE && + ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG)) return -EINVAL; idx = array_index_nospec(ext.property, ARRAY_SIZE(exec_queue_set_property_funcs)); diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h index 90c7f73eab88..421d8dc89814 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.h +++ b/drivers/gpu/drm/xe/xe_exec_queue.h @@ -85,4 +85,6 @@ int xe_exec_queue_last_fence_test_dep(struct xe_exec_queue *q, struct xe_vm *vm); void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q); +int xe_exec_queue_is_debuggable(struct xe_exec_queue *q); + #endif diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h index 1158b6062a6c..03f3ad235e4b 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h @@ -90,6 +90,13 @@ struct xe_exec_queue { */ unsigned long flags; + /** + * @eudebug_flags: immutable eudebug flags for this exec queue. + * Set up with DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG. + */ +#define EXEC_QUEUE_EUDEBUG_FLAG_ENABLE BIT(0) + unsigned long eudebug_flags; + union { /** @multi_gt_list: list head for VM bind engines if multi-GT */ struct list_head multi_gt_list; diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c index a8c416a48812..84b69a5dd361 100644 --- a/drivers/gpu/drm/xe/xe_execlist.c +++ b/drivers/gpu/drm/xe/xe_execlist.c @@ -265,7 +265,7 @@ struct xe_execlist_port *xe_execlist_port_create(struct xe_device *xe, port->hwe = hwe; - port->lrc = xe_lrc_create(hwe, NULL, SZ_16K); + port->lrc = xe_lrc_create(hwe, NULL, SZ_16K, 0); if (IS_ERR(port->lrc)) { err = PTR_ERR(port->lrc); goto err; diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 22e58c6e2a35..4ff217ca5474 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -876,7 +876,7 @@ static void xe_lrc_finish(struct xe_lrc *lrc) #define PVC_CTX_ACC_CTR_THOLD (0x2a + 1) static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, - struct xe_vm *vm, u32 ring_size) + struct xe_vm *vm, u32 ring_size, u32 flags) { struct xe_gt *gt = hwe->gt; struct xe_tile *tile = gt_to_tile(gt); @@ -993,6 +993,16 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, map = __xe_lrc_start_seqno_map(lrc); xe_map_write32(lrc_to_xe(lrc), &map, lrc->fence_ctx.next_seqno - 1); + if (flags & LRC_CREATE_RUNALONE) { + u32 ctx_control = xe_lrc_read_ctx_reg(lrc, CTX_CONTEXT_CONTROL); + + drm_dbg(&xe->drm, "read CTX_CONTEXT_CONTROL: 0x%x\n", ctx_control); + ctx_control |= _MASKED_BIT_ENABLE(CTX_CTRL_RUN_ALONE); + drm_dbg(&xe->drm, "written CTX_CONTEXT_CONTROL: 0x%x\n", ctx_control); + + xe_lrc_write_ctx_reg(lrc, CTX_CONTEXT_CONTROL, ctx_control); + } + return 0; err_lrc_finish: @@ -1012,7 +1022,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, * upon failure. */ struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, - u32 ring_size) + u32 ring_size, u32 flags) { struct xe_lrc *lrc; int err; @@ -1021,7 +1031,7 @@ struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, if (!lrc) return ERR_PTR(-ENOMEM); - err = xe_lrc_init(lrc, hwe, vm, ring_size); + err = xe_lrc_init(lrc, hwe, vm, ring_size, flags); if (err) { kfree(lrc); return ERR_PTR(err); diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index b459dcab8787..3e5656752831 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -41,8 +41,10 @@ struct xe_lrc_snapshot { #define LRC_PPHWSP_SCRATCH_ADDR (0x34 * 4) +#define LRC_CREATE_RUNALONE BIT(0) + struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, - u32 ring_size); + u32 ring_size, u32 flags); void xe_lrc_destroy(struct kref *ref); /** diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 78479100a0b6..d0b9ef0799b2 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -1112,7 +1112,8 @@ struct drm_xe_exec_queue_create { #define DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY 0 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY 0 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE 1 - +#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG 2 +#define DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE (1 << 0) /** @extensions: Pointer to the first extension struct, if any */ __u64 extensions; From patchwork Mon Dec 9 13:32:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 060DDE77181 for ; Mon, 9 Dec 2024 13:33:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6FC0210E74F; Mon, 9 Dec 2024 13:33:18 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="AnkwrbCh"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id C5FF410E74E; Mon, 9 Dec 2024 13:33:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751197; x=1765287197; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tb3gxvjrpZU9LunZET1J6V8yVmH9TtFDwkWrbFXvmBg=; b=AnkwrbChpUkwOxYZEhyHEpwSl0UeFUMj2VM3lsclDsgyNm1vGZNfmvO1 qkxm/IcKF6aWmn1aMnvXFYPFWcamayNwKhTCaLQvgwM+1Sa6ZqYyOphZF MZZ3eX9++MqptN2ll1PKPB+QlLXmk/GOZrk7GyC7qwOY2vDFaAIxyd52x I5WcsfiwUXfQqYt1DAg3Jgc47cbJE5P9xqfx2rjzvyeMBfXlqvJu0M7fb +jF0iv1xZwgvnPmbtsS2Li0UDC9f31BqmkDEvgIS7RywDZsV+xBrhuri+ b8s0zvGSQLtYFluIrcqFTuVTWQQ3tYV8rNfzFfoqpSy6kUjTwjNHx75bU Q==; X-CSE-ConnectionGUID: EwpBr8c3QzSATbT6kVRzbQ== X-CSE-MsgGUID: OZvBj1k3S8CvQA0a3yAh5g== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34191985" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34191985" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:16 -0800 X-CSE-ConnectionGUID: eAgEwHxhR5eWMEWDh3ZSzA== X-CSE-MsgGUID: hc8+ewdETKag9+/T5sZyrw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531285" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:15 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Christoph Manszewski , Maciej Patelczyk , Mika Kuoppala Subject: [PATCH 08/26] drm/xe/eudebug: Introduce per device attention scan worker Date: Mon, 9 Dec 2024 15:32:59 +0200 Message-ID: <20241209133318.1806472-9-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Scan for EU debugging attention bits periodically to detect if some EU thread has entered the system routine (SIP) due to EU thread exception. Make the scanning interval 10 times slower when there is no debugger connection open. Send attention event whenever we see attention with debugger presence. If there is no debugger connection active - reset. Based on work by authors and other folks who were part of attentions in i915. v2: - use xa_array for files - null ptr deref fix for non-debugged context (Dominik) - checkpatch (Tilak) - use discovery_lock during list traversal v3: - engine status per gen improvements, force_wake ref - __counted_by (Mika) Signed-off-by: Dominik Grzegorzek Signed-off-by: Christoph Manszewski Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/Makefile | 1 + drivers/gpu/drm/xe/regs/xe_engine_regs.h | 3 + drivers/gpu/drm/xe/regs/xe_gt_regs.h | 7 + drivers/gpu/drm/xe/xe_device.c | 2 + drivers/gpu/drm/xe/xe_device_types.h | 3 + drivers/gpu/drm/xe/xe_eudebug.c | 410 ++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug.h | 2 + drivers/gpu/drm/xe/xe_eudebug_types.h | 32 ++ drivers/gpu/drm/xe/xe_gt_debug.c | 148 ++++++++ drivers/gpu/drm/xe/xe_gt_debug.h | 21 ++ include/uapi/drm/xe_drm_eudebug.h | 13 + 11 files changed, 640 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.c create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.h diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index deabcdd3ea52..33f457e4fcd3 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -48,6 +48,7 @@ xe-y += xe_bb.o \ xe_gt_clock.o \ xe_gt_freq.o \ xe_gt_idle.o \ + xe_gt_debug.o \ xe_gt_mcr.o \ xe_gt_pagefault.o \ xe_gt_sysfs.o \ diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h index e45c4d5378e5..83b26cb174d6 100644 --- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h @@ -133,6 +133,9 @@ #define RING_EXECLIST_STATUS_LO(base) XE_REG((base) + 0x234) #define RING_EXECLIST_STATUS_HI(base) XE_REG((base) + 0x234 + 4) +#define RING_CURRENT_LRCA(base) XE_REG((base) + 0x240) +#define CURRENT_LRCA_VALID REG_BIT(0) + #define RING_CONTEXT_CONTROL(base) XE_REG((base) + 0x244, XE_REG_OPTION_MASKED) #define CTX_CTRL_OAC_CONTEXT_ENABLE REG_BIT(8) #define CTX_CTRL_RUN_ALONE REG_BIT(7) diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index cd8c49a9000f..a20331b6c20e 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -467,6 +467,8 @@ #define DISABLE_ECC REG_BIT(5) #define ENABLE_PREFETCH_INTO_IC REG_BIT(3) +#define TD_ATT(x) XE_REG_MCR(0xe470 + (x) * 4) + #define ROW_CHICKEN4 XE_REG_MCR(0xe48c, XE_REG_OPTION_MASKED) #define DISABLE_GRF_CLEAR REG_BIT(13) #define XEHP_DIS_BBL_SYSPIPE REG_BIT(11) @@ -547,6 +549,11 @@ #define CCS_MODE_CSLICE(cslice, ccs) \ ((ccs) << ((cslice) * CCS_MODE_CSLICE_WIDTH)) +#define RCU_DEBUG_1 XE_REG(0x14a00) +#define RCU_DEBUG_1_ENGINE_STATUS REG_GENMASK(2, 0) +#define RCU_DEBUG_1_RUNALONE_ACTIVE REG_BIT(2) +#define RCU_DEBUG_1_CONTEXT_ACTIVE REG_BIT(0) + #define FORCEWAKE_ACK_GT XE_REG(0x130044) /* Applicable for all FORCEWAKE_DOMAIN and FORCEWAKE_ACK_DOMAIN regs */ diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index f051612908de..dc0336215912 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -777,6 +777,8 @@ int xe_device_probe(struct xe_device *xe) xe_debugfs_register(xe); + xe_eudebug_init_late(xe); + xe_hwmon_register(xe); for_each_gt(gt, xe, id) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 9941ea1400c6..7b893a86d83f 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -556,6 +556,9 @@ struct xe_device { /** discovery_lock: used for discovery to block xe ioctls */ struct rw_semaphore discovery_lock; + + /** @attention_scan: attention scan worker */ + struct delayed_work attention_scan; } eudebug; #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 4644d6846aae..39e927100222 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -20,9 +20,17 @@ #include "xe_eudebug.h" #include "xe_eudebug_types.h" #include "xe_exec_queue.h" +#include "xe_force_wake.h" +#include "xe_gt.h" +#include "xe_gt_debug.h" +#include "xe_hw_engine.h" +#include "xe_lrc.h" #include "xe_macros.h" +#include "xe_mmio.h" +#include "xe_pm.h" #include "xe_reg_sr.h" #include "xe_rtp.h" +#include "xe_sched_job.h" #include "xe_vm.h" #include "xe_wa.h" @@ -725,7 +733,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_EU_ATTENTION; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -760,7 +768,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_EU_ATTENTION; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -867,6 +875,392 @@ static const struct file_operations fops = { .unlocked_ioctl = xe_eudebug_ioctl, }; +static int __current_lrca(struct xe_hw_engine *hwe, u32 *lrc_hw) +{ + u32 lrc_reg; + + lrc_reg = xe_hw_engine_mmio_read32(hwe, RING_CURRENT_LRCA(0)); + + if (!(lrc_reg & CURRENT_LRCA_VALID)) + return -ENOENT; + + *lrc_hw = lrc_reg & GENMASK(31, 12); + + return 0; +} + +static int current_lrca(struct xe_hw_engine *hwe, u32 *lrc_hw) +{ + unsigned int fw_ref; + int ret; + + fw_ref = xe_force_wake_get(gt_to_fw(hwe->gt), hwe->domain); + if (!fw_ref) + return -ETIMEDOUT; + + ret = __current_lrca(hwe, lrc_hw); + + xe_force_wake_put(gt_to_fw(hwe->gt), fw_ref); + + return ret; +} + +static bool lrca_equals(u32 a, u32 b) +{ + return (a & GENMASK(31, 12)) == (b & GENMASK(31, 12)); +} + +static int match_exec_queue_lrca(struct xe_exec_queue *q, u32 lrc_hw) +{ + int i; + + for (i = 0; i < q->width; i++) + if (lrca_equals(lower_32_bits(xe_lrc_descriptor(q->lrc[i])), lrc_hw)) + return i; + + return -1; +} + +static int rcu_debug1_engine_index(const struct xe_hw_engine * const hwe) +{ + if (hwe->class == XE_ENGINE_CLASS_RENDER) { + XE_WARN_ON(hwe->instance); + return 0; + } + + XE_WARN_ON(hwe->instance > 3); + + return hwe->instance + 1; +} + +static u32 engine_status_xe1(const struct xe_hw_engine * const hwe, + u32 rcu_debug1) +{ + const unsigned int first = 7; + const unsigned int incr = 3; + const unsigned int i = rcu_debug1_engine_index(hwe); + const unsigned int shift = first + (i * incr); + + return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS; +} + +static u32 engine_status_xe2(const struct xe_hw_engine * const hwe, + u32 rcu_debug1) +{ + const unsigned int first = 7; + const unsigned int incr = 4; + const unsigned int i = rcu_debug1_engine_index(hwe); + const unsigned int shift = first + (i * incr); + + return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS; +} + +static u32 engine_status(const struct xe_hw_engine * const hwe, + u32 rcu_debug1) +{ + u32 status = 0; + + if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 20) + status = engine_status_xe1(hwe, rcu_debug1); + else if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 30) + status = engine_status_xe2(hwe, rcu_debug1); + else + XE_WARN_ON(GRAPHICS_VER(gt_to_xe(hwe->gt))); + + return status; +} + +static bool engine_is_runalone_set(const struct xe_hw_engine * const hwe, + u32 rcu_debug1) +{ + return engine_status(hwe, rcu_debug1) & RCU_DEBUG_1_RUNALONE_ACTIVE; +} + +static bool engine_is_context_set(const struct xe_hw_engine * const hwe, + u32 rcu_debug1) +{ + return engine_status(hwe, rcu_debug1) & RCU_DEBUG_1_CONTEXT_ACTIVE; +} + +static bool engine_has_runalone(const struct xe_hw_engine * const hwe) +{ + return hwe->class == XE_ENGINE_CLASS_RENDER || + hwe->class == XE_ENGINE_CLASS_COMPUTE; +} + +static struct xe_hw_engine *get_runalone_active_hw_engine(struct xe_gt *gt) +{ + struct xe_hw_engine *hwe, *first = NULL; + unsigned int num_active, id, fw_ref; + u32 val; + + fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); + if (!fw_ref) { + drm_dbg(>_to_xe(gt)->drm, "eudbg: runalone failed to get force wake\n"); + return NULL; + } + + val = xe_mmio_read32(>->mmio, RCU_DEBUG_1); + xe_force_wake_put(gt_to_fw(gt), fw_ref); + + drm_dbg(>_to_xe(gt)->drm, "eudbg: runalone RCU_DEBUG_1 = 0x%08x\n", val); + + num_active = 0; + for_each_hw_engine(hwe, gt, id) { + bool runalone, ctx; + + if (!engine_has_runalone(hwe)) + continue; + + runalone = engine_is_runalone_set(hwe, val); + ctx = engine_is_context_set(hwe, val); + + drm_dbg(>_to_xe(gt)->drm, "eudbg: engine %s: runalone=%s, context=%s", + hwe->name, runalone ? "active" : "inactive", + ctx ? "active" : "inactive"); + + /* + * On earlier gen12 the context status seems to be idle when + * it has raised attention. We have to omit the active bit. + */ + if (IS_DGFX(gt_to_xe(gt))) + ctx = true; + + if (runalone && ctx) { + num_active++; + + drm_dbg(>_to_xe(gt)->drm, "eudbg: runalone engine %s %s", + hwe->name, first ? "selected" : "found"); + if (!first) + first = hwe; + } + } + + if (num_active > 1) + drm_err(>_to_xe(gt)->drm, "eudbg: %d runalone engines active!", + num_active); + + return first; +} + +static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx) +{ + struct xe_device *xe = gt_to_xe(gt); + struct xe_exec_queue *q, *found = NULL; + struct xe_hw_engine *active; + struct xe_file *xef; + unsigned long i; + int idx, err; + u32 lrc_hw; + + active = get_runalone_active_hw_engine(gt); + if (!active) { + drm_dbg(>_to_xe(gt)->drm, "Runalone engine not found!"); + return ERR_PTR(-ENOENT); + } + + err = current_lrca(active, &lrc_hw); + if (err) + return ERR_PTR(err); + + /* Take write so that we can safely check the lists */ + down_write(&xe->eudebug.discovery_lock); + list_for_each_entry(xef, &xe->clients.list, eudebug.client_link) { + xa_for_each(&xef->exec_queue.xa, i, q) { + if (q->gt != gt) + continue; + + if (q->class != active->class) + continue; + + if (xe_exec_queue_is_idle(q)) + continue; + + idx = match_exec_queue_lrca(q, lrc_hw); + if (idx < 0) + continue; + + found = xe_exec_queue_get(q); + + if (lrc_idx) + *lrc_idx = idx; + + break; + } + + if (found) + break; + } + up_write(&xe->eudebug.discovery_lock); + + if (!found) + return ERR_PTR(-ENOENT); + + if (XE_WARN_ON(current_lrca(active, &lrc_hw)) && + XE_WARN_ON(match_exec_queue_lrca(found, lrc_hw) < 0)) { + xe_exec_queue_put(found); + return ERR_PTR(-ENOENT); + } + + return found; +} + +static int send_attention_event(struct xe_eudebug *d, struct xe_exec_queue *q, int lrc_idx) +{ + struct xe_eudebug_event_eu_attention *ea; + struct xe_eudebug_event *event; + int h_c, h_queue, h_lrc; + u32 size = xe_gt_eu_attention_bitmap_size(q->gt); + u32 sz = struct_size(ea, bitmask, size); + int ret; + + XE_WARN_ON(lrc_idx < 0 || lrc_idx >= q->width); + + XE_WARN_ON(!xe_exec_queue_is_debuggable(q)); + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, q->vm->xef); + if (h_c < 0) + return h_c; + + h_queue = find_handle(d->res, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, q); + if (h_queue < 0) + return h_queue; + + h_lrc = find_handle(d->res, XE_EUDEBUG_RES_TYPE_LRC, q->lrc[lrc_idx]); + if (h_lrc < 0) + return h_lrc; + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION, 0, + DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, sz); + + if (!event) + return -ENOSPC; + + ea = cast_event(ea, event); + write_member(struct drm_xe_eudebug_event_eu_attention, ea, client_handle, (u64)h_c); + write_member(struct drm_xe_eudebug_event_eu_attention, ea, exec_queue_handle, (u64)h_queue); + write_member(struct drm_xe_eudebug_event_eu_attention, ea, lrc_handle, (u64)h_lrc); + write_member(struct drm_xe_eudebug_event_eu_attention, ea, bitmask_size, size); + + mutex_lock(&d->eu_lock); + event->seqno = atomic_long_inc_return(&d->events.seqno); + ret = xe_gt_eu_attention_bitmap(q->gt, &ea->bitmask[0], ea->bitmask_size); + mutex_unlock(&d->eu_lock); + + if (ret) + return ret; + + return xe_eudebug_queue_event(d, event); +} + + +static int xe_send_gt_attention(struct xe_gt *gt) +{ + struct xe_eudebug *d; + struct xe_exec_queue *q; + int ret, lrc_idx; + + if (list_empty_careful(>_to_xe(gt)->eudebug.list)) + return -ENOTCONN; + + q = runalone_active_queue_get(gt, &lrc_idx); + if (IS_ERR(q)) + return PTR_ERR(q); + + if (!xe_exec_queue_is_debuggable(q)) { + ret = -EPERM; + goto err_exec_queue_put; + } + + d = _xe_eudebug_get(q->vm->xef); + if (!d) { + ret = -ENOTCONN; + goto err_exec_queue_put; + } + + if (!completion_done(&d->discovery)) { + eu_dbg(d, "discovery not yet done\n"); + ret = -EBUSY; + goto err_eudebug_put; + } + + ret = send_attention_event(d, q, lrc_idx); + if (ret) + xe_eudebug_disconnect(d, ret); + +err_eudebug_put: + xe_eudebug_put(d); +err_exec_queue_put: + xe_exec_queue_put(q); + + return ret; +} + +static int xe_eudebug_handle_gt_attention(struct xe_gt *gt) +{ + int ret; + + ret = xe_gt_eu_threads_needing_attention(gt); + if (ret <= 0) + return ret; + + ret = xe_send_gt_attention(gt); + + /* Discovery in progress, fake it */ + if (ret == -EBUSY) + return 0; + + return ret; +} + +#define XE_EUDEBUG_ATTENTION_INTERVAL 100 +static void attention_scan_fn(struct work_struct *work) +{ + struct xe_device *xe = container_of(work, typeof(*xe), eudebug.attention_scan.work); + long delay = msecs_to_jiffies(XE_EUDEBUG_ATTENTION_INTERVAL); + struct xe_gt *gt; + u8 gt_id; + + if (list_empty_careful(&xe->eudebug.list)) + delay *= 10; + + if (delay >= HZ) + delay = round_jiffies_up_relative(delay); + + if (xe_pm_runtime_get_if_active(xe)) { + for_each_gt(gt, xe, gt_id) { + int ret; + + if (gt->info.type != XE_GT_TYPE_MAIN) + continue; + + ret = xe_eudebug_handle_gt_attention(gt); + if (ret) { + // TODO: error capture + drm_info(>_to_xe(gt)->drm, + "gt:%d unable to handle eu attention ret=%d\n", + gt_id, ret); + + xe_gt_reset_async(gt); + } + } + + xe_pm_runtime_put(xe); + } + + schedule_delayed_work(&xe->eudebug.attention_scan, delay); +} + +static void attention_scan_cancel(struct xe_device *xe) +{ + cancel_delayed_work_sync(&xe->eudebug.attention_scan); +} + +static void attention_scan_flush(struct xe_device *xe) +{ + mod_delayed_work(system_wq, &xe->eudebug.attention_scan, 0); +} + static void discovery_work_fn(struct work_struct *work); static int @@ -901,6 +1295,7 @@ xe_eudebug_connect(struct xe_device *xe, kref_init(&d->ref); spin_lock_init(&d->connection.lock); + mutex_init(&d->eu_lock); init_waitqueue_head(&d->events.write_done); init_waitqueue_head(&d->events.read_done); init_completion(&d->discovery); @@ -927,6 +1322,7 @@ xe_eudebug_connect(struct xe_device *xe, kref_get(&d->ref); queue_work(xe->eudebug.ordered_wq, &d->discovery_work); + attention_scan_flush(xe); eu_dbg(d, "connected session %lld", d->session); @@ -1004,13 +1400,23 @@ void xe_eudebug_init(struct xe_device *xe) spin_lock_init(&xe->clients.lock); INIT_LIST_HEAD(&xe->clients.list); init_rwsem(&xe->eudebug.discovery_lock); + INIT_DELAYED_WORK(&xe->eudebug.attention_scan, attention_scan_fn); xe->eudebug.ordered_wq = alloc_ordered_workqueue("xe-eudebug-ordered-wq", 0); xe->eudebug.available = !!xe->eudebug.ordered_wq; } +void xe_eudebug_init_late(struct xe_device *xe) +{ + if (!xe->eudebug.available) + return; + + attention_scan_flush(xe); +} + void xe_eudebug_fini(struct xe_device *xe) { + attention_scan_cancel(xe); xe_assert(xe, list_empty_careful(&xe->eudebug.list)); if (xe->eudebug.ordered_wq) diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index 3cd6bc7bb682..1fe86bec99e1 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -20,6 +20,7 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev, struct drm_file *file); void xe_eudebug_init(struct xe_device *xe); +void xe_eudebug_init_late(struct xe_device *xe); void xe_eudebug_fini(struct xe_device *xe); void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe); @@ -39,6 +40,7 @@ static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, struct drm_file *file) { return 0; } static inline void xe_eudebug_init(struct xe_device *xe) { } +static inline void xe_eudebug_init_late(struct xe_device *xe) { } static inline void xe_eudebug_fini(struct xe_device *xe) { } static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe) { } diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index bdffdfb1abff..410b3ecccc12 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -105,6 +105,9 @@ struct xe_eudebug { /** @discovery_work: worker to discover resources for target_task */ struct work_struct discovery_work; + /** eu_lock: guards operations on eus (eu thread control and attention) */ + struct mutex eu_lock; + /** @events: kfifo queue of to-be-delivered events */ struct { /** @lock: guards access to fifo */ @@ -228,4 +231,33 @@ struct xe_eudebug_event_exec_queue_placements { u64 instances[]; __counted_by(num_placements); }; +/** + * struct xe_eudebug_event_eu_attention - Internal event for EU attention + */ +struct xe_eudebug_event_eu_attention { + /** @base: base event */ + struct xe_eudebug_event base; + + /** @client_handle: client for the attention */ + u64 client_handle; + + /** @exec_queue_handle: handle of exec_queue which raised attention */ + u64 exec_queue_handle; + + /** @lrc_handle: lrc handle of the workload which raised attention */ + u64 lrc_handle; + + /** @flags: eu attention event flags, currently MBZ */ + u32 flags; + + /** @bitmask_size: size of the bitmask, specific to device */ + u32 bitmask_size; + + /** + * @bitmask: reflects threads currently signalling attention, + * starting from natural hardware order of DSS=0, eu=0 + */ + u8 bitmask[] __counted_by(bitmask_size); +}; + #endif diff --git a/drivers/gpu/drm/xe/xe_gt_debug.c b/drivers/gpu/drm/xe/xe_gt_debug.c new file mode 100644 index 000000000000..c4f0d11a20a6 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_gt_debug.c @@ -0,0 +1,148 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#include "regs/xe_gt_regs.h" +#include "xe_device.h" +#include "xe_force_wake.h" +#include "xe_gt.h" +#include "xe_gt_topology.h" +#include "xe_gt_debug.h" +#include "xe_gt_mcr.h" +#include "xe_pm.h" +#include "xe_macros.h" + +static int xe_gt_foreach_dss_group_instance(struct xe_gt *gt, + int (*fn)(struct xe_gt *gt, + void *data, + u16 group, + u16 instance), + void *data) +{ + const enum xe_force_wake_domains fw_domains = XE_FW_GT; + unsigned int dss, fw_ref; + u16 group, instance; + int ret = 0; + + fw_ref = xe_force_wake_get(gt_to_fw(gt), fw_domains); + if (!fw_ref) + return -ETIMEDOUT; + + for_each_dss_steering(dss, gt, group, instance) { + ret = fn(gt, data, group, instance); + if (ret) + break; + } + + xe_force_wake_put(gt_to_fw(gt), fw_ref); + + return ret; +} + +static int read_first_attention_mcr(struct xe_gt *gt, void *data, + u16 group, u16 instance) +{ + unsigned int row; + + for (row = 0; row < 2; row++) { + u32 val; + + val = xe_gt_mcr_unicast_read(gt, TD_ATT(row), group, instance); + + if (val) + return 1; + } + + return 0; +} + +#define MAX_EUS_PER_ROW 4u +#define MAX_THREADS 8u + +/** + * xe_gt_eu_attention_bitmap_size - query size of the attention bitmask + * + * @gt: pointer to struct xe_gt + * + * Return: size in bytes. + */ +int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt) +{ + xe_dss_mask_t dss_mask; + + bitmap_or(dss_mask, gt->fuse_topo.c_dss_mask, + gt->fuse_topo.g_dss_mask, XE_MAX_DSS_FUSE_BITS); + + return bitmap_weight(dss_mask, XE_MAX_DSS_FUSE_BITS) * + TD_EU_ATTENTION_MAX_ROWS * MAX_THREADS * + MAX_EUS_PER_ROW / 8; +} + +struct attn_read_iter { + struct xe_gt *gt; + unsigned int i; + unsigned int size; + u8 *bits; +}; + +static int read_eu_attentions_mcr(struct xe_gt *gt, void *data, + u16 group, u16 instance) +{ + struct attn_read_iter * const iter = data; + unsigned int row; + + for (row = 0; row < TD_EU_ATTENTION_MAX_ROWS; row++) { + u32 val; + + if (iter->i >= iter->size) + return 0; + + XE_WARN_ON(iter->i + sizeof(val) > xe_gt_eu_attention_bitmap_size(gt)); + + val = xe_gt_mcr_unicast_read(gt, TD_ATT(row), group, instance); + + memcpy(&iter->bits[iter->i], &val, sizeof(val)); + iter->i += sizeof(val); + } + + return 0; +} + +/** + * xe_gt_eu_attention_bitmap - query host attention + * + * @gt: pointer to struct xe_gt + * + * Return: 0 on success, negative otherwise. + */ +int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits, + unsigned int bitmap_size) +{ + struct attn_read_iter iter = { + .gt = gt, + .i = 0, + .size = bitmap_size, + .bits = bits + }; + + return xe_gt_foreach_dss_group_instance(gt, read_eu_attentions_mcr, &iter); +} + +/** + * xe_gt_eu_threads_needing_attention - Query host attention + * + * @gt: pointer to struct xe_gt + * + * Return: 1 if threads waiting host attention, 0 otherwise. + */ +int xe_gt_eu_threads_needing_attention(struct xe_gt *gt) +{ + int err; + + err = xe_gt_foreach_dss_group_instance(gt, read_first_attention_mcr, NULL); + + XE_WARN_ON(err < 0); + + return err < 0 ? 0 : err; +} diff --git a/drivers/gpu/drm/xe/xe_gt_debug.h b/drivers/gpu/drm/xe/xe_gt_debug.h new file mode 100644 index 000000000000..3f13dbb17a5f --- /dev/null +++ b/drivers/gpu/drm/xe/xe_gt_debug.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef __XE_GT_DEBUG_ +#define __XE_GT_DEBUG_ + +#define TD_EU_ATTENTION_MAX_ROWS 2u + +#include "xe_gt_types.h" + +#define XE_GT_ATTENTION_TIMEOUT_MS 100 + +int xe_gt_eu_threads_needing_attention(struct xe_gt *gt); + +int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt); +int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits, + unsigned int bitmap_size); + +#endif diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index 21690008a869..144c7cf888bb 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -28,12 +28,14 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_VM 3 #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE 4 #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS 5 +#define DRM_XE_EUDEBUG_EVENT_EU_ATTENTION 6 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) #define DRM_XE_EUDEBUG_EVENT_DESTROY (1 << 1) #define DRM_XE_EUDEBUG_EVENT_STATE_CHANGE (1 << 2) #define DRM_XE_EUDEBUG_EVENT_NEED_ACK (1 << 3) + __u64 seqno; __u64 reserved; }; @@ -78,6 +80,17 @@ struct drm_xe_eudebug_event_exec_queue_placements { __u64 instances[]; }; +struct drm_xe_eudebug_event_eu_attention { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 exec_queue_handle; + __u64 lrc_handle; + __u32 flags; + __u32 bitmask_size; + __u8 bitmask[]; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B4EFFE7717D for ; Mon, 9 Dec 2024 13:33:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 33FC210E753; Mon, 9 Dec 2024 13:33:20 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="mBrgsc9+"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id A046010E74A; Mon, 9 Dec 2024 13:33:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751199; x=1765287199; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SmqrFs1Vni9cP02o6r91CG4m2DN9wLEmgy+pmFkC67E=; b=mBrgsc9+bEHsLfzJR/CP2adAirxOF12iqbKJuMjLfV1jWuhQg0Bc6QyB YZZerUThkAcnhEMTqpXhA+cZnh5LFs5Zkyq+RZ8FUYVGRn21Sqh1G1o/X jbLpj/OWqrGvZMuLMEQdek2BjVWjA4yN7RPGC80KanYcGjmn4Dh3p+4be SGYqIplAHNIinZTDuW2g/SfzjrPXLAPYiQgzjtTysmbvzIEDQOgm0qPsO oXfMNSxHqvPDlctuULF6wRXmSVB6/gpthjyZYazEQjg3RtRlH+QaI3jUa IlZodRLhdUOMJTQ54K3yoiII6ooQ+mGVZIGm3Qgj4ABYJNZNnys18XyQv A==; X-CSE-ConnectionGUID: 0FLI+p7IREK7anUmW5Lj4Q== X-CSE-MsgGUID: 13HGtVuNSQunSuqcM29cHw== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192000" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192000" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:18 -0800 X-CSE-ConnectionGUID: FqizDAQxRxyDoBfZH9rRsA== X-CSE-MsgGUID: FOp6D4E1TU+RtvxjFwrnCw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531289" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:17 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Maciej Patelczyk , Mika Kuoppala Subject: [PATCH 09/26] drm/xe/eudebug: Introduce EU control interface Date: Mon, 9 Dec 2024 15:33:00 +0200 Message-ID: <20241209133318.1806472-10-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Introduce EU control functionality, which allows EU debugger to interrupt, resume, and inform about the current state of EU threads during execution. Provide an abstraction layer, so in the future guc will only need to provide appropriate callbacks. Based on implementation created by authors and other folks within i915 driver. v2: - checkpatch (Maciej) - lrc index off by one fix (Mika) - checkpatch (Tilak) - 32bit fixes (Andrzej, Mika) - find_resource_get for client (Mika) v3: - fw ref (Mika) Signed-off-by: Dominik Grzegorzek Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 2 + drivers/gpu/drm/xe/xe_eudebug.c | 515 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug_types.h | 24 ++ drivers/gpu/drm/xe/xe_gt_debug.c | 12 +- drivers/gpu/drm/xe/xe_gt_debug.h | 6 + include/uapi/drm/xe_drm_eudebug.h | 21 +- 6 files changed, 560 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index a20331b6c20e..5fcf06835ef0 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -478,6 +478,8 @@ #define THREAD_EX_ARB_MODE REG_GENMASK(3, 2) #define THREAD_EX_ARB_MODE_RR_AFTER_DEP REG_FIELD_PREP(THREAD_EX_ARB_MODE, 0x2) +#define TD_CLR(i) XE_REG_MCR(0xe490 + (i) * 4) + #define ROW_CHICKEN3 XE_REG_MCR(0xe49c, XE_REG_OPTION_MASKED) #define XE2_EUPEND_CHK_FLUSH_DIS REG_BIT(14) #define DIS_FIX_EOT1_FLUSH REG_BIT(9) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 39e927100222..81d03a860b7f 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -23,6 +23,7 @@ #include "xe_force_wake.h" #include "xe_gt.h" #include "xe_gt_debug.h" +#include "xe_gt_mcr.h" #include "xe_hw_engine.h" #include "xe_lrc.h" #include "xe_macros.h" @@ -587,6 +588,64 @@ static int find_handle(struct xe_eudebug_resources *res, return id; } +static void *find_resource__unlocked(struct xe_eudebug_resources *res, + const int type, + const u32 id) +{ + struct xe_eudebug_resource *r; + struct xe_eudebug_handle *h; + + r = resource_from_type(res, type); + h = xa_load(&r->xa, id); + + return h ? (void *)(uintptr_t)h->key : NULL; +} + +static void *find_resource(struct xe_eudebug_resources *res, + const int type, + const u32 id) +{ + void *p; + + mutex_lock(&res->lock); + p = find_resource__unlocked(res, type, id); + mutex_unlock(&res->lock); + + return p; +} + +static struct xe_file *find_client_get(struct xe_eudebug *d, const u32 id) +{ + struct xe_file *xef; + + mutex_lock(&d->res->lock); + xef = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, id); + if (xef) + xe_file_get(xef); + mutex_unlock(&d->res->lock); + + return xef; +} + +static struct xe_exec_queue *find_exec_queue_get(struct xe_eudebug *d, + u32 id) +{ + struct xe_exec_queue *q; + + mutex_lock(&d->res->lock); + q = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, id); + if (q) + xe_exec_queue_get(q); + mutex_unlock(&d->res->lock); + + return q; +} + +static struct xe_lrc *find_lrc(struct xe_eudebug *d, const u32 id) +{ + return find_resource(d->res, XE_EUDEBUG_RES_TYPE_LRC, id); +} + static int _xe_eudebug_add_handle(struct xe_eudebug *d, int type, void *p, @@ -843,6 +902,177 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, return ret; } +static int do_eu_control(struct xe_eudebug *d, + const struct drm_xe_eudebug_eu_control * const arg, + struct drm_xe_eudebug_eu_control __user * const user_ptr) +{ + void __user * const bitmask_ptr = u64_to_user_ptr(arg->bitmask_ptr); + struct xe_device *xe = d->xe; + u8 *bits = NULL; + unsigned int hw_attn_size, attn_size; + struct xe_exec_queue *q; + struct xe_file *xef; + struct xe_lrc *lrc; + u64 seqno; + int ret; + + if (xe_eudebug_detached(d)) + return -ENOTCONN; + + /* Accept only hardware reg granularity mask */ + if (XE_IOCTL_DBG(xe, !IS_ALIGNED(arg->bitmask_size, sizeof(u32)))) + return -EINVAL; + + xef = find_client_get(d, arg->client_handle); + if (XE_IOCTL_DBG(xe, !xef)) + return -EINVAL; + + q = find_exec_queue_get(d, arg->exec_queue_handle); + if (XE_IOCTL_DBG(xe, !q)) { + xe_file_put(xef); + return -EINVAL; + } + + if (XE_IOCTL_DBG(xe, !xe_exec_queue_is_debuggable(q))) { + ret = -EINVAL; + goto queue_put; + } + + if (XE_IOCTL_DBG(xe, xef != q->vm->xef)) { + ret = -EINVAL; + goto queue_put; + } + + lrc = find_lrc(d, arg->lrc_handle); + if (XE_IOCTL_DBG(xe, !lrc)) { + ret = -EINVAL; + goto queue_put; + } + + hw_attn_size = xe_gt_eu_attention_bitmap_size(q->gt); + attn_size = arg->bitmask_size; + + if (attn_size > hw_attn_size) + attn_size = hw_attn_size; + + if (attn_size > 0) { + bits = kmalloc(attn_size, GFP_KERNEL); + if (!bits) { + ret = -ENOMEM; + goto queue_put; + } + + if (copy_from_user(bits, bitmask_ptr, attn_size)) { + ret = -EFAULT; + goto out_free; + } + } + + if (!pm_runtime_active(xe->drm.dev)) { + ret = -EIO; + goto out_free; + } + + ret = -EINVAL; + mutex_lock(&d->eu_lock); + + switch (arg->cmd) { + case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL: + /* Make sure we dont promise anything but interrupting all */ + if (!attn_size) + ret = d->ops->interrupt_all(d, q, lrc); + break; + case DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED: + ret = d->ops->stopped(d, q, lrc, bits, attn_size); + break; + case DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME: + ret = d->ops->resume(d, q, lrc, bits, attn_size); + break; + default: + break; + } + + if (ret == 0) + seqno = atomic_long_inc_return(&d->events.seqno); + + mutex_unlock(&d->eu_lock); + + if (ret) + goto out_free; + + if (put_user(seqno, &user_ptr->seqno)) { + ret = -EFAULT; + goto out_free; + } + + if (copy_to_user(bitmask_ptr, bits, attn_size)) { + ret = -EFAULT; + goto out_free; + } + + if (hw_attn_size != arg->bitmask_size) + if (put_user(hw_attn_size, &user_ptr->bitmask_size)) + ret = -EFAULT; + +out_free: + kfree(bits); +queue_put: + xe_exec_queue_put(q); + xe_file_put(xef); + + return ret; +} + +static long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg) +{ + struct drm_xe_eudebug_eu_control __user * const user_ptr = + u64_to_user_ptr(arg); + struct drm_xe_eudebug_eu_control user_arg; + struct xe_device *xe = d->xe; + struct xe_file *xef; + int ret; + + if (XE_IOCTL_DBG(xe, !(_IOC_DIR(DRM_XE_EUDEBUG_IOCTL_EU_CONTROL) & _IOC_WRITE))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !(_IOC_DIR(DRM_XE_EUDEBUG_IOCTL_EU_CONTROL) & _IOC_READ))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, _IOC_SIZE(DRM_XE_EUDEBUG_IOCTL_EU_CONTROL) != sizeof(user_arg))) + return -EINVAL; + + if (copy_from_user(&user_arg, + user_ptr, + sizeof(user_arg))) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, user_arg.flags)) + return -EINVAL; + + if (!access_ok(u64_to_user_ptr(user_arg.bitmask_ptr), user_arg.bitmask_size)) + return -EFAULT; + + eu_dbg(d, + "eu_control: client_handle=%llu, cmd=%u, flags=0x%x, exec_queue_handle=%llu, bitmask_size=%u\n", + user_arg.client_handle, user_arg.cmd, user_arg.flags, user_arg.exec_queue_handle, + user_arg.bitmask_size); + + xef = find_client_get(d, user_arg.client_handle); + if (XE_IOCTL_DBG(xe, !xef)) + return -EINVAL; /* As this is user input */ + + ret = do_eu_control(d, &user_arg, user_ptr); + + xe_file_put(xef); + + eu_dbg(d, + "eu_control: client_handle=%llu, cmd=%u, flags=0x%x, exec_queue_handle=%llu, bitmask_size=%u ret=%d\n", + user_arg.client_handle, user_arg.cmd, user_arg.flags, user_arg.exec_queue_handle, + user_arg.bitmask_size, ret); + + return ret; +} + static long xe_eudebug_ioctl(struct file *file, unsigned int cmd, unsigned long arg) @@ -859,6 +1089,10 @@ static long xe_eudebug_ioctl(struct file *file, ret = xe_eudebug_read_event(d, arg, !(file->f_flags & O_NONBLOCK)); break; + case DRM_XE_EUDEBUG_IOCTL_EU_CONTROL: + ret = xe_eudebug_eu_control(d, arg); + eu_dbg(d, "ioctl cmd=EU_CONTROL ret=%ld\n", ret); + break; default: ret = -EINVAL; @@ -1043,23 +1277,17 @@ static struct xe_hw_engine *get_runalone_active_hw_engine(struct xe_gt *gt) return first; } -static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx) +static struct xe_exec_queue *active_hwe_to_exec_queue(struct xe_hw_engine *hwe, int *lrc_idx) { - struct xe_device *xe = gt_to_xe(gt); + struct xe_device *xe = gt_to_xe(hwe->gt); + struct xe_gt *gt = hwe->gt; struct xe_exec_queue *q, *found = NULL; - struct xe_hw_engine *active; struct xe_file *xef; unsigned long i; int idx, err; u32 lrc_hw; - active = get_runalone_active_hw_engine(gt); - if (!active) { - drm_dbg(>_to_xe(gt)->drm, "Runalone engine not found!"); - return ERR_PTR(-ENOENT); - } - - err = current_lrca(active, &lrc_hw); + err = current_lrca(hwe, &lrc_hw); if (err) return ERR_PTR(err); @@ -1070,7 +1298,7 @@ static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lr if (q->gt != gt) continue; - if (q->class != active->class) + if (q->class != hwe->class) continue; if (xe_exec_queue_is_idle(q)) @@ -1096,7 +1324,7 @@ static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lr if (!found) return ERR_PTR(-ENOENT); - if (XE_WARN_ON(current_lrca(active, &lrc_hw)) && + if (XE_WARN_ON(current_lrca(hwe, &lrc_hw)) && XE_WARN_ON(match_exec_queue_lrca(found, lrc_hw) < 0)) { xe_exec_queue_put(found); return ERR_PTR(-ENOENT); @@ -1105,6 +1333,19 @@ static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lr return found; } +static struct xe_exec_queue *runalone_active_queue_get(struct xe_gt *gt, int *lrc_idx) +{ + struct xe_hw_engine *active; + + active = get_runalone_active_hw_engine(gt); + if (!active) { + drm_dbg(>_to_xe(gt)->drm, "Runalone engine not found!"); + return ERR_PTR(-ENOENT); + } + + return active_hwe_to_exec_queue(active, lrc_idx); +} + static int send_attention_event(struct xe_eudebug *d, struct xe_exec_queue *q, int lrc_idx) { struct xe_eudebug_event_eu_attention *ea; @@ -1153,7 +1394,6 @@ static int send_attention_event(struct xe_eudebug *d, struct xe_exec_queue *q, i return xe_eudebug_queue_event(d, event); } - static int xe_send_gt_attention(struct xe_gt *gt) { struct xe_eudebug *d; @@ -1261,6 +1501,254 @@ static void attention_scan_flush(struct xe_device *xe) mod_delayed_work(system_wq, &xe->eudebug.attention_scan, 0); } +static int xe_eu_control_interrupt_all(struct xe_eudebug *d, + struct xe_exec_queue *q, + struct xe_lrc *lrc) +{ + struct xe_gt *gt = q->hwe->gt; + struct xe_device *xe = d->xe; + struct xe_exec_queue *active; + struct xe_hw_engine *hwe; + unsigned int fw_ref; + int lrc_idx, ret; + u32 lrc_hw; + u32 td_ctl; + + hwe = get_runalone_active_hw_engine(gt); + if (XE_IOCTL_DBG(xe, !hwe)) { + drm_dbg(>_to_xe(gt)->drm, "Runalone engine not found!"); + return -EINVAL; + } + + active = active_hwe_to_exec_queue(hwe, &lrc_idx); + if (XE_IOCTL_DBG(xe, IS_ERR(active))) + return PTR_ERR(active); + + if (XE_IOCTL_DBG(xe, q != active)) { + xe_exec_queue_put(active); + return -EINVAL; + } + xe_exec_queue_put(active); + + if (XE_IOCTL_DBG(xe, lrc_idx >= q->width || q->lrc[lrc_idx] != lrc)) + return -EINVAL; + + fw_ref = xe_force_wake_get(gt_to_fw(gt), hwe->domain); + if (!fw_ref) + return -ETIMEDOUT; + + /* Additional check just before issuing MMIO writes */ + ret = __current_lrca(hwe, &lrc_hw); + if (ret) + goto put_fw; + + if (!lrca_equals(lower_32_bits(xe_lrc_descriptor(lrc)), lrc_hw)) { + ret = -EBUSY; + goto put_fw; + } + + td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL); + + /* Halt on next thread dispatch */ + if (!(td_ctl & TD_CTL_FORCE_EXTERNAL_HALT)) + xe_gt_mcr_multicast_write(gt, TD_CTL, + td_ctl | TD_CTL_FORCE_EXTERNAL_HALT); + else + eu_warn(d, "TD_CTL force external halt bit already set!\n"); + + /* + * The sleep is needed because some interrupts are ignored + * by the HW, hence we allow the HW some time to acknowledge + * that. + */ + usleep_range(100, 110); + + /* Halt regardless of thread dependencies */ + if (!(td_ctl & TD_CTL_FORCE_EXCEPTION)) + xe_gt_mcr_multicast_write(gt, TD_CTL, + td_ctl | TD_CTL_FORCE_EXCEPTION); + else + eu_warn(d, "TD_CTL force exception bit already set!\n"); + + usleep_range(100, 110); + + xe_gt_mcr_multicast_write(gt, TD_CTL, td_ctl & + ~(TD_CTL_FORCE_EXTERNAL_HALT | TD_CTL_FORCE_EXCEPTION)); + + /* + * In case of stopping wrong ctx emit warning. + * Nothing else we can do for now. + */ + ret = __current_lrca(hwe, &lrc_hw); + if (ret || !lrca_equals(lower_32_bits(xe_lrc_descriptor(lrc)), lrc_hw)) + eu_warn(d, "xe_eudebug: interrupted wrong context."); + +put_fw: + xe_force_wake_put(gt_to_fw(gt), fw_ref); + + return ret; +} + +struct ss_iter { + struct xe_eudebug *debugger; + unsigned int i; + + unsigned int size; + u8 *bits; +}; + +static int check_attn_mcr(struct xe_gt *gt, void *data, + u16 group, u16 instance) +{ + struct ss_iter *iter = data; + struct xe_eudebug *d = iter->debugger; + unsigned int row; + + for (row = 0; row < TD_EU_ATTENTION_MAX_ROWS; row++) { + u32 val, cur = 0; + + if (iter->i >= iter->size) + return 0; + + if (XE_WARN_ON((iter->i + sizeof(val)) > + (xe_gt_eu_attention_bitmap_size(gt)))) + return -EIO; + + memcpy(&val, &iter->bits[iter->i], sizeof(val)); + iter->i += sizeof(val); + + cur = xe_gt_mcr_unicast_read(gt, TD_ATT(row), group, instance); + + if ((val | cur) != cur) { + eu_dbg(d, + "WRONG CLEAR (%u:%u:%u) TD_CRL: 0x%08x; TD_ATT: 0x%08x\n", + group, instance, row, val, cur); + return -EINVAL; + } + } + + return 0; +} + +static int clear_attn_mcr(struct xe_gt *gt, void *data, + u16 group, u16 instance) +{ + struct ss_iter *iter = data; + struct xe_eudebug *d = iter->debugger; + unsigned int row; + + for (row = 0; row < TD_EU_ATTENTION_MAX_ROWS; row++) { + u32 val; + + if (iter->i >= iter->size) + return 0; + + if (XE_WARN_ON((iter->i + sizeof(val)) > + (xe_gt_eu_attention_bitmap_size(gt)))) + return -EIO; + + memcpy(&val, &iter->bits[iter->i], sizeof(val)); + iter->i += sizeof(val); + + if (!val) + continue; + + xe_gt_mcr_unicast_write(gt, TD_CLR(row), val, + group, instance); + + eu_dbg(d, + "TD_CLR: (%u:%u:%u): 0x%08x\n", + group, instance, row, val); + } + + return 0; +} + +static int xe_eu_control_resume(struct xe_eudebug *d, + struct xe_exec_queue *q, + struct xe_lrc *lrc, + u8 *bits, unsigned int bitmask_size) +{ + struct xe_device *xe = d->xe; + struct ss_iter iter = { + .debugger = d, + .i = 0, + .size = bitmask_size, + .bits = bits + }; + int ret = 0; + struct xe_exec_queue *active; + int lrc_idx; + + active = runalone_active_queue_get(q->gt, &lrc_idx); + if (IS_ERR(active)) + return PTR_ERR(active); + + if (XE_IOCTL_DBG(xe, q != active)) { + xe_exec_queue_put(active); + return -EBUSY; + } + xe_exec_queue_put(active); + + if (XE_IOCTL_DBG(xe, lrc_idx >= q->width || q->lrc[lrc_idx] != lrc)) + return -EBUSY; + + /* + * hsdes: 18021122357 + * We need to avoid clearing attention bits that are not set + * in order to avoid the EOT hang on PVC. + */ + if (GRAPHICS_VERx100(d->xe) == 1260) { + ret = xe_gt_foreach_dss_group_instance(q->gt, check_attn_mcr, &iter); + if (ret) + return ret; + + iter.i = 0; + } + + xe_gt_foreach_dss_group_instance(q->gt, clear_attn_mcr, &iter); + return 0; +} + +static int xe_eu_control_stopped(struct xe_eudebug *d, + struct xe_exec_queue *q, + struct xe_lrc *lrc, + u8 *bits, unsigned int bitmask_size) +{ + struct xe_device *xe = d->xe; + struct xe_exec_queue *active; + int lrc_idx; + + if (XE_WARN_ON(!q) || XE_WARN_ON(!q->gt)) + return -EINVAL; + + active = runalone_active_queue_get(q->gt, &lrc_idx); + if (IS_ERR(active)) + return PTR_ERR(active); + + if (active) { + if (XE_IOCTL_DBG(xe, q != active)) { + xe_exec_queue_put(active); + return -EBUSY; + } + + if (XE_IOCTL_DBG(xe, lrc_idx >= q->width || q->lrc[lrc_idx] != lrc)) { + xe_exec_queue_put(active); + return -EBUSY; + } + } + + xe_exec_queue_put(active); + + return xe_gt_eu_attention_bitmap(q->gt, bits, bitmask_size); +} + +static struct xe_eudebug_eu_control_ops eu_control = { + .interrupt_all = xe_eu_control_interrupt_all, + .stopped = xe_eu_control_stopped, + .resume = xe_eu_control_resume, +}; + static void discovery_work_fn(struct work_struct *work); static int @@ -1320,6 +1808,7 @@ xe_eudebug_connect(struct xe_device *xe, goto err_detach; } + d->ops = &eu_control; kref_get(&d->ref); queue_work(xe->eudebug.ordered_wq, &d->discovery_work); attention_scan_flush(xe); diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index 410b3ecccc12..e1d4e31b32ec 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -18,8 +18,12 @@ struct xe_device; struct task_struct; +struct xe_eudebug; struct xe_eudebug_event; +struct xe_hw_engine; struct workqueue_struct; +struct xe_exec_queue; +struct xe_lrc; #define CONFIG_DRM_XE_DEBUGGER_EVENT_QUEUE_SIZE 64 @@ -65,6 +69,24 @@ struct xe_eudebug_resources { struct xe_eudebug_resource rt[XE_EUDEBUG_RES_TYPE_COUNT]; }; +/** + * struct xe_eudebug_eu_control_ops - interface for eu thread + * state control backend + */ +struct xe_eudebug_eu_control_ops { + /** @interrupt_all: interrupts workload active on given hwe */ + int (*interrupt_all)(struct xe_eudebug *e, struct xe_exec_queue *q, + struct xe_lrc *lrc); + + /** @resume: resumes threads reflected by bitmask active on given hwe */ + int (*resume)(struct xe_eudebug *e, struct xe_exec_queue *q, + struct xe_lrc *lrc, u8 *bitmap, unsigned int bitmap_size); + + /** @stopped: returns bitmap reflecting threads which signal attention */ + int (*stopped)(struct xe_eudebug *e, struct xe_exec_queue *q, + struct xe_lrc *lrc, u8 *bitmap, unsigned int bitmap_size); +}; + /** * struct xe_eudebug - Top level struct for eudebug: the connection */ @@ -128,6 +150,8 @@ struct xe_eudebug { atomic_long_t seqno; } events; + /** @ops operations for eu_control */ + struct xe_eudebug_eu_control_ops *ops; }; /** diff --git a/drivers/gpu/drm/xe/xe_gt_debug.c b/drivers/gpu/drm/xe/xe_gt_debug.c index c4f0d11a20a6..f35b9df5e41b 100644 --- a/drivers/gpu/drm/xe/xe_gt_debug.c +++ b/drivers/gpu/drm/xe/xe_gt_debug.c @@ -13,12 +13,12 @@ #include "xe_pm.h" #include "xe_macros.h" -static int xe_gt_foreach_dss_group_instance(struct xe_gt *gt, - int (*fn)(struct xe_gt *gt, - void *data, - u16 group, - u16 instance), - void *data) +int xe_gt_foreach_dss_group_instance(struct xe_gt *gt, + int (*fn)(struct xe_gt *gt, + void *data, + u16 group, + u16 instance), + void *data) { const enum xe_force_wake_domains fw_domains = XE_FW_GT; unsigned int dss, fw_ref; diff --git a/drivers/gpu/drm/xe/xe_gt_debug.h b/drivers/gpu/drm/xe/xe_gt_debug.h index 3f13dbb17a5f..342082699ff6 100644 --- a/drivers/gpu/drm/xe/xe_gt_debug.h +++ b/drivers/gpu/drm/xe/xe_gt_debug.h @@ -13,6 +13,12 @@ #define XE_GT_ATTENTION_TIMEOUT_MS 100 int xe_gt_eu_threads_needing_attention(struct xe_gt *gt); +int xe_gt_foreach_dss_group_instance(struct xe_gt *gt, + int (*fn)(struct xe_gt *gt, + void *data, + u16 group, + u16 instance), + void *data); int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt); int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits, diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index 144c7cf888bb..ccfbe976c509 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -15,7 +15,8 @@ extern "C" { * * This ioctl is available in debug version 1. */ -#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0) +#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0) +#define DRM_XE_EUDEBUG_IOCTL_EU_CONTROL _IOWR('j', 0x2, struct drm_xe_eudebug_eu_control) /* XXX: Document events to match their internal counterparts when moved to xe_drm.h */ struct drm_xe_eudebug_event { @@ -91,6 +92,24 @@ struct drm_xe_eudebug_event_eu_attention { __u8 bitmask[]; }; +struct drm_xe_eudebug_eu_control { + __u64 client_handle; + +#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL 0 +#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED 1 +#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME 2 + __u32 cmd; + __u32 flags; + + __u64 seqno; + + __u64 exec_queue_handle; + __u64 lrc_handle; + __u32 reserved; + __u32 bitmask_size; + __u64 bitmask_ptr; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3A72FE77183 for ; Mon, 9 Dec 2024 13:33:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AE50310E756; Mon, 9 Dec 2024 13:33:22 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="CuJDu6ke"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5225910E754; Mon, 9 Dec 2024 13:33:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751200; x=1765287200; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hFtHYiTz4Wg9mN5ze3MaA4GSDjIEBcOOgID5jVSD85s=; b=CuJDu6ke90faJIsiqkwUf3x2GVPfDIfxd6qhLsKhpjqKSmeKospr+7w4 wCbO3E/cC9I5MeNdy5as9Lif1UTLMi95N3A8Z/mog6FyEV8+l8gbuF91W 7llnmq9lgT+Z+ru2hxx3KZ6ICZIU83CLMOXts7V5wdOP7XHQTaIs+eRGB aFxpmZSBixudZlF7c1a5g3HcFveo043JEBEDrDtun/O206WKXt85K3b8p Rt4P2ZCVhsui7lxizrFlY78EhYhjRJ9X7OnkWnAox1FChV7IBa0/KQ7DT G/ZBOrxZqo7IhxTQj8ulUFou6mFEFsMYbK7qIH06s07+Nr9HdQQGyiBhS w==; X-CSE-ConnectionGUID: tK+F6JlvSYGfOqreo+Qltw== X-CSE-MsgGUID: 7uS/3pc7TUC2qZ+TT36YzQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192012" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192012" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:20 -0800 X-CSE-ConnectionGUID: A3656k0PQiiMhWY6E/RwBQ== X-CSE-MsgGUID: L+pA8hE/R7WTHyHdebooBQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531298" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:19 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Christoph Manszewski , Mika Kuoppala Subject: [PATCH 10/26] drm/xe/eudebug: Add vm bind and vm bind ops Date: Mon, 9 Dec 2024 15:33:01 +0200 Message-ID: <20241209133318.1806472-11-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Christoph Manszewski Add events dedicated to track vma bind and vma unbind operations. The events are generated for operations performed on xe_vma MAP and UNMAP for boss and userptrs. As one bind can result in multiple operations and fail in the middle, we want to store the events until full successful chain of operations can be relayed to debugger. Signed-off-by: Christoph Manszewski Co-developed-by: Mika Kuoppala Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 318 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug.h | 13 ++ drivers/gpu/drm/xe/xe_eudebug_types.h | 29 +++ drivers/gpu/drm/xe/xe_vm.c | 16 +- drivers/gpu/drm/xe/xe_vm_types.h | 13 ++ include/uapi/drm/xe_drm_eudebug.h | 64 ++++++ 6 files changed, 449 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 81d03a860b7f..f544f60d7d6b 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -792,7 +792,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_EU_ATTENTION; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -827,7 +827,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_EU_ATTENTION; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -2359,6 +2359,320 @@ void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) xe_eudebug_event_put(d, exec_queue_destroy_event(d, xef, q)); } +static int xe_eudebug_queue_bind_event(struct xe_eudebug *d, + struct xe_vm *vm, + struct xe_eudebug_event *event) +{ + struct xe_eudebug_event_envelope *env; + + lockdep_assert_held_write(&vm->lock); + + env = kmalloc(sizeof(*env), GFP_KERNEL); + if (!env) + return -ENOMEM; + + INIT_LIST_HEAD(&env->link); + env->event = event; + + spin_lock(&vm->eudebug.lock); + list_add_tail(&env->link, &vm->eudebug.events); + + if (event->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP) + ++vm->eudebug.ops; + spin_unlock(&vm->eudebug.lock); + + return 0; +} + +static int queue_vm_bind_event(struct xe_eudebug *d, + struct xe_vm *vm, + u64 client_handle, + u64 vm_handle, + u32 bind_flags, + u32 num_ops, u64 *seqno) +{ + struct xe_eudebug_event_vm_bind *e; + struct xe_eudebug_event *event; + const u32 sz = sizeof(*e); + const u32 base_flags = DRM_XE_EUDEBUG_EVENT_STATE_CHANGE; + + *seqno = atomic_long_inc_return(&d->events.seqno); + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND, + *seqno, base_flags, sz); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + write_member(struct drm_xe_eudebug_event_vm_bind, e, client_handle, client_handle); + write_member(struct drm_xe_eudebug_event_vm_bind, e, vm_handle, vm_handle); + write_member(struct drm_xe_eudebug_event_vm_bind, e, flags, bind_flags); + write_member(struct drm_xe_eudebug_event_vm_bind, e, num_binds, num_ops); + + /* If in discovery, no need to collect ops */ + if (!completion_done(&d->discovery)) { + XE_WARN_ON(!num_ops); + return xe_eudebug_queue_event(d, event); + } + + return xe_eudebug_queue_bind_event(d, vm, event); +} + +static int vm_bind_event(struct xe_eudebug *d, + struct xe_vm *vm, + u32 num_ops, + u64 *seqno) +{ + int h_c, h_vm; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, vm->xef); + if (h_c < 0) + return h_c; + + h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, vm); + if (h_vm < 0) + return h_vm; + + return queue_vm_bind_event(d, vm, h_c, h_vm, 0, + num_ops, seqno); +} + +static int vm_bind_op_event(struct xe_eudebug *d, + struct xe_vm *vm, + const u32 flags, + const u64 bind_ref_seqno, + const u64 num_extensions, + u64 addr, u64 range, + u64 *op_seqno) +{ + struct xe_eudebug_event_vm_bind_op *e; + struct xe_eudebug_event *event; + const u32 sz = sizeof(*e); + + *op_seqno = atomic_long_inc_return(&d->events.seqno); + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP, + *op_seqno, flags, sz); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_vm_bind_op, e, vm_bind_ref_seqno, bind_ref_seqno); + write_member(struct drm_xe_eudebug_event_vm_bind_op, e, num_extensions, num_extensions); + write_member(struct drm_xe_eudebug_event_vm_bind_op, e, addr, addr); + write_member(struct drm_xe_eudebug_event_vm_bind_op, e, range, range); + + /* If in discovery, no need to collect ops */ + if (!completion_done(&d->discovery)) + return xe_eudebug_queue_event(d, event); + + return xe_eudebug_queue_bind_event(d, vm, event); +} + +static int vm_bind_op(struct xe_eudebug *d, struct xe_vm *vm, + const u32 flags, const u64 bind_ref_seqno, + u64 addr, u64 range) +{ + u64 op_seqno = 0; + u64 num_extensions = 0; + int ret; + + ret = vm_bind_op_event(d, vm, flags, bind_ref_seqno, num_extensions, + addr, range, &op_seqno); + if (ret) + return ret; + + return 0; +} + +void xe_eudebug_vm_init(struct xe_vm *vm) +{ + INIT_LIST_HEAD(&vm->eudebug.events); + spin_lock_init(&vm->eudebug.lock); + vm->eudebug.ops = 0; + vm->eudebug.ref_seqno = 0; +} + +void xe_eudebug_vm_bind_start(struct xe_vm *vm) +{ + struct xe_eudebug *d; + u64 seqno = 0; + int err; + + if (!xe_vm_in_lr_mode(vm)) + return; + + d = xe_eudebug_get(vm->xef); + if (!d) + return; + + lockdep_assert_held_write(&vm->lock); + + if (XE_WARN_ON(!list_empty(&vm->eudebug.events)) || + XE_WARN_ON(vm->eudebug.ops) || + XE_WARN_ON(vm->eudebug.ref_seqno)) { + eu_err(d, "bind busy on %s", __func__); + xe_eudebug_disconnect(d, -EINVAL); + } + + err = vm_bind_event(d, vm, 0, &seqno); + if (err) { + eu_err(d, "error %d on %s", err, __func__); + xe_eudebug_disconnect(d, err); + } + + spin_lock(&vm->eudebug.lock); + XE_WARN_ON(vm->eudebug.ref_seqno); + vm->eudebug.ref_seqno = seqno; + vm->eudebug.ops = 0; + spin_unlock(&vm->eudebug.lock); + + xe_eudebug_put(d); +} + +void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) +{ + struct xe_eudebug *d; + u32 flags; + + if (!xe_vm_in_lr_mode(vm)) + return; + + switch (op) { + case DRM_XE_VM_BIND_OP_MAP: + case DRM_XE_VM_BIND_OP_MAP_USERPTR: + { + flags = DRM_XE_EUDEBUG_EVENT_CREATE; + break; + } + case DRM_XE_VM_BIND_OP_UNMAP: + case DRM_XE_VM_BIND_OP_UNMAP_ALL: + flags = DRM_XE_EUDEBUG_EVENT_DESTROY; + break; + default: + flags = 0; + break; + } + + if (!flags) + return; + + d = xe_eudebug_get(vm->xef); + if (!d) + return; + + xe_eudebug_event_put(d, vm_bind_op(d, vm, flags, 0, addr, range)); +} + +static struct xe_eudebug_event *fetch_bind_event(struct xe_vm * const vm) +{ + struct xe_eudebug_event_envelope *env; + struct xe_eudebug_event *e = NULL; + + spin_lock(&vm->eudebug.lock); + env = list_first_entry_or_null(&vm->eudebug.events, + struct xe_eudebug_event_envelope, link); + if (env) { + e = env->event; + list_del(&env->link); + } + spin_unlock(&vm->eudebug.lock); + + kfree(env); + + return e; +} + +static void fill_vm_bind_fields(struct xe_vm *vm, + struct xe_eudebug_event *e, + bool ufence, + u32 bind_ops) +{ + struct xe_eudebug_event_vm_bind *eb = cast_event(eb, e); + + eb->flags = ufence ? + DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE : 0; + eb->num_binds = bind_ops; +} + +static void fill_vm_bind_op_fields(struct xe_vm *vm, + struct xe_eudebug_event *e, + u64 ref_seqno) +{ + struct xe_eudebug_event_vm_bind_op *op; + + if (e->type != DRM_XE_EUDEBUG_EVENT_VM_BIND_OP) + return; + + op = cast_event(op, e); + op->vm_bind_ref_seqno = ref_seqno; +} + +void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int bind_err) +{ + struct xe_eudebug_event *e; + struct xe_eudebug *d; + u32 bind_ops; + u64 ref; + + if (!xe_vm_in_lr_mode(vm)) + return; + + spin_lock(&vm->eudebug.lock); + ref = vm->eudebug.ref_seqno; + vm->eudebug.ref_seqno = 0; + bind_ops = vm->eudebug.ops; + vm->eudebug.ops = 0; + spin_unlock(&vm->eudebug.lock); + + e = fetch_bind_event(vm); + if (!e) + return; + + d = NULL; + if (!bind_err && ref) { + d = xe_eudebug_get(vm->xef); + if (d) { + if (bind_ops) { + fill_vm_bind_fields(vm, e, has_ufence, bind_ops); + } else { + /* + * If there was no ops we are interested in, + * we can omit the whole sequence + */ + xe_eudebug_put(d); + d = NULL; + } + } + } + + while (e) { + int err = 0; + + if (d) { + err = xe_eudebug_queue_event(d, e); + if (!err) + e = NULL; + } + + if (err) { + xe_eudebug_disconnect(d, err); + xe_eudebug_put(d); + d = NULL; + } + + kfree(e); + + e = fetch_bind_event(vm); + if (e && ref) + fill_vm_bind_op_fields(vm, e, ref); + } + + if (d) + xe_eudebug_put(d); +} + static int discover_client(struct xe_eudebug *d, struct xe_file *xef) { struct xe_exec_queue *q; diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index 1fe86bec99e1..ccc7202b3308 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -5,11 +5,14 @@ #ifndef _XE_EUDEBUG_H_ +#include + struct drm_device; struct drm_file; struct xe_device; struct xe_file; struct xe_vm; +struct xe_vma; struct xe_exec_queue; struct xe_hw_engine; @@ -33,6 +36,11 @@ void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm); void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q); void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q); +void xe_eudebug_vm_init(struct xe_vm *vm); +void xe_eudebug_vm_bind_start(struct xe_vm *vm); +void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range); +void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int err); + #else static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, @@ -53,6 +61,11 @@ static inline void xe_eudebug_vm_destroy(struct xe_file *xef, struct xe_vm *vm) static inline void xe_eudebug_exec_queue_create(struct xe_file *xef, struct xe_exec_queue *q) { } static inline void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) { } +static inline void xe_eudebug_vm_init(struct xe_vm *vm) { } +static inline void xe_eudebug_vm_bind_start(struct xe_vm *vm) { } +static inline void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) { } +static inline void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int err) { } + #endif /* CONFIG_DRM_XE_EUDEBUG */ #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index e1d4e31b32ec..cbc316ec3593 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -177,6 +177,11 @@ struct xe_eudebug_event { u8 data[]; }; +struct xe_eudebug_event_envelope { + struct list_head link; + struct xe_eudebug_event *event; +}; + /** * struct xe_eudebug_event_open - Internal event for client open/close */ @@ -284,4 +289,28 @@ struct xe_eudebug_event_eu_attention { u8 bitmask[] __counted_by(bitmask_size); }; +/** + * struct xe_eudebug_event_vm_bind - Internal event for vm bind/unbind operation + */ +struct xe_eudebug_event_vm_bind { + /** @base: base event */ + struct xe_eudebug_event base; + + u64 client_handle; + u64 vm_handle; + + u32 flags; + u32 num_binds; +}; + +struct xe_eudebug_event_vm_bind_op { + /** @base: base event */ + struct xe_eudebug_event base; + u64 vm_bind_ref_seqno; + u64 num_extensions; + + u64 addr; /* Zero for unmap all ? */ + u64 range; /* Zero for unmap all ? */ +}; + #endif diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 6f16049f4f6e..e83420473763 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -1413,6 +1413,8 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) for_each_tile(tile, xe, id) xe_range_fence_tree_init(&vm->rftree[id]); + xe_eudebug_vm_init(vm); + vm->pt_ops = &xelp_pt_ops; /* @@ -1641,6 +1643,8 @@ static void vm_destroy_work_func(struct work_struct *w) struct xe_tile *tile; u8 id; + xe_eudebug_vm_bind_end(vm, 0, -ENOENT); + /* xe_vm_close_and_put was not called? */ xe_assert(xe, !vm->size); @@ -2651,7 +2655,7 @@ static void vm_bind_ioctl_ops_fini(struct xe_vm *vm, struct xe_vma_ops *vops, struct dma_fence *fence) { struct xe_exec_queue *wait_exec_queue = to_wait_exec_queue(vm, vops->q); - struct xe_user_fence *ufence; + struct xe_user_fence *ufence = NULL; struct xe_vma_op *op; int i; @@ -2666,6 +2670,9 @@ static void vm_bind_ioctl_ops_fini(struct xe_vm *vm, struct xe_vma_ops *vops, xe_vma_destroy(gpuva_to_vma(op->base.remap.unmap->va), fence); } + + xe_eudebug_vm_bind_end(vm, ufence, 0); + if (ufence) xe_sync_ufence_put(ufence); for (i = 0; i < vops->num_syncs; i++) @@ -3078,6 +3085,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) if (err) goto unwind_ops; + xe_eudebug_vm_bind_op_add(vm, op, addr, range); + #ifdef TEST_VM_OPS_ERROR if (flags & FORCE_OP_ERROR) { vops.inject_error = true; @@ -3101,8 +3110,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) err = vm_bind_ioctl_ops_execute(vm, &vops); unwind_ops: - if (err && err != -ENODATA) + if (err && err != -ENODATA) { + xe_eudebug_vm_bind_end(vm, num_ufence > 0, err); vm_bind_ioctl_ops_unwind(vm, ops, args->num_binds); + } + xe_vma_ops_fini(&vops); for (i = args->num_binds - 1; i >= 0; --i) if (ops[i]) diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 7f9a303e51d8..557b047ebdd7 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -282,6 +282,19 @@ struct xe_vm { bool batch_invalidate_tlb; /** @xef: XE file handle for tracking this VM's drm client */ struct xe_file *xef; + +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + struct { + /** @lock: Lock for eudebug_bind members */ + spinlock_t lock; + /** @events: List of vm bind ops gathered */ + struct list_head events; + /** @ops: How many operations we have stored */ + u32 ops; + /** @ref_seqno: Reference to the VM_BIND that the ops relate */ + u64 ref_seqno; + } eudebug; +#endif }; /** struct xe_vma_op_map - VMA map operation */ diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index ccfbe976c509..cc34c522fa4d 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -30,6 +30,8 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE 4 #define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE_PLACEMENTS 5 #define DRM_XE_EUDEBUG_EVENT_EU_ATTENTION 6 +#define DRM_XE_EUDEBUG_EVENT_VM_BIND 7 +#define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP 8 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) @@ -110,6 +112,68 @@ struct drm_xe_eudebug_eu_control { __u64 bitmask_ptr; }; +/* + * When client (debuggee) does vm_bind_ioctl() following event + * sequence will be created (for the debugger): + * + * ┌───────────────────────┐ + * │ EVENT_VM_BIND ├───────┬─┬─┐ + * └───────────────────────┘ │ │ │ + * ┌───────────────────────┐ │ │ │ + * │ EVENT_VM_BIND_OP #1 ├───┘ │ │ + * └───────────────────────┘ │ │ + * ... │ │ + * ┌───────────────────────┐ │ │ + * │ EVENT_VM_BIND_OP #n ├─────┘ │ + * └───────────────────────┘ │ + * │ + * ┌───────────────────────┐ │ + * │ EVENT_UFENCE ├───────┘ + * └───────────────────────┘ + * + * All the events below VM_BIND will reference the VM_BIND + * they associate with, by field .vm_bind_ref_seqno. + * event_ufence will only be included if the client did + * attach sync of type UFENCE into its vm_bind_ioctl(). + * + * When EVENT_UFENCE is sent by the driver, all the OPs of + * the original VM_BIND are completed and the [addr,range] + * contained in them are present and modifiable through the + * vm accessors. Accessing [addr, range] before related ufence + * event will lead to undefined results as the actual bind + * operations are async and the backing storage might not + * be there on a moment of receiving the event. + * + * Client's UFENCE sync will be held by the driver: client's + * drm_xe_wait_ufence will not complete and the value of the ufence + * won't appear until ufence is acked by the debugger process calling + * DRM_XE_EUDEBUG_IOCTL_ACK_EVENT with the event_ufence.base.seqno. + * This will signal the fence, .value will update and the wait will + * complete allowing the client to continue. + * + */ + +struct drm_xe_eudebug_event_vm_bind { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 vm_handle; + + __u32 flags; +#define DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE (1 << 0) + + __u32 num_binds; +}; + +struct drm_xe_eudebug_event_vm_bind_op { + struct drm_xe_eudebug_event base; + __u64 vm_bind_ref_seqno; /* *_event_vm_bind.base.seqno */ + __u64 num_extensions; + + __u64 addr; /* XXX: Zero for unmap all? */ + __u64 range; /* XXX: Zero for unmap all? */ +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 32C18E7717D for ; Mon, 9 Dec 2024 13:33:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A6E7910E757; Mon, 9 Dec 2024 13:33:24 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="VkpY764T"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id C712210E740; Mon, 9 Dec 2024 13:33:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751202; x=1765287202; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HZgX9d74DzS1vu/EFKHILvX8BMImEWY8w7YqRUH9pXY=; b=VkpY764THOW7A9Z7YXS3a8rVXm8Fqdke/q6iXic6Sh7eD+07AkqkSUMP uu0OZLybQ3StuNR77fCn0fDFh9JB0jcF1KRlZShlZ6/ZqUTPWts1O9svB 9Q282XKFUaDBpXFDCXUqsOOsf6F7BJKbtMEV2yTsRbsFicQ2H/TV0B7wG PfMSubTsS3lFZgezvnfkIi8+4gTXtBtwKTAuZE0H3txVs6BnkjRStQpFg ej2wnkseiV3ZTqomir3Ih1S75KGNyPEGQ2br8NxstbnJhlHnANFyvY86l bejkbv2dUwfSEsi/4qwk9hWAYBUAsUtDVy6KX6uZ72UqwwpktUiGiFqnK A==; X-CSE-ConnectionGUID: lzSMSFiwQwWNdw11idmFUg== X-CSE-MsgGUID: fjCQm7rlSHyK6pai+hs3UA== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192023" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192023" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:21 -0800 X-CSE-ConnectionGUID: fxymClcxQeCakU2NMj1NnQ== X-CSE-MsgGUID: N6uD/HWNQ7+u5qJHW2k0Rg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531304" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:20 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Mika Kuoppala , Andrzej Hajda Subject: [PATCH 11/26] drm/xe/eudebug: Add UFENCE events with acks Date: Mon, 9 Dec 2024 15:33:02 +0200 Message-ID: <20241209133318.1806472-12-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" When vma is in place, debugger needs to intercept before userspace proceeds with the workload. For example to install a breakpoint in a eu shader. Attach debugger in xe_user_fence, send UFENCE event and stall normal user fence signal path to yield if there is debugger attached to ufence. When ack (ioctl) is received for the corresponding seqno, signal ufence. v2: - return err instead of 0 to guarantee signalling (Dominik) - checkpatch (Tilak) - Kconfig (Mika, Andrzej) - use lock instead of cmpxchg (Mika) Signed-off-by: Mika Kuoppala Signed-off-by: Andrzej Hajda --- drivers/gpu/drm/xe/xe_eudebug.c | 283 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug.h | 16 ++ drivers/gpu/drm/xe/xe_eudebug_types.h | 13 ++ drivers/gpu/drm/xe/xe_exec.c | 2 +- drivers/gpu/drm/xe/xe_oa.c | 3 +- drivers/gpu/drm/xe/xe_sync.c | 45 ++-- drivers/gpu/drm/xe/xe_sync.h | 8 +- drivers/gpu/drm/xe/xe_sync_types.h | 28 ++- drivers/gpu/drm/xe/xe_vm.c | 4 +- include/uapi/drm/xe_drm_eudebug.h | 13 ++ 10 files changed, 385 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index f544f60d7d6b..3cf3616e546d 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -32,6 +32,7 @@ #include "xe_reg_sr.h" #include "xe_rtp.h" #include "xe_sched_job.h" +#include "xe_sync.h" #include "xe_vm.h" #include "xe_wa.h" @@ -239,11 +240,119 @@ static void xe_eudebug_free(struct kref *ref) kfree_rcu(d, rcu); } -static void xe_eudebug_put(struct xe_eudebug *d) +void xe_eudebug_put(struct xe_eudebug *d) { kref_put(&d->ref, xe_eudebug_free); } +struct xe_eudebug_ack { + struct rb_node rb_node; + u64 seqno; + u64 ts_insert; + struct xe_user_fence *ufence; +}; + +#define fetch_ack(x) rb_entry(x, struct xe_eudebug_ack, rb_node) + +static int compare_ack(const u64 a, const u64 b) +{ + if (a < b) + return -1; + else if (a > b) + return 1; + + return 0; +} + +static int ack_insert_cmp(struct rb_node * const node, + const struct rb_node * const p) +{ + return compare_ack(fetch_ack(node)->seqno, + fetch_ack(p)->seqno); +} + +static int ack_lookup_cmp(const void * const key, + const struct rb_node * const node) +{ + return compare_ack(*(const u64 *)key, + fetch_ack(node)->seqno); +} + +static struct xe_eudebug_ack *remove_ack(struct xe_eudebug *d, u64 seqno) +{ + struct rb_root * const root = &d->acks.tree; + struct rb_node *node; + + spin_lock(&d->acks.lock); + node = rb_find(&seqno, root, ack_lookup_cmp); + if (node) + rb_erase(node, root); + spin_unlock(&d->acks.lock); + + if (!node) + return NULL; + + return rb_entry_safe(node, struct xe_eudebug_ack, rb_node); +} + +static void ufence_signal_worker(struct work_struct *w) +{ + struct xe_user_fence * const ufence = + container_of(w, struct xe_user_fence, eudebug.worker); + + if (READ_ONCE(ufence->signalled)) + xe_sync_ufence_signal(ufence); + + xe_sync_ufence_put(ufence); +} + +static void kick_ufence_worker(struct xe_user_fence *f) +{ + queue_work(f->xe->eudebug.ordered_wq, &f->eudebug.worker); +} + +static void handle_ack(struct xe_eudebug *d, struct xe_eudebug_ack *ack, + bool on_disconnect) +{ + struct xe_user_fence *f = ack->ufence; + u64 signalled_by; + bool signal = false; + + spin_lock(&f->eudebug.lock); + if (!f->eudebug.signalled_seqno) { + f->eudebug.signalled_seqno = ack->seqno; + signal = true; + } + signalled_by = f->eudebug.signalled_seqno; + spin_unlock(&f->eudebug.lock); + + if (signal) + kick_ufence_worker(f); + else + xe_sync_ufence_put(f); + + eu_dbg(d, "ACK: seqno=%llu: signalled by %llu (%s) (held %lluus)", + ack->seqno, signalled_by, + on_disconnect ? "disconnect" : "debugger", + ktime_us_delta(ktime_get(), ack->ts_insert)); + + kfree(ack); +} + +static void release_acks(struct xe_eudebug *d) +{ + struct xe_eudebug_ack *ack, *n; + struct rb_root root; + + spin_lock(&d->acks.lock); + root = d->acks.tree; + d->acks.tree = RB_ROOT; + spin_unlock(&d->acks.lock); + + rbtree_postorder_for_each_entry_safe(ack, n, &root, rb_node) + handle_ack(d, ack, true); +} + static struct task_struct *find_get_target(const pid_t nr) { struct task_struct *task; @@ -328,6 +437,8 @@ static bool xe_eudebug_detach(struct xe_device *xe, eu_dbg(d, "session %lld detached with %d", d->session, err); + release_acks(d); + /* Our ref with the connection_link */ xe_eudebug_put(d); @@ -453,7 +564,7 @@ _xe_eudebug_get(struct xe_file *xef) return d; } -static struct xe_eudebug * +struct xe_eudebug * xe_eudebug_get(struct xe_file *xef) { struct xe_eudebug *d; @@ -792,7 +903,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -827,7 +938,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -902,6 +1013,44 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, return ret; } +static long +xe_eudebug_ack_event_ioctl(struct xe_eudebug *d, + const unsigned int cmd, + const u64 arg) +{ + struct drm_xe_eudebug_ack_event __user * const user_ptr = + u64_to_user_ptr(arg); + struct drm_xe_eudebug_ack_event user_arg; + struct xe_eudebug_ack *ack; + struct xe_device *xe = d->xe; + + if (XE_IOCTL_DBG(xe, _IOC_SIZE(cmd) < sizeof(user_arg))) + return -EINVAL; + + /* Userland write */ + if (XE_IOCTL_DBG(xe, !(_IOC_DIR(cmd) & _IOC_WRITE))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, copy_from_user(&user_arg, + user_ptr, + sizeof(user_arg)))) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, user_arg.flags)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, xe_eudebug_detached(d))) + return -ENOTCONN; + + ack = remove_ack(d, user_arg.seqno); + if (XE_IOCTL_DBG(xe, !ack)) + return -EINVAL; + + handle_ack(d, ack, false); + + return 0; +} + static int do_eu_control(struct xe_eudebug *d, const struct drm_xe_eudebug_eu_control * const arg, struct drm_xe_eudebug_eu_control __user * const user_ptr) @@ -1093,7 +1242,10 @@ static long xe_eudebug_ioctl(struct file *file, ret = xe_eudebug_eu_control(d, arg); eu_dbg(d, "ioctl cmd=EU_CONTROL ret=%ld\n", ret); break; - + case DRM_XE_EUDEBUG_IOCTL_ACK_EVENT: + ret = xe_eudebug_ack_event_ioctl(d, cmd, arg); + eu_dbg(d, "ioctl cmd=EVENT_ACK ret=%ld\n", ret); + break; default: ret = -EINVAL; } @@ -1792,6 +1944,9 @@ xe_eudebug_connect(struct xe_device *xe, INIT_KFIFO(d->events.fifo); INIT_WORK(&d->discovery_work, discovery_work_fn); + spin_lock_init(&d->acks.lock); + d->acks.tree = RB_ROOT; + d->res = xe_eudebug_resources_alloc(); if (IS_ERR(d->res)) { err = PTR_ERR(d->res); @@ -2486,6 +2641,70 @@ static int vm_bind_op(struct xe_eudebug *d, struct xe_vm *vm, return 0; } +static int xe_eudebug_track_ufence(struct xe_eudebug *d, + struct xe_user_fence *f, + u64 seqno) +{ + struct xe_eudebug_ack *ack; + struct rb_node *old; + + ack = kzalloc(sizeof(*ack), GFP_KERNEL); + if (!ack) + return -ENOMEM; + + ack->seqno = seqno; + ack->ts_insert = ktime_get(); + + spin_lock(&d->acks.lock); + old = rb_find_add(&ack->rb_node, + &d->acks.tree, ack_insert_cmp); + if (!old) { + kref_get(&f->refcount); + ack->ufence = f; + } + spin_unlock(&d->acks.lock); + + if (old) { + eu_dbg(d, "ACK: seqno=%llu: already exists", seqno); + kfree(ack); + return -EEXIST; + } + + eu_dbg(d, "ACK: seqno=%llu: tracking started", seqno); + + return 0; +} + +static int vm_bind_ufence_event(struct xe_eudebug *d, + struct xe_user_fence *ufence) +{ + struct xe_eudebug_event *event; + struct xe_eudebug_event_vm_bind_ufence *e; + const u32 sz = sizeof(*e); + const u32 flags = DRM_XE_EUDEBUG_EVENT_CREATE | + DRM_XE_EUDEBUG_EVENT_NEED_ACK; + u64 seqno; + int ret; + + seqno = atomic_long_inc_return(&d->events.seqno); + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE, + seqno, flags, sz); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_vm_bind_ufence, + e, vm_bind_ref_seqno, ufence->eudebug.bind_ref_seqno); + + ret = xe_eudebug_track_ufence(d, ufence, seqno); + if (!ret) + ret = xe_eudebug_queue_event(d, event); + + return ret; +} + void xe_eudebug_vm_init(struct xe_vm *vm) { INIT_LIST_HEAD(&vm->eudebug.events); @@ -2673,6 +2892,24 @@ void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int bind_err) xe_eudebug_put(d); } +int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence) +{ + struct xe_eudebug *d; + int err; + + d = ufence->eudebug.debugger; + if (!d || xe_eudebug_detached(d)) + return -ENOTCONN; + + err = vm_bind_ufence_event(d, ufence); + if (err) { + eu_err(d, "error %d on %s", err, __func__); + xe_eudebug_disconnect(d, err); + } + + return err; +} + static int discover_client(struct xe_eudebug *d, struct xe_file *xef) { struct xe_exec_queue *q; @@ -2765,3 +3002,39 @@ static void discovery_work_fn(struct work_struct *work) xe_eudebug_put(d); } + +void xe_eudebug_ufence_init(struct xe_user_fence *ufence, + struct xe_file *xef, + struct xe_vm *vm) +{ + u64 bind_ref; + + /* Drop if OA */ + if (!vm) + return; + + spin_lock(&vm->eudebug.lock); + bind_ref = vm->eudebug.ref_seqno; + spin_unlock(&vm->eudebug.lock); + + spin_lock_init(&ufence->eudebug.lock); + INIT_WORK(&ufence->eudebug.worker, ufence_signal_worker); + + ufence->eudebug.signalled_seqno = 0; + + if (bind_ref) { + ufence->eudebug.debugger = xe_eudebug_get(xef); + + if (ufence->eudebug.debugger) + ufence->eudebug.bind_ref_seqno = bind_ref; + } +} + +void xe_eudebug_ufence_fini(struct xe_user_fence *ufence) +{ + if (!ufence->eudebug.debugger) + return; + + xe_eudebug_put(ufence->eudebug.debugger); + ufence->eudebug.debugger = NULL; +} diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index ccc7202b3308..13ba0167b31b 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -15,6 +15,7 @@ struct xe_vm; struct xe_vma; struct xe_exec_queue; struct xe_hw_engine; +struct xe_user_fence; #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) @@ -41,6 +42,13 @@ void xe_eudebug_vm_bind_start(struct xe_vm *vm); void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range); void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int err); +int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence); +void xe_eudebug_ufence_init(struct xe_user_fence *ufence, struct xe_file *xef, struct xe_vm *vm); +void xe_eudebug_ufence_fini(struct xe_user_fence *ufence); + +struct xe_eudebug *xe_eudebug_get(struct xe_file *xef); +void xe_eudebug_put(struct xe_eudebug *d); + #else static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, @@ -66,6 +74,14 @@ static inline void xe_eudebug_vm_bind_start(struct xe_vm *vm) { } static inline void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) { } static inline void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int err) { } +static inline int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence) { return 0; } +static inline void xe_eudebug_ufence_init(struct xe_user_fence *ufence, + struct xe_file *xef, struct xe_vm *vm) { } +static inline void xe_eudebug_ufence_fini(struct xe_user_fence *ufence) { } + +static inline struct xe_eudebug *xe_eudebug_get(struct xe_file *xef) { return NULL; } +static inline void xe_eudebug_put(struct xe_eudebug *d) { } + #endif /* CONFIG_DRM_XE_EUDEBUG */ #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index cbc316ec3593..ffb0dc71430a 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -150,6 +150,14 @@ struct xe_eudebug { atomic_long_t seqno; } events; + /* user fences tracked by this debugger */ + struct { + /** @lock: guards access to tree */ + spinlock_t lock; + + struct rb_root tree; + } acks; + /** @ops operations for eu_control */ struct xe_eudebug_eu_control_ops *ops; }; @@ -313,4 +321,9 @@ struct xe_eudebug_event_vm_bind_op { u64 range; /* Zero for unmap all ? */ }; +struct xe_eudebug_event_vm_bind_ufence { + struct xe_eudebug_event base; + u64 vm_bind_ref_seqno; +}; + #endif diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index 31cca938956f..17dd7a3f8354 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -159,7 +159,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) vm = q->vm; for (num_syncs = 0; num_syncs < args->num_syncs; num_syncs++) { - err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs], + err = xe_sync_entry_parse(xe, xef, vm, &syncs[num_syncs], &syncs_user[num_syncs], SYNC_PARSE_FLAG_EXEC | (xe_vm_in_lr_mode(vm) ? SYNC_PARSE_FLAG_LR_MODE : 0)); diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 8dd55798ab31..a32dc3fdabe7 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -1379,7 +1379,8 @@ static int xe_oa_parse_syncs(struct xe_oa *oa, struct xe_oa_open_param *param) } for (num_syncs = 0; num_syncs < param->num_syncs; num_syncs++) { - ret = xe_sync_entry_parse(oa->xe, param->xef, ¶m->syncs[num_syncs], + ret = xe_sync_entry_parse(oa->xe, param->xef, NULL, + ¶m->syncs[num_syncs], ¶m->syncs_user[num_syncs], 0); if (ret) goto err_syncs; diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c index 42f5bebd09e5..3e7398983b52 100644 --- a/drivers/gpu/drm/xe/xe_sync.c +++ b/drivers/gpu/drm/xe/xe_sync.c @@ -15,27 +15,20 @@ #include #include "xe_device_types.h" +#include "xe_eudebug.h" #include "xe_exec_queue.h" #include "xe_macros.h" #include "xe_sched_job_types.h" -struct xe_user_fence { - struct xe_device *xe; - struct kref refcount; - struct dma_fence_cb cb; - struct work_struct worker; - struct mm_struct *mm; - u64 __user *addr; - u64 value; - int signalled; -}; - static void user_fence_destroy(struct kref *kref) { struct xe_user_fence *ufence = container_of(kref, struct xe_user_fence, refcount); mmdrop(ufence->mm); + + xe_eudebug_ufence_fini(ufence); + kfree(ufence); } @@ -49,7 +42,10 @@ static void user_fence_put(struct xe_user_fence *ufence) kref_put(&ufence->refcount, user_fence_destroy); } -static struct xe_user_fence *user_fence_create(struct xe_device *xe, u64 addr, +static struct xe_user_fence *user_fence_create(struct xe_device *xe, + struct xe_file *xef, + struct xe_vm *vm, + u64 addr, u64 value) { struct xe_user_fence *ufence; @@ -70,12 +66,14 @@ static struct xe_user_fence *user_fence_create(struct xe_device *xe, u64 addr, ufence->mm = current->mm; mmgrab(ufence->mm); + xe_eudebug_ufence_init(ufence, xef, vm); + return ufence; } -static void user_fence_worker(struct work_struct *w) +void xe_sync_ufence_signal(struct xe_user_fence *ufence) { - struct xe_user_fence *ufence = container_of(w, struct xe_user_fence, worker); + XE_WARN_ON(!ufence->signalled); if (mmget_not_zero(ufence->mm)) { kthread_use_mm(ufence->mm); @@ -87,12 +85,25 @@ static void user_fence_worker(struct work_struct *w) drm_dbg(&ufence->xe->drm, "mmget_not_zero() failed, ufence wasn't signaled\n"); } + wake_up_all(&ufence->xe->ufence_wq); +} + +static void user_fence_worker(struct work_struct *w) +{ + struct xe_user_fence *ufence = container_of(w, struct xe_user_fence, worker); + int ret; + /* * Wake up waiters only after updating the ufence state, allowing the UMD * to safely reuse the same ufence without encountering -EBUSY errors. */ WRITE_ONCE(ufence->signalled, 1); - wake_up_all(&ufence->xe->ufence_wq); + + /* Lets see if debugger wants to track this */ + ret = xe_eudebug_vm_bind_ufence(ufence); + if (ret) + xe_sync_ufence_signal(ufence); + user_fence_put(ufence); } @@ -111,6 +122,7 @@ static void user_fence_cb(struct dma_fence *fence, struct dma_fence_cb *cb) } int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, + struct xe_vm *vm, struct xe_sync_entry *sync, struct drm_xe_sync __user *sync_user, unsigned int flags) @@ -192,7 +204,8 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, if (exec) { sync->addr = sync_in.addr; } else { - sync->ufence = user_fence_create(xe, sync_in.addr, + sync->ufence = user_fence_create(xe, xef, vm, + sync_in.addr, sync_in.timeline_value); if (XE_IOCTL_DBG(xe, IS_ERR(sync->ufence))) return PTR_ERR(sync->ufence); diff --git a/drivers/gpu/drm/xe/xe_sync.h b/drivers/gpu/drm/xe/xe_sync.h index 256ffc1e54dc..f5bec2b1b4f6 100644 --- a/drivers/gpu/drm/xe/xe_sync.h +++ b/drivers/gpu/drm/xe/xe_sync.h @@ -9,8 +9,12 @@ #include "xe_sync_types.h" struct xe_device; -struct xe_exec_queue; struct xe_file; +struct xe_exec_queue; +struct drm_syncobj; +struct dma_fence; +struct dma_fence_chain; +struct drm_xe_sync; struct xe_sched_job; struct xe_vm; @@ -19,6 +23,7 @@ struct xe_vm; #define SYNC_PARSE_FLAG_DISALLOW_USER_FENCE BIT(2) int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, + struct xe_vm *vm, struct xe_sync_entry *sync, struct drm_xe_sync __user *sync_user, unsigned int flags); @@ -40,5 +45,6 @@ struct xe_user_fence *__xe_sync_ufence_get(struct xe_user_fence *ufence); struct xe_user_fence *xe_sync_ufence_get(struct xe_sync_entry *sync); void xe_sync_ufence_put(struct xe_user_fence *ufence); int xe_sync_ufence_get_status(struct xe_user_fence *ufence); +void xe_sync_ufence_signal(struct xe_user_fence *ufence); #endif diff --git a/drivers/gpu/drm/xe/xe_sync_types.h b/drivers/gpu/drm/xe/xe_sync_types.h index 30ac3f51993b..dcd3165e66a7 100644 --- a/drivers/gpu/drm/xe/xe_sync_types.h +++ b/drivers/gpu/drm/xe/xe_sync_types.h @@ -6,13 +6,31 @@ #ifndef _XE_SYNC_TYPES_H_ #define _XE_SYNC_TYPES_H_ +#include +#include +#include #include -struct drm_syncobj; -struct dma_fence; -struct dma_fence_chain; -struct drm_xe_sync; -struct user_fence; +struct xe_user_fence { + struct xe_device *xe; + struct kref refcount; + struct dma_fence_cb cb; + struct work_struct worker; + struct mm_struct *mm; + u64 __user *addr; + u64 value; + int signalled; + +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + struct { + spinlock_t lock; + struct xe_eudebug *debugger; + u64 bind_ref_seqno; + u64 signalled_seqno; + struct work_struct worker; + } eudebug; +#endif +}; struct xe_sync_entry { struct drm_syncobj *syncobj; diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index e83420473763..0f17bc8b627b 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3037,9 +3037,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) } } + xe_eudebug_vm_bind_start(vm); + syncs_user = u64_to_user_ptr(args->syncs); for (num_syncs = 0; num_syncs < args->num_syncs; num_syncs++) { - err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs], + err = xe_sync_entry_parse(xe, xef, vm, &syncs[num_syncs], &syncs_user[num_syncs], (xe_vm_in_lr_mode(vm) ? SYNC_PARSE_FLAG_LR_MODE : 0) | diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index cc34c522fa4d..1d5f1411c9a8 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -17,6 +17,7 @@ extern "C" { */ #define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0) #define DRM_XE_EUDEBUG_IOCTL_EU_CONTROL _IOWR('j', 0x2, struct drm_xe_eudebug_eu_control) +#define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT _IOW('j', 0x4, struct drm_xe_eudebug_ack_event) /* XXX: Document events to match their internal counterparts when moved to xe_drm.h */ struct drm_xe_eudebug_event { @@ -32,6 +33,7 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_EU_ATTENTION 6 #define DRM_XE_EUDEBUG_EVENT_VM_BIND 7 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP 8 +#define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE 9 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) @@ -174,6 +176,17 @@ struct drm_xe_eudebug_event_vm_bind_op { __u64 range; /* XXX: Zero for unmap all? */ }; +struct drm_xe_eudebug_event_vm_bind_ufence { + struct drm_xe_eudebug_event base; + __u64 vm_bind_ref_seqno; /* *_event_vm_bind.base.seqno */ +}; + +struct drm_xe_eudebug_ack_event { + __u32 type; + __u32 flags; /* MBZ */ + __u64 seqno; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AB53FE77181 for ; Mon, 9 Dec 2024 13:33:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1F5D010E751; Mon, 9 Dec 2024 13:33:26 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="bA5ExT0T"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7C0EA10E74A; Mon, 9 Dec 2024 13:33:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751203; x=1765287203; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vW2jU+TG8pKh5UTyEwQzL4B/at6q2kPTqElrdUcNAEM=; b=bA5ExT0TgbVMyKdzR60U1+hoOJ99rQRaUAE55dlAaHq4eEEQrglAZIiz PNBldwToeWqbqQlpcHdnfOiR6v9vr4I0jimMW4XpSCl3FZnw24MclqniN ZZPHXZM8L0f49Cid5SxK4yPDhlxfHOqAm90lWnkHR1AAqMK++ZmnWbfv+ Qjl5tU5Fnu4KcOc3b3NVmHpwxROGHUO+qf1EIn3pml3lFcdAjNzSVVjzY YKtg4GKOMQVQ/5gX6FRdOHSZ7uPCvAZj2jMeLWpWd1WXewJEnkQXeYmQW wZ71bGUoRot1gTjAwSbg/3Bs3x9uxWGJCIAtvuuL6x6BKeAXJjSD3GciR Q==; X-CSE-ConnectionGUID: TDH0Zoe7Rzqbo7vEqw2c9Q== X-CSE-MsgGUID: terOpijKTH+IP2nN3xJ5+w== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192041" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192041" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:23 -0800 X-CSE-ConnectionGUID: eu5jPgnfSreiwfqYzWxbgg== X-CSE-MsgGUID: qR98nsrpRi6nbkAorJ69sg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531311" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:22 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Mika Kuoppala , Matthew Brost Subject: [PATCH 12/26] drm/xe/eudebug: vm open/pread/pwrite Date: Mon, 9 Dec 2024 15:33:03 +0200 Message-ID: <20241209133318.1806472-13-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Debugger needs access to the client's vm to read and write. For example inspecting ISA/ELF and setting up breakpoints. Add ioctl to open target vm with debugger client and vm_handle and hook up pread/pwrite possibility. Open will take timeout argument so that standard fsync can be used for explicit flushing between cpu/gpu for the target vm. Implement this for bo backed storage. userptr will be done in following patch. v2: - checkpatch (Maciej) - 32bit fixes (Andrzej) - bo_vmap (Mika) - fix vm leak if can't allocate k_buffer (Mika) - assert vm write held for vma (Matthew) v3: - fw ref, ttm_bo_access - timeout boundary check (Dominik) - dont try to copy to user on zero bytes (Mika) Cc: Matthew Brost Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 24 ++ drivers/gpu/drm/xe/xe_eudebug.c | 442 +++++++++++++++++++++++++++ include/uapi/drm/xe_drm_eudebug.h | 19 ++ 3 files changed, 485 insertions(+) diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 5fcf06835ef0..4c620f95b466 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -551,6 +551,30 @@ #define CCS_MODE_CSLICE(cslice, ccs) \ ((ccs) << ((cslice) * CCS_MODE_CSLICE_WIDTH)) +#define RCU_ASYNC_FLUSH XE_REG(0x149fc) +#define RCU_ASYNC_FLUSH_IN_PROGRESS REG_BIT(31) +#define RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT 28 +#define RCU_ASYNC_FLUSH_ENGINE_ID_DECODE1 REG_BIT(26) +#define RCU_ASYNC_FLUSH_AMFS REG_BIT(8) +#define RCU_ASYNC_FLUSH_PREFETCH REG_BIT(7) +#define RCU_ASYNC_FLUSH_DATA_PORT REG_BIT(6) +#define RCU_ASYNC_FLUSH_DATA_CACHE REG_BIT(5) +#define RCU_ASYNC_FLUSH_HDC_PIPELINE REG_BIT(4) +#define RCU_ASYNC_INVALIDATE_HDC_PIPELINE REG_BIT(3) +#define RCU_ASYNC_INVALIDATE_CONSTANT_CACHE REG_BIT(2) +#define RCU_ASYNC_INVALIDATE_TEXTURE_CACHE REG_BIT(1) +#define RCU_ASYNC_INVALIDATE_INSTRUCTION_CACHE REG_BIT(0) +#define RCU_ASYNC_FLUSH_AND_INVALIDATE_ALL ( \ + RCU_ASYNC_FLUSH_AMFS | \ + RCU_ASYNC_FLUSH_PREFETCH | \ + RCU_ASYNC_FLUSH_DATA_PORT | \ + RCU_ASYNC_FLUSH_DATA_CACHE | \ + RCU_ASYNC_FLUSH_HDC_PIPELINE | \ + RCU_ASYNC_INVALIDATE_HDC_PIPELINE | \ + RCU_ASYNC_INVALIDATE_CONSTANT_CACHE | \ + RCU_ASYNC_INVALIDATE_TEXTURE_CACHE | \ + RCU_ASYNC_INVALIDATE_INSTRUCTION_CACHE) + #define RCU_DEBUG_1 XE_REG(0x14a00) #define RCU_DEBUG_1_ENGINE_STATUS REG_GENMASK(2, 0) #define RCU_DEBUG_1_RUNALONE_ACTIVE REG_BIT(2) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 3cf3616e546d..9d87df75348b 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -5,9 +5,12 @@ #include #include +#include #include #include +#include +#include #include #include @@ -16,6 +19,7 @@ #include "regs/xe_engine_regs.h" #include "xe_assert.h" +#include "xe_bo.h" #include "xe_device.h" #include "xe_eudebug.h" #include "xe_eudebug_types.h" @@ -1222,6 +1226,8 @@ static long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg) return ret; } +static long xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg); + static long xe_eudebug_ioctl(struct file *file, unsigned int cmd, unsigned long arg) @@ -1246,6 +1252,11 @@ static long xe_eudebug_ioctl(struct file *file, ret = xe_eudebug_ack_event_ioctl(d, cmd, arg); eu_dbg(d, "ioctl cmd=EVENT_ACK ret=%ld\n", ret); break; + case DRM_XE_EUDEBUG_IOCTL_VM_OPEN: + ret = xe_eudebug_vm_open_ioctl(d, arg); + eu_dbg(d, "ioctl cmd=VM_OPEN ret=%ld\n", ret); + break; + default: ret = -EINVAL; } @@ -3038,3 +3049,434 @@ void xe_eudebug_ufence_fini(struct xe_user_fence *ufence) xe_eudebug_put(ufence->eudebug.debugger); ufence->eudebug.debugger = NULL; } + +static int xe_eudebug_vma_access(struct xe_vma *vma, u64 offset_in_vma, + void *buf, u64 len, bool write) +{ + struct xe_bo *bo; + u64 bytes; + + lockdep_assert_held_write(&xe_vma_vm(vma)->lock); + + if (XE_WARN_ON(offset_in_vma >= xe_vma_size(vma))) + return -EINVAL; + + bytes = min_t(u64, len, xe_vma_size(vma) - offset_in_vma); + if (!bytes) + return 0; + + bo = xe_bo_get(xe_vma_bo(vma)); + if (bo) { + int ret; + + ret = ttm_bo_access(&bo->ttm, offset_in_vma, buf, bytes, write); + + xe_bo_put(bo); + + return ret; + } + + return -EINVAL; +} + +static int xe_eudebug_vm_access(struct xe_vm *vm, u64 offset, + void *buf, u64 len, bool write) +{ + struct xe_vma *vma; + int ret; + + down_write(&vm->lock); + + vma = xe_vm_find_overlapping_vma(vm, offset, len); + if (vma) { + /* XXX: why find overlapping returns below start? */ + if (offset < xe_vma_start(vma) || + offset >= (xe_vma_start(vma) + xe_vma_size(vma))) { + ret = -EINVAL; + goto out; + } + + /* Offset into vma */ + offset -= xe_vma_start(vma); + ret = xe_eudebug_vma_access(vma, offset, buf, len, write); + } else { + ret = -EINVAL; + } + +out: + up_write(&vm->lock); + + return ret; +} + +struct vm_file { + struct xe_eudebug *debugger; + struct xe_file *xef; + struct xe_vm *vm; + u64 flags; + u64 client_id; + u64 vm_handle; + unsigned int timeout_us; +}; + +static ssize_t __vm_read_write(struct xe_vm *vm, + void *bb, + char __user *r_buffer, + const char __user *w_buffer, + unsigned long offset, + unsigned long len, + const bool write) +{ + ssize_t ret; + + if (!len) + return 0; + + if (write) { + ret = copy_from_user(bb, w_buffer, len); + if (ret) + return -EFAULT; + + ret = xe_eudebug_vm_access(vm, offset, bb, len, true); + if (ret <= 0) + return ret; + + len = ret; + } else { + ret = xe_eudebug_vm_access(vm, offset, bb, len, false); + if (ret <= 0) + return ret; + + len = ret; + + ret = copy_to_user(r_buffer, bb, len); + if (ret) + return -EFAULT; + } + + return len; +} + +static struct xe_vm *find_vm_get(struct xe_eudebug *d, const u32 id) +{ + struct xe_vm *vm; + + mutex_lock(&d->res->lock); + vm = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_VM, id); + if (vm) + xe_vm_get(vm); + + mutex_unlock(&d->res->lock); + + return vm; +} + +static ssize_t __xe_eudebug_vm_access(struct file *file, + char __user *r_buffer, + const char __user *w_buffer, + size_t count, loff_t *__pos) +{ + struct vm_file *vmf = file->private_data; + struct xe_eudebug * const d = vmf->debugger; + struct xe_device * const xe = d->xe; + const bool write = !!w_buffer; + struct xe_vm *vm; + ssize_t copied = 0; + ssize_t bytes_left = count; + ssize_t ret; + unsigned long alloc_len; + loff_t pos = *__pos; + void *k_buffer; + + if (XE_IOCTL_DBG(xe, write && r_buffer)) + return -EINVAL; + + vm = find_vm_get(d, vmf->vm_handle); + if (XE_IOCTL_DBG(xe, !vm)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, vm != vmf->vm)) { + eu_warn(d, "vm_access(%s): vm handle mismatch client_handle=%llu, vm_handle=%llu, flags=0x%llx, pos=%llu, count=%zu\n", + write ? "write" : "read", + vmf->client_id, vmf->vm_handle, vmf->flags, pos, count); + xe_vm_put(vm); + return -EINVAL; + } + + if (!count) { + xe_vm_put(vm); + return 0; + } + + alloc_len = min_t(unsigned long, ALIGN(count, PAGE_SIZE), 64 * SZ_1M); + do { + k_buffer = vmalloc(alloc_len); + if (k_buffer) + break; + + alloc_len >>= 1; + } while (alloc_len > PAGE_SIZE); + + if (XE_IOCTL_DBG(xe, !k_buffer)) { + xe_vm_put(vm); + return -ENOMEM; + } + + do { + const ssize_t len = min_t(ssize_t, bytes_left, alloc_len); + + ret = __vm_read_write(vm, k_buffer, + write ? NULL : r_buffer + copied, + write ? w_buffer + copied : NULL, + pos + copied, + len, + write); + if (ret <= 0) + break; + + bytes_left -= ret; + copied += ret; + } while (bytes_left > 0); + + vfree(k_buffer); + xe_vm_put(vm); + + if (XE_WARN_ON(copied < 0)) + copied = 0; + + *__pos += copied; + + return copied ?: ret; +} + +static ssize_t xe_eudebug_vm_read(struct file *file, + char __user *buffer, + size_t count, loff_t *pos) +{ + return __xe_eudebug_vm_access(file, buffer, NULL, count, pos); +} + +static ssize_t xe_eudebug_vm_write(struct file *file, + const char __user *buffer, + size_t count, loff_t *pos) +{ + return __xe_eudebug_vm_access(file, NULL, buffer, count, pos); +} + +static int engine_rcu_flush(struct xe_eudebug *d, + struct xe_hw_engine *hwe, + unsigned int timeout_us) +{ + const struct xe_reg psmi_addr = RING_PSMI_CTL(hwe->mmio_base); + struct xe_gt *gt = hwe->gt; + unsigned int fw_ref; + u32 mask = RCU_ASYNC_FLUSH_AND_INVALIDATE_ALL; + u32 psmi_ctrl; + u32 id; + int ret; + + if (hwe->class == XE_ENGINE_CLASS_RENDER) + id = 0; + else if (hwe->class == XE_ENGINE_CLASS_COMPUTE) + id = hwe->instance + 1; + else + return -EINVAL; + + if (id < 8) + mask |= id << RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT; + else + mask |= (id - 8) << RCU_ASYNC_FLUSH_ENGINE_ID_SHIFT | + RCU_ASYNC_FLUSH_ENGINE_ID_DECODE1; + + fw_ref = xe_force_wake_get(gt_to_fw(gt), hwe->domain); + if (!fw_ref) + return -ETIMEDOUT; + + /* Prevent concurrent flushes */ + mutex_lock(&d->eu_lock); + psmi_ctrl = xe_mmio_read32(>->mmio, psmi_addr); + if (!(psmi_ctrl & IDLE_MSG_DISABLE)) + xe_mmio_write32(>->mmio, psmi_addr, _MASKED_BIT_ENABLE(IDLE_MSG_DISABLE)); + + /* XXX: Timeout is per operation but in here we flush previous */ + ret = xe_mmio_wait32(>->mmio, RCU_ASYNC_FLUSH, + RCU_ASYNC_FLUSH_IN_PROGRESS, 0, + timeout_us, NULL, false); + if (ret) + goto out; + + xe_mmio_write32(>->mmio, RCU_ASYNC_FLUSH, mask); + + ret = xe_mmio_wait32(>->mmio, RCU_ASYNC_FLUSH, + RCU_ASYNC_FLUSH_IN_PROGRESS, 0, + timeout_us, NULL, false); +out: + if (!(psmi_ctrl & IDLE_MSG_DISABLE)) + xe_mmio_write32(>->mmio, psmi_addr, _MASKED_BIT_DISABLE(IDLE_MSG_DISABLE)); + + mutex_unlock(&d->eu_lock); + xe_force_wake_put(gt_to_fw(gt), fw_ref); + + return ret; +} + +static int xe_eudebug_vm_fsync(struct file *file, loff_t start, loff_t end, int datasync) +{ + struct vm_file *vmf = file->private_data; + struct xe_eudebug *d = vmf->debugger; + struct xe_gt *gt; + int gt_id; + int ret = -EINVAL; + + eu_dbg(d, "vm_fsync: client_handle=%llu, vm_handle=%llu, flags=0x%llx, start=%llu, end=%llu datasync=%d\n", + vmf->client_id, vmf->vm_handle, vmf->flags, start, end, datasync); + + for_each_gt(gt, d->xe, gt_id) { + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + /* XXX: vm open per engine? */ + for_each_hw_engine(hwe, gt, id) { + if (hwe->class != XE_ENGINE_CLASS_RENDER && + hwe->class != XE_ENGINE_CLASS_COMPUTE) + continue; + + ret = engine_rcu_flush(d, hwe, vmf->timeout_us); + if (ret) + break; + } + } + + return ret; +} + +static int xe_eudebug_vm_release(struct inode *inode, struct file *file) +{ + struct vm_file *vmf = file->private_data; + struct xe_eudebug *d = vmf->debugger; + + eu_dbg(d, "vm_release: client_handle=%llu, vm_handle=%llu, flags=0x%llx", + vmf->client_id, vmf->vm_handle, vmf->flags); + + xe_vm_put(vmf->vm); + xe_file_put(vmf->xef); + xe_eudebug_put(d); + drm_dev_put(&d->xe->drm); + + kfree(vmf); + + return 0; +} + +static const struct file_operations vm_fops = { + .owner = THIS_MODULE, + .llseek = generic_file_llseek, + .read = xe_eudebug_vm_read, + .write = xe_eudebug_vm_write, + .fsync = xe_eudebug_vm_fsync, + .mmap = NULL, + .release = xe_eudebug_vm_release, +}; + +static long +xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg) +{ + const u64 max_timeout_ns = DRM_XE_EUDEBUG_VM_SYNC_MAX_TIMEOUT_NSECS; + struct drm_xe_eudebug_vm_open param; + struct xe_device * const xe = d->xe; + struct vm_file *vmf = NULL; + struct xe_file *xef; + struct xe_vm *vm; + struct file *file; + long ret = 0; + int fd; + + if (XE_IOCTL_DBG(xe, _IOC_SIZE(DRM_XE_EUDEBUG_IOCTL_VM_OPEN) != sizeof(param))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !(_IOC_DIR(DRM_XE_EUDEBUG_IOCTL_VM_OPEN) & _IOC_WRITE))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, copy_from_user(¶m, (void __user *)arg, sizeof(param)))) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, param.flags)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, param.timeout_ns > max_timeout_ns)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, xe_eudebug_detached(d))) + return -ENOTCONN; + + xef = find_client_get(d, param.client_handle); + if (xef) + vm = find_vm_get(d, param.vm_handle); + else + vm = NULL; + + if (XE_IOCTL_DBG(xe, !xef)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !vm)) { + ret = -EINVAL; + goto out_file_put; + } + + vmf = kzalloc(sizeof(*vmf), GFP_KERNEL); + if (XE_IOCTL_DBG(xe, !vmf)) { + ret = -ENOMEM; + goto out_vm_put; + } + + fd = get_unused_fd_flags(O_CLOEXEC); + if (XE_IOCTL_DBG(xe, fd < 0)) { + ret = fd; + goto out_free; + } + + kref_get(&d->ref); + vmf->debugger = d; + vmf->vm = vm; + vmf->xef = xef; + vmf->flags = param.flags; + vmf->client_id = param.client_handle; + vmf->vm_handle = param.vm_handle; + vmf->timeout_us = div64_u64(param.timeout_ns, 1000ull); + + file = anon_inode_getfile("[xe_eudebug.vm]", &vm_fops, vmf, O_RDWR); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + XE_IOCTL_DBG(xe, ret); + file = NULL; + goto out_fd_put; + } + + file->f_mode |= FMODE_PREAD | FMODE_PWRITE | + FMODE_READ | FMODE_WRITE | FMODE_LSEEK; + + fd_install(fd, file); + + eu_dbg(d, "vm_open: client_handle=%llu, handle=%llu, flags=0x%llx, fd=%d", + vmf->client_id, vmf->vm_handle, vmf->flags, fd); + + XE_WARN_ON(ret); + + drm_dev_get(&xe->drm); + + return fd; + +out_fd_put: + put_unused_fd(fd); + xe_eudebug_put(d); +out_free: + kfree(vmf); +out_vm_put: + xe_vm_put(vm); +out_file_put: + xe_file_put(xef); + + XE_WARN_ON(ret >= 0); + + return ret; +} diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index 1d5f1411c9a8..a5f13563b3b9 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -18,6 +18,7 @@ extern "C" { #define DRM_XE_EUDEBUG_IOCTL_READ_EVENT _IO('j', 0x0) #define DRM_XE_EUDEBUG_IOCTL_EU_CONTROL _IOWR('j', 0x2, struct drm_xe_eudebug_eu_control) #define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT _IOW('j', 0x4, struct drm_xe_eudebug_ack_event) +#define DRM_XE_EUDEBUG_IOCTL_VM_OPEN _IOW('j', 0x1, struct drm_xe_eudebug_vm_open) /* XXX: Document events to match their internal counterparts when moved to xe_drm.h */ struct drm_xe_eudebug_event { @@ -187,6 +188,24 @@ struct drm_xe_eudebug_ack_event { __u64 seqno; }; +struct drm_xe_eudebug_vm_open { + /** @extensions: Pointer to the first extension struct, if any */ + __u64 extensions; + + /** @client_handle: id of client */ + __u64 client_handle; + + /** @vm_handle: id of vm */ + __u64 vm_handle; + + /** @flags: flags */ + __u64 flags; + +#define DRM_XE_EUDEBUG_VM_SYNC_MAX_TIMEOUT_NSECS (10ULL * NSEC_PER_SEC) + /** @timeout_ns: Timeout value in nanoseconds operations (fsync) */ + __u64 timeout_ns; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38EB5E77185 for ; Mon, 9 Dec 2024 13:33:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8FA1410E755; Mon, 9 Dec 2024 13:33:26 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="P9AFHsOu"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4B82E10E74A; Mon, 9 Dec 2024 13:33:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751205; x=1765287205; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o2DMN7Yk52XsBoy5NuqVTqCP071tTp0yUuznqHjHzUQ=; b=P9AFHsOuNsXfHuKVChvzX282YfvohbrwCOSWxlNFaJWQBI4T6NwbsToQ wDUqpSNYrsUSM0gMPLaGaVpGweHI5ApE1K6sdLVuZUEe5tzQTsowF7yQe 9Nd4lql2LHI34CziSMryBhG9ATWGUqMEWAoPdnzKhJ87J+J7JEcDc/xTH +ahV1qYyNpcNDluyqGYBVjY+Nbu1HgTzThlM/A0kork1Rh24zFNdcPKMM DCPKpWc2KG4roWE1E9ENJZ5qxqxnClvNxaqPACYbQ9HGlGONsWP2LLYp7 RSwUgegHsbcpI7fLNs/+5wHL2ntf12V0dO7qRzCWXNJKuKy0VI3ptlY59 g==; X-CSE-ConnectionGUID: 1FfzNHIOSzy5KxPkvXID2g== X-CSE-MsgGUID: gxMrNVciSiCowbqm1cfrdA== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192052" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192052" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:25 -0800 X-CSE-ConnectionGUID: 3q81wrA9QdWF8s91FJm7Rw== X-CSE-MsgGUID: CwqwUASjRM2TxKY6J89c7w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531318" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:24 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Andrzej Hajda , Mika Kuoppala Subject: [PATCH 13/26] drm/xe: add system memory page iterator support to xe_res_cursor Date: Mon, 9 Dec 2024 15:33:04 +0200 Message-ID: <20241209133318.1806472-14-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Andrzej Hajda Currently xe_res_cursor allows iteration only over DMA side of scatter gatter tables. Signed-off-by: Andrzej Hajda Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_res_cursor.h | 51 +++++++++++++++++++++++------- 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h b/drivers/gpu/drm/xe/xe_res_cursor.h index dca374b6521c..c1f39a680ae0 100644 --- a/drivers/gpu/drm/xe/xe_res_cursor.h +++ b/drivers/gpu/drm/xe/xe_res_cursor.h @@ -129,18 +129,35 @@ static inline void __xe_res_sg_next(struct xe_res_cursor *cur) { struct scatterlist *sgl = cur->sgl; u64 start = cur->start; + unsigned int len; - while (start >= sg_dma_len(sgl)) { - start -= sg_dma_len(sgl); + while (true) { + len = (cur->mem_type == XE_PL_SYSTEM) ? sgl->length : sg_dma_len(sgl); + if (start < len) + break; + start -= len; sgl = sg_next(sgl); XE_WARN_ON(!sgl); } - cur->start = start; - cur->size = sg_dma_len(sgl) - start; + cur->size = len - start; cur->sgl = sgl; } +static inline void __xe_res_first_sg(const struct sg_table *sg, + u64 start, u64 size, + struct xe_res_cursor *cur, u32 mem_type) +{ + XE_WARN_ON(!sg); + cur->node = NULL; + cur->start = start; + cur->remaining = size; + cur->size = 0; + cur->sgl = sg->sgl; + cur->mem_type = mem_type; + __xe_res_sg_next(cur); +} + /** * xe_res_first_sg - initialize a xe_res_cursor with a scatter gather table * @@ -155,14 +172,24 @@ static inline void xe_res_first_sg(const struct sg_table *sg, u64 start, u64 size, struct xe_res_cursor *cur) { - XE_WARN_ON(!sg); - cur->node = NULL; - cur->start = start; - cur->remaining = size; - cur->size = 0; - cur->sgl = sg->sgl; - cur->mem_type = XE_PL_TT; - __xe_res_sg_next(cur); + __xe_res_first_sg(sg, start, size, cur, XE_PL_TT); +} + +/** + * xe_res_first_sg_system - initialize a xe_res_cursor for iterate system memory pages + * + * @sg: scatter gather table to walk + * @start: Start of the range + * @size: Size of the range + * @cur: cursor object to initialize + * + * Start walking over the range of allocations between @start and @size + */ +static inline void xe_res_first_sg_system(const struct sg_table *sg, + u64 start, u64 size, + struct xe_res_cursor *cur) +{ + __xe_res_first_sg(sg, start, size, cur, XE_PL_SYSTEM); } /** From patchwork Mon Dec 9 13:33:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899786 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4B835E7717D for ; Mon, 9 Dec 2024 13:33:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C402C10E74E; Mon, 9 Dec 2024 13:33:29 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="I4yQpSZ+"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5D35610E75A; Mon, 9 Dec 2024 13:33:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751207; x=1765287207; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=de/u2tWBFVAIdQnJDSDQoU0pPV3M20qjM9UvPCWy9Wk=; b=I4yQpSZ+UJ+8cvbFQnDS5IBzHw41RujFl3c+FhH5b5pN9GflFgB7jbF0 zGFWemo1e751abCpGYSeiZ/IybE+6oOungaZduC4iF/cMtSNIiEvYk+GU fHte+IA/naiAhzKNJDCMqhLfq23N6j85FR3UEz/5b0oOeXKd7N6592bPq YXeb/AbTbjJMkhOR+sYoSe6yUGkEVcCaIQVdm6KHvzIH3pqWFGz9/B/kH 3YKEozkjCv5Jl/BQqPxQswu3M/rOIQ+1hOZLSLf7Jh//f1QHBrVreT6xi QnlweeW1Lae8OwZtUpkIGZexuN/8hBojDjHGN44I2Tw0/c0vAvQ6asgL6 w==; X-CSE-ConnectionGUID: Hld7nBTBReylJHTEWgNVzQ== X-CSE-MsgGUID: VMHg4P7PRs6CfrDDy/Pc2Q== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192064" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192064" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:27 -0800 X-CSE-ConnectionGUID: 37JK02koTcSf/c/MbvXsjw== X-CSE-MsgGUID: BH1pn6BQQhKpbVr7WYr/vg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531326" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:26 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Andrzej Hajda , Maciej Patelczyk , Mika Kuoppala , Jonathan Cavitt Subject: [PATCH 14/26] drm/xe/eudebug: implement userptr_vma access Date: Mon, 9 Dec 2024 15:33:05 +0200 Message-ID: <20241209133318.1806472-15-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Andrzej Hajda Debugger needs to read/write program's vmas including userptr_vma. Since hmm_range_fault is used to pin userptr vmas, it is possible to map those vmas from debugger context. v2: pin pages vs notifier, move to vm.c (Matthew) v3: - iterate over system pages instead of DMA, fixes iommu enabled - s/xe_uvma_access/xe_vm_uvma_access/ (Matt) Signed-off-by: Andrzej Hajda Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala Reviewed-by: Jonathan Cavitt #v1 --- drivers/gpu/drm/xe/xe_eudebug.c | 3 ++- drivers/gpu/drm/xe/xe_vm.c | 47 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_vm.h | 3 +++ 3 files changed, 52 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 9d87df75348b..e5949e4dcad8 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -3076,7 +3076,8 @@ static int xe_eudebug_vma_access(struct xe_vma *vma, u64 offset_in_vma, return ret; } - return -EINVAL; + return xe_vm_userptr_access(to_userptr_vma(vma), offset_in_vma, + buf, bytes, write); } static int xe_eudebug_vm_access(struct xe_vm *vm, u64 offset, diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 0f17bc8b627b..224ff9e16941 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3414,3 +3414,50 @@ void xe_vm_snapshot_free(struct xe_vm_snapshot *snap) } kvfree(snap); } + +int xe_vm_userptr_access(struct xe_userptr_vma *uvma, u64 offset, + void *buf, u64 len, bool write) +{ + struct xe_vm *vm = xe_vma_vm(&uvma->vma); + struct xe_userptr *up = &uvma->userptr; + struct xe_res_cursor cur = {}; + int cur_len, ret = 0; + + while (true) { + down_read(&vm->userptr.notifier_lock); + if (!xe_vma_userptr_check_repin(uvma)) + break; + + spin_lock(&vm->userptr.invalidated_lock); + list_del_init(&uvma->userptr.invalidate_link); + spin_unlock(&vm->userptr.invalidated_lock); + + up_read(&vm->userptr.notifier_lock); + ret = xe_vma_userptr_pin_pages(uvma); + if (ret) + return ret; + } + + if (!up->sg) { + ret = -EINVAL; + goto out_unlock_notifier; + } + + for (xe_res_first_sg_system(up->sg, offset, len, &cur); cur.remaining; + xe_res_next(&cur, cur_len)) { + void *ptr = kmap_local_page(sg_page(cur.sgl)) + cur.start; + + cur_len = min(cur.size, cur.remaining); + if (write) + memcpy(ptr, buf, cur_len); + else + memcpy(buf, ptr, cur_len); + kunmap_local(ptr); + buf += cur_len; + } + ret = len; + +out_unlock_notifier: + up_read(&vm->userptr.notifier_lock); + return ret; +} diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 23adb7442881..372ad40ad67f 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -280,3 +280,6 @@ struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm *vm); void xe_vm_snapshot_capture_delayed(struct xe_vm_snapshot *snap); void xe_vm_snapshot_print(struct xe_vm_snapshot *snap, struct drm_printer *p); void xe_vm_snapshot_free(struct xe_vm_snapshot *snap); + +int xe_vm_userptr_access(struct xe_userptr_vma *uvma, u64 offset, + void *buf, u64 len, bool write); From patchwork Mon Dec 9 13:33:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69D9DE77182 for ; Mon, 9 Dec 2024 13:33:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E112210E758; Mon, 9 Dec 2024 13:33:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="QbadJ5/5"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id A929B10E74B; Mon, 9 Dec 2024 13:33:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751210; x=1765287210; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=i0eEWUvjEADbiO2A+EF/OaWk+nouJDCJvT64ZfH2BCQ=; b=QbadJ5/5qAhiEUYJHmaAHyN4sPBKqxmZE0+irN9BUpl/qFOCmksnXxK9 4qFL3HRqsVR7t8MmC0LMj1RSeVxB74BVCpcxX1sRejSeOx5tUsZk7hEHd WvsIkEPn0HrMQ5T303knlWb/RB9iOKBGq6xzaOQaXUdo8BebQgJToZUyo SrQfNJce/zKfItap/1IFAci0T3eFL2tUv76feVh4lZYvP2kH12BAJgvZJ cnqWnB4McNmTqpsfTUtFCvNJQInhsQIJ2oLzDWLBsHMcNezAp9uNJDWa6 HstvAmxFTDMbfGBpYTZnK72z8JBfG0P9O3gvnUFEVH5X2Q1s+ECCZn7GB w==; X-CSE-ConnectionGUID: kOp2C9TpS7OfO7/rufzqCw== X-CSE-MsgGUID: 3fUhB8vaSrenS0fRDRFd4w== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192076" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192076" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:29 -0800 X-CSE-ConnectionGUID: P1e/Bbc0TB6IclHkN+TnNw== X-CSE-MsgGUID: fxvzCazRTjqVaj5saA/AjQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531334" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:28 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Matthew Auld , Mika Kuoppala Subject: [PATCH 15/26] drm/xe: Debug metadata create/destroy ioctls Date: Mon, 9 Dec 2024 15:33:06 +0200 Message-ID: <20241209133318.1806472-16-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Ad a part of eu debug feature introduce debug metadata objects. These are to be used to pass metadata between client and debugger, by attaching them to vm_bind operations. todo: WORK_IN_PROGRESS_* defines need to be reworded/refined when the real usage and need is established by l0+gdb. v2: - include uapi/drm/xe_drm.h - metadata behind kconfig (Mika) - dont leak args->id on error (Matt Auld) Cc: Matthew Auld Signed-off-by: Dominik Grzegorzek Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/Makefile | 3 +- drivers/gpu/drm/xe/xe_debug_metadata.c | 107 +++++++++++++++++++ drivers/gpu/drm/xe/xe_debug_metadata.h | 50 +++++++++ drivers/gpu/drm/xe/xe_debug_metadata_types.h | 25 +++++ drivers/gpu/drm/xe/xe_device.c | 5 + drivers/gpu/drm/xe/xe_device.h | 2 + drivers/gpu/drm/xe/xe_device_types.h | 7 ++ drivers/gpu/drm/xe/xe_eudebug.c | 13 +++ include/uapi/drm/xe_drm.h | 53 ++++++++- 9 files changed, 263 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_debug_metadata.c create mode 100644 drivers/gpu/drm/xe/xe_debug_metadata.h create mode 100644 drivers/gpu/drm/xe/xe_debug_metadata_types.h diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index 33f457e4fcd3..e7dc299ea178 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -117,7 +117,8 @@ xe-y += xe_bb.o \ xe_wa.o \ xe_wopcm.o -xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o +xe-$(CONFIG_DRM_XE_EUDEBUG) += xe_eudebug.o \ + xe_debug_metadata.o xe-$(CONFIG_HMM_MIRROR) += xe_hmm.o diff --git a/drivers/gpu/drm/xe/xe_debug_metadata.c b/drivers/gpu/drm/xe/xe_debug_metadata.c new file mode 100644 index 000000000000..1dfed9aed285 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_debug_metadata.c @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ +#include "xe_debug_metadata.h" + +#include +#include +#include + +#include "xe_device.h" +#include "xe_macros.h" + +static void xe_debug_metadata_release(struct kref *ref) +{ + struct xe_debug_metadata *mdata = container_of(ref, struct xe_debug_metadata, refcount); + + kvfree(mdata->ptr); + kfree(mdata); +} + +void xe_debug_metadata_put(struct xe_debug_metadata *mdata) +{ + kref_put(&mdata->refcount, xe_debug_metadata_release); +} + +int xe_debug_metadata_create_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file) +{ + struct xe_device *xe = to_xe_device(dev); + struct xe_file *xef = to_xe_file(file); + struct drm_xe_debug_metadata_create *args = data; + struct xe_debug_metadata *mdata; + int err; + u32 id; + + if (XE_IOCTL_DBG(xe, args->extensions)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, args->type > DRM_XE_DEBUG_METADATA_PROGRAM_MODULE)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !args->user_addr || !args->len)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !access_ok(u64_to_user_ptr(args->user_addr), args->len))) + return -EFAULT; + + mdata = kzalloc(sizeof(*mdata), GFP_KERNEL); + if (!mdata) + return -ENOMEM; + + mdata->len = args->len; + mdata->type = args->type; + + mdata->ptr = kvmalloc(mdata->len, GFP_KERNEL); + if (!mdata->ptr) { + kfree(mdata); + return -ENOMEM; + } + kref_init(&mdata->refcount); + + err = copy_from_user(mdata->ptr, u64_to_user_ptr(args->user_addr), mdata->len); + if (err) { + err = -EFAULT; + goto put_mdata; + } + + mutex_lock(&xef->eudebug.metadata.lock); + err = xa_alloc(&xef->eudebug.metadata.xa, &id, mdata, xa_limit_32b, GFP_KERNEL); + mutex_unlock(&xef->eudebug.metadata.lock); + + if (err) + goto put_mdata; + + args->metadata_id = id; + + return 0; + +put_mdata: + xe_debug_metadata_put(mdata); + return err; +} + +int xe_debug_metadata_destroy_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file) +{ + struct xe_device *xe = to_xe_device(dev); + struct xe_file *xef = to_xe_file(file); + struct drm_xe_debug_metadata_destroy * const args = data; + struct xe_debug_metadata *mdata; + + if (XE_IOCTL_DBG(xe, args->extensions)) + return -EINVAL; + + mutex_lock(&xef->eudebug.metadata.lock); + mdata = xa_erase(&xef->eudebug.metadata.xa, args->metadata_id); + mutex_unlock(&xef->eudebug.metadata.lock); + if (XE_IOCTL_DBG(xe, !mdata)) + return -ENOENT; + + xe_debug_metadata_put(mdata); + + return 0; +} diff --git a/drivers/gpu/drm/xe/xe_debug_metadata.h b/drivers/gpu/drm/xe/xe_debug_metadata.h new file mode 100644 index 000000000000..3266c25e657e --- /dev/null +++ b/drivers/gpu/drm/xe/xe_debug_metadata.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef _XE_DEBUG_METADATA_H_ +#define _XE_DEBUG_METADATA_H_ + +struct drm_device; +struct drm_file; + +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + +#include "xe_debug_metadata_types.h" + +void xe_debug_metadata_put(struct xe_debug_metadata *mdata); + +int xe_debug_metadata_create_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file); + +int xe_debug_metadata_destroy_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file); +#else /* CONFIG_DRM_XE_EUDEBUG */ + +#include + +struct xe_debug_metadata; + +static inline void xe_debug_metadata_put(struct xe_debug_metadata *mdata) { } + +static inline int xe_debug_metadata_create_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file) +{ + return -EOPNOTSUPP; +} + +static inline int xe_debug_metadata_destroy_ioctl(struct drm_device *dev, + void *data, + struct drm_file *file) +{ + return -EOPNOTSUPP; +} + +#endif /* CONFIG_DRM_XE_EUDEBUG */ + + +#endif diff --git a/drivers/gpu/drm/xe/xe_debug_metadata_types.h b/drivers/gpu/drm/xe/xe_debug_metadata_types.h new file mode 100644 index 000000000000..624852920f58 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_debug_metadata_types.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef _XE_DEBUG_METADATA_TYPES_H_ +#define _XE_DEBUG_METADATA_TYPES_H_ + +#include + +struct xe_debug_metadata { + /** @type: type of given metadata */ + u64 type; + + /** @ptr: copy of userptr, given as a metadata payload */ + void *ptr; + + /** @len: length, in bytes of the metadata */ + u64 len; + + /** @ref: reference count */ + struct kref refcount; +}; + +#endif diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index dc0336215912..a7a715475184 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -25,6 +25,7 @@ #include "xe_bo.h" #include "xe_debugfs.h" #include "xe_devcoredump.h" +#include "xe_debug_metadata.h" #include "xe_dma_buf.h" #include "xe_drm_client.h" #include "xe_drv.h" @@ -197,6 +198,10 @@ static const struct drm_ioctl_desc xe_ioctls[] = { DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(XE_EUDEBUG_CONNECT, xe_eudebug_connect_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(XE_DEBUG_METADATA_CREATE, xe_debug_metadata_create_ioctl, + DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(XE_DEBUG_METADATA_DESTROY, xe_debug_metadata_destroy_ioctl, + DRM_RENDER_ALLOW), }; static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h index 088831a6b863..f89f1f0fc25e 100644 --- a/drivers/gpu/drm/xe/xe_device.h +++ b/drivers/gpu/drm/xe/xe_device.h @@ -218,6 +218,8 @@ static inline int xe_eudebug_needs_lock(const unsigned int cmd) case DRM_XE_EXEC_QUEUE_CREATE: case DRM_XE_EXEC_QUEUE_DESTROY: case DRM_XE_EUDEBUG_CONNECT: + case DRM_XE_DEBUG_METADATA_CREATE: + case DRM_XE_DEBUG_METADATA_DESTROY: return 1; } diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 7b893a86d83f..4ab9f06eba2d 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -684,6 +684,13 @@ struct xe_file { struct { /** @client_link: list entry in xe_device.clients.list */ struct list_head client_link; + + struct { + /** @xa: xarray to store debug metadata */ + struct xarray xa; + /** @lock: protects debug metadata xarray */ + struct mutex lock; + } metadata; } eudebug; #endif }; diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index e5949e4dcad8..e9092ed0b344 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -21,6 +21,7 @@ #include "xe_assert.h" #include "xe_bo.h" #include "xe_device.h" +#include "xe_debug_metadata.h" #include "xe_eudebug.h" #include "xe_eudebug_types.h" #include "xe_exec_queue.h" @@ -2141,6 +2142,8 @@ void xe_eudebug_file_open(struct xe_file *xef) struct xe_eudebug *d; INIT_LIST_HEAD(&xef->eudebug.client_link); + mutex_init(&xef->eudebug.metadata.lock); + xa_init_flags(&xef->eudebug.metadata.xa, XA_FLAGS_ALLOC1); down_read(&xef->xe->eudebug.discovery_lock); @@ -2158,12 +2161,22 @@ void xe_eudebug_file_open(struct xe_file *xef) void xe_eudebug_file_close(struct xe_file *xef) { struct xe_eudebug *d; + unsigned long idx; + struct xe_debug_metadata *mdata; down_read(&xef->xe->eudebug.discovery_lock); d = xe_eudebug_get(xef); if (d) xe_eudebug_event_put(d, client_destroy_event(d, xef)); + mutex_lock(&xef->eudebug.metadata.lock); + xa_for_each(&xef->eudebug.metadata.xa, idx, mdata) + xe_debug_metadata_put(mdata); + mutex_unlock(&xef->eudebug.metadata.lock); + + xa_destroy(&xef->eudebug.metadata.xa); + mutex_destroy(&xef->eudebug.metadata.lock); + spin_lock(&xef->xe->clients.lock); list_del_init(&xef->eudebug.client_link); spin_unlock(&xef->xe->clients.lock); diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index d0b9ef0799b2..1a452a8d2a2a 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -103,7 +103,8 @@ extern "C" { #define DRM_XE_WAIT_USER_FENCE 0x0a #define DRM_XE_OBSERVATION 0x0b #define DRM_XE_EUDEBUG_CONNECT 0x0c - +#define DRM_XE_DEBUG_METADATA_CREATE 0x0d +#define DRM_XE_DEBUG_METADATA_DESTROY 0x0e /* Must be kept compact -- no holes */ #define DRM_IOCTL_XE_DEVICE_QUERY DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEVICE_QUERY, struct drm_xe_device_query) @@ -119,6 +120,8 @@ extern "C" { #define DRM_IOCTL_XE_WAIT_USER_FENCE DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence) #define DRM_IOCTL_XE_OBSERVATION DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param) #define DRM_IOCTL_XE_EUDEBUG_CONNECT DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EUDEBUG_CONNECT, struct drm_xe_eudebug_connect) +#define DRM_IOCTL_XE_DEBUG_METADATA_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEBUG_METADATA_CREATE, struct drm_xe_debug_metadata_create) +#define DRM_IOCTL_XE_DEBUG_METADATA_DESTROY DRM_IOW(DRM_COMMAND_BASE + DRM_XE_DEBUG_METADATA_DESTROY, struct drm_xe_debug_metadata_destroy) /** * DOC: Xe IOCTL Extensions @@ -1733,6 +1736,54 @@ struct drm_xe_eudebug_connect { __u32 version; /* output: current ABI (ioctl / events) version */ }; +/* + * struct drm_xe_debug_metadata_create - Create debug metadata + * + * Add a region of user memory to be marked as debug metadata. + * When the debugger attaches, the metadata regions will be delivered + * for debugger. Debugger can then map these regions to help decode + * the program state. + * + * Returns handle to created metadata entry. + */ +struct drm_xe_debug_metadata_create { + /** @extensions: Pointer to the first extension struct, if any */ + __u64 extensions; + +#define DRM_XE_DEBUG_METADATA_ELF_BINARY 0 +#define DRM_XE_DEBUG_METADATA_PROGRAM_MODULE 1 +#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_MODULE_AREA 2 +#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_SBA_AREA 3 +#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_SIP_AREA 4 +#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM (1 + \ + WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_SIP_AREA) + + /** @type: Type of metadata */ + __u64 type; + + /** @user_addr: pointer to start of the metadata */ + __u64 user_addr; + + /** @len: length, in bytes of the medata */ + __u64 len; + + /** @metadata_id: created metadata handle (out) */ + __u32 metadata_id; +}; + +/** + * struct drm_xe_debug_metadata_destroy - Destroy debug metadata + * + * Destroy debug metadata. + */ +struct drm_xe_debug_metadata_destroy { + /** @extensions: Pointer to the first extension struct, if any */ + __u64 extensions; + + /** @metadata_id: metadata handle to destroy */ + __u32 metadata_id; +}; + #include "xe_drm_eudebug.h" #if defined(__cplusplus) From patchwork Mon Dec 9 13:33:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899788 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E294BE77180 for ; Mon, 9 Dec 2024 13:33:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 581E710E759; Mon, 9 Dec 2024 13:33:34 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="de3jv9Fg"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 50EE310E758; Mon, 9 Dec 2024 13:33:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751211; x=1765287211; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2v8M/mohB2JjUcQ4ERt2TysBGXFiMkXizjbxDT7wpVA=; b=de3jv9FgKgzQ5KujPgydCwAl5z1DvF0RdlS9MWqYwaxQJy5h8QSr22ev Xi3MraQAbKX6P1myAOpAxwMU+4vgJjQz9DV/wmEx24UnZskY9CZ+zTeck CZ2zzNa56LCBT6c3YH2OYetkE11K3QtSLJRe1K9JI/7RQtu6nMBZBO/hN 2totfqOmIb2eEDY3bZ2IVSRHCGRFsRf0FoAuyo4r8ep0yafXsiXnk7B4D FH4GN6SFzk5TLjdl8jWgFmbMR2TVcf1a0QPIBpZGtJwNq3rI154wx7Lux ZdrILMQXTTKFliaVWltDhLnAq5kmPmbTJsnGg1nBf3lG3MHIaVdczsE2F w==; X-CSE-ConnectionGUID: Cxqg7IexQSaQo0WohpMuBw== X-CSE-MsgGUID: mcNf0+TSRF+hzLObzTmxKw== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192091" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192091" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:31 -0800 X-CSE-ConnectionGUID: Aqee65FOT9SSvXREuxWrTw== X-CSE-MsgGUID: 1D+40wTLQYSr9i64bJKFtg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531341" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:29 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Maciej Patelczyk , Mika Kuoppala Subject: [PATCH 16/26] drm/xe: Attach debug metadata to vma Date: Mon, 9 Dec 2024 15:33:07 +0200 Message-ID: <20241209133318.1806472-17-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Introduces a vm_bind_op extension, enabling users to attach metadata objects to each [OP_MAP|OP_MAP_USERPTR] operation. This interface will be utilized by the EU debugger to relay information about the contents of specified VMAs from the debugee to the debugger process. v2: move vma metadata handling behind Kconfig (Mika) Signed-off-by: Dominik Grzegorzek Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_debug_metadata.c | 120 +++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_debug_metadata.h | 52 +++++++++++ drivers/gpu/drm/xe/xe_vm.c | 99 +++++++++++++++++++- drivers/gpu/drm/xe/xe_vm_types.h | 27 ++++++ include/uapi/drm/xe_drm.h | 19 ++++ 5 files changed, 313 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_debug_metadata.c b/drivers/gpu/drm/xe/xe_debug_metadata.c index 1dfed9aed285..b045bdd77235 100644 --- a/drivers/gpu/drm/xe/xe_debug_metadata.c +++ b/drivers/gpu/drm/xe/xe_debug_metadata.c @@ -10,6 +10,113 @@ #include "xe_device.h" #include "xe_macros.h" +#include "xe_vm.h" + +void xe_eudebug_free_vma_metadata(struct xe_eudebug_vma_metadata *mdata) +{ + struct xe_vma_debug_metadata *vmad, *tmp; + + list_for_each_entry_safe(vmad, tmp, &mdata->list, link) { + list_del(&vmad->link); + kfree(vmad); + } +} + +static struct xe_vma_debug_metadata * +vma_new_debug_metadata(u32 metadata_id, u64 cookie) +{ + struct xe_vma_debug_metadata *vmad; + + vmad = kzalloc(sizeof(*vmad), GFP_KERNEL); + if (!vmad) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&vmad->link); + + vmad->metadata_id = metadata_id; + vmad->cookie = cookie; + + return vmad; +} + +int xe_eudebug_copy_vma_metadata(struct xe_eudebug_vma_metadata *from, + struct xe_eudebug_vma_metadata *to) +{ + struct xe_vma_debug_metadata *vmad, *vma; + + list_for_each_entry(vmad, &from->list, link) { + vma = vma_new_debug_metadata(vmad->metadata_id, vmad->cookie); + if (IS_ERR(vma)) + return PTR_ERR(vma); + + list_add_tail(&vmad->link, &to->list); + } + + return 0; +} + +static int vma_new_debug_metadata_op(struct xe_vma_op *op, + u32 metadata_id, u64 cookie, + u64 flags) +{ + struct xe_vma_debug_metadata *vmad; + + vmad = vma_new_debug_metadata(metadata_id, cookie); + if (IS_ERR(vmad)) + return PTR_ERR(vmad); + + list_add_tail(&vmad->link, &op->map.eudebug.metadata.list); + + return 0; +} + +int vm_bind_op_ext_attach_debug(struct xe_device *xe, + struct xe_file *xef, + struct drm_gpuva_ops *ops, + u32 operation, u64 extension) +{ + u64 __user *address = u64_to_user_ptr(extension); + struct drm_xe_vm_bind_op_ext_attach_debug ext; + struct xe_debug_metadata *mdata; + struct drm_gpuva_op *__op; + int err; + + err = __copy_from_user(&ext, address, sizeof(ext)); + if (XE_IOCTL_DBG(xe, err)) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, + operation != DRM_XE_VM_BIND_OP_MAP_USERPTR && + operation != DRM_XE_VM_BIND_OP_MAP)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, ext.flags)) + return -EINVAL; + + mdata = xe_debug_metadata_get(xef, (u32)ext.metadata_id); + if (XE_IOCTL_DBG(xe, !mdata)) + return -ENOENT; + + /* care about metadata existence only on the time of attach */ + xe_debug_metadata_put(mdata); + + if (!ops) + return 0; + + drm_gpuva_for_each_op(__op, ops) { + struct xe_vma_op *op = gpuva_op_to_vma_op(__op); + + if (op->base.op == DRM_GPUVA_OP_MAP) { + err = vma_new_debug_metadata_op(op, + ext.metadata_id, + ext.cookie, + ext.flags); + if (err) + return err; + } + } + return 0; +} static void xe_debug_metadata_release(struct kref *ref) { @@ -24,6 +131,19 @@ void xe_debug_metadata_put(struct xe_debug_metadata *mdata) kref_put(&mdata->refcount, xe_debug_metadata_release); } +struct xe_debug_metadata *xe_debug_metadata_get(struct xe_file *xef, u32 id) +{ + struct xe_debug_metadata *mdata; + + mutex_lock(&xef->eudebug.metadata.lock); + mdata = xa_load(&xef->eudebug.metadata.xa, id); + if (mdata) + kref_get(&mdata->refcount); + mutex_unlock(&xef->eudebug.metadata.lock); + + return mdata; +} + int xe_debug_metadata_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file) diff --git a/drivers/gpu/drm/xe/xe_debug_metadata.h b/drivers/gpu/drm/xe/xe_debug_metadata.h index 3266c25e657e..ba913a4d6def 100644 --- a/drivers/gpu/drm/xe/xe_debug_metadata.h +++ b/drivers/gpu/drm/xe/xe_debug_metadata.h @@ -6,13 +6,18 @@ #ifndef _XE_DEBUG_METADATA_H_ #define _XE_DEBUG_METADATA_H_ +#include + struct drm_device; struct drm_file; +struct xe_file; #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) #include "xe_debug_metadata_types.h" +#include "xe_vm_types.h" +struct xe_debug_metadata *xe_debug_metadata_get(struct xe_file *xef, u32 id); void xe_debug_metadata_put(struct xe_debug_metadata *mdata); int xe_debug_metadata_create_ioctl(struct drm_device *dev, @@ -22,11 +27,35 @@ int xe_debug_metadata_create_ioctl(struct drm_device *dev, int xe_debug_metadata_destroy_ioctl(struct drm_device *dev, void *data, struct drm_file *file); + +static inline void xe_eudebug_move_vma_metadata(struct xe_eudebug_vma_metadata *from, + struct xe_eudebug_vma_metadata *to) +{ + list_splice_tail_init(&from->list, &to->list); +} + +int xe_eudebug_copy_vma_metadata(struct xe_eudebug_vma_metadata *from, + struct xe_eudebug_vma_metadata *to); +void xe_eudebug_free_vma_metadata(struct xe_eudebug_vma_metadata *mdata); + +int vm_bind_op_ext_attach_debug(struct xe_device *xe, + struct xe_file *xef, + struct drm_gpuva_ops *ops, + u32 operation, u64 extension); + #else /* CONFIG_DRM_XE_EUDEBUG */ #include struct xe_debug_metadata; +struct xe_device; +struct xe_eudebug_vma_metadata; +struct drm_gpuva_ops; + +static inline struct xe_debug_metadata *xe_debug_metadata_get(struct xe_file *xef, u32 id) +{ + return NULL; +} static inline void xe_debug_metadata_put(struct xe_debug_metadata *mdata) { } @@ -44,6 +73,29 @@ static inline int xe_debug_metadata_destroy_ioctl(struct drm_device *dev, return -EOPNOTSUPP; } +static inline void xe_eudebug_move_vma_metadata(struct xe_eudebug_vma_metadata *from, + struct xe_eudebug_vma_metadata *to) +{ +} + +static inline int xe_eudebug_copy_vma_metadata(struct xe_eudebug_vma_metadata *from, + struct xe_eudebug_vma_metadata *to) +{ + return 0; +} + +static inline void xe_eudebug_free_vma_metadata(struct xe_eudebug_vma_metadata *mdata) +{ +} + +static inline int vm_bind_op_ext_attach_debug(struct xe_device *xe, + struct xe_file *xef, + struct drm_gpuva_ops *ops, + u32 operation, u64 extension) +{ + return -EINVAL; +} + #endif /* CONFIG_DRM_XE_EUDEBUG */ diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 224ff9e16941..19c0b36c10b1 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -23,6 +23,7 @@ #include "regs/xe_gtt_defs.h" #include "xe_assert.h" #include "xe_bo.h" +#include "xe_debug_metadata.h" #include "xe_device.h" #include "xe_drm_client.h" #include "xe_eudebug.h" @@ -944,6 +945,9 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, vma->gpuva.gem.obj = &bo->ttm.base; } +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + INIT_LIST_HEAD(&vma->eudebug.metadata.list); +#endif INIT_LIST_HEAD(&vma->combined_links.rebind); INIT_LIST_HEAD(&vma->gpuva.gem.entry); @@ -1036,6 +1040,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma) xe_bo_put(xe_vma_bo(vma)); } + xe_eudebug_free_vma_metadata(&vma->eudebug.metadata); xe_vma_free(vma); } @@ -1979,6 +1984,9 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo, op->map.is_null = flags & DRM_XE_VM_BIND_FLAG_NULL; op->map.dumpable = flags & DRM_XE_VM_BIND_FLAG_DUMPABLE; op->map.pat_index = pat_index; +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + INIT_LIST_HEAD(&op->map.eudebug.metadata.list); +#endif } else if (__op->op == DRM_GPUVA_OP_PREFETCH) { op->prefetch.region = prefetch_region; } @@ -2170,11 +2178,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops, flags |= op->map.dumpable ? VMA_CREATE_FLAG_DUMPABLE : 0; - vma = new_vma(vm, &op->base.map, op->map.pat_index, - flags); + vma = new_vma(vm, &op->base.map, op->map.pat_index, flags); if (IS_ERR(vma)) return PTR_ERR(vma); + xe_eudebug_move_vma_metadata(&op->map.eudebug.metadata, + &vma->eudebug.metadata); + op->map.vma = vma; if (op->map.immediate || !xe_vm_in_fault_mode(vm)) xe_vma_ops_incr_pt_update_ops(vops, @@ -2205,6 +2215,9 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops, if (IS_ERR(vma)) return PTR_ERR(vma); + xe_eudebug_move_vma_metadata(&old->eudebug.metadata, + &vma->eudebug.metadata); + op->remap.prev = vma; /* @@ -2244,6 +2257,16 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops, if (IS_ERR(vma)) return PTR_ERR(vma); + if (op->base.remap.prev) { + err = xe_eudebug_copy_vma_metadata(&op->remap.prev->eudebug.metadata, + &vma->eudebug.metadata); + if (err) + return err; + } else { + xe_eudebug_move_vma_metadata(&old->eudebug.metadata, + &vma->eudebug.metadata); + } + op->remap.next = vma; /* @@ -2294,6 +2317,7 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op, switch (op->base.op) { case DRM_GPUVA_OP_MAP: if (op->map.vma) { + xe_eudebug_free_vma_metadata(&op->map.eudebug.metadata); prep_vma_destroy(vm, op->map.vma, post_commit); xe_vma_destroy_unlocked(op->map.vma); } @@ -2532,6 +2556,58 @@ static int vm_ops_setup_tile_args(struct xe_vm *vm, struct xe_vma_ops *vops) } return number_tiles; +}; + +typedef int (*xe_vm_bind_op_user_extension_fn)(struct xe_device *xe, + struct xe_file *xef, + struct drm_gpuva_ops *ops, + u32 operation, u64 extension); + +static const xe_vm_bind_op_user_extension_fn vm_bind_op_extension_funcs[] = { + [XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG] = vm_bind_op_ext_attach_debug, +}; + +#define MAX_USER_EXTENSIONS 16 +static int vm_bind_op_user_extensions(struct xe_device *xe, + struct xe_file *xef, + struct drm_gpuva_ops *ops, + u32 operation, + u64 extensions, int ext_number) +{ + u64 __user *address = u64_to_user_ptr(extensions); + struct drm_xe_user_extension ext; + int err; + + if (XE_IOCTL_DBG(xe, ext_number >= MAX_USER_EXTENSIONS)) + return -E2BIG; + + err = __copy_from_user(&ext, address, sizeof(ext)); + if (XE_IOCTL_DBG(xe, err)) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, ext.pad) || + XE_IOCTL_DBG(xe, ext.name >= + ARRAY_SIZE(vm_bind_op_extension_funcs))) + return -EINVAL; + + err = vm_bind_op_extension_funcs[ext.name](xe, xef, ops, + operation, extensions); + if (XE_IOCTL_DBG(xe, err)) + return err; + + if (ext.next_extension) + return vm_bind_op_user_extensions(xe, xef, ops, + operation, ext.next_extension, + ++ext_number); + + return 0; +} + +static int vm_bind_op_user_extensions_check(struct xe_device *xe, + struct xe_file *xef, + u32 operation, u64 extensions) +{ + return vm_bind_op_user_extensions(xe, xef, NULL, operation, extensions, 0); } static struct dma_fence *ops_execute(struct xe_vm *vm, @@ -2729,6 +2805,7 @@ ALLOW_ERROR_INJECTION(vm_bind_ioctl_ops_execute, ERRNO); #define ALL_DRM_XE_SYNCS_FLAGS (DRM_XE_SYNCS_FLAG_WAIT_FOR_OP) static int vm_bind_ioctl_check_args(struct xe_device *xe, + struct xe_file *xef, struct drm_xe_vm_bind *args, struct drm_xe_vm_bind_op **bind_ops) { @@ -2773,6 +2850,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, u64 obj_offset = (*bind_ops)[i].obj_offset; u32 prefetch_region = (*bind_ops)[i].prefetch_mem_region_instance; bool is_null = flags & DRM_XE_VM_BIND_FLAG_NULL; + u64 extensions = (*bind_ops)[i].extensions; u16 pat_index = (*bind_ops)[i].pat_index; u16 coh_mode; @@ -2833,6 +2911,13 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, err = -EINVAL; goto free_bind_ops; } + + if (extensions) { + err = vm_bind_op_user_extensions_check(xe, xef, op, extensions); + if (err) + goto free_bind_ops; + } + } return 0; @@ -2944,7 +3029,7 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) int err; int i; - err = vm_bind_ioctl_check_args(xe, args, &bind_ops); + err = vm_bind_ioctl_check_args(xe, xef, args, &bind_ops); if (err) return err; @@ -3073,11 +3158,17 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) u64 obj_offset = bind_ops[i].obj_offset; u32 prefetch_region = bind_ops[i].prefetch_mem_region_instance; u16 pat_index = bind_ops[i].pat_index; + u64 extensions = bind_ops[i].extensions; ops[i] = vm_bind_ioctl_ops_create(vm, bos[i], obj_offset, addr, range, op, flags, prefetch_region, pat_index); - if (IS_ERR(ops[i])) { + if (!IS_ERR(ops[i]) && extensions) { + err = vm_bind_op_user_extensions(xe, xef, ops[i], + op, extensions, 0); + if (err) + goto unwind_ops; + } else if (IS_ERR(ops[i])) { err = PTR_ERR(ops[i]); ops[i] = NULL; goto unwind_ops; diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 557b047ebdd7..1c5776194e54 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -70,6 +70,14 @@ struct xe_userptr { #endif }; +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) +struct xe_eudebug_vma_metadata { + struct list_head list; +}; +#else +struct xe_eudebug_vma_metadata { }; +#endif + struct xe_vma { /** @gpuva: Base GPUVA object */ struct drm_gpuva gpuva; @@ -121,6 +129,11 @@ struct xe_vma { * Needs to be signalled before UNMAP can be processed. */ struct xe_user_fence *ufence; + + struct { + /** @metadata: List of vma debug metadata */ + struct xe_eudebug_vma_metadata metadata; + } eudebug; }; /** @@ -311,6 +324,10 @@ struct xe_vma_op_map { bool dumpable; /** @pat_index: The pat index to use for this operation. */ u16 pat_index; + struct { + /** @vma_metadata: List of vma debug metadata */ + struct xe_eudebug_vma_metadata metadata; + } eudebug; }; /** struct xe_vma_op_remap - VMA remap operation */ @@ -388,4 +405,14 @@ struct xe_vma_ops { #endif }; +struct xe_vma_debug_metadata { + /** @debug.metadata: id of attached xe_debug_metadata */ + u32 metadata_id; + /** @debug.cookie: user defined cookie */ + u64 cookie; + + /** @link: list of metadata attached to vma */ + struct list_head link; +}; + #endif diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 1a452a8d2a2a..176c348c3fdd 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -888,6 +888,23 @@ struct drm_xe_vm_destroy { __u64 reserved[2]; }; +struct drm_xe_vm_bind_op_ext_attach_debug { + /** @base: base user extension */ + struct drm_xe_user_extension base; + + /** @id: Debug object id from create metadata */ + __u64 metadata_id; + + /** @flags: Flags */ + __u64 flags; + + /** @cookie: Cookie */ + __u64 cookie; + + /** @reserved: Reserved */ + __u64 reserved; +}; + /** * struct drm_xe_vm_bind_op - run bind operations * @@ -912,7 +929,9 @@ struct drm_xe_vm_destroy { * handle MBZ, and the BO offset MBZ. This flag is intended to * implement VK sparse bindings. */ + struct drm_xe_vm_bind_op { +#define XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG 0 /** @extensions: Pointer to the first extension struct, if any */ __u64 extensions; From patchwork Mon Dec 9 13:33:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 80BFCE77182 for ; Mon, 9 Dec 2024 13:33:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E1FCE10E75B; Mon, 9 Dec 2024 13:33:34 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="PgNtNEDH"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7A70610E746; Mon, 9 Dec 2024 13:33:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751213; x=1765287213; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VJblLJegvP6uMIILclbYpsBNWEStwKlPq/JbPcZIc7Q=; b=PgNtNEDHayOu+nzZTJYvqz7kHqr8LQKfQwj7arUvd6S3r6XmP0AqTJqm wQNqiJVuULI0lVOXkPovh8xuoG3xbzj8GizeTsKC9OsbkWw64ZW7uDA2T vDfr3ldzmfgzyK6Tu8VIpYxSxT0SiE5yH75OF+InEhPe8csGPSL16N8yh N34c+lhK8Qrx6pxA5OnMHhrOSHKE9E+1sFc/dQJsTvNU1wcR/PjsKZzyz V5ruK/x2cFXMEiNot0DJqRH2IY+HYk4v+xnYrYtjxDEypsJ6P9KhtKyOD wldExsdrYInIPTCPTYoDqHQ9+HCJfqprXPqfUoMrsPrr12Ac+LO4NnCD2 w==; X-CSE-ConnectionGUID: eHolKHqVTve2og+rQntvFg== X-CSE-MsgGUID: 0ixehx7OSlG/BnHOz8/UHQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192109" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192109" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:33 -0800 X-CSE-ConnectionGUID: ivD9tyCpQMqgLQQDsmMchA== X-CSE-MsgGUID: gf75IaJDTZCQhNriZ0LbyA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531345" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:31 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Maciej Patelczyk , Mika Kuoppala Subject: [PATCH 17/26] drm/xe/eudebug: Add debug metadata support for xe_eudebug Date: Mon, 9 Dec 2024 15:33:08 +0200 Message-ID: <20241209133318.1806472-18-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Reflect debug metadata resource creation/destroy as events passed to the debugger. Introduce ioctl allowing to read metadata content on demand. Each VMA can have multiple metadata attached and it is passed from user on BIND or it's copied on internal remap. Xe EU Debugger on VM BIND will inform about VMA metadata attachements during bind IOCTL sending proper OP event. v2: - checkpatch (Maciej, Tilak) - struct alignment (Matthew) - Kconfig (Mika) Signed-off-by: Dominik Grzegorzek Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_debug_metadata.c | 8 +- drivers/gpu/drm/xe/xe_eudebug.c | 330 ++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug.h | 21 +- drivers/gpu/drm/xe/xe_eudebug_types.h | 27 +- drivers/gpu/drm/xe/xe_vm.c | 2 +- include/uapi/drm/xe_drm_eudebug.h | 30 +++ 6 files changed, 406 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_debug_metadata.c b/drivers/gpu/drm/xe/xe_debug_metadata.c index b045bdd77235..172fe2b33557 100644 --- a/drivers/gpu/drm/xe/xe_debug_metadata.c +++ b/drivers/gpu/drm/xe/xe_debug_metadata.c @@ -9,6 +9,7 @@ #include #include "xe_device.h" +#include "xe_eudebug.h" #include "xe_macros.h" #include "xe_vm.h" @@ -158,7 +159,7 @@ int xe_debug_metadata_create_ioctl(struct drm_device *dev, if (XE_IOCTL_DBG(xe, args->extensions)) return -EINVAL; - if (XE_IOCTL_DBG(xe, args->type > DRM_XE_DEBUG_METADATA_PROGRAM_MODULE)) + if (XE_IOCTL_DBG(xe, args->type >= WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM)) return -EINVAL; if (XE_IOCTL_DBG(xe, !args->user_addr || !args->len)) @@ -194,8 +195,11 @@ int xe_debug_metadata_create_ioctl(struct drm_device *dev, if (err) goto put_mdata; + args->metadata_id = id; + xe_eudebug_debug_metadata_create(xef, mdata); + return 0; put_mdata: @@ -221,6 +225,8 @@ int xe_debug_metadata_destroy_ioctl(struct drm_device *dev, if (XE_IOCTL_DBG(xe, !mdata)) return -ENOENT; + xe_eudebug_debug_metadata_destroy(xef, mdata); + xe_debug_metadata_put(mdata); return 0; diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index e9092ed0b344..2514b880d871 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -20,15 +20,18 @@ #include "xe_assert.h" #include "xe_bo.h" +#include "xe_debug_metadata.h" #include "xe_device.h" #include "xe_debug_metadata.h" #include "xe_eudebug.h" #include "xe_eudebug_types.h" #include "xe_exec_queue.h" +#include "xe_exec_queue_types.h" #include "xe_force_wake.h" #include "xe_gt.h" #include "xe_gt_debug.h" #include "xe_gt_mcr.h" +#include "xe_guc_exec_queue_types.h" #include "xe_hw_engine.h" #include "xe_lrc.h" #include "xe_macros.h" @@ -908,7 +911,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -943,7 +946,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -1227,6 +1230,90 @@ static long xe_eudebug_eu_control(struct xe_eudebug *d, const u64 arg) return ret; } +static struct xe_debug_metadata *find_metadata_get(struct xe_eudebug *d, + u32 id) +{ + struct xe_debug_metadata *m; + + mutex_lock(&d->res->lock); + m = find_resource__unlocked(d->res, XE_EUDEBUG_RES_TYPE_METADATA, id); + if (m) + kref_get(&m->refcount); + mutex_unlock(&d->res->lock); + + return m; +} + +static long xe_eudebug_read_metadata(struct xe_eudebug *d, + unsigned int cmd, + const u64 arg) +{ + struct drm_xe_eudebug_read_metadata user_arg; + struct xe_debug_metadata *mdata; + struct xe_file *xef; + struct xe_device *xe = d->xe; + long ret = 0; + + if (XE_IOCTL_DBG(xe, !(_IOC_DIR(cmd) & _IOC_WRITE))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, !(_IOC_DIR(cmd) & _IOC_READ))) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, _IOC_SIZE(cmd) < sizeof(user_arg))) + return -EINVAL; + + if (copy_from_user(&user_arg, u64_to_user_ptr(arg), sizeof(user_arg))) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, user_arg.flags)) + return -EINVAL; + + if (!access_ok(u64_to_user_ptr(user_arg.ptr), user_arg.size)) + return -EFAULT; + + if (xe_eudebug_detached(d)) + return -ENOTCONN; + + eu_dbg(d, + "read metadata: client_handle=%llu, metadata_handle=%llu, flags=0x%x", + user_arg.client_handle, user_arg.metadata_handle, user_arg.flags); + + xef = find_client_get(d, user_arg.client_handle); + if (XE_IOCTL_DBG(xe, !xef)) + return -EINVAL; + + mdata = find_metadata_get(d, (u32)user_arg.metadata_handle); + if (XE_IOCTL_DBG(xe, !mdata)) { + xe_file_put(xef); + return -EINVAL; + } + + if (user_arg.size) { + if (user_arg.size < mdata->len) { + ret = -EINVAL; + goto metadata_put; + } + + /* This limits us to a maximum payload size of 2G */ + if (copy_to_user(u64_to_user_ptr(user_arg.ptr), + mdata->ptr, mdata->len)) { + ret = -EFAULT; + goto metadata_put; + } + } + + user_arg.size = mdata->len; + + if (copy_to_user(u64_to_user_ptr(arg), &user_arg, sizeof(user_arg))) + ret = -EFAULT; + +metadata_put: + xe_debug_metadata_put(mdata); + xe_file_put(xef); + return ret; +} + static long xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg); static long xe_eudebug_ioctl(struct file *file, @@ -1257,7 +1344,10 @@ static long xe_eudebug_ioctl(struct file *file, ret = xe_eudebug_vm_open_ioctl(d, arg); eu_dbg(d, "ioctl cmd=VM_OPEN ret=%ld\n", ret); break; - + case DRM_XE_EUDEBUG_IOCTL_READ_METADATA: + ret = xe_eudebug_read_metadata(d, cmd, arg); + eu_dbg(d, "ioctl cmd=READ_METADATA ret=%ld\n", ret); + break; default: ret = -EINVAL; } @@ -2649,19 +2739,145 @@ static int vm_bind_op_event(struct xe_eudebug *d, return xe_eudebug_queue_bind_event(d, vm, event); } +static int vm_bind_op_metadata_event(struct xe_eudebug *d, + struct xe_vm *vm, + u32 flags, + u64 ref_seqno, + u64 metadata_handle, + u64 metadata_cookie) +{ + struct xe_eudebug_event_vm_bind_op_metadata *e; + struct xe_eudebug_event *event; + const u32 sz = sizeof(*e); + u64 seqno; + + seqno = atomic_long_inc_return(&d->events.seqno); + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA, + seqno, flags, sz); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_vm_bind_op_metadata, e, + vm_bind_op_ref_seqno, ref_seqno); + write_member(struct drm_xe_eudebug_event_vm_bind_op_metadata, e, + metadata_handle, metadata_handle); + write_member(struct drm_xe_eudebug_event_vm_bind_op_metadata, e, + metadata_cookie, metadata_cookie); + + /* If in discovery, no need to collect ops */ + if (!completion_done(&d->discovery)) + return xe_eudebug_queue_event(d, event); + + return xe_eudebug_queue_bind_event(d, vm, event); +} + +static int vm_bind_op_metadata_count(struct xe_eudebug *d, + struct xe_vm *vm, + struct list_head *debug_metadata) +{ + struct xe_vma_debug_metadata *metadata; + struct xe_debug_metadata *mdata; + int h_m = 0, metadata_count = 0; + + if (!debug_metadata) + return 0; + + list_for_each_entry(metadata, debug_metadata, link) { + mdata = xe_debug_metadata_get(vm->xef, metadata->metadata_id); + if (mdata) { + h_m = find_handle(d->res, XE_EUDEBUG_RES_TYPE_METADATA, mdata); + xe_debug_metadata_put(mdata); + } + + if (!mdata || h_m < 0) { + if (!mdata) { + eu_err(d, "Metadata::%u not found.", + metadata->metadata_id); + } else { + eu_err(d, "Metadata::%u not in the xe debugger", + metadata->metadata_id); + } + xe_eudebug_disconnect(d, -ENOENT); + return -ENOENT; + } + metadata_count++; + } + return metadata_count; +} + +static int vm_bind_op_metadata(struct xe_eudebug *d, struct xe_vm *vm, + const u32 flags, + const u64 op_ref_seqno, + struct list_head *debug_metadata) +{ + struct xe_vma_debug_metadata *metadata; + int h_m = 0; /* handle space range = <1, MAX_INT>, return 0 if metadata not attached */ + int metadata_count = 0; + int ret; + + if (!debug_metadata) + return 0; + + XE_WARN_ON(flags != DRM_XE_EUDEBUG_EVENT_CREATE); + + list_for_each_entry(metadata, debug_metadata, link) { + struct xe_debug_metadata *mdata; + + mdata = xe_debug_metadata_get(vm->xef, metadata->metadata_id); + if (mdata) { + h_m = find_handle(d->res, XE_EUDEBUG_RES_TYPE_METADATA, mdata); + xe_debug_metadata_put(mdata); + } + + if (!mdata || h_m < 0) { + eu_err(d, "Attached debug metadata::%u not found!\n", + metadata->metadata_id); + return -ENOENT; + } + + ret = vm_bind_op_metadata_event(d, vm, flags, op_ref_seqno, + h_m, metadata->cookie); + if (ret < 0) + return ret; + + metadata_count++; + } + + return metadata_count; +} + static int vm_bind_op(struct xe_eudebug *d, struct xe_vm *vm, const u32 flags, const u64 bind_ref_seqno, - u64 addr, u64 range) + u64 addr, u64 range, + struct list_head *debug_metadata) { u64 op_seqno = 0; - u64 num_extensions = 0; + u64 num_extensions; int ret; + ret = vm_bind_op_metadata_count(d, vm, debug_metadata); + if (ret < 0) + return ret; + + num_extensions = ret; + ret = vm_bind_op_event(d, vm, flags, bind_ref_seqno, num_extensions, addr, range, &op_seqno); if (ret) return ret; + ret = vm_bind_op_metadata(d, vm, flags, op_seqno, debug_metadata); + if (ret < 0) + return ret; + + if (ret != num_extensions) { + eu_err(d, "Inconsistency in metadata detected."); + return -EINVAL; + } + return 0; } @@ -2774,9 +2990,11 @@ void xe_eudebug_vm_bind_start(struct xe_vm *vm) xe_eudebug_put(d); } -void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) +void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range, + struct drm_gpuva_ops *ops) { struct xe_eudebug *d; + struct list_head *debug_metadata = NULL; u32 flags; if (!xe_vm_in_lr_mode(vm)) @@ -2786,7 +3004,17 @@ void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) case DRM_XE_VM_BIND_OP_MAP: case DRM_XE_VM_BIND_OP_MAP_USERPTR: { + struct drm_gpuva_op *__op; + flags = DRM_XE_EUDEBUG_EVENT_CREATE; + + /* OP_MAP will be last and singleton */ + drm_gpuva_for_each_op(__op, ops) { + struct xe_vma_op *op = gpuva_op_to_vma_op(__op); + + if (op->base.op == DRM_GPUVA_OP_MAP) + debug_metadata = &op->map.vma->eudebug.metadata.list; + } break; } case DRM_XE_VM_BIND_OP_UNMAP: @@ -2805,7 +3033,8 @@ void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) if (!d) return; - xe_eudebug_event_put(d, vm_bind_op(d, vm, flags, 0, addr, range)); + xe_eudebug_event_put(d, vm_bind_op(d, vm, flags, 0, addr, range, + debug_metadata)); } static struct xe_eudebug_event *fetch_bind_event(struct xe_vm * const vm) @@ -2934,8 +3163,89 @@ int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence) return err; } +static int send_debug_metadata_event(struct xe_eudebug *d, u32 flags, + u64 client_handle, u64 metadata_handle, + u64 type, u64 len, u64 seqno) +{ + struct xe_eudebug_event *event; + struct xe_eudebug_event_metadata *e; + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_METADATA, seqno, + flags, sizeof(*e)); + if (!event) + return -ENOMEM; + + e = cast_event(e, event); + + write_member(struct drm_xe_eudebug_event_metadata, e, client_handle, client_handle); + write_member(struct drm_xe_eudebug_event_metadata, e, metadata_handle, metadata_handle); + write_member(struct drm_xe_eudebug_event_metadata, e, type, type); + write_member(struct drm_xe_eudebug_event_metadata, e, len, len); + + return xe_eudebug_queue_event(d, event); +} + +static int debug_metadata_create_event(struct xe_eudebug *d, + struct xe_file *xef, struct xe_debug_metadata *m) +{ + int h_c, h_m; + u64 seqno; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); + if (h_c < 0) + return h_c; + + h_m = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_METADATA, m, &seqno); + if (h_m <= 0) + return h_m; + + return send_debug_metadata_event(d, DRM_XE_EUDEBUG_EVENT_CREATE, + h_c, h_m, m->type, m->len, seqno); +} + +static int debug_metadata_destroy_event(struct xe_eudebug *d, + struct xe_file *xef, struct xe_debug_metadata *m) +{ + int h_c, h_m; + u64 seqno; + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef); + if (h_c < 0) + return h_c; + + h_m = xe_eudebug_remove_handle(d, XE_EUDEBUG_RES_TYPE_METADATA, m, &seqno); + if (h_m < 0) + return h_m; + + return send_debug_metadata_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY, + h_c, h_m, m->type, m->len, seqno); +} + +void xe_eudebug_debug_metadata_create(struct xe_file *xef, struct xe_debug_metadata *m) +{ + struct xe_eudebug *d; + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, debug_metadata_create_event(d, xef, m)); +} + +void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, struct xe_debug_metadata *m) +{ + struct xe_eudebug *d; + + d = xe_eudebug_get(xef); + if (!d) + return; + + xe_eudebug_event_put(d, debug_metadata_destroy_event(d, xef, m)); +} + static int discover_client(struct xe_eudebug *d, struct xe_file *xef) { + struct xe_debug_metadata *m; struct xe_exec_queue *q; struct xe_vm *vm; unsigned long i; @@ -2945,6 +3255,12 @@ static int discover_client(struct xe_eudebug *d, struct xe_file *xef) if (err) return err; + xa_for_each(&xef->eudebug.metadata.xa, i, m) { + err = debug_metadata_create_event(d, xef, m); + if (err) + break; + } + xa_for_each(&xef->vm.xa, i, vm) { err = vm_create_event(d, xef, vm); if (err) diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index 13ba0167b31b..572493d341ff 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -16,6 +16,8 @@ struct xe_vma; struct xe_exec_queue; struct xe_hw_engine; struct xe_user_fence; +struct xe_debug_metadata; +struct drm_gpuva_ops; #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) @@ -39,7 +41,8 @@ void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_exec_queue *q) void xe_eudebug_vm_init(struct xe_vm *vm); void xe_eudebug_vm_bind_start(struct xe_vm *vm); -void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range); +void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range, + struct drm_gpuva_ops *ops); void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int err); int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence); @@ -49,6 +52,9 @@ void xe_eudebug_ufence_fini(struct xe_user_fence *ufence); struct xe_eudebug *xe_eudebug_get(struct xe_file *xef); void xe_eudebug_put(struct xe_eudebug *d); +void xe_eudebug_debug_metadata_create(struct xe_file *xef, struct xe_debug_metadata *m); +void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, struct xe_debug_metadata *m); + #else static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, @@ -71,7 +77,8 @@ static inline void xe_eudebug_exec_queue_destroy(struct xe_file *xef, struct xe_ static inline void xe_eudebug_vm_init(struct xe_vm *vm) { } static inline void xe_eudebug_vm_bind_start(struct xe_vm *vm) { } -static inline void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range) { } +static inline void xe_eudebug_vm_bind_op_add(struct xe_vm *vm, u32 op, u64 addr, u64 range, + struct drm_gpuva_ops *ops) { } static inline void xe_eudebug_vm_bind_end(struct xe_vm *vm, bool has_ufence, int err) { } static inline int xe_eudebug_vm_bind_ufence(struct xe_user_fence *ufence) { return 0; } @@ -82,6 +89,16 @@ static inline void xe_eudebug_ufence_fini(struct xe_user_fence *ufence) { } static inline struct xe_eudebug *xe_eudebug_get(struct xe_file *xef) { return NULL; } static inline void xe_eudebug_put(struct xe_eudebug *d) { } +static inline void xe_eudebug_debug_metadata_create(struct xe_file *xef, + struct xe_debug_metadata *m) +{ +} + +static inline void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, + struct xe_debug_metadata *m) +{ +} + #endif /* CONFIG_DRM_XE_EUDEBUG */ #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index ffb0dc71430a..a69051b04698 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -56,7 +56,8 @@ struct xe_eudebug_resource { #define XE_EUDEBUG_RES_TYPE_VM 1 #define XE_EUDEBUG_RES_TYPE_EXEC_QUEUE 2 #define XE_EUDEBUG_RES_TYPE_LRC 3 -#define XE_EUDEBUG_RES_TYPE_COUNT (XE_EUDEBUG_RES_TYPE_LRC + 1) +#define XE_EUDEBUG_RES_TYPE_METADATA 4 +#define XE_EUDEBUG_RES_TYPE_COUNT (XE_EUDEBUG_RES_TYPE_METADATA + 1) /** * struct xe_eudebug_resources - eudebug resources for all types @@ -326,4 +327,28 @@ struct xe_eudebug_event_vm_bind_ufence { u64 vm_bind_ref_seqno; }; +struct xe_eudebug_event_metadata { + struct xe_eudebug_event base; + + /** @client_handle: client for the attention */ + u64 client_handle; + + /** @metadata_handle: debug metadata handle it's created/destroyed */ + u64 metadata_handle; + + /* @type: metadata type, refer to xe_drm.h for options */ + u64 type; + + /* @len: size of metadata paylad */ + u64 len; +}; + +struct xe_eudebug_event_vm_bind_op_metadata { + struct xe_eudebug_event base; + u64 vm_bind_op_ref_seqno; + + u64 metadata_handle; + u64 metadata_cookie; +}; + #endif diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 19c0b36c10b1..474521d0fea9 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3178,7 +3178,7 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) if (err) goto unwind_ops; - xe_eudebug_vm_bind_op_add(vm, op, addr, range); + xe_eudebug_vm_bind_op_add(vm, op, addr, range, ops[i]); #ifdef TEST_VM_OPS_ERROR if (flags & FORCE_OP_ERROR) { diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index a5f13563b3b9..3c4d1b511acd 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -19,6 +19,7 @@ extern "C" { #define DRM_XE_EUDEBUG_IOCTL_EU_CONTROL _IOWR('j', 0x2, struct drm_xe_eudebug_eu_control) #define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT _IOW('j', 0x4, struct drm_xe_eudebug_ack_event) #define DRM_XE_EUDEBUG_IOCTL_VM_OPEN _IOW('j', 0x1, struct drm_xe_eudebug_vm_open) +#define DRM_XE_EUDEBUG_IOCTL_READ_METADATA _IOWR('j', 0x3, struct drm_xe_eudebug_read_metadata) /* XXX: Document events to match their internal counterparts when moved to xe_drm.h */ struct drm_xe_eudebug_event { @@ -35,6 +36,8 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_VM_BIND 7 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP 8 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE 9 +#define DRM_XE_EUDEBUG_EVENT_METADATA 10 +#define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA 11 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) @@ -206,6 +209,33 @@ struct drm_xe_eudebug_vm_open { __u64 timeout_ns; }; +struct drm_xe_eudebug_read_metadata { + __u64 client_handle; + __u64 metadata_handle; + __u32 flags; + __u32 reserved; + __u64 ptr; + __u64 size; +}; + +struct drm_xe_eudebug_event_metadata { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 metadata_handle; + /* XXX: Refer to xe_drm.h for fields */ + __u64 type; + __u64 len; +}; + +struct drm_xe_eudebug_event_vm_bind_op_metadata { + struct drm_xe_eudebug_event base; + __u64 vm_bind_op_ref_seqno; /* *_event_vm_bind_op.base.seqno */ + + __u64 metadata_handle; + __u64 metadata_cookie; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899790 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 235C4E77184 for ; Mon, 9 Dec 2024 13:33:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9DC7310E75D; Mon, 9 Dec 2024 13:33:36 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="d8cOHGXw"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7624510E75A; Mon, 9 Dec 2024 13:33:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751214; x=1765287214; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yJZ3E1Weo84L0EbHQE2eIfQxqwcRP9nvS7/37YmRuCE=; b=d8cOHGXwO5V7X6YsWB4Sm9jgc/uPM/N9HnrLwQWP432q/XDmNfIvQeJw oAt/JY+nGck/QWWYBL9VoESWHf4ccn9yID8M7Nu4zMRRWiLICiX2T+q6R J5sVY4AOC9kHL2hI4aI09tXrj4+6Sk87v7d29MLGib9Nokph5JSRKyYHi iMcV/wdk+laOD2YBS2Op1H6SBUAXcYdtFCo5+tRUkO2CknafpTxBO5LIi JiE9qbpLlS1AhGmHRgU6f1rIJn7YCko0benU9cEhEmJnlhusVQUyDT10j oRGpPi3DyzB4nujfgqFAGRmJAA9U0W/XY2gkB8+tyzsxUNTNbOkVXcb+F g==; X-CSE-ConnectionGUID: fgAWtvVcSIK31xMwvYupzQ== X-CSE-MsgGUID: NY1qpZbfTC6sWecJnO8e8w== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192122" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192122" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:34 -0800 X-CSE-ConnectionGUID: HCvpsePTT8COXuEpn3WY9A== X-CSE-MsgGUID: JlJIjrBVQ3K6/xcDy8GagA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531349" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:33 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Mika Kuoppala , Matthew Brost Subject: [PATCH 18/26] drm/xe/eudebug: Implement vm_bind_op discovery Date: Mon, 9 Dec 2024 15:33:09 +0200 Message-ID: <20241209133318.1806472-19-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Follow the vm bind, vm_bind op sequence for discovery process of a vm with the vmas it has. Send events for ops and attach metadata if available. v2: - Fix bad op ref seqno (Christoph) - with discovery semaphore, we dont need vm lock (Matthew) Cc: Matthew Brost Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 45 +++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 2514b880d871..e17b8f98c7b6 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -3243,6 +3243,47 @@ void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, struct xe_debug_meta xe_eudebug_event_put(d, debug_metadata_destroy_event(d, xef, m)); } +static int vm_discover_binds(struct xe_eudebug *d, struct xe_vm *vm) +{ + struct drm_gpuva *va; + unsigned int num_ops = 0, send_ops = 0; + u64 ref_seqno = 0; + int err; + + /* + * Currently only vm_bind_ioctl inserts vma's + * and with discovery lock, we have exclusivity. + */ + lockdep_assert_held_write(&d->xe->eudebug.discovery_lock); + + drm_gpuvm_for_each_va(va, &vm->gpuvm) + num_ops++; + + if (!num_ops) + return 0; + + err = vm_bind_event(d, vm, num_ops, &ref_seqno); + if (err) + return err; + + drm_gpuvm_for_each_va(va, &vm->gpuvm) { + struct xe_vma *vma = container_of(va, struct xe_vma, gpuva); + + if (send_ops >= num_ops) + break; + + err = vm_bind_op(d, vm, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno, + xe_vma_start(vma), xe_vma_size(vma), + &vma->eudebug.metadata.list); + if (err) + return err; + + send_ops++; + } + + return num_ops == send_ops ? 0 : -EINVAL; +} + static int discover_client(struct xe_eudebug *d, struct xe_file *xef) { struct xe_debug_metadata *m; @@ -3265,6 +3306,10 @@ static int discover_client(struct xe_eudebug *d, struct xe_file *xef) err = vm_create_event(d, xef, vm); if (err) break; + + err = vm_discover_binds(d, vm); + if (err) + break; } xa_for_each(&xef->exec_queue.xa, i, q) { From patchwork Mon Dec 9 13:33:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C914AE77183 for ; Mon, 9 Dec 2024 13:33:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 40D1410E769; Mon, 9 Dec 2024 13:33:40 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="elveZy89"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id E933310E75E; Mon, 9 Dec 2024 13:33:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751217; x=1765287217; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UQMwoajffTUbXgHz9b6yXuc68B+cQ05epTQLAqmoFVM=; b=elveZy89My0Ta/3QBngKaiiuNegQEkK3qoUY9JKaexuW+qJNYrD/MXgl gRggKW4uOkiDsKS9rqDkxByROOAX9bk1peNYmt/c0YekkBLg/aUjqGcnL W/Ce2UjUT/JTAMW8S1A0jCYCudoNeRnbzJ1g4dCEYjooUel5mZwWmWFAa 5q+y4qkGsgoZHAmwR7+V/5k2/MWlipqagTC0QEOR/8uHuFHJ6RcreHu4g n5NmU1oLNMz3Inwb+7sS2cLuTqEcYQFvqmMHuNRw/CmL+aFzoYYOkdjEO 8fNXeq5tznxIZjwnoF91lRAkAuJSqyHnEZTfWCeuAjlwGtyU5W+oS3Exr A==; X-CSE-ConnectionGUID: NXjXnv7SRHWCCc5a8Rip9Q== X-CSE-MsgGUID: hGczOX+yQmKt7pjmDXJdiw== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192145" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192145" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:36 -0800 X-CSE-ConnectionGUID: Aknx1MlfSRKLCAuRcV5cRw== X-CSE-MsgGUID: MT8yhJhdTEWWNhVdxp7aGw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531355" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:35 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Christoph Manszewski , Dominik Grzegorzek , Maciej Patelczyk , Mika Kuoppala Subject: [PATCH 19/26] drm/xe/eudebug: Dynamically toggle debugger functionality Date: Mon, 9 Dec 2024 15:33:10 +0200 Message-ID: <20241209133318.1806472-20-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Christoph Manszewski Make it possible to dynamically enable/disable debugger funtionality, including the setting and unsetting of required hw register values via a sysfs entry located at '/sys/class/drm/card/device/enable_eudebug'. This entry uses 'kstrtobool' and as such it accepts inputs as documented by this function, in particular '0' and '1'. v2: use new discovery_lock to gain exclusivity (Mika) v3: remove init_late and init_hw_engine (Dominik) Signed-off-by: Christoph Manszewski Signed-off-by: Dominik Grzegorzek Signed-off-by: Maciej Patelczyk Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_device.c | 2 - drivers/gpu/drm/xe/xe_device_types.h | 3 + drivers/gpu/drm/xe/xe_eudebug.c | 128 +++++++++++++++++++++++---- drivers/gpu/drm/xe/xe_eudebug.h | 4 - drivers/gpu/drm/xe/xe_exec_queue.c | 5 ++ drivers/gpu/drm/xe/xe_hw_engine.c | 1 - drivers/gpu/drm/xe/xe_reg_sr.c | 21 +++-- drivers/gpu/drm/xe/xe_reg_sr.h | 4 +- drivers/gpu/drm/xe/xe_rtp.c | 2 +- 9 files changed, 137 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index a7a715475184..3045f2a2ca1d 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -782,8 +782,6 @@ int xe_device_probe(struct xe_device *xe) xe_debugfs_register(xe); - xe_eudebug_init_late(xe); - xe_hwmon_register(xe); for_each_gt(gt, xe, id) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 4ab9f06eba2d..f081af5e729d 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -557,6 +557,9 @@ struct xe_device { /** discovery_lock: used for discovery to block xe ioctls */ struct rw_semaphore discovery_lock; + /** @enable: is the debugging functionality enabled */ + bool enable; + /** @attention_scan: attention scan worker */ struct delayed_work attention_scan; } eudebug; diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index e17b8f98c7b6..fe947d5350d8 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -2028,9 +2028,6 @@ xe_eudebug_connect(struct xe_device *xe, param->version = DRM_XE_EUDEBUG_VERSION; - if (!xe->eudebug.available) - return -EOPNOTSUPP; - d = kzalloc(sizeof(*d), GFP_KERNEL); if (!d) return -ENOMEM; @@ -2090,28 +2087,30 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev, { struct xe_device *xe = to_xe_device(dev); struct drm_xe_eudebug_connect * const param = data; - int ret = 0; - ret = xe_eudebug_connect(xe, param); + lockdep_assert_held(&xe->eudebug.discovery_lock); - return ret; + if (!xe->eudebug.enable) + return -ENODEV; + + return xe_eudebug_connect(xe, param); } static void add_sr_entry(struct xe_hw_engine *hwe, struct xe_reg_mcr mcr_reg, - u32 mask) + u32 mask, bool enable) { const struct xe_reg_sr_entry sr_entry = { .reg = mcr_reg.__reg, .clr_bits = mask, - .set_bits = mask, + .set_bits = enable ? mask : 0, .read_mask = mask, }; - xe_reg_sr_add(&hwe->reg_sr, &sr_entry, hwe->gt); + xe_reg_sr_add(&hwe->reg_sr, &sr_entry, hwe->gt, true); } -void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe) +static void xe_eudebug_reinit_hw_engine(struct xe_hw_engine *hwe, bool enable) { struct xe_gt *gt = hwe->gt; struct xe_device *xe = gt_to_xe(gt); @@ -2123,23 +2122,113 @@ void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe) return; if (XE_WA(gt, 18022722726)) - add_sr_entry(hwe, ROW_CHICKEN, STALL_DOP_GATING_DISABLE); + add_sr_entry(hwe, ROW_CHICKEN, + STALL_DOP_GATING_DISABLE, enable); if (XE_WA(gt, 14015474168)) - add_sr_entry(hwe, ROW_CHICKEN2, XEHPC_DISABLE_BTB); + add_sr_entry(hwe, ROW_CHICKEN2, + XEHPC_DISABLE_BTB, + enable); if (xe->info.graphics_verx100 >= 1200) add_sr_entry(hwe, TD_CTL, TD_CTL_BREAKPOINT_ENABLE | TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE | - TD_CTL_FEH_AND_FEE_ENABLE); + TD_CTL_FEH_AND_FEE_ENABLE, + enable); if (xe->info.graphics_verx100 >= 1250) - add_sr_entry(hwe, TD_CTL, TD_CTL_GLOBAL_DEBUG_ENABLE); + add_sr_entry(hwe, TD_CTL, + TD_CTL_GLOBAL_DEBUG_ENABLE, enable); +} + +static int xe_eudebug_enable(struct xe_device *xe, bool enable) +{ + struct xe_gt *gt; + int i; + u8 id; + + if (!xe->eudebug.available) + return -EOPNOTSUPP; + + /* + * The connect ioctl has read lock so we can + * serialize with taking write + */ + down_write(&xe->eudebug.discovery_lock); + + if (!enable && !list_empty(&xe->eudebug.list)) { + up_write(&xe->eudebug.discovery_lock); + return -EBUSY; + } + + if (enable == xe->eudebug.enable) { + up_write(&xe->eudebug.discovery_lock); + return 0; + } + + for_each_gt(gt, xe, id) { + for (i = 0; i < ARRAY_SIZE(gt->hw_engines); i++) { + if (!(gt->info.engine_mask & BIT(i))) + continue; + + xe_eudebug_reinit_hw_engine(>->hw_engines[i], enable); + } + + xe_gt_reset_async(gt); + flush_work(>->reset.worker); + } + + xe->eudebug.enable = enable; + up_write(&xe->eudebug.discovery_lock); + + if (enable) + attention_scan_flush(xe); + else + attention_scan_cancel(xe); + + return 0; +} + +static ssize_t enable_eudebug_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev)); + + return sysfs_emit(buf, "%u\n", xe->eudebug.enable); +} + +static ssize_t enable_eudebug_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct xe_device *xe = pdev_to_xe_device(to_pci_dev(dev)); + bool enable; + int ret; + + ret = kstrtobool(buf, &enable); + if (ret) + return ret; + + ret = xe_eudebug_enable(xe, enable); + if (ret) + return ret; + + return count; +} + +static DEVICE_ATTR_RW(enable_eudebug); + +static void xe_eudebug_sysfs_fini(void *arg) +{ + struct xe_device *xe = arg; + + sysfs_remove_file(&xe->drm.dev->kobj, &dev_attr_enable_eudebug.attr); } void xe_eudebug_init(struct xe_device *xe) { + struct device *dev = xe->drm.dev; + int ret; + spin_lock_init(&xe->eudebug.lock); INIT_LIST_HEAD(&xe->eudebug.list); @@ -2150,14 +2239,17 @@ void xe_eudebug_init(struct xe_device *xe) xe->eudebug.ordered_wq = alloc_ordered_workqueue("xe-eudebug-ordered-wq", 0); xe->eudebug.available = !!xe->eudebug.ordered_wq; -} -void xe_eudebug_init_late(struct xe_device *xe) -{ if (!xe->eudebug.available) return; - attention_scan_flush(xe); + ret = sysfs_create_file(&xe->drm.dev->kobj, &dev_attr_enable_eudebug.attr); + if (ret) + drm_warn(&xe->drm, "eudebug sysfs init failed: %d, debugger unavailable\n", ret); + else + devm_add_action_or_reset(dev, xe_eudebug_sysfs_fini, xe); + + xe->eudebug.available = ret == 0; } void xe_eudebug_fini(struct xe_device *xe) diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index 572493d341ff..a08abf796cc1 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -26,9 +26,7 @@ int xe_eudebug_connect_ioctl(struct drm_device *dev, struct drm_file *file); void xe_eudebug_init(struct xe_device *xe); -void xe_eudebug_init_late(struct xe_device *xe); void xe_eudebug_fini(struct xe_device *xe); -void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe); void xe_eudebug_file_open(struct xe_file *xef); void xe_eudebug_file_close(struct xe_file *xef); @@ -62,9 +60,7 @@ static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, struct drm_file *file) { return 0; } static inline void xe_eudebug_init(struct xe_device *xe) { } -static inline void xe_eudebug_init_late(struct xe_device *xe) { } static inline void xe_eudebug_fini(struct xe_device *xe) { } -static inline void xe_eudebug_init_hw_engine(struct xe_hw_engine *hwe) { } static inline void xe_eudebug_file_open(struct xe_file *xef) { } static inline void xe_eudebug_file_close(struct xe_file *xef) { } diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index cca46a32723e..044a0f2e1873 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -432,6 +432,11 @@ static int exec_queue_set_eudebug(struct xe_device *xe, struct xe_exec_queue *q, !(value & DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE))) return -EINVAL; +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) + if (XE_IOCTL_DBG(xe, !xe->eudebug.enable)) + return -EPERM; +#endif + q->eudebug_flags = EXEC_QUEUE_EUDEBUG_FLAG_ENABLE; q->sched_props.preempt_timeout_us = 0; diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index 8a188ddc99f4..c734aae88a57 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -559,7 +559,6 @@ static void hw_engine_init_early(struct xe_gt *gt, struct xe_hw_engine *hwe, xe_tuning_process_engine(hwe); xe_wa_process_engine(hwe); hw_engine_setup_default_state(hwe); - xe_eudebug_init_hw_engine(hwe); xe_reg_sr_init(&hwe->reg_whitelist, hwe->name, gt_to_xe(gt)); xe_reg_whitelist_process_engine(hwe); diff --git a/drivers/gpu/drm/xe/xe_reg_sr.c b/drivers/gpu/drm/xe/xe_reg_sr.c index e1a0e27cda14..e3a539c1c08e 100644 --- a/drivers/gpu/drm/xe/xe_reg_sr.c +++ b/drivers/gpu/drm/xe/xe_reg_sr.c @@ -93,22 +93,31 @@ static void reg_sr_inc_error(struct xe_reg_sr *sr) int xe_reg_sr_add(struct xe_reg_sr *sr, const struct xe_reg_sr_entry *e, - struct xe_gt *gt) + struct xe_gt *gt, + bool overwrite) { unsigned long idx = e->reg.addr; struct xe_reg_sr_entry *pentry = xa_load(&sr->xa, idx); int ret; if (pentry) { - if (!compatible_entries(pentry, e)) { + if (overwrite && e->set_bits) { + pentry->clr_bits |= e->clr_bits; + pentry->set_bits |= e->set_bits; + pentry->read_mask |= e->read_mask; + } else if (overwrite && !e->set_bits) { + pentry->clr_bits |= e->clr_bits; + pentry->set_bits &= ~e->clr_bits; + pentry->read_mask |= e->read_mask; + } else if (!compatible_entries(pentry, e)) { ret = -EINVAL; goto fail; + } else { + pentry->clr_bits |= e->clr_bits; + pentry->set_bits |= e->set_bits; + pentry->read_mask |= e->read_mask; } - pentry->clr_bits |= e->clr_bits; - pentry->set_bits |= e->set_bits; - pentry->read_mask |= e->read_mask; - return 0; } diff --git a/drivers/gpu/drm/xe/xe_reg_sr.h b/drivers/gpu/drm/xe/xe_reg_sr.h index 51fbba423e27..d67fafdcd847 100644 --- a/drivers/gpu/drm/xe/xe_reg_sr.h +++ b/drivers/gpu/drm/xe/xe_reg_sr.h @@ -6,6 +6,8 @@ #ifndef _XE_REG_SR_ #define _XE_REG_SR_ +#include + /* * Reg save/restore bookkeeping */ @@ -21,7 +23,7 @@ int xe_reg_sr_init(struct xe_reg_sr *sr, const char *name, struct xe_device *xe) void xe_reg_sr_dump(struct xe_reg_sr *sr, struct drm_printer *p); int xe_reg_sr_add(struct xe_reg_sr *sr, const struct xe_reg_sr_entry *e, - struct xe_gt *gt); + struct xe_gt *gt, bool overwrite); void xe_reg_sr_apply_mmio(struct xe_reg_sr *sr, struct xe_gt *gt); void xe_reg_sr_apply_whitelist(struct xe_hw_engine *hwe); diff --git a/drivers/gpu/drm/xe/xe_rtp.c b/drivers/gpu/drm/xe/xe_rtp.c index b13d4d62f0b1..6006f7c90cac 100644 --- a/drivers/gpu/drm/xe/xe_rtp.c +++ b/drivers/gpu/drm/xe/xe_rtp.c @@ -153,7 +153,7 @@ static void rtp_add_sr_entry(const struct xe_rtp_action *action, }; sr_entry.reg.addr += mmio_base; - xe_reg_sr_add(sr, &sr_entry, gt); + xe_reg_sr_add(sr, &sr_entry, gt, false); } static bool rtp_process_one_sr(const struct xe_rtp_entry_sr *entry, From patchwork Mon Dec 9 13:33:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899792 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18623E77184 for ; Mon, 9 Dec 2024 13:33:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6168110E76B; Mon, 9 Dec 2024 13:33:40 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="W8B21MjK"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8647010E740; Mon, 9 Dec 2024 13:33:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751218; x=1765287218; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2mBWGMUlMSlsOjWsvtMpSZ/thOHKnwzkWJWBYWExdDU=; b=W8B21MjKvPasLu27Cl9yxzTetXVQ3YvloIophrTYkbCt8CkzS+fvT/F+ PF/Rmm6ugJz+jIWfIJTuJ2olBeHfODmuHfLujRyFwDiaAqBvuD0LnWp8q dgsRp8Ar9Aes8vzkGQFp7noYGkZSEeRJ+vxo4LpA5JBFZB6039K2M04ku KLtxWY7aGAzAqkBMnvueSA61wG7yXht0dCN9Cnwind9/SJkXQWNCDRPOM sDJ5xfQ9CjOtlmwRj8fCopLuJHEOye5fF+S4bTtuqBbQvfWh1BE/cWM/b 6W5vW2nSf+XwKGXIee7W8IUpcyHghHtKhSbgTJSuVE2LxEDQi8PVoc2c2 w==; X-CSE-ConnectionGUID: 0s9zO6ikRdO84BvR/hn5lQ== X-CSE-MsgGUID: 3paQHqODSTa6w7En0cZY0g== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192155" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192155" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:38 -0800 X-CSE-ConnectionGUID: W6CyEpZARCKliAYMf8w/BQ== X-CSE-MsgGUID: TF0aPbwsQmOTgYac8SzVxQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531361" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:37 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Christoph Manszewski , Mika Kuoppala Subject: [PATCH 20/26] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Date: Mon, 9 Dec 2024 15:33:11 +0200 Message-ID: <20241209133318.1806472-21-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Christoph Manszewski Introduce kunit test for eudebug. For now it checks the dynamic application of WAs. v2: adapt to removal of call_for_each_device (Mika) v3: s/FW_RENDER/FORCEWAKE_ALL (Mika) Signed-off-by: Christoph Manszewski Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/tests/xe_eudebug.c | 176 ++++++++++++++++++++ drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 5 + drivers/gpu/drm/xe/xe_eudebug.c | 4 + 3 files changed, 185 insertions(+) create mode 100644 drivers/gpu/drm/xe/tests/xe_eudebug.c diff --git a/drivers/gpu/drm/xe/tests/xe_eudebug.c b/drivers/gpu/drm/xe/tests/xe_eudebug.c new file mode 100644 index 000000000000..d47e4ff259cb --- /dev/null +++ b/drivers/gpu/drm/xe/tests/xe_eudebug.c @@ -0,0 +1,176 @@ +// SPDX-License-Identifier: GPL-2.0 AND MIT +/* + * Copyright © 2024 Intel Corporation + */ + +#include + +#include "tests/xe_kunit_helpers.h" +#include "tests/xe_pci_test.h" +#include "tests/xe_test.h" + +#undef XE_REG_MCR +#define XE_REG_MCR(r_, ...) ((const struct xe_reg_mcr){ \ + .__reg = XE_REG_INITIALIZER(r_, ##__VA_ARGS__, .mcr = 1) \ + }) + +static const char *reg_to_str(struct xe_reg reg) +{ + if (reg.raw == TD_CTL.__reg.raw) + return "TD_CTL"; + else if (reg.raw == CS_DEBUG_MODE2(RENDER_RING_BASE).raw) + return "CS_DEBUG_MODE2"; + else if (reg.raw == ROW_CHICKEN.__reg.raw) + return "ROW_CHICKEN"; + else if (reg.raw == ROW_CHICKEN2.__reg.raw) + return "ROW_CHICKEN2"; + else if (reg.raw == ROW_CHICKEN3.__reg.raw) + return "ROW_CHICKEN3"; + else + return "UNKNOWN REG"; +} + +static u32 get_reg_mask(struct xe_device *xe, struct xe_reg reg) +{ + struct kunit *test = kunit_get_current_test(); + u32 val = 0; + + if (reg.raw == TD_CTL.__reg.raw) { + val = TD_CTL_BREAKPOINT_ENABLE | + TD_CTL_FORCE_THREAD_BREAKPOINT_ENABLE | + TD_CTL_FEH_AND_FEE_ENABLE; + + if (GRAPHICS_VERx100(xe) >= 1250) + val |= TD_CTL_GLOBAL_DEBUG_ENABLE; + + } else if (reg.raw == CS_DEBUG_MODE2(RENDER_RING_BASE).raw) { + val = GLOBAL_DEBUG_ENABLE; + } else if (reg.raw == ROW_CHICKEN.__reg.raw) { + val = STALL_DOP_GATING_DISABLE; + } else if (reg.raw == ROW_CHICKEN2.__reg.raw) { + val = XEHPC_DISABLE_BTB; + } else if (reg.raw == ROW_CHICKEN3.__reg.raw) { + val = XE2_EUPEND_CHK_FLUSH_DIS; + } else { + kunit_warn(test, "Invalid register selection: %u\n", reg.raw); + } + + return val; +} + +static u32 get_reg_expected(struct xe_device *xe, struct xe_reg reg, bool enable_eudebug) +{ + u32 reg_mask = get_reg_mask(xe, reg); + u32 reg_bits = 0; + + if (enable_eudebug || reg.raw == ROW_CHICKEN3.__reg.raw) + reg_bits = reg_mask; + else + reg_bits = 0; + + return reg_bits; +} + +static void check_reg(struct xe_gt *gt, bool enable_eudebug, struct xe_reg reg) +{ + struct kunit *test = kunit_get_current_test(); + struct xe_device *xe = gt_to_xe(gt); + u32 reg_bits_expected = get_reg_expected(xe, reg, enable_eudebug); + u32 reg_mask = get_reg_mask(xe, reg); + u32 reg_bits = 0; + + if (reg.mcr) + reg_bits = xe_gt_mcr_unicast_read_any(gt, (struct xe_reg_mcr){.__reg = reg}); + else + reg_bits = xe_mmio_read32(>->mmio, reg); + + reg_bits &= reg_mask; + + kunit_printk(KERN_DEBUG, test, "%s bits: expected == 0x%x; actual == 0x%x\n", + reg_to_str(reg), reg_bits_expected, reg_bits); + KUNIT_EXPECT_EQ_MSG(test, reg_bits_expected, reg_bits, + "Invalid bits set for %s\n", reg_to_str(reg)); +} + +static void __check_regs(struct xe_gt *gt, bool enable_eudebug) +{ + struct xe_device *xe = gt_to_xe(gt); + + if (GRAPHICS_VERx100(xe) >= 1200) + check_reg(gt, enable_eudebug, TD_CTL.__reg); + + if (GRAPHICS_VERx100(xe) >= 1250 && GRAPHICS_VERx100(xe) <= 1274) + check_reg(gt, enable_eudebug, ROW_CHICKEN.__reg); + + if (xe->info.platform == XE_PVC) + check_reg(gt, enable_eudebug, ROW_CHICKEN2.__reg); + + if (GRAPHICS_VERx100(xe) >= 2000 && GRAPHICS_VERx100(xe) <= 2004) + check_reg(gt, enable_eudebug, ROW_CHICKEN3.__reg); +} + +static void check_regs(struct xe_device *xe, bool enable_eudebug) +{ + struct kunit *test = kunit_get_current_test(); + struct xe_gt *gt; + unsigned int fw_ref; + u8 id; + + kunit_printk(KERN_DEBUG, test, "Check regs for eudebug %s\n", + enable_eudebug ? "enabled" : "disabled"); + + xe_pm_runtime_get(xe); + for_each_gt(gt, xe, id) { + if (xe_gt_is_media_type(gt)) + continue; + + /* XXX: Figure out per platform proper domain */ + fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); + KUNIT_ASSERT_TRUE_MSG(test, fw_ref, "Forcewake failed.\n"); + + __check_regs(gt, enable_eudebug); + + xe_force_wake_put(gt_to_fw(gt), fw_ref); + } + xe_pm_runtime_put(xe); +} + +static int toggle_reg_value(struct xe_device *xe) +{ + struct kunit *test = kunit_get_current_test(); + bool enable_eudebug = xe->eudebug.enable; + + kunit_printk(KERN_DEBUG, test, "Test eudebug WAs for graphics version: %u\n", + GRAPHICS_VERx100(xe)); + + check_regs(xe, enable_eudebug); + + xe_eudebug_enable(xe, !enable_eudebug); + check_regs(xe, !enable_eudebug); + + xe_eudebug_enable(xe, enable_eudebug); + check_regs(xe, enable_eudebug); + + return 0; +} + +static void xe_eudebug_toggle_reg_kunit(struct kunit *test) +{ + struct xe_device *xe = test->priv; + + toggle_reg_value(xe); +} + +static struct kunit_case xe_eudebug_tests[] = { + KUNIT_CASE_PARAM(xe_eudebug_toggle_reg_kunit, + xe_pci_live_device_gen_param), + {} +}; + +VISIBLE_IF_KUNIT +struct kunit_suite xe_eudebug_test_suite = { + .name = "xe_eudebug", + .test_cases = xe_eudebug_tests, + .init = xe_kunit_helper_xe_device_live_test_init, +}; +EXPORT_SYMBOL_IF_KUNIT(xe_eudebug_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c index 5f14737c8210..7dd8a0a4bdfd 100644 --- a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c +++ b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c @@ -15,6 +15,11 @@ kunit_test_suite(xe_dma_buf_test_suite); kunit_test_suite(xe_migrate_test_suite); kunit_test_suite(xe_mocs_test_suite); +#if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) +extern struct kunit_suite xe_eudebug_test_suite; +kunit_test_suite(xe_eudebug_test_suite); +#endif + MODULE_AUTHOR("Intel Corporation"); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("xe live kunit tests"); diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index fe947d5350d8..f44cc0f8290e 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -3947,3 +3947,7 @@ xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg) return ret; } + +#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) +#include "tests/xe_eudebug.c" +#endif From patchwork Mon Dec 9 13:33:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4BDE9E77185 for ; Mon, 9 Dec 2024 13:33:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7445210E760; Mon, 9 Dec 2024 13:33:41 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="H5hfLQlq"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0BCA510E773; Mon, 9 Dec 2024 13:33:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751220; x=1765287220; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RThmuulfUqOi6traoABdpzCGlKL3v7YFxMezvg+1QfM=; b=H5hfLQlq4vrTzgC9Zkv2LV4IoH6XZBYf1edOqxDbXWqp/XBZQFdXshUi khnsCfLceZslI+Ezaxdh/2Gn9uJrKuHCywrtC3kCaRhblTpcc9W7sj7QQ cBJthN1hUs1EqevuOXBDXE1/5zXTAjybGEI0PMA0B0aujre1O5MKc3oxO KFr0yXyeI9xmiwkqzpdHTRyyzHwv6DB2WrRRr02TCE5xPi6cRMVYxxFTG RKFseytJlfJtTeMnUblMTviwoJSa7cvX8dhYzofg2GW9tDR7K5EC8Edy5 dWa8Na6GbAY7MEV5PzOuRLiNO7KpBT+sVkRHqdbbuvxjmLE7hp+7dW1kM A==; X-CSE-ConnectionGUID: mXnOcPJMR46kNp3CpI5Zuw== X-CSE-MsgGUID: YzOqFb4xRNCxYz9UJdjChg== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192166" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192166" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:40 -0800 X-CSE-ConnectionGUID: EgMafpxUTpmPuV+Jf7pL9A== X-CSE-MsgGUID: P/jMAcn4TM2ke/XRu48FjA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531365" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:39 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Mika Kuoppala Subject: [PATCH 21/26] drm/xe/eudebug/ptl: Add support for extra attention register Date: Mon, 9 Dec 2024 15:33:12 +0200 Message-ID: <20241209133318.1806472-22-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek xe3 can set bits within an additional attention bit register EU_ATT1. Recalculate bitmask and make sure we read all required data. Signed-off-by: Dominik Grzegorzek Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 4 ++-- drivers/gpu/drm/xe/xe_gt_debug.c | 8 ++++---- drivers/gpu/drm/xe/xe_gt_debug.h | 8 ++++++-- 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index f44cc0f8290e..c259e5804386 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -1858,7 +1858,7 @@ static int check_attn_mcr(struct xe_gt *gt, void *data, struct xe_eudebug *d = iter->debugger; unsigned int row; - for (row = 0; row < TD_EU_ATTENTION_MAX_ROWS; row++) { + for (row = 0; row < xe_gt_debug_eu_att_rows(gt); row++) { u32 val, cur = 0; if (iter->i >= iter->size) @@ -1891,7 +1891,7 @@ static int clear_attn_mcr(struct xe_gt *gt, void *data, struct xe_eudebug *d = iter->debugger; unsigned int row; - for (row = 0; row < TD_EU_ATTENTION_MAX_ROWS; row++) { + for (row = 0; row < xe_gt_debug_eu_att_rows(gt); row++) { u32 val; if (iter->i >= iter->size) diff --git a/drivers/gpu/drm/xe/xe_gt_debug.c b/drivers/gpu/drm/xe/xe_gt_debug.c index f35b9df5e41b..49f24db9da9c 100644 --- a/drivers/gpu/drm/xe/xe_gt_debug.c +++ b/drivers/gpu/drm/xe/xe_gt_debug.c @@ -74,9 +74,9 @@ int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt) bitmap_or(dss_mask, gt->fuse_topo.c_dss_mask, gt->fuse_topo.g_dss_mask, XE_MAX_DSS_FUSE_BITS); - return bitmap_weight(dss_mask, XE_MAX_DSS_FUSE_BITS) * - TD_EU_ATTENTION_MAX_ROWS * MAX_THREADS * - MAX_EUS_PER_ROW / 8; + return bitmap_weight(dss_mask, XE_MAX_DSS_FUSE_BITS) * + xe_gt_debug_eu_att_rows(gt) * MAX_THREADS * + MAX_EUS_PER_ROW / 8; } struct attn_read_iter { @@ -92,7 +92,7 @@ static int read_eu_attentions_mcr(struct xe_gt *gt, void *data, struct attn_read_iter * const iter = data; unsigned int row; - for (row = 0; row < TD_EU_ATTENTION_MAX_ROWS; row++) { + for (row = 0; row < xe_gt_debug_eu_att_rows(gt); row++) { u32 val; if (iter->i >= iter->size) diff --git a/drivers/gpu/drm/xe/xe_gt_debug.h b/drivers/gpu/drm/xe/xe_gt_debug.h index 342082699ff6..1edb667154f1 100644 --- a/drivers/gpu/drm/xe/xe_gt_debug.h +++ b/drivers/gpu/drm/xe/xe_gt_debug.h @@ -6,12 +6,16 @@ #ifndef __XE_GT_DEBUG_ #define __XE_GT_DEBUG_ -#define TD_EU_ATTENTION_MAX_ROWS 2u - +#include "xe_device_types.h" #include "xe_gt_types.h" #define XE_GT_ATTENTION_TIMEOUT_MS 100 +static inline unsigned int xe_gt_debug_eu_att_rows(struct xe_gt *gt) +{ + return (GRAPHICS_VERx100(gt_to_xe(gt)) >= 3000) ? 4u : 2u; +} + int xe_gt_eu_threads_needing_attention(struct xe_gt *gt); int xe_gt_foreach_dss_group_instance(struct xe_gt *gt, int (*fn)(struct xe_gt *gt, From patchwork Mon Dec 9 13:33:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55718E77182 for ; Mon, 9 Dec 2024 13:33:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B882E10E744; Mon, 9 Dec 2024 13:33:42 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="lDeyMtX8"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id D95E310E768; Mon, 9 Dec 2024 13:33:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751222; x=1765287222; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wjKEhuQIZxZQamJ2eGfMVpXxIsfEqV84VEgXGtOiwjw=; b=lDeyMtX8osycrZALAtEPXpDJ/qbxCZDYI41dj/Yd74PSiNtCurBCIIR3 cKSRb6qUl/VP6TAhR4Taq1O7Qb7PP30mMHesOp/Yg1zoGLBKbe52Itjkq QVWga8m+1k0qDTxr+wHUnY98zLq0L6pdXUxCulQ4WAonwj19kXcaeCcfq CEKwrBDmYoULC71iIzvu+pORh8CX7M3Nebg5gTLCpYlItQtCICwT/RO2c uQYfhw/MXRO1qggsvFWjqhSFcMe2lyWmqfS3MkVzdnHXrmaD1tD4sfIxb 6sOP05U5RcFAldSMcFtsE40l5tjIKWnKj/ckVTSkZCEzuAmoJYXTywI+c g==; X-CSE-ConnectionGUID: +Nf5L8oES1azjBE9JDAHuA== X-CSE-MsgGUID: OQFT6C4pQFqUhQOwwwQURQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192184" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192184" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:41 -0800 X-CSE-ConnectionGUID: vxNA+sVhSr6fCmdxZxdZtg== X-CSE-MsgGUID: 82la/lOaSG+6qjPjRoky+A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531373" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:40 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Dominik Grzegorzek , Mika Kuoppala Subject: [PATCH 22/26] drm/xe/eudebug/ptl: Add RCU_DEBUG_1 register support for xe3 Date: Mon, 9 Dec 2024 15:33:13 +0200 Message-ID: <20241209133318.1806472-23-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dominik Grzegorzek Format of Register_RenderControlUnitDebug1 is different from previous gens. Adjust it so it matches PTL/xe3 format. Acked-by: Mika Kuoppala Signed-off-by: Dominik Grzegorzek Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index c259e5804386..09b455a96571 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -1443,6 +1443,17 @@ static u32 engine_status_xe2(const struct xe_hw_engine * const hwe, return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS; } +static u32 engine_status_xe3(const struct xe_hw_engine * const hwe, + u32 rcu_debug1) +{ + const unsigned int first = 6; + const unsigned int incr = 4; + const unsigned int i = rcu_debug1_engine_index(hwe); + const unsigned int shift = first + (i * incr); + + return (rcu_debug1 >> shift) & RCU_DEBUG_1_ENGINE_STATUS; +} + static u32 engine_status(const struct xe_hw_engine * const hwe, u32 rcu_debug1) { @@ -1452,6 +1463,8 @@ static u32 engine_status(const struct xe_hw_engine * const hwe, status = engine_status_xe1(hwe, rcu_debug1); else if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 30) status = engine_status_xe2(hwe, rcu_debug1); + else if (GRAPHICS_VER(gt_to_xe(hwe->gt)) < 35) + status = engine_status_xe3(hwe, rcu_debug1); else XE_WARN_ON(GRAPHICS_VER(gt_to_xe(hwe->gt))); From patchwork Mon Dec 9 13:33:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CA72CE77180 for ; Mon, 9 Dec 2024 13:33:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 54D3610E75F; Mon, 9 Dec 2024 13:33:50 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="EhBIQ97z"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 80FCF10E75E; Mon, 9 Dec 2024 13:33:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751224; x=1765287224; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+xCFeV01j1EWvHaalSUnfctLoI4K8h3clbgsY+oeZDA=; b=EhBIQ97zumyZnN1Y1udDSUAnfbr/zMIY1lH+ztxU5gqq48eCgcQdGbiG i6q/o602WBLhjcAa0oIYrhjAuUOk/WmTXMqGSgebwybdjUMDy0zq/lHHY rvH9+gvOxd6GrQrtopUOYqtyOLrj2Cjog4vKGYu3oGL8mQG459uPY6+PQ Pgr1LF36cuF46f18DwJE7mt9iL6sCRQ4/7KZzeA5FqogotYubgI+hZ5jd YTtD3GP2EiAo5/ECX+b5LOUB2SO5EWo0fCXmreTT42Q+8mkEipi/4liF2 SnS8YNlOVDXO78VzxRKKF94BFjUbs48dYhIVn7ut/3q0Bf8J9Z3f9iq2J g==; X-CSE-ConnectionGUID: 81FNxyMaRESiNbkxSkM4Sw== X-CSE-MsgGUID: jYoCNJnRSbCUWWnzFMouLw== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192190" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192190" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:43 -0800 X-CSE-ConnectionGUID: fCL5eKx2R2KlYOqIbCgRMA== X-CSE-MsgGUID: xpaxzA6DSHm/x+8jVMQ93g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531377" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:42 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Gwan-gyeong Mun , Mika Kuoppala Subject: [PATCH 23/26] drm/xe/eudebug: Add read/count/compare helper for eu attention Date: Mon, 9 Dec 2024 15:33:14 +0200 Message-ID: <20241209133318.1806472-24-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Gwan-gyeong Mun Add xe_eu_attentions structure to capture and store eu attention bits. Add a function to count the number of eu threads that have turned on from eu attentions, and add a function to count the number of eu threads that have changed on a state between eu attentions. Signed-off-by: Gwan-gyeong Mun Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_gt_debug.c | 64 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_gt_debug.h | 15 ++++++++ 2 files changed, 79 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt_debug.c b/drivers/gpu/drm/xe/xe_gt_debug.c index 49f24db9da9c..a20e1e57212c 100644 --- a/drivers/gpu/drm/xe/xe_gt_debug.c +++ b/drivers/gpu/drm/xe/xe_gt_debug.c @@ -3,6 +3,7 @@ * Copyright © 2023 Intel Corporation */ +#include #include "regs/xe_gt_regs.h" #include "xe_device.h" #include "xe_force_wake.h" @@ -146,3 +147,66 @@ int xe_gt_eu_threads_needing_attention(struct xe_gt *gt) return err < 0 ? 0 : err; } + +static inline unsigned int +xe_eu_attentions_count(const struct xe_eu_attentions *a) +{ + return bitmap_weight((void *)a->att, a->size * BITS_PER_BYTE); +} + +void xe_gt_eu_attentions_read(struct xe_gt *gt, + struct xe_eu_attentions *a, + const unsigned int settle_time_ms) +{ + unsigned int prev = 0; + ktime_t end, now; + + now = ktime_get_raw(); + end = ktime_add_ms(now, settle_time_ms); + + a->ts = 0; + a->size = min_t(int, + xe_gt_eu_attention_bitmap_size(gt), + sizeof(a->att)); + + do { + unsigned int attn; + + xe_gt_eu_attention_bitmap(gt, a->att, a->size); + attn = xe_eu_attentions_count(a); + + now = ktime_get_raw(); + + if (a->ts == 0) + a->ts = now; + else if (attn && attn != prev) + a->ts = now; + + prev = attn; + + if (settle_time_ms) + udelay(5); + + /* + * XXX We are gathering data for production SIP to find + * the upper limit of settle time. For now, we wait full + * timeout value regardless. + */ + } while (ktime_before(now, end)); +} + +unsigned int xe_eu_attentions_xor_count(const struct xe_eu_attentions *a, + const struct xe_eu_attentions *b) +{ + unsigned int count = 0; + unsigned int i; + + if (XE_WARN_ON(a->size != b->size)) + return -EINVAL; + + for (i = 0; i < a->size; i++) + if (a->att[i] ^ b->att[i]) + count++; + + return count; +} diff --git a/drivers/gpu/drm/xe/xe_gt_debug.h b/drivers/gpu/drm/xe/xe_gt_debug.h index 1edb667154f1..1d50b93235ae 100644 --- a/drivers/gpu/drm/xe/xe_gt_debug.h +++ b/drivers/gpu/drm/xe/xe_gt_debug.h @@ -11,6 +11,15 @@ #define XE_GT_ATTENTION_TIMEOUT_MS 100 +struct xe_eu_attentions { +#define XE_MAX_EUS 1024 +#define XE_MAX_THREADS 10 + + u8 att[DIV_ROUND_UP(XE_MAX_EUS * XE_MAX_THREADS, BITS_PER_BYTE)]; + unsigned int size; + ktime_t ts; +}; + static inline unsigned int xe_gt_debug_eu_att_rows(struct xe_gt *gt) { return (GRAPHICS_VERx100(gt_to_xe(gt)) >= 3000) ? 4u : 2u; @@ -28,4 +37,10 @@ int xe_gt_eu_attention_bitmap_size(struct xe_gt *gt); int xe_gt_eu_attention_bitmap(struct xe_gt *gt, u8 *bits, unsigned int bitmap_size); +void xe_gt_eu_attentions_read(struct xe_gt *gt, + struct xe_eu_attentions *a, + const unsigned int settle_time_ms); + +unsigned int xe_eu_attentions_xor_count(const struct xe_eu_attentions *a, + const struct xe_eu_attentions *b); #endif From patchwork Mon Dec 9 13:33:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96861E77181 for ; Mon, 9 Dec 2024 13:33:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1DDF710E768; Mon, 9 Dec 2024 13:33:48 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="T1ilK6jF"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2FEC810E767; Mon, 9 Dec 2024 13:33:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751225; x=1765287225; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zifbsZAxncZARfsonXt0a6rNDitogPi3iUOI+MM80Wo=; b=T1ilK6jFL8JVLPi4xFGt0xiZnyQgtqLyri9JGa6CSit7GI9GiJCciOL5 mgVildhMXirYSZvOQxCwrKDaoWQ3PfcVO5orKVFOxGeedrYEehet1UDnP Xf5xNpocL9ShphBfYY05jXLizwajLTuz93bgsA8IetKXvIZWIetR42GIv f9HwBsoHODowsBNJhiR2xdprnGxgBuiDyDW+KxJEaTy5C7VO22P/aOxCf zci66VcqclGQlGzYC7TDNQVcj2y7NGjXeNBm5mEOkNU7c95mArW7Ndb9s lAx3LdOlox0rghaHbd1XcOLqVwrWav74jhBtn1EHMwEnfKE+0IJ72slBT g==; X-CSE-ConnectionGUID: b6/j1lGgQMmJDJg9Kg1kiA== X-CSE-MsgGUID: IQpo4xtvTQ2svoT1Q1StUQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192199" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192199" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:45 -0800 X-CSE-ConnectionGUID: ht+xfNaxSKefATjraaln2g== X-CSE-MsgGUID: XyCGhGolTjCP/U9XVwNY7Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531383" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:44 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Gwan-gyeong Mun , Mika Kuoppala Subject: [PATCH 24/26] drm/xe/eudebug: Introduce EU pagefault handling interface Date: Mon, 9 Dec 2024 15:33:15 +0200 Message-ID: <20241209133318.1806472-25-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Gwan-gyeong Mun The XE2 (and PVC) HW has a limitation that the pagefault due to invalid access will halt the corresponding EUs. To solve this problem, introduce EU pagefault handling functionality, which allows to unhalt pagefaulted eu threads and to EU debugger to get inform about the eu attentions state of EU threads during execution. If a pagefault occurs, send the DRM_XE_EUDEBUG_EVENT_PAGEFAULT event to the client connected to the xe_eudebug after handling the pagefault. The pagefault eudebug event follows the newly added drm_xe_eudebug_event_pagefault type. When a pagefault occurs, it prevents to send the DRM_XE_EUDEBUG_EVENT_EU_ATTENTION event to the client during pagefault handling. The page fault event delivery follows the below policy. (1) If EU Debugger discovery has completed and pagefaulted eu threads turn on attention bit then pagefault handler delivers pagefault event directly. (2) If a pagefault occurs during eu debugger discovery process, pagefault handler queues a pagefault event and sends the queued event when discovery has completed and pagefaulted eu threads turn on attention bit. (3) If the pagefaulted eu thread struggles to turn on the attention bit within the specified time, the attention scan worker sends a pagefault event when it detects that the attention bit is turned on. If multiple eu threads are running and a pagefault occurs due to accessing the same invalid address, send a single pagefault event (DRM_XE_EUDEBUG_EVENT_PAGEFAULT type) to the user debugger instead of a pagefault event for each of the multiple eu threads. If eu threads (other than the one that caused the page fault before) access the new invalid addresses, send a new pagefault event. As the attention scan worker send the eu attention event whenever the attention bit is turned on, user debugger receives attenion event immediately after pagefault event. In this case, the page-fault event always precedes the attention event. When the user debugger receives an attention event after a pagefault event, it can detect whether additional breakpoints or interrupts occur in addition to the existing pagefault by comparing the eu threads where the pagefault occurred with the eu threads where the attention bit is newly enabled. Signed-off-by: Gwan-gyeong Mun Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_eudebug.c | 489 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_eudebug.h | 28 ++ drivers/gpu/drm/xe/xe_eudebug_types.h | 94 +++++ drivers/gpu/drm/xe/xe_gt_pagefault.c | 4 +- drivers/gpu/drm/xe/xe_gt_pagefault.h | 2 + include/uapi/drm/xe_drm_eudebug.h | 13 + 6 files changed, 626 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c index 09b455a96571..0fd0958c5790 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.c +++ b/drivers/gpu/drm/xe/xe_eudebug.c @@ -31,6 +31,7 @@ #include "xe_gt.h" #include "xe_gt_debug.h" #include "xe_gt_mcr.h" +#include "xe_gt_pagefault.h" #include "xe_guc_exec_queue_types.h" #include "xe_hw_engine.h" #include "xe_lrc.h" @@ -236,10 +237,17 @@ static void xe_eudebug_free(struct kref *ref) { struct xe_eudebug *d = container_of(ref, typeof(*d), ref); struct xe_eudebug_event *event; + struct xe_eudebug_pagefault *pf, *pf_temp; while (kfifo_get(&d->events.fifo, &event)) kfree(event); + /* Since it's the last reference no race here */ + list_for_each_entry_safe(pf, pf_temp, &d->pagefaults, list) { + xe_exec_queue_put(pf->q); + kfree(pf); + } + xe_eudebug_destroy_resources(d); put_task_struct(d->target_task); @@ -911,7 +919,7 @@ static struct xe_eudebug_event * xe_eudebug_create_event(struct xe_eudebug *d, u16 type, u64 seqno, u16 flags, u32 len) { - const u16 max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA; + const u16 max_event = DRM_XE_EUDEBUG_EVENT_PAGEFAULT; const u16 known_flags = DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY | @@ -946,7 +954,7 @@ static long xe_eudebug_read_event(struct xe_eudebug *d, u64_to_user_ptr(arg); struct drm_xe_eudebug_event user_event; struct xe_eudebug_event *event; - const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA; + const unsigned int max_event = DRM_XE_EUDEBUG_EVENT_PAGEFAULT; long ret = 0; if (XE_IOCTL_DBG(xe, copy_from_user(&user_event, user_orig, sizeof(user_event)))) @@ -1067,6 +1075,7 @@ static int do_eu_control(struct xe_eudebug *d, struct xe_device *xe = d->xe; u8 *bits = NULL; unsigned int hw_attn_size, attn_size; + struct dma_fence *pf_fence; struct xe_exec_queue *q; struct xe_file *xef; struct xe_lrc *lrc; @@ -1132,6 +1141,23 @@ static int do_eu_control(struct xe_eudebug *d, ret = -EINVAL; mutex_lock(&d->eu_lock); + rcu_read_lock(); + pf_fence = dma_fence_get_rcu_safe(&d->pf_fence); + rcu_read_unlock(); + + while (pf_fence) { + mutex_unlock(&d->eu_lock); + ret = dma_fence_wait(pf_fence, true); + dma_fence_put(pf_fence); + + if (ret) + goto out_free; + + mutex_lock(&d->eu_lock); + rcu_read_lock(); + pf_fence = dma_fence_get_rcu_safe(&d->pf_fence); + rcu_read_unlock(); + } switch (arg->cmd) { case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL: @@ -1720,6 +1746,182 @@ static int xe_eudebug_handle_gt_attention(struct xe_gt *gt) return ret; } +static int send_pagefault_event(struct xe_eudebug *d, struct xe_eudebug_pagefault *pf) +{ + struct xe_eudebug_event_pagefault *ep; + struct xe_eudebug_event *event; + int h_c, h_queue, h_lrc; + u32 size = xe_gt_eu_attention_bitmap_size(pf->q->gt) * 3; + u32 sz = struct_size(ep, bitmask, size); + + XE_WARN_ON(pf->lrc_idx < 0 || pf->lrc_idx >= pf->q->width); + + XE_WARN_ON(!xe_exec_queue_is_debuggable(pf->q)); + + h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, pf->q->vm->xef); + if (h_c < 0) + return h_c; + + h_queue = find_handle(d->res, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, pf->q); + if (h_queue < 0) + return h_queue; + + h_lrc = find_handle(d->res, XE_EUDEBUG_RES_TYPE_LRC, pf->q->lrc[pf->lrc_idx]); + if (h_lrc < 0) + return h_lrc; + + event = xe_eudebug_create_event(d, DRM_XE_EUDEBUG_EVENT_PAGEFAULT, 0, + DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, sz); + + if (!event) + return -ENOSPC; + + ep = cast_event(ep, event); + write_member(struct xe_eudebug_event_pagefault, ep, client_handle, (u64)h_c); + write_member(struct xe_eudebug_event_pagefault, ep, exec_queue_handle, (u64)h_queue); + write_member(struct xe_eudebug_event_pagefault, ep, lrc_handle, (u64)h_lrc); + write_member(struct xe_eudebug_event_pagefault, ep, bitmask_size, size); + write_member(struct xe_eudebug_event_pagefault, ep, pagefault_address, pf->fault.addr); + + memcpy(ep->bitmask, pf->attentions.before.att, pf->attentions.before.size); + memcpy(ep->bitmask + pf->attentions.before.size, + pf->attentions.after.att, pf->attentions.after.size); + memcpy(ep->bitmask + pf->attentions.before.size + pf->attentions.after.size, + pf->attentions.resolved.att, pf->attentions.resolved.size); + + event->seqno = atomic_long_inc_return(&d->events.seqno); + + return xe_eudebug_queue_event(d, event); +} + +static int send_pagefault(struct xe_gt *gt, struct xe_eudebug_pagefault *pf, + bool from_attention_scan) +{ + struct xe_eudebug *d; + struct xe_exec_queue *q; + int ret, lrc_idx; + + if (list_empty_careful(>_to_xe(gt)->eudebug.list)) + return -ENOTCONN; + + q = runalone_active_queue_get(gt, &lrc_idx); + if (IS_ERR(q)) + return PTR_ERR(q); + + if (!xe_exec_queue_is_debuggable(q)) { + ret = -EPERM; + goto out_exec_queue_put; + } + + d = _xe_eudebug_get(q->vm->xef); + if (!d) { + ret = -ENOTCONN; + goto out_exec_queue_put; + } + + if (!completion_done(&d->discovery)) { + eu_dbg(d, "discovery not yet done\n"); + ret = -EBUSY; + goto out_eudebug_put; + } + + if (pf->deferred_resolved) { + xe_gt_eu_attentions_read(gt, &pf->attentions.resolved, + XE_GT_ATTENTION_TIMEOUT_MS); + + if (!xe_eu_attentions_xor_count(&pf->attentions.after, + &pf->attentions.resolved) && + !from_attention_scan) { + eu_dbg(d, "xe attentions not yet updated\n"); + ret = -EBUSY; + goto out_eudebug_put; + } + } + + ret = send_pagefault_event(d, pf); + if (ret) + xe_eudebug_disconnect(d, ret); + +out_eudebug_put: + xe_eudebug_put(d); +out_exec_queue_put: + xe_exec_queue_put(q); + + return ret; +} + +static int send_queued_pagefault(struct xe_eudebug *d, bool from_attention_scan) +{ + struct xe_eudebug_pagefault *pf, *pf_temp; + int ret = 0; + + mutex_lock(&d->pf_lock); + list_for_each_entry_safe(pf, pf_temp, &d->pagefaults, list) { + struct xe_gt *gt = pf->q->gt; + + ret = send_pagefault(gt, pf, from_attention_scan); + + /* if resolved attentions are not updated */ + if (ret == -EBUSY) + break; + + /* decrease the reference count of xe_exec_queue obtained from pagefault handler */ + xe_exec_queue_put(pf->q); + list_del(&pf->list); + kfree(pf); + + if (ret) + break; + } + mutex_unlock(&d->pf_lock); + + return ret; +} + +static int handle_gt_queued_pagefault(struct xe_gt *gt) +{ + struct xe_exec_queue *q; + struct xe_eudebug *d; + int ret, lrc_idx; + + ret = xe_gt_eu_threads_needing_attention(gt); + if (ret <= 0) + return ret; + + if (list_empty_careful(>_to_xe(gt)->eudebug.list)) + return -ENOTCONN; + + q = runalone_active_queue_get(gt, &lrc_idx); + if (IS_ERR(q)) + return PTR_ERR(q); + + if (!xe_exec_queue_is_debuggable(q)) { + ret = -EPERM; + goto out_exec_queue_put; + } + + d = _xe_eudebug_get(q->vm->xef); + if (!d) { + ret = -ENOTCONN; + goto out_exec_queue_put; + } + + if (!completion_done(&d->discovery)) { + eu_dbg(d, "discovery not yet done\n"); + ret = -EBUSY; + goto out_eudebug_put; + } + + ret = send_queued_pagefault(d, true); + +out_eudebug_put: + xe_eudebug_put(d); +out_exec_queue_put: + xe_exec_queue_put(q); + + return ret; +} + #define XE_EUDEBUG_ATTENTION_INTERVAL 100 static void attention_scan_fn(struct work_struct *work) { @@ -1741,6 +1943,8 @@ static void attention_scan_fn(struct work_struct *work) if (gt->info.type != XE_GT_TYPE_MAIN) continue; + handle_gt_queued_pagefault(gt); + ret = xe_eudebug_handle_gt_attention(gt); if (ret) { // TODO: error capture @@ -2048,6 +2252,8 @@ xe_eudebug_connect(struct xe_device *xe, kref_init(&d->ref); spin_lock_init(&d->connection.lock); mutex_init(&d->eu_lock); + mutex_init(&d->pf_lock); + INIT_LIST_HEAD(&d->pagefaults); init_waitqueue_head(&d->events.write_done); init_waitqueue_head(&d->events.read_done); init_completion(&d->discovery); @@ -3490,6 +3696,8 @@ static void discovery_work_fn(struct work_struct *work) up_write(&xe->eudebug.discovery_lock); + send_queued_pagefault(d, false); + xe_eudebug_put(d); } @@ -3961,6 +4169,283 @@ xe_eudebug_vm_open_ioctl(struct xe_eudebug *d, unsigned long arg) return ret; } +static int queue_pagefault(struct xe_gt *gt, struct xe_eudebug_pagefault *pf) +{ + struct xe_eudebug *d; + + if (list_empty_careful(>_to_xe(gt)->eudebug.list)) + return -ENOTCONN; + + d = _xe_eudebug_get(pf->q->vm->xef); + if (IS_ERR_OR_NULL(d)) + return -EINVAL; + + mutex_lock(&d->pf_lock); + list_add_tail(&pf->list, &d->pagefaults); + mutex_unlock(&d->pf_lock); + + xe_eudebug_put(d); + + return 0; +} + +static int handle_pagefault(struct xe_gt *gt, struct xe_eudebug_pagefault *pf) +{ + int ret; + + ret = send_pagefault(gt, pf, false); + + /* + * if debugger discovery is not completed or resolved attentions are not + * updated, then queue pagefault + */ + if (ret == -EBUSY) { + ret = queue_pagefault(gt, pf); + if (!ret) + goto out; + } + + xe_exec_queue_put(pf->q); + kfree(pf); + +out: + return ret; +} + +static const char * +pagefault_get_driver_name(struct dma_fence *dma_fence) +{ + return "xe"; +} + +static const char * +pagefault_fence_get_timeline_name(struct dma_fence *dma_fence) +{ + return "eudebug_pagefault_fence"; +} + +static const struct dma_fence_ops pagefault_fence_ops = { + .get_driver_name = pagefault_get_driver_name, + .get_timeline_name = pagefault_fence_get_timeline_name, +}; + +struct pagefault_fence { + struct dma_fence base; + spinlock_t lock; +}; + +static struct pagefault_fence *pagefault_fence_create(void) +{ + struct pagefault_fence *fence; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (fence == NULL) + return NULL; + + spin_lock_init(&fence->lock); + dma_fence_init(&fence->base, &pagefault_fence_ops, &fence->lock, + dma_fence_context_alloc(1), 1); + + return fence; +} + +struct xe_eudebug_pagefault * +xe_eudebug_pagefault_create(struct xe_gt *gt, struct xe_vm *vm, u64 page_addr, + u8 fault_type, u8 fault_level, u8 access_type) +{ + struct pagefault_fence *pf_fence; + struct xe_eudebug_pagefault *pf; + struct xe_vma *vma = NULL; + struct xe_exec_queue *q; + struct dma_fence *fence; + struct xe_eudebug *d; + unsigned int fw_ref; + int lrc_idx; + u32 td_ctl; + + down_read(&vm->lock); + vma = xe_gt_pagefault_lookup_vma(vm, page_addr); + up_read(&vm->lock); + + if (vma) + return NULL; + + d = _xe_eudebug_get(vm->xef); + if (!d) + return NULL; + + q = runalone_active_queue_get(gt, &lrc_idx); + if (IS_ERR(q)) + goto err_put_eudebug; + + if (!xe_exec_queue_is_debuggable(q)) + goto err_put_exec_queue; + + fw_ref = xe_force_wake_get(gt_to_fw(gt), q->hwe->domain); + if (!fw_ref) + goto err_put_exec_queue; + + /* + * If there is no debug functionality (TD_CTL_GLOBAL_DEBUG_ENABLE, etc.), + * don't proceed pagefault routine for eu debugger. + */ + + td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL); + if (!td_ctl) + goto err_put_fw; + + pf = kzalloc(sizeof(*pf), GFP_KERNEL); + if (!pf) + goto err_put_fw; + + attention_scan_cancel(gt_to_xe(gt)); + + mutex_lock(&d->eu_lock); + rcu_read_lock(); + fence = dma_fence_get_rcu_safe(&d->pf_fence); + rcu_read_unlock(); + + if (fence) { + /* + * TODO: If the new incoming pagefaulted address is different + * from the pagefaulted address it is currently handling on the + * same ASID, it needs a routine to wait here and then do the + * following pagefault. + */ + dma_fence_put(fence); + goto err_unlock_eu_lock; + } + + pf_fence = pagefault_fence_create(); + if (!pf_fence) + goto err_unlock_eu_lock; + + d->pf_fence = &pf_fence->base; + mutex_unlock(&d->eu_lock); + + INIT_LIST_HEAD(&pf->list); + + xe_gt_eu_attentions_read(gt, &pf->attentions.before, 0); + + /* Halt on next thread dispatch */ + while (!(td_ctl & TD_CTL_FORCE_EXTERNAL_HALT)) { + xe_gt_mcr_multicast_write(gt, TD_CTL, + td_ctl | TD_CTL_FORCE_EXTERNAL_HALT); + /* + * The sleep is needed because some interrupts are ignored + * by the HW, hence we allow the HW some time to acknowledge + * that. + */ + udelay(200); + td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL); + } + + /* Halt regardless of thread dependencies */ + while (!(td_ctl & TD_CTL_FORCE_EXCEPTION)) { + xe_gt_mcr_multicast_write(gt, TD_CTL, + td_ctl | TD_CTL_FORCE_EXCEPTION); + udelay(200); + td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL); + } + + xe_gt_eu_attentions_read(gt, &pf->attentions.after, + XE_GT_ATTENTION_TIMEOUT_MS); + + /* + * xe_exec_queue_put() will be called from xe_eudebug_pagefault_destroy() + * or handle_pagefault() + */ + pf->q = q; + pf->lrc_idx = lrc_idx; + pf->fault.addr = page_addr; + pf->fault.type = fault_type; + pf->fault.level = fault_level; + pf->fault.access = access_type; + + xe_force_wake_put(gt_to_fw(gt), fw_ref); + xe_eudebug_put(d); + + return pf; + +err_unlock_eu_lock: + mutex_unlock(&d->eu_lock); + attention_scan_flush(gt_to_xe(gt)); + kfree(pf); +err_put_fw: + xe_force_wake_put(gt_to_fw(gt), fw_ref); +err_put_exec_queue: + xe_exec_queue_put(q); +err_put_eudebug: + xe_eudebug_put(d); + + return NULL; +} + +void +xe_eudebug_pagefault_process(struct xe_gt *gt, struct xe_eudebug_pagefault *pf) +{ + xe_gt_eu_attentions_read(gt, &pf->attentions.resolved, + XE_GT_ATTENTION_TIMEOUT_MS); + + if (!xe_eu_attentions_xor_count(&pf->attentions.after, + &pf->attentions.resolved)) + pf->deferred_resolved = true; +} + +void +xe_eudebug_pagefault_destroy(struct xe_gt *gt, struct xe_vm *vm, + struct xe_eudebug_pagefault *pf, bool send_event) +{ + struct xe_eudebug *d; + unsigned int fw_ref; + u32 td_ctl; + + fw_ref = xe_force_wake_get(gt_to_fw(gt), pf->q->hwe->domain); + if (!fw_ref) { + struct xe_device *xe = gt_to_xe(gt); + + drm_warn(&xe->drm, "Forcewake fail: Can not recover TD_CTL"); + } else { + td_ctl = xe_gt_mcr_unicast_read_any(gt, TD_CTL); + xe_gt_mcr_multicast_write(gt, TD_CTL, td_ctl & + ~(TD_CTL_FORCE_EXTERNAL_HALT | TD_CTL_FORCE_EXCEPTION)); + xe_force_wake_put(gt_to_fw(gt), fw_ref); + } + + if (send_event) + handle_pagefault(gt, pf); + + d = _xe_eudebug_get(vm->xef); + if (d) { + struct dma_fence *fence; + + mutex_lock(&d->eu_lock); + rcu_read_lock(); + fence = dma_fence_get_rcu_safe(&d->pf_fence); + rcu_read_unlock(); + + if (fence) { + if (send_event) + dma_fence_signal(fence); + + dma_fence_put(fence); /* deref for dma_fence_get_rcu_safe() */ + dma_fence_put(fence); /* defef for dma_fence_init() */ + } + + d->pf_fence = NULL; + mutex_unlock(&d->eu_lock); + + xe_eudebug_put(d); + } + + if (!send_event) { + xe_exec_queue_put(pf->q); + kfree(pf); + } + + attention_scan_flush(gt_to_xe(gt)); +} + #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) #include "tests/xe_eudebug.c" #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug.h b/drivers/gpu/drm/xe/xe_eudebug.h index a08abf796cc1..cf1df4e2c6a6 100644 --- a/drivers/gpu/drm/xe/xe_eudebug.h +++ b/drivers/gpu/drm/xe/xe_eudebug.h @@ -11,6 +11,7 @@ struct drm_device; struct drm_file; struct xe_device; struct xe_file; +struct xe_gt; struct xe_vm; struct xe_vma; struct xe_exec_queue; @@ -18,6 +19,7 @@ struct xe_hw_engine; struct xe_user_fence; struct xe_debug_metadata; struct drm_gpuva_ops; +struct xe_eudebug_pagefault; #if IS_ENABLED(CONFIG_DRM_XE_EUDEBUG) @@ -53,6 +55,13 @@ void xe_eudebug_put(struct xe_eudebug *d); void xe_eudebug_debug_metadata_create(struct xe_file *xef, struct xe_debug_metadata *m); void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, struct xe_debug_metadata *m); +struct xe_eudebug_pagefault *xe_eudebug_pagefault_create(struct xe_gt *gt, struct xe_vm *vm, + u64 page_addr, u8 fault_type, + u8 fault_level, u8 access_type); +void xe_eudebug_pagefault_process(struct xe_gt *gt, struct xe_eudebug_pagefault *pf); +void xe_eudebug_pagefault_destroy(struct xe_gt *gt, struct xe_vm *vm, + struct xe_eudebug_pagefault *pf, bool send_event); + #else static inline int xe_eudebug_connect_ioctl(struct drm_device *dev, @@ -95,6 +104,25 @@ static inline void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, { } +static inline struct xe_eudebug_pagefault * +xe_eudebug_pagefault_create(struct xe_gt *gt, struct xe_vm *vm, u64 page_addr, + u8 fault_type, u8 fault_level, u8 access_type) +{ + return NULL; +} + +static inline void +xe_eudebug_pagefault_process(struct xe_gt *gt, struct xe_eudebug_pagefault *pf) +{ +} + +static inline void xe_eudebug_pagefault_destroy(struct xe_gt *gt, + struct xe_vm *vm, + struct xe_eudebug_pagefault *pf, + bool send_event) +{ +} + #endif /* CONFIG_DRM_XE_EUDEBUG */ #endif diff --git a/drivers/gpu/drm/xe/xe_eudebug_types.h b/drivers/gpu/drm/xe/xe_eudebug_types.h index a69051b04698..00853dacd477 100644 --- a/drivers/gpu/drm/xe/xe_eudebug_types.h +++ b/drivers/gpu/drm/xe/xe_eudebug_types.h @@ -16,6 +16,8 @@ #include +#include "xe_gt_debug.h" + struct xe_device; struct task_struct; struct xe_eudebug; @@ -161,6 +163,16 @@ struct xe_eudebug { /** @ops operations for eu_control */ struct xe_eudebug_eu_control_ops *ops; + + /** @pf_lock: guards access to pagefaults list*/ + struct mutex pf_lock; + /** @pagefaults: xe_eudebug_pagefault list for pagefault event queuing */ + struct list_head pagefaults; + /** + * @pf_fence: fence on operations of eus (eu thread control and attention) + * when page faults are being handled, protected by @eu_lock. + */ + struct dma_fence __rcu *pf_fence; }; /** @@ -351,4 +363,86 @@ struct xe_eudebug_event_vm_bind_op_metadata { u64 metadata_cookie; }; +/** + * struct xe_eudebug_event_pagefault - Internal event for EU Pagefault + */ +struct xe_eudebug_event_pagefault { + /** @base: base event */ + struct xe_eudebug_event base; + + /** @client_handle: client for the Pagefault */ + u64 client_handle; + + /** @exec_queue_handle: handle of exec_queue which raised Pagefault */ + u64 exec_queue_handle; + + /** @lrc_handle: lrc handle of the workload which raised Pagefault */ + u64 lrc_handle; + + /** @flags: eu Pagefault event flags, currently MBZ */ + u32 flags; + + /** + * @bitmask_size: sum of size before/after/resolved att bits. + * It has three times the size of xe_eudebug_event_eu_attention.bitmask_size. + */ + u32 bitmask_size; + + /** @pagefault_address: The ppgtt address where the Pagefault occurred */ + u64 pagefault_address; + + /** + * @bitmask: Bitmask of thread attentions starting from natural, + * hardware order of DSS=0, eu=0, 8 attention bits per eu. + * The order of the bitmask array is before, after, resolved. + */ + u8 bitmask[]; +}; + +/** + * struct xe_eudebug_pagefault - eudebug structure for queuing pagefault + */ +struct xe_eudebug_pagefault { + /** @list: link into the xe_eudebug.pagefaults */ + struct list_head list; + /** @q: exec_queue which raised pagefault */ + struct xe_exec_queue *q; + /** @lrc_idx: lrc index of the workload which raised pagefault */ + int lrc_idx; + + /* pagefault raw partial data passed from guc*/ + struct { + /** @addr: ppgtt address where the pagefault occurred */ + u64 addr; + int type; + int level; + int access; + } fault; + + struct { + /** @before: state of attention bits before page fault WA processing*/ + struct xe_eu_attentions before; + /** + * @after: status of attention bits during page fault WA processing. + * It includes eu threads where attention bits are turned on for + * reasons other than page fault WA (breakpoint, interrupt, etc.). + */ + struct xe_eu_attentions after; + /** + * @resolved: state of the attention bits after page fault WA. + * It includes the eu thread that caused the page fault. + * To determine the eu thread that caused the page fault, + * do XOR attentions.after and attentions.resolved. + */ + struct xe_eu_attentions resolved; + } attentions; + + /** + * @deferred_resolved: to update attentions.resolved again when attention + * bits are ready if the eu thread fails to turn on attention bits within + * a certain time after page fault WA processing. + */ + bool deferred_resolved; +}; + #endif diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 2606cd396df5..5558342b8e07 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -79,7 +79,7 @@ static bool vma_matches(struct xe_vma *vma, u64 page_addr) return true; } -static struct xe_vma *lookup_vma(struct xe_vm *vm, u64 page_addr) +struct xe_vma *xe_gt_pagefault_lookup_vma(struct xe_vm *vm, u64 page_addr) { struct xe_vma *vma = NULL; @@ -225,7 +225,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) goto unlock_vm; } - vma = lookup_vma(vm, pf->page_addr); + vma = xe_gt_pagefault_lookup_vma(vm, pf->page_addr); if (!vma) { err = -EINVAL; goto unlock_vm; diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.h b/drivers/gpu/drm/xe/xe_gt_pagefault.h index 839c065a5e4c..3c0628b79f33 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.h +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.h @@ -10,10 +10,12 @@ struct xe_gt; struct xe_guc; +struct xe_vm; int xe_gt_pagefault_init(struct xe_gt *gt); void xe_gt_pagefault_reset(struct xe_gt *gt); int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len); int xe_guc_access_counter_notify_handler(struct xe_guc *guc, u32 *msg, u32 len); +struct xe_vma *xe_gt_pagefault_lookup_vma(struct xe_vm *vm, u64 page_addr); #endif /* _XE_GT_PAGEFAULT_ */ diff --git a/include/uapi/drm/xe_drm_eudebug.h b/include/uapi/drm/xe_drm_eudebug.h index 3c4d1b511acd..e43576c7bc5e 100644 --- a/include/uapi/drm/xe_drm_eudebug.h +++ b/include/uapi/drm/xe_drm_eudebug.h @@ -38,6 +38,7 @@ struct drm_xe_eudebug_event { #define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE 9 #define DRM_XE_EUDEBUG_EVENT_METADATA 10 #define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA 11 +#define DRM_XE_EUDEBUG_EVENT_PAGEFAULT 12 __u16 flags; #define DRM_XE_EUDEBUG_EVENT_CREATE (1 << 0) @@ -236,6 +237,18 @@ struct drm_xe_eudebug_event_vm_bind_op_metadata { __u64 metadata_cookie; }; +struct drm_xe_eudebug_event_pagefault { + struct drm_xe_eudebug_event base; + + __u64 client_handle; + __u64 exec_queue_handle; + __u64 lrc_handle; + __u32 flags; + __u32 bitmask_size; + __u64 pagefault_address; + __u8 bitmask[]; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Dec 9 13:33:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1FA6AE77184 for ; Mon, 9 Dec 2024 13:33:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 49CE910E767; Mon, 9 Dec 2024 13:33:51 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="IolEBWzd"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id BD4B010E768; Mon, 9 Dec 2024 13:33:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751228; x=1765287228; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=90FvMUCLQi8mDjtBvWw+swGOj430xuNure4yFSG5B6I=; b=IolEBWzd8/PCPSJa+8WCBQHBXJSM4PoVeFhcZQBMWABs9C7ig0X63pED JZsGrebwligQqm6KBvCXXOOlQ71FyK70YvjpzHVxs/e7Q6KZDx39I2UP5 QK6B7IzkyHuYL78oj1R/HSUOhVM8692C0VwO1dGHJAnk0Lx93e0y89HCj sruWzNuoHgZTFFvjXhmzwpxo8ai62U9Wgb/JaJofCiCJeh8A58PPFgLkM 0qt81CdOyhGAiT0TODoPKECfoRFvxnDvPlVEAq2flELMSDtASlZ9tYE4a gGUnFqxFsIPQCbgr/ig+4BbUJvjS77zXMe/FV+S3P+y6/J26qwAG7Tz1G g==; X-CSE-ConnectionGUID: v50arGQfS1Kg7/kbv4z+jg== X-CSE-MsgGUID: UwnGw1zYRxO72IVlzGM+/g== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192214" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192214" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:47 -0800 X-CSE-ConnectionGUID: AFt/k6KfRNm8xPW4e5YaaA== X-CSE-MsgGUID: Wxzb7kBeS1a1dDfw7eeegg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531391" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:45 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Gwan-gyeong Mun , Oak Zeng , Niranjana Vishwanathapura , Stuart Summers , Matthew Brost , Bruce Chang , Mika Kuoppala Subject: [PATCH 25/26] drm/xe/vm: Support for adding null page VMA to VM on request Date: Mon, 9 Dec 2024 15:33:16 +0200 Message-ID: <20241209133318.1806472-26-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Gwan-gyeong Mun The XE2 (and PVC) HW has a limitation that the pagefault due to invalid access will halt the corresponding EUs. So, in order to activate the debugger, kmd needs to install the temporal page to unhalt the EUs. Plan to be used for pagefault handling when the EU debugger is running. The idea is to install a null page vma if the pagefault is from an invalid access. After installing null page pte, the user debugger can continue to run/inspect without causing a fatal failure or reset and stop. Based on Bruce's implementation [1]. [1] https://lore.kernel.org/intel-xe/20230829231648.4438-1-yu.bruce.chang@intel.com/ Cc: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Stuart Summers Cc: Matthew Brost Co-developed-by: Bruce Chang Signed-off-by: Bruce Chang Signed-off-by: Gwan-gyeong Mun Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_vm.c | 23 +++++++++++++++++++++++ drivers/gpu/drm/xe/xe_vm.h | 2 ++ 2 files changed, 25 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 474521d0fea9..ff45e5264aed 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3552,3 +3552,26 @@ int xe_vm_userptr_access(struct xe_userptr_vma *uvma, u64 offset, up_read(&vm->userptr.notifier_lock); return ret; } + +struct xe_vma *xe_vm_create_null_vma(struct xe_vm *vm, u64 addr) +{ + struct xe_vma *vma; + u32 page_size; + int err; + + if (xe_vm_is_closed_or_banned(vm)) + return ERR_PTR(-ENOENT); + + page_size = vm->flags & XE_VM_FLAG_64K ? SZ_64K : SZ_4K; + vma = xe_vma_create(vm, NULL, 0, addr, addr + page_size - 1, 0, VMA_CREATE_FLAG_IS_NULL); + if (IS_ERR_OR_NULL(vma)) + return vma; + + err = xe_vm_insert_vma(vm, vma); + if (err) { + xe_vma_destroy_late(vma); + return ERR_PTR(err); + } + + return vma; +} diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 372ad40ad67f..2ae3749cfd82 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -283,3 +283,5 @@ void xe_vm_snapshot_free(struct xe_vm_snapshot *snap); int xe_vm_userptr_access(struct xe_userptr_vma *uvma, u64 offset, void *buf, u64 len, bool write); + +struct xe_vma *xe_vm_create_null_vma(struct xe_vm *vm, u64 addr); From patchwork Mon Dec 9 13:33:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 13899797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 87D01E7717D for ; Mon, 9 Dec 2024 13:33:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D690E10E762; Mon, 9 Dec 2024 13:33:50 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="PshWHiwQ"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id A2C4E10E75F; Mon, 9 Dec 2024 13:33:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733751230; x=1765287230; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RWlBI74UgxTE9VEQ4zxDfmkZ82fFFokHx6rswykANsg=; b=PshWHiwQ9HFaB3By2VqcivgxWhEnGl3TYf3iDOpWZPPy43lCctgSUSCr p6rFeqyyaFD9om31eDugzDAT1NiaEc9wXwv8jF7jIuaoAYGwHCIFG3tKl 9Gi1wgIrjZe98zFat9xsDvij6vwa70D4pM4fwxSNI7A5ClNUPHhayEDs8 BhbG3539EWUzfGX805KbRsQnj8xY4FmWPW/acjG8UEaJH2slN81DcOKkn 7cp7Tz58DR/ZeigfSWbp73+fy8FYPawm0eeaapy3UWafiCjIrzqMzI64H btWDz+zE6oeLUcZeX0j9fkobOwYBLGRdl4k31yorqpYx7oGkZ15qrHUsi w==; X-CSE-ConnectionGUID: BUs0LGE7SqmLz+aueFzWyA== X-CSE-MsgGUID: FJ3B7mCvTquT0Dsg5chDlQ== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="34192221" X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="34192221" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:49 -0800 X-CSE-ConnectionGUID: lvKuYSf7ShW9DBObARJhNg== X-CSE-MsgGUID: O76YidI9S+uG4OCtP0lHNw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,219,1728975600"; d="scan'208";a="99531398" Received: from mkuoppal-desk.fi.intel.com ([10.237.72.193]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2024 05:33:48 -0800 From: Mika Kuoppala To: intel-xe@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Gwan-gyeong Mun , Mika Kuoppala Subject: [PATCH 26/26] drm/xe/eudebug: Enable EU pagefault handling Date: Mon, 9 Dec 2024 15:33:17 +0200 Message-ID: <20241209133318.1806472-27-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Gwan-gyeong Mun The XE2 (and PVC) HW has a limitation that the pagefault due to invalid access will halt the corresponding EUs. To solve this problem, enable EU pagefault handling functionality, which allows to unhalt pagefaulted eu threads and to EU debugger to get inform about the eu attentions state of EU threads during execution. If a pagefault occurs, send the DRM_XE_EUDEBUG_EVENT_PAGEFAULT event to the client connected to the xe_eudebug after handling the pagefault. The pagefault handling is a mechanism that allows a stalled EU thread to enter SIP mode by installing a temporal null page to the page table entry where the pagefault happened. A brief description of the page fault handling mechanism flow between KMD and the eu thread is as follows (1) eu thread accesses unallocated address (2) pagefault happens and eu thread stalls (3) XE kmd set an force eu thread exception to allow the running eu thread to enter SIP mode (kmd set ForceException / ForceExternalHalt bit of TD_CTL register) Not stalled (none-pagefaulted) eu threads enter SIP mode (4) XE kmd installs temporal null page to the pagetable entry of the address where pagefault happened. (5) XE kmd replies pagefault successful message to GUC (6) stalled eu thread resumes as per pagefault condition has resolved (7) resumed eu thread enters SIP mode due to force exception set by (3) As designed this feature to only work when eudbug is enabled, it should have no impact to regular recoverable pagefault code path. Signed-off-by: Gwan-gyeong Mun Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 83 +++++++++++++++++++++++++--- 1 file changed, 75 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 5558342b8e07..4e2883e19018 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -13,6 +13,7 @@ #include "abi/guc_actions_abi.h" #include "xe_bo.h" +#include "xe_eudebug.h" #include "xe_gt.h" #include "xe_gt_tlb_invalidation.h" #include "xe_guc.h" @@ -199,12 +200,16 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid) return vm; } -static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) +static int handle_pagefault_start(struct xe_gt *gt, struct pagefault *pf, + struct xe_vm **pf_vm, + struct xe_eudebug_pagefault **eudebug_pf_out) { - struct xe_device *xe = gt_to_xe(gt); + struct xe_eudebug_pagefault *eudebug_pf; struct xe_tile *tile = gt_to_tile(gt); - struct xe_vm *vm; + struct xe_device *xe = gt_to_xe(gt); + bool destroy_eudebug_pf = false; struct xe_vma *vma = NULL; + struct xe_vm *vm; int err; /* SW isn't expected to handle TRTT faults */ @@ -215,6 +220,10 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) if (IS_ERR(vm)) return PTR_ERR(vm); + eudebug_pf = xe_eudebug_pagefault_create(gt, vm, pf->page_addr, + pf->fault_type, pf->fault_level, + pf->access_type); + /* * TODO: Change to read lock? Using write lock for simplicity. */ @@ -227,8 +236,27 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) vma = xe_gt_pagefault_lookup_vma(vm, pf->page_addr); if (!vma) { - err = -EINVAL; - goto unlock_vm; + if (eudebug_pf) + vma = xe_vm_create_null_vma(vm, pf->page_addr); + + if (IS_ERR_OR_NULL(vma)) { + err = -EINVAL; + if (eudebug_pf) + destroy_eudebug_pf = true; + + goto unlock_vm; + } + } else { + /* + * When creating an instance of eudebug_pagefault, there was + * no vma containing the ppgtt address where the pagefault occurred, + * but when reacquiring vm->lock, there is. + * During not aquiring the vm->lock from this context, + * but vma corresponding to the address where the pagefault occurred + * in another context has allocated. + */ + if (eudebug_pf) + destroy_eudebug_pf = true; } err = handle_vma_pagefault(tile, pf, vma); @@ -237,11 +265,43 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) if (!err) vm->usm.last_fault_vma = vma; up_write(&vm->lock); - xe_vm_put(vm); + + if (destroy_eudebug_pf) { + xe_eudebug_pagefault_destroy(gt, vm, eudebug_pf, false); + *eudebug_pf_out = NULL; + } else { + *eudebug_pf_out = eudebug_pf; + } + + /* while the lifetime of the eudebug pagefault instance, keep the VM instance.*/ + if (!*eudebug_pf_out) { + xe_vm_put(vm); + *pf_vm = NULL; + } else { + *pf_vm = vm; + } return err; } +static void handle_pagefault_end(struct xe_gt *gt, struct xe_vm *vm, + struct xe_eudebug_pagefault *eudebug_pf) +{ + /* if there no eudebug_pagefault then return */ + if (!eudebug_pf) + return; + + xe_eudebug_pagefault_process(gt, eudebug_pf); + + /* + * TODO: Remove VMA added to handle eudebug pagefault + */ + + xe_eudebug_pagefault_destroy(gt, vm, eudebug_pf, true); + + xe_vm_put(vm); +} + static int send_pagefault_reply(struct xe_guc *guc, struct xe_guc_pagefault_reply *reply) { @@ -367,7 +427,10 @@ static void pf_queue_work_func(struct work_struct *w) threshold = jiffies + msecs_to_jiffies(USM_QUEUE_MAX_RUNTIME_MS); while (get_pagefault(pf_queue, &pf)) { - ret = handle_pagefault(gt, &pf); + struct xe_eudebug_pagefault *eudebug_pf = NULL; + struct xe_vm *vm = NULL; + + ret = handle_pagefault_start(gt, &pf, &vm, &eudebug_pf); if (unlikely(ret)) { print_pagefault(xe, &pf); pf.fault_unsuccessful = 1; @@ -385,7 +448,11 @@ static void pf_queue_work_func(struct work_struct *w) FIELD_PREP(PFR_ENG_CLASS, pf.engine_class) | FIELD_PREP(PFR_PDATA, pf.pdata); - send_pagefault_reply(>->uc.guc, &reply); + ret = send_pagefault_reply(>->uc.guc, &reply); + if (unlikely(ret)) + drm_dbg(&xe->drm, "GuC Pagefault reply failed: %d\n", ret); + + handle_pagefault_end(gt, vm, eudebug_pf); if (time_after(jiffies, threshold) && pf_queue->tail != pf_queue->head) {