From patchwork Thu Mar 13 18:34:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cavitt X-Patchwork-Id: 14015766 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4BF2CC35FF6 for ; Thu, 13 Mar 2025 18:34:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B9F7F10E90B; Thu, 13 Mar 2025 18:34:18 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Xb2IipNQ"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id 889F410E90B; Thu, 13 Mar 2025 18:34:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741890858; x=1773426858; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hl8+lM11PRLmxBjMwYzmp398zmvgVeUwbwpOM7K9t/8=; b=Xb2IipNQIrag+pWyVvQ8UnucM8QN5iyDfCwIK0EX2FH3+bu56NVhBM9+ UuRyzq6/esQxcTvkKPId47rUn4cKQaNkHZOiaqcKrxWKIwbn28YcnZYfn DHk+30LWlyzbuVWaTuERHCtw0T8t6/axCqh9OT10Ro+K+pHNjcj3DBAXP WU6JPRyVM9azzg2Fco5wkkaFr4Aw0//I5tnfT8s74PxfjVhV59ZmYI1yu QOihGiTOHRjM5qkEXnUhnGkMbVFSut7lmc20oE54PjCF9yiS1jVmKfOgy IDhjiiybNn98UdgHFPAtjcsvagO+YOwbF0A7wOhoWF4i/7Rp5lAzDiL0D Q==; X-CSE-ConnectionGUID: A1/nnJVIQAm23V6myUDq2Q== X-CSE-MsgGUID: MBHg7NNgReiuZSCtyRo7/Q== X-IronPort-AV: E=McAfee;i="6700,10204,11372"; a="65485457" X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="65485457" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:17 -0700 X-CSE-ConnectionGUID: 8U47uYS4Td+rN0ax8s9C+A== X-CSE-MsgGUID: LXrRZbVrSsm6oU2XC0xykQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="151900769" Received: from dut4440lnl.fm.intel.com ([10.105.10.40]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:17 -0700 From: Jonathan Cavitt To: intel-xe@lists.freedesktop.org Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com, jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com, matthew.brost@intel.com, jianxun.zhang@intel.com, shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org, Michal.Wajdeczko@intel.com, michal.mzorek@intel.com Subject: [PATCH v8 1/6] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs Date: Thu, 13 Mar 2025 18:34:03 +0000 Message-ID: <20250313183415.133863-2-jonathan.cavitt@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250313183415.133863-1-jonathan.cavitt@intel.com> References: <20250313183415.133863-1-jonathan.cavitt@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" The page fault handler should reject write/atomic access to read only VMAs. Add code to handle this in handle_pagefault after the VMA lookup. Fixes: 3d420e9fa848 ("drm/xe: Rework GPU page fault handling") Signed-off-by: Jonathan Cavitt Suggested-by: Matthew Brost --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 9fa11e837dd1..3240890aac07 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -237,6 +237,11 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) goto unlock_vm; } + if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) { + err = -EPERM; + goto unlock_vm; + } + atomic = access_is_atomic(pf->access_type); if (xe_vma_is_cpu_addr_mirror(vma)) From patchwork Thu Mar 13 18:34:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jonathan Cavitt X-Patchwork-Id: 14015765 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 160E2C35FF3 for ; Thu, 13 Mar 2025 18:34:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5052310E910; Thu, 13 Mar 2025 18:34:18 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="LQmfwW2t"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id 998FE10E90F; Thu, 13 Mar 2025 18:34:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741890858; x=1773426858; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=c10iHRPfUSzVWdJ19XvLqKIooqItU5iV2rOyI2mR3ls=; b=LQmfwW2t82wjmATRf39R5AtrdL3u1QsRTQi6E0L9WWYLHY3Dmbx4UdHp oOmNURBpZGqesl8h/igV0V+rKs/PraMTS+yugYD1H/Z4vsv/kh8BLzQU4 KIOTVIq1FWptloy/nBXxrw1ZIL7VH4rVflPB6qZaHtGy6kyXlb7r3yhwz 8a7B5O7tJa00mPDbzCIFZzNNdWz+2hl/MlPQ45ZITI6n8WgC75dwdrVWC KwE+eYL2Si70wRqhxwuwtWDIitPKcKUMIj+H0Q3ckIC3fn7eLnM5M/yI9 ODeBQWuw/d92fwl+k2AfVyI/WtrUlu61kCCMv9/HTSGoRn2RJVSZ3KaKB w==; X-CSE-ConnectionGUID: w9lzwtWxRqufm1LF2zPtbQ== X-CSE-MsgGUID: fVJ2Ub48Rw6xUaSj2m3Pkw== X-IronPort-AV: E=McAfee;i="6700,10204,11372"; a="65485459" X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="65485459" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:17 -0700 X-CSE-ConnectionGUID: uibQ0TL/SEGEuLtWidMKLA== X-CSE-MsgGUID: hl23cZMxR6eehM7m+PeXpA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="151900772" Received: from dut4440lnl.fm.intel.com ([10.105.10.40]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:17 -0700 From: Jonathan Cavitt To: intel-xe@lists.freedesktop.org Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com, jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com, matthew.brost@intel.com, jianxun.zhang@intel.com, shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org, Michal.Wajdeczko@intel.com, michal.mzorek@intel.com Subject: [PATCH v8 2/6] drm/xe/xe_gt_pagefault: Move pagefault struct to header Date: Thu, 13 Mar 2025 18:34:04 +0000 Message-ID: <20250313183415.133863-3-jonathan.cavitt@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250313183415.133863-1-jonathan.cavitt@intel.com> References: <20250313183415.133863-1-jonathan.cavitt@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Move the pagefault struct from xe_gt_pagefault.c to the xe_gt_pagefault_types.h header file, along with the associated enum values. v2: - Normalize names for common header (Matt Brost) v3: - s/Migrate/Move (Michal W) - s/xe_pagefault/xe_gt_pagefault (Michal W) - Create new header file, xe_gt_pagefault_types.h (Michal W) - Add kernel docs (Michal W) Signed-off-by: Jonathan Cavitt Cc: Michal Wajdeczko --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 41 +++---------- drivers/gpu/drm/xe/xe_gt_pagefault.h | 2 + drivers/gpu/drm/xe/xe_gt_pagefault_types.h | 67 ++++++++++++++++++++++ 3 files changed, 76 insertions(+), 34 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 3240890aac07..06a4e3fdc11d 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -23,33 +23,6 @@ #include "xe_trace_bo.h" #include "xe_vm.h" -struct pagefault { - u64 page_addr; - u32 asid; - u16 pdata; - u8 vfid; - u8 access_type; - u8 fault_type; - u8 fault_level; - u8 engine_class; - u8 engine_instance; - u8 fault_unsuccessful; - bool trva_fault; -}; - -enum access_type { - ACCESS_TYPE_READ = 0, - ACCESS_TYPE_WRITE = 1, - ACCESS_TYPE_ATOMIC = 2, - ACCESS_TYPE_RESERVED = 3, -}; - -enum fault_type { - NOT_PRESENT = 0, - WRITE_ACCESS_VIOLATION = 1, - ATOMIC_ACCESS_VIOLATION = 2, -}; - struct acc { u64 va_range_base; u32 asid; @@ -61,9 +34,9 @@ struct acc { u8 engine_instance; }; -static bool access_is_atomic(enum access_type access_type) +static bool access_is_atomic(enum xe_gt_pagefault_access_type access_type) { - return access_type == ACCESS_TYPE_ATOMIC; + return access_type == XE_GT_PAGEFAULT_ACCESS_TYPE_ATOMIC; } static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma) @@ -205,7 +178,7 @@ static struct xe_vm *asid_to_vm(struct xe_device *xe, u32 asid) return vm; } -static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) +static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf) { struct xe_device *xe = gt_to_xe(gt); struct xe_vm *vm; @@ -237,7 +210,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) goto unlock_vm; } - if (xe_vma_read_only(vma) && pf->access_type != ACCESS_TYPE_READ) { + if (xe_vma_read_only(vma) && pf->access_type != XE_GT_PAGEFAULT_ACCESS_TYPE_READ) { err = -EPERM; goto unlock_vm; } @@ -271,7 +244,7 @@ static int send_pagefault_reply(struct xe_guc *guc, return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0); } -static void print_pagefault(struct xe_device *xe, struct pagefault *pf) +static void print_pagefault(struct xe_device *xe, struct xe_gt_pagefault *pf) { drm_dbg(&xe->drm, "\n\tASID: %d\n" "\tVFID: %d\n" @@ -291,7 +264,7 @@ static void print_pagefault(struct xe_device *xe, struct pagefault *pf) #define PF_MSG_LEN_DW 4 -static bool get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf) +static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf) { const struct xe_guc_pagefault_desc *desc; bool ret = false; @@ -378,7 +351,7 @@ static void pf_queue_work_func(struct work_struct *w) struct xe_gt *gt = pf_queue->gt; struct xe_device *xe = gt_to_xe(gt); struct xe_guc_pagefault_reply reply = {}; - struct pagefault pf = {}; + struct xe_gt_pagefault pf = {}; unsigned long threshold; int ret; diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.h b/drivers/gpu/drm/xe/xe_gt_pagefault.h index 839c065a5e4c..69b700c4915a 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.h +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.h @@ -8,6 +8,8 @@ #include +#include "xe_gt_pagefault_types.h" + struct xe_gt; struct xe_guc; diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h new file mode 100644 index 000000000000..90b7085d4b8e --- /dev/null +++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h @@ -0,0 +1,67 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2022-2025 Intel Corporation + */ + +#ifndef _XE_GT_PAGEFAULT_TYPES_H_ +#define _XE_GT_PAGEFAULT_TYPES_H_ + +/** + * struct xe_gt_pagefault - Structure of pagefaults returned by the + * pagefault handler + */ +struct xe_gt_pagefault { + /** @page_addr: faulted address of this pagefault */ + u64 page_addr; + /** @asid: ASID of this pagefault */ + u32 asid; + /** @pdata: PDATA of this pagefault */ + u16 pdata; + /** @vfid: VFID of this pagefault */ + u8 vfid; + /** + * @access_type: access type of this pagefault, as a value + * from xe_gt_pagefault_access_type + */ + u8 access_type; + /** + * @fault_type: fault type of this pagefault, as a value + * from xe_gt_pagefault_fault_type + */ + u8 fault_type; + /** @fault_level: fault level of this pagefault */ + u8 fault_level; + /** @engine_class: engine class this pagefault was reported on */ + u8 engine_class; + /** @engine_instance: engine instance this pagefault was reported on */ + u8 engine_instance; + /** @fault_unsuccessful: flag for if the pagefault recovered or not */ + u8 fault_unsuccessful; + /** @prefetch: unused */ + bool prefetch; + /** @trva_fault: is set if this is a TRTT fault */ + bool trva_fault; +}; + +/** + * enum xe_gt_pagefault_access_type - Access type reported to the xe_gt_pagefault + * struct. Saved to xe_gt_pagefault@access_type + */ +enum xe_gt_pagefault_access_type { + XE_GT_PAGEFAULT_ACCESS_TYPE_READ = 0, + XE_GT_PAGEFAULT_ACCESS_TYPE_WRITE = 1, + XE_GT_PAGEFAULT_ACCESS_TYPE_ATOMIC = 2, + XE_GT_PAGEFAULT_ACCESS_TYPE_RESERVED = 3, +}; + +/** + * enum xe_gt_pagefault_fault_type - Fault type reported to the xe_gt_pagefault + * struct. Saved to xe_gt_pagefault@fault_type + */ +enum xe_gt_pagefault_fault_type { + XE_GT_PAGEFAULT_TYPE_NOT_PRESENT = 0, + XE_GT_PAGEFAULT_TYPE_WRITE_ACCESS_VIOLATION = 1, + XE_GT_PAGEFAULT_TYPE_ATOMIC_ACCESS_VIOLATION = 2, +}; + +#endif From patchwork Thu Mar 13 18:34:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cavitt X-Patchwork-Id: 14015770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6B896C35FF1 for ; Thu, 13 Mar 2025 18:34:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D2E9510E91F; Thu, 13 Mar 2025 18:34:24 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="SxAX1xJz"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id B68C410E90B; Thu, 13 Mar 2025 18:34:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741890858; x=1773426858; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UWfhJkQnqaJIf9saeCotVB+nzxmsxukD7wnRqFdpJvw=; b=SxAX1xJzRalbT+qplafKBCveqLvhTorXBBHYPSz5KG+DsZGjM8zWz53A G77xVjUl4oygXa5+wgozO7ykqUXJg+2lSOeq+4gWqZ57PV1feOhIfIbyK /TcP+sIUGsqfteaA7NNf7VUNLdo2SwMIPL1xe8Rpmt2l/m44PviecDHcq yOIGv7aoolnzmvWVr6XGYkhPEbe3QKwroe4gGuLEzwIQbnXUxW0sMBgjV jaO3nJfPY7AJNDBepKtXkKXeBVgOKtCMEh6JW63UwUjILbsfM4vcVcZdz ShHRKixpMbaFVvkqDPKVVp0stck19LdwbPNXDLBHmRBZ1QM2eol7F889D A==; X-CSE-ConnectionGUID: BqlTeROUQamfNzeK517goA== X-CSE-MsgGUID: dRdr9BiFR7S/i/STRNI1WA== X-IronPort-AV: E=McAfee;i="6700,10204,11372"; a="65485460" X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="65485460" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:18 -0700 X-CSE-ConnectionGUID: Jsc8vmKIRz+Yz2GmWqI00Q== X-CSE-MsgGUID: CyDlY9loQga4YiVkTOYu/w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="151900775" Received: from dut4440lnl.fm.intel.com ([10.105.10.40]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:17 -0700 From: Jonathan Cavitt To: intel-xe@lists.freedesktop.org Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com, jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com, matthew.brost@intel.com, jianxun.zhang@intel.com, shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org, Michal.Wajdeczko@intel.com, michal.mzorek@intel.com Subject: [PATCH v8 3/6] drm/xe/uapi: Define drm_xe_vm_get_property Date: Thu, 13 Mar 2025 18:34:05 +0000 Message-ID: <20250313183415.133863-4-jonathan.cavitt@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250313183415.133863-1-jonathan.cavitt@intel.com> References: <20250313183415.133863-1-jonathan.cavitt@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add initial declarations for the drm_xe_vm_get_property ioctl. Signed-off-by: Jonathan Cavitt --- include/uapi/drm/xe_drm.h | 69 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 616916985e3f..0ed52666b4e9 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -81,6 +81,7 @@ extern "C" { * - &DRM_IOCTL_XE_EXEC * - &DRM_IOCTL_XE_WAIT_USER_FENCE * - &DRM_IOCTL_XE_OBSERVATION + * - &DRM_IOCTL_XE_VM_GET_PROPERTY */ /* @@ -102,6 +103,7 @@ extern "C" { #define DRM_XE_EXEC 0x09 #define DRM_XE_WAIT_USER_FENCE 0x0a #define DRM_XE_OBSERVATION 0x0b +#define DRM_XE_VM_GET_PROPERTY 0x0c /* Must be kept compact -- no holes */ @@ -117,6 +119,7 @@ extern "C" { #define DRM_IOCTL_XE_EXEC DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec) #define DRM_IOCTL_XE_WAIT_USER_FENCE DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence) #define DRM_IOCTL_XE_OBSERVATION DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param) +#define DRM_IOCTL_XE_VM_GET_PROPERTY DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_GET_PROPERTY, struct drm_xe_vm_get_property) /** * DOC: Xe IOCTL Extensions @@ -1189,6 +1192,72 @@ struct drm_xe_vm_bind { __u64 reserved[2]; }; +/** struct xe_vm_fault - Describes faults for %DRM_XE_VM_GET_PROPERTY_FAULTS */ +struct xe_vm_fault { + /** @address: Address of the fault */ + __u64 address; +#define DRM_XE_FAULT_ADDRESS_TYPE_NONE_EXT 0 +#define DRM_XE_FAULT_ADDRESS_TYPE_READ_INVALID_EXT 1 +#define DRM_XE_FAULT_ADDRESS_TYPE_WRITE_INVALID_EXT 2 + /** @address_type: Type of address access that resulted in fault */ + __u32 address_type; + /** @address_precision: Precision of faulted address */ + __u32 address_precision; + /** @fault_level: fault level of the fault */ + __u8 fault_level; + /** @engine_class: class of engine fault was reported on */ + __u8 engine_class; + /** @engine_instance: instance of engine fault was reported on */ + __u8 engine_instance; + /** @pad: MBZ */ + __u8 pad[5]; + /** @reserved: MBZ */ + __u64 reserved[3]; +}; + +/** + * struct drm_xe_vm_get_property - Input of &DRM_IOCTL_XE_VM_GET_PROPERTY + * + * The user provides a VM ID and a property to query for. The ioctl will return + * the size of the data expected to be returned in @size. Performing the query + * again with memory allocated to @data of size @size will return the requested + * data to the allocated memory. + * + * Some get property requests may be scalar values and require no memory allocation. + * In such cases, the first call to this ioctl will not set @size and will return + * the requested value to @value instead. + * + * The @property can be: + * - %DRM_XE_VM_GET_PROPERTY_FAULTS + */ +struct drm_xe_vm_get_property { + /** @extensions: Pointer to the first extension struct, if any */ + __u64 extensions; + + /** @vm_id: The ID of the VM to query the properties of */ + __u32 vm_id; + +#define DRM_XE_VM_GET_PROPERTY_FAULTS 0 + /** @property: property to get */ + __u32 property; + + /** @size: Size to allocate for @data */ + __u32 size; + + /** @pad: MBZ */ + __u32 pad; + + union { + /** @data: Pointer to user-defined array of flexible size and type */ + __u64 data; + /** @value: Return value for scalar queries */ + __u64 value; + }; + + /** @reserved: MBZ */ + __u64 reserved[3]; +}; + /** * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE * From patchwork Thu Mar 13 18:34:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cavitt X-Patchwork-Id: 14015769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6BF26C35FF6 for ; Thu, 13 Mar 2025 18:34:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 39AB910E914; Thu, 13 Mar 2025 18:34:19 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="cIJrK95C"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id EA87A10E90B; Thu, 13 Mar 2025 18:34:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741890858; x=1773426858; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EX1g/4ZvEY7GSy7ARqUXvg37KKeV+p0TymOsr5HLoSQ=; b=cIJrK95CUA7Di695UGor8Tp09PkfQ9L7uR3N92KdoaS8HZG8ST5oGrNX whnaso7wEqAevShE4D7pssW7okMmrfAX6bFutKfgb6SC90Gt/rc9mOEDX yiJScoJIUoUsrY3cmJq2mphhonmey67PYi+UGgQjEIzpQuFPz+OWC5VG5 RPXE4RwlJfoJubjaKZYT1qi2ZdZg2ZX5k4McfUovc+46dEan62zohDQ3D yOqJcImBqNTmZy6xlVKltrjg/umXA8BL94us0N1aEDSGeaSRWgkt1VuwF EyT+QXprwptjtJYZF5VmcYu1HoFTi8xltNkIMtiZQ/vMHJuAqaNg3Mjmb w==; X-CSE-ConnectionGUID: OR7DFRSKQz+7oXEEC7VsMQ== X-CSE-MsgGUID: CIJbIa9ISseDRc4+/IUNvA== X-IronPort-AV: E=McAfee;i="6700,10204,11372"; a="65485461" X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="65485461" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:18 -0700 X-CSE-ConnectionGUID: i/VwLW7aSGK8VVycHrGjIw== X-CSE-MsgGUID: 5Y/NUDv9RKm9MmfwXp3LXg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="151900778" Received: from dut4440lnl.fm.intel.com ([10.105.10.40]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:17 -0700 From: Jonathan Cavitt To: intel-xe@lists.freedesktop.org Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com, jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com, matthew.brost@intel.com, jianxun.zhang@intel.com, shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org, Michal.Wajdeczko@intel.com, michal.mzorek@intel.com Subject: [PATCH v8 4/6] drm/xe/xe_gt_pagefault: Add address_type field to pagefaults Date: Thu, 13 Mar 2025 18:34:06 +0000 Message-ID: <20250313183415.133863-5-jonathan.cavitt@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250313183415.133863-1-jonathan.cavitt@intel.com> References: <20250313183415.133863-1-jonathan.cavitt@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add a new field to the xe_pagefault struct, address_type, that tracks the type of fault the pagefault incurred. Signed-off-by: Jonathan Cavitt --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 3 +++ drivers/gpu/drm/xe/xe_gt_pagefault_types.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 06a4e3fdc11d..e67ee7ac3df7 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -206,11 +206,13 @@ static int handle_pagefault(struct xe_gt *gt, struct xe_gt_pagefault *pf) vma = lookup_vma(vm, pf->page_addr); if (!vma) { + pf->address_type = DRM_XE_FAULT_ADDRESS_TYPE_NONE_EXT; err = -EINVAL; goto unlock_vm; } if (xe_vma_read_only(vma) && pf->access_type != XE_GT_PAGEFAULT_ACCESS_TYPE_READ) { + pf->address_type = DRM_XE_FAULT_ADDRESS_TYPE_WRITE_INVALID_EXT; err = -EPERM; goto unlock_vm; } @@ -284,6 +286,7 @@ static bool get_pagefault(struct pf_queue *pf_queue, struct xe_gt_pagefault *pf) pf->asid = FIELD_GET(PFD_ASID, desc->dw1); pf->vfid = FIELD_GET(PFD_VFID, desc->dw2); pf->access_type = FIELD_GET(PFD_ACCESS_TYPE, desc->dw2); + pf->address_type = 0; pf->fault_type = FIELD_GET(PFD_FAULT_TYPE, desc->dw2); pf->page_addr = (u64)(FIELD_GET(PFD_VIRTUAL_ADDR_HI, desc->dw3)) << PFD_VIRTUAL_ADDR_HI_SHIFT; diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h index 90b7085d4b8e..68721973debb 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h +++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h @@ -24,6 +24,8 @@ struct xe_gt_pagefault { * from xe_gt_pagefault_access_type */ u8 access_type; + /** @address_type: Type of address access that resulted in fault */ + u8 address_type; /** * @fault_type: fault type of this pagefault, as a value * from xe_gt_pagefault_fault_type From patchwork Thu Mar 13 18:34:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cavitt X-Patchwork-Id: 14015767 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECEFFC35FF1 for ; Thu, 13 Mar 2025 18:34:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CE22410E911; Thu, 13 Mar 2025 18:34:18 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ETxOkPzk"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3556B10E90B; Thu, 13 Mar 2025 18:34:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741890858; x=1773426858; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oOFeBTpk3EErhmr27oaHR3HsjNEksIbd7NWrIX1Uszk=; b=ETxOkPzkQ8jbDY6RsCjLyT7DPHS+xMOZeg8RB3C8JSw9dGOkL2jr3+rC PM2AqhRuvmQ4haEDegJjQiAiferjHDGU96syVIGYZkz64lbpsnrYihOm/ B4GNPP51/KYIjuu95tzIJUAnCJm+fG09V14iWgWGzJpzZNsmKCbuTKwwi wOzO2SCamyZODl1r0OgcOrukazHeBnglV6ZYEksd1gSPxP/p9Si7CUtcJ bKD+xKVmMmTLE8+Erko0lW4LOsh1QqY5g0Y0b96K+4/m9dPfwlHRMQQBp x0ht+05mr/i505B9FQR04EaOiR9/CbUb+5qSN7fIM/R3/toT9EFt1+dHp g==; X-CSE-ConnectionGUID: NEPV0cZ/SquRhfo26j/O5Q== X-CSE-MsgGUID: 1GLKl0tsSk2QGeuEE/JWlg== X-IronPort-AV: E=McAfee;i="6700,10204,11372"; a="65485462" X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="65485462" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:18 -0700 X-CSE-ConnectionGUID: E2Db/AffSSOms+pl21uvhg== X-CSE-MsgGUID: Qh7+YtX/QjeeunOZYfjlIw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="151900781" Received: from dut4440lnl.fm.intel.com ([10.105.10.40]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:18 -0700 From: Jonathan Cavitt To: intel-xe@lists.freedesktop.org Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com, jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com, matthew.brost@intel.com, jianxun.zhang@intel.com, shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org, Michal.Wajdeczko@intel.com, michal.mzorek@intel.com Subject: [PATCH v8 5/6] drm/xe/xe_vm: Add per VM fault info Date: Thu, 13 Mar 2025 18:34:07 +0000 Message-ID: <20250313183415.133863-6-jonathan.cavitt@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250313183415.133863-1-jonathan.cavitt@intel.com> References: <20250313183415.133863-1-jonathan.cavitt@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add additional information to each VM so they can report up to the first 50 seen faults. Only pagefaults are saved this way currently, though in the future, all faults should be tracked by the VM for future reporting. Additionally, of the pagefaults reported, only failed pagefaults are saved this way, as successful pagefaults should recover silently and not need to be reported to userspace. v2: - Free vm after use (Shuicheng) - Compress pf copy logic (Shuicheng) - Update fault_unsuccessful before storing (Shuicheng) - Fix old struct name in comments (Shuicheng) - Keep first 50 pagefaults instead of last 50 (Jianxun) v3: - Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt) - Fix double-locking error (jcavitt) - Assert kmemdump is successful (Shuicheng) v4: - Rename xe_vm.pfs to xe_vm.faults (jcavitt) - Store fault data and not pagefault in xe_vm faults list (jcavitt) - Store address, address type, and address precision per fault (jcavitt) - Store engine class and instance data per fault (Jianxun) - Add and fix kernel docs (Michal W) - Properly handle kzalloc error (Michal W) - s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W) - Store fault level per fault (Micahl M) Signed-off-by: Jonathan Cavitt Suggested-by: Matthew Brost Cc: Shuicheng Lin Cc: Jianxun Zhang Cc: Michal Wajdeczko Cc: Michal Mzorek --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 21 ++++++++++ drivers/gpu/drm/xe/xe_vm.c | 58 ++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_vm.h | 9 +++++ drivers/gpu/drm/xe/xe_vm_types.h | 33 ++++++++++++++++ 4 files changed, 121 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index e67ee7ac3df7..927b83f7a384 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -346,6 +346,26 @@ int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len) return full ? -ENOSPC : 0; } +static void save_pagefault_to_vm(struct xe_device *xe, struct xe_gt_pagefault *pf) +{ + struct xe_vm *vm; + + vm = asid_to_vm(xe, pf->asid); + if (IS_ERR(vm)) + return; + + spin_lock(&vm->faults.lock); + + /** + * Limit the number of faults in the fault list to prevent memory overuse. + */ + if (vm->faults.len < MAX_FAULTS_SAVED_PER_VM) + xe_vm_add_fault_entry_pf(vm, pf); + + spin_unlock(&vm->faults.lock); + xe_vm_put(vm); +} + #define USM_QUEUE_MAX_RUNTIME_MS 20 static void pf_queue_work_func(struct work_struct *w) @@ -365,6 +385,7 @@ static void pf_queue_work_func(struct work_struct *w) if (unlikely(ret)) { print_pagefault(xe, &pf); pf.fault_unsuccessful = 1; + save_pagefault_to_vm(xe, &pf); drm_dbg(&xe->drm, "Fault response: Unsuccessful %d\n", ret); } diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 60303998bd61..067a9cedcad9 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -778,6 +778,59 @@ int xe_vm_userptr_check_repin(struct xe_vm *vm) list_empty_careful(&vm->userptr.invalidated)) ? 0 : -EAGAIN; } +/** + * xe_vm_add_fault_entry_pf() - Add pagefault to vm fault list + * @vm: The VM. + * @pf: The pagefault. + * + * This function takes the data from the pagefault @pf and saves it to @vm->faults.list. + * + * The function exits silently if the list is full, and reports a warning if the pagefault + * could not be saved to the list. + */ +void xe_vm_add_fault_entry_pf(struct xe_vm *vm, struct xe_gt_pagefault *pf) +{ + struct xe_vm_fault_entry *e = NULL; + + spin_lock(&vm->faults.lock); + + if (vm->faults.len >= MAX_FAULTS_SAVED_PER_VM) + goto out; + + e = kzalloc(sizeof(*e), GFP_KERNEL); + if (!e) { + drm_warn(&vm->xe->drm, + "Could not allocate memory for fault %i!", + vm->faults.len); + goto out; + } + + e->address = pf->page_addr; + e->address_type = pf->address_type; + e->address_precision = 1; + e->fault_level = pf->fault_level; + e->engine_class = pf->engine_class; + e->engine_instance = pf->engine_instance; + + list_add_tail(&e->list, &vm->faults.list); + vm->faults.len++; +out: + spin_unlock(&vm->faults.lock); +} + +static void xe_vm_clear_fault_entries(struct xe_vm *vm) +{ + struct xe_vm_fault_entry *e, *tmp; + + spin_lock(&vm->faults.lock); + list_for_each_entry_safe(e, tmp, &vm->faults.list, list) { + list_del(&e->list); + kfree(e); + } + vm->faults.len = 0; + spin_unlock(&vm->faults.lock); +} + static int xe_vma_ops_alloc(struct xe_vma_ops *vops, bool array_of_binds) { int i; @@ -1660,6 +1713,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) init_rwsem(&vm->userptr.notifier_lock); spin_lock_init(&vm->userptr.invalidated_lock); + INIT_LIST_HEAD(&vm->faults.list); + spin_lock_init(&vm->faults.lock); + ttm_lru_bulk_move_init(&vm->lru_bulk_move); INIT_WORK(&vm->destroy_work, vm_destroy_work_func); @@ -1930,6 +1986,8 @@ void xe_vm_close_and_put(struct xe_vm *vm) } up_write(&xe->usm.lock); + xe_vm_clear_fault_entries(vm); + for_each_tile(tile, xe, id) xe_range_fence_tree_fini(&vm->rftree[id]); diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 0ef811fc2bde..9bd7e93824da 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -12,6 +12,12 @@ #include "xe_map.h" #include "xe_vm_types.h" +/** + * MAX_FAULTS_SAVED_PER_VM - Maximum number of faults each vm can store before future + * faults are discarded to prevent memory overuse + */ +#define MAX_FAULTS_SAVED_PER_VM 50 + struct drm_device; struct drm_printer; struct drm_file; @@ -22,6 +28,7 @@ struct dma_fence; struct xe_exec_queue; struct xe_file; +struct xe_gt_pagefault; struct xe_sync_entry; struct xe_svm_range; struct drm_exec; @@ -257,6 +264,8 @@ int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma); int xe_vma_userptr_check_repin(struct xe_userptr_vma *uvma); +void xe_vm_add_fault_entry_pf(struct xe_vm *vm, struct xe_gt_pagefault *pf); + bool xe_vm_validate_should_retry(struct drm_exec *exec, int err, ktime_t *end); int xe_vm_lock_vma(struct drm_exec *exec, struct xe_vma *vma); diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 84fa41b9fa20..da5beace981b 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -19,6 +19,7 @@ #include "xe_range_fence.h" struct xe_bo; +struct xe_pagefault; struct xe_svm_range; struct xe_sync_entry; struct xe_user_fence; @@ -142,6 +143,26 @@ struct xe_userptr_vma { struct xe_device; +/** + * struct xe_vm_fault_entry - Elements of vm->faults.list + * @list: link into @xe_vm.faults.list + * @address: address of the fault + * @address_type: type of address access that resulted in fault + * @address_precision: precision of faulted address + * @fault_level: fault level of the fault + * @engine_class: class of engine fault was reported on + * @engine_instance: instance of engine fault was reported on + */ +struct xe_vm_fault_entry { + struct list_head list; + u64 address; + u32 address_type; + u32 address_precision; + u8 fault_level; + u8 engine_class; + u8 engine_instance; +}; + struct xe_vm { /** @gpuvm: base GPUVM used to track VMAs */ struct drm_gpuvm gpuvm; @@ -305,6 +326,18 @@ struct xe_vm { bool capture_once; } error_capture; + /** + * @faults: List of all faults associated with this VM + */ + struct { + /** @faults.lock: lock protecting @faults.list */ + spinlock_t lock; + /** @faults.list: list of xe_vm_fault_entry entries */ + struct list_head list; + /** @faults.len: length of @faults.list */ + unsigned int len; + } faults; + /** * @tlb_flush_seqno: Required TLB flush seqno for the next exec. * protected by the vm resv. From patchwork Thu Mar 13 18:34:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cavitt X-Patchwork-Id: 14015771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2C106C282DE for ; Thu, 13 Mar 2025 18:34:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 606FB10E920; Thu, 13 Mar 2025 18:34:25 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="loosrCnJ"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6DC2E10E90B; Thu, 13 Mar 2025 18:34:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741890859; x=1773426859; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UBa/NK3Hc+CHZCKFqa7vqXPMaY1raQMPTJhtjnQWB3Y=; b=loosrCnJOK0zRLPGqJ8I6QnlKMZ/KTNgjLQSDfo7v1/pUv41uf6j57wR 1Rn0yxEJpohI1JLCwo3X1LfNER5+tmouTqARLLsj+TJuV+BcE3pYVWhLd kto/IKYHB+0WJ5yJXpqiSlL84E7CLzmzXWBrfTSGtlPfW8hut/TA7rxwk xUr0Rv4Y1l24JnuozzW48HMGKO6sMnOiDIA/0TZoDBUSOfceLCau72mSZ dUNg+XPSvShjEq2MR13VlnRt6XbQ6RCTxa2gJ3ES4nTCzSDVXL9LUQhWt f8UTNprBd3NIDFOihXcQTBjMoEWjUrsYw069yxnWcqi3Xq2L451FF85sP g==; X-CSE-ConnectionGUID: FHVPVV+fRnOhBRDo4HkkuQ== X-CSE-MsgGUID: bN9V0ehsSOWd8j4NqE5i2w== X-IronPort-AV: E=McAfee;i="6700,10204,11372"; a="65485463" X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="65485463" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:18 -0700 X-CSE-ConnectionGUID: 0opF2mgMQYyW64FFd4oefQ== X-CSE-MsgGUID: XwAx2+OqRXqDWh1RWrl2bg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,245,1736841600"; d="scan'208";a="151900784" Received: from dut4440lnl.fm.intel.com ([10.105.10.40]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2025 11:34:18 -0700 From: Jonathan Cavitt To: intel-xe@lists.freedesktop.org Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com, jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com, matthew.brost@intel.com, jianxun.zhang@intel.com, shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org, Michal.Wajdeczko@intel.com, michal.mzorek@intel.com Subject: [PATCH v8 6/6] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Date: Thu, 13 Mar 2025 18:34:08 +0000 Message-ID: <20250313183415.133863-7-jonathan.cavitt@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250313183415.133863-1-jonathan.cavitt@intel.com> References: <20250313183415.133863-1-jonathan.cavitt@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add support for userspace to request a list of observed faults from a specified VM. v2: - Only allow querying of failed pagefaults (Matt Brost) v3: - Remove unnecessary size parameter from helper function, as it is a property of the arguments. (jcavitt) - Remove unnecessary copy_from_user (Jainxun) - Set address_precision to 1 (Jainxun) - Report max size instead of dynamic size for memory allocation purposes. Total memory usage is reported separately. v4: - Return int from xe_vm_get_property_size (Shuicheng) - Fix memory leak (Shuicheng) - Remove unnecessary size variable (jcavitt) v5: - Rename ioctl to xe_vm_get_faults_ioctl (jcavitt) - Update fill_property_pfs to eliminate need for kzalloc (Jianxun) v6: - Repair and move fill_faults break condition (Dan Carpenter) - Free vm after use (jcavitt) - Combine assertions (jcavitt) - Expand size check in xe_vm_get_faults_ioctl (jcavitt) - Remove return mask from fill_faults, as return is already -EFAULT or 0 (jcavitt) v7: - Revert back to using xe_vm_get_property_ioctl - Apply better copy_to_user logic (jcavitt) Signed-off-by: Jonathan Cavitt Suggested-by: Matthew Brost Cc: Jainxun Zhang Cc: Shuicheng Lin --- drivers/gpu/drm/xe/xe_device.c | 3 + drivers/gpu/drm/xe/xe_vm.c | 134 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_vm.h | 2 + 3 files changed, 139 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index b2f656b2a563..74e510cb0e47 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -194,6 +194,9 @@ static const struct drm_ioctl_desc xe_ioctls[] = { DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl, + DRM_RENDER_ALLOW), + }; static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 067a9cedcad9..521f0032d9e2 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -42,6 +42,14 @@ #include "xe_wa.h" #include "xe_hmm.h" +static const u16 xe_to_user_engine_class[] = { + [XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER, + [XE_ENGINE_CLASS_COPY] = DRM_XE_ENGINE_CLASS_COPY, + [XE_ENGINE_CLASS_VIDEO_DECODE] = DRM_XE_ENGINE_CLASS_VIDEO_DECODE, + [XE_ENGINE_CLASS_VIDEO_ENHANCE] = DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE, + [XE_ENGINE_CLASS_COMPUTE] = DRM_XE_ENGINE_CLASS_COMPUTE, +}; + static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm) { return vm->gpuvm.r_obj; @@ -3551,6 +3559,132 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) return err; } +static int xe_vm_get_property_size(struct xe_vm *vm, u32 property) +{ + int size = -EINVAL; + + switch (property) { + case DRM_XE_VM_GET_PROPERTY_FAULTS: + spin_lock(&vm->faults.lock); + size = vm->faults.len * sizeof(struct xe_vm_fault); + spin_unlock(&vm->faults.lock); + break; + default: + break; + } + return size; +} + +static int xe_vm_get_property_verify_size(struct xe_vm *vm, u32 property, + int expected, int actual) +{ + switch (property) { + case DRM_XE_VM_GET_PROPERTY_FAULTS: + /* + * Number of faults may increase between calls to + * xe_vm_get_property_ioctl, so just report the + * number of faults the user requests if it's less + * than or equal to the number of faults in the VM + * fault array. + */ + if (actual < expected) + return -EINVAL; + break; + default: + if (actual != expected) + return -EINVAL; + break; + } + return 0; +} + +static int fill_faults(struct xe_vm *vm, + struct drm_xe_vm_get_property *args) +{ + struct xe_vm_fault __user *usr_ptr = u64_to_user_ptr(args->data); + struct xe_vm_fault store = { 0 }; + struct xe_vm_fault_entry *entry; + int ret = 0, i = 0, count; + + count = args->size / sizeof(struct xe_vm_fault); + + spin_lock(&vm->faults.lock); + list_for_each_entry(entry, &vm->faults.list, list) { + if (i++ == count) + break; + + memset(&store, 0, sizeof(struct xe_vm_fault)); + + store.address = entry->address; + store.address_type = entry->address_type; + store.address_precision = entry->address_precision; + store.fault_level = entry->fault_level; + store.engine_class = xe_to_user_engine_class[entry->engine_class]; + store.engine_instance = entry->engine_instance; + + ret = copy_to_user(usr_ptr, &store, sizeof(struct xe_vm_fault)); + if (ret) + break; + + usr_ptr++; + } + spin_unlock(&vm->faults.lock); + + return ret; +} + +static int xe_vm_get_property_fill_data(struct xe_vm *vm, + struct drm_xe_vm_get_property *args) +{ + switch (args->property) { + case DRM_XE_VM_GET_PROPERTY_FAULTS: + return fill_faults(vm, args); + default: + break; + } + return -EINVAL; +} + +int xe_vm_get_property_ioctl(struct drm_device *drm, void *data, + struct drm_file *file) +{ + struct xe_device *xe = to_xe_device(drm); + struct xe_file *xef = to_xe_file(file); + struct drm_xe_vm_get_property *args = data; + struct xe_vm *vm; + int size, ret = 0; + + if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1])) + return -EINVAL; + + vm = xe_vm_lookup(xef, args->vm_id); + if (XE_IOCTL_DBG(xe, !vm)) + return -ENOENT; + + size = xe_vm_get_property_size(vm, args->property); + + if (size < 0) { + ret = size; + goto put_vm; + } else if (!args->size) { + args->size = size; + goto put_vm; + } + + ret = xe_vm_get_property_verify_size(vm, args->property, + args->size, size); + if (XE_IOCTL_DBG(xe, ret)) { + ret = -EINVAL; + goto put_vm; + } + + ret = xe_vm_get_property_fill_data(vm, args); + +put_vm: + xe_vm_put(vm); + return ret; +} + /** * xe_vm_bind_kernel_bo - bind a kernel BO to a VM * @vm: VM to bind the BO to diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 9bd7e93824da..63ec22458e04 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -196,6 +196,8 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +int xe_vm_get_property_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); void xe_vm_close_and_put(struct xe_vm *vm);