From patchwork Mon Nov 18 23:37:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879191 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5E38DD60CFD for ; Mon, 18 Nov 2024 23:37:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CF60C10E040; Mon, 18 Nov 2024 23:37:24 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="m2frrSJE"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 956DA10E190; Mon, 18 Nov 2024 23:37:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973043; x=1763509043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rH8hTGJCAoerEtuHvYdMXIH/cuzWnm4A2o119v9dQe0=; b=m2frrSJEj5fPKFn7CJ5UjIhy6xlqH8mMOq43r2C+uzx/PKZKK7MrI8H9 JdIMiFmEzgJ/MkmVJlZia3QlKuezHnBQujLHKMbPjgwSXLgR95Z4T4fF3 4grXkSdovN/SuoxHCxN79yUgWAM3jvWt6xyTErCR+j5v1h6ZqqKJhaUV+ ayGSwkcLX1XyJ6/Kr9VjMyQrz4WWzANwJTw265cGZNy5cxIsk1nxu6t4X Ba7sVhdExSAQSUWspS0k5GjJfY2ToEu/QOEPwQR6Ad3Zk1bY0CSEIlqAu bCeghArZIK8MCUwNnzA8rcX1iUC+BTAqroLuxXEt6pX53ha5c3YNolPde A==; X-CSE-ConnectionGUID: ehbpurniTCupRzAKRk7UMQ== X-CSE-MsgGUID: goCsjvMDQa6+AFAaDgYxQA== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878851" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878851" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:23 -0800 X-CSE-ConnectionGUID: fYuWwgN6RiitE1p5oCjrew== X-CSE-MsgGUID: z1OTRbCpT4qEiCFw04kY6g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521668" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:23 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 01/29] dma-fence: Add dma_fence_preempt base class Date: Mon, 18 Nov 2024 15:37:29 -0800 Message-Id: <20241118233757.2374041-2-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add a dma_fence_preempt base class with driver ops to implement preemption, based on the existing Xe preemptive fence implementation. Annotated to ensure correct driver usage. Cc: Dave Airlie Cc: Simona Vetter Cc: Christian Koenig Signed-off-by: Matthew Brost --- drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/dma-fence-preempt.c | 133 ++++++++++++++++++++++++++++ include/linux/dma-fence-preempt.h | 56 ++++++++++++ 3 files changed, 190 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-preempt.c create mode 100644 include/linux/dma-fence-preempt.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 70ec901edf2c..c25500bb38b5 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ - dma-fence-unwrap.o dma-resv.o + dma-fence-preempt.o dma-fence-unwrap.o dma-resv.o obj-$(CONFIG_DMABUF_HEAPS) += dma-heap.o obj-$(CONFIG_DMABUF_HEAPS) += heaps/ obj-$(CONFIG_SYNC_FILE) += sync_file.o diff --git a/drivers/dma-buf/dma-fence-preempt.c b/drivers/dma-buf/dma-fence-preempt.c new file mode 100644 index 000000000000..6e6ce7ea7421 --- /dev/null +++ b/drivers/dma-buf/dma-fence-preempt.c @@ -0,0 +1,133 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2024 Intel Corporation + */ + +#include +#include + +static void dma_fence_preempt_work_func(struct work_struct *w) +{ + bool cookie = dma_fence_begin_signalling(); + struct dma_fence_preempt *pfence = + container_of(w, typeof(*pfence), work); + const struct dma_fence_preempt_ops *ops = pfence->ops; + int err = pfence->base.error; + + if (!err) { + err = ops->preempt_wait(pfence); + if (err) + dma_fence_set_error(&pfence->base, err); + } + + dma_fence_signal(&pfence->base); + ops->preempt_finished(pfence); + + dma_fence_end_signalling(cookie); +} + +static const char * +dma_fence_preempt_get_driver_name(struct dma_fence *fence) +{ + return "dma_fence_preempt"; +} + +static const char * +dma_fence_preempt_get_timeline_name(struct dma_fence *fence) +{ + return "ordered"; +} + +static void dma_fence_preempt_issue(struct dma_fence_preempt *pfence) +{ + int err; + + err = pfence->ops->preempt(pfence); + if (err) + dma_fence_set_error(&pfence->base, err); + + queue_work(pfence->wq, &pfence->work); +} + +static void dma_fence_preempt_cb(struct dma_fence *fence, + struct dma_fence_cb *cb) +{ + struct dma_fence_preempt *pfence = + container_of(cb, typeof(*pfence), cb); + + dma_fence_preempt_issue(pfence); +} + +static void dma_fence_preempt_delay(struct dma_fence_preempt *pfence) +{ + struct dma_fence *fence; + int err; + + fence = pfence->ops->preempt_delay(pfence); + if (WARN_ON_ONCE(!fence || IS_ERR(fence))) + return; + + err = dma_fence_add_callback(fence, &pfence->cb, dma_fence_preempt_cb); + if (err == -ENOENT) + dma_fence_preempt_issue(pfence); +} + +static bool dma_fence_preempt_enable_signaling(struct dma_fence *fence) +{ + struct dma_fence_preempt *pfence = + container_of(fence, typeof(*pfence), base); + + if (pfence->ops->preempt_delay) + dma_fence_preempt_delay(pfence); + else + dma_fence_preempt_issue(pfence); + + return true; +} + +static const struct dma_fence_ops preempt_fence_ops = { + .get_driver_name = dma_fence_preempt_get_driver_name, + .get_timeline_name = dma_fence_preempt_get_timeline_name, + .enable_signaling = dma_fence_preempt_enable_signaling, +}; + +/** + * dma_fence_is_preempt() - Is preempt fence + * + * @fence: Preempt fence + * + * Return: True if preempt fence, False otherwise + */ +bool dma_fence_is_preempt(const struct dma_fence *fence) +{ + return fence->ops == &preempt_fence_ops; +} +EXPORT_SYMBOL(dma_fence_is_preempt); + +/** + * dma_fence_preempt_init() - Initial preempt fence + * + * @fence: Preempt fence + * @ops: Preempt fence operations + * @wq: Work queue for preempt wait, should have WQ_MEM_RECLAIM set + * @context: Fence context + * @seqno: Fence seqence number + */ +void dma_fence_preempt_init(struct dma_fence_preempt *fence, + const struct dma_fence_preempt_ops *ops, + struct workqueue_struct *wq, + u64 context, u64 seqno) +{ + /* + * XXX: We really want to check wq for WQ_MEM_RECLAIM here but + * workqueue_struct is private. + */ + + fence->ops = ops; + fence->wq = wq; + INIT_WORK(&fence->work, dma_fence_preempt_work_func); + spin_lock_init(&fence->lock); + dma_fence_init(&fence->base, &preempt_fence_ops, + &fence->lock, context, seqno); +} +EXPORT_SYMBOL(dma_fence_preempt_init); diff --git a/include/linux/dma-fence-preempt.h b/include/linux/dma-fence-preempt.h new file mode 100644 index 000000000000..28d803f89527 --- /dev/null +++ b/include/linux/dma-fence-preempt.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2024 Intel Corporation + */ + +#ifndef __LINUX_DMA_FENCE_PREEMPT_H +#define __LINUX_DMA_FENCE_PREEMPT_H + +#include +#include + +struct dma_fence_preempt; +struct dma_resv; + +/** + * struct dma_fence_preempt_ops - Preempt fence operations + * + * These functions should be implemented in the driver side. + */ +struct dma_fence_preempt_ops { + /** @preempt_delay: Preempt execution with a delay */ + struct dma_fence *(*preempt_delay)(struct dma_fence_preempt *fence); + /** @preempt: Preempt execution */ + int (*preempt)(struct dma_fence_preempt *fence); + /** @preempt_wait: Wait for preempt of execution to complete */ + int (*preempt_wait)(struct dma_fence_preempt *fence); + /** @preempt_finished: Signal that the preempt has finished */ + void (*preempt_finished)(struct dma_fence_preempt *fence); +}; + +/** + * struct dma_fence_preempt - Embedded preempt fence base class + */ +struct dma_fence_preempt { + /** @base: Fence base class */ + struct dma_fence base; + /** @lock: Spinlock for fence handling */ + spinlock_t lock; + /** @cb: Callback preempt delay */ + struct dma_fence_cb cb; + /** @ops: Preempt fence operation */ + const struct dma_fence_preempt_ops *ops; + /** @wq: Work queue for preempt wait */ + struct workqueue_struct *wq; + /** @work: Work struct for preempt wait */ + struct work_struct work; +}; + +bool dma_fence_is_preempt(const struct dma_fence *fence); + +void dma_fence_preempt_init(struct dma_fence_preempt *fence, + const struct dma_fence_preempt_ops *ops, + struct workqueue_struct *wq, + u64 context, u64 seqno); + +#endif From patchwork Mon Nov 18 23:37:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6200D60D03 for ; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1EBDE10E56A; Mon, 18 Nov 2024 23:37:26 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="IrFj9FKw"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id CA4E010E040; Mon, 18 Nov 2024 23:37:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973044; x=1763509044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y2XgvCO4mQouVbYYXxG/xv9Uz9Y6ODwgn8hTNcyktQQ=; b=IrFj9FKwvX3KSAKJBAnv8c7A6MU7ix050XzpqvCn9Wo3JadzGgIdbs0h wDfbz8dcSFHOkX/V7FHFor3ytRqHmOOe6uSLtw2dELY+kA5ZrN0jwjHIK 54Ux6axXNo+xcYnEU9kgquuPLgSKxCQ+wbn07IrMaRHI0Rd8CaxX4v0Gp aGubjPRf5GDkmXO23lLG3naNdKm2ExtuVavcsxw6WeAdSlHI1EQpG3HKq CPsecNTIb18HBAKw08eJXwcZsA5sbeTG5YPy9aiJoB7YErNKPhOEieAUz 6BlnDShgoh08UMh88cRTUF0eDlHL6o0AfkxQLxZH8jzk3fGjwN1uvjmPN A==; X-CSE-ConnectionGUID: XUesvJgLRyeEmKvXb9EvyA== X-CSE-MsgGUID: 95BpnEiUSFCdGCDKl21tJw== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878857" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878857" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:23 -0800 X-CSE-ConnectionGUID: jIkz5y60T46r1gXNkpZgtg== X-CSE-MsgGUID: Zyda5ht9SuyPP51qVmDkYQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521675" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:23 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 02/29] dma-fence: Add dma_fence_user_fence Date: Mon, 18 Nov 2024 15:37:30 -0800 Message-Id: <20241118233757.2374041-3-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Normalize user fence attachment to a DMA fence. A user fence is a simple seqno write to memory, implemented by attaching a DMA fence callback that writes out the seqno. Intended use case is importing a dma-fence into kernel and exporting a user fence. Helpers added to allocate, attach, and free a dma_fence_user_fence. Cc: Dave Airlie Cc: Simona Vetter Cc: Christian Koenig Signed-off-by: Matthew Brost --- drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/dma-fence-user-fence.c | 73 ++++++++++++++++++++++++++ include/linux/dma-fence-user-fence.h | 31 +++++++++++ 3 files changed, 105 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-user-fence.c create mode 100644 include/linux/dma-fence-user-fence.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index c25500bb38b5..ba9ba339319e 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ - dma-fence-preempt.o dma-fence-unwrap.o dma-resv.o + dma-fence-preempt.o dma-fence-unwrap.o dma-fence-user-fence.o dma-resv.o obj-$(CONFIG_DMABUF_HEAPS) += dma-heap.o obj-$(CONFIG_DMABUF_HEAPS) += heaps/ obj-$(CONFIG_SYNC_FILE) += sync_file.o diff --git a/drivers/dma-buf/dma-fence-user-fence.c b/drivers/dma-buf/dma-fence-user-fence.c new file mode 100644 index 000000000000..5a4b289bacb8 --- /dev/null +++ b/drivers/dma-buf/dma-fence-user-fence.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2024 Intel Corporation + */ + +#include +#include + +static void user_fence_cb(struct dma_fence *fence, struct dma_fence_cb *cb) +{ + struct dma_fence_user_fence *user_fence = + container_of(cb, struct dma_fence_user_fence, cb); + + if (user_fence->map.is_iomem) + writeq(user_fence->seqno, user_fence->map.vaddr_iomem); + else + *(u64 *)user_fence->map.vaddr = user_fence->seqno; + + dma_fence_user_fence_free(user_fence); +} + +/** + * dma_fence_user_fence_alloc() - Allocate user fence + * + * Return: Allocated struct dma_fence_user_fence on Success, NULL on failure + */ +struct dma_fence_user_fence *dma_fence_user_fence_alloc(void) +{ + return kmalloc(sizeof(struct dma_fence_user_fence), GFP_KERNEL); +} +EXPORT_SYMBOL(dma_fence_user_fence_alloc); + +/** + * dma_fence_user_fence_free() - Free user fence + * + * Free user fence. Should only be called on a user fence if + * dma_fence_user_fence_attach is not called to cleanup original allocation from + * dma_fence_user_fence_alloc. + */ +void dma_fence_user_fence_free(struct dma_fence_user_fence *user_fence) +{ + kfree(user_fence); +} +EXPORT_SYMBOL(dma_fence_user_fence_free); + +/** + * dma_fence_user_fence_attach() - Attach user fence to dma-fence + * + * @fence: fence + * @user_fence user fence + * @map: IOSYS map to write seqno to + * @seqno: seqno to write to IOSYS map + * + * Attach a user fence, which is a seqno write to an IOSYS map, to a DMA fence. + * The caller must guarantee that the memory in the IOSYS map doesn't move + * before the fence signals. This is typically done by installing the DMA fence + * into the BO's DMA reservation bookkeeping slot from which the IOSYS was + * derived. + */ +void dma_fence_user_fence_attach(struct dma_fence *fence, + struct dma_fence_user_fence *user_fence, + struct iosys_map *map, u64 seqno) +{ + int err; + + user_fence->map = *map; + user_fence->seqno = seqno; + + err = dma_fence_add_callback(fence, &user_fence->cb, user_fence_cb); + if (err == -ENOENT) + user_fence_cb(NULL, &user_fence->cb); +} +EXPORT_SYMBOL(dma_fence_user_fence_attach); diff --git a/include/linux/dma-fence-user-fence.h b/include/linux/dma-fence-user-fence.h new file mode 100644 index 000000000000..8678129c7d56 --- /dev/null +++ b/include/linux/dma-fence-user-fence.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2024 Intel Corporation + */ + +#ifndef __LINUX_DMA_FENCE_USER_FENCE_H +#define __LINUX_DMA_FENCE_USER_FENCE_H + +#include +#include + +/** struct dma_fence_user_fence - User fence */ +struct dma_fence_user_fence { + /** @cb: dma-fence callback used to attach user fence to dma-fence */ + struct dma_fence_cb cb; + /** @map: IOSYS map to write seqno to */ + struct iosys_map map; + /** @seqno: seqno to write to IOSYS map */ + u64 seqno; +}; + +struct dma_fence_user_fence *dma_fence_user_fence_alloc(void); + +void dma_fence_user_fence_free(struct dma_fence_user_fence *user_fence); + +void dma_fence_user_fence_attach(struct dma_fence *fence, + struct dma_fence_user_fence *user_fence, + struct iosys_map *map, + u64 seqno); + +#endif From patchwork Mon Nov 18 23:37:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879198 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0F8D1D60D0B for ; Mon, 18 Nov 2024 23:37:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 81BA910E585; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="YxWSwJNX"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 13BE910E040; Mon, 18 Nov 2024 23:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973044; x=1763509044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ai1mfcFgevmGKJpWo6ttA0URBdnq6ylMV+L0WkHjHAE=; b=YxWSwJNX3N15bO3ZOOOdHANtnXMY+u9ymh+rMG2hkCQ7Obfe5+vFfYXR m6dWo/jL0VuvMWyOp3QhII3616XZgGZ2cFvmBuVGHo39rPzyd7hxzQCN2 fWlO8XvHX0CkssU+z06ZW1BLMLN4zh8eh1I8G2CaUZh7YdBe+Xw/bnlFF 9fceaIyMsKd5xdpOcdUlzGaAY5mHG02Pe5VWHXaHjIbQyl1dwcexrCLi1 Bvb3qcDDb0wZuL2C2khNk5gUbXK/ZsaOVHB66ie3JBKe6AvfPAxE31bAt Brms1xYkuceHaj7bVUe7OD3bBeqa5e0KEgkymq6Nc2/xYXY+WSGX3hFpQ w==; X-CSE-ConnectionGUID: 0vaWaVaqRLirB7c1vgcYTg== X-CSE-MsgGUID: od3euoxkT9KBmEs0lwivRQ== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878863" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878863" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 X-CSE-ConnectionGUID: vTVMy150R7GutH79N6O82w== X-CSE-MsgGUID: 0YbWrGI0RN+bhm79ph4ZuQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521679" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:23 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 03/29] drm/xe: Use dma_fence_preempt base class Date: Mon, 18 Nov 2024 15:37:31 -0800 Message-Id: <20241118233757.2374041-4-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Use the dma_fence_preempt base class in Xe instead of open-coding the preemption implementation. Cc: Dave Airlie Cc: Simona Vetter Cc: Christian Koenig Signed-off-by: Matthew Brost --- drivers/dma-buf/dma-fence-preempt.c | 5 +- drivers/gpu/drm/xe/xe_guc_submit.c | 3 + drivers/gpu/drm/xe/xe_hw_engine_group.c | 4 +- drivers/gpu/drm/xe/xe_preempt_fence.c | 80 ++++++--------------- drivers/gpu/drm/xe/xe_preempt_fence.h | 2 +- drivers/gpu/drm/xe/xe_preempt_fence_types.h | 11 +-- 6 files changed, 34 insertions(+), 71 deletions(-) diff --git a/drivers/dma-buf/dma-fence-preempt.c b/drivers/dma-buf/dma-fence-preempt.c index 6e6ce7ea7421..bcc5e5cec919 100644 --- a/drivers/dma-buf/dma-fence-preempt.c +++ b/drivers/dma-buf/dma-fence-preempt.c @@ -8,11 +8,11 @@ static void dma_fence_preempt_work_func(struct work_struct *w) { - bool cookie = dma_fence_begin_signalling(); struct dma_fence_preempt *pfence = container_of(w, typeof(*pfence), work); const struct dma_fence_preempt_ops *ops = pfence->ops; int err = pfence->base.error; + bool cookie = dma_fence_begin_signalling(); if (!err) { err = ops->preempt_wait(pfence); @@ -23,6 +23,7 @@ static void dma_fence_preempt_work_func(struct work_struct *w) dma_fence_signal(&pfence->base); ops->preempt_finished(pfence); + /* The entire worker is signaling path, thus annotate the entirety */ dma_fence_end_signalling(cookie); } @@ -109,7 +110,7 @@ EXPORT_SYMBOL(dma_fence_is_preempt); * * @fence: Preempt fence * @ops: Preempt fence operations - * @wq: Work queue for preempt wait, should have WQ_MEM_RECLAIM set + * @wq: Work queue for preempt wait, must have WQ_MEM_RECLAIM set * @context: Fence context * @seqno: Fence seqence number */ diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index f9ecee5364d8..58a3f4bb3887 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1603,6 +1603,9 @@ static int guc_exec_queue_suspend_wait(struct xe_exec_queue *q) struct xe_guc *guc = exec_queue_to_guc(q); int ret; + if (exec_queue_reset(q) || exec_queue_killed_or_banned_or_wedged(q)) + return -ECANCELED; + /* * Likely don't need to check exec_queue_killed() as we clear * suspend_pending upon kill but to be paranoid but races in which diff --git a/drivers/gpu/drm/xe/xe_hw_engine_group.c b/drivers/gpu/drm/xe/xe_hw_engine_group.c index 82750520a90a..8ed5410c3964 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_group.c +++ b/drivers/gpu/drm/xe/xe_hw_engine_group.c @@ -163,7 +163,7 @@ int xe_hw_engine_group_add_exec_queue(struct xe_hw_engine_group *group, struct x if (xe_vm_in_fault_mode(q->vm) && group->cur_mode == EXEC_MODE_DMA_FENCE) { q->ops->suspend(q); err = q->ops->suspend_wait(q); - if (err) + if (err == -ETIME) goto err_suspend; xe_hw_engine_group_resume_faulting_lr_jobs(group); @@ -236,7 +236,7 @@ static int xe_hw_engine_group_suspend_faulting_lr_jobs(struct xe_hw_engine_group continue; err = q->ops->suspend_wait(q); - if (err) + if (err == -ETIME) goto err_suspend; } diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c index 83fbeea5aa20..80a8bc82f3cc 100644 --- a/drivers/gpu/drm/xe/xe_preempt_fence.c +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c @@ -4,73 +4,40 @@ */ #include "xe_preempt_fence.h" - -#include - #include "xe_exec_queue.h" #include "xe_vm.h" -static void preempt_fence_work_func(struct work_struct *w) +static struct xe_exec_queue *to_exec_queue(struct dma_fence_preempt *fence) { - bool cookie = dma_fence_begin_signalling(); - struct xe_preempt_fence *pfence = - container_of(w, typeof(*pfence), preempt_work); - struct xe_exec_queue *q = pfence->q; - - if (pfence->error) { - dma_fence_set_error(&pfence->base, pfence->error); - } else if (!q->ops->reset_status(q)) { - int err = q->ops->suspend_wait(q); - - if (err) - dma_fence_set_error(&pfence->base, err); - } else { - dma_fence_set_error(&pfence->base, -ENOENT); - } - - dma_fence_signal(&pfence->base); - /* - * Opt for keep everything in the fence critical section. This looks really strange since we - * have just signalled the fence, however the preempt fences are all signalled via single - * global ordered-wq, therefore anything that happens in this callback can easily block - * progress on the entire wq, which itself may prevent other published preempt fences from - * ever signalling. Therefore try to keep everything here in the callback in the fence - * critical section. For example if something below grabs a scary lock like vm->lock, - * lockdep should complain since we also hold that lock whilst waiting on preempt fences to - * complete. - */ - xe_vm_queue_rebind_worker(q->vm); - xe_exec_queue_put(q); - dma_fence_end_signalling(cookie); + return container_of(fence, struct xe_preempt_fence, base)->q; } -static const char * -preempt_fence_get_driver_name(struct dma_fence *fence) +static int xe_preempt_fence_preempt(struct dma_fence_preempt *fence) { - return "xe"; + struct xe_exec_queue *q = to_exec_queue(fence); + + return q->ops->suspend(q); } -static const char * -preempt_fence_get_timeline_name(struct dma_fence *fence) +static int xe_preempt_fence_preempt_wait(struct dma_fence_preempt *fence) { - return "preempt"; + struct xe_exec_queue *q = to_exec_queue(fence); + + return q->ops->suspend_wait(q); } -static bool preempt_fence_enable_signaling(struct dma_fence *fence) +static void xe_preempt_fence_preempt_finished(struct dma_fence_preempt *fence) { - struct xe_preempt_fence *pfence = - container_of(fence, typeof(*pfence), base); - struct xe_exec_queue *q = pfence->q; + struct xe_exec_queue *q = to_exec_queue(fence); - pfence->error = q->ops->suspend(q); - queue_work(q->vm->xe->preempt_fence_wq, &pfence->preempt_work); - return true; + xe_vm_queue_rebind_worker(q->vm); + xe_exec_queue_put(q); } -static const struct dma_fence_ops preempt_fence_ops = { - .get_driver_name = preempt_fence_get_driver_name, - .get_timeline_name = preempt_fence_get_timeline_name, - .enable_signaling = preempt_fence_enable_signaling, +static const struct dma_fence_preempt_ops xe_preempt_fence_ops = { + .preempt = xe_preempt_fence_preempt, + .preempt_wait = xe_preempt_fence_preempt_wait, + .preempt_finished = xe_preempt_fence_preempt_finished, }; /** @@ -95,7 +62,6 @@ struct xe_preempt_fence *xe_preempt_fence_alloc(void) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(&pfence->link); - INIT_WORK(&pfence->preempt_work, preempt_fence_work_func); return pfence; } @@ -134,11 +100,11 @@ xe_preempt_fence_arm(struct xe_preempt_fence *pfence, struct xe_exec_queue *q, { list_del_init(&pfence->link); pfence->q = xe_exec_queue_get(q); - spin_lock_init(&pfence->lock); - dma_fence_init(&pfence->base, &preempt_fence_ops, - &pfence->lock, context, seqno); - return &pfence->base; + dma_fence_preempt_init(&pfence->base, &xe_preempt_fence_ops, + q->vm->xe->preempt_fence_wq, context, seqno); + + return &pfence->base.base; } /** @@ -169,5 +135,5 @@ xe_preempt_fence_create(struct xe_exec_queue *q, bool xe_fence_is_xe_preempt(const struct dma_fence *fence) { - return fence->ops == &preempt_fence_ops; + return dma_fence_is_preempt(fence); } diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.h b/drivers/gpu/drm/xe/xe_preempt_fence.h index 9406c6fea525..7b56d12c0786 100644 --- a/drivers/gpu/drm/xe/xe_preempt_fence.h +++ b/drivers/gpu/drm/xe/xe_preempt_fence.h @@ -25,7 +25,7 @@ xe_preempt_fence_arm(struct xe_preempt_fence *pfence, struct xe_exec_queue *q, static inline struct xe_preempt_fence * to_preempt_fence(struct dma_fence *fence) { - return container_of(fence, struct xe_preempt_fence, base); + return container_of(fence, struct xe_preempt_fence, base.base); } /** diff --git a/drivers/gpu/drm/xe/xe_preempt_fence_types.h b/drivers/gpu/drm/xe/xe_preempt_fence_types.h index 312c3372a49f..f12b89f7dc35 100644 --- a/drivers/gpu/drm/xe/xe_preempt_fence_types.h +++ b/drivers/gpu/drm/xe/xe_preempt_fence_types.h @@ -6,8 +6,7 @@ #ifndef _XE_PREEMPT_FENCE_TYPES_H_ #define _XE_PREEMPT_FENCE_TYPES_H_ -#include -#include +#include struct xe_exec_queue; @@ -18,17 +17,11 @@ struct xe_exec_queue; */ struct xe_preempt_fence { /** @base: dma fence base */ - struct dma_fence base; + struct dma_fence_preempt base; /** @link: link into list of pending preempt fences */ struct list_head link; /** @q: exec queue for this preempt fence */ struct xe_exec_queue *q; - /** @preempt_work: work struct which issues preemption */ - struct work_struct preempt_work; - /** @lock: dma-fence fence lock */ - spinlock_t lock; - /** @error: preempt fence is in error state */ - int error; }; #endif From patchwork Mon Nov 18 23:37:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 31770D60CF8 for ; Mon, 18 Nov 2024 23:37:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8315E10E569; Mon, 18 Nov 2024 23:37:25 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="e6pVmwhN"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6112110E040; Mon, 18 Nov 2024 23:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973044; x=1763509044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ay6A70QWOSeOFEkh6hckt8gFzr/HHR3I+sf1nUFsfbY=; b=e6pVmwhNbHrFBJjfIduxAd2WEalKGwH2lny3kIkxw04NEbRnuZ5c/q0Q EQhqkCJ57cuspYeO4n8t/PHv/3RtS0nUa3/EdYvUYSjmcloKx1VBti+br cVpvQmGHCruJIIgkzUo1BYmjyBweczfT48lgpRI43BKErSDiFGi1a7608 QpwG7OqSuHKezqJIUCmQgV3YeKQkQnAvgRPMMC8R0JHTf0XvSnORIHzJ1 S5ACqTQLFI/SwAXZkZ8tHJyJYPEn7NfijvHJNFHRI7mGiRAZ8MRV6ZpYI fKj6yJ14IwfS+XImz+vaeIR1DzP7ShLEa6jBLhdh7m/aMSMtKxEwW87b3 Q==; X-CSE-ConnectionGUID: ZdNPvipyTF2kk7zxFctR0g== X-CSE-MsgGUID: PSFOmK1nRp6dSfQrWRd23A== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878869" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878869" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 X-CSE-ConnectionGUID: z2lLjAT2SfqNVCTX/tSUvA== X-CSE-MsgGUID: /Oa5n3TJSAyJsyprwKuZDQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521683" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 04/29] drm/xe: Allocate doorbells for UMD exec queues Date: Mon, 18 Nov 2024 15:37:32 -0800 Message-Id: <20241118233757.2374041-5-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" These will be mapped to user space for UMD submission. Add infrastructure to GuC submission backend to manage these. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec_queue_types.h | 2 + drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 7 ++ drivers/gpu/drm/xe/xe_guc_submit.c | 107 +++++++++++++++++-- 3 files changed, 106 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h index 1158b6062a6c..7f68587d4021 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h @@ -83,6 +83,8 @@ struct xe_exec_queue { #define EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD BIT(3) /* kernel exec_queue only, set priority to highest level */ #define EXEC_QUEUE_FLAG_HIGH_PRIORITY BIT(4) +/* queue used for UMD submission */ +#define EXEC_QUEUE_FLAG_UMD_SUBMISSION BIT(5) /** * @flags: flags for this exec queue, should statically setup aside from ban diff --git a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h index 4c39f01e4f52..2d53af75ed75 100644 --- a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h @@ -47,6 +47,13 @@ struct xe_guc_exec_queue { u16 id; /** @suspend_wait: wait queue used to wait on pending suspends */ wait_queue_head_t suspend_wait; + /** @db: doorbell state */ + struct { + /** @db.id: doorbell ID */ + int id; + /** @db.dpa: doorbell device physical address */ + u64 dpa; + } db; /** @suspend_pending: a suspend of the exec_queue is pending */ bool suspend_pending; }; diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 58a3f4bb3887..cc7a98c1343e 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -29,6 +29,7 @@ #include "xe_guc.h" #include "xe_guc_capture.h" #include "xe_guc_ct.h" +#include "xe_guc_db_mgr.h" #include "xe_guc_exec_queue_types.h" #include "xe_guc_id_mgr.h" #include "xe_guc_submit_types.h" @@ -67,6 +68,7 @@ exec_queue_to_guc(struct xe_exec_queue *q) #define EXEC_QUEUE_STATE_BANNED (1 << 9) #define EXEC_QUEUE_STATE_CHECK_TIMEOUT (1 << 10) #define EXEC_QUEUE_STATE_EXTRA_REF (1 << 11) +#define EXEC_QUEUE_STATE_DB_REGISTERED (1 << 12) static bool exec_queue_registered(struct xe_exec_queue *q) { @@ -218,6 +220,16 @@ static void set_exec_queue_extra_ref(struct xe_exec_queue *q) atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state); } +static bool exec_queue_doorbell_registered(struct xe_exec_queue *q) +{ + return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_DB_REGISTERED; +} + +static void set_exec_queue_doorbell_registered(struct xe_exec_queue *q) +{ + atomic_or(EXEC_QUEUE_STATE_DB_REGISTERED, &q->guc->state); +} + static bool exec_queue_killed_or_banned_or_wedged(struct xe_exec_queue *q) { return (atomic_read(&q->guc->state) & @@ -354,13 +366,6 @@ static int alloc_guc_id(struct xe_guc *guc, struct xe_exec_queue *q) return ret; } -static void release_guc_id(struct xe_guc *guc, struct xe_exec_queue *q) -{ - mutex_lock(&guc->submission_state.lock); - __release_guc_id(guc, q, q->width); - mutex_unlock(&guc->submission_state.lock); -} - struct exec_queue_policy { u32 count; struct guc_update_exec_queue_policy h2g; @@ -1238,7 +1243,13 @@ static void __guc_exec_queue_fini_async(struct work_struct *w) if (xe_exec_queue_is_lr(q)) cancel_work_sync(&ge->lr_tdr); - release_guc_id(guc, q); + + mutex_lock(&guc->submission_state.lock); + if (q->guc->db.id >= 0) + xe_guc_db_mgr_release_id_locked(&guc->dbm, q->guc->db.id); + __release_guc_id(guc, q, q->width); + mutex_unlock(&guc->submission_state.lock); + xe_sched_entity_fini(&ge->entity); xe_sched_fini(&ge->sched); @@ -1273,6 +1284,8 @@ static void __guc_exec_queue_fini(struct xe_guc *guc, struct xe_exec_queue *q) guc_exec_queue_fini_async(q); } +static void deallocate_doorbell(struct xe_guc *guc, u16 guc_id); + static void __guc_exec_queue_process_msg_cleanup(struct xe_sched_msg *msg) { struct xe_exec_queue *q = msg->private_data; @@ -1281,6 +1294,9 @@ static void __guc_exec_queue_process_msg_cleanup(struct xe_sched_msg *msg) xe_gt_assert(guc_to_gt(guc), !(q->flags & EXEC_QUEUE_FLAG_PERMANENT)); trace_xe_exec_queue_cleanup_entity(q); + if (exec_queue_doorbell_registered(q)) + deallocate_doorbell(guc, q->guc->id); + if (exec_queue_registered(q)) disable_scheduling_deregister(guc, q); else @@ -1399,6 +1415,53 @@ static void guc_exec_queue_process_msg(struct xe_sched_msg *msg) xe_pm_runtime_put(xe); } +static int allocate_doorbell(struct xe_guc *guc, u16 guc_id, int doorbell_id, + u64 gpa) +{ + u32 action[] = { + XE_GUC_ACTION_ALLOCATE_DOORBELL, + guc_id, + doorbell_id, + lower_32_bits(gpa), + upper_32_bits(gpa), + 0, + }; + + return xe_guc_ct_send_block(&guc->ct, action, ARRAY_SIZE(action)); +} + +static void deallocate_doorbell(struct xe_guc *guc, u16 guc_id) +{ + u32 action[] = { + XE_GUC_ACTION_DEALLOCATE_DOORBELL, + guc_id + }; + + xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0); +} + +#define GUC_MMIO_DB_BAR_OFFSET SZ_4M + +static int create_doorbell(struct xe_guc *guc, struct xe_exec_queue *q) +{ + int ret; + + set_exec_queue_doorbell_registered(q); + xe_guc_submit_reset_wait(guc); + + q->guc->db.dpa = GUC_MMIO_DB_BAR_OFFSET + PAGE_SIZE * q->guc->db.id; + register_exec_queue(q); + enable_scheduling(q); + + ret = allocate_doorbell(guc, q->guc->id, q->guc->db.id, q->guc->db.dpa); + if (ret) { + disable_scheduling_deregister(guc, q); + return ret; + } + + return 0; +} + static const struct drm_sched_backend_ops drm_sched_ops = { .run_job = guc_exec_queue_run_job, .free_job = guc_exec_queue_free_job, @@ -1415,7 +1478,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q) struct xe_guc *guc = exec_queue_to_guc(q); struct xe_guc_exec_queue *ge; long timeout; - int err, i; + int err, i, db_id = 0; xe_gt_assert(guc_to_gt(guc), xe_device_uc_enabled(guc_to_xe(guc))); @@ -1458,14 +1521,35 @@ static int guc_exec_queue_init(struct xe_exec_queue *q) if (xe_guc_read_stopped(guc)) xe_sched_stop(sched); + q->guc->db.id = -1; + if (q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION) { + db_id = xe_guc_db_mgr_reserve_id_locked(&guc->dbm); + if (db_id < 0) { + err = db_id; + goto err_id; + } + } + mutex_unlock(&guc->submission_state.lock); + if (q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION) { + q->guc->db.id = db_id; + err = create_doorbell(guc, q); + if (err) + goto err_db; + } + xe_exec_queue_assign_name(q, q->guc->id); trace_xe_exec_queue_create(q); return 0; +err_db: + mutex_lock(&guc->submission_state.lock); + xe_guc_db_mgr_release_id_locked(&guc->dbm, q->guc->db.id); +err_id: + __release_guc_id(guc, q, q->width); err_entity: mutex_unlock(&guc->submission_state.lock); xe_sched_entity_fini(&ge->entity); @@ -1699,7 +1783,10 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q) struct xe_sched_job *job = xe_sched_first_pending_job(sched); bool ban = false; - if (job) { + if (exec_queue_doorbell_registered(q)) { + /* TODO: Ban via UMD shim too */ + ban = true; + } else if (job) { if ((xe_sched_job_started(job) && !xe_sched_job_completed(job)) || xe_sched_invalidate_job(job, 2)) { From patchwork Mon Nov 18 23:37:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879194 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 960D8D60D05 for ; Mon, 18 Nov 2024 23:37:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9F46810E2EA; Mon, 18 Nov 2024 23:37:25 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ghziqzlP"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9DA7710E040; Mon, 18 Nov 2024 23:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973044; x=1763509044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=poahaFFlyliF7Ds8FoyFTl1+7yOcxw9tdG8AI0fNJNU=; b=ghziqzlPV735j3HI7HLEbcSeW6wg2bEWOz0SDFkH2LTp4MYzl6AhlA+x m8qj3hauJXwHoy+vz1qFNOdLk+e4Eeyqu5QVhDQX1If4jAgsPU1Uc5lN/ 4Tl70Y+SmK75Q+UzhaxQ0rpn/gou0b89yNt3uWlWh4ZMCYomGJzbp2Y8l mEvteJAtN/vQJkc/vgvrfdUVmg/iMV8UKm/ztZQ/CQF9l7yAQ6XPaPix2 xl1QauFKz5oo0P/ofElt2BxJAAUkqB/inAUuhAM8oASH8dXg9A0PDYCyR mh/UUx4Mve6D8QOfouuKex2vdTzAIEe5lowfFjwkeT78SKOls53MJupmr w==; X-CSE-ConnectionGUID: WmFEgl8DTOWuaWvk+xoGNQ== X-CSE-MsgGUID: SL2RHoopQvSrfhDBiQe+4w== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878877" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878877" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 X-CSE-ConnectionGUID: A6gWdVU2SRmxsqfuUmLYag== X-CSE-MsgGUID: EEJmAh2WSOSLqAxpPjv5NA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521686" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 05/29] drm/xe: Add doorbell ID to snapshot capture Date: Mon, 18 Nov 2024 15:37:33 -0800 Message-Id: <20241118233757.2374041-6-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Useful for debugging hangs with doorbells. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ drivers/gpu/drm/xe/xe_guc_submit_types.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index cc7a98c1343e..c226c7b3245d 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -2227,6 +2227,7 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q) return NULL; snapshot->guc.id = q->guc->id; + snapshot->guc.db_id = q->guc->db.id; memcpy(&snapshot->name, &q->name, sizeof(snapshot->name)); snapshot->class = q->class; snapshot->logical_mask = q->logical_mask; @@ -2321,6 +2322,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps drm_printf(p, "\tClass: %d\n", snapshot->class); drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask); drm_printf(p, "\tWidth: %d\n", snapshot->width); + drm_printf(p, "\tDoorbell ID: %d\n", snapshot->guc.db_id); drm_printf(p, "\tRef: %d\n", snapshot->refcount); drm_printf(p, "\tTimeout: %ld (ms)\n", snapshot->sched_timeout); drm_printf(p, "\tTimeslice: %u (us)\n", diff --git a/drivers/gpu/drm/xe/xe_guc_submit_types.h b/drivers/gpu/drm/xe/xe_guc_submit_types.h index dc7456c34583..12fef7848b78 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit_types.h +++ b/drivers/gpu/drm/xe/xe_guc_submit_types.h @@ -113,6 +113,8 @@ struct xe_guc_submit_exec_queue_snapshot { u32 wqi_tail; /** @guc.id: GuC id for this exec_queue */ u16 id; + /** @guc.db_id: Doorbell id */ + u16 db_id; } guc; /** From patchwork Mon Nov 18 23:37:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EFC54D60D08 for ; Mon, 18 Nov 2024 23:37:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D03D910E58E; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="dXPwISic"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id E3CD210E190; Mon, 18 Nov 2024 23:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973045; x=1763509045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0Lyjhcqd14IudQ/k5R2F8mmgWywuiWlj1+zwobOgdLs=; b=dXPwISicNXXwNetCxtPovmkF9y9JBWGDJKXrqbmZfOogpo5vwpwSaqOH 4IXACjTHW0/FBM+C1xfN7f55UYHgfbpthXmuQXnGWxk5Kns9Z3OP7KLzG rW+kHXVlSCVSib0cscJ6qhiRqljYaosAM4WkmIpmo0oBO/usuvmnGWjXA b2LT/xJcq0/h/nYgh2PWYq/49s+F+DejFEnxi8wDAmupMUo4bsxISNOtM nlvY+GfIcEwrFaMdiQqAKEsZ2XGMnmXDyr1qCl7t8cqZ8jsA19PW4D5kj 1y/9Z7cdq09JgdwqQoK2M8NfFvwwB4+fr9ea2wATll0oI8jwAXYGcaBx+ A==; X-CSE-ConnectionGUID: yGd0FRflSf6U2iEFCZXcdw== X-CSE-MsgGUID: b2Jbe5bjTzm0nlL9A2Zssw== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878884" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878884" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 X-CSE-ConnectionGUID: 4O5S4E3eTTWvAAVaoy3WFA== X-CSE-MsgGUID: OFMKX57MRl6PoeO0mR4z+w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521689" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:24 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 06/29] drm/xe: Break submission ring out into its own BO Date: Mon, 18 Nov 2024 15:37:34 -0800 Message-Id: <20241118233757.2374041-7-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Start laying the ground work for UMD submission. This will allow mmaping the submission ring to user space. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_lrc.c | 38 +++++++++++++++++++++++++------ drivers/gpu/drm/xe/xe_lrc_types.h | 9 ++++++-- 2 files changed, 38 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 22e58c6e2a35..758648b6a711 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -632,7 +632,7 @@ static inline u32 __xe_lrc_ring_offset(struct xe_lrc *lrc) u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc) { - return lrc->ring.size; + return 0; } /* Make the magic macros work */ @@ -712,7 +712,21 @@ static inline u32 __maybe_unused __xe_lrc_##elem##_ggtt_addr(struct xe_lrc *lrc) return xe_bo_ggtt_addr(lrc->bo) + __xe_lrc_##elem##_offset(lrc); \ } \ -DECL_MAP_ADDR_HELPERS(ring) +#define DECL_MAP_RING_ADDR_HELPERS(elem) \ +static inline struct iosys_map __xe_lrc_##elem##_map(struct xe_lrc *lrc) \ +{ \ + struct iosys_map map = lrc->submission_ring->vmap; \ +\ + xe_assert(lrc_to_xe(lrc), !iosys_map_is_null(&map)); \ + iosys_map_incr(&map, __xe_lrc_##elem##_offset(lrc)); \ + return map; \ +} \ +static inline u32 __maybe_unused __xe_lrc_##elem##_ggtt_addr(struct xe_lrc *lrc) \ +{ \ + return xe_bo_ggtt_addr(lrc->submission_ring) + __xe_lrc_##elem##_offset(lrc); \ +} \ + +DECL_MAP_RING_ADDR_HELPERS(ring) DECL_MAP_ADDR_HELPERS(pphwsp) DECL_MAP_ADDR_HELPERS(seqno) DECL_MAP_ADDR_HELPERS(regs) @@ -722,6 +736,7 @@ DECL_MAP_ADDR_HELPERS(ctx_timestamp) DECL_MAP_ADDR_HELPERS(parallel) DECL_MAP_ADDR_HELPERS(indirect_ring) +#undef DECL_RING_MAP_ADDR_HELPERS #undef DECL_MAP_ADDR_HELPERS /** @@ -866,10 +881,8 @@ static void xe_lrc_set_ppgtt(struct xe_lrc *lrc, struct xe_vm *vm) static void xe_lrc_finish(struct xe_lrc *lrc) { xe_hw_fence_ctx_finish(&lrc->fence_ctx); - xe_bo_lock(lrc->bo, false); - xe_bo_unpin(lrc->bo); - xe_bo_unlock(lrc->bo); - xe_bo_put(lrc->bo); + xe_bo_unpin_map_no_vm(lrc->bo); + xe_bo_unpin_map_no_vm(lrc->submission_ring); } #define PVC_CTX_ASID (0x2e + 1) @@ -889,7 +902,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, kref_init(&lrc->refcount); lrc->flags = 0; - lrc_size = ring_size + xe_gt_lrc_size(gt, hwe->class); + lrc_size = xe_gt_lrc_size(gt, hwe->class); if (xe_gt_has_indirect_ring_state(gt)) lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE; @@ -905,6 +918,17 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, if (IS_ERR(lrc->bo)) return PTR_ERR(lrc->bo); + lrc->submission_ring = xe_bo_create_pin_map(xe, tile, vm, ring_size, + ttm_bo_type_kernel, + XE_BO_FLAG_VRAM_IF_DGFX(tile) | + XE_BO_FLAG_GGTT | + XE_BO_FLAG_GGTT_INVALIDATE); + if (IS_ERR(lrc->submission_ring)) { + err = PTR_ERR(lrc->submission_ring); + lrc->submission_ring = NULL; + goto err_lrc_finish; + } + lrc->size = lrc_size; lrc->tile = gt_to_tile(hwe->gt); lrc->ring.size = ring_size; diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h index 71ecb453f811..3ad9ac2d644f 100644 --- a/drivers/gpu/drm/xe/xe_lrc_types.h +++ b/drivers/gpu/drm/xe/xe_lrc_types.h @@ -17,11 +17,16 @@ struct xe_bo; */ struct xe_lrc { /** - * @bo: buffer object (memory) for logical ring context, per process HW - * status page, and submission ring. + * @bo: buffer object (memory) for logical ring context and per process + * HW status page. */ struct xe_bo *bo; + /** + * @submission_ring: buffer object (memory) for submission_ring + */ + struct xe_bo *submission_ring; + /** @size: size of lrc including any indirect ring state page */ u32 size; From patchwork Mon Nov 18 23:37:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B993FD60CFD for ; Mon, 18 Nov 2024 23:37:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2C10710E568; Mon, 18 Nov 2024 23:37:27 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="asqlE5s3"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4353A10E2EA; Mon, 18 Nov 2024 23:37:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973045; x=1763509045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Rg2arXeuFU6PrAFxbxFS4Q0ahHLNAQySLw0eB/7jbv0=; b=asqlE5s3TU/VletWpQg8r2sp3d9ZUQdt5yytAqTPqYdkBafEVokW+Dp8 YivcI26K2+OW7ES9906dhc3iqMjN6jSFhA65Kj2M0va31cQQGuAvoUTcE J3rIkpe1UdfId/wmLUWwEfyy6hNMiU8v3Z+jd9rL5cU0yePk+qm162F1B bHiHY918j2LU8GR4w+j0z2VnYaZB1qKm91J4rVitlMVz8KefdWdxlASWx tB3zfCJoJyjJ5jLnTKNb6lBXGYDXvsAYFtCus45SvGQ1ycpGINu12Huik ZGiY5/+QP5ooM0GkXTMJPcpaD6YH6/OU3jfjZQM0zfc1I1umZfdxh8+f/ Q==; X-CSE-ConnectionGUID: 21p1ddy0ReSAY6ojQQrPzQ== X-CSE-MsgGUID: LQcqUmr5SZazyK+Y/inbhg== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878890" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878890" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:25 -0800 X-CSE-ConnectionGUID: F1nZMF+gRaWQwi9z+DUkfw== X-CSE-MsgGUID: VP+HY8E/T5alSKK3tMRGQw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521693" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:25 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 07/29] drm/xe: Break indirect ring state out into its own BO Date: Mon, 18 Nov 2024 15:37:35 -0800 Message-Id: <20241118233757.2374041-8-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Start laying the ground work for UMD submission. This will allow mmaping the indirect ring state to user space. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_lrc.c | 79 ++++++++++++++++++++++--------- drivers/gpu/drm/xe/xe_lrc_types.h | 7 ++- 2 files changed, 63 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 758648b6a711..e3c1773191bd 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -74,10 +74,6 @@ size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class) size = 2 * SZ_4K; } - /* Add indirect ring state page */ - if (xe_gt_has_indirect_ring_state(gt)) - size += LRC_INDIRECT_RING_STATE_SIZE; - return size; } @@ -694,8 +690,7 @@ static u32 __xe_lrc_ctx_timestamp_offset(struct xe_lrc *lrc) static inline u32 __xe_lrc_indirect_ring_offset(struct xe_lrc *lrc) { - /* Indirect ring state page is at the very end of LRC */ - return lrc->size - LRC_INDIRECT_RING_STATE_SIZE; + return 0; } #define DECL_MAP_ADDR_HELPERS(elem) \ @@ -726,6 +721,20 @@ static inline u32 __maybe_unused __xe_lrc_##elem##_ggtt_addr(struct xe_lrc *lrc) return xe_bo_ggtt_addr(lrc->submission_ring) + __xe_lrc_##elem##_offset(lrc); \ } \ +#define DECL_MAP_INDIRECT_ADDR_HELPERS(elem) \ +static inline struct iosys_map __xe_lrc_##elem##_map(struct xe_lrc *lrc) \ +{ \ + struct iosys_map map = lrc->indirect_state->vmap; \ +\ + xe_assert(lrc_to_xe(lrc), !iosys_map_is_null(&map)); \ + iosys_map_incr(&map, __xe_lrc_##elem##_offset(lrc)); \ + return map; \ +} \ +static inline u32 __maybe_unused __xe_lrc_##elem##_ggtt_addr(struct xe_lrc *lrc) \ +{ \ + return xe_bo_ggtt_addr(lrc->indirect_state) + __xe_lrc_##elem##_offset(lrc); \ +} \ + DECL_MAP_RING_ADDR_HELPERS(ring) DECL_MAP_ADDR_HELPERS(pphwsp) DECL_MAP_ADDR_HELPERS(seqno) @@ -734,8 +743,9 @@ DECL_MAP_ADDR_HELPERS(start_seqno) DECL_MAP_ADDR_HELPERS(ctx_job_timestamp) DECL_MAP_ADDR_HELPERS(ctx_timestamp) DECL_MAP_ADDR_HELPERS(parallel) -DECL_MAP_ADDR_HELPERS(indirect_ring) +DECL_MAP_INDIRECT_ADDR_HELPERS(indirect_ring) +#undef DECL_INDIRECT_MAP_ADDR_HELPERS #undef DECL_RING_MAP_ADDR_HELPERS #undef DECL_MAP_ADDR_HELPERS @@ -845,25 +855,27 @@ void xe_lrc_write_ctx_reg(struct xe_lrc *lrc, int reg_nr, u32 val) xe_map_write32(xe, &map, val); } -static void *empty_lrc_data(struct xe_hw_engine *hwe) +static void *empty_lrc_data(struct xe_hw_engine *hwe, bool has_default) { struct xe_gt *gt = hwe->gt; void *data; u32 *regs; - data = kzalloc(xe_gt_lrc_size(gt, hwe->class), GFP_KERNEL); + data = kzalloc(xe_gt_lrc_size(gt, hwe->class) + + LRC_INDIRECT_RING_STATE_SIZE, GFP_KERNEL); if (!data) return NULL; /* 1st page: Per-Process of HW status Page */ - regs = data + LRC_PPHWSP_SIZE; - set_offsets(regs, reg_offsets(gt_to_xe(gt), hwe->class), hwe); - set_context_control(regs, hwe); - set_memory_based_intr(regs, hwe); - reset_stop_ring(regs, hwe); + if (!has_default) { + regs = data + LRC_PPHWSP_SIZE; + set_offsets(regs, reg_offsets(gt_to_xe(gt), hwe->class), hwe); + set_context_control(regs, hwe); + set_memory_based_intr(regs, hwe); + reset_stop_ring(regs, hwe); + } if (xe_gt_has_indirect_ring_state(gt)) { - regs = data + xe_gt_lrc_size(gt, hwe->class) - - LRC_INDIRECT_RING_STATE_SIZE; + regs = data + xe_gt_lrc_size(gt, hwe->class); set_offsets(regs, xe2_indirect_ring_state_offsets, hwe); } @@ -883,6 +895,7 @@ static void xe_lrc_finish(struct xe_lrc *lrc) xe_hw_fence_ctx_finish(&lrc->fence_ctx); xe_bo_unpin_map_no_vm(lrc->bo); xe_bo_unpin_map_no_vm(lrc->submission_ring); + xe_bo_unpin_map_no_vm(lrc->indirect_state); } #define PVC_CTX_ASID (0x2e + 1) @@ -903,8 +916,6 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, kref_init(&lrc->refcount); lrc->flags = 0; lrc_size = xe_gt_lrc_size(gt, hwe->class); - if (xe_gt_has_indirect_ring_state(gt)) - lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE; /* * FIXME: Perma-pinning LRC as we don't yet support moving GGTT address @@ -929,6 +940,22 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, goto err_lrc_finish; } + if (xe_gt_has_indirect_ring_state(gt)) { + lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE; + + lrc->indirect_state = xe_bo_create_pin_map(xe, tile, vm, + LRC_INDIRECT_RING_STATE_SIZE, + ttm_bo_type_kernel, + XE_BO_FLAG_VRAM_IF_DGFX(tile) | + XE_BO_FLAG_GGTT | + XE_BO_FLAG_GGTT_INVALIDATE); + if (IS_ERR(lrc->indirect_state)) { + err = PTR_ERR(lrc->indirect_state); + lrc->indirect_state = NULL; + goto err_lrc_finish; + } + } + lrc->size = lrc_size; lrc->tile = gt_to_tile(hwe->gt); lrc->ring.size = ring_size; @@ -938,8 +965,8 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt, hwe->fence_irq, hwe->name); - if (!gt->default_lrc[hwe->class]) { - init_data = empty_lrc_data(hwe); + if (!gt->default_lrc[hwe->class] || xe_gt_has_indirect_ring_state(gt)) { + init_data = empty_lrc_data(hwe, !!gt->default_lrc[hwe->class]); if (!init_data) { err = -ENOMEM; goto err_lrc_finish; @@ -951,7 +978,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, * values */ map = __xe_lrc_pphwsp_map(lrc); - if (!init_data) { + if (gt->default_lrc[hwe->class]) { xe_map_memset(xe, &map, 0, 0, LRC_PPHWSP_SIZE); /* PPHWSP */ xe_map_memcpy_to(xe, &map, LRC_PPHWSP_SIZE, gt->default_lrc[hwe->class] + LRC_PPHWSP_SIZE, @@ -959,9 +986,17 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, } else { xe_map_memcpy_to(xe, &map, 0, init_data, xe_gt_lrc_size(gt, hwe->class)); - kfree(init_data); } + if (xe_gt_has_indirect_ring_state(gt)) { + map = __xe_lrc_indirect_ring_map(lrc); + xe_map_memcpy_to(xe, &map, 0, init_data + + xe_gt_lrc_size(gt, hwe->class), + LRC_INDIRECT_RING_STATE_SIZE); + } + + kfree(init_data); + if (vm) { xe_lrc_set_ppgtt(lrc, vm); diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h index 3ad9ac2d644f..3be708c82313 100644 --- a/drivers/gpu/drm/xe/xe_lrc_types.h +++ b/drivers/gpu/drm/xe/xe_lrc_types.h @@ -27,7 +27,12 @@ struct xe_lrc { */ struct xe_bo *submission_ring; - /** @size: size of lrc including any indirect ring state page */ + /** + * @indirect_state: buffer object (memory) for indirect state + */ + struct xe_bo *indirect_state; + + /** @size: size of lrc */ u32 size; /** @tile: tile which this LRC belongs to */ From patchwork Mon Nov 18 23:37:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879196 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB3A6D60D06 for ; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D56D610E56E; Mon, 18 Nov 2024 23:37:26 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="C3PzzUD8"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7BB6410E568; Mon, 18 Nov 2024 23:37:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973045; x=1763509045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FQxvSEd01/OxTgGjN+9eaO1d4HXqMPyPwKn7zGm1XOU=; b=C3PzzUD8IDMSz6BtiwuujaqdB+qesEr7XpmlonfH3LlpYEyQnt9+l9pz ZMVZMSds8BNir4rWs9L1uGpzZuF/93ajomj6l8WLi2D7FwkEqpheN60da xtWZ7Mbw/SHFSPAP37C1CzBeiSlktf6wg+52OCKTKtzGMdw+0c7QMutMd 49G0xu/hyC87ZcICESo+tVziZo44yBYk40sCrKcqWRp7AfglsHktqCOQ3 twHaXryjUOKtB43We3Akz/ZVlrzWFGykhVzA5z4/u35EFMABGXdXUYPZ9 fyQuVI567YeNwihH3L9+lBVDL1do2gNsqUtz4LoRL+28AzC29/JDoDoO9 Q==; X-CSE-ConnectionGUID: Ik0KT47hTxSN0bDIJeuGQw== X-CSE-MsgGUID: ZxPHoLFWSASgFVVs/Kew8g== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878897" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878897" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:25 -0800 X-CSE-ConnectionGUID: oxIahCEPQ4eIRrn66HhuBw== X-CSE-MsgGUID: 4I+LFVZCRmSL9zHA6s/IoQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521696" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:25 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 08/29] drm/xe: Clear GGTT in xe_bo_restore_kernel Date: Mon, 18 Nov 2024 15:37:36 -0800 Message-Id: <20241118233757.2374041-9-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Part of what xe_bo_restore_kernel does, is restore BO's GGTT mappings which may have been lost during a power state change. Missing is restoring the GGTT entries without BO mappings to a known state (e.g., scratch pages). Update xe_bo_restore_kernel to clear the entire GGTT before restoring BO's GGTT mappings. v2: - Include missing local change of tile and id variable (CI) v3: - Fixed kernel doc (CI) v4: - Only clear holes (CI) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Lucas De Marchi Cc: Matthew Auld Signed-off-by: Matthew Brost Cc: # v6.8+ --- drivers/gpu/drm/xe/xe_bo_evict.c | 8 +++++++- drivers/gpu/drm/xe/xe_ggtt.c | 19 ++++++++++++++++--- drivers/gpu/drm/xe/xe_ggtt.h | 2 ++ 3 files changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c index 8fb2be061003..d7bb3dbb41d6 100644 --- a/drivers/gpu/drm/xe/xe_bo_evict.c +++ b/drivers/gpu/drm/xe/xe_bo_evict.c @@ -123,7 +123,8 @@ int xe_bo_evict_all(struct xe_device *xe) * @xe: xe device * * Move kernel BOs from temporary (typically system) memory to VRAM via CPU. All - * moves done via TTM calls. + * moves done via TTM calls. All GGTT are restored too, first by clearing GGTT + * to known state and then restoring individual BO's GGTT mappings. * * This function should be called early, before trying to init the GT, on device * resume. @@ -131,8 +132,13 @@ int xe_bo_evict_all(struct xe_device *xe) int xe_bo_restore_kernel(struct xe_device *xe) { struct xe_bo *bo; + struct xe_tile *tile; + u8 id; int ret; + for_each_tile(tile, xe, id) + xe_ggtt_clear(tile->mem.ggtt); + spin_lock(&xe->pinned.lock); for (;;) { bo = list_first_entry_or_null(&xe->pinned.evicted, diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c index 558fac8bb6fb..2fc498b89878 100644 --- a/drivers/gpu/drm/xe/xe_ggtt.c +++ b/drivers/gpu/drm/xe/xe_ggtt.c @@ -140,7 +140,7 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte) ggtt_update_access_counter(ggtt); } -static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size) +static void __xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size) { u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB]; u64 end = start + size - 1; @@ -160,6 +160,19 @@ static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size) } } +static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt); + +/** + * xe_ggtt_clear() - GGTT clear + * @ggtt: the &xe_ggtt to be cleared + * + * Clear all GGTT to a known state + */ +void xe_ggtt_clear(struct xe_ggtt *ggtt) +{ + xe_ggtt_initial_clear(ggtt); +} + static void ggtt_fini_early(struct drm_device *drm, void *arg) { struct xe_ggtt *ggtt = arg; @@ -277,7 +290,7 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt) /* Display may have allocated inside ggtt, so be careful with clearing here */ mutex_lock(&ggtt->lock); drm_mm_for_each_hole(hole, &ggtt->mm, start, end) - xe_ggtt_clear(ggtt, start, end - start); + __xe_ggtt_clear(ggtt, start, end - start); xe_ggtt_invalidate(ggtt); mutex_unlock(&ggtt->lock); @@ -294,7 +307,7 @@ static void ggtt_node_remove(struct xe_ggtt_node *node) mutex_lock(&ggtt->lock); if (bound) - xe_ggtt_clear(ggtt, node->base.start, node->base.size); + __xe_ggtt_clear(ggtt, node->base.start, node->base.size); drm_mm_remove_node(&node->base); node->base.size = 0; mutex_unlock(&ggtt->lock); diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h index 27e7d67de004..b7ae440cdebf 100644 --- a/drivers/gpu/drm/xe/xe_ggtt.h +++ b/drivers/gpu/drm/xe/xe_ggtt.h @@ -13,6 +13,8 @@ struct drm_printer; int xe_ggtt_init_early(struct xe_ggtt *ggtt); int xe_ggtt_init(struct xe_ggtt *ggtt); +void xe_ggtt_clear(struct xe_ggtt *ggtt); + struct xe_ggtt_node *xe_ggtt_node_init(struct xe_ggtt *ggtt); void xe_ggtt_node_fini(struct xe_ggtt_node *node); int xe_ggtt_node_insert_balloon(struct xe_ggtt_node *node, From patchwork Mon Nov 18 23:37:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879202 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18CC3D60D0E for ; Mon, 18 Nov 2024 23:37:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4706910E589; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="L//5PpjM"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id C4FF510E56A; Mon, 18 Nov 2024 23:37:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973045; x=1763509045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3JQ2v4LblLT+fySLDRUZtov1VGz24hLhWMPLpO5qV7M=; b=L//5PpjMoh5Z1YnuWpM2t9/gRsa+otX72/eCmCjDpAHdR0cZDNoVsKur xMp0nOYFH7Q0QWSN3oNjmc8VKzL7IHmZhWO0GWVxCiYFaCAkDDgGgM1bY NS6mZqO8/Yeg1UoSa0TKUc9TNbfgSjSSjy7iNmVy6DSM63C3L4ZP8Z77x Sn1zdS0G4D6eeXJ4YVJ59crJdbbRcpVcaMrZYZmlahLXIRgMpPcJ+dvOU EU1Fc3Q/wpfRzm87h1E7t4tFrDD9xkerGQE/iBYz4kZMshOyu9RDquUmw OPqFoIyRV3BOZ4o9hQqW7YGk8GEKtWdsvoYpiydA4XvOo3cyFtGF5NZEU Q==; X-CSE-ConnectionGUID: u8x4zbPhT5WURgn0hiuVjA== X-CSE-MsgGUID: 7Mjhz9S4RpGmV1qlSOOUIg== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878903" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878903" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:25 -0800 X-CSE-ConnectionGUID: UzH9G9ajT8+AP5GqCMyEBw== X-CSE-MsgGUID: vNlK+HXTSR2th8L40QMryQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521699" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:25 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 09/29] FIXME: drm/xe: Add pad to ring and indirect state Date: Mon, 18 Nov 2024 15:37:37 -0800 Message-Id: <20241118233757.2374041-10-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Unsure why, but without this intermittent hangs occur on GuC context switching. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_lrc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index e3c1773191bd..9633e5e700f6 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -929,7 +929,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, if (IS_ERR(lrc->bo)) return PTR_ERR(lrc->bo); - lrc->submission_ring = xe_bo_create_pin_map(xe, tile, vm, ring_size, + lrc->submission_ring = xe_bo_create_pin_map(xe, tile, vm, SZ_32K, ttm_bo_type_kernel, XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT | @@ -943,8 +943,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, if (xe_gt_has_indirect_ring_state(gt)) { lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE; - lrc->indirect_state = xe_bo_create_pin_map(xe, tile, vm, - LRC_INDIRECT_RING_STATE_SIZE, + lrc->indirect_state = xe_bo_create_pin_map(xe, tile, vm, SZ_8K, ttm_bo_type_kernel, XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT | From patchwork Mon Nov 18 23:37:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C74AD60D03 for ; Mon, 18 Nov 2024 23:37:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 931BC10E58A; Mon, 18 Nov 2024 23:37:32 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="UpIow4kQ"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 18CA310E568; Mon, 18 Nov 2024 23:37:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973046; x=1763509046; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=G7l0ib8+QEAgg4AIsKBibiDazGQbjzzu/FWpLW6Gqes=; b=UpIow4kQRfoFPhp6PPI8PKPLQkb1R9kXNFoiTIPMVU04XKMPvmLhXguI lscZ6h83pG2OqmxrNVqdPXnesYnQJ1iokBtlAfnxifG6RWD8Wu8Ebc2JJ A3DKc8czsHaaA9l12UuCdW22jrdWTbIodn4QnaaRbRgq/yPcia0FxNjwO jwrjbEcIPuH/a2kjdsCSH3ZuxCn+RAvH8qp8DL5tVdAWTXinZdQa3/kxu VdfTJbdOUQBAPTMUJ4/yfXxqdK1oQgUP2S1HiE42WansUSE/DB2poZSO3 K27cNpqUURspbps7jHQgVV/oE2O7g1YvXczlSd5e2cP4V4cX5pGFV3553 Q==; X-CSE-ConnectionGUID: qqVsIqk0Q26QzjIG6Rl2OQ== X-CSE-MsgGUID: a1oUDGKRROKAoI2O+Gt7JA== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878909" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878909" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 X-CSE-ConnectionGUID: BQKhGgVhS/67DEVnecavKQ== X-CSE-MsgGUID: KW2BGiVMS6G7WYkS+QhSHQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521702" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 10/29] drm/xe: Enable indirect ring on media GT Date: Mon, 18 Nov 2024 15:37:38 -0800 Message-Id: <20241118233757.2374041-11-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" The media GT supports this, required for UMD submission, so enable by default. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_pci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c index 9b81e7d00a86..a27450e63cf9 100644 --- a/drivers/gpu/drm/xe/xe_pci.c +++ b/drivers/gpu/drm/xe/xe_pci.c @@ -209,6 +209,7 @@ static const struct xe_media_desc media_xelpmp = { static const struct xe_media_desc media_xe2 = { .name = "Xe2_LPM / Xe2_HPM / Xe3_LPM", + .has_indirect_ring_state = 1, .hw_engine_mask = GENMASK(XE_HW_ENGINE_VCS7, XE_HW_ENGINE_VCS0) | GENMASK(XE_HW_ENGINE_VECS3, XE_HW_ENGINE_VECS0) | From patchwork Mon Nov 18 23:37:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0F62FD60D0F for ; Mon, 18 Nov 2024 23:37:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B106510E58B; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="jj3WxVxq"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 64F2410E568; Mon, 18 Nov 2024 23:37:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973046; x=1763509046; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JMIpXjB8x2plmka8KMhBqdpOHr0UlIYDdaEUAuZzIxQ=; b=jj3WxVxqBM6Zd2W6iKQKdgiwdYjmO9JYQMUeV7YhytPc4+Ao0Ubg1mOW kWDseZPePbOgaPLmQ4OPdOjLSqy77BTyB5S2hAENt5ya99Xuk7w3S9Ox/ VUkEzT50Y25YeaerSojjXDeirOkZ+sBC8MLUm4eTn71g4dir5+eCZU/Pn SyQfcfZKHfxGpK2iihrvU0D2Muzvx0nOVBjaq2rUxBH0mkr1nLPvhfDRE kUloysRaZyRXrQl8AeTIBdbW1/k1l/5JZISzoEdlVwRPHW4vc+k/xchhx 0pNdKt9WEzSpURZgKBruz3B/weLGaoyLHHj3iyJRrHhnAwIy0A4Nq5rxk A==; X-CSE-ConnectionGUID: Ef1Tu5g3Ta6bhHd33mvn3Q== X-CSE-MsgGUID: LC2K6wkTTPWEydsOR6Wwjg== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878916" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878916" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 X-CSE-ConnectionGUID: ug1TyloXTm2DfCGlo4xEdw== X-CSE-MsgGUID: 4/d5sIvWR/KTZ2QKiyLFxg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521708" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 11/29] drm/xe: Don't add pinned mappings to VM bulk move Date: Mon, 18 Nov 2024 15:37:39 -0800 Message-Id: <20241118233757.2374041-12-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We don't want kernel pinned resources (ring, indirect state) in the VM's bulk move as these are unevictable. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_bo.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 549866da5cd1..96dbc88b1f55 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -1470,6 +1470,9 @@ __xe_bo_create_locked(struct xe_device *xe, { struct xe_bo *bo = NULL; int err; + bool want_bulk = vm && !xe_vm_in_fault_mode(vm) && + flags & XE_BO_FLAG_USER && + !(flags & (XE_BO_FLAG_PINNED | XE_BO_FLAG_GGTT)); if (vm) xe_vm_assert_held(vm); @@ -1488,9 +1491,7 @@ __xe_bo_create_locked(struct xe_device *xe, } bo = ___xe_bo_create_locked(xe, bo, tile, vm ? xe_vm_resv(vm) : NULL, - vm && !xe_vm_in_fault_mode(vm) && - flags & XE_BO_FLAG_USER ? - &vm->lru_bulk_move : NULL, size, + want_bulk ? &vm->lru_bulk_move : NULL, size, cpu_caching, type, flags); if (IS_ERR(bo)) return bo; @@ -1781,9 +1782,6 @@ int xe_bo_pin(struct xe_bo *bo) struct xe_device *xe = xe_bo_device(bo); int err; - /* We currently don't expect user BO to be pinned */ - xe_assert(xe, !xe_bo_is_user(bo)); - /* Pinned object must be in GGTT or have pinned flag */ xe_assert(xe, bo->flags & (XE_BO_FLAG_PINNED | XE_BO_FLAG_GGTT)); From patchwork Mon Nov 18 23:37:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879214 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC1AFD60CFD for ; Mon, 18 Nov 2024 23:37:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2D59D10E5AF; Mon, 18 Nov 2024 23:37:34 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="f9lxWxRF"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id AB9C010E568; Mon, 18 Nov 2024 23:37:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973046; x=1763509046; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4Qy8SJVx/z7VFxY+3ybgTb8glU6XUs0SKy5CimmDh9I=; b=f9lxWxRF+dyy9hn47uymIc+fcWnBLa1AXXcYtrjvi65BYda7d+sD4GSK HAZgXXVaBIcXFzUEQnSaJY78Bc8TnwLKkC+kAB/qNRyZDsUswXagVVIVq UaxmhO35SNyMkllI8Qjlllx93c+nSRCMlwoko8ZLuiBnH5IwovwBgY+cE ItANxgQzk2MBhwvOJLhvH9xR5ZLhA7F6hvnVi6m0PtBkZFBaNi50KlZqT kAAdQFAhp5aiydARu9crOgbyS7iV+DU10pIVlXSIo0JF/iosedmBMCwQd fUcRsCteJ/kunP9B1sbSh5wQVPVAMiBgCaJRlMBxhJm1z4Q6xtVEKSjjj w==; X-CSE-ConnectionGUID: +YqRBx5ITc6XYG++eZkE1w== X-CSE-MsgGUID: QWXwwLRQS5it1MeKEVXJNg== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878922" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878922" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 X-CSE-ConnectionGUID: 89kf0iVVR6SpOAFfSNa1Lg== X-CSE-MsgGUID: DkgWOua1TO+ZjRgO7vhCYQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521712" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 12/29] drm/xe: Add exec queue post init extension processing Date: Mon, 18 Nov 2024 15:37:40 -0800 Message-Id: <20241118233757.2374041-13-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add exec queue post init extension processing which is needed for more complex extensions in which data is returned to the user. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec_queue.c | 48 ++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index aab9e561153d..f402988b4fc0 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -33,6 +33,8 @@ enum xe_exec_queue_sched_prop { static int exec_queue_user_extensions(struct xe_device *xe, struct xe_exec_queue *q, u64 extensions, int ext_number); +static int exec_queue_user_extensions_post_init(struct xe_device *xe, struct xe_exec_queue *q, + u64 extensions, int ext_number); static void __xe_exec_queue_free(struct xe_exec_queue *q) { @@ -446,6 +448,10 @@ static const xe_exec_queue_user_extension_fn exec_queue_user_extension_funcs[] = [DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = exec_queue_user_ext_set_property, }; +static const xe_exec_queue_user_extension_fn exec_queue_user_extension_post_init_funcs[] = { + [DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = NULL, +}; + #define MAX_USER_EXTENSIONS 16 static int exec_queue_user_extensions(struct xe_device *xe, struct xe_exec_queue *q, u64 extensions, int ext_number) @@ -480,6 +486,42 @@ static int exec_queue_user_extensions(struct xe_device *xe, struct xe_exec_queue return 0; } +static int exec_queue_user_extensions_post_init(struct xe_device *xe, struct xe_exec_queue *q, + u64 extensions, int ext_number) +{ + u64 __user *address = u64_to_user_ptr(extensions); + struct drm_xe_user_extension ext; + int err; + u32 idx; + + if (XE_IOCTL_DBG(xe, ext_number >= MAX_USER_EXTENSIONS)) + return -E2BIG; + + err = __copy_from_user(&ext, address, sizeof(ext)); + if (XE_IOCTL_DBG(xe, err)) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, ext.pad) || + XE_IOCTL_DBG(xe, ext.name >= + ARRAY_SIZE(exec_queue_user_extension_post_init_funcs))) + return -EINVAL; + + idx = array_index_nospec(ext.name, + ARRAY_SIZE(exec_queue_user_extension_post_init_funcs)); + if (exec_queue_user_extension_post_init_funcs[idx]) { + err = exec_queue_user_extension_post_init_funcs[idx](xe, q, extensions); + if (XE_IOCTL_DBG(xe, err)) + return err; + } + + if (ext.next_extension) + return exec_queue_user_extensions_post_init(xe, q, + ext.next_extension, + ++ext_number); + + return 0; +} + static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt, struct drm_xe_engine_class_instance *eci, u16 width, u16 num_placements) @@ -647,6 +689,12 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data, q->xef = xe_file_get(xef); + if (args->extensions) { + err = exec_queue_user_extensions_post_init(xe, q, args->extensions, 0); + if (err) + goto kill_exec_queue; + } + /* user id alloc must always be last in ioctl to prevent UAF */ err = xa_alloc(&xef->exec_queue.xa, &id, q, xa_limit_32b, GFP_KERNEL); if (err) From patchwork Mon Nov 18 23:37:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879200 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 28944D60D0D for ; Mon, 18 Nov 2024 23:37:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C364510E58C; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Xlrrd4jk"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8F20710E57A; Mon, 18 Nov 2024 23:37:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973047; x=1763509047; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OtbyF2EW/P7V2LLp0ko5/zMgM1ltR8QgJg7XG/TyNAU=; b=Xlrrd4jkPvdC7iJr3hBIhQTfaocM17Gtt45tnXg/FbnOQafTMl+R3PCt bmKq5HHy2cbG8YZPw0QajO43ajMcXbLW4LGBtu9CYEllNLb3FjMlGtPz1 jZ4QiaqIceg50v0UHn2JFCP6SpY5N2xvMl0fV1JVMrmugDQ5qq9/tBRAq se5WAPGiG1EolDzMCFvGqaLzANTDQ1e8f+2SE941TRnjsoVcHJaHvzJH6 2Fx4ZV4DqstFf7l3pQpo2od1z/N3LSfxm+eEp3s45d304fXVCIef6sjd+ zEeob3F07/OxrgTM16JKzeTY6TGsgyahLFwa47+vGUHUFvvLpFMRlOUVQ g==; X-CSE-ConnectionGUID: YpdraNELRAGmx8pBrfe5pQ== X-CSE-MsgGUID: esWyYkb9T+alptZ72b2I3Q== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878930" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878930" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 X-CSE-ConnectionGUID: 3NmoZ0DZRLamvwwbTcN9eg== X-CSE-MsgGUID: QUm4moe7SxCRkdgtRW7l4w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521716" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:26 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 13/29] drm/xe/mmap: Add mmap support for PCI memory barrier Date: Mon, 18 Nov 2024 15:37:41 -0800 Message-Id: <20241118233757.2374041-14-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tejas Upadhyay In order to avoid having userspace to use MI_MEM_FENCE, we are adding a mechanism for userspace to generate a PCI memory barrier with low overhead (avoiding IOCTL call as well as writing to VRAM will adds some overhead). This is implemented by memory-mapping a page as uncached that is backed by MMIO on the dGPU and thus allowing userspace to do memory write to the page without invoking an IOCTL. We are selecting the MMIO so that it is not accessible from the PCI bus so that the MMIO writes themselves are ignored, but the PCI memory barrier will still take action as the MMIO filtering will happen after the memory barrier effect. When we detect special defined offset in mmap(), We are mapping 4K page which contains the last of page of doorbell MMIO range to userspace for same purpose. For user to query special offset we are adding special flag in mmap_offset ioctl which needs to be passed as follows, struct drm_xe_gem_mmap_offset mmo = { .handle = 0, /* this must be 0 */ .flags = DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER, }; igt_ioctl(fd, DRM_IOCTL_XE_GEM_MMAP_OFFSET, &mmo); map = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fd, mmo); Note: Test coverage for this is added by IGT https://patchwork.freedesktop.org/series/140368/ here. UMD implementing test, once PR is ready will attach with this patch. V6(MAuld) - Move physical mmap to fault handler - Modify kernel-doc and attach UMD PR when ready V5(MAuld) - Return invalid early in case of non 4K PAGE_SIZE - Format kernel-doc and add note for 4K PAGE_SIZE HW limit V4(MAuld) - Add kernel-doc for uapi change - Restrict page size to 4K V3(MAuld) - Remove offset defination from UAPI to be able to change later - Edit commit message for special flag addition V2(MAuld) - Add fault handler with dummy page to handle unplug device - Add Build check for special offset to be below normal start page - Test d3hot, mapping seems to be valid in d3hot as well - Add more info to commit message Cc: Matthew Auld Cc: Michal Mrozek Signed-off-by: Tejas Upadhyay Reviewed-by: Matthew Auld --- drivers/gpu/drm/xe/xe_bo.c | 16 ++++- drivers/gpu/drm/xe/xe_bo.h | 2 + drivers/gpu/drm/xe/xe_device.c | 103 ++++++++++++++++++++++++++++++++- include/uapi/drm/xe_drm.h | 29 +++++++++- 4 files changed, 147 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 96dbc88b1f55..f948262e607f 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -2138,9 +2138,23 @@ int xe_gem_mmap_offset_ioctl(struct drm_device *dev, void *data, XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1])) return -EINVAL; - if (XE_IOCTL_DBG(xe, args->flags)) + if (XE_IOCTL_DBG(xe, args->flags & + ~DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER)) return -EINVAL; + if (args->flags & DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER) { + if (XE_IOCTL_DBG(xe, args->handle)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, PAGE_SIZE > SZ_4K)) + return -EINVAL; + + BUILD_BUG_ON(((XE_PCI_BARRIER_MMAP_OFFSET >> XE_PTE_SHIFT) + + SZ_4K) >= DRM_FILE_PAGE_OFFSET_START); + args->offset = XE_PCI_BARRIER_MMAP_OFFSET; + return 0; + } + gem_obj = drm_gem_object_lookup(file, args->handle); if (XE_IOCTL_DBG(xe, !gem_obj)) return -ENOENT; diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h index 7fa44a0138b0..e7724965d3f1 100644 --- a/drivers/gpu/drm/xe/xe_bo.h +++ b/drivers/gpu/drm/xe/xe_bo.h @@ -63,6 +63,8 @@ #define XE_BO_PROPS_INVALID (-1) +#define XE_PCI_BARRIER_MMAP_OFFSET (0x50 << XE_PTE_SHIFT) + struct sg_table; struct xe_bo *xe_bo_alloc(void); diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 930bb2750e2e..f6069db795e7 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -231,12 +231,113 @@ static long xe_drm_compat_ioctl(struct file *file, unsigned int cmd, unsigned lo #define xe_drm_compat_ioctl NULL #endif +static void barrier_open(struct vm_area_struct *vma) +{ + drm_dev_get(vma->vm_private_data); +} + +static void barrier_close(struct vm_area_struct *vma) +{ + drm_dev_put(vma->vm_private_data); +} + +static void barrier_release_dummy_page(struct drm_device *dev, void *res) +{ + struct page *dummy_page = (struct page *)res; + + __free_page(dummy_page); +} + +static vm_fault_t barrier_fault(struct vm_fault *vmf) +{ + struct drm_device *dev = vmf->vma->vm_private_data; + struct vm_area_struct *vma = vmf->vma; + vm_fault_t ret = VM_FAULT_NOPAGE; + pgprot_t prot; + int idx; + + prot = vm_get_page_prot(vma->vm_flags); + + if (drm_dev_enter(dev, &idx)) { + unsigned long pfn; + +#define LAST_DB_PAGE_OFFSET 0x7ff001 + pfn = PHYS_PFN(pci_resource_start(to_pci_dev(dev->dev), 0) + + LAST_DB_PAGE_OFFSET); + ret = vmf_insert_pfn_prot(vma, vma->vm_start, pfn, + pgprot_noncached(prot)); + drm_dev_exit(idx); + } else { + struct page *page; + + /* Allocate new dummy page to map all the VA range in this VMA to it*/ + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return VM_FAULT_OOM; + + /* Set the page to be freed using drmm release action */ + if (drmm_add_action_or_reset(dev, barrier_release_dummy_page, page)) + return VM_FAULT_OOM; + + ret = vmf_insert_pfn_prot(vma, vma->vm_start, page_to_pfn(page), + prot); + } + + return ret; +} + +static const struct vm_operations_struct vm_ops_barrier = { + .open = barrier_open, + .close = barrier_close, + .fault = barrier_fault, +}; + +static int xe_pci_barrier_mmap(struct file *filp, + struct vm_area_struct *vma) +{ + struct drm_file *priv = filp->private_data; + struct drm_device *dev = priv->minor->dev; + + if (vma->vm_end - vma->vm_start > SZ_4K) + return -EINVAL; + + if (is_cow_mapping(vma->vm_flags)) + return -EINVAL; + + if (vma->vm_flags & (VM_READ | VM_EXEC)) + return -EINVAL; + + vm_flags_clear(vma, VM_MAYREAD | VM_MAYEXEC); + vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO); + vma->vm_ops = &vm_ops_barrier; + vma->vm_private_data = dev; + drm_dev_get(vma->vm_private_data); + + return 0; +} + +static int xe_mmap(struct file *filp, struct vm_area_struct *vma) +{ + struct drm_file *priv = filp->private_data; + struct drm_device *dev = priv->minor->dev; + + if (drm_dev_is_unplugged(dev)) + return -ENODEV; + + switch (vma->vm_pgoff) { + case XE_PCI_BARRIER_MMAP_OFFSET >> XE_PTE_SHIFT: + return xe_pci_barrier_mmap(filp, vma); + } + + return drm_gem_mmap(filp, vma); +} + static const struct file_operations xe_driver_fops = { .owner = THIS_MODULE, .open = drm_open, .release = drm_release_noglobal, .unlocked_ioctl = xe_drm_ioctl, - .mmap = drm_gem_mmap, + .mmap = xe_mmap, .poll = drm_poll, .read = drm_read, .compat_ioctl = xe_drm_compat_ioctl, diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 4a8a4a63e99c..6490b16b1217 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -811,6 +811,32 @@ struct drm_xe_gem_create { /** * struct drm_xe_gem_mmap_offset - Input of &DRM_IOCTL_XE_GEM_MMAP_OFFSET + * + * The @flags can be: + * - %DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER - For user to query special offset + * for use in mmap ioctl. Writing to the returned mmap address will generate a + * PCI memory barrier with low overhead (avoiding IOCTL call as well as writing + * to VRAM which would also add overhead), acting like an MI_MEM_FENCE + * instruction. + * + * Note: The mmap size can be at most 4K, due to HW limitations. As a result + * this interface is only supported on CPU architectures that support 4K page + * size. The mmap_offset ioctl will detect this and gracefully return an + * error, where userspace is expected to have a different fallback method for + * triggering a barrier. + * + * Roughly the usage would be as follows: + * + * .. code-block:: C + * + * struct drm_xe_gem_mmap_offset mmo = { + * .handle = 0, // must be set to 0 + * .flags = DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER, + * }; + * + * err = ioctl(fd, DRM_IOCTL_XE_GEM_MMAP_OFFSET, &mmo); + * map = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fd, mmo.offset); + * map[i] = 0xdeadbeaf; // issue barrier */ struct drm_xe_gem_mmap_offset { /** @extensions: Pointer to the first extension struct, if any */ @@ -819,7 +845,8 @@ struct drm_xe_gem_mmap_offset { /** @handle: Handle for the object being mapped. */ __u32 handle; - /** @flags: Must be zero */ +#define DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER (1 << 0) + /** @flags: Flags */ __u32 flags; /** @offset: The fake offset to use for subsequent mmap call */ From patchwork Mon Nov 18 23:37:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AFC87D60D07 for ; Mon, 18 Nov 2024 23:37:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 095F810E593; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Qkr9qi3C"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 568B210E579; Mon, 18 Nov 2024 23:37:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973047; x=1763509047; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=W3SqTgfgG3QPnrc77lM8aiyB5CANE1eeVgHOl+hET5E=; b=Qkr9qi3C+s7UwahgC2OAksHsduiKaQZKLZmgv3aGKQX2cCQwBoiTCo9N ZvMoOp1T5dfmMRuIvW+DrcmznD8K/1exciC3pw1I9lElrm6yqjPoJUwJF pMQUeNXnHiBXCD+XOvGRx57SPu7ckY1iwdHUUG2i3lM9XqSHpBxlq7jU/ q7J9tXEL0PbJevNRVNurJt6DSGtJAqG34gIVypmgzavHmBeLdyHsYXYTr m8X6WQ1kqCbZwdZK4m875kXok20ynJFeL6DxvvBjl3+UHp18D8DsMThV2 0L44b9oIh1xObwfhTsL9MFxEswtCK1kh5QsDoM0EY2p1+YXaHdrRTtdm9 Q==; X-CSE-ConnectionGUID: NFDRLiliQJ2s4eDebGGbYA== X-CSE-MsgGUID: 2+HmnKDHS/+4wVhhZzeA8A== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878936" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878936" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 X-CSE-ConnectionGUID: zjz0fL1/S/WUnW7Hydwq8w== X-CSE-MsgGUID: yoiOjRK5QZaqZAcOiZLHng== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521720" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 14/29] drm/xe: Add support for mmapping doorbells to user space Date: Mon, 18 Nov 2024 15:37:42 -0800 Message-Id: <20241118233757.2374041-15-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Doorbells need to be mapped to user space for UMD direct submisssion, add support for this. FIXME: Wildly insecure as anyone can pick MMIO doorbell offset, will need to randomize and tie unique offset to FD. Can be done in later revs before upstreaming. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_bo.h | 3 ++ drivers/gpu/drm/xe/xe_device.c | 73 ++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h index e7724965d3f1..2772d42ac057 100644 --- a/drivers/gpu/drm/xe/xe_bo.h +++ b/drivers/gpu/drm/xe/xe_bo.h @@ -64,6 +64,9 @@ #define XE_BO_PROPS_INVALID (-1) #define XE_PCI_BARRIER_MMAP_OFFSET (0x50 << XE_PTE_SHIFT) +#define XE_MMIO_DOORBELL_MMAP_OFFSET (0x100 << XE_PTE_SHIFT) +#define XE_MMIO_DOORBELL_PFN_START (SZ_4M >> XE_PTE_SHIFT) +#define XE_MMIO_DOORBELL_PFN_COUNT (256) struct sg_table; diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index f6069db795e7..bbdff4308b2e 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -316,6 +316,75 @@ static int xe_pci_barrier_mmap(struct file *filp, return 0; } +static vm_fault_t doorbell_fault(struct vm_fault *vmf) +{ + struct drm_device *dev = vmf->vma->vm_private_data; + struct vm_area_struct *vma = vmf->vma; + vm_fault_t ret = VM_FAULT_NOPAGE; + pgprot_t prot; + int idx; + + prot = vm_get_page_prot(vma->vm_flags); + + if (drm_dev_enter(dev, &idx)) { + unsigned long pfn; + + pfn = PHYS_PFN(pci_resource_start(to_pci_dev(dev->dev), 0) + + (XE_MMIO_DOORBELL_PFN_START << XE_PTE_SHIFT)); + pfn += vma->vm_pgoff & (XE_MMIO_DOORBELL_PFN_COUNT - 1); + + ret = vmf_insert_pfn_prot(vma, vma->vm_start, pfn, + pgprot_noncached(prot)); + drm_dev_exit(idx); + } else { + struct page *page; + + /* Allocate new dummy page to map all the VA range in this VMA to it*/ + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return VM_FAULT_OOM; + + /* Set the page to be freed using drmm release action */ + if (drmm_add_action_or_reset(dev, barrier_release_dummy_page, page)) + return VM_FAULT_OOM; + + ret = vmf_insert_pfn_prot(vma, vma->vm_start, page_to_pfn(page), + prot); + } + + return ret; +} + +static const struct vm_operations_struct vm_ops_doorbell = { + .open = barrier_open, + .close = barrier_close, + .fault = doorbell_fault, +}; + +static int xe_mmio_doorbell_mmap(struct file *filp, + struct vm_area_struct *vma) +{ + struct drm_file *priv = filp->private_data; + struct drm_device *dev = priv->minor->dev; + + if (vma->vm_end - vma->vm_start > SZ_4K) + return -EINVAL; + + if (is_cow_mapping(vma->vm_flags)) + return -EINVAL; + + if (vma->vm_flags & VM_EXEC) + return -EINVAL; + + vm_flags_clear(vma, VM_MAYEXEC); + vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO); + vma->vm_ops = &vm_ops_doorbell; + vma->vm_private_data = dev; + drm_dev_get(vma->vm_private_data); + + return 0; +} + static int xe_mmap(struct file *filp, struct vm_area_struct *vma) { struct drm_file *priv = filp->private_data; @@ -327,6 +396,10 @@ static int xe_mmap(struct file *filp, struct vm_area_struct *vma) switch (vma->vm_pgoff) { case XE_PCI_BARRIER_MMAP_OFFSET >> XE_PTE_SHIFT: return xe_pci_barrier_mmap(filp, vma); + case (XE_MMIO_DOORBELL_MMAP_OFFSET >> XE_PTE_SHIFT) ... + ((XE_MMIO_DOORBELL_MMAP_OFFSET >> XE_PTE_SHIFT) + + XE_MMIO_DOORBELL_PFN_COUNT - 1): + return xe_mmio_doorbell_mmap(filp, vma); } return drm_gem_mmap(filp, vma); From patchwork Mon Nov 18 23:37:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3274ED60D0E for ; Mon, 18 Nov 2024 23:37:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1BDA110E59C; Mon, 18 Nov 2024 23:37:34 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="RnnX55TK"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9757010E57B; Mon, 18 Nov 2024 23:37:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973047; x=1763509047; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lTAKyN0n/mTKOEi99XONgPWWRs53ycg5RdPiTuMR5+s=; b=RnnX55TK/J7HCPwVQEdMH+nAdHEuPp9KlU76MjnIHQOLJxPhLH521U1n b+L5v0XO5iF315C7SQx5f7SpsV2bCBUyT5RRd301Um/yimcoseGWfLtrj e8WjlDbvF1EffbzWfE7xEvEfyGtEZvTEUCDCyrazu2cJgkq9v1PwmYCS5 wWPeX5HNThf+3EUoBKkU5sxOZtP2EKpieuS+WqoyMk3XIWjb0pfYr9K1l 4UHee2APZeFl815X9lR2UZrt8Gu1cJvWFjDDbC22mc8A5yGKIG6e2bIsi iJEroJyeoCkjRCkaiybadXUWFW5ymZKK/bLJ77dERvgmQmto99fWI1gjn g==; X-CSE-ConnectionGUID: jQr0NmlpRCS8LxTT7b5fUA== X-CSE-MsgGUID: 8NxVKTbGTpGVDjU95DgVaw== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878943" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878943" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 X-CSE-ConnectionGUID: b94Rn646SCGY65tiLh6w0g== X-CSE-MsgGUID: /NGz+ijvTSqAG2SipjH9xg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521723" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 15/29] drm/xe: Add support for mmapping submission ring and indirect ring state to user space Date: Mon, 18 Nov 2024 15:37:43 -0800 Message-Id: <20241118233757.2374041-16-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" The ring and indirect ring state need to mapped to user space for UMD direction submission, add support for this. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_bo.c | 3 --- drivers/gpu/drm/xe/xe_exec_queue.c | 2 +- drivers/gpu/drm/xe/xe_execlist.c | 2 +- drivers/gpu/drm/xe/xe_lrc.c | 29 ++++++++++++++++++++++------- drivers/gpu/drm/xe/xe_lrc.h | 4 ++-- 5 files changed, 26 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index f948262e607f..a87871f1cb95 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -1311,9 +1311,6 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo, size_t aligned_size; int err; - /* Only kernel objects should set GT */ - xe_assert(xe, !tile || type == ttm_bo_type_kernel); - if (XE_WARN_ON(!size)) { xe_bo_free(bo); return ERR_PTR(-EINVAL); diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index f402988b4fc0..aef5b130e7f8 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -119,7 +119,7 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q) } for (i = 0; i < q->width; ++i) { - q->lrc[i] = xe_lrc_create(q->hwe, q->vm, SZ_16K); + q->lrc[i] = xe_lrc_create(q, q->hwe, q->vm, SZ_16K); if (IS_ERR(q->lrc[i])) { err = PTR_ERR(q->lrc[i]); goto err_unlock; diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c index a8c416a48812..93f76280d453 100644 --- a/drivers/gpu/drm/xe/xe_execlist.c +++ b/drivers/gpu/drm/xe/xe_execlist.c @@ -265,7 +265,7 @@ struct xe_execlist_port *xe_execlist_port_create(struct xe_device *xe, port->hwe = hwe; - port->lrc = xe_lrc_create(hwe, NULL, SZ_16K); + port->lrc = xe_lrc_create(NULL, hwe, NULL, SZ_16K); if (IS_ERR(port->lrc)) { err = PTR_ERR(port->lrc); goto err; diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 9633e5e700f6..8a79470b52ae 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -901,8 +901,9 @@ static void xe_lrc_finish(struct xe_lrc *lrc) #define PVC_CTX_ASID (0x2e + 1) #define PVC_CTX_ACC_CTR_THOLD (0x2a + 1) -static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, - struct xe_vm *vm, u32 ring_size) +static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, + struct xe_hw_engine *hwe, struct xe_vm *vm, + u32 ring_size) { struct xe_gt *gt = hwe->gt; struct xe_tile *tile = gt_to_tile(gt); @@ -911,6 +912,11 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, void *init_data = NULL; u32 arb_enable; u32 lrc_size; + bool user_queue = q && q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION; + enum ttm_bo_type submit_type = user_queue ? ttm_bo_type_device : + ttm_bo_type_kernel; + unsigned int submit_flags = user_queue ? + XE_BO_FLAG_USER : 0; int err; kref_init(&lrc->refcount); @@ -930,7 +936,8 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, return PTR_ERR(lrc->bo); lrc->submission_ring = xe_bo_create_pin_map(xe, tile, vm, SZ_32K, - ttm_bo_type_kernel, + submit_type, + submit_flags | XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT | XE_BO_FLAG_GGTT_INVALIDATE); @@ -944,7 +951,8 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE; lrc->indirect_state = xe_bo_create_pin_map(xe, tile, vm, SZ_8K, - ttm_bo_type_kernel, + submit_type, + submit_flags | XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT | XE_BO_FLAG_GGTT_INVALIDATE); @@ -955,6 +963,12 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, } } + /* Wait for clear */ + if (user_queue) + dma_resv_wait_timeout(xe_vm_resv(vm), + DMA_RESV_USAGE_KERNEL, + false, MAX_SCHEDULE_TIMEOUT); + lrc->size = lrc_size; lrc->tile = gt_to_tile(hwe->gt); lrc->ring.size = ring_size; @@ -1060,6 +1074,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, /** * xe_lrc_create - Create a LRC + * @q: Execution queue * @hwe: Hardware Engine * @vm: The VM (address space) * @ring_size: LRC ring size @@ -1069,8 +1084,8 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, * Return pointer to created LRC upon success and an error pointer * upon failure. */ -struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, - u32 ring_size) +struct xe_lrc *xe_lrc_create(struct xe_exec_queue *q, struct xe_hw_engine *hwe, + struct xe_vm *vm, u32 ring_size) { struct xe_lrc *lrc; int err; @@ -1079,7 +1094,7 @@ struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, if (!lrc) return ERR_PTR(-ENOMEM); - err = xe_lrc_init(lrc, hwe, vm, ring_size); + err = xe_lrc_init(lrc, q, hwe, vm, ring_size); if (err) { kfree(lrc); return ERR_PTR(err); diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index b459dcab8787..23d71283c79d 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -41,8 +41,8 @@ struct xe_lrc_snapshot { #define LRC_PPHWSP_SCRATCH_ADDR (0x34 * 4) -struct xe_lrc *xe_lrc_create(struct xe_hw_engine *hwe, struct xe_vm *vm, - u32 ring_size); +struct xe_lrc *xe_lrc_create(struct xe_exec_queue *q, struct xe_hw_engine *hwe, + struct xe_vm *vm, u32 ring_size); void xe_lrc_destroy(struct kref *ref); /** From patchwork Mon Nov 18 23:37:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879210 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D83ED60D0D for ; Mon, 18 Nov 2024 23:37:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0015410E5A8; Mon, 18 Nov 2024 23:37:32 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="dkza6VuP"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 105C110E579; Mon, 18 Nov 2024 23:37:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973048; x=1763509048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rdI4PLJjh43e/IG8RS2pfdO39jrbnYw1BScUbD5C7zQ=; b=dkza6VuPE8i8sVI4ZAu8q5k5NcceIPzOr1nFCpzuqnxUjOBtyFxta4W6 h8jAalbIJBCc3DiAGAMlUVDdNEGtYcNYYDSsMolS4FVSN3xpQBdXJsfbS C1LsUFthWmQAFHrq+fzYbClOMeK4nGDDjQVbDMwGcG8akRiwz6Glzv8xf fSNfthdnGOm0xdFzjO6BIDOnUXGpYD2lx0LRMY3bRGJnYHJzTeidYi9G0 SMWochathY2/dAoSj+49ljsZzs4qzi5KKC7Q4hhpfNbkDA5DdhbgfKXDc CU/qkmaosOSyaJcY0M+Szf9bN8cSRXV0vyqNUD4AdIAbQXWdMQaNGoXnZ A==; X-CSE-ConnectionGUID: dTaREb26Sk+VQlhdJoWE5Q== X-CSE-MsgGUID: ESOwHV/oSwuhn2lMS9HBLA== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878949" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878949" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 X-CSE-ConnectionGUID: MsQ484wxTmO0UM5j1lkOOw== X-CSE-MsgGUID: xzLkV2jxT5yGjYaAeQ6J7Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521727" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:27 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 16/29] drm/xe/uapi: Define UMD exec queue mapping uAPI Date: Mon, 18 Nov 2024 15:37:44 -0800 Message-Id: <20241118233757.2374041-17-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Define UMD exec queue mapping uAPI. The submit ring, indirect LRC state (ring head, tail, etc...), and doorbell are securly mapped to user space. The ring is a VM PPGTT addres, while indirect LRC state and doorbell mapping is provided via a fake offset like BOs. Signed-off-by: Matthew Brost --- include/uapi/drm/xe_drm.h | 56 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 6490b16b1217..9356a714a2e0 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -1111,6 +1111,61 @@ struct drm_xe_vm_bind { __u64 reserved[2]; }; +/** + * struct drm_xe_exec_queue_ext_usermap + */ +struct drm_xe_exec_queue_ext_usermap { + /** @base: base user extension */ + struct drm_xe_user_extension base; + + /** @flags: MBZ */ + __u32 flags; + + /** @version: Version of usermap */ +#define DRM_XE_EXEC_QUEUE_USERMAP_VERSION_XE2_REV0 0 + __u32 version; + + /** + * @ring_size: The ring size. 4k-2M valid, must be 4k aligned. User + * space has to pad allocation / mapping to avoid prefetch faults. + * Prefetch size is platform dependent. + */ + __u32 ring_size; + + /** @pad: MBZ */ + __u32 pad; + + /** + * @ring_addr: Ring address mapped within the VM, should be mapped as + * UC. + */ + __u64 ring_addr; + + /** + * @indirect_ring_state_offset: The fake indirect ring state offset to + * use for subsequent mmap call. Always 4k in size. + */ + __u64 indirect_ring_state_offset; + + /** + * @doorbell_offset: The fake doorbell offset to use for subsequent mmap + * call. Always 4k in size. + */ + __u64 doorbell_offset; + + /** @doorbell_page_offset: The doorbell offset within the mmapped page */ + __u32 doorbell_page_offset; + + /** + * @indirect_ring_state_handle: Indirect ring state buffer object + * handle. Allocated by KMD and must be closed by user. + */ + __u32 indirect_ring_state_handle; + + /** @reserved: Reserved */ + __u64 reserved[2]; +}; + /** * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE * @@ -1138,6 +1193,7 @@ struct drm_xe_exec_queue_create { #define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY 0 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE 1 +#define DRM_XE_EXEC_QUEUE_EXTENSION_USERMAP 1 /** @extensions: Pointer to the first extension struct, if any */ __u64 extensions; From patchwork Mon Nov 18 23:37:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 431A5D60D09 for ; Mon, 18 Nov 2024 23:37:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3CC7B10E584; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="W3VOVwfJ"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3FE2A10E57A; Mon, 18 Nov 2024 23:37:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973048; x=1763509048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gNdp0QyGV4i1yaEhzaz39MS69VT1BTl28gqiah5Po/o=; b=W3VOVwfJBN6J76Y/Nyd5x8/epHY+Ona0XQMNYzzOpxdVWaSovTZSTcnE AQ/UxwgqXQj1fn/ueMB+PCpvWWmFGJMYbIzCqO/oeqy/9+a4VlHvicDZH H1rilTYZO9Mv+3Ov3scNd+qPdn1mdcyJt/HSYSo4ZPSlchqpQ0NLmr6rv 2jtQ/vwXyJpItWrwcuZ0gE+p//ShzyTwUfLZ4cx60X8QwxZYcjmr9nkZ9 Fb2nmboQQZV8wJgezTKUGQwrxJVWAa8mAVWjYSeTiK5Q0n35laR6JA7z4 1Cd30hzMsiBYlhctG9sUY21GqUnz07ivR1FcPdXOn2T4QmyDGDVmCn1um A==; X-CSE-ConnectionGUID: JEF8QO7sQzGrRW2K2XGV3g== X-CSE-MsgGUID: ztXMgF0rR3SrV2KGSd/82g== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878956" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878956" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:28 -0800 X-CSE-ConnectionGUID: QNIgtjt/THizqv1WLVDANA== X-CSE-MsgGUID: r2gxfw5ZQDOKgSot8h/NWg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521731" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:28 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 17/29] drm/xe: Add usermap exec queue extension Date: Mon, 18 Nov 2024 15:37:45 -0800 Message-Id: <20241118233757.2374041-18-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Implement uAPI which maps submit rings, indirect LRC state, and doorbells to user space. This is required for UMD direction submission. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec_queue.c | 125 ++++++++++++++++++++++- drivers/gpu/drm/xe/xe_exec_queue_types.h | 13 +++ drivers/gpu/drm/xe/xe_execlist.c | 2 +- drivers/gpu/drm/xe/xe_lrc.c | 59 +++++++---- drivers/gpu/drm/xe/xe_lrc.h | 2 +- 5 files changed, 176 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index aef5b130e7f8..c8d45133eb59 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -11,6 +11,7 @@ #include #include +#include "xe_bo.h" #include "xe_device.h" #include "xe_gt.h" #include "xe_hw_engine_class_sysfs.h" @@ -38,12 +39,18 @@ static int exec_queue_user_extensions_post_init(struct xe_device *xe, struct xe_ static void __xe_exec_queue_free(struct xe_exec_queue *q) { + struct xe_device *xe = q->vm ? q->vm->xe : NULL; + if (q->vm) xe_vm_put(q->vm); if (q->xef) xe_file_put(q->xef); + if (q->usermap) + xe_pm_runtime_put(xe); + + kfree(q->usermap); kfree(q); } @@ -110,6 +117,8 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe, static int __xe_exec_queue_init(struct xe_exec_queue *q) { struct xe_vm *vm = q->vm; + u64 ring_addr = q->usermap ? q->usermap->ring_addr : 0; + u32 ring_size = q->usermap ? q->usermap->ring_size : SZ_16K; int i, err; if (vm) { @@ -119,7 +128,8 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q) } for (i = 0; i < q->width; ++i) { - q->lrc[i] = xe_lrc_create(q, q->hwe, q->vm, SZ_16K); + q->lrc[i] = xe_lrc_create(q, q->hwe, q->vm, ring_size, + ring_addr); if (IS_ERR(q->lrc[i])) { err = PTR_ERR(q->lrc[i]); goto err_unlock; @@ -444,12 +454,125 @@ typedef int (*xe_exec_queue_user_extension_fn)(struct xe_device *xe, struct xe_exec_queue *q, u64 extension); +static int exec_queue_user_ext_usermap(struct xe_device *xe, + struct xe_exec_queue *q, + u64 extension) +{ + u64 __user *address = u64_to_user_ptr(extension); + struct drm_xe_exec_queue_ext_usermap ext; + int err; + + /* Just parse args and make sure they are sane */ + + if (XE_IOCTL_DBG(xe, !xe_gt_has_indirect_ring_state(q->gt))) + return -EOPNOTSUPP; + + if (XE_IOCTL_DBG(xe, q->width != 1)) + return -EOPNOTSUPP; + + if (XE_IOCTL_DBG(xe, q->flags & (EXEC_QUEUE_FLAG_KERNEL | + EXEC_QUEUE_FLAG_PERMANENT | + EXEC_QUEUE_FLAG_VM | + EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD))) + return -EOPNOTSUPP; + + if (XE_IOCTL_DBG(xe, q->width != 1)) + return -EOPNOTSUPP; + + /* + * XXX: More or less free to support this but targeting Mesa for now as + * LR mode has ULLS. + */ + if (XE_IOCTL_DBG(xe, xe_vm_in_lr_mode(q->vm))) + return -EOPNOTSUPP; + + if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION)) + return -EINVAL; + + err = __copy_from_user(&ext, address, sizeof(ext)); + if (XE_IOCTL_DBG(xe, err)) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, ext.reserved[0] || ext.reserved[1])) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, ext.pad)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, ext.flags)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, ext.ring_size < SZ_4K || + ext.ring_size > SZ_2M || + ext.ring_size & ~PAGE_MASK)) + return -EINVAL; + + if (XE_IOCTL_DBG(xe, ext.version != + DRM_XE_EXEC_QUEUE_USERMAP_VERSION_XE2_REV0)) + return -EINVAL; + + q->usermap = kzalloc(sizeof(struct xe_exec_queue_usermap), GFP_KERNEL); + if (!q->usermap) + return -ENOMEM; + + q->usermap->ring_size = ext.ring_size; + q->usermap->ring_addr = ext.ring_addr; + + xe_pm_runtime_get_noresume(xe); + q->flags |= EXEC_QUEUE_FLAG_UMD_SUBMISSION; + + return 0; +} + +static int exec_queue_user_ext_post_init_usermap(struct xe_device *xe, + struct xe_exec_queue *q, + u64 extension) +{ + struct drm_xe_exec_queue_ext_usermap ext; + struct xe_lrc *lrc = q->lrc[0]; + u64 __user *address = u64_to_user_ptr(extension); + u32 indirect_ring_state_handle; + int err; + + err = __copy_from_user(&ext, address, sizeof(ext)); + if (XE_IOCTL_DBG(xe, err)) + return -EFAULT; + + err = drm_gem_handle_create(q->xef->drm, + &lrc->indirect_state->ttm.base, + &indirect_ring_state_handle); + if (err) + return err; + + ext.indirect_ring_state_offset = + drm_vma_node_offset_addr(&lrc->indirect_state->ttm.base.vma_node); + ext.indirect_ring_state_handle = indirect_ring_state_handle; + ext.doorbell_offset = XE_MMIO_DOORBELL_MMAP_OFFSET + + SZ_4K * q->guc->db.id; + ext.doorbell_page_offset = 0; + + err = copy_to_user(address, &ext, sizeof(ext)); + if (XE_IOCTL_DBG(xe, err)) { + err = -EFAULT; + goto close_handles; + } + + return 0; + +close_handles: + drm_gem_handle_delete(q->xef->drm, indirect_ring_state_handle); + + return err; +} + static const xe_exec_queue_user_extension_fn exec_queue_user_extension_funcs[] = { [DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = exec_queue_user_ext_set_property, + [DRM_XE_EXEC_QUEUE_EXTENSION_USERMAP] = exec_queue_user_ext_usermap, }; static const xe_exec_queue_user_extension_fn exec_queue_user_extension_post_init_funcs[] = { [DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = NULL, + [DRM_XE_EXEC_QUEUE_EXTENSION_USERMAP] = exec_queue_user_ext_post_init_usermap, }; #define MAX_USER_EXTENSIONS 16 diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h index 7f68587d4021..b30b5ee910fa 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h @@ -31,6 +31,16 @@ enum xe_exec_queue_priority { XE_EXEC_QUEUE_PRIORITY_COUNT }; +/** + * struct xe_exec_queue_usermap - Execution queue usermap (UMD submission) + */ +struct xe_exec_queue_usermap { + /** @ring_addr: ring address (PPGTT) */ + u64 ring_addr; + /** @ring_size: ring size */ + u32 ring_size; +}; + /** * struct xe_exec_queue - Execution queue * @@ -130,6 +140,9 @@ struct xe_exec_queue { struct list_head link; } lr; + /** @usermap: user map interface */ + struct xe_exec_queue_usermap *usermap; + /** @ops: submission backend exec queue operations */ const struct xe_exec_queue_ops *ops; diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c index 93f76280d453..803c84b2e4ed 100644 --- a/drivers/gpu/drm/xe/xe_execlist.c +++ b/drivers/gpu/drm/xe/xe_execlist.c @@ -265,7 +265,7 @@ struct xe_execlist_port *xe_execlist_port_create(struct xe_device *xe, port->hwe = hwe; - port->lrc = xe_lrc_create(NULL, hwe, NULL, SZ_16K); + port->lrc = xe_lrc_create(NULL, hwe, NULL, SZ_16K, 0); if (IS_ERR(port->lrc)) { err = PTR_ERR(port->lrc); goto err; diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 8a79470b52ae..8d5a65724c04 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -903,7 +903,7 @@ static void xe_lrc_finish(struct xe_lrc *lrc) static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, struct xe_hw_engine *hwe, struct xe_vm *vm, - u32 ring_size) + u32 ring_size, u64 ring_addr) { struct xe_gt *gt = hwe->gt; struct xe_tile *tile = gt_to_tile(gt); @@ -919,6 +919,8 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, XE_BO_FLAG_USER : 0; int err; + xe_assert(xe, (!user_queue && !ring_addr) || (user_queue && ring_addr)); + kref_init(&lrc->refcount); lrc->flags = 0; lrc_size = xe_gt_lrc_size(gt, hwe->class); @@ -935,16 +937,18 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, if (IS_ERR(lrc->bo)) return PTR_ERR(lrc->bo); - lrc->submission_ring = xe_bo_create_pin_map(xe, tile, vm, SZ_32K, - submit_type, - submit_flags | - XE_BO_FLAG_VRAM_IF_DGFX(tile) | - XE_BO_FLAG_GGTT | - XE_BO_FLAG_GGTT_INVALIDATE); - if (IS_ERR(lrc->submission_ring)) { - err = PTR_ERR(lrc->submission_ring); - lrc->submission_ring = NULL; - goto err_lrc_finish; + if (!user_queue) { + lrc->submission_ring = xe_bo_create_pin_map(xe, tile, vm, SZ_32K, + submit_type, + submit_flags | + XE_BO_FLAG_VRAM_IF_DGFX(tile) | + XE_BO_FLAG_GGTT | + XE_BO_FLAG_GGTT_INVALIDATE); + if (IS_ERR(lrc->submission_ring)) { + err = PTR_ERR(lrc->submission_ring); + lrc->submission_ring = NULL; + goto err_lrc_finish; + } } if (xe_gt_has_indirect_ring_state(gt)) { @@ -1018,12 +1022,19 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, } if (xe_gt_has_indirect_ring_state(gt)) { - xe_lrc_write_ctx_reg(lrc, CTX_INDIRECT_RING_STATE, - __xe_lrc_indirect_ring_ggtt_addr(lrc)); - - xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START, - __xe_lrc_ring_ggtt_addr(lrc)); - xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START_UDW, 0); + if (ring_addr) { /* PPGTT */ + xe_lrc_write_ctx_reg(lrc, CTX_INDIRECT_RING_STATE, + __xe_lrc_indirect_ring_ggtt_addr(lrc) | BIT(0)); + xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START, + ring_addr); + } else { + xe_lrc_write_ctx_reg(lrc, CTX_INDIRECT_RING_STATE, + __xe_lrc_indirect_ring_ggtt_addr(lrc)); + xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START, + __xe_lrc_ring_ggtt_addr(lrc)); + } + xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START_UDW, + ring_addr >> 32); xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_HEAD, 0); xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_TAIL, lrc->ring.tail); xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_CTL, @@ -1056,8 +1067,10 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, lrc->desc |= FIELD_PREP(LRC_ENGINE_CLASS, hwe->class); } - arb_enable = MI_ARB_ON_OFF | MI_ARB_ENABLE; - xe_lrc_write_ring(lrc, &arb_enable, sizeof(arb_enable)); + if (lrc->submission_ring) { + arb_enable = MI_ARB_ON_OFF | MI_ARB_ENABLE; + xe_lrc_write_ring(lrc, &arb_enable, sizeof(arb_enable)); + } map = __xe_lrc_seqno_map(lrc); xe_map_write32(lrc_to_xe(lrc), &map, lrc->fence_ctx.next_seqno - 1); @@ -1078,6 +1091,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, * @hwe: Hardware Engine * @vm: The VM (address space) * @ring_size: LRC ring size + * @ring_addr: LRC ring address, only valid for usermap queues * * Allocate and initialize the Logical Ring Context (LRC). * @@ -1085,7 +1099,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, * upon failure. */ struct xe_lrc *xe_lrc_create(struct xe_exec_queue *q, struct xe_hw_engine *hwe, - struct xe_vm *vm, u32 ring_size) + struct xe_vm *vm, u32 ring_size, u64 ring_addr) { struct xe_lrc *lrc; int err; @@ -1094,7 +1108,7 @@ struct xe_lrc *xe_lrc_create(struct xe_exec_queue *q, struct xe_hw_engine *hwe, if (!lrc) return ERR_PTR(-ENOMEM); - err = xe_lrc_init(lrc, q, hwe, vm, ring_size); + err = xe_lrc_init(lrc, q, hwe, vm, ring_size, ring_addr); if (err) { kfree(lrc); return ERR_PTR(err); @@ -1717,7 +1731,8 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc) xe_vm_get(lrc->bo->vm); snapshot->context_desc = xe_lrc_ggtt_addr(lrc); - snapshot->ring_addr = __xe_lrc_ring_ggtt_addr(lrc); + snapshot->ring_addr = lrc->submission_ring ? + __xe_lrc_ring_ggtt_addr(lrc) : 0; snapshot->indirect_context_desc = xe_lrc_indirect_ring_ggtt_addr(lrc); snapshot->head = xe_lrc_ring_head(lrc); snapshot->tail.internal = lrc->ring.tail; diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index 23d71283c79d..a7facfa8bf51 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -42,7 +42,7 @@ struct xe_lrc_snapshot { #define LRC_PPHWSP_SCRATCH_ADDR (0x34 * 4) struct xe_lrc *xe_lrc_create(struct xe_exec_queue *q, struct xe_hw_engine *hwe, - struct xe_vm *vm, u32 ring_size); + struct xe_vm *vm, u32 ring_size, u64 ring_addr); void xe_lrc_destroy(struct kref *ref); /** From patchwork Mon Nov 18 23:37:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879206 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B23AAD60CF8 for ; Mon, 18 Nov 2024 23:37:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 25F5410E594; Mon, 18 Nov 2024 23:37:32 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="eg9Bvd/o"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7ABDC10E582; Mon, 18 Nov 2024 23:37:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973048; x=1763509048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gg7AO0QPwabBwsHQ+3ISwQCdQZLih9895VdKNFkARtg=; b=eg9Bvd/ogwkJ1NmiJULektuDcha+xjsh9BymujM1HA9zS+xaUkWvxSIV dWfM7rlzRMKtd3A3FVf3J4//MueCpec6jzvHsSZlxbI8ktAW3yy8wwmD7 1Yjrp9qRTLMSKcZLh8tWEP3wGDOLeybezIQV50FoSU72ElUrAVXmXyOyO dcuP5P5ECQ3eZq7uS7S+X3K1EoDXHtQpJY0IKIKMvwnSTcZzTlFXcdIJc XezSNhdZJK1ywYlMtlAXnUjs49jFACOvvx3W6svPJxXHjJp+mXNpoBPef Bv701m88v/Fk3lERTqHL1HW3bbgpP4W9p+5DwautDLub1h1WuwiYyaLeh A==; X-CSE-ConnectionGUID: 9q0Cf4JhQ4uDnUq+Q+fjvA== X-CSE-MsgGUID: bEYu9MPeTBiKtm96e7LxFw== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878963" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878963" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:28 -0800 X-CSE-ConnectionGUID: KtyOM3iSQ0GBgaABnL0/fA== X-CSE-MsgGUID: UMfm82nGTqKZRELvZrig4g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521738" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:28 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 18/29] drm/xe: Drop EXEC_QUEUE_FLAG_UMD_SUBMISSION flag Date: Mon, 18 Nov 2024 15:37:46 -0800 Message-Id: <20241118233757.2374041-19-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Use xe_exec_queue_is_usermap helper instead. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec_queue.c | 3 +-- drivers/gpu/drm/xe/xe_exec_queue.h | 5 +++++ drivers/gpu/drm/xe/xe_exec_queue_types.h | 2 -- drivers/gpu/drm/xe/xe_guc_submit.c | 4 ++-- drivers/gpu/drm/xe/xe_lrc.c | 4 ++-- 5 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index c8d45133eb59..a22f089ccec6 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -486,7 +486,7 @@ static int exec_queue_user_ext_usermap(struct xe_device *xe, if (XE_IOCTL_DBG(xe, xe_vm_in_lr_mode(q->vm))) return -EOPNOTSUPP; - if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION)) + if (XE_IOCTL_DBG(xe, xe_exec_queue_is_usermap(q))) return -EINVAL; err = __copy_from_user(&ext, address, sizeof(ext)); @@ -519,7 +519,6 @@ static int exec_queue_user_ext_usermap(struct xe_device *xe, q->usermap->ring_addr = ext.ring_addr; xe_pm_runtime_get_noresume(xe); - q->flags |= EXEC_QUEUE_FLAG_UMD_SUBMISSION; return 0; } diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h index 90c7f73eab88..a4a1dbf5b977 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.h +++ b/drivers/gpu/drm/xe/xe_exec_queue.h @@ -57,6 +57,11 @@ static inline bool xe_exec_queue_is_parallel(struct xe_exec_queue *q) return q->width > 1; } +static inline bool xe_exec_queue_is_usermap(struct xe_exec_queue *q) +{ + return !!q->usermap; +} + bool xe_exec_queue_is_lr(struct xe_exec_queue *q); bool xe_exec_queue_ring_full(struct xe_exec_queue *q); diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h index b30b5ee910fa..26ce85b8d163 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h @@ -93,8 +93,6 @@ struct xe_exec_queue { #define EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD BIT(3) /* kernel exec_queue only, set priority to highest level */ #define EXEC_QUEUE_FLAG_HIGH_PRIORITY BIT(4) -/* queue used for UMD submission */ -#define EXEC_QUEUE_FLAG_UMD_SUBMISSION BIT(5) /** * @flags: flags for this exec queue, should statically setup aside from ban diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index c226c7b3245d..59d2e08797f5 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1522,7 +1522,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q) xe_sched_stop(sched); q->guc->db.id = -1; - if (q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION) { + if (xe_exec_queue_is_usermap(q)) { db_id = xe_guc_db_mgr_reserve_id_locked(&guc->dbm); if (db_id < 0) { err = db_id; @@ -1532,7 +1532,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q) mutex_unlock(&guc->submission_state.lock); - if (q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION) { + if (xe_exec_queue_is_usermap(q)) { q->guc->db.id = db_id; err = create_doorbell(guc, q); if (err) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 8d5a65724c04..e8675624966d 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -18,7 +18,7 @@ #include "xe_bo.h" #include "xe_device.h" #include "xe_drm_client.h" -#include "xe_exec_queue_types.h" +#include "xe_exec_queue.h" #include "xe_gt.h" #include "xe_gt_printk.h" #include "xe_hw_fence.h" @@ -912,7 +912,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_exec_queue *q, void *init_data = NULL; u32 arb_enable; u32 lrc_size; - bool user_queue = q && q->flags & EXEC_QUEUE_FLAG_UMD_SUBMISSION; + bool user_queue = q && xe_exec_queue_is_usermap(q);; enum ttm_bo_type submit_type = user_queue ? ttm_bo_type_device : ttm_bo_type_kernel; unsigned int submit_flags = user_queue ? From patchwork Mon Nov 18 23:37:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 314ABD60D07 for ; Mon, 18 Nov 2024 23:37:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B026F10E595; Mon, 18 Nov 2024 23:37:42 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="bOkQS1C5"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id C3CE010E582; Mon, 18 Nov 2024 23:37:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973048; x=1763509048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bj9M7C0DQr35+AV9jiUejnxNFjJG4cp8nso19AZvrOQ=; b=bOkQS1C5M+HtDzSp9JwtUaRFdZKNndqUYWvM88Aigmq637pUOINKxWTY Ne3vkv3KkUjZ9azWAp46ukWnkKZSVvfkZlxF9MYTYJB8BaQkgDPe6NWnv dPSCD6m+Nx+F7r+ngUQCNcgYJSvak/1Od3MYlcy7O+pFeuXy2W6P7i1Lk bMDtnjSVJEnBoY5zVaAuNsq7tIrKIWRuOScFXZDiFwGINb0AtnHcNKmFS RABPRfoIaLdCTv0Onsciv6fU1NOawjhEX1VjX8OrGIBt148Yfphgz28D8 Xt35OueJ1I4NK2KV3/rOSzT2cN6YRCNnyYQ64SPazQT/E5GBUbSUQWoYn A==; X-CSE-ConnectionGUID: 29zr3746SfyMXGyHThNdCQ== X-CSE-MsgGUID: uyI2M4uIQiyx3XGdnhEbUw== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878970" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878970" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:28 -0800 X-CSE-ConnectionGUID: mgNJIdMKS2aiUJA+oBaMNQ== X-CSE-MsgGUID: /DNBMEWITjSdVBd8qs2IFQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521741" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:28 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 19/29] drm/xe: Do not allow usermap exec queues in exec IOCTL Date: Mon, 18 Nov 2024 15:37:47 -0800 Message-Id: <20241118233757.2374041-20-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Not supported at the moment, may need something in the no doorbells available. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index 31cca938956f..898e4718d639 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -132,7 +132,8 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) if (XE_IOCTL_DBG(xe, !q)) return -ENOENT; - if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_VM)) { + if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_VM || + xe_exec_queue_is_usermap(q))) { err = -EINVAL; goto err_exec_queue; } From patchwork Mon Nov 18 23:37:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879204 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D7235D60D06 for ; Mon, 18 Nov 2024 23:37:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5434D10E58D; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="YGolGYUI"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4DCE610E584; Mon, 18 Nov 2024 23:37:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973049; x=1763509049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KSwdYjxf8PXwMIB2tQPack53L/KWH/z20YDuNNF5+gY=; b=YGolGYUIA0KlS1sjyiNw4bk27Lce/hdcZPxUxmrg/jgUu9AiZhRrX4LT mApyUh7GwjR0A0eZNXlzvTrHsC8G+aEeXgAL9FNHH7K2Tafyv3fic/U8N qGfTGaLJjYMyZqsgawRzPOeNxKpGle9rw8MqotvVsXYPFaLn83Hr6Lhoh iKW17tGb6M/X3Vdtj4F7T3w4NRP0nuQkwgIf5sNPRX4Zp8uGZ7SEKzVHd G9imvqA6F7VvZ+PsOpijYeFOEBin6vVqEGNVSX8nagGay3NqT0bAcz50d yYUMU6+yEOX4PDZ2x4oOuEwLJ+1wJCR/QhSNV6V/ELy8lkKFreVLUddPD Q==; X-CSE-ConnectionGUID: +h39/CaTTp2PmkAAaz5WwA== X-CSE-MsgGUID: k7PXYSGCSDWfTNFf7DS+aw== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878979" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878979" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 X-CSE-ConnectionGUID: qagDw0mgSiu0LYh/NOLwtw== X-CSE-MsgGUID: F9p/60siQsOdSKfi1Wqo9g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521747" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 20/29] drm/xe: Teach GuC backend to kill usermap queues Date: Mon, 18 Nov 2024 15:37:48 -0800 Message-Id: <20241118233757.2374041-21-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Usermap exec queue's teardown (kill) differs from other exec queues as no job is available, a doorbell is mapped, and the kill should be immediate. A follow up could unify LR queue cleanup with usermap but keeping this a seperate flow for now. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 2 +- drivers/gpu/drm/xe/xe_guc_submit.c | 56 +++++++++++++++++++- 2 files changed, 55 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h index 2d53af75ed75..c6c58e414b19 100644 --- a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h @@ -29,7 +29,7 @@ struct xe_guc_exec_queue { * a message needs to sent through the GPU scheduler but memory * allocations are not allowed. */ -#define MAX_STATIC_MSG_TYPE 3 +#define MAX_STATIC_MSG_TYPE 4 struct xe_sched_msg static_msgs[MAX_STATIC_MSG_TYPE]; /** @lr_tdr: long running TDR worker */ struct work_struct lr_tdr; diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 59d2e08797f5..82071a0ec91e 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -230,6 +230,11 @@ static void set_exec_queue_doorbell_registered(struct xe_exec_queue *q) atomic_or(EXEC_QUEUE_STATE_DB_REGISTERED, &q->guc->state); } +static void clear_exec_queue_doorbell_registered(struct xe_exec_queue *q) +{ + atomic_and(~EXEC_QUEUE_STATE_DB_REGISTERED, &q->guc->state); +} + static bool exec_queue_killed_or_banned_or_wedged(struct xe_exec_queue *q) { return (atomic_read(&q->guc->state) & @@ -798,6 +803,8 @@ static void disable_scheduling_deregister(struct xe_guc *guc, G2H_LEN_DW_DEREGISTER_CONTEXT, 2); } +static void guc_exec_queue_kill_user(struct xe_exec_queue *q); + static void xe_guc_exec_queue_trigger_cleanup(struct xe_exec_queue *q) { struct xe_guc *guc = exec_queue_to_guc(q); @@ -806,7 +813,9 @@ static void xe_guc_exec_queue_trigger_cleanup(struct xe_exec_queue *q) /** to wakeup xe_wait_user_fence ioctl if exec queue is reset */ wake_up_all(&xe->ufence_wq); - if (xe_exec_queue_is_lr(q)) + if (xe_exec_queue_is_usermap(q)) + guc_exec_queue_kill_user(q); + else if (xe_exec_queue_is_lr(q)) queue_work(guc_to_gt(guc)->ordered_wq, &q->guc->lr_tdr); else xe_sched_tdr_queue_imm(&q->guc->sched); @@ -1294,8 +1303,10 @@ static void __guc_exec_queue_process_msg_cleanup(struct xe_sched_msg *msg) xe_gt_assert(guc_to_gt(guc), !(q->flags & EXEC_QUEUE_FLAG_PERMANENT)); trace_xe_exec_queue_cleanup_entity(q); - if (exec_queue_doorbell_registered(q)) + if (exec_queue_doorbell_registered(q)) { + clear_exec_queue_doorbell_registered(q); deallocate_doorbell(guc, q->guc->id); + } if (exec_queue_registered(q)) disable_scheduling_deregister(guc, q); @@ -1382,10 +1393,29 @@ static void __guc_exec_queue_process_msg_resume(struct xe_sched_msg *msg) } } +static void __guc_exec_queue_process_msg_kill_user(struct xe_sched_msg *msg) +{ + struct xe_exec_queue *q = msg->private_data; + struct xe_guc *guc = exec_queue_to_guc(q); + + if (!xe_lrc_ring_is_idle(q->lrc[0])) + xe_gt_dbg(q->gt, "Killing non-idle usermap queue: guc_id=%d", + q->guc->id); + + if (exec_queue_doorbell_registered(q)) { + clear_exec_queue_doorbell_registered(q); + deallocate_doorbell(guc, q->guc->id); + } + + if (exec_queue_registered(q)) + disable_scheduling_deregister(guc, q); +} + #define CLEANUP 1 /* Non-zero values to catch uninitialized msg */ #define SET_SCHED_PROPS 2 #define SUSPEND 3 #define RESUME 4 +#define KILL_USER 5 #define OPCODE_MASK 0xf #define MSG_LOCKED BIT(8) @@ -1408,6 +1438,9 @@ static void guc_exec_queue_process_msg(struct xe_sched_msg *msg) case RESUME: __guc_exec_queue_process_msg_resume(msg); break; + case KILL_USER: + __guc_exec_queue_process_msg_kill_user(msg); + break; default: XE_WARN_ON("Unknown message type"); } @@ -1600,6 +1633,7 @@ static bool guc_exec_queue_try_add_msg(struct xe_exec_queue *q, #define STATIC_MSG_CLEANUP 0 #define STATIC_MSG_SUSPEND 1 #define STATIC_MSG_RESUME 2 +#define STATIC_MSG_KILL_USER 3 static void guc_exec_queue_fini(struct xe_exec_queue *q) { struct xe_sched_msg *msg = q->guc->static_msgs + STATIC_MSG_CLEANUP; @@ -1725,6 +1759,24 @@ static void guc_exec_queue_resume(struct xe_exec_queue *q) xe_sched_msg_unlock(sched); } +static void guc_exec_queue_kill_user(struct xe_exec_queue *q) +{ + struct xe_gpu_scheduler *sched = &q->guc->sched; + struct xe_sched_msg *msg = q->guc->static_msgs + STATIC_MSG_KILL_USER; + + if (exec_queue_extra_ref(q)) + return; + + set_exec_queue_banned(q); + + xe_sched_msg_lock(sched); + if (guc_exec_queue_try_add_msg(q, msg, KILL_USER)) { + set_exec_queue_extra_ref(q); + xe_exec_queue_get(q); + } + xe_sched_msg_unlock(sched); +} + static bool guc_exec_queue_reset_status(struct xe_exec_queue *q) { return exec_queue_reset(q) || exec_queue_killed_or_banned_or_wedged(q); From patchwork Mon Nov 18 23:37:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 087E7D60D00 for ; Mon, 18 Nov 2024 23:37:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E7CD110E5B4; Mon, 18 Nov 2024 23:37:33 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="bLyenX8o"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 64FAB10E585; Mon, 18 Nov 2024 23:37:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973049; x=1763509049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0CfghZNeiZ/DK/IeKHwi+ZoI/7jLbVs/H+RcKIZeBk8=; b=bLyenX8oX4Zc5hs7E/X0nBRp24yCGpjuH32yAdgnWDFk+tMhX9Mrw669 YH+MCi8RxjGJELZXgJk0pf5QcUNR6MmL03+zrCKiHSntZ+pN19RfAUemu 2pgPkatiVV/GhR6jf0iXvIRkU4FAYXvmFEX+Ch3qvDIb/rUQo9r0F9f7D LL1cqFxSB2yKCtQ2fhQhs3WxTA/JkyipVE1BHjUiEGy5/4F24JnlZgRDy BA4+5zlNF63NwokgcAjxI+ez6zrwFCXOOEe6fLbgZN6ndQSQseD1yWvM9 MPPclIjxWzZaXEkMidgIhUHY1FIDdEIVtzDXE/3z+4WDFKJwcQURcV+G2 g==; X-CSE-ConnectionGUID: nJCT9MhXTQ6Q/45tF3t3CA== X-CSE-MsgGUID: /HGdwq34S9y2Ao+cTUGiFQ== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878985" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878985" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 X-CSE-ConnectionGUID: +AHPvOiWR0KGnh/+KkCv0g== X-CSE-MsgGUID: hugmTivTTHObyNN8UJGazw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521750" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 21/29] drm/xe: Enable preempt fences on usermap queues Date: Mon, 18 Nov 2024 15:37:49 -0800 Message-Id: <20241118233757.2374041-22-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Preempt fences are used by usermap queues to implement dynamic memory (BO eviction, userptr invalidation), enable preempt fences on usermap queues. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec_queue.c | 3 ++- drivers/gpu/drm/xe/xe_pt.c | 3 +-- drivers/gpu/drm/xe/xe_vm.c | 18 ++++++++---------- drivers/gpu/drm/xe/xe_vm.h | 2 +- 4 files changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index a22f089ccec6..987584090263 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -794,7 +794,8 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data, if (IS_ERR(q)) return PTR_ERR(q); - if (xe_vm_in_preempt_fence_mode(vm)) { + if (xe_vm_in_preempt_fence_mode(vm) || + xe_exec_queue_is_usermap(q)) { q->lr.context = dma_fence_context_alloc(1); err = xe_vm_add_compute_exec_queue(vm, q); diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index 684dc075deac..a75667346ab3 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -1882,8 +1882,7 @@ static void bind_op_commit(struct xe_vm *vm, struct xe_tile *tile, * the rebind worker */ if (pt_update_ops->wait_vm_bookkeep && - xe_vm_in_preempt_fence_mode(vm) && - !current->mm) + vm->preempt.num_exec_queues && !current->mm) xe_vm_queue_rebind_worker(vm); } diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 2e67648ed512..16bc1b82d950 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -229,7 +229,8 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q) int err; bool wait; - xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm)); + xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm) || + xe_exec_queue_is_usermap(q)); down_write(&vm->lock); err = drm_gpuvm_exec_lock(&vm_exec); @@ -280,7 +281,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q) */ void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q) { - if (!xe_vm_in_preempt_fence_mode(vm)) + if (!xe_vm_in_preempt_fence_mode(vm) && !xe_exec_queue_is_usermap(q)) return; down_write(&vm->lock); @@ -487,7 +488,7 @@ static void preempt_rebind_work_func(struct work_struct *w) long wait; int __maybe_unused tries = 0; - xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm)); + xe_assert(vm->xe, !xe_vm_in_fault_mode(vm)); trace_xe_vm_rebind_worker_enter(vm); down_write(&vm->lock); @@ -1467,10 +1468,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) vm->batch_invalidate_tlb = true; } - if (vm->flags & XE_VM_FLAG_LR_MODE) { - INIT_WORK(&vm->preempt.rebind_work, preempt_rebind_work_func); + INIT_WORK(&vm->preempt.rebind_work, preempt_rebind_work_func); + if (vm->flags & XE_VM_FLAG_LR_MODE) vm->batch_invalidate_tlb = false; - } /* Fill pt_root after allocating scratch tables */ for_each_tile(tile, xe, id) { @@ -1543,8 +1543,7 @@ void xe_vm_close_and_put(struct xe_vm *vm) xe_assert(xe, !vm->preempt.num_exec_queues); xe_vm_close(vm); - if (xe_vm_in_preempt_fence_mode(vm)) - flush_work(&vm->preempt.rebind_work); + flush_work(&vm->preempt.rebind_work); down_write(&vm->lock); for_each_tile(tile, xe, id) { @@ -1644,8 +1643,7 @@ static void vm_destroy_work_func(struct work_struct *w) /* xe_vm_close_and_put was not called? */ xe_assert(xe, !vm->size); - if (xe_vm_in_preempt_fence_mode(vm)) - flush_work(&vm->preempt.rebind_work); + flush_work(&vm->preempt.rebind_work); mutex_destroy(&vm->snap_mutex); diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index c864dba35e1d..4391dbaeba51 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -216,7 +216,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma); static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm) { - xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm)); + xe_assert(vm->xe, !xe_vm_in_fault_mode(vm)); queue_work(vm->xe->ordered_wq, &vm->preempt.rebind_work); } From patchwork Mon Nov 18 23:37:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D727FD60D05 for ; Mon, 18 Nov 2024 23:37:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 544D810E590; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="OAfTxZfw"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id C8EFC10E584; Mon, 18 Nov 2024 23:37:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973050; x=1763509050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CB/UAMQz/584WOERxoPhXPZsBITEWPivbuUiVFlOkiU=; b=OAfTxZfw18Ur2cMNSW4GhLcgIy5RG8qJCits1dVVfzEVa4bDjkywSsra 4lkqqPDKqQRpOIc0/A/YPvWzjb+VqevZys5Y5S1nJVSBCWWNhVbB2+s2h ylV/xv5A6Lh0nRJywl0IMMoaRIaq/TYZYoaa68uhrEm7H3vMJGMKh1tCe eq4+r9si6DL8OYEzRcYbOynPUfHl69WKNGc2eaO6QQWnCBwhj6SJWl9qi YZZP5YmlMO7dSU9VoFqFhKFPXL29Q5pyemGetnP+IZHaO3hkU+EBQwr5W ywuS1Ltd5z1TDyS1gpdXBvUpZkohCi6Ml9fEJmXhYf03UmXX1o88AdYKL A==; X-CSE-ConnectionGUID: eSNS9VhWQWa41KooKh2bgw== X-CSE-MsgGUID: FZh0D1W4QQ6hdgItCPzBkg== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878991" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878991" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 X-CSE-ConnectionGUID: cMFx8zK2RLiqrX+ZzLwxnA== X-CSE-MsgGUID: 7FQIaW/JQbuZ2vyyvfKt3A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521754" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 22/29] drm/xe/uapi: Add uAPI to convert user semaphore to / from drm syncobj Date: Mon, 18 Nov 2024 15:37:50 -0800 Message-Id: <20241118233757.2374041-23-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Simple interface to allow user space to share user syncs with kernel syncs (dma-fences). The idea also is when user syncs are converted to kernel syncs, preemption is guarded against until the kernel sync signals. This is required to adhere to dma-fencing rules (no memory allocates done in path of dma-fence, resume after preemption requires memory allocations). FIXME: uAPI likely to change, perhaps in drm generic way. Currently enough for a PoC and enable initial Mesa development. Signed-off-by: Matthew Brost --- include/uapi/drm/xe_drm.h | 62 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 9356a714a2e0..0cd473d2d91b 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -102,6 +102,7 @@ extern "C" { #define DRM_XE_EXEC 0x09 #define DRM_XE_WAIT_USER_FENCE 0x0a #define DRM_XE_OBSERVATION 0x0b +#define DRM_XE_VM_CONVERT_FENCE 0x0c /* Must be kept compact -- no holes */ @@ -117,6 +118,7 @@ extern "C" { #define DRM_IOCTL_XE_EXEC DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec) #define DRM_IOCTL_XE_WAIT_USER_FENCE DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence) #define DRM_IOCTL_XE_OBSERVATION DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OBSERVATION, struct drm_xe_observation_param) +#define DRM_IOCTL_XE_VM_CONVERT_FENCE DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_CONVERT_FENCE, struct drm_xe_vm_convert_fence) /** * DOC: Xe IOCTL Extensions @@ -1796,6 +1798,66 @@ struct drm_xe_oa_stream_info { __u64 reserved[3]; }; +/** + * struct drm_xe_semaphore - Semaphore + */ +struct drm_xe_semaphore { + /** + * @handle: Handle for the semaphore. Must be bound to the VM when + * passed into drm_xe_vm_convert_fence. + */ + __u32 handle; + + /** @offset: Offset in BO for semaphore, must QW aligned */ + __u32 offset; + + /** @seqno: Sequence number of semaphore */ + __u64 seqno; + + /** @token: Semaphore token - MBZ as not supported yet */ + __u64 token; + + /** @reserved: reserved for future use */ + __u64 reserved[2]; +}; + +/** + * struct drm_xe_vm_convert_fence - Convert semaphore to / from syncobj + * + * DRM_XE_SYNC_FLAG_SIGNAL set indicates semaphore -> syncobj + * DRM_XE_SYNC_FLAG_SIGNAL clear indicates syncobj -> semaphore + */ +struct drm_xe_vm_convert_fence { + /** + * @extensions: Pointer to the first extension struct, if any + */ + __u64 extensions; + + /** @vm_id: VM ID */ + __u32 vm_id; + + /** @flags: Flags - MBZ */ + __u32 flags; + + /** @pad: MBZ */ + __u32 pad; + + /** + * @num_syncs: Number of struct drm_xe_sync and struct drm_xe_semaphore + * in arrays. + */ + __u32 num_syncs; + + /** @syncs: Pointer to struct drm_xe_sync array. */ + __u64 syncs; + + /** @semaphores: Pointer to struct drm_xe_semaphore array. */ + __u64 semaphores; + + /** @reserved: reserved for future use */ + __u64 reserved[2]; +}; + #if defined(__cplusplus) } #endif From patchwork Mon Nov 18 23:37:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 17A05D60CFD for ; Mon, 18 Nov 2024 23:37:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EBB5510E576; Mon, 18 Nov 2024 23:37:48 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="h9qzgR2+"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 07C3210E585; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973050; x=1763509050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C02NA37fN9dm3BF7bGK31pqMn+aFCtASLHdm9hCh/k4=; b=h9qzgR2+V3epog6qoHy9w/BMBRnIGSdiij6ePFyKQPcapEKtJsRUmKRB UwlXYUiEUn6wduV9V/REmFMvonUftD91rb2g/g14d8r2kugjXAVtSlvic hvyFtGPhChk6XiPNGIe0aNVb3UBi/DCVduf45mKdz1+spVTk2a0CSc62b ucOKPKQ8LFTe50lj8gTBEfcsgIJ3TyE6GlS3sC0AX7yZ0L6X/6Mp64+ZY PID3aq1V6IjhfFMP3Rbs87RwRKaIJ1XEvSHiZ/fy07YSEYIwkplIsea1r 1JWXqyLxoBWkkSayoNm4yvka/X/NV11iAsaNYImAqGJ6y9khsQLUfW2go w==; X-CSE-ConnectionGUID: 23QZUSSJSrOiUsUfTuYNvA== X-CSE-MsgGUID: g4Dn8+UrT7yibgruFP+btA== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31878998" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31878998" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 X-CSE-ConnectionGUID: ZjHVHXT9QDaHpvEo7aDu5A== X-CSE-MsgGUID: VLvUIeWJSt+IcI6yzvRDcA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521759" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:29 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 23/29] drm/xe: Add user fence IRQ handler Date: Mon, 18 Nov 2024 15:37:51 -0800 Message-Id: <20241118233757.2374041-24-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Imported user fences will not be tied to a specific queue or hardware engine class. Therefore, a device IRQ handler is needed to signal the associated exported DMA fences. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_device.c | 4 ++++ drivers/gpu/drm/xe/xe_device_types.h | 3 +++ drivers/gpu/drm/xe/xe_hw_engine.c | 4 +++- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index bbdff4308b2e..573b5f3df0c8 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -39,6 +39,7 @@ #include "xe_gt_sriov_vf.h" #include "xe_guc.h" #include "xe_hw_engine_group.h" +#include "xe_hw_fence.h" #include "xe_hwmon.h" #include "xe_irq.h" #include "xe_memirq.h" @@ -902,6 +903,7 @@ int xe_device_probe(struct xe_device *xe) if (err) goto err; + xe_hw_fence_irq_init(&xe->user_fence_irq); for_each_gt(gt, xe, id) { last_gt = id; @@ -944,6 +946,7 @@ int xe_device_probe(struct xe_device *xe) xe_oa_fini(xe); err_fini_gt: + xe_hw_fence_irq_finish(&xe->user_fence_irq); for_each_gt(gt, xe, id) { if (id < last_gt) xe_gt_remove(gt); @@ -979,6 +982,7 @@ void xe_device_remove(struct xe_device *xe) xe_heci_gsc_fini(xe); + xe_hw_fence_irq_finish(&xe->user_fence_irq); for_each_gt(gt, xe, id) xe_gt_remove(gt); } diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 8592f1b02db1..3ac118c6f85e 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -507,6 +507,9 @@ struct xe_device { int mode; } wedged; + /** @user_fence_irq: User fence IRQ handler */ + struct xe_hw_fence_irq user_fence_irq; + #ifdef TEST_VM_OPS_ERROR /** * @vm_inject_error_position: inject errors at different places in VM diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index c4b0dc3be39c..2c9aa5343971 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -822,8 +822,10 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec) if (hwe->irq_handler) hwe->irq_handler(hwe, intr_vec); - if (intr_vec & GT_RENDER_USER_INTERRUPT) + if (intr_vec & GT_RENDER_USER_INTERRUPT) { + xe_hw_fence_irq_run(>_to_xe(hwe->gt)->user_fence_irq); xe_hw_fence_irq_run(hwe->fence_irq); + } } /** From patchwork Mon Nov 18 23:37:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9615DD60CFD for ; Mon, 18 Nov 2024 23:37:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 848CF10E599; Mon, 18 Nov 2024 23:37:32 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="KqhQzOLF"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4775110E585; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973050; x=1763509050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iQ6a/LouePdc9Se4be/vtSdyyt0+f/h9tau0Yc4zn40=; b=KqhQzOLFGs2vTMPKNDT2yUtg5I+0szvMPrKF/IjfcGjez5gMzUdFzlWa jUZmctFq4YQlz3N1q8RmAtTYlDUfAxj7se62PiWKMLgDU/zBxYLBUUYhi MekaqSQ443c1HtPAQhd97yYYWSCLgs1rI7ma9wzAmQldt7ZnJcc9BbZK6 s5z6CsrPRetVwoS4I4TTjF6xpT9zDL1t8l7lQ8br+yS6P9MUg8EiEw2L+ S7kMRgFz54tj4ICpPepdbor6KTM0vyAL6uXigMsgjb8OEAg5pjSdSAY/S cHC+Iuejao//cocr8ehce3rVGVmRXrtNcrXQjIZHfZO7Q4oXr4RwmH6wT Q==; X-CSE-ConnectionGUID: /TEUt9GISwmI43Z2W/SlMA== X-CSE-MsgGUID: 9RSKHzQSS5WCZrCg6ahpow== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31879006" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31879006" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:30 -0800 X-CSE-ConnectionGUID: k4yaSghsRkO7WL2M2VwqCg== X-CSE-MsgGUID: QWrVrlftRVSAwhup1vC3IA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521762" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:30 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 24/29] drm/xe: Add xe_hw_fence_user_init Date: Mon, 18 Nov 2024 15:37:52 -0800 Message-Id: <20241118233757.2374041-25-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add xe_hw_fence_user_init which can create a struct xe_hw_fence from a user input rather than internal LRC state. Used to import user fence and export them as dma fences. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_hw_fence.c | 17 +++++++++++++++++ drivers/gpu/drm/xe/xe_hw_fence.h | 3 +++ 2 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c index 0b4f12be3692..2ea4d8bca6eb 100644 --- a/drivers/gpu/drm/xe/xe_hw_fence.c +++ b/drivers/gpu/drm/xe/xe_hw_fence.c @@ -263,3 +263,20 @@ void xe_hw_fence_init(struct dma_fence *fence, struct xe_hw_fence_ctx *ctx, trace_xe_hw_fence_create(hw_fence); } + +void xe_hw_fence_user_init(struct dma_fence *fence, struct xe_device *xe, + struct iosys_map seqno_map, u64 seqno) +{ + struct xe_hw_fence *hw_fence = + container_of(fence, typeof(*hw_fence), dma); + + hw_fence->xe = xe; + snprintf(hw_fence->name, sizeof(hw_fence->name), "user"); + hw_fence->seqno_map = seqno_map; + + INIT_LIST_HEAD(&hw_fence->irq_link); + dma_fence_init(fence, &xe_hw_fence_ops, &xe->user_fence_irq.lock, + dma_fence_context_alloc(1), seqno); + + trace_xe_hw_fence_create(hw_fence); +} diff --git a/drivers/gpu/drm/xe/xe_hw_fence.h b/drivers/gpu/drm/xe/xe_hw_fence.h index f13a1c4982c7..76571ef2ef36 100644 --- a/drivers/gpu/drm/xe/xe_hw_fence.h +++ b/drivers/gpu/drm/xe/xe_hw_fence.h @@ -30,4 +30,7 @@ void xe_hw_fence_free(struct dma_fence *fence); void xe_hw_fence_init(struct dma_fence *fence, struct xe_hw_fence_ctx *ctx, struct iosys_map seqno_map); +void xe_hw_fence_user_init(struct dma_fence *fence, struct xe_device *xe, + struct iosys_map seqno_map, u64 seqno); + #endif From patchwork Mon Nov 18 23:37:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 627D4D60D00 for ; Mon, 18 Nov 2024 23:37:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B955C10E59D; Mon, 18 Nov 2024 23:37:37 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="gJZ2BAEj"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 96F9F10E589; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973050; x=1763509050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vE/DWvj18UUkeQDzMSwfAfb5jQfsXWBL099DaFUoIv0=; b=gJZ2BAEjj+djXgJ5rpMcZ2uTsKqM/IhhoTYWfRJctl7jxafsqrGDFx52 90+4ef2ayKXxwlWfcY3G1Z1tEgojoKFyZWbAytiDshxkHr2c0m0D5//n6 qYN+qXEr97nsI2dX2m/w30V5BKhOKyT8DVcS9LptBDtF2XId6dPzA1itc KLO2kzf8wXqKClMgyU5sTmfcLKc3VeVi/B64iTVBQchcZC4kLb6U8WhMI rBxFAAyHDHEkKQeSedOCxmaXHBwYWz6euJkzpTLw7G/577D3Vd8PN7WWt wQp+z6YybatHZi5cOSe1CpKObjekHBN14SDtqj5ESARnFsDsYTUsim1n2 A==; X-CSE-ConnectionGUID: 4Dh34+zySlq5UYX7MUtgfA== X-CSE-MsgGUID: ZCXsLvouRsSeQSxkggLF4Q== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31879013" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31879013" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:30 -0800 X-CSE-ConnectionGUID: jRODvD4ESWWes/WGO2Ozug== X-CSE-MsgGUID: e03epWlYQrq/muXEinFyaA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521766" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:30 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 25/29] drm/xe: Add a message lock to the Xe GPU scheduler Date: Mon, 18 Nov 2024 15:37:53 -0800 Message-Id: <20241118233757.2374041-26-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Stop abusing job list lock for message, use a dedicated lock. This lock will soon be able to be taken in IRQ contexts, using irqsave for simplicity. Can to tweaked in a follow up as needed. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_gpu_scheduler.c | 19 ++++++++++++------- drivers/gpu/drm/xe/xe_gpu_scheduler.h | 12 ++++-------- drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 ++ drivers/gpu/drm/xe/xe_guc_submit.c | 15 +++++++++------ 4 files changed, 27 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c index 50361b4638f9..55ccfb587523 100644 --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c @@ -14,25 +14,27 @@ static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched) static void xe_sched_process_msg_queue_if_ready(struct xe_gpu_scheduler *sched) { struct xe_sched_msg *msg; + unsigned long flags; - xe_sched_msg_lock(sched); + xe_sched_msg_lock(sched, flags); msg = list_first_entry_or_null(&sched->msgs, struct xe_sched_msg, link); if (msg) xe_sched_process_msg_queue(sched); - xe_sched_msg_unlock(sched); + xe_sched_msg_unlock(sched, flags); } static struct xe_sched_msg * xe_sched_get_msg(struct xe_gpu_scheduler *sched) { struct xe_sched_msg *msg; + unsigned long flags; - xe_sched_msg_lock(sched); + xe_sched_msg_lock(sched, flags); msg = list_first_entry_or_null(&sched->msgs, struct xe_sched_msg, link); if (msg) list_del_init(&msg->link); - xe_sched_msg_unlock(sched); + xe_sched_msg_unlock(sched, flags); return msg; } @@ -64,6 +66,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched, struct device *dev) { sched->ops = xe_ops; + spin_lock_init(&sched->msg_lock); INIT_LIST_HEAD(&sched->msgs); INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work); @@ -98,15 +101,17 @@ void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched) void xe_sched_add_msg(struct xe_gpu_scheduler *sched, struct xe_sched_msg *msg) { - xe_sched_msg_lock(sched); + unsigned long flags; + + xe_sched_msg_lock(sched, flags); xe_sched_add_msg_locked(sched, msg); - xe_sched_msg_unlock(sched); + xe_sched_msg_unlock(sched, flags); } void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched, struct xe_sched_msg *msg) { - lockdep_assert_held(&sched->base.job_list_lock); + lockdep_assert_held(&sched->msg_lock); list_add_tail(&msg->link, &sched->msgs); xe_sched_process_msg_queue(sched); diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h index c250ea773491..3238de26dcfe 100644 --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h @@ -29,15 +29,11 @@ void xe_sched_add_msg(struct xe_gpu_scheduler *sched, void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched, struct xe_sched_msg *msg); -static inline void xe_sched_msg_lock(struct xe_gpu_scheduler *sched) -{ - spin_lock(&sched->base.job_list_lock); -} +#define xe_sched_msg_lock(sched, flags) \ + spin_lock_irqsave(&sched->msg_lock, flags) -static inline void xe_sched_msg_unlock(struct xe_gpu_scheduler *sched) -{ - spin_unlock(&sched->base.job_list_lock); -} +#define xe_sched_msg_unlock(sched, flags) \ + spin_unlock_irqrestore(&sched->msg_lock, flags) static inline void xe_sched_stop(struct xe_gpu_scheduler *sched) { diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h index 6731b13da8bb..c8e0352ef941 100644 --- a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h @@ -47,6 +47,8 @@ struct xe_gpu_scheduler { const struct xe_sched_backend_ops *ops; /** @msgs: list of messages to be processed in @work_process_msg */ struct list_head msgs; + /** @msg_lock: Lock for messages */ + spinlock_t msg_lock; /** @work_process_msg: processes messages */ struct work_struct work_process_msg; }; diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 82071a0ec91e..3efd2000c0a2 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1704,14 +1704,15 @@ static int guc_exec_queue_suspend(struct xe_exec_queue *q) { struct xe_gpu_scheduler *sched = &q->guc->sched; struct xe_sched_msg *msg = q->guc->static_msgs + STATIC_MSG_SUSPEND; + unsigned long flags; if (exec_queue_killed_or_banned_or_wedged(q)) return -EINVAL; - xe_sched_msg_lock(sched); + xe_sched_msg_lock(sched, flags); if (guc_exec_queue_try_add_msg(q, msg, SUSPEND)) q->guc->suspend_pending = true; - xe_sched_msg_unlock(sched); + xe_sched_msg_unlock(sched, flags); return 0; } @@ -1751,30 +1752,32 @@ static void guc_exec_queue_resume(struct xe_exec_queue *q) struct xe_gpu_scheduler *sched = &q->guc->sched; struct xe_sched_msg *msg = q->guc->static_msgs + STATIC_MSG_RESUME; struct xe_guc *guc = exec_queue_to_guc(q); + unsigned long flags; xe_gt_assert(guc_to_gt(guc), !q->guc->suspend_pending); - xe_sched_msg_lock(sched); + xe_sched_msg_lock(sched, flags); guc_exec_queue_try_add_msg(q, msg, RESUME); - xe_sched_msg_unlock(sched); + xe_sched_msg_unlock(sched, flags); } static void guc_exec_queue_kill_user(struct xe_exec_queue *q) { struct xe_gpu_scheduler *sched = &q->guc->sched; struct xe_sched_msg *msg = q->guc->static_msgs + STATIC_MSG_KILL_USER; + unsigned long flags; if (exec_queue_extra_ref(q)) return; set_exec_queue_banned(q); - xe_sched_msg_lock(sched); + xe_sched_msg_lock(sched, flags); if (guc_exec_queue_try_add_msg(q, msg, KILL_USER)) { set_exec_queue_extra_ref(q); xe_exec_queue_get(q); } - xe_sched_msg_unlock(sched); + xe_sched_msg_unlock(sched, flags); } static bool guc_exec_queue_reset_status(struct xe_exec_queue *q) From patchwork Mon Nov 18 23:37:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879220 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D2333D60CF8 for ; Mon, 18 Nov 2024 23:37:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2976D10E29D; Mon, 18 Nov 2024 23:37:52 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="PEOAep6O"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id E295410E58F; Mon, 18 Nov 2024 23:37:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973051; x=1763509051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qor/rz+zhJl8ucNnVdsiBtpFdqatd8c7wJgJh3n3K7A=; b=PEOAep6OxJKRv6vrD1HmJ6fdZVqGCiUs7DQAlp5xESug4t58EVhD0tNw AU82wvg530ndAhy7iSTqMTq3xy1BhEbAXjrfsKnkn/HvGjU13yFnW/HLt pFAJvySD4ZwWhvTG6W2RN9UYnXSgILJUPO9DjnD8ZFpVkzPV6P1hW9ffX V1rdki/EpUTwK9lNAO1z5WnAOPALiCL1ELBs6o7Fe24gpKuAvQeAkoLAS z3/FxpLVVhjhvu5E2KNJpzcAz1lWykwnhY10rgWNvUi1W1G+GBO87IK0C LJ3g9xP1as0TCMNkR18dg+WPthgmUXIrlwssIx/niJO0Pb9F68RxSJt6u g==; X-CSE-ConnectionGUID: l7TRPBVDQMKUcurHPTAh0A== X-CSE-MsgGUID: W/G8w5etT9Wlo48+QjGwUg== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31879020" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31879020" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:30 -0800 X-CSE-ConnectionGUID: mv8Ar+PbS8qB3xQ3cSvYhQ== X-CSE-MsgGUID: wetTYL7dT2ic7HmwkopKZg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521771" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:30 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 26/29] drm/xe: Always wait on preempt fences in vma_check_userptr Date: Mon, 18 Nov 2024 15:37:54 -0800 Message-Id: <20241118233757.2374041-27-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" The assumption only a VM in preempt fence mode has preempt fences attached is not true, preempt fences can be attached to a dma-resv VM if user queues are open. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_pt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index a75667346ab3..1efe17b0b1f8 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -1231,7 +1231,7 @@ static int vma_check_userptr(struct xe_vm *vm, struct xe_vma *vma, &vm->userptr.invalidated); spin_unlock(&vm->userptr.invalidated_lock); - if (xe_vm_in_preempt_fence_mode(vm)) { + if (vm->preempt.num_exec_queues) { struct dma_resv_iter cursor; struct dma_fence *fence; long err; From patchwork Mon Nov 18 23:37:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C50DD60CF8 for ; Mon, 18 Nov 2024 23:37:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0F6A110E5BA; Mon, 18 Nov 2024 23:37:35 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="TXPX2W67"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 33D0889F4F; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973051; x=1763509051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0osXRmnKpiY3+lFHIipsNkgn0vpsvm78bLb55LfZ/Ws=; b=TXPX2W67rNSqpNY+C4R0VFEE+bCWAtrgngd2EE6TZueXowa5uTklMYjb IMAi9JOgcmwZO6tI6+B/MV2ynK0nDrPGNabKWERjGt0MrQ0LxSDwNoX7N HHh3sadLZWw3GZgja2H6Y9NAZcCkhPidZqeBCWEGITp/V6DyvlzXKGSfx okGMNX6jucjRuV8PE/BQmzyaGNSVWhthgxAcOfMsXbnEdcADBCVZr6q7F 5ewIUbvFHrElUHmwyk1vExe6V3PomYU4NmODfYEC5cBSQKSyaVnr7Da10 o61DQ+eCicwvWAOXpPzMAPpHYtS91JJN5ypVURatk1qcxJ0/tLQpgT2K1 Q==; X-CSE-ConnectionGUID: 9eK1QwB4TLShL1Qp/ozEPg== X-CSE-MsgGUID: 55ANa1bgT4uIvWeddDO1TA== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31879026" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31879026" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:31 -0800 X-CSE-ConnectionGUID: Tom5VpZIQDG9eqGJ34LnpA== X-CSE-MsgGUID: RkdrvsVOQm2ILJQRG5eHsQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521775" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:31 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 27/29] drm/xe: Teach xe_sync layer about drm_xe_semaphore Date: Mon, 18 Nov 2024 15:37:55 -0800 Message-Id: <20241118233757.2374041-28-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Teach xe_sync layer about drm_xe_semaphore which is used import / export user fences. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_sync.c | 90 ++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_sync.h | 8 +++ drivers/gpu/drm/xe/xe_sync_types.h | 5 +- 3 files changed, 102 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c index 42f5bebd09e5..ac4510ad52a9 100644 --- a/drivers/gpu/drm/xe/xe_sync.c +++ b/drivers/gpu/drm/xe/xe_sync.c @@ -6,6 +6,7 @@ #include "xe_sync.h" #include +#include #include #include #include @@ -14,11 +15,15 @@ #include #include +#include "xe_bo.h" #include "xe_device_types.h" #include "xe_exec_queue.h" +#include "xe_hw_fence.h" #include "xe_macros.h" #include "xe_sched_job_types.h" +#define IS_UNINSTALLED_HW_FENCE BIT(31) + struct xe_user_fence { struct xe_device *xe; struct kref refcount; @@ -211,6 +216,74 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, return 0; } +int xe_sync_semaphore_parse(struct xe_device *xe, struct xe_file *xef, + struct xe_sync_entry *sync, + struct drm_xe_semaphore __user *semaphore_user, + unsigned int flags) +{ + struct drm_xe_semaphore semaphore_in; + struct drm_gem_object *gem_obj; + struct xe_bo *bo; + + if (copy_from_user(&semaphore_in, semaphore_user, + sizeof(*semaphore_user))) + return -EFAULT; + + if (XE_IOCTL_DBG(xe, semaphore_in.offset & 0x7 || + !semaphore_in.handle || semaphore_in.token || + semaphore_in.reserved[0] || semaphore_in.reserved[1])) + return -EINVAL; + + gem_obj = drm_gem_object_lookup(xef->drm, semaphore_in.handle); + if (XE_IOCTL_DBG(xe, !gem_obj)) + return -ENOENT; + + bo = gem_to_xe_bo(gem_obj); + + if (XE_IOCTL_DBG(xe, bo->size < semaphore_in.offset)) { + xe_bo_put(bo); + return -EINVAL; + } + + if (flags & DRM_XE_SYNC_FLAG_SIGNAL) { + struct iosys_map vmap = sync->bo->vmap; + struct dma_fence *fence; + + sync->chain_fence = dma_fence_chain_alloc(); + if (!sync->chain_fence) { + xe_bo_put(bo); + dma_fence_chain_free(sync->chain_fence); + return -ENOMEM; + } + + fence = xe_hw_fence_alloc(); + if (IS_ERR(fence)) { + xe_bo_put(bo); + return PTR_ERR(fence); + } + + vmap = bo->vmap; + iosys_map_incr(&vmap, semaphore_in.offset); + + xe_hw_fence_user_init(fence, xe, vmap, semaphore_in.seqno); + sync->fence = fence; + sync->flags = IS_UNINSTALLED_HW_FENCE; + } else { + sync->user_fence = dma_fence_user_fence_alloc(); + if (XE_IOCTL_DBG(xe, !sync->user_fence)) { + xe_bo_put(bo); + return PTR_ERR(sync->ufence); + } + + sync->addr = semaphore_in.offset; + sync->timeline_value = semaphore_in.seqno; + sync->flags = DRM_XE_SYNC_FLAG_SIGNAL; + } + sync->bo = bo; + + return 0; +} + int xe_sync_entry_add_deps(struct xe_sync_entry *sync, struct xe_sched_job *job) { if (sync->fence) @@ -249,17 +322,34 @@ void xe_sync_entry_signal(struct xe_sync_entry *sync, struct dma_fence *fence) user_fence_put(sync->ufence); dma_fence_put(fence); } + } else if (sync->user_fence) { + struct iosys_map vmap = sync->bo->vmap; + + iosys_map_incr(&vmap, sync->addr); + dma_fence_user_fence_attach(fence, sync->user_fence, + &vmap, sync->timeline_value); + sync->user_fence = NULL; } } +void xe_sync_entry_hw_fence_installed(struct xe_sync_entry *sync) +{ + sync->flags &= ~IS_UNINSTALLED_HW_FENCE; +} + void xe_sync_entry_cleanup(struct xe_sync_entry *sync) { if (sync->syncobj) drm_syncobj_put(sync->syncobj); + xe_bo_put(sync->bo); + if (sync->flags & IS_UNINSTALLED_HW_FENCE) + dma_fence_set_error(sync->fence, -ECANCELED); dma_fence_put(sync->fence); dma_fence_chain_free(sync->chain_fence); if (sync->ufence) user_fence_put(sync->ufence); + if (sync->user_fence) + dma_fence_user_fence_free(sync->user_fence); } /** diff --git a/drivers/gpu/drm/xe/xe_sync.h b/drivers/gpu/drm/xe/xe_sync.h index 256ffc1e54dc..fd56929e37cc 100644 --- a/drivers/gpu/drm/xe/xe_sync.h +++ b/drivers/gpu/drm/xe/xe_sync.h @@ -8,6 +8,9 @@ #include "xe_sync_types.h" +struct drm_xe_semaphore; +struct drm_xe_sync; + struct xe_device; struct xe_exec_queue; struct xe_file; @@ -22,10 +25,15 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, struct xe_sync_entry *sync, struct drm_xe_sync __user *sync_user, unsigned int flags); +int xe_sync_semaphore_parse(struct xe_device *xe, struct xe_file *xef, + struct xe_sync_entry *sync, + struct drm_xe_semaphore __user *semaphore_user, + unsigned int flags); int xe_sync_entry_add_deps(struct xe_sync_entry *sync, struct xe_sched_job *job); void xe_sync_entry_signal(struct xe_sync_entry *sync, struct dma_fence *fence); +void xe_sync_entry_hw_fence_installed(struct xe_sync_entry *sync); void xe_sync_entry_cleanup(struct xe_sync_entry *sync); struct dma_fence * xe_sync_in_fence_get(struct xe_sync_entry *sync, int num_sync, diff --git a/drivers/gpu/drm/xe/xe_sync_types.h b/drivers/gpu/drm/xe/xe_sync_types.h index 30ac3f51993b..28e846c29122 100644 --- a/drivers/gpu/drm/xe/xe_sync_types.h +++ b/drivers/gpu/drm/xe/xe_sync_types.h @@ -11,14 +11,17 @@ struct drm_syncobj; struct dma_fence; struct dma_fence_chain; -struct drm_xe_sync; +struct dma_fence_user_fence; struct user_fence; +struct xe_bo; struct xe_sync_entry { struct drm_syncobj *syncobj; struct dma_fence *fence; struct dma_fence_chain *chain_fence; struct xe_user_fence *ufence; + struct dma_fence_user_fence *user_fence; + struct xe_bo *bo; u64 addr; u64 timeline_value; u32 type; From patchwork Mon Nov 18 23:37:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 86686D60D06 for ; Mon, 18 Nov 2024 23:37:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7925F10E5B6; Mon, 18 Nov 2024 23:37:37 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="NM1yHNfx"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 94EE110E56D; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973051; x=1763509051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XTgAsegDjFV05M1nrJP912+yjN4Cgy2+lhTXrEG8ScA=; b=NM1yHNfxL0rOyijXvVyXLD4fJCrpApgSGGJxtaKmTbaayowJo92/D4om JVbToBV4inT+ZJHpNAifCu1b5OLuOyrABmz4OelAYreSbFvi4KoQpHanc yhNiXbj1bF2rcqgqgneBm25QK6hv/A3UVrrSzYgSwDz0Y23BxhUm9TfUy PrPYu9vx9wRNLoP7qnED8Ky+Zv59KMW9dCfKpebzZHeDXadrpnt81IE2+ aUjQ1juK5eIXujg6OnYncvdh7X3MUP9u0VvnlRNiMdIA/Z6v1SfZnb3Qt nyKAbsciU/ZyRmWAENhYCtFYle4Hs4lBp9dhlllAYSRWDwaVJnWCFK4uR Q==; X-CSE-ConnectionGUID: BfOCh3fjR6+TJWTyxqCMQA== X-CSE-MsgGUID: N5OBIMm5TGi1Q+QynKcD6A== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31879032" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31879032" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:31 -0800 X-CSE-ConnectionGUID: ZqBJvNbJThChO9X9u8LOXw== X-CSE-MsgGUID: O0jIXyqFSEuCYEMugpxLfA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521778" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:31 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 28/29] drm/xe: Add VM convert fence IOCTL Date: Mon, 18 Nov 2024 15:37:56 -0800 Message-Id: <20241118233757.2374041-29-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Basically a version of the resume worker which also converts user syncs to kerenl syncs (dma-fences) and vise versa. The expoxrted dma-fences in the conversion guard against preemption which is required to avoid breaking dma fence rules (no memory allocations in path of dma-fence, resume requires memory allocations). Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_device.c | 1 + drivers/gpu/drm/xe/xe_preempt_fence.c | 9 + drivers/gpu/drm/xe/xe_vm.c | 247 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_vm.h | 2 + drivers/gpu/drm/xe/xe_vm_types.h | 4 + 5 files changed, 254 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 573b5f3df0c8..56dd26eddd92 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -191,6 +191,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = { DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(XE_VM_CONVERT_FENCE, xe_vm_convert_fence_ioctl, DRM_RENDER_ALLOW), }; static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c index 80a8bc82f3cc..c225f3cc82a3 100644 --- a/drivers/gpu/drm/xe/xe_preempt_fence.c +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c @@ -12,6 +12,14 @@ static struct xe_exec_queue *to_exec_queue(struct dma_fence_preempt *fence) return container_of(fence, struct xe_preempt_fence, base)->q; } +static struct dma_fence * +xe_preempt_fence_preempt_delay(struct dma_fence_preempt *fence) +{ + struct xe_exec_queue *q = to_exec_queue(fence); + + return q->vm->preempt.exported_fence ?: dma_fence_get_stub(); +} + static int xe_preempt_fence_preempt(struct dma_fence_preempt *fence) { struct xe_exec_queue *q = to_exec_queue(fence); @@ -35,6 +43,7 @@ static void xe_preempt_fence_preempt_finished(struct dma_fence_preempt *fence) } static const struct dma_fence_preempt_ops xe_preempt_fence_ops = { + .preempt_delay = xe_preempt_fence_preempt_delay, .preempt = xe_preempt_fence_preempt, .preempt_wait = xe_preempt_fence_preempt_wait, .preempt_finished = xe_preempt_fence_preempt_finished, diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 16bc1b82d950..5078aeea2bd8 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -6,6 +6,7 @@ #include "xe_vm.h" #include +#include #include #include @@ -441,29 +442,44 @@ int xe_vm_validate_rebind(struct xe_vm *vm, struct drm_exec *exec, } static int xe_preempt_work_begin(struct drm_exec *exec, struct xe_vm *vm, - bool *done) + int extra_fence_count, bool *done) { int err; + *done = false; + err = drm_gpuvm_prepare_vm(&vm->gpuvm, exec, 0); if (err) return err; - if (xe_vm_is_idle(vm)) { + if (xe_vm_in_preempt_fence_mode(vm) && xe_vm_is_idle(vm)) { vm->preempt.rebind_deactivated = true; *done = true; return 0; } + err = drm_gpuvm_prepare_objects(&vm->gpuvm, exec, 0); + if (err) + return err; + if (!preempt_fences_waiting(vm)) { *done = true; + + if (extra_fence_count) { + struct drm_gem_object *obj; + unsigned long index; + + drm_exec_for_each_locked_object(exec, index, obj) { + err = dma_resv_reserve_fences(obj->resv, + extra_fence_count); + if (err) + return err; + } + } + return 0; } - err = drm_gpuvm_prepare_objects(&vm->gpuvm, exec, 0); - if (err) - return err; - err = wait_for_existing_preempt_fences(vm); if (err) return err; @@ -474,7 +490,8 @@ static int xe_preempt_work_begin(struct drm_exec *exec, struct xe_vm *vm, * The fence reservation here is intended for the new preempt fences * we attach at the end of the rebind work. */ - return xe_vm_validate_rebind(vm, exec, vm->preempt.num_exec_queues); + return xe_vm_validate_rebind(vm, exec, vm->preempt.num_exec_queues + + extra_fence_count); } static void preempt_rebind_work_func(struct work_struct *w) @@ -509,9 +526,9 @@ static void preempt_rebind_work_func(struct work_struct *w) drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked(&exec) { - bool done = false; + bool done; - err = xe_preempt_work_begin(&exec, vm, &done); + err = xe_preempt_work_begin(&exec, vm, 0, &done); drm_exec_retry_on_contention(&exec); if (err || done) { drm_exec_fini(&exec); @@ -1638,6 +1655,7 @@ static void vm_destroy_work_func(struct work_struct *w) container_of(w, struct xe_vm, destroy_work); struct xe_device *xe = vm->xe; struct xe_tile *tile; + struct dma_fence *fence; u8 id; /* xe_vm_close_and_put was not called? */ @@ -1660,6 +1678,9 @@ static void vm_destroy_work_func(struct work_struct *w) if (vm->xef) xe_file_put(vm->xef); + dma_fence_chain_for_each(fence, vm->preempt.exported_fence); + dma_fence_put(vm->preempt.exported_fence); + kfree(vm); } @@ -3403,3 +3424,211 @@ void xe_vm_snapshot_free(struct xe_vm_snapshot *snap) } kvfree(snap); } + +static int check_semaphores(struct xe_vm *vm, struct xe_sync_entry *syncs, + struct drm_exec *exec, int num_syncs) +{ + int i, j; + + for (i = 0; i < num_syncs; ++i) { + struct xe_bo *bo = syncs[i].bo; + struct drm_gem_object *obj = &bo->ttm.base; + + if (bo->vm == vm) + continue; + + for (j = 0; j < exec->num_objects; ++j) { + if (obj == exec->objects[j]) + break; + } + + if (j == exec->num_objects) + return -EINVAL; + } + + return 0; +} + +int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct xe_device *xe = to_xe_device(dev); + struct xe_file *xef = to_xe_file(file); + struct drm_xe_vm_convert_fence __user *args = data; + struct drm_xe_sync __user *syncs_user; + struct drm_xe_semaphore __user *semaphores_user; + struct xe_sync_entry *syncs = NULL; + struct xe_vm *vm; + int err = 0, i, num_syncs = 0; + bool done = false; + struct drm_exec exec; + unsigned int fence_count = 0; + LIST_HEAD(preempt_fences); + ktime_t end = 0; + long wait; + int __maybe_unused tries = 0; + struct dma_fence *fence, *prev = NULL; + + if (XE_IOCTL_DBG(xe, args->extensions || args->flags || + args->reserved[0] || args->reserved[1] || + args->pad)) + return -EINVAL; + + vm = xe_vm_lookup(xef, args->vm_id); + if (XE_IOCTL_DBG(xe, !vm)) + return -EINVAL; + + err = down_write_killable(&vm->lock); + if (err) + goto put_vm; + + if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) { + err = -ENOENT; + goto release_vm_lock; + } + + syncs = kcalloc(args->num_syncs * 2, sizeof(*syncs), GFP_KERNEL); + if (!syncs) { + err = -ENOMEM; + goto release_vm_lock; + } + + syncs_user = u64_to_user_ptr(args->syncs); + semaphores_user = u64_to_user_ptr(args->semaphores); + for (i = 0; i < args->num_syncs; i++, num_syncs++) { + struct xe_sync_entry *sync = &syncs[i]; + struct xe_sync_entry *semaphore_sync = + &syncs[args->num_syncs + i]; + + err = xe_sync_entry_parse(xe, xef, sync, &syncs_user[i], + SYNC_PARSE_FLAG_DISALLOW_USER_FENCE); + if (err) + goto release_syncs; + + err = xe_sync_semaphore_parse(xe, xef, semaphore_sync, + &semaphores_user[i], + sync->flags); + if (err) { + xe_sync_entry_cleanup(&syncs[i]); + goto release_syncs; + } + } + +retry: + if (xe_vm_userptr_check_repin(vm)) { + err = xe_vm_userptr_pin(vm); + if (err) + goto release_syncs; + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); + + drm_exec_until_all_locked(&exec) { + err = xe_preempt_work_begin(&exec, vm, num_syncs, &done); + drm_exec_retry_on_contention(&exec); + if (err) { + drm_exec_fini(&exec); + if (err && xe_vm_validate_should_retry(&exec, err, &end)) + err = -EAGAIN; + + goto release_syncs; + } + } + + if (XE_IOCTL_DBG(xe, check_semaphores(vm, syncs + num_syncs, + &exec, num_syncs))) { + err = -EINVAL; + goto out_unlock; + } + + if (!done) { + err = alloc_preempt_fences(vm, &preempt_fences, &fence_count); + if (err) + goto out_unlock; + + wait = dma_resv_wait_timeout(xe_vm_resv(vm), + DMA_RESV_USAGE_KERNEL, + false, MAX_SCHEDULE_TIMEOUT); + if (wait <= 0) { + err = -ETIME; + goto out_unlock; + } + } + +#define retry_required(__tries, __vm) \ + (IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT) ? \ + (!(__tries)++ || __xe_vm_userptr_needs_repin(__vm)) : \ + __xe_vm_userptr_needs_repin(__vm)) + + down_read(&vm->userptr.notifier_lock); + if (retry_required(tries, vm)) { + up_read(&vm->userptr.notifier_lock); + err = -EAGAIN; + goto out_unlock; + } + +#undef retry_required + + /* Point of no return. */ + xe_assert(vm->xe, list_empty(&vm->rebind_list)); + + for (i = 0; i < num_syncs; i++) { + struct xe_sync_entry *sync = &syncs[i]; + struct xe_sync_entry *semaphore_sync = &syncs[num_syncs + i]; + + if (sync->flags & DRM_XE_SYNC_FLAG_SIGNAL) { + xe_sync_entry_signal(sync, semaphore_sync->fence); + xe_sync_entry_hw_fence_installed(semaphore_sync); + + dma_fence_put(prev); + prev = dma_fence_get(vm->preempt.exported_fence); + + dma_fence_chain_init(semaphore_sync->chain_fence, + prev, semaphore_sync->fence, + vm->preempt.seqno++); + + vm->preempt.exported_fence = + &semaphore_sync->chain_fence->base; + semaphore_sync->chain_fence = NULL; + + semaphore_sync->fence = NULL; /* Ref owned by chain */ + } else { + xe_sync_entry_signal(semaphore_sync, sync->fence); + drm_gpuvm_resv_add_fence(&vm->gpuvm, &exec, + dma_fence_chain_contained(sync->fence), + DMA_RESV_USAGE_BOOKKEEP, + DMA_RESV_USAGE_BOOKKEEP); + } + } + + dma_fence_chain_for_each(fence, prev); + dma_fence_put(prev); + + if (!done) { + spin_lock(&vm->xe->ttm.lru_lock); + ttm_lru_bulk_move_tail(&vm->lru_bulk_move); + spin_unlock(&vm->xe->ttm.lru_lock); + + arm_preempt_fences(vm, &preempt_fences); + resume_and_reinstall_preempt_fences(vm, &exec); + } + up_read(&vm->userptr.notifier_lock); + +out_unlock: + drm_exec_fini(&exec); +release_syncs: + while (err != -EAGAIN && num_syncs--) { + xe_sync_entry_cleanup(&syncs[num_syncs]); + xe_sync_entry_cleanup(&syncs[args->num_syncs + num_syncs]); + } +release_vm_lock: + if (err == -EAGAIN) + goto retry; + up_write(&vm->lock); +put_vm: + xe_vm_put(vm); + free_preempt_fences(&preempt_fences); + kfree(syncs); + + return err; +} diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 4391dbaeba51..c1c70239cc91 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -181,6 +181,8 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); void xe_vm_close_and_put(struct xe_vm *vm); diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 7f9a303e51d8..c5cb83722706 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -254,6 +254,10 @@ struct xe_vm { * BOs */ struct work_struct rebind_work; + /** @seqno: Seqno of exported dma-fences */ + u64 seqno; + /** @exported_fence: Chain of exported dma-fences */ + struct dma_fence *exported_fence; } preempt; /** @um: unified memory state */ From patchwork Mon Nov 18 23:37:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13879211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 59240D60D0B for ; Mon, 18 Nov 2024 23:37:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 613C710E5B5; Mon, 18 Nov 2024 23:37:33 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="QmErJ+oN"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0241C10E592; Mon, 18 Nov 2024 23:37:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731973052; x=1763509052; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=k6npIro/CDXRVdYry+Z7wVHzj1KL9gz5MwjkQUsk+8Y=; b=QmErJ+oNGVpf3Xo/AyQk3AvaXNx99XveupDettHBdxBnDt+kxUE2l2Jb KLcxHGjBW30bWqV7MnCx1mEMsftPK/b+NH+R3CeTlNhX3j3iJSMOqo9LN YTOhgx6NSygOL19d6X+dTIHqFTs9N6cyTBjEI+5vaM/pNPcgZWWu1X0tq 2LEIT2wVWWkLvWvYGn/AE+y5XtZCdiKolyGPyElV1Cfs6kcu3ycOwQw8m b1ePMOUWbVYAg98jMoy+W2W76DVmLzBJkwxEXY4eBfdsE72OrbmHVnciE nf9Vj9hFtG2S3Bi9D2gwNdvOEjUq2NKmQ1w5ZeEA603Bdx+5uqp430nB/ g==; X-CSE-ConnectionGUID: y3+dy3ahR8uf5aMpLwP5Pw== X-CSE-MsgGUID: zyomD7BiSqON0jIJMLsNTA== X-IronPort-AV: E=McAfee;i="6700,10204,11260"; a="31879040" X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="31879040" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:31 -0800 X-CSE-ConnectionGUID: 4ewmGeeVSXSAonED7NscQw== X-CSE-MsgGUID: fWWSngICRRqLPLlUCOYldQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,165,1728975600"; d="scan'208";a="89521783" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2024 15:37:31 -0800 From: Matthew Brost To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kenneth.w.graunke@intel.com, lionel.g.landwerlin@intel.com, jose.souza@intel.com, simona.vetter@ffwll.ch, thomas.hellstrom@linux.intel.com, boris.brezillon@collabora.com, airlied@gmail.com, christian.koenig@amd.com, mihail.atanassov@arm.com, steven.price@arm.com, shashank.sharma@amd.com Subject: [RFC PATCH 29/29] drm/xe: Add user fence TDR Date: Mon, 18 Nov 2024 15:37:57 -0800 Message-Id: <20241118233757.2374041-30-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241118233757.2374041-1-matthew.brost@intel.com> References: <20241118233757.2374041-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We cannot let user fences exported as dma-fence run forever. Add a TDR to protect against this. If the TDR fires the entire VM is killed as dma-fences are not tied to an individual queue. Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 164 +++++++++++++++++++++++++++++-- drivers/gpu/drm/xe/xe_vm_types.h | 22 +++++ 2 files changed, 179 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 5078aeea2bd8..8b475e76bfe0 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -30,6 +30,7 @@ #include "xe_exec_queue.h" #include "xe_gt_pagefault.h" #include "xe_gt_tlb_invalidation.h" +#include "xe_hw_fence.h" #include "xe_migrate.h" #include "xe_pat.h" #include "xe_pm.h" @@ -336,11 +337,15 @@ void xe_vm_kill(struct xe_vm *vm, bool unlocked) if (unlocked) xe_vm_lock(vm, false); - vm->flags |= XE_VM_FLAG_BANNED; - trace_xe_vm_kill(vm); + if (!(vm->flags |= XE_VM_FLAG_BANNED)) { + vm->flags |= XE_VM_FLAG_BANNED; + trace_xe_vm_kill(vm); - list_for_each_entry(q, &vm->preempt.exec_queues, lr.link) - q->ops->kill(q); + list_for_each_entry(q, &vm->preempt.exec_queues, lr.link) + q->ops->kill(q); + + /* TODO: Unmap usermap doorbells */ + } if (unlocked) xe_vm_unlock(vm); @@ -1393,6 +1398,9 @@ static void xe_vm_free_scratch(struct xe_vm *vm) } } +static void userfence_tdr(struct work_struct *w); +static void userfence_kill(struct work_struct *w); + struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) { struct drm_gem_object *vm_resv_obj; @@ -1517,6 +1525,12 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) } } + spin_lock_init(&vm->userfence.lock); + INIT_LIST_HEAD(&vm->userfence.pending_list); + vm->userfence.timeout = HZ * 5; + INIT_DELAYED_WORK(&vm->userfence.tdr, userfence_tdr); + INIT_WORK(&vm->userfence.kill_work, userfence_kill); + if (number_tiles > 1) vm->composite_fence_ctx = dma_fence_context_alloc(1); @@ -1562,6 +1576,9 @@ void xe_vm_close_and_put(struct xe_vm *vm) xe_vm_close(vm); flush_work(&vm->preempt.rebind_work); + flush_delayed_work(&vm->userfence.tdr); + flush_work(&vm->userfence.kill_work); + down_write(&vm->lock); for_each_tile(tile, xe, id) { if (vm->q[id]) @@ -3449,6 +3466,114 @@ static int check_semaphores(struct xe_vm *vm, struct xe_sync_entry *syncs, return 0; } +struct tdr_item { + struct dma_fence *fence; + struct xe_vm *vm; + struct list_head link; + struct dma_fence_cb cb; + u64 deadline; +}; + +static void userfence_kill(struct work_struct *w) +{ + struct xe_vm *vm = + container_of(w, struct xe_vm, userfence.kill_work); + + down_write(&vm->lock); + xe_vm_kill(vm, true); + up_write(&vm->lock); +} + +static void userfence_tdr(struct work_struct *w) +{ + struct xe_vm *vm = + container_of(w, struct xe_vm, userfence.tdr.work); + struct tdr_item *tdr_item; + bool timeout = false, cookie = dma_fence_begin_signalling(); + + xe_hw_fence_irq_stop(&vm->xe->user_fence_irq); + + spin_lock_irq(&vm->userfence.lock); + list_for_each_entry(tdr_item, &vm->userfence.pending_list, link) { + if (!dma_fence_is_signaled(tdr_item->fence)) { + drm_notice(&vm->xe->drm, + "Timedout usermap fence: seqno=%llu, deadline=%llu, jiffies=%llu", + tdr_item->fence->seqno, tdr_item->deadline, + get_jiffies_64()); + dma_fence_set_error(tdr_item->fence, -ETIME); + timeout = true; + vm->userfence.timeout = 0; + } + } + spin_unlock_irq(&vm->userfence.lock); + + xe_hw_fence_irq_start(&vm->xe->user_fence_irq); + + /* + * This is dma-fence signaling path so we cannot take the locks requires + * to kill a VM. Defer killing to a worker. + */ + if (timeout) + schedule_work(&vm->userfence.kill_work); + + dma_fence_end_signalling(cookie); +} + +static void userfence_fence_cb(struct dma_fence *fence, + struct dma_fence_cb *cb) +{ + struct tdr_item *next, *tdr_item = container_of(cb, struct tdr_item, cb); + struct xe_vm *vm = tdr_item->vm; + struct xe_gt *gt = xe_device_get_gt(vm->xe, 0); + + if (fence) + spin_lock(&vm->userfence.lock); + else + spin_lock_irq(&vm->userfence.lock); + + list_del(&tdr_item->link); + next = list_first_entry_or_null(&vm->userfence.pending_list, + typeof(*next), link); + if (next) + mod_delayed_work(gt->ordered_wq, &vm->userfence.tdr, + next->deadline - get_jiffies_64()); + else + cancel_delayed_work(&vm->userfence.tdr); + + if (fence) + spin_unlock(&vm->userfence.lock); + else + spin_unlock_irq(&vm->userfence.lock); + + dma_fence_put(tdr_item->fence); + xe_vm_put(tdr_item->vm); + kfree(tdr_item); +} + +static void userfence_tdr_add(struct xe_vm *vm, struct tdr_item *tdr_item, + struct dma_fence *fence) +{ + struct xe_gt *gt = xe_device_get_gt(vm->xe, 0); + int ret; + + tdr_item->fence = dma_fence_get(fence); + tdr_item->vm = xe_vm_get(vm); + INIT_LIST_HEAD(&tdr_item->link); + tdr_item->deadline = vm->userfence.timeout + get_jiffies_64(); + + spin_lock_irq(&vm->userfence.lock); + list_add_tail(&tdr_item->link, &vm->userfence.pending_list); + if (list_is_singular(&vm->userfence.pending_list)) + mod_delayed_work(gt->ordered_wq, + &vm->userfence.tdr, + vm->userfence.timeout); + spin_unlock_irq(&vm->userfence.lock); + + ret = dma_fence_add_callback(fence, &tdr_item->cb, userfence_fence_cb); + if (ret == -ENOENT) + userfence_fence_cb(NULL, &tdr_item->cb); +} + int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -3459,6 +3584,7 @@ int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, struct drm_xe_semaphore __user *semaphores_user; struct xe_sync_entry *syncs = NULL; struct xe_vm *vm; + struct tdr_item **tdr_items = NULL; int err = 0, i, num_syncs = 0; bool done = false; struct drm_exec exec; @@ -3493,6 +3619,12 @@ int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, goto release_vm_lock; } + tdr_items = kcalloc(args->num_syncs, sizeof(*tdr_items), GFP_KERNEL); + if (!tdr_items) { + err = -ENOMEM; + goto release_vm_lock; + } + syncs_user = u64_to_user_ptr(args->syncs); semaphores_user = u64_to_user_ptr(args->semaphores); for (i = 0; i < args->num_syncs; i++, num_syncs++) { @@ -3505,6 +3637,15 @@ int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, if (err) goto release_syncs; + if (sync->flags & DRM_XE_SYNC_FLAG_SIGNAL) { + tdr_items[i] = kmalloc(sizeof(struct tdr_item), + GFP_KERNEL); + if (!tdr_items[i]) { + xe_sync_entry_cleanup(&syncs[i]); + goto release_syncs; + } + } + err = xe_sync_semaphore_parse(xe, xef, semaphore_sync, &semaphores_user[i], sync->flags); @@ -3591,6 +3732,10 @@ int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, &semaphore_sync->chain_fence->base; semaphore_sync->chain_fence = NULL; + userfence_tdr_add(vm, tdr_items[i], + semaphore_sync->fence); + tdr_items[i] = 0; + semaphore_sync->fence = NULL; /* Ref owned by chain */ } else { xe_sync_entry_signal(semaphore_sync, sync->fence); @@ -3617,9 +3762,13 @@ int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, out_unlock: drm_exec_fini(&exec); release_syncs: - while (err != -EAGAIN && num_syncs--) { - xe_sync_entry_cleanup(&syncs[num_syncs]); - xe_sync_entry_cleanup(&syncs[args->num_syncs + num_syncs]); + if (err != -EAGAIN) { + for (i = 0; i < num_syncs; ++i) + kfree(tdr_items[i]); + while (num_syncs--) { + xe_sync_entry_cleanup(&syncs[num_syncs]); + xe_sync_entry_cleanup(&syncs[args->num_syncs + num_syncs]); + } } release_vm_lock: if (err == -EAGAIN) @@ -3629,6 +3778,7 @@ int xe_vm_convert_fence_ioctl(struct drm_device *dev, void *data, xe_vm_put(vm); free_preempt_fences(&preempt_fences); kfree(syncs); + kfree(tdr_items); return err; } diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index c5cb83722706..49cac5716f72 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -260,6 +260,28 @@ struct xe_vm { struct dma_fence *exported_fence; } preempt; + /** @userfence: User fence state */ + struct { + /** + * @userfence.lock: fence lock + */ + spinlock_t lock; + /** + * @userfence.pending_list: pending fence list, protected by + * userfence.lock + */ + struct list_head pending_list; + /** @userfence.tdr: fence TDR */ + struct delayed_work tdr; + /** @userfence.kill_work */ + struct work_struct kill_work; + /** + * @userfence.timeout: Fence timeout period, protected by + * userfence.lock + */ + u32 timeout; + } userfence; + /** @um: unified memory state */ struct { /** @asid: address space ID, unique to each VM */