From patchwork Mon Nov 27 18:48:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DCBAFC4167B for ; Mon, 27 Nov 2023 18:59:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4034510E3D4; Mon, 27 Nov 2023 18:59:43 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 07CA710E3D4 for ; Mon, 27 Nov 2023 18:59:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=uLJc/0q+PREavE8ipSgwZnKn6Rpre4aSziTiaa+cIMg=; b=HYIlCDFfNH+pZbTdTNNwCBpoPe 7JajgREWAKmiBwrpRWyl52yhTboEnJWtCnF+7aaDEA2oAKIZLkqqF4etoeaRdZKqEnc/Le1cXVtdP 9FvsBcfDBra7GPDjCjVMZBPVa6gJRW0KilgFRH1sxJpDQ/BsZxZHL0F8N9kbz2Yef+Y2OYhVjKTvQ OgzG2Q9Fo+XcNLKIRzJDnKLpkxT1R2fzAsYlfzw0cnYnNp4U1uEs5ptHc2MOcojY3CZY7Dc4WvpEJ FWdbScLrT9UcbWl1LSEKdwqFVCeQCdnwucRkjuIgA9pKhV7bxG8Mh+peBmNemZhJKacOSmYftpaYT OeIf9i0g==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpO-007jiM-GD; Mon, 27 Nov 2023 19:59:31 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 01/17] drm/v3d: Remove unused function header Date: Mon, 27 Nov 2023 15:48:19 -0300 Message-ID: <20231127185723.10348-3-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen v3d_mmu_get_offset header was added but the function was never defined. Just remove it. Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 4c59fefaa0b4..d8b392494eba 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -420,8 +420,6 @@ void v3d_irq_disable(struct v3d_dev *v3d); void v3d_irq_reset(struct v3d_dev *v3d); /* v3d_mmu.c */ -int v3d_mmu_get_offset(struct drm_file *file_priv, struct v3d_bo *bo, - u32 *offset); int v3d_mmu_set_page_table(struct v3d_dev *v3d); void v3d_mmu_insert_ptes(struct v3d_bo *bo); void v3d_mmu_remove_ptes(struct v3d_bo *bo); From patchwork Mon Nov 27 18:48:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A7E1C4167B for ; Mon, 27 Nov 2023 18:59:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 67EEA10E3D6; Mon, 27 Nov 2023 18:59:47 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id A889710E3D6 for ; Mon, 27 Nov 2023 18:59:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=an1lX0ioHk1+o4uXh9o++hKOIPYhWN18tudolFVniWQ=; b=Ul0gTLUAf7YRp4BKCmz7Uc4jJt uW8we4xlZOePWcfiwZVUIqKOAYzKd5F/MbdveuE7q6JarK2q0XTdCyRnytOeXMi9jQvRM5Sub4H6N +u7HIcq4Ty7lU8u6krg31b24vCQlyJ7VwshEu4puXU8GEZTmyncfO6NIAgB9BGke8uwySX2sEfvib n20IxhNM11g8u0bo4cLQz4+0JhJVrATm2EG9fEWXPn7ddiDV5Kf5tKC7JdRm0cqq0ZPbNJAM6ZCvI dppAJP0tKyGGK1nFULROwVDp+auMkyhHgUoAYkmkBrxSD5699/ZzKVY8hr86B/VyTouP7+4mEnyk6 BlCsoCPA==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpS-007jiM-OC; Mon, 27 Nov 2023 19:59:35 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 02/17] drm/v3d: Move wait BO ioctl to the v3d_bo file Date: Mon, 27 Nov 2023 15:48:20 -0300 Message-ID: <20231127185723.10348-4-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen IOCTLs related to BO operations reside on the file v3d_bo.c. The wait BO ioctl is the only IOCTL regarding BOs that is placed in a different file. So, move it to the v3d_bo.c file. Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_bo.c | 33 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_drv.h | 4 ++-- drivers/gpu/drm/v3d/v3d_gem.c | 33 --------------------------------- 3 files changed, 35 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c index 8b3229a37c6d..357a0da7e16a 100644 --- a/drivers/gpu/drm/v3d/v3d_bo.c +++ b/drivers/gpu/drm/v3d/v3d_bo.c @@ -233,3 +233,36 @@ int v3d_get_bo_offset_ioctl(struct drm_device *dev, void *data, drm_gem_object_put(gem_obj); return 0; } + +int +v3d_wait_bo_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + int ret; + struct drm_v3d_wait_bo *args = data; + ktime_t start = ktime_get(); + u64 delta_ns; + unsigned long timeout_jiffies = + nsecs_to_jiffies_timeout(args->timeout_ns); + + if (args->pad != 0) + return -EINVAL; + + ret = drm_gem_dma_resv_wait(file_priv, args->handle, + true, timeout_jiffies); + + /* Decrement the user's timeout, in case we got interrupted + * such that the ioctl will be restarted. + */ + delta_ns = ktime_to_ns(ktime_sub(ktime_get(), start)); + if (delta_ns < args->timeout_ns) + args->timeout_ns -= delta_ns; + else + args->timeout_ns = 0; + + /* Asked to wait beyond the jiffie/scheduler precision? */ + if (ret == -ETIME && args->timeout_ns) + ret = -EAGAIN; + + return ret; +} diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index d8b392494eba..fc04372d5cbd 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -385,6 +385,8 @@ int v3d_mmap_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_get_bo_offset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int v3d_wait_bo_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); struct drm_gem_object *v3d_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sgt); @@ -405,8 +407,6 @@ int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); -int v3d_wait_bo_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); void v3d_job_cleanup(struct v3d_job *job); void v3d_job_put(struct v3d_job *job); void v3d_reset(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 9d2ac23c29e3..26f62a48d226 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -363,39 +363,6 @@ void v3d_job_put(struct v3d_job *job) kref_put(&job->refcount, job->free); } -int -v3d_wait_bo_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - int ret; - struct drm_v3d_wait_bo *args = data; - ktime_t start = ktime_get(); - u64 delta_ns; - unsigned long timeout_jiffies = - nsecs_to_jiffies_timeout(args->timeout_ns); - - if (args->pad != 0) - return -EINVAL; - - ret = drm_gem_dma_resv_wait(file_priv, args->handle, - true, timeout_jiffies); - - /* Decrement the user's timeout, in case we got interrupted - * such that the ioctl will be restarted. - */ - delta_ns = ktime_to_ns(ktime_sub(ktime_get(), start)); - if (delta_ns < args->timeout_ns) - args->timeout_ns -= delta_ns; - else - args->timeout_ns = 0; - - /* Asked to wait beyond the jiffie/scheduler precision? */ - if (ret == -ETIME && args->timeout_ns) - ret = -EAGAIN; - - return ret; -} - static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, void **container, size_t size, void (*free)(struct kref *ref), From patchwork Mon Nov 27 18:48:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22B63C4167B for ; Mon, 27 Nov 2023 18:59:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7F13510E3D8; Mon, 27 Nov 2023 18:59:52 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id C106310E3D8 for ; Mon, 27 Nov 2023 18:59:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=WDslCOYtjkafCIv8EgIlWK96t5onglOZvomeZ1SQoLU=; b=PeTZDtvP568srhdMHalsXR/9bj HykImv+6hak3hAB+PbJ83mLSTkTJnDYaiO/bq48kQSN1a8/uGE42N3hF/isV+YY1GGKbCUjAfkVav j79ljF207sEjDudgLGT49nDhNw15/+/M45JS6fnXphf5v5fY0I9kA5K3+oEai/pLGgmHAYA+vq6/S qE0Dupkk3MDLVdTG9ll0Kgb2EWNWwPzuZJWnu1J0rrcFDH456C+Ag7w9D24NsG5eIm0nUnBxKhK5S wALNUbyOqpEpMXB9NcCyvCCJ5LwnW/4JnMlyRFrO8W8MFvK8AWiT5ftB2cAbVTT6KYjpz/d5o9Dx0 D3WP2vaw==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpW-007jiM-O9; Mon, 27 Nov 2023 19:59:39 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 03/17] drm/v3d: Detach job submissions IOCTLs to a new specific file Date: Mon, 27 Nov 2023 15:48:21 -0300 Message-ID: <20231127185723.10348-5-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen We will include a new job submission type, the CPU job submission. For readability and maintability, separate the job submission IOCTLs and related operations from v3d_gem.c. Minor fix in the CSD submission kernel doc: CSD (texture formatting) -> CSD (compute shader). Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/Makefile | 3 +- drivers/gpu/drm/v3d/v3d_drv.h | 12 +- drivers/gpu/drm/v3d/v3d_gem.c | 735 ------------------------------ drivers/gpu/drm/v3d/v3d_submit.c | 744 +++++++++++++++++++++++++++++++ 4 files changed, 753 insertions(+), 741 deletions(-) create mode 100644 drivers/gpu/drm/v3d/v3d_submit.c diff --git a/drivers/gpu/drm/v3d/Makefile b/drivers/gpu/drm/v3d/Makefile index 4b21b20e4998..b7d673f1153b 100644 --- a/drivers/gpu/drm/v3d/Makefile +++ b/drivers/gpu/drm/v3d/Makefile @@ -12,7 +12,8 @@ v3d-y := \ v3d_perfmon.o \ v3d_trace_points.o \ v3d_sched.o \ - v3d_sysfs.o + v3d_sysfs.o \ + v3d_submit.o v3d-$(CONFIG_DEBUG_FS) += v3d_debugfs.o diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index fc04372d5cbd..4db9ace66024 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -401,17 +401,19 @@ struct dma_fence *v3d_fence_create(struct v3d_dev *v3d, enum v3d_queue queue); /* v3d_gem.c */ int v3d_gem_init(struct drm_device *dev); void v3d_gem_destroy(struct drm_device *dev); +void v3d_reset(struct v3d_dev *v3d); +void v3d_invalidate_caches(struct v3d_dev *v3d); +void v3d_clean_caches(struct v3d_dev *v3d); + +/* v3d_submit.c */ +void v3d_job_cleanup(struct v3d_job *job); +void v3d_job_put(struct v3d_job *job); int v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); -void v3d_job_cleanup(struct v3d_job *job); -void v3d_job_put(struct v3d_job *job); -void v3d_reset(struct v3d_dev *v3d); -void v3d_invalidate_caches(struct v3d_dev *v3d); -void v3d_clean_caches(struct v3d_dev *v3d); /* v3d_irq.c */ int v3d_irq_init(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 26f62a48d226..afc565078c78 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -11,8 +11,6 @@ #include #include -#include -#include #include "v3d_drv.h" #include "v3d_regs.h" @@ -241,739 +239,6 @@ v3d_invalidate_caches(struct v3d_dev *v3d) v3d_invalidate_slices(v3d, 0); } -/* Takes the reservation lock on all the BOs being referenced, so that - * at queue submit time we can update the reservations. - * - * We don't lock the RCL the tile alloc/state BOs, or overflow memory - * (all of which are on exec->unref_list). They're entirely private - * to v3d, so we don't attach dma-buf fences to them. - */ -static int -v3d_lock_bo_reservations(struct v3d_job *job, - struct ww_acquire_ctx *acquire_ctx) -{ - int i, ret; - - ret = drm_gem_lock_reservations(job->bo, job->bo_count, acquire_ctx); - if (ret) - return ret; - - for (i = 0; i < job->bo_count; i++) { - ret = dma_resv_reserve_fences(job->bo[i]->resv, 1); - if (ret) - goto fail; - - ret = drm_sched_job_add_implicit_dependencies(&job->base, - job->bo[i], true); - if (ret) - goto fail; - } - - return 0; - -fail: - drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); - return ret; -} - -/** - * v3d_lookup_bos() - Sets up job->bo[] with the GEM objects - * referenced by the job. - * @dev: DRM device - * @file_priv: DRM file for this fd - * @job: V3D job being set up - * @bo_handles: GEM handles - * @bo_count: Number of GEM handles passed in - * - * The command validator needs to reference BOs by their index within - * the submitted job's BO list. This does the validation of the job's - * BO list and reference counting for the lifetime of the job. - * - * Note that this function doesn't need to unreference the BOs on - * failure, because that will happen at v3d_exec_cleanup() time. - */ -static int -v3d_lookup_bos(struct drm_device *dev, - struct drm_file *file_priv, - struct v3d_job *job, - u64 bo_handles, - u32 bo_count) -{ - job->bo_count = bo_count; - - if (!job->bo_count) { - /* See comment on bo_index for why we have to check - * this. - */ - DRM_DEBUG("Rendering requires BOs\n"); - return -EINVAL; - } - - return drm_gem_objects_lookup(file_priv, - (void __user *)(uintptr_t)bo_handles, - job->bo_count, &job->bo); -} - -static void -v3d_job_free(struct kref *ref) -{ - struct v3d_job *job = container_of(ref, struct v3d_job, refcount); - int i; - - if (job->bo) { - for (i = 0; i < job->bo_count; i++) - drm_gem_object_put(job->bo[i]); - kvfree(job->bo); - } - - dma_fence_put(job->irq_fence); - dma_fence_put(job->done_fence); - - if (job->perfmon) - v3d_perfmon_put(job->perfmon); - - kfree(job); -} - -static void -v3d_render_job_free(struct kref *ref) -{ - struct v3d_render_job *job = container_of(ref, struct v3d_render_job, - base.refcount); - struct v3d_bo *bo, *save; - - list_for_each_entry_safe(bo, save, &job->unref_list, unref_head) { - drm_gem_object_put(&bo->base.base); - } - - v3d_job_free(ref); -} - -void v3d_job_cleanup(struct v3d_job *job) -{ - if (!job) - return; - - drm_sched_job_cleanup(&job->base); - v3d_job_put(job); -} - -void v3d_job_put(struct v3d_job *job) -{ - kref_put(&job->refcount, job->free); -} - -static int -v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, - void **container, size_t size, void (*free)(struct kref *ref), - u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) -{ - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct v3d_job *job; - bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); - int ret, i; - - *container = kcalloc(1, size, GFP_KERNEL); - if (!*container) { - DRM_ERROR("Cannot allocate memory for v3d job."); - return -ENOMEM; - } - - job = *container; - job->v3d = v3d; - job->free = free; - job->file = file_priv; - - ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], - 1, v3d_priv); - if (ret) - goto fail; - - if (has_multisync) { - if (se->in_sync_count && se->wait_stage == queue) { - struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs); - - for (i = 0; i < se->in_sync_count; i++) { - struct drm_v3d_sem in; - - if (copy_from_user(&in, handle++, sizeof(in))) { - ret = -EFAULT; - DRM_DEBUG("Failed to copy wait dep handle.\n"); - goto fail_deps; - } - ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in.handle, 0); - - // TODO: Investigate why this was filtered out for the IOCTL. - if (ret && ret != -ENOENT) - goto fail_deps; - } - } - } else { - ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in_sync, 0); - - // TODO: Investigate why this was filtered out for the IOCTL. - if (ret && ret != -ENOENT) - goto fail_deps; - } - - kref_init(&job->refcount); - - return 0; - -fail_deps: - drm_sched_job_cleanup(&job->base); -fail: - kfree(*container); - *container = NULL; - - return ret; -} - -static void -v3d_push_job(struct v3d_job *job) -{ - drm_sched_job_arm(&job->base); - - job->done_fence = dma_fence_get(&job->base.s_fence->finished); - - /* put by scheduler job completion */ - kref_get(&job->refcount); - - drm_sched_entity_push_job(&job->base); -} - -static void -v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, - struct v3d_job *job, - struct ww_acquire_ctx *acquire_ctx, - u32 out_sync, - struct v3d_submit_ext *se, - struct dma_fence *done_fence) -{ - struct drm_syncobj *sync_out; - bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); - int i; - - for (i = 0; i < job->bo_count; i++) { - /* XXX: Use shared fences for read-only objects. */ - dma_resv_add_fence(job->bo[i]->resv, job->done_fence, - DMA_RESV_USAGE_WRITE); - } - - drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); - - /* Update the return sync object for the job */ - /* If it only supports a single signal semaphore*/ - if (!has_multisync) { - sync_out = drm_syncobj_find(file_priv, out_sync); - if (sync_out) { - drm_syncobj_replace_fence(sync_out, done_fence); - drm_syncobj_put(sync_out); - } - return; - } - - /* If multiple semaphores extension is supported */ - if (se->out_sync_count) { - for (i = 0; i < se->out_sync_count; i++) { - drm_syncobj_replace_fence(se->out_syncs[i].syncobj, - done_fence); - drm_syncobj_put(se->out_syncs[i].syncobj); - } - kvfree(se->out_syncs); - } -} - -static void -v3d_put_multisync_post_deps(struct v3d_submit_ext *se) -{ - unsigned int i; - - if (!(se && se->out_sync_count)) - return; - - for (i = 0; i < se->out_sync_count; i++) - drm_syncobj_put(se->out_syncs[i].syncobj); - kvfree(se->out_syncs); -} - -static int -v3d_get_multisync_post_deps(struct drm_file *file_priv, - struct v3d_submit_ext *se, - u32 count, u64 handles) -{ - struct drm_v3d_sem __user *post_deps; - int i, ret; - - if (!count) - return 0; - - se->out_syncs = (struct v3d_submit_outsync *) - kvmalloc_array(count, - sizeof(struct v3d_submit_outsync), - GFP_KERNEL); - if (!se->out_syncs) - return -ENOMEM; - - post_deps = u64_to_user_ptr(handles); - - for (i = 0; i < count; i++) { - struct drm_v3d_sem out; - - if (copy_from_user(&out, post_deps++, sizeof(out))) { - ret = -EFAULT; - DRM_DEBUG("Failed to copy post dep handles\n"); - goto fail; - } - - se->out_syncs[i].syncobj = drm_syncobj_find(file_priv, - out.handle); - if (!se->out_syncs[i].syncobj) { - ret = -EINVAL; - goto fail; - } - } - se->out_sync_count = count; - - return 0; - -fail: - for (i--; i >= 0; i--) - drm_syncobj_put(se->out_syncs[i].syncobj); - kvfree(se->out_syncs); - - return ret; -} - -/* Get data for multiple binary semaphores synchronization. Parse syncobj - * to be signaled when job completes (out_sync). - */ -static int -v3d_get_multisync_submit_deps(struct drm_file *file_priv, - struct drm_v3d_extension __user *ext, - void *data) -{ - struct drm_v3d_multi_sync multisync; - struct v3d_submit_ext *se = data; - int ret; - - if (copy_from_user(&multisync, ext, sizeof(multisync))) - return -EFAULT; - - if (multisync.pad) - return -EINVAL; - - ret = v3d_get_multisync_post_deps(file_priv, data, multisync.out_sync_count, - multisync.out_syncs); - if (ret) - return ret; - - se->in_sync_count = multisync.in_sync_count; - se->in_syncs = multisync.in_syncs; - se->flags |= DRM_V3D_EXT_ID_MULTI_SYNC; - se->wait_stage = multisync.wait_stage; - - return 0; -} - -/* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data - * according to the extension id (name). - */ -static int -v3d_get_extensions(struct drm_file *file_priv, - u64 ext_handles, - void *data) -{ - struct drm_v3d_extension __user *user_ext; - int ret; - - user_ext = u64_to_user_ptr(ext_handles); - while (user_ext) { - struct drm_v3d_extension ext; - - if (copy_from_user(&ext, user_ext, sizeof(ext))) { - DRM_DEBUG("Failed to copy submit extension\n"); - return -EFAULT; - } - - switch (ext.id) { - case DRM_V3D_EXT_ID_MULTI_SYNC: - ret = v3d_get_multisync_submit_deps(file_priv, user_ext, data); - if (ret) - return ret; - break; - default: - DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); - return -EINVAL; - } - - user_ext = u64_to_user_ptr(ext.next); - } - - return 0; -} - -/** - * v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D. - * @dev: DRM device - * @data: ioctl argument - * @file_priv: DRM file for this fd - * - * This is the main entrypoint for userspace to submit a 3D frame to - * the GPU. Userspace provides the binner command list (if - * applicable), and the kernel sets up the render command list to draw - * to the framebuffer described in the ioctl, using the command lists - * that the 3D engine's binner will produce. - */ -int -v3d_submit_cl_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct v3d_dev *v3d = to_v3d_dev(dev); - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct drm_v3d_submit_cl *args = data; - struct v3d_submit_ext se = {0}; - struct v3d_bin_job *bin = NULL; - struct v3d_render_job *render = NULL; - struct v3d_job *clean_job = NULL; - struct v3d_job *last_job; - struct ww_acquire_ctx acquire_ctx; - int ret = 0; - - trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end); - - if (args->pad) - return -EINVAL; - - if (args->flags && - args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE | - DRM_V3D_SUBMIT_EXTENSION)) { - DRM_INFO("invalid flags: %d\n", args->flags); - return -EINVAL; - } - - if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); - if (ret) { - DRM_DEBUG("Failed to get extensions.\n"); - return ret; - } - } - - ret = v3d_job_init(v3d, file_priv, (void *)&render, sizeof(*render), - v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER); - if (ret) - goto fail; - - render->start = args->rcl_start; - render->end = args->rcl_end; - INIT_LIST_HEAD(&render->unref_list); - - if (args->bcl_start != args->bcl_end) { - ret = v3d_job_init(v3d, file_priv, (void *)&bin, sizeof(*bin), - v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); - if (ret) - goto fail; - - bin->start = args->bcl_start; - bin->end = args->bcl_end; - bin->qma = args->qma; - bin->qms = args->qms; - bin->qts = args->qts; - bin->render = render; - } - - if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); - if (ret) - goto fail; - - last_job = clean_job; - } else { - last_job = &render->base; - } - - ret = v3d_lookup_bos(dev, file_priv, last_job, - args->bo_handles, args->bo_handle_count); - if (ret) - goto fail; - - ret = v3d_lock_bo_reservations(last_job, &acquire_ctx); - if (ret) - goto fail; - - if (args->perfmon_id) { - render->base.perfmon = v3d_perfmon_find(v3d_priv, - args->perfmon_id); - - if (!render->base.perfmon) { - ret = -ENOENT; - goto fail_perfmon; - } - } - - mutex_lock(&v3d->sched_lock); - if (bin) { - bin->base.perfmon = render->base.perfmon; - v3d_perfmon_get(bin->base.perfmon); - v3d_push_job(&bin->base); - - ret = drm_sched_job_add_dependency(&render->base.base, - dma_fence_get(bin->base.done_fence)); - if (ret) - goto fail_unreserve; - } - - v3d_push_job(&render->base); - - if (clean_job) { - struct dma_fence *render_fence = - dma_fence_get(render->base.done_fence); - ret = drm_sched_job_add_dependency(&clean_job->base, - render_fence); - if (ret) - goto fail_unreserve; - clean_job->perfmon = render->base.perfmon; - v3d_perfmon_get(clean_job->perfmon); - v3d_push_job(clean_job); - } - - mutex_unlock(&v3d->sched_lock); - - v3d_attach_fences_and_unlock_reservation(file_priv, - last_job, - &acquire_ctx, - args->out_sync, - &se, - last_job->done_fence); - - if (bin) - v3d_job_put(&bin->base); - v3d_job_put(&render->base); - if (clean_job) - v3d_job_put(clean_job); - - return 0; - -fail_unreserve: - mutex_unlock(&v3d->sched_lock); -fail_perfmon: - drm_gem_unlock_reservations(last_job->bo, - last_job->bo_count, &acquire_ctx); -fail: - v3d_job_cleanup((void *)bin); - v3d_job_cleanup((void *)render); - v3d_job_cleanup(clean_job); - v3d_put_multisync_post_deps(&se); - - return ret; -} - -/** - * v3d_submit_tfu_ioctl() - Submits a TFU (texture formatting) job to the V3D. - * @dev: DRM device - * @data: ioctl argument - * @file_priv: DRM file for this fd - * - * Userspace provides the register setup for the TFU, which we don't - * need to validate since the TFU is behind the MMU. - */ -int -v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct v3d_dev *v3d = to_v3d_dev(dev); - struct drm_v3d_submit_tfu *args = data; - struct v3d_submit_ext se = {0}; - struct v3d_tfu_job *job = NULL; - struct ww_acquire_ctx acquire_ctx; - int ret = 0; - - trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia); - - if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { - DRM_DEBUG("invalid flags: %d\n", args->flags); - return -EINVAL; - } - - if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); - if (ret) { - DRM_DEBUG("Failed to get extensions.\n"); - return ret; - } - } - - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), - v3d_job_free, args->in_sync, &se, V3D_TFU); - if (ret) - goto fail; - - job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), - sizeof(*job->base.bo), GFP_KERNEL); - if (!job->base.bo) { - ret = -ENOMEM; - goto fail; - } - - job->args = *args; - - for (job->base.bo_count = 0; - job->base.bo_count < ARRAY_SIZE(args->bo_handles); - job->base.bo_count++) { - struct drm_gem_object *bo; - - if (!args->bo_handles[job->base.bo_count]) - break; - - bo = drm_gem_object_lookup(file_priv, args->bo_handles[job->base.bo_count]); - if (!bo) { - DRM_DEBUG("Failed to look up GEM BO %d: %d\n", - job->base.bo_count, - args->bo_handles[job->base.bo_count]); - ret = -ENOENT; - goto fail; - } - job->base.bo[job->base.bo_count] = bo; - } - - ret = v3d_lock_bo_reservations(&job->base, &acquire_ctx); - if (ret) - goto fail; - - mutex_lock(&v3d->sched_lock); - v3d_push_job(&job->base); - mutex_unlock(&v3d->sched_lock); - - v3d_attach_fences_and_unlock_reservation(file_priv, - &job->base, &acquire_ctx, - args->out_sync, - &se, - job->base.done_fence); - - v3d_job_put(&job->base); - - return 0; - -fail: - v3d_job_cleanup((void *)job); - v3d_put_multisync_post_deps(&se); - - return ret; -} - -/** - * v3d_submit_csd_ioctl() - Submits a CSD (texture formatting) job to the V3D. - * @dev: DRM device - * @data: ioctl argument - * @file_priv: DRM file for this fd - * - * Userspace provides the register setup for the CSD, which we don't - * need to validate since the CSD is behind the MMU. - */ -int -v3d_submit_csd_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct v3d_dev *v3d = to_v3d_dev(dev); - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct drm_v3d_submit_csd *args = data; - struct v3d_submit_ext se = {0}; - struct v3d_csd_job *job = NULL; - struct v3d_job *clean_job = NULL; - struct ww_acquire_ctx acquire_ctx; - int ret; - - trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]); - - if (args->pad) - return -EINVAL; - - if (!v3d_has_csd(v3d)) { - DRM_DEBUG("Attempting CSD submit on non-CSD hardware\n"); - return -EINVAL; - } - - if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { - DRM_INFO("invalid flags: %d\n", args->flags); - return -EINVAL; - } - - if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); - if (ret) { - DRM_DEBUG("Failed to get extensions.\n"); - return ret; - } - } - - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), - v3d_job_free, args->in_sync, &se, V3D_CSD); - if (ret) - goto fail; - - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); - if (ret) - goto fail; - - job->args = *args; - - ret = v3d_lookup_bos(dev, file_priv, clean_job, - args->bo_handles, args->bo_handle_count); - if (ret) - goto fail; - - ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx); - if (ret) - goto fail; - - if (args->perfmon_id) { - job->base.perfmon = v3d_perfmon_find(v3d_priv, - args->perfmon_id); - if (!job->base.perfmon) { - ret = -ENOENT; - goto fail_perfmon; - } - } - - mutex_lock(&v3d->sched_lock); - v3d_push_job(&job->base); - - ret = drm_sched_job_add_dependency(&clean_job->base, - dma_fence_get(job->base.done_fence)); - if (ret) - goto fail_unreserve; - - v3d_push_job(clean_job); - mutex_unlock(&v3d->sched_lock); - - v3d_attach_fences_and_unlock_reservation(file_priv, - clean_job, - &acquire_ctx, - args->out_sync, - &se, - clean_job->done_fence); - - v3d_job_put(&job->base); - v3d_job_put(clean_job); - - return 0; - -fail_unreserve: - mutex_unlock(&v3d->sched_lock); -fail_perfmon: - drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count, - &acquire_ctx); -fail: - v3d_job_cleanup((void *)job); - v3d_job_cleanup(clean_job); - v3d_put_multisync_post_deps(&se); - - return ret; -} - int v3d_gem_init(struct drm_device *dev) { diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c new file mode 100644 index 000000000000..f36214002f37 --- /dev/null +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -0,0 +1,744 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2014-2018 Broadcom + * Copyright (C) 2023 Raspberry Pi + */ + +#include + +#include "v3d_drv.h" +#include "v3d_regs.h" +#include "v3d_trace.h" + +/* Takes the reservation lock on all the BOs being referenced, so that + * at queue submit time we can update the reservations. + * + * We don't lock the RCL the tile alloc/state BOs, or overflow memory + * (all of which are on exec->unref_list). They're entirely private + * to v3d, so we don't attach dma-buf fences to them. + */ +static int +v3d_lock_bo_reservations(struct v3d_job *job, + struct ww_acquire_ctx *acquire_ctx) +{ + int i, ret; + + ret = drm_gem_lock_reservations(job->bo, job->bo_count, acquire_ctx); + if (ret) + return ret; + + for (i = 0; i < job->bo_count; i++) { + ret = dma_resv_reserve_fences(job->bo[i]->resv, 1); + if (ret) + goto fail; + + ret = drm_sched_job_add_implicit_dependencies(&job->base, + job->bo[i], true); + if (ret) + goto fail; + } + + return 0; + +fail: + drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); + return ret; +} + +/** + * v3d_lookup_bos() - Sets up job->bo[] with the GEM objects + * referenced by the job. + * @dev: DRM device + * @file_priv: DRM file for this fd + * @job: V3D job being set up + * @bo_handles: GEM handles + * @bo_count: Number of GEM handles passed in + * + * The command validator needs to reference BOs by their index within + * the submitted job's BO list. This does the validation of the job's + * BO list and reference counting for the lifetime of the job. + * + * Note that this function doesn't need to unreference the BOs on + * failure, because that will happen at v3d_exec_cleanup() time. + */ +static int +v3d_lookup_bos(struct drm_device *dev, + struct drm_file *file_priv, + struct v3d_job *job, + u64 bo_handles, + u32 bo_count) +{ + job->bo_count = bo_count; + + if (!job->bo_count) { + /* See comment on bo_index for why we have to check + * this. + */ + DRM_DEBUG("Rendering requires BOs\n"); + return -EINVAL; + } + + return drm_gem_objects_lookup(file_priv, + (void __user *)(uintptr_t)bo_handles, + job->bo_count, &job->bo); +} + +static void +v3d_job_free(struct kref *ref) +{ + struct v3d_job *job = container_of(ref, struct v3d_job, refcount); + int i; + + if (job->bo) { + for (i = 0; i < job->bo_count; i++) + drm_gem_object_put(job->bo[i]); + kvfree(job->bo); + } + + dma_fence_put(job->irq_fence); + dma_fence_put(job->done_fence); + + if (job->perfmon) + v3d_perfmon_put(job->perfmon); + + kfree(job); +} + +static void +v3d_render_job_free(struct kref *ref) +{ + struct v3d_render_job *job = container_of(ref, struct v3d_render_job, + base.refcount); + struct v3d_bo *bo, *save; + + list_for_each_entry_safe(bo, save, &job->unref_list, unref_head) { + drm_gem_object_put(&bo->base.base); + } + + v3d_job_free(ref); +} + +void v3d_job_cleanup(struct v3d_job *job) +{ + if (!job) + return; + + drm_sched_job_cleanup(&job->base); + v3d_job_put(job); +} + +void v3d_job_put(struct v3d_job *job) +{ + kref_put(&job->refcount, job->free); +} + +static int +v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, + void **container, size_t size, void (*free)(struct kref *ref), + u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) +{ + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct v3d_job *job; + bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); + int ret, i; + + *container = kcalloc(1, size, GFP_KERNEL); + if (!*container) { + DRM_ERROR("Cannot allocate memory for v3d job."); + return -ENOMEM; + } + + job = *container; + job->v3d = v3d; + job->free = free; + job->file = file_priv; + + ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], + 1, v3d_priv); + if (ret) + goto fail; + + if (has_multisync) { + if (se->in_sync_count && se->wait_stage == queue) { + struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs); + + for (i = 0; i < se->in_sync_count; i++) { + struct drm_v3d_sem in; + + if (copy_from_user(&in, handle++, sizeof(in))) { + ret = -EFAULT; + DRM_DEBUG("Failed to copy wait dep handle.\n"); + goto fail_deps; + } + ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in.handle, 0); + + // TODO: Investigate why this was filtered out for the IOCTL. + if (ret && ret != -ENOENT) + goto fail_deps; + } + } + } else { + ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in_sync, 0); + + // TODO: Investigate why this was filtered out for the IOCTL. + if (ret && ret != -ENOENT) + goto fail_deps; + } + + kref_init(&job->refcount); + + return 0; + +fail_deps: + drm_sched_job_cleanup(&job->base); +fail: + kfree(*container); + *container = NULL; + + return ret; +} + +static void +v3d_push_job(struct v3d_job *job) +{ + drm_sched_job_arm(&job->base); + + job->done_fence = dma_fence_get(&job->base.s_fence->finished); + + /* put by scheduler job completion */ + kref_get(&job->refcount); + + drm_sched_entity_push_job(&job->base); +} + +static void +v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, + struct v3d_job *job, + struct ww_acquire_ctx *acquire_ctx, + u32 out_sync, + struct v3d_submit_ext *se, + struct dma_fence *done_fence) +{ + struct drm_syncobj *sync_out; + bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); + int i; + + for (i = 0; i < job->bo_count; i++) { + /* XXX: Use shared fences for read-only objects. */ + dma_resv_add_fence(job->bo[i]->resv, job->done_fence, + DMA_RESV_USAGE_WRITE); + } + + drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); + + /* Update the return sync object for the job */ + /* If it only supports a single signal semaphore*/ + if (!has_multisync) { + sync_out = drm_syncobj_find(file_priv, out_sync); + if (sync_out) { + drm_syncobj_replace_fence(sync_out, done_fence); + drm_syncobj_put(sync_out); + } + return; + } + + /* If multiple semaphores extension is supported */ + if (se->out_sync_count) { + for (i = 0; i < se->out_sync_count; i++) { + drm_syncobj_replace_fence(se->out_syncs[i].syncobj, + done_fence); + drm_syncobj_put(se->out_syncs[i].syncobj); + } + kvfree(se->out_syncs); + } +} + +static void +v3d_put_multisync_post_deps(struct v3d_submit_ext *se) +{ + unsigned int i; + + if (!(se && se->out_sync_count)) + return; + + for (i = 0; i < se->out_sync_count; i++) + drm_syncobj_put(se->out_syncs[i].syncobj); + kvfree(se->out_syncs); +} + +static int +v3d_get_multisync_post_deps(struct drm_file *file_priv, + struct v3d_submit_ext *se, + u32 count, u64 handles) +{ + struct drm_v3d_sem __user *post_deps; + int i, ret; + + if (!count) + return 0; + + se->out_syncs = (struct v3d_submit_outsync *) + kvmalloc_array(count, + sizeof(struct v3d_submit_outsync), + GFP_KERNEL); + if (!se->out_syncs) + return -ENOMEM; + + post_deps = u64_to_user_ptr(handles); + + for (i = 0; i < count; i++) { + struct drm_v3d_sem out; + + if (copy_from_user(&out, post_deps++, sizeof(out))) { + ret = -EFAULT; + DRM_DEBUG("Failed to copy post dep handles\n"); + goto fail; + } + + se->out_syncs[i].syncobj = drm_syncobj_find(file_priv, + out.handle); + if (!se->out_syncs[i].syncobj) { + ret = -EINVAL; + goto fail; + } + } + se->out_sync_count = count; + + return 0; + +fail: + for (i--; i >= 0; i--) + drm_syncobj_put(se->out_syncs[i].syncobj); + kvfree(se->out_syncs); + + return ret; +} + +/* Get data for multiple binary semaphores synchronization. Parse syncobj + * to be signaled when job completes (out_sync). + */ +static int +v3d_get_multisync_submit_deps(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + void *data) +{ + struct drm_v3d_multi_sync multisync; + struct v3d_submit_ext *se = data; + int ret; + + if (copy_from_user(&multisync, ext, sizeof(multisync))) + return -EFAULT; + + if (multisync.pad) + return -EINVAL; + + ret = v3d_get_multisync_post_deps(file_priv, data, multisync.out_sync_count, + multisync.out_syncs); + if (ret) + return ret; + + se->in_sync_count = multisync.in_sync_count; + se->in_syncs = multisync.in_syncs; + se->flags |= DRM_V3D_EXT_ID_MULTI_SYNC; + se->wait_stage = multisync.wait_stage; + + return 0; +} + +/* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data + * according to the extension id (name). + */ +static int +v3d_get_extensions(struct drm_file *file_priv, + u64 ext_handles, + void *data) +{ + struct drm_v3d_extension __user *user_ext; + int ret; + + user_ext = u64_to_user_ptr(ext_handles); + while (user_ext) { + struct drm_v3d_extension ext; + + if (copy_from_user(&ext, user_ext, sizeof(ext))) { + DRM_DEBUG("Failed to copy submit extension\n"); + return -EFAULT; + } + + switch (ext.id) { + case DRM_V3D_EXT_ID_MULTI_SYNC: + ret = v3d_get_multisync_submit_deps(file_priv, user_ext, data); + if (ret) + return ret; + break; + default: + DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); + return -EINVAL; + } + + user_ext = u64_to_user_ptr(ext.next); + } + + return 0; +} + +/** + * v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * This is the main entrypoint for userspace to submit a 3D frame to + * the GPU. Userspace provides the binner command list (if + * applicable), and the kernel sets up the render command list to draw + * to the framebuffer described in the ioctl, using the command lists + * that the 3D engine's binner will produce. + */ +int +v3d_submit_cl_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct drm_v3d_submit_cl *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_bin_job *bin = NULL; + struct v3d_render_job *render = NULL; + struct v3d_job *clean_job = NULL; + struct v3d_job *last_job; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end); + + if (args->pad) + return -EINVAL; + + if (args->flags && + args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE | + DRM_V3D_SUBMIT_EXTENSION)) { + DRM_INFO("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + + ret = v3d_job_init(v3d, file_priv, (void *)&render, sizeof(*render), + v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER); + if (ret) + goto fail; + + render->start = args->rcl_start; + render->end = args->rcl_end; + INIT_LIST_HEAD(&render->unref_list); + + if (args->bcl_start != args->bcl_end) { + ret = v3d_job_init(v3d, file_priv, (void *)&bin, sizeof(*bin), + v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); + if (ret) + goto fail; + + bin->start = args->bcl_start; + bin->end = args->bcl_end; + bin->qma = args->qma; + bin->qms = args->qms; + bin->qts = args->qts; + bin->render = render; + } + + if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { + ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); + if (ret) + goto fail; + + last_job = clean_job; + } else { + last_job = &render->base; + } + + ret = v3d_lookup_bos(dev, file_priv, last_job, + args->bo_handles, args->bo_handle_count); + if (ret) + goto fail; + + ret = v3d_lock_bo_reservations(last_job, &acquire_ctx); + if (ret) + goto fail; + + if (args->perfmon_id) { + render->base.perfmon = v3d_perfmon_find(v3d_priv, + args->perfmon_id); + + if (!render->base.perfmon) { + ret = -ENOENT; + goto fail_perfmon; + } + } + + mutex_lock(&v3d->sched_lock); + if (bin) { + bin->base.perfmon = render->base.perfmon; + v3d_perfmon_get(bin->base.perfmon); + v3d_push_job(&bin->base); + + ret = drm_sched_job_add_dependency(&render->base.base, + dma_fence_get(bin->base.done_fence)); + if (ret) + goto fail_unreserve; + } + + v3d_push_job(&render->base); + + if (clean_job) { + struct dma_fence *render_fence = + dma_fence_get(render->base.done_fence); + ret = drm_sched_job_add_dependency(&clean_job->base, + render_fence); + if (ret) + goto fail_unreserve; + clean_job->perfmon = render->base.perfmon; + v3d_perfmon_get(clean_job->perfmon); + v3d_push_job(clean_job); + } + + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + last_job, + &acquire_ctx, + args->out_sync, + &se, + last_job->done_fence); + + if (bin) + v3d_job_put(&bin->base); + v3d_job_put(&render->base); + if (clean_job) + v3d_job_put(clean_job); + + return 0; + +fail_unreserve: + mutex_unlock(&v3d->sched_lock); +fail_perfmon: + drm_gem_unlock_reservations(last_job->bo, + last_job->bo_count, &acquire_ctx); +fail: + v3d_job_cleanup((void *)bin); + v3d_job_cleanup((void *)render); + v3d_job_cleanup(clean_job); + v3d_put_multisync_post_deps(&se); + + return ret; +} + +/** + * v3d_submit_tfu_ioctl() - Submits a TFU (texture formatting) job to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * Userspace provides the register setup for the TFU, which we don't + * need to validate since the TFU is behind the MMU. + */ +int +v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct drm_v3d_submit_tfu *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_tfu_job *job = NULL; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia); + + if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_DEBUG("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + + ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + v3d_job_free, args->in_sync, &se, V3D_TFU); + if (ret) + goto fail; + + job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), + sizeof(*job->base.bo), GFP_KERNEL); + if (!job->base.bo) { + ret = -ENOMEM; + goto fail; + } + + job->args = *args; + + for (job->base.bo_count = 0; + job->base.bo_count < ARRAY_SIZE(args->bo_handles); + job->base.bo_count++) { + struct drm_gem_object *bo; + + if (!args->bo_handles[job->base.bo_count]) + break; + + bo = drm_gem_object_lookup(file_priv, args->bo_handles[job->base.bo_count]); + if (!bo) { + DRM_DEBUG("Failed to look up GEM BO %d: %d\n", + job->base.bo_count, + args->bo_handles[job->base.bo_count]); + ret = -ENOENT; + goto fail; + } + job->base.bo[job->base.bo_count] = bo; + } + + ret = v3d_lock_bo_reservations(&job->base, &acquire_ctx); + if (ret) + goto fail; + + mutex_lock(&v3d->sched_lock); + v3d_push_job(&job->base); + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + &job->base, &acquire_ctx, + args->out_sync, + &se, + job->base.done_fence); + + v3d_job_put(&job->base); + + return 0; + +fail: + v3d_job_cleanup((void *)job); + v3d_put_multisync_post_deps(&se); + + return ret; +} + +/** + * v3d_submit_csd_ioctl() - Submits a CSD (compute shader) job to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * Userspace provides the register setup for the CSD, which we don't + * need to validate since the CSD is behind the MMU. + */ +int +v3d_submit_csd_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct drm_v3d_submit_csd *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_csd_job *job = NULL; + struct v3d_job *clean_job = NULL; + struct ww_acquire_ctx acquire_ctx; + int ret; + + trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]); + + if (args->pad) + return -EINVAL; + + if (!v3d_has_csd(v3d)) { + DRM_DEBUG("Attempting CSD submit on non-CSD hardware\n"); + return -EINVAL; + } + + if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_INFO("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + + ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + v3d_job_free, args->in_sync, &se, V3D_CSD); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); + if (ret) + goto fail; + + job->args = *args; + + ret = v3d_lookup_bos(dev, file_priv, clean_job, + args->bo_handles, args->bo_handle_count); + if (ret) + goto fail; + + ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx); + if (ret) + goto fail; + + if (args->perfmon_id) { + job->base.perfmon = v3d_perfmon_find(v3d_priv, + args->perfmon_id); + if (!job->base.perfmon) { + ret = -ENOENT; + goto fail_perfmon; + } + } + + mutex_lock(&v3d->sched_lock); + v3d_push_job(&job->base); + + ret = drm_sched_job_add_dependency(&clean_job->base, + dma_fence_get(job->base.done_fence)); + if (ret) + goto fail_unreserve; + + v3d_push_job(clean_job); + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + clean_job, + &acquire_ctx, + args->out_sync, + &se, + clean_job->done_fence); + + v3d_job_put(&job->base); + v3d_job_put(clean_job); + + return 0; + +fail_unreserve: + mutex_unlock(&v3d->sched_lock); +fail_perfmon: + drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count, + &acquire_ctx); +fail: + v3d_job_cleanup((void *)job); + v3d_job_cleanup(clean_job); + v3d_put_multisync_post_deps(&se); + + return ret; +} From patchwork Mon Nov 27 18:48:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0A7C8C07D5A for ; Mon, 27 Nov 2023 18:59:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3CE5510E3DC; Mon, 27 Nov 2023 18:59:56 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9D44B10E3DA for ; Mon, 27 Nov 2023 18:59:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=4MfYPkwJklsa7UW6lxV4E4N4hOFvK7atmvCKzpaWVoQ=; b=b5MkF/WKS47mpdlA20n8QBymVC UNenk2SFqXwWpGq1Bhbb8qF9GMVATshUzrKlvW2RW5EjhqWsECuzwH8ydMFEw19yGG40Os6puEnNv zLSycpolTDeExFSioU5OC+POop0Jv9Pem52vWdLXcng+/Vv7JpV92rKe4faksJTquEQ7CxN1g8YEK 19Eo6LGXSw2Vu6AoIocYzhcwMNOYodSJ8m2EeZnpR7x6rKymmFXERyjrEhk1vlfBVWf8pDQYceoau 2CqlNMubxBP2UzTNlml9B7y+gmjA5bWW8Uz/u5QrFqP6HVaK/ofpB3kqHssdoq8AWHnJZzPxuh9Gq zjmIpIpA==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpb-007jiM-GY; Mon, 27 Nov 2023 19:59:44 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 04/17] drm/v3d: Simplify job refcount handling Date: Mon, 27 Nov 2023 15:48:22 -0300 Message-ID: <20231127185723.10348-6-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen Instead of checking if the job is NULL every time we call the function, check it inside the function. Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_submit.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index f36214002f37..a0caf9c499bb 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -129,6 +129,9 @@ void v3d_job_cleanup(struct v3d_job *job) void v3d_job_put(struct v3d_job *job) { + if (!job) + return; + kref_put(&job->refcount, job->free); } @@ -517,11 +520,9 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, &se, last_job->done_fence); - if (bin) - v3d_job_put(&bin->base); + v3d_job_put(&bin->base); v3d_job_put(&render->base); - if (clean_job) - v3d_job_put(clean_job); + v3d_job_put(clean_job); return 0; From patchwork Mon Nov 27 18:48:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2EC3C4167B for ; Mon, 27 Nov 2023 19:00:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E2F7410E3DE; Mon, 27 Nov 2023 19:00:00 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1C32810E3DB for ; Mon, 27 Nov 2023 18:59:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=5nE2qoJ48amFKeCsCsNOMQlSLEl2ujMYQdEfeaaj0Ck=; b=gyTR6V60pMp6Iyn/DvGksdt+8c WSErNxNhCM0DP8Uy8j/vl78Rt+l+pJQZEBaq3D22R+DQijEaMbaIX4VdqRi4abYa5iRhHk5mfvNgm mHPM9SW4hZMoGn7b9X4Gy4bb+12WHZ2RFg3eZPToD1Ndo2ghBaBHrbfmDa530dwDsBF+xfdWv290L lFEJWKC23q1Q3/3YPbUshHNefE7vccgJGwwgS4bur5oEATJhrZCkHhdZ1AUkoXbG5783e4hgG+lej A9VNIccmzK6NkpwjkC6fcTZO5FoPOtFI7Q8EdSP0XqQMS1RECV8G5406PcWFXPcXwPiKoeQvKJpjR YN9zCd9g==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpf-007jiM-OY; Mon, 27 Nov 2023 19:59:48 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 05/17] drm/v3d: Don't allow two multisync extensions in the same job Date: Mon, 27 Nov 2023 15:48:23 -0300 Message-ID: <20231127185723.10348-7-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Currently, two multisync extensions can be added to the same job and only the last multisync extension will be used. To avoid this vulnerability, don't allow two multisync extensions in the same job. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_submit.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index a0caf9c499bb..10141dc2820a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -329,6 +329,11 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, struct v3d_submit_ext *se = data; int ret; + if (se->in_sync_count || se->out_sync_count) { + DRM_DEBUG("Two multisync extensions were added to the same job."); + return -EINVAL; + } + if (copy_from_user(&multisync, ext, sizeof(multisync))) return -EFAULT; From patchwork Mon Nov 27 18:48:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 152D8C4167B for ; Mon, 27 Nov 2023 19:00:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CF95E10E3E0; Mon, 27 Nov 2023 19:00:11 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0444910E3DF for ; Mon, 27 Nov 2023 19:00:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=qO0E3068RUMYPupr8AluFHQe2TqeU1l4qAr68BXaEN0=; b=bqRAMMsqYXJaFDjq4uxU5SJhBk y1zG9UcB3QFnsOtcbnrp+Pui2jI9Frn8lS8wxWRunjSPaB0ga7o/zqiMmjnGxhYUbvYKy69rbpWPr DtZDJI5PMhF+ZL+d9U/20PRqhv3ERoyheej2c86QGSEFD0hNnCpp2nfb8OPuDtBeTzSy6OgKQezZP rmIEB4L7Cx/9rN7BcWgjEHXFLMKYj1Usl7JdCRqQjVEIEFAEouHzQxXzUDKHjgBtC4r+CySh8W90q rZbKgPITHu9ldzabfxWuUyvku/phi472VXoHrkMQzM9TXBYrOxcgBm6anXlB+1WP4nHvy8FJYi6Au F07OEjfw==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpk-007jiM-NB; Mon, 27 Nov 2023 19:59:53 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 06/17] drm/v3d: Decouple job allocation from job initiation Date: Mon, 27 Nov 2023 15:48:24 -0300 Message-ID: <20231127185723.10348-8-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We want to allow the IOCTLs to allocate the job without initiating it. This will be useful for the CPU job submission IOCTL, as the CPU job has the need to use information from the user extensions. Currently, the user extensions are parsed before the job allocation, making it impossible to fill the CPU job when parsing the user extensions. Therefore, decouple the job allocation from the job initiation. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_submit.c | 64 ++++++++++++++++++++++---------- 1 file changed, 44 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 10141dc2820a..148900283c2a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -135,23 +135,27 @@ void v3d_job_put(struct v3d_job *job) kref_put(&job->refcount, job->free); } +static int +v3d_job_allocate(void **container, size_t size) +{ + *container = kcalloc(1, size, GFP_KERNEL); + if (!*container) { + DRM_ERROR("Cannot allocate memory for V3D job.\n"); + return -ENOMEM; + } + + return 0; +} + static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, - void **container, size_t size, void (*free)(struct kref *ref), + struct v3d_job *job, void (*free)(struct kref *ref), u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) { struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct v3d_job *job; bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); int ret, i; - *container = kcalloc(1, size, GFP_KERNEL); - if (!*container) { - DRM_ERROR("Cannot allocate memory for v3d job."); - return -ENOMEM; - } - - job = *container; job->v3d = v3d; job->free = free; job->file = file_priv; @@ -159,7 +163,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], 1, v3d_priv); if (ret) - goto fail; + return ret; if (has_multisync) { if (se->in_sync_count && se->wait_stage == queue) { @@ -194,10 +198,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, fail_deps: drm_sched_job_cleanup(&job->base); -fail: - kfree(*container); - *container = NULL; - return ret; } @@ -437,7 +437,11 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_init(v3d, file_priv, (void *)&render, sizeof(*render), + ret = v3d_job_allocate((void *)&render, sizeof(*render)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &render->base, v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER); if (ret) goto fail; @@ -447,7 +451,11 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, INIT_LIST_HEAD(&render->unref_list); if (args->bcl_start != args->bcl_end) { - ret = v3d_job_init(v3d, file_priv, (void *)&bin, sizeof(*bin), + ret = v3d_job_allocate((void *)&bin, sizeof(*bin)); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, &bin->base, v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); if (ret) goto fail; @@ -461,7 +469,11 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + ret = v3d_job_allocate((void *)&clean_job, sizeof(*clean_job)); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); if (ret) goto fail; @@ -580,7 +592,11 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + ret = v3d_job_allocate((void *)&job, sizeof(*job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &job->base, v3d_job_free, args->in_sync, &se, V3D_TFU); if (ret) goto fail; @@ -683,12 +699,20 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + ret = v3d_job_allocate((void *)&job, sizeof(*job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &job->base, v3d_job_free, args->in_sync, &se, V3D_CSD); if (ret) goto fail; - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + ret = v3d_job_allocate((void *)&clean_job, sizeof(*clean_job)); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); if (ret) goto fail; From patchwork Mon Nov 27 18:48:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CE823C4167B for ; Mon, 27 Nov 2023 19:00:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 182F810E3E4; Mon, 27 Nov 2023 19:00:13 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6347910E3DB for ; Mon, 27 Nov 2023 19:00:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=hSCZRbu/LMhn8v42GWO+IS3KPsDJX1RTIrJVR7QwJcM=; b=frN9Jo2Ky6yCVcsVocjMDRd1Fy HHhqkseyGrm78LxthYuEV1EBbMhlBBNwNPOvYL5DgIRNhzCSq19vLskDCZAzsKZc/RC9l/1mFt8X9 1JKzNFrl3AQTn0Yxa+wULdfzZ5QgPFHCmwTy/2NfeOtbcK1+Xz5tiCyd98F2oHwOO6C+lefc8oR84 dFDVwMysSlmtY6XAV3rZIxOXamR4BYCTEd/BiwBXu5Vhry+ZpwHecubns/LVkcgcjOYIbXL1yYELI M8QY4q5UaBJvD+WnPfEZqG75f3OKtV4jivHn9a+ynol3WH9VshPPsF9XSUjJ93RFcS0Gq3XOwgzXG LryLAZlQ==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpp-007jiM-2V; Mon, 27 Nov 2023 19:59:57 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 07/17] drm/v3d: Add a CPU job submission Date: Mon, 27 Nov 2023 15:48:25 -0300 Message-ID: <20231127185723.10348-9-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen Create a new type of job, a CPU job. A CPU job is a type of job that performs operations that requires CPU intervention. The overall idea is to use user extensions to enable different types of CPU job, allowing the CPU job to perform different operations according to the type of user extension. The user extension ID identify the type of CPU job that must be dealt. Having a CPU job is interesting for synchronization purposes as a CPU job has a queue like any other V3D job and can be synchoronized by the multisync extension. Signed-off-by: Melissa Wen Co-developed-by: Maíra Canal Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.c | 4 ++ drivers/gpu/drm/v3d/v3d_drv.h | 16 +++++- drivers/gpu/drm/v3d/v3d_sched.c | 57 +++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 86 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 17 +++++++ 5 files changed, 179 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 44a1ca57d6a4..3debf37e7d9b 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -91,6 +91,9 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, case DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT: args->value = 1; return 0; + case DRM_V3D_PARAM_SUPPORTS_CPU_QUEUE: + args->value = 1; + return 0; default: DRM_DEBUG("Unknown parameter %d\n", args->param); return -EINVAL; @@ -189,6 +192,7 @@ static const struct drm_ioctl_desc v3d_drm_ioctls[] = { DRM_IOCTL_DEF_DRV(V3D_PERFMON_CREATE, v3d_perfmon_create_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(V3D_PERFMON_DESTROY, v3d_perfmon_destroy_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(V3D_PERFMON_GET_VALUES, v3d_perfmon_get_values_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(V3D_SUBMIT_CPU, v3d_submit_cpu_ioctl, DRM_RENDER_ALLOW | DRM_AUTH), }; static const struct drm_driver v3d_drm_driver = { diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 4db9ace66024..2246a0e29955 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -19,7 +19,7 @@ struct reset_control; #define GMP_GRANULARITY (128 * 1024) -#define V3D_MAX_QUEUES (V3D_CACHE_CLEAN + 1) +#define V3D_MAX_QUEUES (V3D_CPU + 1) static inline char *v3d_queue_to_string(enum v3d_queue queue) { @@ -29,6 +29,7 @@ static inline char *v3d_queue_to_string(enum v3d_queue queue) case V3D_TFU: return "tfu"; case V3D_CSD: return "csd"; case V3D_CACHE_CLEAN: return "cache_clean"; + case V3D_CPU: return "cpu"; } return "UNKNOWN"; } @@ -122,6 +123,7 @@ struct v3d_dev { struct v3d_render_job *render_job; struct v3d_tfu_job *tfu_job; struct v3d_csd_job *csd_job; + struct v3d_cpu_job *cpu_job; struct v3d_queue_state queue[V3D_MAX_QUEUES]; @@ -312,6 +314,16 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; }; +enum v3d_cpu_job_type {}; + +struct v3d_cpu_job { + struct v3d_job base; + + enum v3d_cpu_job_type job_type; +}; + +typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); + struct v3d_submit_outsync { struct drm_syncobj *syncobj; }; @@ -414,6 +426,8 @@ int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); /* v3d_irq.c */ int v3d_irq_init(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index fccbea2a5f2e..c89e92fc614c 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -55,6 +55,12 @@ to_csd_job(struct drm_sched_job *sched_job) return container_of(sched_job, struct v3d_csd_job, base.base); } +static struct v3d_cpu_job * +to_cpu_job(struct drm_sched_job *sched_job) +{ + return container_of(sched_job, struct v3d_cpu_job, base.base); +} + static void v3d_sched_job_free(struct drm_sched_job *sched_job) { @@ -262,6 +268,42 @@ v3d_csd_job_run(struct drm_sched_job *sched_job) return fence; } +static const v3d_cpu_job_fn cpu_job_function[] = { }; + +static struct dma_fence * +v3d_cpu_job_run(struct drm_sched_job *sched_job) +{ + struct v3d_cpu_job *job = to_cpu_job(sched_job); + struct v3d_dev *v3d = job->base.v3d; + struct v3d_file_priv *file = job->base.file->driver_priv; + u64 runtime; + + v3d->cpu_job = job; + + if (job->job_type >= ARRAY_SIZE(cpu_job_function)) { + DRM_DEBUG_DRIVER("Unknown CPU job: %d\n", job->job_type); + return NULL; + } + + file->start_ns[V3D_CPU] = local_clock(); + v3d->queue[V3D_CPU].start_ns = file->start_ns[V3D_CPU]; + + cpu_job_function[job->job_type](job); + + runtime = local_clock() - file->start_ns[V3D_CPU]; + + file->enabled_ns[V3D_CPU] += runtime; + v3d->queue[V3D_CPU].enabled_ns += runtime; + + file->jobs_sent[V3D_CPU]++; + v3d->queue[V3D_CPU].jobs_sent++; + + file->start_ns[V3D_CPU] = 0; + v3d->queue[V3D_CPU].start_ns = 0; + + return NULL; +} + static struct dma_fence * v3d_cache_clean_job_run(struct drm_sched_job *sched_job) { @@ -416,6 +458,12 @@ static const struct drm_sched_backend_ops v3d_cache_clean_sched_ops = { .free_job = v3d_sched_job_free }; +static const struct drm_sched_backend_ops v3d_cpu_sched_ops = { + .run_job = v3d_cpu_job_run, + .timedout_job = v3d_generic_job_timedout, + .free_job = v3d_sched_job_free +}; + int v3d_sched_init(struct v3d_dev *v3d) { @@ -471,6 +519,15 @@ v3d_sched_init(struct v3d_dev *v3d) goto fail; } + ret = drm_sched_init(&v3d->queue[V3D_CPU].sched, + &v3d_cpu_sched_ops, NULL, + DRM_SCHED_PRIORITY_COUNT, + 1, job_hang_limit, + msecs_to_jiffies(hang_limit_ms), NULL, + NULL, "v3d_cpu", v3d->drm.dev); + if (ret) + goto fail; + return 0; fail: diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 148900283c2a..918e28b1d00a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -772,3 +772,89 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return ret; } + +static const unsigned int cpu_job_bo_handle_count[] = { }; + +/** + * v3d_submit_cpu_ioctl() - Submits a CPU job to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * Userspace specifies the CPU job type and data required to perform its + * operations through the drm_v3d_extension struct. + */ +int +v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct drm_v3d_submit_cpu *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_cpu_job *cpu_job = NULL; + struct ww_acquire_ctx acquire_ctx; + int ret; + + if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_INFO("Invalid flags: %d\n", args->flags); + return -EINVAL; + } + + ret = v3d_job_allocate((void *)&cpu_job, sizeof(*cpu_job)); + if (ret) + return ret; + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + goto fail; + } + } + + /* Every CPU job must have a CPU job user extension */ + if (!cpu_job->job_type) { + DRM_DEBUG("CPU job must have a CPU job user extension.\n"); + goto fail; + } + + if (args->bo_handle_count != cpu_job_bo_handle_count[cpu_job->job_type]) { + DRM_DEBUG("This CPU job was not submitted with the proper number of BOs.\n"); + goto fail; + } + + ret = v3d_job_init(v3d, file_priv, &cpu_job->base, + v3d_job_free, 0, &se, V3D_CPU); + if (ret) + goto fail; + + if (args->bo_handle_count) { + ret = v3d_lookup_bos(dev, file_priv, &cpu_job->base, + args->bo_handles, args->bo_handle_count); + if (ret) + goto fail; + + ret = v3d_lock_bo_reservations(&cpu_job->base, &acquire_ctx); + if (ret) + goto fail; + } + + mutex_lock(&v3d->sched_lock); + v3d_push_job(&cpu_job->base); + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + &cpu_job->base, + &acquire_ctx, 0, + NULL, cpu_job->base.done_fence); + + v3d_job_put(&cpu_job->base); + + return 0; + +fail: + v3d_job_cleanup((void *)cpu_job); + v3d_put_multisync_post_deps(&se); + + return ret; +} diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 1a7d7a689de3..00abef9d0db7 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -41,6 +41,7 @@ extern "C" { #define DRM_V3D_PERFMON_CREATE 0x08 #define DRM_V3D_PERFMON_DESTROY 0x09 #define DRM_V3D_PERFMON_GET_VALUES 0x0a +#define DRM_V3D_SUBMIT_CPU 0x0b #define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl) #define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo) @@ -56,6 +57,7 @@ extern "C" { struct drm_v3d_perfmon_destroy) #define DRM_IOCTL_V3D_PERFMON_GET_VALUES DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_PERFMON_GET_VALUES, \ struct drm_v3d_perfmon_get_values) +#define DRM_IOCTL_V3D_SUBMIT_CPU DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CPU, struct drm_v3d_submit_cpu) #define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 #define DRM_V3D_SUBMIT_EXTENSION 0x02 @@ -93,6 +95,7 @@ enum v3d_queue { V3D_TFU, V3D_CSD, V3D_CACHE_CLEAN, + V3D_CPU, }; /** @@ -276,6 +279,7 @@ enum drm_v3d_param { DRM_V3D_PARAM_SUPPORTS_CACHE_FLUSH, DRM_V3D_PARAM_SUPPORTS_PERFMON, DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT, + DRM_V3D_PARAM_SUPPORTS_CPU_QUEUE, }; struct drm_v3d_get_param { @@ -361,6 +365,19 @@ struct drm_v3d_submit_csd { __u32 pad; }; +struct drm_v3d_submit_cpu { + /* Pointer to a u32 array of the BOs that are referenced by the job. */ + __u64 bo_handles; + + /* Number of BO handles passed in (size is that times 4). */ + __u32 bo_handle_count; + + __u32 flags; + + /* Pointer to an array of ioctl extensions*/ + __u64 extensions; +}; + enum { V3D_PERFCNT_FEP_VALID_PRIMTS_NO_PIXELS, V3D_PERFCNT_FEP_VALID_PRIMS, From patchwork Mon Nov 27 18:48:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C7C4C07D5A for ; Mon, 27 Nov 2023 19:00:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E8A0B10E3E2; Mon, 27 Nov 2023 19:00:12 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id E7E2B10E3DB for ; Mon, 27 Nov 2023 19:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=c3D9LKo1cPAH9/f4DLFIp0Vb1/L98eEDQhFQP8Pqx8E=; b=nVkpU+ciSHbzlNFCKQFPSTD9kJ otqJidOkbS9uuq63G55I1u7QseMUZi5CsP/QB3oqA1o5NGqSax+QRE/XeA1cOS8FQL7dApGK41Lye BjzNfD4AABVfkygmqIeGxUdyRvRKWK9RqKoxJCROX1NdrsiqFd8H3ZvpqMOtIw7DhKTtw57dBHXyt ar9KtpMF6LQ59/MYgRRwN2ko2ZcXqenIynrhvNfKNCEprzxqHWlT8cgpBgiU14cBMVFfSIFkbwhKQ IuhSGd2X+Y5gKLLHdwAaIbVU82mpA2+NeUWNupFMbG9oQ9Un8hEzGrdQs67G+6lH1lGble9PpO0Cg sjFYMLKg==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpt-007jiM-Ca; Mon, 27 Nov 2023 20:00:01 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 08/17] drm/v3d: Use v3d_get_extensions() to parse CPU job data Date: Mon, 27 Nov 2023 15:48:26 -0300 Message-ID: <20231127185723.10348-10-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Currently, v3d_get_extensions() only parses multisync data and assigns it to the `struct v3d_submit_ext`. But, to implement the CPU job with user extensions, we want v3d_get_extensions() to be able to parse CPU job data and assign it to the `struct v3d_cpu_job`. Therefore, allow the function v3d_get_extensions() to use `struct v3d_cpu_job *` as a parameter. If the `struct v3d_cpu_job *` is assigned to NULL, it means that the job is a GPU job and CPU job extensions should be rejected. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_submit.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 918e28b1d00a..124935547f17 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -323,10 +323,9 @@ v3d_get_multisync_post_deps(struct drm_file *file_priv, static int v3d_get_multisync_submit_deps(struct drm_file *file_priv, struct drm_v3d_extension __user *ext, - void *data) + struct v3d_submit_ext *se) { struct drm_v3d_multi_sync multisync; - struct v3d_submit_ext *se = data; int ret; if (se->in_sync_count || se->out_sync_count) { @@ -340,7 +339,7 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, if (multisync.pad) return -EINVAL; - ret = v3d_get_multisync_post_deps(file_priv, data, multisync.out_sync_count, + ret = v3d_get_multisync_post_deps(file_priv, se, multisync.out_sync_count, multisync.out_syncs); if (ret) return ret; @@ -359,7 +358,8 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, static int v3d_get_extensions(struct drm_file *file_priv, u64 ext_handles, - void *data) + struct v3d_submit_ext *se, + struct v3d_cpu_job *job) { struct drm_v3d_extension __user *user_ext; int ret; @@ -375,15 +375,16 @@ v3d_get_extensions(struct drm_file *file_priv, switch (ext.id) { case DRM_V3D_EXT_ID_MULTI_SYNC: - ret = v3d_get_multisync_submit_deps(file_priv, user_ext, data); - if (ret) - return ret; + ret = v3d_get_multisync_submit_deps(file_priv, user_ext, se); break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; } + if (ret) + return ret; + user_ext = u64_to_user_ptr(ext.next); } @@ -430,7 +431,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, NULL); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); return ret; @@ -585,7 +586,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, NULL); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); return ret; @@ -692,7 +693,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, NULL); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); return ret; @@ -805,7 +806,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, return ret; if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, cpu_job); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); goto fail; From patchwork Mon Nov 27 18:48:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 975C8C4167B for ; Mon, 27 Nov 2023 19:00:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AE69110E3E6; Mon, 27 Nov 2023 19:00:27 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1F90C10E3DB for ; Mon, 27 Nov 2023 19:00:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=uqov3897rRMYgqlae47+1CPx2uh5SHUP6FbfaO38SH8=; b=EV4MVCnj7ee2BgG/m1DGMH8rcR whLatYmrwMHMwrfBLZj9nDaqMlKBfarzAmmXo+/7pZxHiYwHqG34OXRrpofgTvpFpOTamz+JgcvN1 nLYBwk7yEOSyXdU5zipjoPHAgRdUE0IJnp7cgiB5HqqxBdmRB5mR7kYpj/GTfddEBJBRErt20ZbJ2 2LQhaV90f/FdIqUd5XTmN1E35/xKYoNUNB0eLsiY6Q7eCfyz6xbyL7RsLwoqXwcjaKvMH26WrQGRZ YyDR5cfE0SDomX9GdnU25rqRxoCgWUDFQl5KcoNIg2Uc0FNlQ0p4uHC/HiQ/f9tDuFl/gj9iip8y4 +zXLdeVA==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gpx-007jiM-Ng; Mon, 27 Nov 2023 20:00:06 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 09/17] drm/v3d: Create tracepoints to track the CPU job Date: Mon, 27 Nov 2023 15:48:27 -0300 Message-ID: <20231127185723.10348-11-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Create tracepoints to track the three major events of a CPU job lifetime: 1. Submission of a `v3d_submit_cpu` IOCTL 2. Beginning of the execution of a CPU job 3. Ending of the execution of a CPU job Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_sched.c | 4 +++ drivers/gpu/drm/v3d/v3d_submit.c | 2 ++ drivers/gpu/drm/v3d/v3d_trace.h | 57 ++++++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index c89e92fc614c..ebbd00840a73 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -288,8 +288,12 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job) file->start_ns[V3D_CPU] = local_clock(); v3d->queue[V3D_CPU].start_ns = file->start_ns[V3D_CPU]; + trace_v3d_cpu_job_begin(&v3d->drm, job->job_type); + cpu_job_function[job->job_type](job); + trace_v3d_cpu_job_end(&v3d->drm, job->job_type); + runtime = local_clock() - file->start_ns[V3D_CPU]; file->enabled_ns[V3D_CPU] += runtime; diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 124935547f17..c134b113b181 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -824,6 +824,8 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, goto fail; } + trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type); + ret = v3d_job_init(v3d, file_priv, &cpu_job->base, v3d_job_free, 0, &se, V3D_CPU); if (ret) diff --git a/drivers/gpu/drm/v3d/v3d_trace.h b/drivers/gpu/drm/v3d/v3d_trace.h index 7aa8dc356e54..06086ece6e9e 100644 --- a/drivers/gpu/drm/v3d/v3d_trace.h +++ b/drivers/gpu/drm/v3d/v3d_trace.h @@ -225,6 +225,63 @@ TRACE_EVENT(v3d_submit_csd, __entry->seqno) ); +TRACE_EVENT(v3d_submit_cpu_ioctl, + TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type), + TP_ARGS(dev, job_type), + + TP_STRUCT__entry( + __field(u32, dev) + __field(enum v3d_cpu_job_type, job_type) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->job_type = job_type; + ), + + TP_printk("dev=%u, job_type=%d", + __entry->dev, + __entry->job_type) +); + +TRACE_EVENT(v3d_cpu_job_begin, + TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type), + TP_ARGS(dev, job_type), + + TP_STRUCT__entry( + __field(u32, dev) + __field(enum v3d_cpu_job_type, job_type) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->job_type = job_type; + ), + + TP_printk("dev=%u, job_type=%d", + __entry->dev, + __entry->job_type) +); + +TRACE_EVENT(v3d_cpu_job_end, + TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type), + TP_ARGS(dev, job_type), + + TP_STRUCT__entry( + __field(u32, dev) + __field(enum v3d_cpu_job_type, job_type) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->job_type = job_type; + ), + + TP_printk("dev=%u, job_type=%d", + __entry->dev, + __entry->job_type) +); + TRACE_EVENT(v3d_cache_clean_begin, TP_PROTO(struct drm_device *dev), TP_ARGS(dev), From patchwork Mon Nov 27 18:48:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 02BFDC0755A for ; Mon, 27 Nov 2023 19:00:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A5C6C10E3E7; Mon, 27 Nov 2023 19:00:29 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id DC9BE10E3DB for ; Mon, 27 Nov 2023 19:00:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=HMnP0nsUbp8jy9rjF58RVwPtyVIB0xPs3y4AvaHil6c=; b=f/dwvuvQCaWEs5ZKT5G1hHmJx9 Dm8IuFdOnrvpLQkXwzgf7rUAUVbnkSRzTcXnwdHg4PRyevPfNNlrE9lmq0oZU0p/elDej0jP1GhbK 4UcqJ/4BlnX10o2rsSXynbxi219fv0+u/dUsdBP+ufs23Jc6XX47sIR7OjDepo5yOmW0/yaDOFMNV aSWojTuLxp0uE+eYg2fRVbKOiP4HBZM+qvlKb7nhI2VGhKo5wE+0hGlB1+dhnjTri8lt9xWVZriln OtIc94qeyR5Tr1FMBl/p0TaKt3/FKtnz0D5F6H5hMdc1Mey6h9WIb9jcMyXbDjIha5IEJ8JuWeORl ouRqErow==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gq2-007jiM-9G; Mon, 27 Nov 2023 20:00:10 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 10/17] drm/v3d: Detach the CSD job BO setup Date: Mon, 27 Nov 2023 15:48:28 -0300 Message-ID: <20231127185723.10348-12-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen Detach CSD job setup from CSD submission ioctl to reuse it in CPU submission ioctl for indirect CSD job. Signed-off-by: Melissa Wen Co-developed-by: Maíra Canal Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_submit.c | 68 ++++++++++++++++++++------------ 1 file changed, 42 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index c134b113b181..eb26fe1e27e3 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -256,6 +256,45 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } } +static int +v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv, + struct v3d_dev *v3d, + struct drm_v3d_submit_csd *args, + struct v3d_csd_job **job, + struct v3d_job **clean_job, + struct v3d_submit_ext *se, + struct ww_acquire_ctx *acquire_ctx) +{ + int ret; + + ret = v3d_job_allocate((void *)job, sizeof(**job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &(*job)->base, + v3d_job_free, args->in_sync, se, V3D_CSD); + if (ret) + return ret; + + ret = v3d_job_allocate((void *)clean_job, sizeof(**clean_job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, *clean_job, + v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); + if (ret) + return ret; + + (*job)->args = *args; + + ret = v3d_lookup_bos(&v3d->drm, file_priv, *clean_job, + args->bo_handles, args->bo_handle_count); + if (ret) + return ret; + + return v3d_lock_bo_reservations(*clean_job, acquire_ctx); +} + static void v3d_put_multisync_post_deps(struct v3d_submit_ext *se) { @@ -700,32 +739,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_allocate((void *)&job, sizeof(*job)); - if (ret) - return ret; - - ret = v3d_job_init(v3d, file_priv, &job->base, - v3d_job_free, args->in_sync, &se, V3D_CSD); - if (ret) - goto fail; - - ret = v3d_job_allocate((void *)&clean_job, sizeof(*clean_job)); - if (ret) - goto fail; - - ret = v3d_job_init(v3d, file_priv, clean_job, - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); - if (ret) - goto fail; - - job->args = *args; - - ret = v3d_lookup_bos(dev, file_priv, clean_job, - args->bo_handles, args->bo_handle_count); - if (ret) - goto fail; - - ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx); + ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d, args, + &job, &clean_job, &se, + &acquire_ctx); if (ret) goto fail; From patchwork Mon Nov 27 18:48:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 75A94C07D5A for ; Mon, 27 Nov 2023 19:00:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8666E10E3DB; Mon, 27 Nov 2023 19:00:28 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id DF7C910E3DB for ; Mon, 27 Nov 2023 19:00:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=3pjxdTIr4LkGzYZHCn63mYo8eOHl+Mm+tfB4KcX1fOk=; b=jnBsB8OrKRVMOwEL2lDJCVVW4T ZohFmZBvHkIkXKO0+b4vw9w/4XMSiVSC/+NBczDAHh1rZw0WYB8SXzWppMJugjKoLJAIu+eVm/7lh F7bNu97ar6Xw8XEvH0F/Ggt114xtMLJ6NjP9/OGDROc6Hcw6WB9fYuU+HZ898j/jqUPmNWFQIBQW/ I0VATdJzf6ZNCQmc47TmJWcF0/qGANNBhQeCZQhs2ifAxmxmTulsv+EI5QYXtTZY75Rbd4lzVntLi 0bjz0Jn1YlYOwj7qRQu8t1+MIOX2rBaOpRFxN1RaIGyyxo+owj62qBB688Jh/ClVHbx+hy8B6WaBF uylP6ong==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gq6-007jiM-Pd; Mon, 27 Nov 2023 20:00:15 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 11/17] drm/v3d: Enable BO mapping Date: Mon, 27 Nov 2023 15:48:29 -0300 Message-ID: <20231127185723.10348-13-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" For the indirect CSD CPU job, we will need to access the internal contents of the BO with the dispatch parameters. Therefore, create methods to allow the mapping and unmapping of the BO. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_bo.c | 18 ++++++++++++++++++ drivers/gpu/drm/v3d/v3d_drv.h | 4 ++++ 2 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c index 357a0da7e16a..1bdfac8beafd 100644 --- a/drivers/gpu/drm/v3d/v3d_bo.c +++ b/drivers/gpu/drm/v3d/v3d_bo.c @@ -33,6 +33,9 @@ void v3d_free_object(struct drm_gem_object *obj) struct v3d_dev *v3d = to_v3d_dev(obj->dev); struct v3d_bo *bo = to_v3d_bo(obj); + if (bo->vaddr) + v3d_put_bo_vaddr(bo); + v3d_mmu_remove_ptes(bo); mutex_lock(&v3d->bo_lock); @@ -134,6 +137,7 @@ struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv, if (IS_ERR(shmem_obj)) return ERR_CAST(shmem_obj); bo = to_v3d_bo(&shmem_obj->base); + bo->vaddr = NULL; ret = v3d_bo_create_finish(&shmem_obj->base); if (ret) @@ -167,6 +171,20 @@ v3d_prime_import_sg_table(struct drm_device *dev, return obj; } +void v3d_get_bo_vaddr(struct v3d_bo *bo) +{ + struct drm_gem_shmem_object *obj = &bo->base; + + bo->vaddr = vmap(obj->pages, obj->base.size >> PAGE_SHIFT, VM_MAP, + pgprot_writecombine(PAGE_KERNEL)); +} + +void v3d_put_bo_vaddr(struct v3d_bo *bo) +{ + vunmap(bo->vaddr); + bo->vaddr = NULL; +} + int v3d_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 2246a0e29955..39d62915cdd6 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -202,6 +202,8 @@ struct v3d_bo { * v3d_render_job->unref_list */ struct list_head unref_head; + + void *vaddr; }; static inline struct v3d_bo * @@ -391,6 +393,8 @@ struct drm_gem_object *v3d_create_object(struct drm_device *dev, size_t size); void v3d_free_object(struct drm_gem_object *gem_obj); struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv, size_t size); +void v3d_get_bo_vaddr(struct v3d_bo *bo); +void v3d_put_bo_vaddr(struct v3d_bo *bo); int v3d_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_mmap_bo_ioctl(struct drm_device *dev, void *data, From patchwork Mon Nov 27 18:48:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470112 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AA4E3C07D5B for ; Mon, 27 Nov 2023 19:00:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 252A810E3E8; Mon, 27 Nov 2023 19:00:31 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9FA0510E3DB for ; Mon, 27 Nov 2023 19:00:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=uMI0QCITOk9vMzQxG86clxMKpeglXn6x3j01D+W0rGU=; b=f8OWCOqflj9Ngo0st3HkN37svS Yg9mPrY/g4vOgJ/+l/G0LzQPVGOQjlieof9HUNCSUh3y60iUcgldwuM17a+Wvr22Uw4jKkbCnh3Ae egeAcmyRX+PUWJXnbZrS/AY3zzmvmmovyCv/cZg1MA2fP4rykzS5eT4oIfNTNA83v7iH/5HPztW7Q elMYLXPZqV00CDGMiiRAkxBTy0O9Idqu9PEviHU33TML44oqu/+6DTl32Utv63eTZhnmWodUW4hhG 5CQbeNcyKHXjsdJ8VFTfbJlWeYecCsCGNyLmmvQKOVD2kZAyEn1lfbnNdxzhJpM+oVMJaUtgNsnd7 Vj0WCIhw==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gqB-007jiM-QC; Mon, 27 Nov 2023 20:00:20 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 12/17] drm/v3d: Create a CPU job extension for a indirect CSD job Date: Mon, 27 Nov 2023 15:48:30 -0300 Message-ID: <20231127185723.10348-14-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. An indirect CSD job is a job that, when executed in the queue, will map the indirect buffer, read the dispatch parameters, and submit a regular dispatch. Therefore, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of an indirect CSD. This user extension will allow the creation of a CSD job linked to a CPU job. The CPU job will wait for the indirect CSD job dependencies and, once they are signaled, it will update the CSD job parameters. Co-developed-by: Melissa Wen Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 31 ++++++++- drivers/gpu/drm/v3d/v3d_sched.c | 41 +++++++++++- drivers/gpu/drm/v3d/v3d_submit.c | 104 ++++++++++++++++++++++++++++++- include/uapi/drm/v3d_drm.h | 43 ++++++++++++- 4 files changed, 213 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 39d62915cdd6..202c0d4b04a5 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -316,12 +316,41 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; }; -enum v3d_cpu_job_type {}; +enum v3d_cpu_job_type { + V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, +}; + +struct v3d_indirect_csd_info { + /* Indirect CSD */ + struct v3d_csd_job *job; + + /* Clean cache job associated to the Indirect CSD job */ + struct v3d_job *clean_job; + + /* Offset within the BO where the workgroup counts are stored */ + u32 offset; + + /* Workgroups size */ + u32 wg_size; + + /* Indices of the uniforms with the workgroup dispatch counts + * in the uniform stream. + */ + u32 wg_uniform_offsets[3]; + + /* Indirect BO */ + struct drm_gem_object *indirect; + + /* Context of the Indirect CSD job */ + struct ww_acquire_ctx acquire_ctx; +}; struct v3d_cpu_job { struct v3d_job base; enum v3d_cpu_job_type job_type; + + struct v3d_indirect_csd_info indirect_csd; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index ebbd00840a73..597e4ec3d28d 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -25,6 +25,8 @@ #include "v3d_regs.h" #include "v3d_trace.h" +#define V3D_CSD_CFG012_WG_COUNT_SHIFT 16 + static struct v3d_job * to_v3d_job(struct drm_sched_job *sched_job) { @@ -268,7 +270,44 @@ v3d_csd_job_run(struct drm_sched_job *sched_job) return fence; } -static const v3d_cpu_job_fn cpu_job_function[] = { }; +static void +v3d_rewrite_csd_job_wg_counts_from_indirect(struct v3d_cpu_job *job) +{ + struct v3d_indirect_csd_info *indirect_csd = &job->indirect_csd; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + struct v3d_bo *indirect = to_v3d_bo(indirect_csd->indirect); + struct drm_v3d_submit_csd *args = &indirect_csd->job->args; + u32 *wg_counts; + + v3d_get_bo_vaddr(bo); + v3d_get_bo_vaddr(indirect); + + wg_counts = (uint32_t *) (bo->vaddr + indirect_csd->offset); + + if (wg_counts[0] == 0 || wg_counts[1] == 0 || wg_counts[2] == 0) + return; + + args->cfg[0] = wg_counts[0] << V3D_CSD_CFG012_WG_COUNT_SHIFT; + args->cfg[1] = wg_counts[1] << V3D_CSD_CFG012_WG_COUNT_SHIFT; + args->cfg[2] = wg_counts[2] << V3D_CSD_CFG012_WG_COUNT_SHIFT; + args->cfg[4] = DIV_ROUND_UP(indirect_csd->wg_size, 16) * + (wg_counts[0] * wg_counts[1] * wg_counts[2]) - 1; + + for (int i = 0; i < 3; i++) { + /* 0xffffffff indicates that the uniform rewrite is not needed */ + if (indirect_csd->wg_uniform_offsets[i] != 0xffffffff) { + u32 uniform_idx = indirect_csd->wg_uniform_offsets[i]; + ((uint32_t *) indirect->vaddr)[uniform_idx] = wg_counts[i]; + } + } + + v3d_put_bo_vaddr(indirect); + v3d_put_bo_vaddr(bo); +} + +static const v3d_cpu_job_fn cpu_job_function[] = { + [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, +}; static struct dma_fence * v3d_cpu_job_run(struct drm_sched_job *sched_job) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index eb26fe1e27e3..0320695b941b 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -391,6 +391,48 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, return 0; } +/* Get data for the indirect CSD job submission. */ +static int +v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct v3d_dev *v3d = v3d_priv->v3d; + struct drm_v3d_indirect_csd indirect_csd; + struct v3d_indirect_csd_info *info = &job->indirect_csd; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(&indirect_csd, ext, sizeof(indirect_csd))) + return -EFAULT; + + if (!v3d_has_csd(v3d)) { + DRM_DEBUG("Attempting CSD submit on non-CSD hardware.\n"); + return -EINVAL; + } + + job->job_type = V3D_CPU_JOB_TYPE_INDIRECT_CSD; + info->offset = indirect_csd.offset; + info->wg_size = indirect_csd.wg_size; + memcpy(&info->wg_uniform_offsets, &indirect_csd.wg_uniform_offsets, + sizeof(indirect_csd.wg_uniform_offsets)); + + info->indirect = drm_gem_object_lookup(file_priv, indirect_csd.indirect); + + return v3d_setup_csd_jobs_and_bos(file_priv, v3d, &indirect_csd.submit, + &info->job, &info->clean_job, + NULL, &info->acquire_ctx); +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -416,6 +458,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_MULTI_SYNC: ret = v3d_get_multisync_submit_deps(file_priv, user_ext, se); break; + case DRM_V3D_EXT_ID_CPU_INDIRECT_CSD: + ret = v3d_get_cpu_indirect_csd_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -790,7 +835,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return ret; } -static const unsigned int cpu_job_bo_handle_count[] = { }; +static const unsigned int cpu_job_bo_handle_count[] = { + [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, +}; /** * v3d_submit_cpu_ioctl() - Submits a CPU job to the V3D. @@ -808,7 +855,10 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct drm_v3d_submit_cpu *args = data; struct v3d_submit_ext se = {0}; + struct v3d_submit_ext *out_se = NULL; struct v3d_cpu_job *cpu_job = NULL; + struct v3d_csd_job *csd_job = NULL; + struct v3d_job *clean_job = NULL; struct ww_acquire_ctx acquire_ctx; int ret; @@ -847,6 +897,9 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, if (ret) goto fail; + clean_job = cpu_job->indirect_csd.clean_job; + csd_job = cpu_job->indirect_csd.job; + if (args->bo_handle_count) { ret = v3d_lookup_bos(dev, file_priv, &cpu_job->base, args->bo_handles, args->bo_handle_count); @@ -860,19 +913,66 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, mutex_lock(&v3d->sched_lock); v3d_push_job(&cpu_job->base); + + switch (cpu_job->job_type) { + case V3D_CPU_JOB_TYPE_INDIRECT_CSD: + ret = drm_sched_job_add_dependency(&csd_job->base.base, + dma_fence_get(cpu_job->base.done_fence)); + if (ret) + goto fail_unreserve; + + v3d_push_job(&csd_job->base); + + ret = drm_sched_job_add_dependency(&clean_job->base, + dma_fence_get(csd_job->base.done_fence)); + if (ret) + goto fail_unreserve; + + v3d_push_job(clean_job); + + break; + default: + break; + } mutex_unlock(&v3d->sched_lock); + out_se = (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) ? NULL : &se; + v3d_attach_fences_and_unlock_reservation(file_priv, &cpu_job->base, &acquire_ctx, 0, - NULL, cpu_job->base.done_fence); + out_se, cpu_job->base.done_fence); + + switch (cpu_job->job_type) { + case V3D_CPU_JOB_TYPE_INDIRECT_CSD: + v3d_attach_fences_and_unlock_reservation(file_priv, + clean_job, + &cpu_job->indirect_csd.acquire_ctx, + 0, &se, clean_job->done_fence); + break; + default: + break; + } v3d_job_put(&cpu_job->base); + v3d_job_put(&csd_job->base); + v3d_job_put(clean_job); return 0; +fail_unreserve: + mutex_unlock(&v3d->sched_lock); + + drm_gem_unlock_reservations(cpu_job->base.bo, cpu_job->base.bo_count, + &acquire_ctx); + + drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count, + &cpu_job->indirect_csd.acquire_ctx); + fail: v3d_job_cleanup((void *)cpu_job); + v3d_job_cleanup((void *)csd_job); + v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); return ret; diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 00abef9d0db7..0c0f47782528 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -71,7 +71,8 @@ extern "C" { struct drm_v3d_extension { __u64 next; __u32 id; -#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 +#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 +#define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 __u32 flags; /* mbz */ }; @@ -365,8 +366,46 @@ struct drm_v3d_submit_csd { __u32 pad; }; +/** + * struct drm_v3d_indirect_csd - ioctl extension for the CPU job to create an + * indirect CSD + * + * When an extension of DRM_V3D_EXT_ID_CPU_INDIRECT_CSD id is defined, it + * points to this extension to define a indirect CSD submission. It creates a + * CPU job linked to a CSD job. The CPU job waits for the indirect CSD + * dependencies and, once they are signaled, it updates the CSD job config + * before allowing the CSD job execution. + */ +struct drm_v3d_indirect_csd { + struct drm_v3d_extension base; + + /* Indirect CSD */ + struct drm_v3d_submit_csd submit; + + /* Handle of the indirect BO, that should be also attached to the + * indirect CSD. + */ + __u32 indirect; + + /* Offset within the BO where the workgroup counts are stored */ + __u32 offset; + + /* Workgroups size */ + __u32 wg_size; + + /* Indices of the uniforms with the workgroup dispatch counts + * in the uniform stream. If the uniform rewrite is not needed, + * the offset must be 0xffffffff. + */ + __u32 wg_uniform_offsets[3]; +}; + struct drm_v3d_submit_cpu { - /* Pointer to a u32 array of the BOs that are referenced by the job. */ + /* Pointer to a u32 array of the BOs that are referenced by the job. + * + * For DRM_V3D_EXT_ID_CPU_INDIRECT_CSD, it must contain only one BO, + * that contains the workgroup counts. + */ __u64 bo_handles; /* Number of BO handles passed in (size is that times 4). */ From patchwork Mon Nov 27 18:48:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1958C4167B for ; Mon, 27 Nov 2023 19:00:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2F72E10E3E5; Mon, 27 Nov 2023 19:00:47 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id C79EC10E3EB for ; Mon, 27 Nov 2023 19:00:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=tkwSRbUqSNWZRpNVa8ChwV7I46/Xz9VOreFD73RwF2I=; b=i28rHpMu1FDaNOOkOlDKTTkcEp YHrsYaNQ3Fx9r9oREK0b7/k+OiVWX+zZytbCNL4KAp+fhZfDHM0naDpqpYMV9PswKWZyQjmaIKjFw BP1u51JqCArO/ygomJ6ClYHW9fb9hQUn/kBwt/Yn7TGcGrosT9/sVl0AcCjTZb0Qf8uO+iLOWDJCh XhJCE4dYx7Zajsn0un/orDb2HulUoULJOJVAaAyQdmHqvh4TpkYwqm+vxJDADhqopBg4aOK1teGb3 X4zSxIE78BO2t0c1kVHqL+rLZOCuaNEz3rxFjWcC7vZnEIIHAYC6bfsmFO2Ui/Q9qR6OmacWfnuFJ ism2HowQ==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gqG-007jiM-6u; Mon, 27 Nov 2023 20:00:24 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 13/17] drm/v3d: Create a CPU job extension for the timestamp query job Date: Mon, 27 Nov 2023 15:48:31 -0300 Message-ID: <20231127185723.10348-15-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A timestamp query job is a job that calculates the query timestamp and updates the query availability by signaling a syncobj. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a timestamp query job. This user extension will allow the creation of a CPU job that performs the timestamp query calculation and updates the timestamp BO with the proper value. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 17 +++++++++ drivers/gpu/drm/v3d/v3d_sched.c | 40 +++++++++++++++++++- drivers/gpu/drm/v3d/v3d_submit.c | 63 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 30 +++++++++++++++ 4 files changed, 149 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 202c0d4b04a5..dd86e745c260 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -318,6 +318,15 @@ struct v3d_csd_job { enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, + V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, +}; + +struct v3d_timestamp_query { + /* Offset of this query in the timestamp BO for its value. */ + u32 offset; + + /* Syncobj that indicates the timestamp availability */ + struct drm_syncobj *syncobj; }; struct v3d_indirect_csd_info { @@ -345,12 +354,20 @@ struct v3d_indirect_csd_info { struct ww_acquire_ctx acquire_ctx; }; +struct v3d_timestamp_query_info { + struct v3d_timestamp_query *queries; + + u32 count; +}; + struct v3d_cpu_job { struct v3d_job base; enum v3d_cpu_job_type job_type; struct v3d_indirect_csd_info indirect_csd; + + struct v3d_timestamp_query_info timestamp_query; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 597e4ec3d28d..f4cffefb6398 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -21,6 +21,8 @@ #include #include +#include + #include "v3d_drv.h" #include "v3d_regs.h" #include "v3d_trace.h" @@ -71,6 +73,21 @@ v3d_sched_job_free(struct drm_sched_job *sched_job) v3d_job_cleanup(job); } +static void +v3d_cpu_job_free(struct drm_sched_job *sched_job) +{ + struct v3d_cpu_job *job = to_cpu_job(sched_job); + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + + if (timestamp_query->queries) { + for (int i = 0; i < timestamp_query->count; i++) + drm_syncobj_put(timestamp_query->queries[i].syncobj); + kvfree(timestamp_query->queries); + } + + v3d_job_cleanup(&job->base); +} + static void v3d_switch_perfmon(struct v3d_dev *v3d, struct v3d_job *job) { @@ -305,8 +322,29 @@ v3d_rewrite_csd_job_wg_counts_from_indirect(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +v3d_timestamp_query(struct v3d_cpu_job *job) +{ + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + u8 *value_addr; + + v3d_get_bo_vaddr(bo); + + for (int i = 0; i < timestamp_query->count; i++) { + value_addr = ((u8 *) bo->vaddr) + timestamp_query->queries[i].offset; + *((u64 *) value_addr) = i == 0 ? ktime_get_ns() : 0ull; + + drm_syncobj_replace_fence(timestamp_query->queries[i].syncobj, + job->base.done_fence); + } + + v3d_put_bo_vaddr(bo); +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, + [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, }; static struct dma_fence * @@ -504,7 +542,7 @@ static const struct drm_sched_backend_ops v3d_cache_clean_sched_ops = { static const struct drm_sched_backend_ops v3d_cpu_sched_ops = { .run_job = v3d_cpu_job_run, .timedout_job = v3d_generic_job_timedout, - .free_job = v3d_sched_job_free + .free_job = v3d_cpu_job_free }; int diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 0320695b941b..83e029e786ea 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -433,6 +433,64 @@ v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv, NULL, &info->acquire_ctx); } +/* Get data for the query timestamp job submission. */ +static int +v3d_get_cpu_timestamp_query_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *offsets, *syncs; + struct drm_v3d_timestamp_query timestamp; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(×tamp, ext, sizeof(timestamp))) + return -EFAULT; + + if (timestamp.pad) + return -EINVAL; + + job->job_type = V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY; + + job->timestamp_query.queries = kvmalloc_array(timestamp.count, + sizeof(struct v3d_timestamp_query), + GFP_KERNEL); + if (!job->timestamp_query.queries) + return -ENOMEM; + + offsets = u64_to_user_ptr(timestamp.offsets); + syncs = u64_to_user_ptr(timestamp.syncs); + + for (int i = 0; i < timestamp.count; i++) { + u32 offset, sync; + + if (copy_from_user(&offset, offsets++, sizeof(offset))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].offset = offset; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + } + job->timestamp_query.count = timestamp.count; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -461,6 +519,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_INDIRECT_CSD: ret = v3d_get_cpu_indirect_csd_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY: + ret = v3d_get_cpu_timestamp_query_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -837,6 +898,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, + [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, }; /** @@ -974,6 +1036,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, v3d_job_cleanup((void *)csd_job); v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); + kvfree(cpu_job->timestamp_query.queries); return ret; } diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 0c0f47782528..239801b5e117 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -73,6 +73,7 @@ struct drm_v3d_extension { __u32 id; #define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 #define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 +#define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 __u32 flags; /* mbz */ }; @@ -400,11 +401,40 @@ struct drm_v3d_indirect_csd { __u32 wg_uniform_offsets[3]; }; +/** + * struct drm_v3d_timestamp_query - ioctl extension for the CPU job to calculate + * a timestamp query + * + * When an extension DRM_V3D_EXT_ID_TIMESTAMP_QUERY is defined, it points to + * this extension to define a timestamp query submission. This CPU job will + * calculate the timestamp query and update the query value within the + * timestamp BO. Moreover, it will signal the timestamp syncobj to indicate + * query availability. + */ +struct drm_v3d_timestamp_query { + struct drm_v3d_extension base; + + /* Array of queries' offsets within the timestamp BO for their value */ + __u64 offsets; + + /* Array of timestamp's syncobjs to indicate its availability */ + __u64 syncs; + + /* Number of queries */ + __u32 count; + + /* mbz */ + __u32 pad; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * * For DRM_V3D_EXT_ID_CPU_INDIRECT_CSD, it must contain only one BO, * that contains the workgroup counts. + * + * For DRM_V3D_EXT_ID_TIMESTAMP_QUERY, it must contain only one BO, + * that will contain the timestamp. */ __u64 bo_handles; From patchwork Mon Nov 27 18:48:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470114 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D759C07D5A for ; Mon, 27 Nov 2023 19:00:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 86B6010E3EC; Mon, 27 Nov 2023 19:00:53 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2F34B10E3DA for ; Mon, 27 Nov 2023 19:00:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=pISkqw68B3zYuF5SFRapXoE4tm5x/Xr4nsB+Yj7rJbM=; b=bhBvJkxBh+S6TPqZemGjTpX5q6 iPT7FiX3D8SHAfpody2Tj8pKDRfEdO7zyHSjz1xdm4h9CMNFgzGqxIkXpcoMOC22r+RFK8BF0J06R qmn2tk2zOJSM+0mUuqPoD0+gnRbJdfe9Jh1AUGHrAGEiKy9xO3u5d/4o91Rn9KpEM4IusWRcrqzOn qU0tZi1SKPapGgX1Q6kHR1c9lxeWFmUgWYXtFwrej25ulB5WQrhKbFa81og0In/rYmb+YXh1SemAJ uP5ulj2f3RJ3YKVIOPZzTXjTkkl+HW1SHBfjsrABBhrLnxRYjlFXnsx+8zto1PO1yzJK6Gr2psRXX /17xwlQg==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gqK-007jiM-Jw; Mon, 27 Nov 2023 20:00:29 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 14/17] drm/v3d: Create a CPU job extension for the reset timestamp job Date: Mon, 27 Nov 2023 15:48:32 -0300 Message-ID: <20231127185723.10348-16-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A reset timestamp job is a job that resets the timestamp queries based on the value offset of the first query. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a reset timestamp job. This user extension will allow the creation of a CPU job that resets the timestamp value in the timestamp BO and resets the availability syncobj. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 1 + drivers/gpu/drm/v3d/v3d_sched.c | 21 +++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 52 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 27 +++++++++++++++++ 4 files changed, 101 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index dd86e745c260..3988407635ed 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -319,6 +319,7 @@ struct v3d_csd_job { enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, + V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, }; struct v3d_timestamp_query { diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index f4cffefb6398..3a435f621b9b 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -342,9 +342,30 @@ v3d_timestamp_query(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +v3d_reset_timestamp_queries(struct v3d_cpu_job *job) +{ + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_timestamp_query *queries = timestamp_query->queries; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + u8 *value_addr; + + v3d_get_bo_vaddr(bo); + + for (int i = 0; i < timestamp_query->count; i++) { + value_addr = ((u8 *) bo->vaddr) + queries[i].offset; + *((u64 *) value_addr) = 0; + + drm_syncobj_replace_fence(queries[i].syncobj, NULL); + } + + v3d_put_bo_vaddr(bo); +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, + [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 83e029e786ea..1c719416e26a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -491,6 +491,54 @@ v3d_get_cpu_timestamp_query_params(struct drm_file *file_priv, return 0; } +static int +v3d_get_cpu_reset_timestamp_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *syncs; + struct drm_v3d_reset_timestamp_query reset; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(&reset, ext, sizeof(reset))) + return -EFAULT; + + job->job_type = V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY; + + job->timestamp_query.queries = kvmalloc_array(reset.count, + sizeof(struct v3d_timestamp_query), + GFP_KERNEL); + if (!job->timestamp_query.queries) + return -ENOMEM; + + syncs = u64_to_user_ptr(reset.syncs); + + for (int i = 0; i < reset.count; i++) { + u32 sync; + + job->timestamp_query.queries[i].offset = reset.offset + 8 * i; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + } + job->timestamp_query.count = reset.count; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -522,6 +570,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY: ret = v3d_get_cpu_timestamp_query_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY: + ret = v3d_get_cpu_reset_timestamp_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -899,6 +950,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, + [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, }; /** diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 239801b5e117..930f8d07f088 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -74,6 +74,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 #define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 #define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 +#define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 __u32 flags; /* mbz */ }; @@ -427,6 +428,29 @@ struct drm_v3d_timestamp_query { __u32 pad; }; +/** + * struct drm_v3d_reset_timestamp_query - ioctl extension for the CPU job to + * reset timestamp queries + * + * When an extension DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY is defined, it + * points to this extension to define a reset timestamp submission. This CPU + * job will reset the timestamp queries based on value offset of the first + * query. Moreover, it will reset the timestamp syncobj to reset query + * availability. + */ +struct drm_v3d_reset_timestamp_query { + struct drm_v3d_extension base; + + /* Array of timestamp's syncobjs to indicate its availability */ + __u64 syncs; + + /* Offset of the first query within the timestamp BO for its value */ + __u32 offset; + + /* Number of queries */ + __u32 count; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -435,6 +459,9 @@ struct drm_v3d_submit_cpu { * * For DRM_V3D_EXT_ID_TIMESTAMP_QUERY, it must contain only one BO, * that will contain the timestamp. + * + * For DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY, it must contain only + * one BO, that contains the timestamp. */ __u64 bo_handles; From patchwork Mon Nov 27 18:48:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BDE06C4167B for ; Mon, 27 Nov 2023 19:00:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CE69110E3ED; Mon, 27 Nov 2023 19:00:55 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id D7FFC10E3EE for ; Mon, 27 Nov 2023 19:00:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=OIJCu2edircauhqrwPzHBMvx1O2KoWIpubpVkdX1G/k=; b=ENlaMZgA4pti3HDo5ST0Z3Izwv EU+gXpyxnE9KX6zLV2MZE2B+8XBge9zFBKvMlvWtgP1cJRaRrQy7Rfb2uqBEdlrEL3XvUS1lNM0le Lz+QtxtgUIPMe1VvFA2PUXjE1HAwCdtzxp0GgOYLUCzfM8sUJPVFWKe6q11BkmsrNDo2TkP6LsVxx IwW48v9cOsKXNNHX+/EJcuVJtxilio2dWoqTpXbQEHzxP6aH40zsbE70ks/uDMVCiSAlnFvmlymwU wREYYfB8OVedHnFVw7VqGofBn4WXW9fqbBa49izjYlNLkq+FgEurtSI0Va2GAAt3TVz1MbXn/FWwW l/e8wpJw==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gqP-007jiM-3z; Mon, 27 Nov 2023 20:00:34 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 15/17] drm/v3d: Create a CPU job extension to copy timestamp query to a buffer Date: Mon, 27 Nov 2023 15:48:33 -0300 Message-ID: <20231127185723.10348-17-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A copy timestamp query job is a job that copy the complete or partial result of a query to a buffer. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a copy timestamp query job. This user extension will allow the creation of a CPU job that copy the results of a timestamp query to a BO with the possibility to indicate the timestamp availability with a availability bit. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 20 +++++++++ drivers/gpu/drm/v3d/v3d_sched.c | 53 ++++++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 69 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 45 +++++++++++++++++++++ 4 files changed, 187 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 3988407635ed..5058a354fffd 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -320,6 +320,7 @@ enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, + V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY, }; struct v3d_timestamp_query { @@ -361,6 +362,23 @@ struct v3d_timestamp_query_info { u32 count; }; +struct v3d_copy_query_results_info { + /* Define if should write to buffer using 64 or 32 bits */ + bool do_64bit; + + /* Define if it can write to buffer even if the query is not available */ + bool do_partial; + + /* Define if it should write availability bit to buffer */ + bool availability_bit; + + /* Offset of the copy buffer in the BO */ + u32 offset; + + /* Stride of the copy buffer in the BO */ + u32 stride; +}; + struct v3d_cpu_job { struct v3d_job base; @@ -369,6 +387,8 @@ struct v3d_cpu_job { struct v3d_indirect_csd_info indirect_csd; struct v3d_timestamp_query_info timestamp_query; + + struct v3d_copy_query_results_info copy; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 3a435f621b9b..07c897cd3423 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -361,11 +361,64 @@ v3d_reset_timestamp_queries(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +write_to_buffer(void *dst, u32 idx, bool do_64bit, u64 value) +{ + if (do_64bit) { + u64 *dst64 = (u64 *) dst; + dst64[idx] = value; + } else { + u32 *dst32 = (u32 *) dst; + dst32[idx] = (u32) value; + } +} + +static void +v3d_copy_query_results(struct v3d_cpu_job *job) +{ + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_timestamp_query *queries = timestamp_query->queries; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + struct v3d_bo *timestamp = to_v3d_bo(job->base.bo[1]); + struct v3d_copy_query_results_info *copy = &job->copy; + struct dma_fence *fence; + u8 *query_addr; + bool available, write_result; + u8 *data; + int i; + + v3d_get_bo_vaddr(bo); + v3d_get_bo_vaddr(timestamp); + + data = ((u8 *) bo->vaddr) + copy->offset; + + for (i = 0; i < timestamp_query->count; i++) { + fence = drm_syncobj_fence_get(queries[i].syncobj); + available = fence ? dma_fence_is_signaled(fence) : false; + + write_result = available || copy->do_partial; + if (write_result) { + query_addr = ((u8 *) timestamp->vaddr) + queries[i].offset; + write_to_buffer(data, 0, copy->do_64bit, *((u64 *) query_addr)); + } + + if (copy->availability_bit) + write_to_buffer(data, 1, copy->do_64bit, available ? 1u : 0u); + + data += copy->stride; + + dma_fence_put(fence); + } + + v3d_put_bo_vaddr(timestamp); + v3d_put_bo_vaddr(bo); +} static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, + [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = v3d_copy_query_results, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 1c719416e26a..bafd49c6440c 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -539,6 +539,71 @@ v3d_get_cpu_reset_timestamp_params(struct drm_file *file_priv, return 0; } +/* Get data for the copy timestamp query results job submission. */ +static int +v3d_get_cpu_copy_query_results_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *offsets, *syncs; + struct drm_v3d_copy_timestamp_query copy; + int i; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(©, ext, sizeof(copy))) + return -EFAULT; + + if (copy.pad) + return -EINVAL; + + job->job_type = V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY; + + job->timestamp_query.queries = kvmalloc_array(copy.count, + sizeof(struct v3d_timestamp_query), + GFP_KERNEL); + if (!job->timestamp_query.queries) + return -ENOMEM; + + offsets = u64_to_user_ptr(copy.offsets); + syncs = u64_to_user_ptr(copy.syncs); + + for (i = 0; i < copy.count; i++) { + u32 offset, sync; + + if (copy_from_user(&offset, offsets++, sizeof(offset))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].offset = offset; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + } + job->timestamp_query.count = copy.count; + + job->copy.do_64bit = copy.do_64bit; + job->copy.do_partial = copy.do_partial; + job->copy.availability_bit = copy.availability_bit; + job->copy.offset = copy.offset; + job->copy.stride = copy.stride; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -573,6 +638,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY: ret = v3d_get_cpu_reset_timestamp_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY: + ret = v3d_get_cpu_copy_query_results_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -951,6 +1019,7 @@ static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, + [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 2, }; /** diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 930f8d07f088..a3ae1f220291 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -75,6 +75,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 #define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 #define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 +#define DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY 0x05 __u32 flags; /* mbz */ }; @@ -451,6 +452,46 @@ struct drm_v3d_reset_timestamp_query { __u32 count; }; +/** + * struct drm_v3d_copy_timestamp_query - ioctl extension for the CPU job to copy + * query results to a buffer + * + * When an extension DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY is defined, it + * points to this extension to define a copy timestamp query submission. This + * CPU job will copy the timestamp queries results to a BO with the offset + * and stride defined in the extension. + */ +struct drm_v3d_copy_timestamp_query { + struct drm_v3d_extension base; + + /* Define if should write to buffer using 64 or 32 bits */ + __u8 do_64bit; + + /* Define if it can write to buffer even if the query is not available */ + __u8 do_partial; + + /* Define if it should write availability bit to buffer */ + __u8 availability_bit; + + /* mbz */ + __u8 pad; + + /* Offset of the buffer in the BO */ + __u32 offset; + + /* Stride of the buffer in the BO */ + __u32 stride; + + /* Number of queries */ + __u32 count; + + /* Array of queries' offsets within the timestamp BO for their value */ + __u64 offsets; + + /* Array of timestamp's syncobjs to indicate its availability */ + __u64 syncs; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -462,6 +503,10 @@ struct drm_v3d_submit_cpu { * * For DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY, it must contain only * one BO, that contains the timestamp. + * + * For DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY, it must contain two + * BOs. The first is the BO for which the timestamp queries results + * will be written to. The second is the BO that contains the timestamp. */ __u64 bo_handles; From patchwork Mon Nov 27 18:48:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470117 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F2CA9C07D5B for ; Mon, 27 Nov 2023 19:00:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 29A9810E3EF; Mon, 27 Nov 2023 19:00:58 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1CFD110E3EE for ; Mon, 27 Nov 2023 19:00:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=n3z6/OmNltRLmJdYsIrfAJ0puP2fWFFVAr41jeR3Tjc=; b=L8ElCum/z3CO1KOCf9nj5FV9jm YjExcPizQ0PkbIsaH5lgUNmThYrIKR93Gw7pWy7tcgkgSz8lm4sflLm/3zhBR0+7bBhi6X2RnxcKd lm4m7bxHqeNfli9OjcJPgmGhiefVPHv45vGHzAabaS+z2slrfJQ6k6XdJgNmueNKPFUqOgxE5OV4f QS/qtSQX7gr+DwXgV29tMymkwq4NCAhT6VTGTTVHADFFgMDmqPqFSFG26pkTAV0KiFRmvpzNcSt1/ vM9jBLwskS1bN7u1W0hTkvSEup5EedxaORUqh+BxIRw4WM3Mo31HCVxEZMYc7FluD6hiWhjiNiJGS frqmlPbg==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gqT-007jiM-UG; Mon, 27 Nov 2023 20:00:38 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 16/17] drm/v3d: Create a CPU job extension for the reset performance query job Date: Mon, 27 Nov 2023 15:48:34 -0300 Message-ID: <20231127185723.10348-18-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A reset performance query job is a job that resets the performance queries by resetting the values of the perfmons. Moreover, we also reset the syncobjs related to the availability of the query. So, create a user extension for the CPU job that enables the creation of a reset performance job. This user extension will allow the creation of a CPU job that resets the perfmons values and resets the availability syncobj. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 28 ++++++++++++ drivers/gpu/drm/v3d/v3d_sched.c | 37 ++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 73 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 30 +++++++++++++ 4 files changed, 168 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 5058a354fffd..0f7f80ad8d88 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -321,6 +321,7 @@ enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY, + V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY, }; struct v3d_timestamp_query { @@ -331,6 +332,18 @@ struct v3d_timestamp_query { struct drm_syncobj *syncobj; }; +/* Number of perfmons required to handle all supported performance counters */ +#define V3D_MAX_PERFMONS DIV_ROUND_UP(V3D_PERFCNT_NUM, \ + DRM_V3D_MAX_PERF_COUNTERS) + +struct v3d_performance_query { + /* Performance monitor IDs for this query */ + u32 kperfmon_ids[V3D_MAX_PERFMONS]; + + /* Syncobj that indicates the query availability */ + struct drm_syncobj *syncobj; +}; + struct v3d_indirect_csd_info { /* Indirect CSD */ struct v3d_csd_job *job; @@ -362,6 +375,19 @@ struct v3d_timestamp_query_info { u32 count; }; +struct v3d_performance_query_info { + struct v3d_performance_query *queries; + + /* Number of performance queries */ + u32 count; + + /* Number of performance monitors related to that query pool */ + u32 nperfmons; + + /* Number of performance counters related to that query pool */ + u32 ncounters; +}; + struct v3d_copy_query_results_info { /* Define if should write to buffer using 64 or 32 bits */ bool do_64bit; @@ -389,6 +415,8 @@ struct v3d_cpu_job { struct v3d_timestamp_query_info timestamp_query; struct v3d_copy_query_results_info copy; + + struct v3d_performance_query_info performance_query; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 07c897cd3423..452c4a1db52e 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -78,6 +78,7 @@ v3d_cpu_job_free(struct drm_sched_job *sched_job) { struct v3d_cpu_job *job = to_cpu_job(sched_job); struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_performance_query_info *performance_query = &job->performance_query; if (timestamp_query->queries) { for (int i = 0; i < timestamp_query->count; i++) @@ -85,6 +86,12 @@ v3d_cpu_job_free(struct drm_sched_job *sched_job) kvfree(timestamp_query->queries); } + if (performance_query->queries) { + for (int i = 0; i < performance_query->count; i++) + drm_syncobj_put(performance_query->queries[i].syncobj); + kvfree(performance_query->queries); + } + v3d_job_cleanup(&job->base); } @@ -361,6 +368,7 @@ v3d_reset_timestamp_queries(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } + static void write_to_buffer(void *dst, u32 idx, bool do_64bit, u64 value) { @@ -414,11 +422,40 @@ v3d_copy_query_results(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +v3d_reset_performance_queries(struct v3d_cpu_job *job) +{ + struct v3d_performance_query_info *performance_query = &job->performance_query; + struct v3d_file_priv *v3d_priv = job->base.file->driver_priv; + struct v3d_dev *v3d = job->base.v3d; + struct v3d_perfmon *perfmon; + + for (int i = 0; i < performance_query->count; i++) { + for (int j = 0; j < performance_query->nperfmons; j++) { + perfmon = v3d_perfmon_find(v3d_priv, + performance_query->queries[i].kperfmon_ids[j]); + if (!perfmon) { + DRM_DEBUG("Failed to find perfmon."); + continue; + } + + v3d_perfmon_stop(v3d, perfmon, false); + + memset(perfmon->values, 0, perfmon->ncounters * sizeof(u64)); + + v3d_perfmon_put(perfmon); + } + + drm_syncobj_replace_fence(performance_query->queries[i].syncobj, NULL); + } +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = v3d_copy_query_results, + [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = v3d_reset_performance_queries, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index bafd49c6440c..20af8ae14831 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -604,6 +604,74 @@ v3d_get_cpu_copy_query_results_params(struct drm_file *file_priv, return 0; } +static int +v3d_get_cpu_reset_performance_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *syncs; + u64 __user *kperfmon_ids; + struct drm_v3d_reset_performance_query reset; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(&reset, ext, sizeof(reset))) + return -EFAULT; + + job->job_type = V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY; + + job->performance_query.queries = kvmalloc_array(reset.count, + sizeof(struct v3d_performance_query), + GFP_KERNEL); + if (!job->performance_query.queries) + return -ENOMEM; + + syncs = u64_to_user_ptr(reset.syncs); + kperfmon_ids = u64_to_user_ptr(reset.kperfmon_ids); + + for (int i = 0; i < reset.count; i++) { + u32 sync; + u64 ids; + u32 __user *ids_pointer; + u32 id; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + + if (copy_from_user(&ids, kperfmon_ids++, sizeof(ids))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + ids_pointer = u64_to_user_ptr(ids); + + for (int j = 0; j < reset.nperfmons; j++) { + if (copy_from_user(&id, ids_pointer++, sizeof(id))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].kperfmon_ids[j] = id; + } + } + job->performance_query.count = reset.count; + job->performance_query.nperfmons = reset.nperfmons; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -641,6 +709,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY: ret = v3d_get_cpu_copy_query_results_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY: + ret = v3d_get_cpu_reset_performance_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -1020,6 +1091,7 @@ static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 2, + [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = 0, }; /** @@ -1158,6 +1230,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); kvfree(cpu_job->timestamp_query.queries); + kvfree(cpu_job->performance_query.queries); return ret; } diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index a3ae1f220291..76a02d2c01e6 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -76,6 +76,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 #define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 #define DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY 0x05 +#define DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY 0x06 __u32 flags; /* mbz */ }; @@ -492,6 +493,32 @@ struct drm_v3d_copy_timestamp_query { __u64 syncs; }; +/** + * struct drm_v3d_reset_performance_query - ioctl extension for the CPU job to + * reset performance queries + * + * When an extension DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY is defined, it + * points to this extension to define a reset performance submission. This CPU + * job will reset the performance queries by resetting the values of the + * performance monitors. Moreover, it will reset the syncobj to reset query + * availability. + */ +struct drm_v3d_reset_performance_query { + struct drm_v3d_extension base; + + /* Array of performance queries's syncobjs to indicate its availability */ + __u64 syncs; + + /* Number of queries */ + __u32 count; + + /* Number of performance monitors */ + __u32 nperfmons; + + /* Array of u64 user-pointers that point to an array of kperfmon_ids */ + __u64 kperfmon_ids; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -507,6 +534,9 @@ struct drm_v3d_submit_cpu { * For DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY, it must contain two * BOs. The first is the BO for which the timestamp queries results * will be written to. The second is the BO that contains the timestamp. + * + * For DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY, it must contain no + * BOs. */ __u64 bo_handles; From patchwork Mon Nov 27 18:48:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13470116 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 71DFCC0755A for ; Mon, 27 Nov 2023 19:00:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 287FB10E3EB; Mon, 27 Nov 2023 19:00:56 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2FBE810E3F0 for ; Mon, 27 Nov 2023 19:00:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=ObzC/3SX+nS+wOkkXQ/jYYbF80eiNaCoXzJj8BrLUbw=; b=BcDrLTOit4/g38BwwK5hNW642o daQ1QqalfsLhR2EcOeWmlxNhxi383xZbmcTUfROKFFM64n5an4yCCYeALYW3KjkBwPlq77KcBugEI loy7TnI2kQeehXAkREqluLRw00LlJAOReOd6+VPnD2UbOVkJwyh7e0BZ4UOAUgOfN7ejUNiyQz6qd RcCXkZ2Mt7mwCV2GUY4ItB8FNssVNjWa4wZpB7fC34VjFkQrlmQd//MvhoRAoT9ZNmXmHjPPgJYH7 8MBS0+grkJr6/T4K4ruC778kl6s2CRe4JToKC6IPPP8rfiF8jCNyIXjv87rqBsl7k0tWcPULQZEYO 17xw4wrA==; Received: from [177.34.168.16] (helo=morissey..) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r7gqY-007jiM-7d; Mon, 27 Nov 2023 20:00:42 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v3 17/17] drm/v3d: Create a CPU job extension for the copy performance query job Date: Mon, 27 Nov 2023 15:48:35 -0300 Message-ID: <20231127185723.10348-19-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231127185723.10348-2-mcanal@igalia.com> References: <20231127185723.10348-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , Boris Brezillon , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A copy performance query job is a job that copy the complete or partial result of a query to a buffer. In order to copy the result of a performance query to a buffer, we need to get the values from the performance monitors. So, create a user extension for the CPU job that enables the creation of a copy performance query job. This user extension will allow the creation of a CPU job that copy the results of a performance query to a BO with the possibility to indicate the availability with a availability bit. Signed-off-by: Maíra Canal --- drivers/gpu/drm/v3d/v3d_drv.h | 1 + drivers/gpu/drm/v3d/v3d_sched.c | 66 +++++++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 82 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 50 +++++++++++++++++++ 4 files changed, 199 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 0f7f80ad8d88..3c7d58866570 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -322,6 +322,7 @@ enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY, + V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY, }; struct v3d_timestamp_query { diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 452c4a1db52e..203c32ed99d4 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -450,12 +450,78 @@ v3d_reset_performance_queries(struct v3d_cpu_job *job) } } +static void +v3d_write_performance_query_result(struct v3d_cpu_job *job, void *data, u32 query) +{ + struct v3d_performance_query_info *performance_query = &job->performance_query; + struct v3d_copy_query_results_info *copy = &job->copy; + struct v3d_file_priv *v3d_priv = job->base.file->driver_priv; + struct v3d_dev *v3d = job->base.v3d; + struct v3d_perfmon *perfmon; + u64 counter_values[V3D_PERFCNT_NUM]; + + for (int i = 0; i < performance_query->nperfmons; i++) { + perfmon = v3d_perfmon_find(v3d_priv, + performance_query->queries[query].kperfmon_ids[i]); + if (!perfmon) { + DRM_DEBUG("Failed to find perfmon."); + continue; + } + + v3d_perfmon_stop(v3d, perfmon, true); + + memcpy(&counter_values[i * DRM_V3D_MAX_PERF_COUNTERS], perfmon->values, + perfmon->ncounters * sizeof(u64)); + + v3d_perfmon_put(perfmon); + } + + for (int i = 0; i < performance_query->ncounters; i++) + write_to_buffer(data, i, copy->do_64bit, counter_values[i]); +} + + +static void +v3d_copy_performance_query(struct v3d_cpu_job *job) +{ + struct v3d_performance_query_info *performance_query = &job->performance_query; + struct v3d_copy_query_results_info *copy = &job->copy; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + struct dma_fence *fence; + bool available, write_result; + u8 *data; + + v3d_get_bo_vaddr(bo); + + data = ((u8 *) bo->vaddr) + copy->offset; + + for (int i = 0; i < performance_query->count; i++) { + fence = drm_syncobj_fence_get(performance_query->queries[i].syncobj); + available = fence ? dma_fence_is_signaled(fence) : false; + + write_result = available || copy->do_partial; + if (write_result) + v3d_write_performance_query_result(job, data, i); + + if (copy->availability_bit) + write_to_buffer(data, performance_query->ncounters, + copy->do_64bit, available ? 1u : 0u); + + data += copy->stride; + + dma_fence_put(fence); + } + + v3d_put_bo_vaddr(bo); +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = v3d_copy_query_results, [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = v3d_reset_performance_queries, + [V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY] = v3d_copy_performance_query, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 20af8ae14831..d7a9da2484fd 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -672,6 +672,84 @@ v3d_get_cpu_reset_performance_params(struct drm_file *file_priv, return 0; } +static int +v3d_get_cpu_copy_performance_query_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *syncs; + u64 __user *kperfmon_ids; + struct drm_v3d_copy_performance_query copy; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(©, ext, sizeof(copy))) + return -EFAULT; + + if (copy.pad) + return -EINVAL; + + job->job_type = V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY; + + job->performance_query.queries = kvmalloc_array(copy.count, + sizeof(struct v3d_performance_query), + GFP_KERNEL); + if (!job->performance_query.queries) + return -ENOMEM; + + syncs = u64_to_user_ptr(copy.syncs); + kperfmon_ids = u64_to_user_ptr(copy.kperfmon_ids); + + for (int i = 0; i < copy.count; i++) { + u32 sync; + u64 ids; + u32 __user *ids_pointer; + u32 id; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + + if (copy_from_user(&ids, kperfmon_ids++, sizeof(ids))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + ids_pointer = u64_to_user_ptr(ids); + + for (int j = 0; j < copy.nperfmons; j++) { + if (copy_from_user(&id, ids_pointer++, sizeof(id))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].kperfmon_ids[j] = id; + } + } + job->performance_query.count = copy.count; + job->performance_query.nperfmons = copy.nperfmons; + job->performance_query.ncounters = copy.ncounters; + + job->copy.do_64bit = copy.do_64bit; + job->copy.do_partial = copy.do_partial; + job->copy.availability_bit = copy.availability_bit; + job->copy.offset = copy.offset; + job->copy.stride = copy.stride; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -712,6 +790,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY: ret = v3d_get_cpu_reset_performance_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY: + ret = v3d_get_cpu_copy_performance_query_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -1092,6 +1173,7 @@ static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 2, [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = 0, + [V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY] = 1, }; /** diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 76a02d2c01e6..9b99d554ef9c 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -77,6 +77,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 #define DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY 0x05 #define DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY 0x06 +#define DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY 0x07 __u32 flags; /* mbz */ }; @@ -519,6 +520,52 @@ struct drm_v3d_reset_performance_query { __u64 kperfmon_ids; }; +/** + * struct drm_v3d_copy_performance_query - ioctl extension for the CPU job to copy + * performance query results to a buffer + * + * When an extension DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY is defined, it + * points to this extension to define a copy performance query submission. This + * CPU job will copy the performance queries results to a BO with the offset + * and stride defined in the extension. + */ +struct drm_v3d_copy_performance_query { + struct drm_v3d_extension base; + + /* Define if should write to buffer using 64 or 32 bits */ + __u8 do_64bit; + + /* Define if it can write to buffer even if the query is not available */ + __u8 do_partial; + + /* Define if it should write availability bit to buffer */ + __u8 availability_bit; + + /* mbz */ + __u8 pad; + + /* Offset of the buffer in the BO */ + __u32 offset; + + /* Stride of the buffer in the BO */ + __u32 stride; + + /* Number of performance monitors */ + __u32 nperfmons; + + /* Number of performance counters related to this query pool */ + __u32 ncounters; + + /* Number of queries */ + __u32 count; + + /* Array of performance queries's syncobjs to indicate its availability */ + __u64 syncs; + + /* Array of u64 user-pointers that point to an array of kperfmon_ids */ + __u64 kperfmon_ids; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -537,6 +584,9 @@ struct drm_v3d_submit_cpu { * * For DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY, it must contain no * BOs. + * + * For DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY, it must contain one + * BO, for which the performance queries will be written to. */ __u64 bo_handles;