From patchwork Thu Nov 30 16:40:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B599FC4167B for ; Thu, 30 Nov 2023 16:45:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E8E0B10E72F; Thu, 30 Nov 2023 16:45:12 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id D627D10E72F for ; Thu, 30 Nov 2023 16:45:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=NQPfvidAO9rRShgNUNnBv+pClshCvM70dgU+3LBW0M8=; b=rTCmbFsr0VyNj6zhn+sig21ZAL bTzNcW9Gp4EpCSrEp+w5QQMGu9NoJo4xpcL8ntSlVi5NL9VPWuMoxdHe0jtwBZJg+XTqF6VQlsP9h +2YlzlhWDuM0cpSInTYm0ig/6XeFo7xyEjoKxT25wYW/GTk0DFFzsxaf7PHq4eZ6zSqR9cqDdbSWZ n19jvxEA18ocbhR+Oi0oH1r+GKjmXQ4IT8x+Y7zsuzWJf5+zeuH3ySkBlgZdBrhMN8qm9Rw8vCLqU S0N2lxMydvowWbq264H/W9kd/TNzJn1Rzz/IMpd1cnW9iz0G6xLJQSlyRg/ZJWiqrPWKqrv67eJbK rVDVVMRQ==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8k9u-008scY-Gj; Thu, 30 Nov 2023 17:45:03 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 01/17] drm/v3d: Remove unused function header Date: Thu, 30 Nov 2023 13:40:24 -0300 Message-ID: <20231130164420.932823-3-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen v3d_mmu_get_offset header was added but the function was never defined. Just remove it. Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 4c59fefaa0b4..d8b392494eba 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -420,8 +420,6 @@ void v3d_irq_disable(struct v3d_dev *v3d); void v3d_irq_reset(struct v3d_dev *v3d); /* v3d_mmu.c */ -int v3d_mmu_get_offset(struct drm_file *file_priv, struct v3d_bo *bo, - u32 *offset); int v3d_mmu_set_page_table(struct v3d_dev *v3d); void v3d_mmu_insert_ptes(struct v3d_bo *bo); void v3d_mmu_remove_ptes(struct v3d_bo *bo); From patchwork Thu Nov 30 16:40:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4477FC10DC2 for ; Thu, 30 Nov 2023 16:45:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6D7EC10E733; Thu, 30 Nov 2023 16:45:22 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 76E7910E732 for ; Thu, 30 Nov 2023 16:45:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=pbxcQ8iWwdrbRajbi0maiBflYX/Xs059SsiwpGRI+fU=; b=h0gaOUS0Wn2np5k8gUCki/Xs9u qBd68voX3jxPkJg8ppd/dYwIBNa4lP6FcMbmu/kC3GEvIWuTrUM3K3avB546gfJni2vW0VDbm/ZbK S6zRwZeAEAptfsCDrKhbFxNcDs40InQaBmX69T3rWE6uImYaGuyb4+D7O7bPvvWWLqRvrmyzpyHh1 U0UycSH+TU3crWz9mTuZcxja7rdA0teRRICMX6KhQBqDPcHkYS5m8uV/aOFGKBK6+kqrQJqD4I17T +vqZFsIY2PVILgbEO3cMbLtkWtVj84A/Sf9etuhuFk5iJ2jItxDS09jhbUlO9Q5PN1fGWSngyOYLR BU9nf6iQ==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8k9y-008scY-DW; Thu, 30 Nov 2023 17:45:07 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 02/17] drm/v3d: Move wait BO ioctl to the v3d_bo file Date: Thu, 30 Nov 2023 13:40:25 -0300 Message-ID: <20231130164420.932823-4-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen IOCTLs related to BO operations reside on the file v3d_bo.c. The wait BO ioctl is the only IOCTL regarding BOs that is placed in a different file. So, move it to the v3d_bo.c file. Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_bo.c | 33 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_drv.h | 4 ++-- drivers/gpu/drm/v3d/v3d_gem.c | 33 --------------------------------- 3 files changed, 35 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c index 8b3229a37c6d..357a0da7e16a 100644 --- a/drivers/gpu/drm/v3d/v3d_bo.c +++ b/drivers/gpu/drm/v3d/v3d_bo.c @@ -233,3 +233,36 @@ int v3d_get_bo_offset_ioctl(struct drm_device *dev, void *data, drm_gem_object_put(gem_obj); return 0; } + +int +v3d_wait_bo_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + int ret; + struct drm_v3d_wait_bo *args = data; + ktime_t start = ktime_get(); + u64 delta_ns; + unsigned long timeout_jiffies = + nsecs_to_jiffies_timeout(args->timeout_ns); + + if (args->pad != 0) + return -EINVAL; + + ret = drm_gem_dma_resv_wait(file_priv, args->handle, + true, timeout_jiffies); + + /* Decrement the user's timeout, in case we got interrupted + * such that the ioctl will be restarted. + */ + delta_ns = ktime_to_ns(ktime_sub(ktime_get(), start)); + if (delta_ns < args->timeout_ns) + args->timeout_ns -= delta_ns; + else + args->timeout_ns = 0; + + /* Asked to wait beyond the jiffie/scheduler precision? */ + if (ret == -ETIME && args->timeout_ns) + ret = -EAGAIN; + + return ret; +} diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index d8b392494eba..fc04372d5cbd 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -385,6 +385,8 @@ int v3d_mmap_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_get_bo_offset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int v3d_wait_bo_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); struct drm_gem_object *v3d_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sgt); @@ -405,8 +407,6 @@ int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); -int v3d_wait_bo_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); void v3d_job_cleanup(struct v3d_job *job); void v3d_job_put(struct v3d_job *job); void v3d_reset(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 9d2ac23c29e3..26f62a48d226 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -363,39 +363,6 @@ void v3d_job_put(struct v3d_job *job) kref_put(&job->refcount, job->free); } -int -v3d_wait_bo_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - int ret; - struct drm_v3d_wait_bo *args = data; - ktime_t start = ktime_get(); - u64 delta_ns; - unsigned long timeout_jiffies = - nsecs_to_jiffies_timeout(args->timeout_ns); - - if (args->pad != 0) - return -EINVAL; - - ret = drm_gem_dma_resv_wait(file_priv, args->handle, - true, timeout_jiffies); - - /* Decrement the user's timeout, in case we got interrupted - * such that the ioctl will be restarted. - */ - delta_ns = ktime_to_ns(ktime_sub(ktime_get(), start)); - if (delta_ns < args->timeout_ns) - args->timeout_ns -= delta_ns; - else - args->timeout_ns = 0; - - /* Asked to wait beyond the jiffie/scheduler precision? */ - if (ret == -ETIME && args->timeout_ns) - ret = -EAGAIN; - - return ret; -} - static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, void **container, size_t size, void (*free)(struct kref *ref), From patchwork Thu Nov 30 16:40:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D654AC4167B for ; Thu, 30 Nov 2023 16:45:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 890FF10E732; Thu, 30 Nov 2023 16:45:23 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5A0A510E732 for ; Thu, 30 Nov 2023 16:45:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=cJ/U4+vutet4RnDW9/7yzoRRAzwOVPANEffs0MF42tc=; b=OIp1Kkw4lXrnivs3BuQf/ZtA0J iwWRyU+ASVSwxo8EFZ0mwCOyjS/9xH6Sk9L1uI6M3g+YnaGDer20+ahwBce80S2hk9FWAo5Yvza/1 T9tFMiVFTKkmZ71ezmbnB/hl4wKIYnEv3fXz9PICyYVpr3N4ra032iq4gH6pzg2Z/UhaU2b7vFB0i fVYjHtMJs+LDvZrkdlqpP5HjgLB7BSjr/KS2vtq6ItmTdHkAiLqjprRkYtGZvWuN85Dy7nT9YkK8/ zEDe9sw5HXSdU8EhVjHjJAQBHBraq2H1/kTUgw8VcHXKq+digtxvJYldN8OWSAdLntx+A7dj9OeSV EnxxbTLg==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kA2-008scY-B8; Thu, 30 Nov 2023 17:45:11 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 03/17] drm/v3d: Detach job submissions IOCTLs to a new specific file Date: Thu, 30 Nov 2023 13:40:26 -0300 Message-ID: <20231130164420.932823-5-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen We will include a new job submission type, the CPU job submission. For readability and maintability, separate the job submission IOCTLs and related operations from v3d_gem.c. Minor fix in the CSD submission kernel doc: CSD (texture formatting) -> CSD (compute shader). Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/Makefile | 3 +- drivers/gpu/drm/v3d/v3d_drv.h | 12 +- drivers/gpu/drm/v3d/v3d_gem.c | 735 ------------------------------ drivers/gpu/drm/v3d/v3d_submit.c | 744 +++++++++++++++++++++++++++++++ 4 files changed, 753 insertions(+), 741 deletions(-) create mode 100644 drivers/gpu/drm/v3d/v3d_submit.c diff --git a/drivers/gpu/drm/v3d/Makefile b/drivers/gpu/drm/v3d/Makefile index 4b21b20e4998..b7d673f1153b 100644 --- a/drivers/gpu/drm/v3d/Makefile +++ b/drivers/gpu/drm/v3d/Makefile @@ -12,7 +12,8 @@ v3d-y := \ v3d_perfmon.o \ v3d_trace_points.o \ v3d_sched.o \ - v3d_sysfs.o + v3d_sysfs.o \ + v3d_submit.o v3d-$(CONFIG_DEBUG_FS) += v3d_debugfs.o diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index fc04372d5cbd..4db9ace66024 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -401,17 +401,19 @@ struct dma_fence *v3d_fence_create(struct v3d_dev *v3d, enum v3d_queue queue); /* v3d_gem.c */ int v3d_gem_init(struct drm_device *dev); void v3d_gem_destroy(struct drm_device *dev); +void v3d_reset(struct v3d_dev *v3d); +void v3d_invalidate_caches(struct v3d_dev *v3d); +void v3d_clean_caches(struct v3d_dev *v3d); + +/* v3d_submit.c */ +void v3d_job_cleanup(struct v3d_job *job); +void v3d_job_put(struct v3d_job *job); int v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); -void v3d_job_cleanup(struct v3d_job *job); -void v3d_job_put(struct v3d_job *job); -void v3d_reset(struct v3d_dev *v3d); -void v3d_invalidate_caches(struct v3d_dev *v3d); -void v3d_clean_caches(struct v3d_dev *v3d); /* v3d_irq.c */ int v3d_irq_init(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 26f62a48d226..afc565078c78 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -11,8 +11,6 @@ #include #include -#include -#include #include "v3d_drv.h" #include "v3d_regs.h" @@ -241,739 +239,6 @@ v3d_invalidate_caches(struct v3d_dev *v3d) v3d_invalidate_slices(v3d, 0); } -/* Takes the reservation lock on all the BOs being referenced, so that - * at queue submit time we can update the reservations. - * - * We don't lock the RCL the tile alloc/state BOs, or overflow memory - * (all of which are on exec->unref_list). They're entirely private - * to v3d, so we don't attach dma-buf fences to them. - */ -static int -v3d_lock_bo_reservations(struct v3d_job *job, - struct ww_acquire_ctx *acquire_ctx) -{ - int i, ret; - - ret = drm_gem_lock_reservations(job->bo, job->bo_count, acquire_ctx); - if (ret) - return ret; - - for (i = 0; i < job->bo_count; i++) { - ret = dma_resv_reserve_fences(job->bo[i]->resv, 1); - if (ret) - goto fail; - - ret = drm_sched_job_add_implicit_dependencies(&job->base, - job->bo[i], true); - if (ret) - goto fail; - } - - return 0; - -fail: - drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); - return ret; -} - -/** - * v3d_lookup_bos() - Sets up job->bo[] with the GEM objects - * referenced by the job. - * @dev: DRM device - * @file_priv: DRM file for this fd - * @job: V3D job being set up - * @bo_handles: GEM handles - * @bo_count: Number of GEM handles passed in - * - * The command validator needs to reference BOs by their index within - * the submitted job's BO list. This does the validation of the job's - * BO list and reference counting for the lifetime of the job. - * - * Note that this function doesn't need to unreference the BOs on - * failure, because that will happen at v3d_exec_cleanup() time. - */ -static int -v3d_lookup_bos(struct drm_device *dev, - struct drm_file *file_priv, - struct v3d_job *job, - u64 bo_handles, - u32 bo_count) -{ - job->bo_count = bo_count; - - if (!job->bo_count) { - /* See comment on bo_index for why we have to check - * this. - */ - DRM_DEBUG("Rendering requires BOs\n"); - return -EINVAL; - } - - return drm_gem_objects_lookup(file_priv, - (void __user *)(uintptr_t)bo_handles, - job->bo_count, &job->bo); -} - -static void -v3d_job_free(struct kref *ref) -{ - struct v3d_job *job = container_of(ref, struct v3d_job, refcount); - int i; - - if (job->bo) { - for (i = 0; i < job->bo_count; i++) - drm_gem_object_put(job->bo[i]); - kvfree(job->bo); - } - - dma_fence_put(job->irq_fence); - dma_fence_put(job->done_fence); - - if (job->perfmon) - v3d_perfmon_put(job->perfmon); - - kfree(job); -} - -static void -v3d_render_job_free(struct kref *ref) -{ - struct v3d_render_job *job = container_of(ref, struct v3d_render_job, - base.refcount); - struct v3d_bo *bo, *save; - - list_for_each_entry_safe(bo, save, &job->unref_list, unref_head) { - drm_gem_object_put(&bo->base.base); - } - - v3d_job_free(ref); -} - -void v3d_job_cleanup(struct v3d_job *job) -{ - if (!job) - return; - - drm_sched_job_cleanup(&job->base); - v3d_job_put(job); -} - -void v3d_job_put(struct v3d_job *job) -{ - kref_put(&job->refcount, job->free); -} - -static int -v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, - void **container, size_t size, void (*free)(struct kref *ref), - u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) -{ - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct v3d_job *job; - bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); - int ret, i; - - *container = kcalloc(1, size, GFP_KERNEL); - if (!*container) { - DRM_ERROR("Cannot allocate memory for v3d job."); - return -ENOMEM; - } - - job = *container; - job->v3d = v3d; - job->free = free; - job->file = file_priv; - - ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], - 1, v3d_priv); - if (ret) - goto fail; - - if (has_multisync) { - if (se->in_sync_count && se->wait_stage == queue) { - struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs); - - for (i = 0; i < se->in_sync_count; i++) { - struct drm_v3d_sem in; - - if (copy_from_user(&in, handle++, sizeof(in))) { - ret = -EFAULT; - DRM_DEBUG("Failed to copy wait dep handle.\n"); - goto fail_deps; - } - ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in.handle, 0); - - // TODO: Investigate why this was filtered out for the IOCTL. - if (ret && ret != -ENOENT) - goto fail_deps; - } - } - } else { - ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in_sync, 0); - - // TODO: Investigate why this was filtered out for the IOCTL. - if (ret && ret != -ENOENT) - goto fail_deps; - } - - kref_init(&job->refcount); - - return 0; - -fail_deps: - drm_sched_job_cleanup(&job->base); -fail: - kfree(*container); - *container = NULL; - - return ret; -} - -static void -v3d_push_job(struct v3d_job *job) -{ - drm_sched_job_arm(&job->base); - - job->done_fence = dma_fence_get(&job->base.s_fence->finished); - - /* put by scheduler job completion */ - kref_get(&job->refcount); - - drm_sched_entity_push_job(&job->base); -} - -static void -v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, - struct v3d_job *job, - struct ww_acquire_ctx *acquire_ctx, - u32 out_sync, - struct v3d_submit_ext *se, - struct dma_fence *done_fence) -{ - struct drm_syncobj *sync_out; - bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); - int i; - - for (i = 0; i < job->bo_count; i++) { - /* XXX: Use shared fences for read-only objects. */ - dma_resv_add_fence(job->bo[i]->resv, job->done_fence, - DMA_RESV_USAGE_WRITE); - } - - drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); - - /* Update the return sync object for the job */ - /* If it only supports a single signal semaphore*/ - if (!has_multisync) { - sync_out = drm_syncobj_find(file_priv, out_sync); - if (sync_out) { - drm_syncobj_replace_fence(sync_out, done_fence); - drm_syncobj_put(sync_out); - } - return; - } - - /* If multiple semaphores extension is supported */ - if (se->out_sync_count) { - for (i = 0; i < se->out_sync_count; i++) { - drm_syncobj_replace_fence(se->out_syncs[i].syncobj, - done_fence); - drm_syncobj_put(se->out_syncs[i].syncobj); - } - kvfree(se->out_syncs); - } -} - -static void -v3d_put_multisync_post_deps(struct v3d_submit_ext *se) -{ - unsigned int i; - - if (!(se && se->out_sync_count)) - return; - - for (i = 0; i < se->out_sync_count; i++) - drm_syncobj_put(se->out_syncs[i].syncobj); - kvfree(se->out_syncs); -} - -static int -v3d_get_multisync_post_deps(struct drm_file *file_priv, - struct v3d_submit_ext *se, - u32 count, u64 handles) -{ - struct drm_v3d_sem __user *post_deps; - int i, ret; - - if (!count) - return 0; - - se->out_syncs = (struct v3d_submit_outsync *) - kvmalloc_array(count, - sizeof(struct v3d_submit_outsync), - GFP_KERNEL); - if (!se->out_syncs) - return -ENOMEM; - - post_deps = u64_to_user_ptr(handles); - - for (i = 0; i < count; i++) { - struct drm_v3d_sem out; - - if (copy_from_user(&out, post_deps++, sizeof(out))) { - ret = -EFAULT; - DRM_DEBUG("Failed to copy post dep handles\n"); - goto fail; - } - - se->out_syncs[i].syncobj = drm_syncobj_find(file_priv, - out.handle); - if (!se->out_syncs[i].syncobj) { - ret = -EINVAL; - goto fail; - } - } - se->out_sync_count = count; - - return 0; - -fail: - for (i--; i >= 0; i--) - drm_syncobj_put(se->out_syncs[i].syncobj); - kvfree(se->out_syncs); - - return ret; -} - -/* Get data for multiple binary semaphores synchronization. Parse syncobj - * to be signaled when job completes (out_sync). - */ -static int -v3d_get_multisync_submit_deps(struct drm_file *file_priv, - struct drm_v3d_extension __user *ext, - void *data) -{ - struct drm_v3d_multi_sync multisync; - struct v3d_submit_ext *se = data; - int ret; - - if (copy_from_user(&multisync, ext, sizeof(multisync))) - return -EFAULT; - - if (multisync.pad) - return -EINVAL; - - ret = v3d_get_multisync_post_deps(file_priv, data, multisync.out_sync_count, - multisync.out_syncs); - if (ret) - return ret; - - se->in_sync_count = multisync.in_sync_count; - se->in_syncs = multisync.in_syncs; - se->flags |= DRM_V3D_EXT_ID_MULTI_SYNC; - se->wait_stage = multisync.wait_stage; - - return 0; -} - -/* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data - * according to the extension id (name). - */ -static int -v3d_get_extensions(struct drm_file *file_priv, - u64 ext_handles, - void *data) -{ - struct drm_v3d_extension __user *user_ext; - int ret; - - user_ext = u64_to_user_ptr(ext_handles); - while (user_ext) { - struct drm_v3d_extension ext; - - if (copy_from_user(&ext, user_ext, sizeof(ext))) { - DRM_DEBUG("Failed to copy submit extension\n"); - return -EFAULT; - } - - switch (ext.id) { - case DRM_V3D_EXT_ID_MULTI_SYNC: - ret = v3d_get_multisync_submit_deps(file_priv, user_ext, data); - if (ret) - return ret; - break; - default: - DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); - return -EINVAL; - } - - user_ext = u64_to_user_ptr(ext.next); - } - - return 0; -} - -/** - * v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D. - * @dev: DRM device - * @data: ioctl argument - * @file_priv: DRM file for this fd - * - * This is the main entrypoint for userspace to submit a 3D frame to - * the GPU. Userspace provides the binner command list (if - * applicable), and the kernel sets up the render command list to draw - * to the framebuffer described in the ioctl, using the command lists - * that the 3D engine's binner will produce. - */ -int -v3d_submit_cl_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct v3d_dev *v3d = to_v3d_dev(dev); - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct drm_v3d_submit_cl *args = data; - struct v3d_submit_ext se = {0}; - struct v3d_bin_job *bin = NULL; - struct v3d_render_job *render = NULL; - struct v3d_job *clean_job = NULL; - struct v3d_job *last_job; - struct ww_acquire_ctx acquire_ctx; - int ret = 0; - - trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end); - - if (args->pad) - return -EINVAL; - - if (args->flags && - args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE | - DRM_V3D_SUBMIT_EXTENSION)) { - DRM_INFO("invalid flags: %d\n", args->flags); - return -EINVAL; - } - - if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); - if (ret) { - DRM_DEBUG("Failed to get extensions.\n"); - return ret; - } - } - - ret = v3d_job_init(v3d, file_priv, (void *)&render, sizeof(*render), - v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER); - if (ret) - goto fail; - - render->start = args->rcl_start; - render->end = args->rcl_end; - INIT_LIST_HEAD(&render->unref_list); - - if (args->bcl_start != args->bcl_end) { - ret = v3d_job_init(v3d, file_priv, (void *)&bin, sizeof(*bin), - v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); - if (ret) - goto fail; - - bin->start = args->bcl_start; - bin->end = args->bcl_end; - bin->qma = args->qma; - bin->qms = args->qms; - bin->qts = args->qts; - bin->render = render; - } - - if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); - if (ret) - goto fail; - - last_job = clean_job; - } else { - last_job = &render->base; - } - - ret = v3d_lookup_bos(dev, file_priv, last_job, - args->bo_handles, args->bo_handle_count); - if (ret) - goto fail; - - ret = v3d_lock_bo_reservations(last_job, &acquire_ctx); - if (ret) - goto fail; - - if (args->perfmon_id) { - render->base.perfmon = v3d_perfmon_find(v3d_priv, - args->perfmon_id); - - if (!render->base.perfmon) { - ret = -ENOENT; - goto fail_perfmon; - } - } - - mutex_lock(&v3d->sched_lock); - if (bin) { - bin->base.perfmon = render->base.perfmon; - v3d_perfmon_get(bin->base.perfmon); - v3d_push_job(&bin->base); - - ret = drm_sched_job_add_dependency(&render->base.base, - dma_fence_get(bin->base.done_fence)); - if (ret) - goto fail_unreserve; - } - - v3d_push_job(&render->base); - - if (clean_job) { - struct dma_fence *render_fence = - dma_fence_get(render->base.done_fence); - ret = drm_sched_job_add_dependency(&clean_job->base, - render_fence); - if (ret) - goto fail_unreserve; - clean_job->perfmon = render->base.perfmon; - v3d_perfmon_get(clean_job->perfmon); - v3d_push_job(clean_job); - } - - mutex_unlock(&v3d->sched_lock); - - v3d_attach_fences_and_unlock_reservation(file_priv, - last_job, - &acquire_ctx, - args->out_sync, - &se, - last_job->done_fence); - - if (bin) - v3d_job_put(&bin->base); - v3d_job_put(&render->base); - if (clean_job) - v3d_job_put(clean_job); - - return 0; - -fail_unreserve: - mutex_unlock(&v3d->sched_lock); -fail_perfmon: - drm_gem_unlock_reservations(last_job->bo, - last_job->bo_count, &acquire_ctx); -fail: - v3d_job_cleanup((void *)bin); - v3d_job_cleanup((void *)render); - v3d_job_cleanup(clean_job); - v3d_put_multisync_post_deps(&se); - - return ret; -} - -/** - * v3d_submit_tfu_ioctl() - Submits a TFU (texture formatting) job to the V3D. - * @dev: DRM device - * @data: ioctl argument - * @file_priv: DRM file for this fd - * - * Userspace provides the register setup for the TFU, which we don't - * need to validate since the TFU is behind the MMU. - */ -int -v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct v3d_dev *v3d = to_v3d_dev(dev); - struct drm_v3d_submit_tfu *args = data; - struct v3d_submit_ext se = {0}; - struct v3d_tfu_job *job = NULL; - struct ww_acquire_ctx acquire_ctx; - int ret = 0; - - trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia); - - if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { - DRM_DEBUG("invalid flags: %d\n", args->flags); - return -EINVAL; - } - - if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); - if (ret) { - DRM_DEBUG("Failed to get extensions.\n"); - return ret; - } - } - - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), - v3d_job_free, args->in_sync, &se, V3D_TFU); - if (ret) - goto fail; - - job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), - sizeof(*job->base.bo), GFP_KERNEL); - if (!job->base.bo) { - ret = -ENOMEM; - goto fail; - } - - job->args = *args; - - for (job->base.bo_count = 0; - job->base.bo_count < ARRAY_SIZE(args->bo_handles); - job->base.bo_count++) { - struct drm_gem_object *bo; - - if (!args->bo_handles[job->base.bo_count]) - break; - - bo = drm_gem_object_lookup(file_priv, args->bo_handles[job->base.bo_count]); - if (!bo) { - DRM_DEBUG("Failed to look up GEM BO %d: %d\n", - job->base.bo_count, - args->bo_handles[job->base.bo_count]); - ret = -ENOENT; - goto fail; - } - job->base.bo[job->base.bo_count] = bo; - } - - ret = v3d_lock_bo_reservations(&job->base, &acquire_ctx); - if (ret) - goto fail; - - mutex_lock(&v3d->sched_lock); - v3d_push_job(&job->base); - mutex_unlock(&v3d->sched_lock); - - v3d_attach_fences_and_unlock_reservation(file_priv, - &job->base, &acquire_ctx, - args->out_sync, - &se, - job->base.done_fence); - - v3d_job_put(&job->base); - - return 0; - -fail: - v3d_job_cleanup((void *)job); - v3d_put_multisync_post_deps(&se); - - return ret; -} - -/** - * v3d_submit_csd_ioctl() - Submits a CSD (texture formatting) job to the V3D. - * @dev: DRM device - * @data: ioctl argument - * @file_priv: DRM file for this fd - * - * Userspace provides the register setup for the CSD, which we don't - * need to validate since the CSD is behind the MMU. - */ -int -v3d_submit_csd_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct v3d_dev *v3d = to_v3d_dev(dev); - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct drm_v3d_submit_csd *args = data; - struct v3d_submit_ext se = {0}; - struct v3d_csd_job *job = NULL; - struct v3d_job *clean_job = NULL; - struct ww_acquire_ctx acquire_ctx; - int ret; - - trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]); - - if (args->pad) - return -EINVAL; - - if (!v3d_has_csd(v3d)) { - DRM_DEBUG("Attempting CSD submit on non-CSD hardware\n"); - return -EINVAL; - } - - if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { - DRM_INFO("invalid flags: %d\n", args->flags); - return -EINVAL; - } - - if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); - if (ret) { - DRM_DEBUG("Failed to get extensions.\n"); - return ret; - } - } - - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), - v3d_job_free, args->in_sync, &se, V3D_CSD); - if (ret) - goto fail; - - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); - if (ret) - goto fail; - - job->args = *args; - - ret = v3d_lookup_bos(dev, file_priv, clean_job, - args->bo_handles, args->bo_handle_count); - if (ret) - goto fail; - - ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx); - if (ret) - goto fail; - - if (args->perfmon_id) { - job->base.perfmon = v3d_perfmon_find(v3d_priv, - args->perfmon_id); - if (!job->base.perfmon) { - ret = -ENOENT; - goto fail_perfmon; - } - } - - mutex_lock(&v3d->sched_lock); - v3d_push_job(&job->base); - - ret = drm_sched_job_add_dependency(&clean_job->base, - dma_fence_get(job->base.done_fence)); - if (ret) - goto fail_unreserve; - - v3d_push_job(clean_job); - mutex_unlock(&v3d->sched_lock); - - v3d_attach_fences_and_unlock_reservation(file_priv, - clean_job, - &acquire_ctx, - args->out_sync, - &se, - clean_job->done_fence); - - v3d_job_put(&job->base); - v3d_job_put(clean_job); - - return 0; - -fail_unreserve: - mutex_unlock(&v3d->sched_lock); -fail_perfmon: - drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count, - &acquire_ctx); -fail: - v3d_job_cleanup((void *)job); - v3d_job_cleanup(clean_job); - v3d_put_multisync_post_deps(&se); - - return ret; -} - int v3d_gem_init(struct drm_device *dev) { diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c new file mode 100644 index 000000000000..f36214002f37 --- /dev/null +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -0,0 +1,744 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2014-2018 Broadcom + * Copyright (C) 2023 Raspberry Pi + */ + +#include + +#include "v3d_drv.h" +#include "v3d_regs.h" +#include "v3d_trace.h" + +/* Takes the reservation lock on all the BOs being referenced, so that + * at queue submit time we can update the reservations. + * + * We don't lock the RCL the tile alloc/state BOs, or overflow memory + * (all of which are on exec->unref_list). They're entirely private + * to v3d, so we don't attach dma-buf fences to them. + */ +static int +v3d_lock_bo_reservations(struct v3d_job *job, + struct ww_acquire_ctx *acquire_ctx) +{ + int i, ret; + + ret = drm_gem_lock_reservations(job->bo, job->bo_count, acquire_ctx); + if (ret) + return ret; + + for (i = 0; i < job->bo_count; i++) { + ret = dma_resv_reserve_fences(job->bo[i]->resv, 1); + if (ret) + goto fail; + + ret = drm_sched_job_add_implicit_dependencies(&job->base, + job->bo[i], true); + if (ret) + goto fail; + } + + return 0; + +fail: + drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); + return ret; +} + +/** + * v3d_lookup_bos() - Sets up job->bo[] with the GEM objects + * referenced by the job. + * @dev: DRM device + * @file_priv: DRM file for this fd + * @job: V3D job being set up + * @bo_handles: GEM handles + * @bo_count: Number of GEM handles passed in + * + * The command validator needs to reference BOs by their index within + * the submitted job's BO list. This does the validation of the job's + * BO list and reference counting for the lifetime of the job. + * + * Note that this function doesn't need to unreference the BOs on + * failure, because that will happen at v3d_exec_cleanup() time. + */ +static int +v3d_lookup_bos(struct drm_device *dev, + struct drm_file *file_priv, + struct v3d_job *job, + u64 bo_handles, + u32 bo_count) +{ + job->bo_count = bo_count; + + if (!job->bo_count) { + /* See comment on bo_index for why we have to check + * this. + */ + DRM_DEBUG("Rendering requires BOs\n"); + return -EINVAL; + } + + return drm_gem_objects_lookup(file_priv, + (void __user *)(uintptr_t)bo_handles, + job->bo_count, &job->bo); +} + +static void +v3d_job_free(struct kref *ref) +{ + struct v3d_job *job = container_of(ref, struct v3d_job, refcount); + int i; + + if (job->bo) { + for (i = 0; i < job->bo_count; i++) + drm_gem_object_put(job->bo[i]); + kvfree(job->bo); + } + + dma_fence_put(job->irq_fence); + dma_fence_put(job->done_fence); + + if (job->perfmon) + v3d_perfmon_put(job->perfmon); + + kfree(job); +} + +static void +v3d_render_job_free(struct kref *ref) +{ + struct v3d_render_job *job = container_of(ref, struct v3d_render_job, + base.refcount); + struct v3d_bo *bo, *save; + + list_for_each_entry_safe(bo, save, &job->unref_list, unref_head) { + drm_gem_object_put(&bo->base.base); + } + + v3d_job_free(ref); +} + +void v3d_job_cleanup(struct v3d_job *job) +{ + if (!job) + return; + + drm_sched_job_cleanup(&job->base); + v3d_job_put(job); +} + +void v3d_job_put(struct v3d_job *job) +{ + kref_put(&job->refcount, job->free); +} + +static int +v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, + void **container, size_t size, void (*free)(struct kref *ref), + u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) +{ + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct v3d_job *job; + bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); + int ret, i; + + *container = kcalloc(1, size, GFP_KERNEL); + if (!*container) { + DRM_ERROR("Cannot allocate memory for v3d job."); + return -ENOMEM; + } + + job = *container; + job->v3d = v3d; + job->free = free; + job->file = file_priv; + + ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], + 1, v3d_priv); + if (ret) + goto fail; + + if (has_multisync) { + if (se->in_sync_count && se->wait_stage == queue) { + struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs); + + for (i = 0; i < se->in_sync_count; i++) { + struct drm_v3d_sem in; + + if (copy_from_user(&in, handle++, sizeof(in))) { + ret = -EFAULT; + DRM_DEBUG("Failed to copy wait dep handle.\n"); + goto fail_deps; + } + ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in.handle, 0); + + // TODO: Investigate why this was filtered out for the IOCTL. + if (ret && ret != -ENOENT) + goto fail_deps; + } + } + } else { + ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in_sync, 0); + + // TODO: Investigate why this was filtered out for the IOCTL. + if (ret && ret != -ENOENT) + goto fail_deps; + } + + kref_init(&job->refcount); + + return 0; + +fail_deps: + drm_sched_job_cleanup(&job->base); +fail: + kfree(*container); + *container = NULL; + + return ret; +} + +static void +v3d_push_job(struct v3d_job *job) +{ + drm_sched_job_arm(&job->base); + + job->done_fence = dma_fence_get(&job->base.s_fence->finished); + + /* put by scheduler job completion */ + kref_get(&job->refcount); + + drm_sched_entity_push_job(&job->base); +} + +static void +v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, + struct v3d_job *job, + struct ww_acquire_ctx *acquire_ctx, + u32 out_sync, + struct v3d_submit_ext *se, + struct dma_fence *done_fence) +{ + struct drm_syncobj *sync_out; + bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); + int i; + + for (i = 0; i < job->bo_count; i++) { + /* XXX: Use shared fences for read-only objects. */ + dma_resv_add_fence(job->bo[i]->resv, job->done_fence, + DMA_RESV_USAGE_WRITE); + } + + drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); + + /* Update the return sync object for the job */ + /* If it only supports a single signal semaphore*/ + if (!has_multisync) { + sync_out = drm_syncobj_find(file_priv, out_sync); + if (sync_out) { + drm_syncobj_replace_fence(sync_out, done_fence); + drm_syncobj_put(sync_out); + } + return; + } + + /* If multiple semaphores extension is supported */ + if (se->out_sync_count) { + for (i = 0; i < se->out_sync_count; i++) { + drm_syncobj_replace_fence(se->out_syncs[i].syncobj, + done_fence); + drm_syncobj_put(se->out_syncs[i].syncobj); + } + kvfree(se->out_syncs); + } +} + +static void +v3d_put_multisync_post_deps(struct v3d_submit_ext *se) +{ + unsigned int i; + + if (!(se && se->out_sync_count)) + return; + + for (i = 0; i < se->out_sync_count; i++) + drm_syncobj_put(se->out_syncs[i].syncobj); + kvfree(se->out_syncs); +} + +static int +v3d_get_multisync_post_deps(struct drm_file *file_priv, + struct v3d_submit_ext *se, + u32 count, u64 handles) +{ + struct drm_v3d_sem __user *post_deps; + int i, ret; + + if (!count) + return 0; + + se->out_syncs = (struct v3d_submit_outsync *) + kvmalloc_array(count, + sizeof(struct v3d_submit_outsync), + GFP_KERNEL); + if (!se->out_syncs) + return -ENOMEM; + + post_deps = u64_to_user_ptr(handles); + + for (i = 0; i < count; i++) { + struct drm_v3d_sem out; + + if (copy_from_user(&out, post_deps++, sizeof(out))) { + ret = -EFAULT; + DRM_DEBUG("Failed to copy post dep handles\n"); + goto fail; + } + + se->out_syncs[i].syncobj = drm_syncobj_find(file_priv, + out.handle); + if (!se->out_syncs[i].syncobj) { + ret = -EINVAL; + goto fail; + } + } + se->out_sync_count = count; + + return 0; + +fail: + for (i--; i >= 0; i--) + drm_syncobj_put(se->out_syncs[i].syncobj); + kvfree(se->out_syncs); + + return ret; +} + +/* Get data for multiple binary semaphores synchronization. Parse syncobj + * to be signaled when job completes (out_sync). + */ +static int +v3d_get_multisync_submit_deps(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + void *data) +{ + struct drm_v3d_multi_sync multisync; + struct v3d_submit_ext *se = data; + int ret; + + if (copy_from_user(&multisync, ext, sizeof(multisync))) + return -EFAULT; + + if (multisync.pad) + return -EINVAL; + + ret = v3d_get_multisync_post_deps(file_priv, data, multisync.out_sync_count, + multisync.out_syncs); + if (ret) + return ret; + + se->in_sync_count = multisync.in_sync_count; + se->in_syncs = multisync.in_syncs; + se->flags |= DRM_V3D_EXT_ID_MULTI_SYNC; + se->wait_stage = multisync.wait_stage; + + return 0; +} + +/* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data + * according to the extension id (name). + */ +static int +v3d_get_extensions(struct drm_file *file_priv, + u64 ext_handles, + void *data) +{ + struct drm_v3d_extension __user *user_ext; + int ret; + + user_ext = u64_to_user_ptr(ext_handles); + while (user_ext) { + struct drm_v3d_extension ext; + + if (copy_from_user(&ext, user_ext, sizeof(ext))) { + DRM_DEBUG("Failed to copy submit extension\n"); + return -EFAULT; + } + + switch (ext.id) { + case DRM_V3D_EXT_ID_MULTI_SYNC: + ret = v3d_get_multisync_submit_deps(file_priv, user_ext, data); + if (ret) + return ret; + break; + default: + DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); + return -EINVAL; + } + + user_ext = u64_to_user_ptr(ext.next); + } + + return 0; +} + +/** + * v3d_submit_cl_ioctl() - Submits a job (frame) to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * This is the main entrypoint for userspace to submit a 3D frame to + * the GPU. Userspace provides the binner command list (if + * applicable), and the kernel sets up the render command list to draw + * to the framebuffer described in the ioctl, using the command lists + * that the 3D engine's binner will produce. + */ +int +v3d_submit_cl_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct drm_v3d_submit_cl *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_bin_job *bin = NULL; + struct v3d_render_job *render = NULL; + struct v3d_job *clean_job = NULL; + struct v3d_job *last_job; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end); + + if (args->pad) + return -EINVAL; + + if (args->flags && + args->flags & ~(DRM_V3D_SUBMIT_CL_FLUSH_CACHE | + DRM_V3D_SUBMIT_EXTENSION)) { + DRM_INFO("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + + ret = v3d_job_init(v3d, file_priv, (void *)&render, sizeof(*render), + v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER); + if (ret) + goto fail; + + render->start = args->rcl_start; + render->end = args->rcl_end; + INIT_LIST_HEAD(&render->unref_list); + + if (args->bcl_start != args->bcl_end) { + ret = v3d_job_init(v3d, file_priv, (void *)&bin, sizeof(*bin), + v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); + if (ret) + goto fail; + + bin->start = args->bcl_start; + bin->end = args->bcl_end; + bin->qma = args->qma; + bin->qms = args->qms; + bin->qts = args->qts; + bin->render = render; + } + + if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { + ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); + if (ret) + goto fail; + + last_job = clean_job; + } else { + last_job = &render->base; + } + + ret = v3d_lookup_bos(dev, file_priv, last_job, + args->bo_handles, args->bo_handle_count); + if (ret) + goto fail; + + ret = v3d_lock_bo_reservations(last_job, &acquire_ctx); + if (ret) + goto fail; + + if (args->perfmon_id) { + render->base.perfmon = v3d_perfmon_find(v3d_priv, + args->perfmon_id); + + if (!render->base.perfmon) { + ret = -ENOENT; + goto fail_perfmon; + } + } + + mutex_lock(&v3d->sched_lock); + if (bin) { + bin->base.perfmon = render->base.perfmon; + v3d_perfmon_get(bin->base.perfmon); + v3d_push_job(&bin->base); + + ret = drm_sched_job_add_dependency(&render->base.base, + dma_fence_get(bin->base.done_fence)); + if (ret) + goto fail_unreserve; + } + + v3d_push_job(&render->base); + + if (clean_job) { + struct dma_fence *render_fence = + dma_fence_get(render->base.done_fence); + ret = drm_sched_job_add_dependency(&clean_job->base, + render_fence); + if (ret) + goto fail_unreserve; + clean_job->perfmon = render->base.perfmon; + v3d_perfmon_get(clean_job->perfmon); + v3d_push_job(clean_job); + } + + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + last_job, + &acquire_ctx, + args->out_sync, + &se, + last_job->done_fence); + + if (bin) + v3d_job_put(&bin->base); + v3d_job_put(&render->base); + if (clean_job) + v3d_job_put(clean_job); + + return 0; + +fail_unreserve: + mutex_unlock(&v3d->sched_lock); +fail_perfmon: + drm_gem_unlock_reservations(last_job->bo, + last_job->bo_count, &acquire_ctx); +fail: + v3d_job_cleanup((void *)bin); + v3d_job_cleanup((void *)render); + v3d_job_cleanup(clean_job); + v3d_put_multisync_post_deps(&se); + + return ret; +} + +/** + * v3d_submit_tfu_ioctl() - Submits a TFU (texture formatting) job to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * Userspace provides the register setup for the TFU, which we don't + * need to validate since the TFU is behind the MMU. + */ +int +v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct drm_v3d_submit_tfu *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_tfu_job *job = NULL; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia); + + if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_DEBUG("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + + ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + v3d_job_free, args->in_sync, &se, V3D_TFU); + if (ret) + goto fail; + + job->base.bo = kcalloc(ARRAY_SIZE(args->bo_handles), + sizeof(*job->base.bo), GFP_KERNEL); + if (!job->base.bo) { + ret = -ENOMEM; + goto fail; + } + + job->args = *args; + + for (job->base.bo_count = 0; + job->base.bo_count < ARRAY_SIZE(args->bo_handles); + job->base.bo_count++) { + struct drm_gem_object *bo; + + if (!args->bo_handles[job->base.bo_count]) + break; + + bo = drm_gem_object_lookup(file_priv, args->bo_handles[job->base.bo_count]); + if (!bo) { + DRM_DEBUG("Failed to look up GEM BO %d: %d\n", + job->base.bo_count, + args->bo_handles[job->base.bo_count]); + ret = -ENOENT; + goto fail; + } + job->base.bo[job->base.bo_count] = bo; + } + + ret = v3d_lock_bo_reservations(&job->base, &acquire_ctx); + if (ret) + goto fail; + + mutex_lock(&v3d->sched_lock); + v3d_push_job(&job->base); + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + &job->base, &acquire_ctx, + args->out_sync, + &se, + job->base.done_fence); + + v3d_job_put(&job->base); + + return 0; + +fail: + v3d_job_cleanup((void *)job); + v3d_put_multisync_post_deps(&se); + + return ret; +} + +/** + * v3d_submit_csd_ioctl() - Submits a CSD (compute shader) job to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * Userspace provides the register setup for the CSD, which we don't + * need to validate since the CSD is behind the MMU. + */ +int +v3d_submit_csd_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct drm_v3d_submit_csd *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_csd_job *job = NULL; + struct v3d_job *clean_job = NULL; + struct ww_acquire_ctx acquire_ctx; + int ret; + + trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]); + + if (args->pad) + return -EINVAL; + + if (!v3d_has_csd(v3d)) { + DRM_DEBUG("Attempting CSD submit on non-CSD hardware\n"); + return -EINVAL; + } + + if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_INFO("invalid flags: %d\n", args->flags); + return -EINVAL; + } + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + return ret; + } + } + + ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + v3d_job_free, args->in_sync, &se, V3D_CSD); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); + if (ret) + goto fail; + + job->args = *args; + + ret = v3d_lookup_bos(dev, file_priv, clean_job, + args->bo_handles, args->bo_handle_count); + if (ret) + goto fail; + + ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx); + if (ret) + goto fail; + + if (args->perfmon_id) { + job->base.perfmon = v3d_perfmon_find(v3d_priv, + args->perfmon_id); + if (!job->base.perfmon) { + ret = -ENOENT; + goto fail_perfmon; + } + } + + mutex_lock(&v3d->sched_lock); + v3d_push_job(&job->base); + + ret = drm_sched_job_add_dependency(&clean_job->base, + dma_fence_get(job->base.done_fence)); + if (ret) + goto fail_unreserve; + + v3d_push_job(clean_job); + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + clean_job, + &acquire_ctx, + args->out_sync, + &se, + clean_job->done_fence); + + v3d_job_put(&job->base); + v3d_job_put(clean_job); + + return 0; + +fail_unreserve: + mutex_unlock(&v3d->sched_lock); +fail_perfmon: + drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count, + &acquire_ctx); +fail: + v3d_job_cleanup((void *)job); + v3d_job_cleanup(clean_job); + v3d_put_multisync_post_deps(&se); + + return ret; +} From patchwork Thu Nov 30 16:40:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474722 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8B597C4167B for ; Thu, 30 Nov 2023 16:45:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E5F4610E736; Thu, 30 Nov 2023 16:45:31 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 514A510E732 for ; Thu, 30 Nov 2023 16:45:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=f/vG7TxPL8Pv8sZmuHhzZoJmOkFLRMJMPmeGSQLe0EM=; b=F2/07+4ONP0ZHXN2wK9EEYxNTz YieJLXSEhDqZKhwfD5V9ZThHoTD1mDceDSnfircfEruAwwvSr+KBm6n8iEzNTIQ7qpN8m/VXbI4OP xSr+1CZ2DjBR34saf+fLL/IqFf9+beS+hTu/XWZ9wB427TdrwgHm4AKC0DIPddanNlxrj8SHvNKw/ m++T8Jj5JgbIxPNvbpfckXonZtXs61x3LyUKqydAjJaPkft3Fpod01SoVQZ6vS5S+ca8OSEFpkBYb FCTW07YB8CPSzZM2SdT48DTIsdjhHlqd17lTPjT5d6tkK3Z9NsgB/MB6pq4ImyHSo+y2u0WMpbWlr Zmlx51yg==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kA6-008scY-G7; Thu, 30 Nov 2023 17:45:15 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 04/17] drm/v3d: Simplify job refcount handling Date: Thu, 30 Nov 2023 13:40:27 -0300 Message-ID: <20231130164420.932823-6-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen Instead of checking if the job is NULL every time we call the function, check it inside the function. Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_submit.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index f36214002f37..a0caf9c499bb 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -129,6 +129,9 @@ void v3d_job_cleanup(struct v3d_job *job) void v3d_job_put(struct v3d_job *job) { + if (!job) + return; + kref_put(&job->refcount, job->free); } @@ -517,11 +520,9 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, &se, last_job->done_fence); - if (bin) - v3d_job_put(&bin->base); + v3d_job_put(&bin->base); v3d_job_put(&render->base); - if (clean_job) - v3d_job_put(clean_job); + v3d_job_put(clean_job); return 0; From patchwork Thu Nov 30 16:40:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E319CC4167B for ; Thu, 30 Nov 2023 16:45:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5600D10E738; Thu, 30 Nov 2023 16:45:36 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id A8EFF10E736 for ; Thu, 30 Nov 2023 16:45:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=0nUzX4gsVmUg41H9Ed/+mVKvCLFiFk+LgC3jIn6sBY4=; b=OzYVeTZwDixIgJiG7CalX7gJEd fH0A0lcv6tFm2f1Vo8C3QgA8GzTqpDzy0SljY7hcYJsx99ZPsok/96YOS4zY8Xk4OTQ9YGXnI5RnZ Tm9b3AEHUdIn9nZGEvrNIIKResVZzOtvdqE6nhkaJquNgnmql69FK7XGaL7s7g85WGuMl0OWt6ylt hmTK84tzAq5yw3hRdbPwQJu2GlOMg1AIFxcrvRJqF1lTQwnGwCWqQnKmaAmfuT9D/7YpMORjySChU dQL3BOq/4+7Lh322HpcQjk6sdTtyK4GpUNqxdJgU7heyUFCnBEFChH8IpP05+xaVGhcAU/Y2roKD4 jMWOfw8g==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAA-008scY-JB; Thu, 30 Nov 2023 17:45:19 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 05/17] drm/v3d: Don't allow two multisync extensions in the same job Date: Thu, 30 Nov 2023 13:40:28 -0300 Message-ID: <20231130164420.932823-7-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Currently, two multisync extensions can be added to the same job and only the last multisync extension will be used. To avoid this vulnerability, don't allow two multisync extensions in the same job. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_submit.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index a0caf9c499bb..10141dc2820a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -329,6 +329,11 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, struct v3d_submit_ext *se = data; int ret; + if (se->in_sync_count || se->out_sync_count) { + DRM_DEBUG("Two multisync extensions were added to the same job."); + return -EINVAL; + } + if (copy_from_user(&multisync, ext, sizeof(multisync))) return -EFAULT; From patchwork Thu Nov 30 16:40:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 44184C4167B for ; Thu, 30 Nov 2023 16:45:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A21F710E73A; Thu, 30 Nov 2023 16:45:41 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0595E10E736 for ; Thu, 30 Nov 2023 16:45:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=nE78RnxMs21nb8+2/mKnuWSjI1uVM/9ma0urIcrNWrc=; b=mUfFbnJrHSHqz2tvzqjH3j3tlr 7sGzP074fyatzTnP42Q+lfms8DGvVgGMBbv2h/csrvkqxtqZLrvcS4oCsQe8GLgUB14UgjV98TM1O b/JhXs+nb+QIjDST7MihFvkIJyezsrk624os/mXzkwn/Zo1y8bd723LfXplclORyPQy1daDDLo5u3 m1KXwNYlySNFxLNrk6BkJ81eOMu61AKKCh6RGngg/cNje/uPiB0t03oDdVXx5N3pVuiNg0f6AJjkM hulQC5KVSVDvAKed3ymm6OiBoDzH0IbUYrfMMDnoTxe463qYnF9U7V/PY8LiWD6cnrJEHEw/GudnY 6fFcJ9cA==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAE-008scY-IR; Thu, 30 Nov 2023 17:45:23 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 06/17] drm/v3d: Decouple job allocation from job initiation Date: Thu, 30 Nov 2023 13:40:29 -0300 Message-ID: <20231130164420.932823-8-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We want to allow the IOCTLs to allocate the job without initiating it. This will be useful for the CPU job submission IOCTL, as the CPU job has the need to use information from the user extensions. Currently, the user extensions are parsed before the job allocation, making it impossible to fill the CPU job when parsing the user extensions. Therefore, decouple the job allocation from the job initiation. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_submit.c | 64 ++++++++++++++++++++++---------- 1 file changed, 44 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 10141dc2820a..148900283c2a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -135,23 +135,27 @@ void v3d_job_put(struct v3d_job *job) kref_put(&job->refcount, job->free); } +static int +v3d_job_allocate(void **container, size_t size) +{ + *container = kcalloc(1, size, GFP_KERNEL); + if (!*container) { + DRM_ERROR("Cannot allocate memory for V3D job.\n"); + return -ENOMEM; + } + + return 0; +} + static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, - void **container, size_t size, void (*free)(struct kref *ref), + struct v3d_job *job, void (*free)(struct kref *ref), u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue) { struct v3d_file_priv *v3d_priv = file_priv->driver_priv; - struct v3d_job *job; bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC); int ret, i; - *container = kcalloc(1, size, GFP_KERNEL); - if (!*container) { - DRM_ERROR("Cannot allocate memory for v3d job."); - return -ENOMEM; - } - - job = *container; job->v3d = v3d; job->free = free; job->file = file_priv; @@ -159,7 +163,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], 1, v3d_priv); if (ret) - goto fail; + return ret; if (has_multisync) { if (se->in_sync_count && se->wait_stage == queue) { @@ -194,10 +198,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, fail_deps: drm_sched_job_cleanup(&job->base); -fail: - kfree(*container); - *container = NULL; - return ret; } @@ -437,7 +437,11 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_init(v3d, file_priv, (void *)&render, sizeof(*render), + ret = v3d_job_allocate((void *)&render, sizeof(*render)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &render->base, v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER); if (ret) goto fail; @@ -447,7 +451,11 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, INIT_LIST_HEAD(&render->unref_list); if (args->bcl_start != args->bcl_end) { - ret = v3d_job_init(v3d, file_priv, (void *)&bin, sizeof(*bin), + ret = v3d_job_allocate((void *)&bin, sizeof(*bin)); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, &bin->base, v3d_job_free, args->in_sync_bcl, &se, V3D_BIN); if (ret) goto fail; @@ -461,7 +469,11 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) { - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + ret = v3d_job_allocate((void *)&clean_job, sizeof(*clean_job)); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); if (ret) goto fail; @@ -580,7 +592,11 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + ret = v3d_job_allocate((void *)&job, sizeof(*job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &job->base, v3d_job_free, args->in_sync, &se, V3D_TFU); if (ret) goto fail; @@ -683,12 +699,20 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_init(v3d, file_priv, (void *)&job, sizeof(*job), + ret = v3d_job_allocate((void *)&job, sizeof(*job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &job->base, v3d_job_free, args->in_sync, &se, V3D_CSD); if (ret) goto fail; - ret = v3d_job_init(v3d, file_priv, (void *)&clean_job, sizeof(*clean_job), + ret = v3d_job_allocate((void *)&clean_job, sizeof(*clean_job)); + if (ret) + goto fail; + + ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); if (ret) goto fail; From patchwork Thu Nov 30 16:40:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C0D17C10DC2 for ; Thu, 30 Nov 2023 16:45:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8D22810E73D; Thu, 30 Nov 2023 16:45:52 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5135310E738 for ; Thu, 30 Nov 2023 16:45:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=mwXdrsbgDHD0VIynQhVTxyfGMeNZw7DDN8X3NddOLNs=; b=aKj2nuxcMJWIEVMbHOnXL6fW4j CSjOUXIF4TbX0VJnc8joJWK/kgnK0RkGqgAcILjtJjkToe9S4LJpTIa0R/L8wMIwfT+zrYv6olqtP 7PADodY6QZhXjxLwMy4M3IbN8Zv+smaA2VGT40ucvFjfbXuZ9OOTx1PsFXMy+w1AHLZERTu8pPrNR Jfrxuo0fLV3AA8HIxXx25z+r6lE7PjzoHDRHLqWL7LPAuxK3eTFcjJkSiUjSeRlN7AzW+0oFm3Dl4 FYVw/SrWEylGqYKAtfT3Sq444ttpOJ9/od18t006fZUIKffKpStFeB8XfXF5pedwdc3dsHW6Db1YC yAyVngsA==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAI-008scY-MB; Thu, 30 Nov 2023 17:45:27 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 07/17] drm/v3d: Add a CPU job submission Date: Thu, 30 Nov 2023 13:40:30 -0300 Message-ID: <20231130164420.932823-9-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen Create a new type of job, a CPU job. A CPU job is a type of job that performs operations that requires CPU intervention. The overall idea is to use user extensions to enable different types of CPU job, allowing the CPU job to perform different operations according to the type of user extension. The user extension ID identify the type of CPU job that must be dealt. Having a CPU job is interesting for synchronization purposes as a CPU job has a queue like any other V3D job and can be synchoronized by the multisync extension. Signed-off-by: Melissa Wen Co-developed-by: Maíra Canal Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.c | 4 ++ drivers/gpu/drm/v3d/v3d_drv.h | 16 +++++- drivers/gpu/drm/v3d/v3d_sched.c | 57 +++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 86 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 17 +++++++ 5 files changed, 179 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.c b/drivers/gpu/drm/v3d/v3d_drv.c index 44a1ca57d6a4..3debf37e7d9b 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.c +++ b/drivers/gpu/drm/v3d/v3d_drv.c @@ -91,6 +91,9 @@ static int v3d_get_param_ioctl(struct drm_device *dev, void *data, case DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT: args->value = 1; return 0; + case DRM_V3D_PARAM_SUPPORTS_CPU_QUEUE: + args->value = 1; + return 0; default: DRM_DEBUG("Unknown parameter %d\n", args->param); return -EINVAL; @@ -189,6 +192,7 @@ static const struct drm_ioctl_desc v3d_drm_ioctls[] = { DRM_IOCTL_DEF_DRV(V3D_PERFMON_CREATE, v3d_perfmon_create_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(V3D_PERFMON_DESTROY, v3d_perfmon_destroy_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(V3D_PERFMON_GET_VALUES, v3d_perfmon_get_values_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(V3D_SUBMIT_CPU, v3d_submit_cpu_ioctl, DRM_RENDER_ALLOW | DRM_AUTH), }; static const struct drm_driver v3d_drm_driver = { diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 4db9ace66024..2246a0e29955 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -19,7 +19,7 @@ struct reset_control; #define GMP_GRANULARITY (128 * 1024) -#define V3D_MAX_QUEUES (V3D_CACHE_CLEAN + 1) +#define V3D_MAX_QUEUES (V3D_CPU + 1) static inline char *v3d_queue_to_string(enum v3d_queue queue) { @@ -29,6 +29,7 @@ static inline char *v3d_queue_to_string(enum v3d_queue queue) case V3D_TFU: return "tfu"; case V3D_CSD: return "csd"; case V3D_CACHE_CLEAN: return "cache_clean"; + case V3D_CPU: return "cpu"; } return "UNKNOWN"; } @@ -122,6 +123,7 @@ struct v3d_dev { struct v3d_render_job *render_job; struct v3d_tfu_job *tfu_job; struct v3d_csd_job *csd_job; + struct v3d_cpu_job *cpu_job; struct v3d_queue_state queue[V3D_MAX_QUEUES]; @@ -312,6 +314,16 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; }; +enum v3d_cpu_job_type {}; + +struct v3d_cpu_job { + struct v3d_job base; + + enum v3d_cpu_job_type job_type; +}; + +typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); + struct v3d_submit_outsync { struct drm_syncobj *syncobj; }; @@ -414,6 +426,8 @@ int v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); /* v3d_irq.c */ int v3d_irq_init(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index fccbea2a5f2e..c89e92fc614c 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -55,6 +55,12 @@ to_csd_job(struct drm_sched_job *sched_job) return container_of(sched_job, struct v3d_csd_job, base.base); } +static struct v3d_cpu_job * +to_cpu_job(struct drm_sched_job *sched_job) +{ + return container_of(sched_job, struct v3d_cpu_job, base.base); +} + static void v3d_sched_job_free(struct drm_sched_job *sched_job) { @@ -262,6 +268,42 @@ v3d_csd_job_run(struct drm_sched_job *sched_job) return fence; } +static const v3d_cpu_job_fn cpu_job_function[] = { }; + +static struct dma_fence * +v3d_cpu_job_run(struct drm_sched_job *sched_job) +{ + struct v3d_cpu_job *job = to_cpu_job(sched_job); + struct v3d_dev *v3d = job->base.v3d; + struct v3d_file_priv *file = job->base.file->driver_priv; + u64 runtime; + + v3d->cpu_job = job; + + if (job->job_type >= ARRAY_SIZE(cpu_job_function)) { + DRM_DEBUG_DRIVER("Unknown CPU job: %d\n", job->job_type); + return NULL; + } + + file->start_ns[V3D_CPU] = local_clock(); + v3d->queue[V3D_CPU].start_ns = file->start_ns[V3D_CPU]; + + cpu_job_function[job->job_type](job); + + runtime = local_clock() - file->start_ns[V3D_CPU]; + + file->enabled_ns[V3D_CPU] += runtime; + v3d->queue[V3D_CPU].enabled_ns += runtime; + + file->jobs_sent[V3D_CPU]++; + v3d->queue[V3D_CPU].jobs_sent++; + + file->start_ns[V3D_CPU] = 0; + v3d->queue[V3D_CPU].start_ns = 0; + + return NULL; +} + static struct dma_fence * v3d_cache_clean_job_run(struct drm_sched_job *sched_job) { @@ -416,6 +458,12 @@ static const struct drm_sched_backend_ops v3d_cache_clean_sched_ops = { .free_job = v3d_sched_job_free }; +static const struct drm_sched_backend_ops v3d_cpu_sched_ops = { + .run_job = v3d_cpu_job_run, + .timedout_job = v3d_generic_job_timedout, + .free_job = v3d_sched_job_free +}; + int v3d_sched_init(struct v3d_dev *v3d) { @@ -471,6 +519,15 @@ v3d_sched_init(struct v3d_dev *v3d) goto fail; } + ret = drm_sched_init(&v3d->queue[V3D_CPU].sched, + &v3d_cpu_sched_ops, NULL, + DRM_SCHED_PRIORITY_COUNT, + 1, job_hang_limit, + msecs_to_jiffies(hang_limit_ms), NULL, + NULL, "v3d_cpu", v3d->drm.dev); + if (ret) + goto fail; + return 0; fail: diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 148900283c2a..918e28b1d00a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -772,3 +772,89 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return ret; } + +static const unsigned int cpu_job_bo_handle_count[] = { }; + +/** + * v3d_submit_cpu_ioctl() - Submits a CPU job to the V3D. + * @dev: DRM device + * @data: ioctl argument + * @file_priv: DRM file for this fd + * + * Userspace specifies the CPU job type and data required to perform its + * operations through the drm_v3d_extension struct. + */ +int +v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct v3d_dev *v3d = to_v3d_dev(dev); + struct drm_v3d_submit_cpu *args = data; + struct v3d_submit_ext se = {0}; + struct v3d_cpu_job *cpu_job = NULL; + struct ww_acquire_ctx acquire_ctx; + int ret; + + if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) { + DRM_INFO("Invalid flags: %d\n", args->flags); + return -EINVAL; + } + + ret = v3d_job_allocate((void *)&cpu_job, sizeof(*cpu_job)); + if (ret) + return ret; + + if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { + ret = v3d_get_extensions(file_priv, args->extensions, &se); + if (ret) { + DRM_DEBUG("Failed to get extensions.\n"); + goto fail; + } + } + + /* Every CPU job must have a CPU job user extension */ + if (!cpu_job->job_type) { + DRM_DEBUG("CPU job must have a CPU job user extension.\n"); + goto fail; + } + + if (args->bo_handle_count != cpu_job_bo_handle_count[cpu_job->job_type]) { + DRM_DEBUG("This CPU job was not submitted with the proper number of BOs.\n"); + goto fail; + } + + ret = v3d_job_init(v3d, file_priv, &cpu_job->base, + v3d_job_free, 0, &se, V3D_CPU); + if (ret) + goto fail; + + if (args->bo_handle_count) { + ret = v3d_lookup_bos(dev, file_priv, &cpu_job->base, + args->bo_handles, args->bo_handle_count); + if (ret) + goto fail; + + ret = v3d_lock_bo_reservations(&cpu_job->base, &acquire_ctx); + if (ret) + goto fail; + } + + mutex_lock(&v3d->sched_lock); + v3d_push_job(&cpu_job->base); + mutex_unlock(&v3d->sched_lock); + + v3d_attach_fences_and_unlock_reservation(file_priv, + &cpu_job->base, + &acquire_ctx, 0, + NULL, cpu_job->base.done_fence); + + v3d_job_put(&cpu_job->base); + + return 0; + +fail: + v3d_job_cleanup((void *)cpu_job); + v3d_put_multisync_post_deps(&se); + + return ret; +} diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 1a7d7a689de3..00abef9d0db7 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -41,6 +41,7 @@ extern "C" { #define DRM_V3D_PERFMON_CREATE 0x08 #define DRM_V3D_PERFMON_DESTROY 0x09 #define DRM_V3D_PERFMON_GET_VALUES 0x0a +#define DRM_V3D_SUBMIT_CPU 0x0b #define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl) #define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo) @@ -56,6 +57,7 @@ extern "C" { struct drm_v3d_perfmon_destroy) #define DRM_IOCTL_V3D_PERFMON_GET_VALUES DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_PERFMON_GET_VALUES, \ struct drm_v3d_perfmon_get_values) +#define DRM_IOCTL_V3D_SUBMIT_CPU DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CPU, struct drm_v3d_submit_cpu) #define DRM_V3D_SUBMIT_CL_FLUSH_CACHE 0x01 #define DRM_V3D_SUBMIT_EXTENSION 0x02 @@ -93,6 +95,7 @@ enum v3d_queue { V3D_TFU, V3D_CSD, V3D_CACHE_CLEAN, + V3D_CPU, }; /** @@ -276,6 +279,7 @@ enum drm_v3d_param { DRM_V3D_PARAM_SUPPORTS_CACHE_FLUSH, DRM_V3D_PARAM_SUPPORTS_PERFMON, DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT, + DRM_V3D_PARAM_SUPPORTS_CPU_QUEUE, }; struct drm_v3d_get_param { @@ -361,6 +365,19 @@ struct drm_v3d_submit_csd { __u32 pad; }; +struct drm_v3d_submit_cpu { + /* Pointer to a u32 array of the BOs that are referenced by the job. */ + __u64 bo_handles; + + /* Number of BO handles passed in (size is that times 4). */ + __u32 bo_handle_count; + + __u32 flags; + + /* Pointer to an array of ioctl extensions*/ + __u64 extensions; +}; + enum { V3D_PERFCNT_FEP_VALID_PRIMTS_NO_PIXELS, V3D_PERFCNT_FEP_VALID_PRIMS, From patchwork Thu Nov 30 16:40:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8D43BC4167B for ; Thu, 30 Nov 2023 16:45:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D740210E735; Thu, 30 Nov 2023 16:45:51 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8A80710E73A for ; Thu, 30 Nov 2023 16:45:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=z9pQd6xkTS6aJE+aFf0TjetFbhZxKU8C6UJLNKmF8so=; b=mBdjyKyR5pREarqEhw2qNwBToY muEkl238HQ3+TmdfA1yLY3SeUtp8zkc9hBAj9dBsAQk/1FMI3ZXhlEXenlh3tlqb38lz6olONmpfc n83W2It3OoqqjpfyXrFM5cEPUABjJ25xFoj9QK6duIrh1uuGHLvq+vD7s5PrlEOOiRC+JdLPf0Mnn ABPdxsnh6cOMp2y01U+QO9sjqIqSCgH6qkzld+r/6CEjBXKf3esHK92L79QLwcAZtQ6xJQk3r83LL ok84mDhVKsw2T3/L0KN4MFEOh1EdMphB6tZIqx/O5pukaX7i2/tMQE3GAVh/grDIAIz2GBmeMDhrP WVCrI3AQ==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAM-008scY-PB; Thu, 30 Nov 2023 17:45:31 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 08/17] drm/v3d: Use v3d_get_extensions() to parse CPU job data Date: Thu, 30 Nov 2023 13:40:31 -0300 Message-ID: <20231130164420.932823-10-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Currently, v3d_get_extensions() only parses multisync data and assigns it to the `struct v3d_submit_ext`. But, to implement the CPU job with user extensions, we want v3d_get_extensions() to be able to parse CPU job data and assign it to the `struct v3d_cpu_job`. Therefore, allow the function v3d_get_extensions() to use `struct v3d_cpu_job *` as a parameter. If the `struct v3d_cpu_job *` is assigned to NULL, it means that the job is a GPU job and CPU job extensions should be rejected. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_submit.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 918e28b1d00a..124935547f17 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -323,10 +323,9 @@ v3d_get_multisync_post_deps(struct drm_file *file_priv, static int v3d_get_multisync_submit_deps(struct drm_file *file_priv, struct drm_v3d_extension __user *ext, - void *data) + struct v3d_submit_ext *se) { struct drm_v3d_multi_sync multisync; - struct v3d_submit_ext *se = data; int ret; if (se->in_sync_count || se->out_sync_count) { @@ -340,7 +339,7 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, if (multisync.pad) return -EINVAL; - ret = v3d_get_multisync_post_deps(file_priv, data, multisync.out_sync_count, + ret = v3d_get_multisync_post_deps(file_priv, se, multisync.out_sync_count, multisync.out_syncs); if (ret) return ret; @@ -359,7 +358,8 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, static int v3d_get_extensions(struct drm_file *file_priv, u64 ext_handles, - void *data) + struct v3d_submit_ext *se, + struct v3d_cpu_job *job) { struct drm_v3d_extension __user *user_ext; int ret; @@ -375,15 +375,16 @@ v3d_get_extensions(struct drm_file *file_priv, switch (ext.id) { case DRM_V3D_EXT_ID_MULTI_SYNC: - ret = v3d_get_multisync_submit_deps(file_priv, user_ext, data); - if (ret) - return ret; + ret = v3d_get_multisync_submit_deps(file_priv, user_ext, se); break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; } + if (ret) + return ret; + user_ext = u64_to_user_ptr(ext.next); } @@ -430,7 +431,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, NULL); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); return ret; @@ -585,7 +586,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, NULL); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); return ret; @@ -692,7 +693,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, } if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, NULL); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); return ret; @@ -805,7 +806,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, return ret; if (args->flags & DRM_V3D_SUBMIT_EXTENSION) { - ret = v3d_get_extensions(file_priv, args->extensions, &se); + ret = v3d_get_extensions(file_priv, args->extensions, &se, cpu_job); if (ret) { DRM_DEBUG("Failed to get extensions.\n"); goto fail; From patchwork Thu Nov 30 16:40:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0610AC4167B for ; Thu, 30 Nov 2023 16:45:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6117010E734; Thu, 30 Nov 2023 16:45:46 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id EA4D910E73B for ; Thu, 30 Nov 2023 16:45:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=7oSZPgtm7FaRyKeqjrX4/MeX1CTnWNaij67tA0rkpQk=; b=nebhAEQXwcsyBRl9iP/dujW/lB Yax4R54nuqbJQlTsjhiO+QYbC+8QKPO2YT3JnFn2bEcqEovi7N5zZHGWMrermq1ijnoSV8MhMdVd3 eCp5/EsLSMyMKocIrGJXL9ILVbPZfRVFeY9xAL9erA6qp4jZKHM135LBDV/vk4wI7lHO6a/uvzve0 bGj5/BvOctbKauCtuVX6xLek7+9ARdAn2zJEvt27djbgZ2KZZe4YWHIe5hGGkaDOQLpTSn6YQORLg aQZFuRJiGpMEdjoYh5+ia8gM4X6uQG27IuIhuszdhLkvVbuTxXXlc6YiBL2903S1wN0iDAaDNWUV8 xYjUPCNQ==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAQ-008scY-VI; Thu, 30 Nov 2023 17:45:35 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 09/17] drm/v3d: Create tracepoints to track the CPU job Date: Thu, 30 Nov 2023 13:40:32 -0300 Message-ID: <20231130164420.932823-11-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Create tracepoints to track the three major events of a CPU job lifetime: 1. Submission of a `v3d_submit_cpu` IOCTL 2. Beginning of the execution of a CPU job 3. Ending of the execution of a CPU job Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_sched.c | 4 +++ drivers/gpu/drm/v3d/v3d_submit.c | 2 ++ drivers/gpu/drm/v3d/v3d_trace.h | 57 ++++++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index c89e92fc614c..ebbd00840a73 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -288,8 +288,12 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job) file->start_ns[V3D_CPU] = local_clock(); v3d->queue[V3D_CPU].start_ns = file->start_ns[V3D_CPU]; + trace_v3d_cpu_job_begin(&v3d->drm, job->job_type); + cpu_job_function[job->job_type](job); + trace_v3d_cpu_job_end(&v3d->drm, job->job_type); + runtime = local_clock() - file->start_ns[V3D_CPU]; file->enabled_ns[V3D_CPU] += runtime; diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 124935547f17..c134b113b181 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -824,6 +824,8 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, goto fail; } + trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type); + ret = v3d_job_init(v3d, file_priv, &cpu_job->base, v3d_job_free, 0, &se, V3D_CPU); if (ret) diff --git a/drivers/gpu/drm/v3d/v3d_trace.h b/drivers/gpu/drm/v3d/v3d_trace.h index 7aa8dc356e54..06086ece6e9e 100644 --- a/drivers/gpu/drm/v3d/v3d_trace.h +++ b/drivers/gpu/drm/v3d/v3d_trace.h @@ -225,6 +225,63 @@ TRACE_EVENT(v3d_submit_csd, __entry->seqno) ); +TRACE_EVENT(v3d_submit_cpu_ioctl, + TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type), + TP_ARGS(dev, job_type), + + TP_STRUCT__entry( + __field(u32, dev) + __field(enum v3d_cpu_job_type, job_type) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->job_type = job_type; + ), + + TP_printk("dev=%u, job_type=%d", + __entry->dev, + __entry->job_type) +); + +TRACE_EVENT(v3d_cpu_job_begin, + TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type), + TP_ARGS(dev, job_type), + + TP_STRUCT__entry( + __field(u32, dev) + __field(enum v3d_cpu_job_type, job_type) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->job_type = job_type; + ), + + TP_printk("dev=%u, job_type=%d", + __entry->dev, + __entry->job_type) +); + +TRACE_EVENT(v3d_cpu_job_end, + TP_PROTO(struct drm_device *dev, enum v3d_cpu_job_type job_type), + TP_ARGS(dev, job_type), + + TP_STRUCT__entry( + __field(u32, dev) + __field(enum v3d_cpu_job_type, job_type) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->job_type = job_type; + ), + + TP_printk("dev=%u, job_type=%d", + __entry->dev, + __entry->job_type) +); + TRACE_EVENT(v3d_cache_clean_begin, TP_PROTO(struct drm_device *dev), TP_ARGS(dev), From patchwork Thu Nov 30 16:40:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B5F89C10DC2 for ; Thu, 30 Nov 2023 16:45:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5C93F10E73C; Thu, 30 Nov 2023 16:45:52 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id CFBF110E73B for ; Thu, 30 Nov 2023 16:45:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=2bEx5ZYj5LGz7iCdYYN1Fw/uSuAxrxRG4eoNL3L2caw=; b=HGnKmu0e0NJLwM2dmBQWNBQhRy 9aN+KCiKoRwJjBcMFn+qIcB8O9B7HCu/DzqmUmISA8JPmwF3bcD2W726JH/O8fqunw3dIYhFnx1dE s4xnme04IcSJbM25sG0y11EXRMj6rO4BHHYmgjkC7t2XpGAflyMD1s4ehHL7hee1RllhB9ZKdX7vf +60tn7S3eo6UaTQ029Gnq7A+N4DvO7jgb2ogDo5+Qc4ab/PjCLfzxZh0j2z5jnz3vIy1OEICec2o6 cRFkyeCE0pOopHphyjYWxhcjq1HG9CQhWn6k1WYJqXhh9j7H6g5YMcoOatMFXFgxg8cpXtXMzNdWV zqAP3vmw==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAU-008scY-Tq; Thu, 30 Nov 2023 17:45:39 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 10/17] drm/v3d: Detach the CSD job BO setup Date: Thu, 30 Nov 2023 13:40:33 -0300 Message-ID: <20231130164420.932823-12-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Melissa Wen Detach CSD job setup from CSD submission ioctl to reuse it in CPU submission ioctl for indirect CSD job. Signed-off-by: Melissa Wen Co-developed-by: Maíra Canal Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_submit.c | 68 ++++++++++++++++++++------------ 1 file changed, 42 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index c134b113b181..eb26fe1e27e3 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -256,6 +256,45 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv, } } +static int +v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv, + struct v3d_dev *v3d, + struct drm_v3d_submit_csd *args, + struct v3d_csd_job **job, + struct v3d_job **clean_job, + struct v3d_submit_ext *se, + struct ww_acquire_ctx *acquire_ctx) +{ + int ret; + + ret = v3d_job_allocate((void *)job, sizeof(**job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, &(*job)->base, + v3d_job_free, args->in_sync, se, V3D_CSD); + if (ret) + return ret; + + ret = v3d_job_allocate((void *)clean_job, sizeof(**clean_job)); + if (ret) + return ret; + + ret = v3d_job_init(v3d, file_priv, *clean_job, + v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); + if (ret) + return ret; + + (*job)->args = *args; + + ret = v3d_lookup_bos(&v3d->drm, file_priv, *clean_job, + args->bo_handles, args->bo_handle_count); + if (ret) + return ret; + + return v3d_lock_bo_reservations(*clean_job, acquire_ctx); +} + static void v3d_put_multisync_post_deps(struct v3d_submit_ext *se) { @@ -700,32 +739,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, } } - ret = v3d_job_allocate((void *)&job, sizeof(*job)); - if (ret) - return ret; - - ret = v3d_job_init(v3d, file_priv, &job->base, - v3d_job_free, args->in_sync, &se, V3D_CSD); - if (ret) - goto fail; - - ret = v3d_job_allocate((void *)&clean_job, sizeof(*clean_job)); - if (ret) - goto fail; - - ret = v3d_job_init(v3d, file_priv, clean_job, - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN); - if (ret) - goto fail; - - job->args = *args; - - ret = v3d_lookup_bos(dev, file_priv, clean_job, - args->bo_handles, args->bo_handle_count); - if (ret) - goto fail; - - ret = v3d_lock_bo_reservations(clean_job, &acquire_ctx); + ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d, args, + &job, &clean_job, &se, + &acquire_ctx); if (ret) goto fail; From patchwork Thu Nov 30 16:40:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3DB70C4167B for ; Thu, 30 Nov 2023 16:45:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3724810E73B; Thu, 30 Nov 2023 16:45:52 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id B03B810E735 for ; Thu, 30 Nov 2023 16:45:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=TwSjfJS74t0A3hbDuy5BWZ/zgHuKemzueEp/VFBcWOM=; b=etY1aAoHJY5UNxutOpLI3cNz0y JD0nMplRtZZSw0edoFtxGlOnJ71FkbhfT7jCCh3vH6t3z8Nn/17vEkvPNm7S7lSTX4iL4sjaWLhwp wi3H57t1cmtEZX+h9gBsZkSLY0KZsEfw6JvI1eUpsCmdEM5fh18pB0TPzv+qLD1RggOuTjjF8D0FA ws31k/lYNqdllxA/eKsBgLqg5pEcVEItYrgMcGZ0+rbiQ2fs2UfK6Kjq1tim9ghQ/NWifEDpTN7TR no9RVSkn0p7uM3l6me5tIm8scVEVp2fbDccRuiqYLc/cYvXui64FK+KplsRAHrwnq3Wr9POiK7JoT mRN7i3QA==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAY-008scY-Ih; Thu, 30 Nov 2023 17:45:43 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 11/17] drm/v3d: Enable BO mapping Date: Thu, 30 Nov 2023 13:40:34 -0300 Message-ID: <20231130164420.932823-13-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" For the indirect CSD CPU job, we will need to access the internal contents of the BO with the dispatch parameters. Therefore, create methods to allow the mapping and unmapping of the BO. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_bo.c | 18 ++++++++++++++++++ drivers/gpu/drm/v3d/v3d_drv.h | 4 ++++ 2 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c index 357a0da7e16a..1bdfac8beafd 100644 --- a/drivers/gpu/drm/v3d/v3d_bo.c +++ b/drivers/gpu/drm/v3d/v3d_bo.c @@ -33,6 +33,9 @@ void v3d_free_object(struct drm_gem_object *obj) struct v3d_dev *v3d = to_v3d_dev(obj->dev); struct v3d_bo *bo = to_v3d_bo(obj); + if (bo->vaddr) + v3d_put_bo_vaddr(bo); + v3d_mmu_remove_ptes(bo); mutex_lock(&v3d->bo_lock); @@ -134,6 +137,7 @@ struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv, if (IS_ERR(shmem_obj)) return ERR_CAST(shmem_obj); bo = to_v3d_bo(&shmem_obj->base); + bo->vaddr = NULL; ret = v3d_bo_create_finish(&shmem_obj->base); if (ret) @@ -167,6 +171,20 @@ v3d_prime_import_sg_table(struct drm_device *dev, return obj; } +void v3d_get_bo_vaddr(struct v3d_bo *bo) +{ + struct drm_gem_shmem_object *obj = &bo->base; + + bo->vaddr = vmap(obj->pages, obj->base.size >> PAGE_SHIFT, VM_MAP, + pgprot_writecombine(PAGE_KERNEL)); +} + +void v3d_put_bo_vaddr(struct v3d_bo *bo) +{ + vunmap(bo->vaddr); + bo->vaddr = NULL; +} + int v3d_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 2246a0e29955..39d62915cdd6 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -202,6 +202,8 @@ struct v3d_bo { * v3d_render_job->unref_list */ struct list_head unref_head; + + void *vaddr; }; static inline struct v3d_bo * @@ -391,6 +393,8 @@ struct drm_gem_object *v3d_create_object(struct drm_device *dev, size_t size); void v3d_free_object(struct drm_gem_object *gem_obj); struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv, size_t size); +void v3d_get_bo_vaddr(struct v3d_bo *bo); +void v3d_put_bo_vaddr(struct v3d_bo *bo); int v3d_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_mmap_bo_ioctl(struct drm_device *dev, void *data, From patchwork Thu Nov 30 16:40:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7C068C4167B for ; Thu, 30 Nov 2023 16:46:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D756210E745; Thu, 30 Nov 2023 16:46:21 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4F22710E73F for ; Thu, 30 Nov 2023 16:45:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=DCuWOwc80fZqx7N4nYcwajFphta6B4TroFlPXaYyvSM=; b=sVos7Ziu8mBzdwS4nU/OuOc0gA da+R3qAig7yyGOqyJlC+ytA2D4FJtmsIUhdEhv9yutnMVyLuDPJxCCZmRtuSL2xYMLr7t0mtK2pDM 1LX3vDkNH5OyFXMPSOff8d2jXPbvwlPj8VfZLPy7FxTRh9uyeQI+5cAfWqPe8tIlAQz3jdr+rbIg8 BAGeZAA+1GCUp6BII7WbaBbW/uDjRfGJcWdXGwUfXh4/Eq78z+hdWNsm8Q+92KPzyjSSDtovaDN/i g8xEkwpLDyYaGCFXIa/AAgWKCGTBC5dd7FE9uw9kWepY9qqgsRghJI5h0jUC9WomqA4Ml82hrjYS9 ll5Pdy+Q==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAc-008scY-4f; Thu, 30 Nov 2023 17:45:46 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 12/17] drm/v3d: Create a CPU job extension for a indirect CSD job Date: Thu, 30 Nov 2023 13:40:35 -0300 Message-ID: <20231130164420.932823-14-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. An indirect CSD job is a job that, when executed in the queue, will map the indirect buffer, read the dispatch parameters, and submit a regular dispatch. Therefore, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of an indirect CSD. This user extension will allow the creation of a CSD job linked to a CPU job. The CPU job will wait for the indirect CSD job dependencies and, once they are signaled, it will update the CSD job parameters. Co-developed-by: Melissa Wen Signed-off-by: Melissa Wen Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 31 ++++++++- drivers/gpu/drm/v3d/v3d_sched.c | 41 +++++++++++- drivers/gpu/drm/v3d/v3d_submit.c | 104 ++++++++++++++++++++++++++++++- include/uapi/drm/v3d_drm.h | 43 ++++++++++++- 4 files changed, 213 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 39d62915cdd6..202c0d4b04a5 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -316,12 +316,41 @@ struct v3d_csd_job { struct drm_v3d_submit_csd args; }; -enum v3d_cpu_job_type {}; +enum v3d_cpu_job_type { + V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, +}; + +struct v3d_indirect_csd_info { + /* Indirect CSD */ + struct v3d_csd_job *job; + + /* Clean cache job associated to the Indirect CSD job */ + struct v3d_job *clean_job; + + /* Offset within the BO where the workgroup counts are stored */ + u32 offset; + + /* Workgroups size */ + u32 wg_size; + + /* Indices of the uniforms with the workgroup dispatch counts + * in the uniform stream. + */ + u32 wg_uniform_offsets[3]; + + /* Indirect BO */ + struct drm_gem_object *indirect; + + /* Context of the Indirect CSD job */ + struct ww_acquire_ctx acquire_ctx; +}; struct v3d_cpu_job { struct v3d_job base; enum v3d_cpu_job_type job_type; + + struct v3d_indirect_csd_info indirect_csd; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index ebbd00840a73..597e4ec3d28d 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -25,6 +25,8 @@ #include "v3d_regs.h" #include "v3d_trace.h" +#define V3D_CSD_CFG012_WG_COUNT_SHIFT 16 + static struct v3d_job * to_v3d_job(struct drm_sched_job *sched_job) { @@ -268,7 +270,44 @@ v3d_csd_job_run(struct drm_sched_job *sched_job) return fence; } -static const v3d_cpu_job_fn cpu_job_function[] = { }; +static void +v3d_rewrite_csd_job_wg_counts_from_indirect(struct v3d_cpu_job *job) +{ + struct v3d_indirect_csd_info *indirect_csd = &job->indirect_csd; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + struct v3d_bo *indirect = to_v3d_bo(indirect_csd->indirect); + struct drm_v3d_submit_csd *args = &indirect_csd->job->args; + u32 *wg_counts; + + v3d_get_bo_vaddr(bo); + v3d_get_bo_vaddr(indirect); + + wg_counts = (uint32_t *) (bo->vaddr + indirect_csd->offset); + + if (wg_counts[0] == 0 || wg_counts[1] == 0 || wg_counts[2] == 0) + return; + + args->cfg[0] = wg_counts[0] << V3D_CSD_CFG012_WG_COUNT_SHIFT; + args->cfg[1] = wg_counts[1] << V3D_CSD_CFG012_WG_COUNT_SHIFT; + args->cfg[2] = wg_counts[2] << V3D_CSD_CFG012_WG_COUNT_SHIFT; + args->cfg[4] = DIV_ROUND_UP(indirect_csd->wg_size, 16) * + (wg_counts[0] * wg_counts[1] * wg_counts[2]) - 1; + + for (int i = 0; i < 3; i++) { + /* 0xffffffff indicates that the uniform rewrite is not needed */ + if (indirect_csd->wg_uniform_offsets[i] != 0xffffffff) { + u32 uniform_idx = indirect_csd->wg_uniform_offsets[i]; + ((uint32_t *) indirect->vaddr)[uniform_idx] = wg_counts[i]; + } + } + + v3d_put_bo_vaddr(indirect); + v3d_put_bo_vaddr(bo); +} + +static const v3d_cpu_job_fn cpu_job_function[] = { + [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, +}; static struct dma_fence * v3d_cpu_job_run(struct drm_sched_job *sched_job) diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index eb26fe1e27e3..0320695b941b 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -391,6 +391,48 @@ v3d_get_multisync_submit_deps(struct drm_file *file_priv, return 0; } +/* Get data for the indirect CSD job submission. */ +static int +v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; + struct v3d_dev *v3d = v3d_priv->v3d; + struct drm_v3d_indirect_csd indirect_csd; + struct v3d_indirect_csd_info *info = &job->indirect_csd; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(&indirect_csd, ext, sizeof(indirect_csd))) + return -EFAULT; + + if (!v3d_has_csd(v3d)) { + DRM_DEBUG("Attempting CSD submit on non-CSD hardware.\n"); + return -EINVAL; + } + + job->job_type = V3D_CPU_JOB_TYPE_INDIRECT_CSD; + info->offset = indirect_csd.offset; + info->wg_size = indirect_csd.wg_size; + memcpy(&info->wg_uniform_offsets, &indirect_csd.wg_uniform_offsets, + sizeof(indirect_csd.wg_uniform_offsets)); + + info->indirect = drm_gem_object_lookup(file_priv, indirect_csd.indirect); + + return v3d_setup_csd_jobs_and_bos(file_priv, v3d, &indirect_csd.submit, + &info->job, &info->clean_job, + NULL, &info->acquire_ctx); +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -416,6 +458,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_MULTI_SYNC: ret = v3d_get_multisync_submit_deps(file_priv, user_ext, se); break; + case DRM_V3D_EXT_ID_CPU_INDIRECT_CSD: + ret = v3d_get_cpu_indirect_csd_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -790,7 +835,9 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, return ret; } -static const unsigned int cpu_job_bo_handle_count[] = { }; +static const unsigned int cpu_job_bo_handle_count[] = { + [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, +}; /** * v3d_submit_cpu_ioctl() - Submits a CPU job to the V3D. @@ -808,7 +855,10 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, struct v3d_dev *v3d = to_v3d_dev(dev); struct drm_v3d_submit_cpu *args = data; struct v3d_submit_ext se = {0}; + struct v3d_submit_ext *out_se = NULL; struct v3d_cpu_job *cpu_job = NULL; + struct v3d_csd_job *csd_job = NULL; + struct v3d_job *clean_job = NULL; struct ww_acquire_ctx acquire_ctx; int ret; @@ -847,6 +897,9 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, if (ret) goto fail; + clean_job = cpu_job->indirect_csd.clean_job; + csd_job = cpu_job->indirect_csd.job; + if (args->bo_handle_count) { ret = v3d_lookup_bos(dev, file_priv, &cpu_job->base, args->bo_handles, args->bo_handle_count); @@ -860,19 +913,66 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, mutex_lock(&v3d->sched_lock); v3d_push_job(&cpu_job->base); + + switch (cpu_job->job_type) { + case V3D_CPU_JOB_TYPE_INDIRECT_CSD: + ret = drm_sched_job_add_dependency(&csd_job->base.base, + dma_fence_get(cpu_job->base.done_fence)); + if (ret) + goto fail_unreserve; + + v3d_push_job(&csd_job->base); + + ret = drm_sched_job_add_dependency(&clean_job->base, + dma_fence_get(csd_job->base.done_fence)); + if (ret) + goto fail_unreserve; + + v3d_push_job(clean_job); + + break; + default: + break; + } mutex_unlock(&v3d->sched_lock); + out_se = (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) ? NULL : &se; + v3d_attach_fences_and_unlock_reservation(file_priv, &cpu_job->base, &acquire_ctx, 0, - NULL, cpu_job->base.done_fence); + out_se, cpu_job->base.done_fence); + + switch (cpu_job->job_type) { + case V3D_CPU_JOB_TYPE_INDIRECT_CSD: + v3d_attach_fences_and_unlock_reservation(file_priv, + clean_job, + &cpu_job->indirect_csd.acquire_ctx, + 0, &se, clean_job->done_fence); + break; + default: + break; + } v3d_job_put(&cpu_job->base); + v3d_job_put(&csd_job->base); + v3d_job_put(clean_job); return 0; +fail_unreserve: + mutex_unlock(&v3d->sched_lock); + + drm_gem_unlock_reservations(cpu_job->base.bo, cpu_job->base.bo_count, + &acquire_ctx); + + drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count, + &cpu_job->indirect_csd.acquire_ctx); + fail: v3d_job_cleanup((void *)cpu_job); + v3d_job_cleanup((void *)csd_job); + v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); return ret; diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 00abef9d0db7..0c0f47782528 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -71,7 +71,8 @@ extern "C" { struct drm_v3d_extension { __u64 next; __u32 id; -#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 +#define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 +#define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 __u32 flags; /* mbz */ }; @@ -365,8 +366,46 @@ struct drm_v3d_submit_csd { __u32 pad; }; +/** + * struct drm_v3d_indirect_csd - ioctl extension for the CPU job to create an + * indirect CSD + * + * When an extension of DRM_V3D_EXT_ID_CPU_INDIRECT_CSD id is defined, it + * points to this extension to define a indirect CSD submission. It creates a + * CPU job linked to a CSD job. The CPU job waits for the indirect CSD + * dependencies and, once they are signaled, it updates the CSD job config + * before allowing the CSD job execution. + */ +struct drm_v3d_indirect_csd { + struct drm_v3d_extension base; + + /* Indirect CSD */ + struct drm_v3d_submit_csd submit; + + /* Handle of the indirect BO, that should be also attached to the + * indirect CSD. + */ + __u32 indirect; + + /* Offset within the BO where the workgroup counts are stored */ + __u32 offset; + + /* Workgroups size */ + __u32 wg_size; + + /* Indices of the uniforms with the workgroup dispatch counts + * in the uniform stream. If the uniform rewrite is not needed, + * the offset must be 0xffffffff. + */ + __u32 wg_uniform_offsets[3]; +}; + struct drm_v3d_submit_cpu { - /* Pointer to a u32 array of the BOs that are referenced by the job. */ + /* Pointer to a u32 array of the BOs that are referenced by the job. + * + * For DRM_V3D_EXT_ID_CPU_INDIRECT_CSD, it must contain only one BO, + * that contains the workgroup counts. + */ __u64 bo_handles; /* Number of BO handles passed in (size is that times 4). */ From patchwork Thu Nov 30 16:40:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474730 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DAFF1C4167B for ; Thu, 30 Nov 2023 16:46:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 21C9310E73F; Thu, 30 Nov 2023 16:46:10 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id A8B0710E73F for ; Thu, 30 Nov 2023 16:45:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=QfZ8sdMytiyG2rQqUXEfaHjW6zQZJoQ9iC1hwUGa8Rs=; b=Uz+DzjsrD+X0ldS3MSexviazY/ fG0h8SWobs22t0euVVYCyyesf/0hP920fyaoj6WwOCpLgsByg63PxSGNduPuYcOnT9lmOpM44x5Vn U/w+/bW/UUeRmBySTq2fS/RcwDKBO8f7vav1E/lw0DoKYQIIcRRLaABlOyhbspFv4TGdAhO+a232z Z6r0p/nzGYD8pHEGp3LEk5K+dTR19J6cm66EBUHAbs4rflGWoxv2h198CQpcsn+etapY45eCG3eGy 1tYL29953pkJlGAE5k/06VY8Aey/ohjpfjYVBBxo4TLsYIcM6luTkzrqq6WAdv3uLfHChX4pZAqoh /RiVs6vQ==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAf-008scY-Tf; Thu, 30 Nov 2023 17:45:50 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 13/17] drm/v3d: Create a CPU job extension for the timestamp query job Date: Thu, 30 Nov 2023 13:40:36 -0300 Message-ID: <20231130164420.932823-15-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A timestamp query job is a job that calculates the query timestamp and updates the query availability by signaling a syncobj. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a timestamp query job. This user extension will allow the creation of a CPU job that performs the timestamp query calculation and updates the timestamp BO with the proper value. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 17 +++++++++ drivers/gpu/drm/v3d/v3d_sched.c | 40 +++++++++++++++++++- drivers/gpu/drm/v3d/v3d_submit.c | 63 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 30 +++++++++++++++ 4 files changed, 149 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 202c0d4b04a5..dd86e745c260 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -318,6 +318,15 @@ struct v3d_csd_job { enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, + V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, +}; + +struct v3d_timestamp_query { + /* Offset of this query in the timestamp BO for its value. */ + u32 offset; + + /* Syncobj that indicates the timestamp availability */ + struct drm_syncobj *syncobj; }; struct v3d_indirect_csd_info { @@ -345,12 +354,20 @@ struct v3d_indirect_csd_info { struct ww_acquire_ctx acquire_ctx; }; +struct v3d_timestamp_query_info { + struct v3d_timestamp_query *queries; + + u32 count; +}; + struct v3d_cpu_job { struct v3d_job base; enum v3d_cpu_job_type job_type; struct v3d_indirect_csd_info indirect_csd; + + struct v3d_timestamp_query_info timestamp_query; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 597e4ec3d28d..f4cffefb6398 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -21,6 +21,8 @@ #include #include +#include + #include "v3d_drv.h" #include "v3d_regs.h" #include "v3d_trace.h" @@ -71,6 +73,21 @@ v3d_sched_job_free(struct drm_sched_job *sched_job) v3d_job_cleanup(job); } +static void +v3d_cpu_job_free(struct drm_sched_job *sched_job) +{ + struct v3d_cpu_job *job = to_cpu_job(sched_job); + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + + if (timestamp_query->queries) { + for (int i = 0; i < timestamp_query->count; i++) + drm_syncobj_put(timestamp_query->queries[i].syncobj); + kvfree(timestamp_query->queries); + } + + v3d_job_cleanup(&job->base); +} + static void v3d_switch_perfmon(struct v3d_dev *v3d, struct v3d_job *job) { @@ -305,8 +322,29 @@ v3d_rewrite_csd_job_wg_counts_from_indirect(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +v3d_timestamp_query(struct v3d_cpu_job *job) +{ + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + u8 *value_addr; + + v3d_get_bo_vaddr(bo); + + for (int i = 0; i < timestamp_query->count; i++) { + value_addr = ((u8 *) bo->vaddr) + timestamp_query->queries[i].offset; + *((u64 *) value_addr) = i == 0 ? ktime_get_ns() : 0ull; + + drm_syncobj_replace_fence(timestamp_query->queries[i].syncobj, + job->base.done_fence); + } + + v3d_put_bo_vaddr(bo); +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, + [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, }; static struct dma_fence * @@ -504,7 +542,7 @@ static const struct drm_sched_backend_ops v3d_cache_clean_sched_ops = { static const struct drm_sched_backend_ops v3d_cpu_sched_ops = { .run_job = v3d_cpu_job_run, .timedout_job = v3d_generic_job_timedout, - .free_job = v3d_sched_job_free + .free_job = v3d_cpu_job_free }; int diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 0320695b941b..83e029e786ea 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -433,6 +433,64 @@ v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv, NULL, &info->acquire_ctx); } +/* Get data for the query timestamp job submission. */ +static int +v3d_get_cpu_timestamp_query_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *offsets, *syncs; + struct drm_v3d_timestamp_query timestamp; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(×tamp, ext, sizeof(timestamp))) + return -EFAULT; + + if (timestamp.pad) + return -EINVAL; + + job->job_type = V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY; + + job->timestamp_query.queries = kvmalloc_array(timestamp.count, + sizeof(struct v3d_timestamp_query), + GFP_KERNEL); + if (!job->timestamp_query.queries) + return -ENOMEM; + + offsets = u64_to_user_ptr(timestamp.offsets); + syncs = u64_to_user_ptr(timestamp.syncs); + + for (int i = 0; i < timestamp.count; i++) { + u32 offset, sync; + + if (copy_from_user(&offset, offsets++, sizeof(offset))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].offset = offset; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + } + job->timestamp_query.count = timestamp.count; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -461,6 +519,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_INDIRECT_CSD: ret = v3d_get_cpu_indirect_csd_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY: + ret = v3d_get_cpu_timestamp_query_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -837,6 +898,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, + [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, }; /** @@ -974,6 +1036,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, v3d_job_cleanup((void *)csd_job); v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); + kvfree(cpu_job->timestamp_query.queries); return ret; } diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 0c0f47782528..239801b5e117 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -73,6 +73,7 @@ struct drm_v3d_extension { __u32 id; #define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 #define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 +#define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 __u32 flags; /* mbz */ }; @@ -400,11 +401,40 @@ struct drm_v3d_indirect_csd { __u32 wg_uniform_offsets[3]; }; +/** + * struct drm_v3d_timestamp_query - ioctl extension for the CPU job to calculate + * a timestamp query + * + * When an extension DRM_V3D_EXT_ID_TIMESTAMP_QUERY is defined, it points to + * this extension to define a timestamp query submission. This CPU job will + * calculate the timestamp query and update the query value within the + * timestamp BO. Moreover, it will signal the timestamp syncobj to indicate + * query availability. + */ +struct drm_v3d_timestamp_query { + struct drm_v3d_extension base; + + /* Array of queries' offsets within the timestamp BO for their value */ + __u64 offsets; + + /* Array of timestamp's syncobjs to indicate its availability */ + __u64 syncs; + + /* Number of queries */ + __u32 count; + + /* mbz */ + __u32 pad; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * * For DRM_V3D_EXT_ID_CPU_INDIRECT_CSD, it must contain only one BO, * that contains the workgroup counts. + * + * For DRM_V3D_EXT_ID_TIMESTAMP_QUERY, it must contain only one BO, + * that will contain the timestamp. */ __u64 bo_handles; From patchwork Thu Nov 30 16:40:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 50C66C10DC2 for ; Thu, 30 Nov 2023 16:46:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2758A10E740; Thu, 30 Nov 2023 16:46:10 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3F54010E752 for ; Thu, 30 Nov 2023 16:46:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=+4Gkf1Rr+bQkO2IQwgoneHDYOxys5ACJlBxKmMYZe7o=; b=hgZu0ZxJrDXxmTJxTV59OlrNn0 FVPfi8UjGAjR/5g9I9XSSFbvosItBXQjXjZqbgxvE5Mp6i6rd77jdmbyvm4DYx3PvrtWAYdab6Y+k sBzT+eInn1KRnxEq88YnLMCFQA1xH5S7c3mb1NY5hTYms2WYzA4LvEzswVzdKcJ5txpbTybNOCHak 64jho98UwTfW7rckW+R9ImDxSo4JJO6/dNDaCog6/kXws/Hhv7+ACwVzFmJdUmOalMHdovbvDPCuQ n4z3fCwBb8n1zCG4N9kEl8hOMRM1NbZu/4T/W0ps3wUljBaLtA6ZvdrceoowQISqMpn3t/eiA6wRF MqqmUmLA==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAj-008scY-EB; Thu, 30 Nov 2023 17:45:53 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 14/17] drm/v3d: Create a CPU job extension for the reset timestamp job Date: Thu, 30 Nov 2023 13:40:37 -0300 Message-ID: <20231130164420.932823-16-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A reset timestamp job is a job that resets the timestamp queries based on the value offset of the first query. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a reset timestamp job. This user extension will allow the creation of a CPU job that resets the timestamp value in the timestamp BO and resets the availability syncobj. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 1 + drivers/gpu/drm/v3d/v3d_sched.c | 21 +++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 52 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 27 +++++++++++++++++ 4 files changed, 101 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index dd86e745c260..3988407635ed 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -319,6 +319,7 @@ struct v3d_csd_job { enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, + V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, }; struct v3d_timestamp_query { diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index f4cffefb6398..3a435f621b9b 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -342,9 +342,30 @@ v3d_timestamp_query(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +v3d_reset_timestamp_queries(struct v3d_cpu_job *job) +{ + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_timestamp_query *queries = timestamp_query->queries; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + u8 *value_addr; + + v3d_get_bo_vaddr(bo); + + for (int i = 0; i < timestamp_query->count; i++) { + value_addr = ((u8 *) bo->vaddr) + queries[i].offset; + *((u64 *) value_addr) = 0; + + drm_syncobj_replace_fence(queries[i].syncobj, NULL); + } + + v3d_put_bo_vaddr(bo); +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, + [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 83e029e786ea..1c719416e26a 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -491,6 +491,54 @@ v3d_get_cpu_timestamp_query_params(struct drm_file *file_priv, return 0; } +static int +v3d_get_cpu_reset_timestamp_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *syncs; + struct drm_v3d_reset_timestamp_query reset; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(&reset, ext, sizeof(reset))) + return -EFAULT; + + job->job_type = V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY; + + job->timestamp_query.queries = kvmalloc_array(reset.count, + sizeof(struct v3d_timestamp_query), + GFP_KERNEL); + if (!job->timestamp_query.queries) + return -ENOMEM; + + syncs = u64_to_user_ptr(reset.syncs); + + for (int i = 0; i < reset.count; i++) { + u32 sync; + + job->timestamp_query.queries[i].offset = reset.offset + 8 * i; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + } + job->timestamp_query.count = reset.count; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -522,6 +570,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY: ret = v3d_get_cpu_timestamp_query_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY: + ret = v3d_get_cpu_reset_timestamp_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -899,6 +950,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, + [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, }; /** diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 239801b5e117..930f8d07f088 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -74,6 +74,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_MULTI_SYNC 0x01 #define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 #define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 +#define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 __u32 flags; /* mbz */ }; @@ -427,6 +428,29 @@ struct drm_v3d_timestamp_query { __u32 pad; }; +/** + * struct drm_v3d_reset_timestamp_query - ioctl extension for the CPU job to + * reset timestamp queries + * + * When an extension DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY is defined, it + * points to this extension to define a reset timestamp submission. This CPU + * job will reset the timestamp queries based on value offset of the first + * query. Moreover, it will reset the timestamp syncobj to reset query + * availability. + */ +struct drm_v3d_reset_timestamp_query { + struct drm_v3d_extension base; + + /* Array of timestamp's syncobjs to indicate its availability */ + __u64 syncs; + + /* Offset of the first query within the timestamp BO for its value */ + __u32 offset; + + /* Number of queries */ + __u32 count; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -435,6 +459,9 @@ struct drm_v3d_submit_cpu { * * For DRM_V3D_EXT_ID_TIMESTAMP_QUERY, it must contain only one BO, * that will contain the timestamp. + * + * For DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY, it must contain only + * one BO, that contains the timestamp. */ __u64 bo_handles; From patchwork Thu Nov 30 16:40:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474734 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0BCFFC10DC2 for ; Thu, 30 Nov 2023 16:46:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B10DB10E74A; Thu, 30 Nov 2023 16:46:22 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id D457510E741 for ; Thu, 30 Nov 2023 16:46:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=5jDrhugZE6UU1Lf2u+b+5RByAL5o94BVphHSx6StmBE=; b=U4sVwlQozWcmIKumVuLimvvnDk 9LHdtSt/ncX6tUN8GT4rRJoF8DCtux70uPHD0f9mGJUZ5ZyFw9c8DSNsLpzYJD4bBt+oyAlQo1wLS EOPKplgBviNYW2z3LvVBQI4OaD/SxVRgBtmUvvvex7Kqg9V+S6wt74DQmgf542i0hgtytlSUyeaQM Y5PZO/nRfUV+V8xI1QVSmYw/ePlHbD+SpAoli7gQFUhQtPlmOcr4r5YavfkaX6pF0z6iocjosUs5l PGFoeT2SVaCeCciidHtKQig178H4pifwwlOmCKWnyZc46tVbYv5cBAR3u/qLqm04RCjhGmbPdTRc2 9P/Vb8dg==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAm-008scY-Vb; Thu, 30 Nov 2023 17:45:57 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 15/17] drm/v3d: Create a CPU job extension to copy timestamp query to a buffer Date: Thu, 30 Nov 2023 13:40:38 -0300 Message-ID: <20231130164420.932823-17-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A copy timestamp query job is a job that copy the complete or partial result of a query to a buffer. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a copy timestamp query job. This user extension will allow the creation of a CPU job that copy the results of a timestamp query to a BO with the possibility to indicate the timestamp availability with a availability bit. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 20 +++++++++ drivers/gpu/drm/v3d/v3d_sched.c | 53 ++++++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 69 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 45 +++++++++++++++++++++ 4 files changed, 187 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 3988407635ed..5058a354fffd 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -320,6 +320,7 @@ enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_INDIRECT_CSD = 1, V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, + V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY, }; struct v3d_timestamp_query { @@ -361,6 +362,23 @@ struct v3d_timestamp_query_info { u32 count; }; +struct v3d_copy_query_results_info { + /* Define if should write to buffer using 64 or 32 bits */ + bool do_64bit; + + /* Define if it can write to buffer even if the query is not available */ + bool do_partial; + + /* Define if it should write availability bit to buffer */ + bool availability_bit; + + /* Offset of the copy buffer in the BO */ + u32 offset; + + /* Stride of the copy buffer in the BO */ + u32 stride; +}; + struct v3d_cpu_job { struct v3d_job base; @@ -369,6 +387,8 @@ struct v3d_cpu_job { struct v3d_indirect_csd_info indirect_csd; struct v3d_timestamp_query_info timestamp_query; + + struct v3d_copy_query_results_info copy; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 3a435f621b9b..07c897cd3423 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -361,11 +361,64 @@ v3d_reset_timestamp_queries(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +write_to_buffer(void *dst, u32 idx, bool do_64bit, u64 value) +{ + if (do_64bit) { + u64 *dst64 = (u64 *) dst; + dst64[idx] = value; + } else { + u32 *dst32 = (u32 *) dst; + dst32[idx] = (u32) value; + } +} + +static void +v3d_copy_query_results(struct v3d_cpu_job *job) +{ + struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_timestamp_query *queries = timestamp_query->queries; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + struct v3d_bo *timestamp = to_v3d_bo(job->base.bo[1]); + struct v3d_copy_query_results_info *copy = &job->copy; + struct dma_fence *fence; + u8 *query_addr; + bool available, write_result; + u8 *data; + int i; + + v3d_get_bo_vaddr(bo); + v3d_get_bo_vaddr(timestamp); + + data = ((u8 *) bo->vaddr) + copy->offset; + + for (i = 0; i < timestamp_query->count; i++) { + fence = drm_syncobj_fence_get(queries[i].syncobj); + available = fence ? dma_fence_is_signaled(fence) : false; + + write_result = available || copy->do_partial; + if (write_result) { + query_addr = ((u8 *) timestamp->vaddr) + queries[i].offset; + write_to_buffer(data, 0, copy->do_64bit, *((u64 *) query_addr)); + } + + if (copy->availability_bit) + write_to_buffer(data, 1, copy->do_64bit, available ? 1u : 0u); + + data += copy->stride; + + dma_fence_put(fence); + } + + v3d_put_bo_vaddr(timestamp); + v3d_put_bo_vaddr(bo); +} static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, + [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = v3d_copy_query_results, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 1c719416e26a..bafd49c6440c 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -539,6 +539,71 @@ v3d_get_cpu_reset_timestamp_params(struct drm_file *file_priv, return 0; } +/* Get data for the copy timestamp query results job submission. */ +static int +v3d_get_cpu_copy_query_results_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *offsets, *syncs; + struct drm_v3d_copy_timestamp_query copy; + int i; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(©, ext, sizeof(copy))) + return -EFAULT; + + if (copy.pad) + return -EINVAL; + + job->job_type = V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY; + + job->timestamp_query.queries = kvmalloc_array(copy.count, + sizeof(struct v3d_timestamp_query), + GFP_KERNEL); + if (!job->timestamp_query.queries) + return -ENOMEM; + + offsets = u64_to_user_ptr(copy.offsets); + syncs = u64_to_user_ptr(copy.syncs); + + for (i = 0; i < copy.count; i++) { + u32 offset, sync; + + if (copy_from_user(&offset, offsets++, sizeof(offset))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].offset = offset; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->timestamp_query.queries); + return -EFAULT; + } + + job->timestamp_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + } + job->timestamp_query.count = copy.count; + + job->copy.do_64bit = copy.do_64bit; + job->copy.do_partial = copy.do_partial; + job->copy.availability_bit = copy.availability_bit; + job->copy.offset = copy.offset; + job->copy.stride = copy.stride; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -573,6 +638,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY: ret = v3d_get_cpu_reset_timestamp_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY: + ret = v3d_get_cpu_copy_query_results_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -951,6 +1019,7 @@ static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = 1, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, + [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 2, }; /** diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index 930f8d07f088..f7defbf3d043 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -75,6 +75,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_CPU_INDIRECT_CSD 0x02 #define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 #define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 +#define DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY 0x05 __u32 flags; /* mbz */ }; @@ -451,6 +452,46 @@ struct drm_v3d_reset_timestamp_query { __u32 count; }; +/** + * struct drm_v3d_copy_timestamp_query - ioctl extension for the CPU job to copy + * query results to a buffer + * + * When an extension DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY is defined, it + * points to this extension to define a copy timestamp query submission. This + * CPU job will copy the timestamp queries results to a BO with the offset + * and stride defined in the extension. + */ +struct drm_v3d_copy_timestamp_query { + struct drm_v3d_extension base; + + /* Define if should write to buffer using 64 or 32 bits */ + __u8 do_64bit; + + /* Define if it can write to buffer even if the query is not available */ + __u8 do_partial; + + /* Define if it should write availability bit to buffer */ + __u8 availability_bit; + + /* mbz */ + __u8 pad; + + /* Offset of the buffer in the BO */ + __u32 offset; + + /* Stride of the buffer in the BO */ + __u32 stride; + + /* Number of queries */ + __u32 count; + + /* Array of queries' offsets within the timestamp BO for their value */ + __u64 offsets; + + /* Array of timestamp's syncobjs to indicate its availability */ + __u64 syncs; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -462,6 +503,10 @@ struct drm_v3d_submit_cpu { * * For DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY, it must contain only * one BO, that contains the timestamp. + * + * For DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY, it must contain two + * BOs. The first is the BO where the timestamp queries will be written + * to. The second is the BO that contains the timestamp. */ __u64 bo_handles; From patchwork Thu Nov 30 16:40:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E61D1C07CA9 for ; Thu, 30 Nov 2023 16:46:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9885C10E742; Thu, 30 Nov 2023 16:46:11 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id F0AF110E742 for ; Thu, 30 Nov 2023 16:46:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=Ug+j3HcVirfC3zvlULiblw55kw8k0lFlQPhSnPQZ5s4=; b=pXAIUxyQJ4hIr7b7jS2gIlSIA1 H2pNWGWDbco/pGuk+coB0N6E5909wQTzlNMh2anf1yDA0LBoyCfCRZQ78zGV+uOh8Fle6jXAwYzcs aX7giZ76NCn7Bc+Va/6QKErSrwvcANKGi4IbNLtUVQf6VCyW9i8GfCL4OHfvZ9fNozM4IdykOeKE+ D+fydKHOs5FDZcS5GBPNu47sZOhNg4XXsC5xAeS/JLx7Mv1i9Go8sgzriyeM5V2eOeB+uoFIK36s/ Cizg7+byeSvLUWHI50h9ylAB5AV0iAJ3cCQk2aP4rekncISlzbDGNvi/CFJOwILx3+/x4JkWgYlaj nbOQmYYw==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAq-008scY-Py; Thu, 30 Nov 2023 17:46:01 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 16/17] drm/v3d: Create a CPU job extension for the reset performance query job Date: Thu, 30 Nov 2023 13:40:39 -0300 Message-ID: <20231130164420.932823-18-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A reset performance query job is a job that resets the performance queries by resetting the values of the perfmons. Moreover, we also reset the syncobjs related to the availability of the query. So, create a user extension for the CPU job that enables the creation of a reset performance job. This user extension will allow the creation of a CPU job that resets the perfmons values and resets the availability syncobj. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 28 ++++++++++++ drivers/gpu/drm/v3d/v3d_sched.c | 37 ++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 73 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 30 +++++++++++++ 4 files changed, 168 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 5058a354fffd..0f7f80ad8d88 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -321,6 +321,7 @@ enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY, + V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY, }; struct v3d_timestamp_query { @@ -331,6 +332,18 @@ struct v3d_timestamp_query { struct drm_syncobj *syncobj; }; +/* Number of perfmons required to handle all supported performance counters */ +#define V3D_MAX_PERFMONS DIV_ROUND_UP(V3D_PERFCNT_NUM, \ + DRM_V3D_MAX_PERF_COUNTERS) + +struct v3d_performance_query { + /* Performance monitor IDs for this query */ + u32 kperfmon_ids[V3D_MAX_PERFMONS]; + + /* Syncobj that indicates the query availability */ + struct drm_syncobj *syncobj; +}; + struct v3d_indirect_csd_info { /* Indirect CSD */ struct v3d_csd_job *job; @@ -362,6 +375,19 @@ struct v3d_timestamp_query_info { u32 count; }; +struct v3d_performance_query_info { + struct v3d_performance_query *queries; + + /* Number of performance queries */ + u32 count; + + /* Number of performance monitors related to that query pool */ + u32 nperfmons; + + /* Number of performance counters related to that query pool */ + u32 ncounters; +}; + struct v3d_copy_query_results_info { /* Define if should write to buffer using 64 or 32 bits */ bool do_64bit; @@ -389,6 +415,8 @@ struct v3d_cpu_job { struct v3d_timestamp_query_info timestamp_query; struct v3d_copy_query_results_info copy; + + struct v3d_performance_query_info performance_query; }; typedef void (*v3d_cpu_job_fn)(struct v3d_cpu_job *); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 07c897cd3423..452c4a1db52e 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -78,6 +78,7 @@ v3d_cpu_job_free(struct drm_sched_job *sched_job) { struct v3d_cpu_job *job = to_cpu_job(sched_job); struct v3d_timestamp_query_info *timestamp_query = &job->timestamp_query; + struct v3d_performance_query_info *performance_query = &job->performance_query; if (timestamp_query->queries) { for (int i = 0; i < timestamp_query->count; i++) @@ -85,6 +86,12 @@ v3d_cpu_job_free(struct drm_sched_job *sched_job) kvfree(timestamp_query->queries); } + if (performance_query->queries) { + for (int i = 0; i < performance_query->count; i++) + drm_syncobj_put(performance_query->queries[i].syncobj); + kvfree(performance_query->queries); + } + v3d_job_cleanup(&job->base); } @@ -361,6 +368,7 @@ v3d_reset_timestamp_queries(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } + static void write_to_buffer(void *dst, u32 idx, bool do_64bit, u64 value) { @@ -414,11 +422,40 @@ v3d_copy_query_results(struct v3d_cpu_job *job) v3d_put_bo_vaddr(bo); } +static void +v3d_reset_performance_queries(struct v3d_cpu_job *job) +{ + struct v3d_performance_query_info *performance_query = &job->performance_query; + struct v3d_file_priv *v3d_priv = job->base.file->driver_priv; + struct v3d_dev *v3d = job->base.v3d; + struct v3d_perfmon *perfmon; + + for (int i = 0; i < performance_query->count; i++) { + for (int j = 0; j < performance_query->nperfmons; j++) { + perfmon = v3d_perfmon_find(v3d_priv, + performance_query->queries[i].kperfmon_ids[j]); + if (!perfmon) { + DRM_DEBUG("Failed to find perfmon."); + continue; + } + + v3d_perfmon_stop(v3d, perfmon, false); + + memset(perfmon->values, 0, perfmon->ncounters * sizeof(u64)); + + v3d_perfmon_put(perfmon); + } + + drm_syncobj_replace_fence(performance_query->queries[i].syncobj, NULL); + } +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = v3d_copy_query_results, + [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = v3d_reset_performance_queries, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index bafd49c6440c..20af8ae14831 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -604,6 +604,74 @@ v3d_get_cpu_copy_query_results_params(struct drm_file *file_priv, return 0; } +static int +v3d_get_cpu_reset_performance_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *syncs; + u64 __user *kperfmon_ids; + struct drm_v3d_reset_performance_query reset; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(&reset, ext, sizeof(reset))) + return -EFAULT; + + job->job_type = V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY; + + job->performance_query.queries = kvmalloc_array(reset.count, + sizeof(struct v3d_performance_query), + GFP_KERNEL); + if (!job->performance_query.queries) + return -ENOMEM; + + syncs = u64_to_user_ptr(reset.syncs); + kperfmon_ids = u64_to_user_ptr(reset.kperfmon_ids); + + for (int i = 0; i < reset.count; i++) { + u32 sync; + u64 ids; + u32 __user *ids_pointer; + u32 id; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + + if (copy_from_user(&ids, kperfmon_ids++, sizeof(ids))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + ids_pointer = u64_to_user_ptr(ids); + + for (int j = 0; j < reset.nperfmons; j++) { + if (copy_from_user(&id, ids_pointer++, sizeof(id))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].kperfmon_ids[j] = id; + } + } + job->performance_query.count = reset.count; + job->performance_query.nperfmons = reset.nperfmons; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -641,6 +709,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY: ret = v3d_get_cpu_copy_query_results_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY: + ret = v3d_get_cpu_reset_performance_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -1020,6 +1091,7 @@ static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 2, + [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = 0, }; /** @@ -1158,6 +1230,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); kvfree(cpu_job->timestamp_query.queries); + kvfree(cpu_job->performance_query.queries); return ret; } diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index f7defbf3d043..d609bdf29fd6 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -76,6 +76,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_CPU_TIMESTAMP_QUERY 0x03 #define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 #define DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY 0x05 +#define DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY 0x06 __u32 flags; /* mbz */ }; @@ -492,6 +493,32 @@ struct drm_v3d_copy_timestamp_query { __u64 syncs; }; +/** + * struct drm_v3d_reset_performance_query - ioctl extension for the CPU job to + * reset performance queries + * + * When an extension DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY is defined, it + * points to this extension to define a reset performance submission. This CPU + * job will reset the performance queries by resetting the values of the + * performance monitors. Moreover, it will reset the syncobj to reset query + * availability. + */ +struct drm_v3d_reset_performance_query { + struct drm_v3d_extension base; + + /* Array of performance queries's syncobjs to indicate its availability */ + __u64 syncs; + + /* Number of queries */ + __u32 count; + + /* Number of performance monitors */ + __u32 nperfmons; + + /* Array of u64 user-pointers that point to an array of kperfmon_ids */ + __u64 kperfmon_ids; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -507,6 +534,9 @@ struct drm_v3d_submit_cpu { * For DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY, it must contain two * BOs. The first is the BO where the timestamp queries will be written * to. The second is the BO that contains the timestamp. + * + * For DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY, it must contain no + * BOs. */ __u64 bo_handles; From patchwork Thu Nov 30 16:40:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ma=C3=ADra_Canal?= X-Patchwork-Id: 13474735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C95DCC07CA9 for ; Thu, 30 Nov 2023 16:46:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0113010E744; Thu, 30 Nov 2023 16:46:23 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id A060410E744 for ; Thu, 30 Nov 2023 16:46:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=KhGjbdM/a7A1tWLDIsZWcd3Fa0O/uPVcUQ3oPWovFBY=; b=MznUfNLTWNPlwpbY0ryjQUdyY+ k1Eq5NE0HohKUTDAqOm5Qu79OMYXR1nAMRGW7IQAR9mI28VKJvhWf9QOWlY+6cKDdt4BAt2WLQe7d UlXF+7dObf1BqDZnD/yr+NW2u/jTrDVKS3VjAwb+xwsQ5pXvTDAOW82zABGtOznziS5NF2WMMaDkF ONEi2gH5Ff5eIckV27eC2qY47r04nrllamLGCJ5DKIZDx/8Nm8A2pWw+7jg4JbNihxwRoaeaJCvQ4 L5f6uIcIonB4tG6vErxC3Aj0shgWSrF5g6fydkUqe4W1Q/uyXWAlfi/S+JGdNKAfdKlvavGt/FOVg qNp2+3/Q==; Received: from [177.34.169.138] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1r8kAu-008scY-Fx; Thu, 30 Nov 2023 17:46:05 +0100 From: =?utf-8?q?Ma=C3=ADra_Canal?= To: Melissa Wen , Iago Toral , David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Subject: [PATCH v4 17/17] drm/v3d: Create a CPU job extension for the copy performance query job Date: Thu, 30 Nov 2023 13:40:40 -0300 Message-ID: <20231130164420.932823-19-mcanal@igalia.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130164420.932823-2-mcanal@igalia.com> References: <20231130164420.932823-2-mcanal@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Ma=C3=ADra_Canal?= , kernel-dev@igalia.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A CPU job is a type of job that performs operations that requires CPU intervention. A copy performance query job is a job that copy the complete or partial result of a query to a buffer. In order to copy the result of a performance query to a buffer, we need to get the values from the performance monitors. So, create a user extension for the CPU job that enables the creation of a copy performance query job. This user extension will allow the creation of a CPU job that copy the results of a performance query to a BO with the possibility to indicate the availability with a availability bit. Signed-off-by: Maíra Canal Reviewed-by: Iago Toral Quiroga --- drivers/gpu/drm/v3d/v3d_drv.h | 1 + drivers/gpu/drm/v3d/v3d_sched.c | 66 +++++++++++++++++++++++++ drivers/gpu/drm/v3d/v3d_submit.c | 82 ++++++++++++++++++++++++++++++++ include/uapi/drm/v3d_drm.h | 50 +++++++++++++++++++ 4 files changed, 199 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 0f7f80ad8d88..3c7d58866570 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -322,6 +322,7 @@ enum v3d_cpu_job_type { V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY, V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY, + V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY, }; struct v3d_timestamp_query { diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 452c4a1db52e..203c32ed99d4 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -450,12 +450,78 @@ v3d_reset_performance_queries(struct v3d_cpu_job *job) } } +static void +v3d_write_performance_query_result(struct v3d_cpu_job *job, void *data, u32 query) +{ + struct v3d_performance_query_info *performance_query = &job->performance_query; + struct v3d_copy_query_results_info *copy = &job->copy; + struct v3d_file_priv *v3d_priv = job->base.file->driver_priv; + struct v3d_dev *v3d = job->base.v3d; + struct v3d_perfmon *perfmon; + u64 counter_values[V3D_PERFCNT_NUM]; + + for (int i = 0; i < performance_query->nperfmons; i++) { + perfmon = v3d_perfmon_find(v3d_priv, + performance_query->queries[query].kperfmon_ids[i]); + if (!perfmon) { + DRM_DEBUG("Failed to find perfmon."); + continue; + } + + v3d_perfmon_stop(v3d, perfmon, true); + + memcpy(&counter_values[i * DRM_V3D_MAX_PERF_COUNTERS], perfmon->values, + perfmon->ncounters * sizeof(u64)); + + v3d_perfmon_put(perfmon); + } + + for (int i = 0; i < performance_query->ncounters; i++) + write_to_buffer(data, i, copy->do_64bit, counter_values[i]); +} + + +static void +v3d_copy_performance_query(struct v3d_cpu_job *job) +{ + struct v3d_performance_query_info *performance_query = &job->performance_query; + struct v3d_copy_query_results_info *copy = &job->copy; + struct v3d_bo *bo = to_v3d_bo(job->base.bo[0]); + struct dma_fence *fence; + bool available, write_result; + u8 *data; + + v3d_get_bo_vaddr(bo); + + data = ((u8 *) bo->vaddr) + copy->offset; + + for (int i = 0; i < performance_query->count; i++) { + fence = drm_syncobj_fence_get(performance_query->queries[i].syncobj); + available = fence ? dma_fence_is_signaled(fence) : false; + + write_result = available || copy->do_partial; + if (write_result) + v3d_write_performance_query_result(job, data, i); + + if (copy->availability_bit) + write_to_buffer(data, performance_query->ncounters, + copy->do_64bit, available ? 1u : 0u); + + data += copy->stride; + + dma_fence_put(fence); + } + + v3d_put_bo_vaddr(bo); +} + static const v3d_cpu_job_fn cpu_job_function[] = { [V3D_CPU_JOB_TYPE_INDIRECT_CSD] = v3d_rewrite_csd_job_wg_counts_from_indirect, [V3D_CPU_JOB_TYPE_TIMESTAMP_QUERY] = v3d_timestamp_query, [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = v3d_reset_timestamp_queries, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = v3d_copy_query_results, [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = v3d_reset_performance_queries, + [V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY] = v3d_copy_performance_query, }; static struct dma_fence * diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 20af8ae14831..d7a9da2484fd 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -672,6 +672,84 @@ v3d_get_cpu_reset_performance_params(struct drm_file *file_priv, return 0; } +static int +v3d_get_cpu_copy_performance_query_params(struct drm_file *file_priv, + struct drm_v3d_extension __user *ext, + struct v3d_cpu_job *job) +{ + u32 __user *syncs; + u64 __user *kperfmon_ids; + struct drm_v3d_copy_performance_query copy; + + if (!job) { + DRM_DEBUG("CPU job extension was attached to a GPU job.\n"); + return -EINVAL; + } + + if (job->job_type) { + DRM_DEBUG("Two CPU job extensions were added to the same CPU job.\n"); + return -EINVAL; + } + + if (copy_from_user(©, ext, sizeof(copy))) + return -EFAULT; + + if (copy.pad) + return -EINVAL; + + job->job_type = V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY; + + job->performance_query.queries = kvmalloc_array(copy.count, + sizeof(struct v3d_performance_query), + GFP_KERNEL); + if (!job->performance_query.queries) + return -ENOMEM; + + syncs = u64_to_user_ptr(copy.syncs); + kperfmon_ids = u64_to_user_ptr(copy.kperfmon_ids); + + for (int i = 0; i < copy.count; i++) { + u32 sync; + u64 ids; + u32 __user *ids_pointer; + u32 id; + + if (copy_from_user(&sync, syncs++, sizeof(sync))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].syncobj = drm_syncobj_find(file_priv, sync); + + if (copy_from_user(&ids, kperfmon_ids++, sizeof(ids))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + ids_pointer = u64_to_user_ptr(ids); + + for (int j = 0; j < copy.nperfmons; j++) { + if (copy_from_user(&id, ids_pointer++, sizeof(id))) { + kvfree(job->performance_query.queries); + return -EFAULT; + } + + job->performance_query.queries[i].kperfmon_ids[j] = id; + } + } + job->performance_query.count = copy.count; + job->performance_query.nperfmons = copy.nperfmons; + job->performance_query.ncounters = copy.ncounters; + + job->copy.do_64bit = copy.do_64bit; + job->copy.do_partial = copy.do_partial; + job->copy.availability_bit = copy.availability_bit; + job->copy.offset = copy.offset; + job->copy.stride = copy.stride; + + return 0; +} + /* Whenever userspace sets ioctl extensions, v3d_get_extensions parses data * according to the extension id (name). */ @@ -712,6 +790,9 @@ v3d_get_extensions(struct drm_file *file_priv, case DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY: ret = v3d_get_cpu_reset_performance_params(file_priv, user_ext, job); break; + case DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY: + ret = v3d_get_cpu_copy_performance_query_params(file_priv, user_ext, job); + break; default: DRM_DEBUG_DRIVER("Unknown extension id: %d\n", ext.id); return -EINVAL; @@ -1092,6 +1173,7 @@ static const unsigned int cpu_job_bo_handle_count[] = { [V3D_CPU_JOB_TYPE_RESET_TIMESTAMP_QUERY] = 1, [V3D_CPU_JOB_TYPE_COPY_TIMESTAMP_QUERY] = 2, [V3D_CPU_JOB_TYPE_RESET_PERFORMANCE_QUERY] = 0, + [V3D_CPU_JOB_TYPE_COPY_PERFORMANCE_QUERY] = 1, }; /** diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h index d609bdf29fd6..dce1835eced4 100644 --- a/include/uapi/drm/v3d_drm.h +++ b/include/uapi/drm/v3d_drm.h @@ -77,6 +77,7 @@ struct drm_v3d_extension { #define DRM_V3D_EXT_ID_CPU_RESET_TIMESTAMP_QUERY 0x04 #define DRM_V3D_EXT_ID_CPU_COPY_TIMESTAMP_QUERY 0x05 #define DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY 0x06 +#define DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY 0x07 __u32 flags; /* mbz */ }; @@ -519,6 +520,52 @@ struct drm_v3d_reset_performance_query { __u64 kperfmon_ids; }; +/** + * struct drm_v3d_copy_performance_query - ioctl extension for the CPU job to copy + * performance query results to a buffer + * + * When an extension DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY is defined, it + * points to this extension to define a copy performance query submission. This + * CPU job will copy the performance queries results to a BO with the offset + * and stride defined in the extension. + */ +struct drm_v3d_copy_performance_query { + struct drm_v3d_extension base; + + /* Define if should write to buffer using 64 or 32 bits */ + __u8 do_64bit; + + /* Define if it can write to buffer even if the query is not available */ + __u8 do_partial; + + /* Define if it should write availability bit to buffer */ + __u8 availability_bit; + + /* mbz */ + __u8 pad; + + /* Offset of the buffer in the BO */ + __u32 offset; + + /* Stride of the buffer in the BO */ + __u32 stride; + + /* Number of performance monitors */ + __u32 nperfmons; + + /* Number of performance counters related to this query pool */ + __u32 ncounters; + + /* Number of queries */ + __u32 count; + + /* Array of performance queries's syncobjs to indicate its availability */ + __u64 syncs; + + /* Array of u64 user-pointers that point to an array of kperfmon_ids */ + __u64 kperfmon_ids; +}; + struct drm_v3d_submit_cpu { /* Pointer to a u32 array of the BOs that are referenced by the job. * @@ -537,6 +584,9 @@ struct drm_v3d_submit_cpu { * * For DRM_V3D_EXT_ID_CPU_RESET_PERFORMANCE_QUERY, it must contain no * BOs. + * + * For DRM_V3D_EXT_ID_CPU_COPY_PERFORMANCE_QUERY, it must contain one + * BO, where the performance queries will be written. */ __u64 bo_handles;