From patchwork Mon Dec 30 16:52:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923360 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A0B0E77194 for ; Mon, 30 Dec 2024 16:53:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8938E10E533; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="FhKtURaS"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6FB9410E387 for ; Mon, 30 Dec 2024 16:53:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=kT0XbtSflQtuyxcgxu6kI7eT6t8AoXrR3GQEO7NchIQ=; b=FhKtURaSfz3VgYYYptcdJGG9lZ KLzjKQ6dDxotWqZdtC45pt9U7VTpaPx6OkI7iG3cqaQl8BG6GGUHU2MeVp5C8cJ7BD2T+EVLWnlJi 9x/1ziVh+M+j4S159FfKslv5xgGNlO2Lj4A1yUw3FOya96vJrDogLj2jMtCrWe4xiiqjMiQ0suplJ oHm0QLI8fRsSC+wBdrt6+wkpC+aj9kMfjLRJG4grjr0CMJF+OyiGE8o1bXc5Yt96BFDWjZYR4fqCf N7ah4H75b6+Tpm7PHZ8OTOcx6aOyhqFWJF16XvfmsgGQtXkRGt2WNrZCYHrluruAhW9GPb+GwNaZh okjBJ4LA==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0q-009ZvH-Ig; Mon, 30 Dec 2024 17:53:04 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 01/14] drm/sched: Delete unused update_job_credits Date: Mon, 30 Dec 2024 16:52:46 +0000 Message-ID: <20241230165259.95855-2-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin No driver is using the update_job_credits() schduler vfunc so lets remove it. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_main.c | 13 ------------- include/drm/gpu_scheduler.h | 13 ------------- 2 files changed, 26 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 7ce25281c74c..1734c17aeea5 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -64,12 +64,6 @@ * credit limit, the job won't be executed. Instead, the scheduler will wait * until the credit count has decreased enough to not overflow its credit limit. * This implies waiting for previously executed jobs. - * - * Optionally, drivers may register a callback (update_job_credits) provided by - * struct drm_sched_backend_ops to update the job's credits dynamically. The - * scheduler executes this callback every time the scheduler considers a job for - * execution and subsequently checks whether the job fits the scheduler's credit - * limit. */ #include @@ -133,13 +127,6 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, if (!s_job) return false; - if (sched->ops->update_job_credits) { - s_job->credits = sched->ops->update_job_credits(s_job); - - drm_WARN(sched, !s_job->credits, - "Jobs with zero credits bypass job-flow control.\n"); - } - /* If a job exceeds the credit limit, truncate it to the credit limit * itself to guarantee forward progress. */ diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 95e17504e46a..e2e6af8849c6 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -476,19 +476,6 @@ struct drm_sched_backend_ops { * and it's time to clean it up. */ void (*free_job)(struct drm_sched_job *sched_job); - - /** - * @update_job_credits: Called when the scheduler is considering this - * job for execution. - * - * This callback returns the number of credits the job would take if - * pushed to the hardware. Drivers may use this to dynamically update - * the job's credit count. For instance, deduct the number of credits - * for already signalled native fences. - * - * This callback is optional. - */ - u32 (*update_job_credits)(struct drm_sched_job *sched_job); }; /** From patchwork Mon Dec 30 16:52:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B0325E77188 for ; Mon, 30 Dec 2024 16:53:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 221A310E387; Mon, 30 Dec 2024 16:53:08 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="gTKwYo44"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 146C410E527 for ; Mon, 30 Dec 2024 16:53:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=ppP2ltD/fqdU3EpZm/D4OpRI0IUCcdxlakILhiMJfgA=; b=gTKwYo44gUrohcG5PPeUkgiij4 E9G4PVE2n148XRXE/tpG30S16tZCggUy0vyy3gSdQUWgZgcrhCF5AOHQn/yWOvgM50SERFVUD5gJ5 JrAY8umLiITi37GYYbEX9R9+Gfrq3xf0expbmRdpOHwKQgZ5Ar3hRaIxAKJLT1XfX7PIrrZWDj/Mo L7qu87s+1Wb2wEIhm4OxlsIBbUDMMzv7JIGGjy1aCzW/ZXwGbxireHuT2yfaKmF2zjYCd5TNIFyLH +8rr8LO/dOliTBa1UKt5DrwFFdK1L9bxs2DWT+YtQHXao0COy1zyJ4wyH4N18dBg9Vbm9JUJNGEKJ cai1Trmw==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0r-009ZvO-8p; Mon, 30 Dec 2024 17:53:05 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 02/14] drm/sched: Remove idle entity from tree Date: Mon, 30 Dec 2024 16:52:47 +0000 Message-ID: <20241230165259.95855-3-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin There is no need to keep entities with no jobs in the tree so lets remove it once the last job is consumed. This keeps the tree smaller which is nicer and more efficient. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 15 ++++++++------- drivers/gpu/drm/scheduler/sched_main.c | 4 ++-- include/drm/gpu_scheduler.h | 2 ++ 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 69bcf0e99d57..8e910586979e 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -512,19 +512,20 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) */ if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) { struct drm_sched_job *next; + struct drm_sched_rq *rq; + spin_lock(&entity->lock); + rq = entity->rq; + spin_lock(&rq->lock); next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); if (next) { - struct drm_sched_rq *rq; - - spin_lock(&entity->lock); - rq = entity->rq; - spin_lock(&rq->lock); drm_sched_rq_update_fifo_locked(entity, rq, next->submit_ts); - spin_unlock(&rq->lock); - spin_unlock(&entity->lock); + } else { + drm_sched_rq_remove_fifo_locked(entity, rq); } + spin_unlock(&rq->lock); + spin_unlock(&entity->lock); } /* Jobs and entities might have different lifecycles. Since we're diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 1734c17aeea5..9beb4c611988 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -146,8 +146,8 @@ static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, return ktime_before(ent_a->oldest_job_waiting, ent_b->oldest_job_waiting); } -static void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq) +void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq) { if (!RB_EMPTY_NODE(&entity->rb_tree_node)) { rb_erase_cached(&entity->rb_tree_node, &rq->rb_tree_root); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index e2e6af8849c6..978ca621cc13 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -591,6 +591,8 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq, void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity); +void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq); void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity, struct drm_sched_rq *rq, ktime_t ts); From patchwork Mon Dec 30 16:52:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A7CAAE77188 for ; Mon, 30 Dec 2024 16:53:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D34AF10E52B; Mon, 30 Dec 2024 16:53:13 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="aI0Fs+ze"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id C024910E387 for ; Mon, 30 Dec 2024 16:53:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=FgeUPmKi9tQfAt6CiUHOz1fzopsb3Wsm9wnXgh6GHAM=; b=aI0Fs+ze5O5uMEtLTABWoH5xSr Oa2mA0B7Af4Jf3WCbhvgyNxCgsFddFnxwcsngAhOejVAMq4xwj5n9ey/JoW7uUoDMD/m1K0IN0Ddp iAhKPE58c82bMQts+4QRScQpRniS+O2MOyc5svwHJgUl8S1hBirtSYyVT/Ku4Rqs4FZ9Y66ifxDqV UhB5oSRqbBld4UC7hRWrsnHxToSJIIveFVPOUeej+hJ0qjpI93JmK4ultfglQEE+PLyDzml5jsSpI /3TTa3nY6LDGgH8GRD0FHGEAUz8FED3Dg0pway4ZQmS3RBSKrAhViFBOXxRpWPOIkJEl9KIF8gSOK 4O8geaOQ==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0r-009ZvS-VJ; Mon, 30 Dec 2024 17:53:06 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 03/14] drm/sched: Implement RR via FIFO Date: Mon, 30 Dec 2024 16:52:48 +0000 Message-ID: <20241230165259.95855-4-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Round-robin being the non-default policy and unclear how much it is used, we can notice that it can be implemented using the FIFO data structures if we only invent a fake submit timestamp which is monotonically increasing inside drm_sched_rq instances. So instead of remembering which was the last entity the scheduler worker picked, we can bump the picked one to the bottom of the tree, achieving the same round-robin behaviour. Advantage is that we can consolidate to a single code path and remove a bunch of code. Downside is round-robin mode now needs to lock on the job pop path but that should not be visible. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 53 ++++++++----- drivers/gpu/drm/scheduler/sched_main.c | 99 +++--------------------- include/drm/gpu_scheduler.h | 5 +- 3 files changed, 48 insertions(+), 109 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 8e910586979e..cb5f596b48b7 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -473,9 +473,20 @@ drm_sched_job_dependency(struct drm_sched_job *job, return NULL; } +static ktime_t +drm_sched_rq_get_rr_deadline(struct drm_sched_rq *rq) +{ + lockdep_assert_held(&rq->lock); + + rq->rr_deadline = ktime_add_ns(rq->rr_deadline, 1); + + return rq->rr_deadline; +} + struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) { - struct drm_sched_job *sched_job; + struct drm_sched_job *sched_job, *next_job; + struct drm_sched_rq *rq; sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); if (!sched_job) @@ -510,24 +521,28 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) * Update the entity's location in the min heap according to * the timestamp of the next job, if any. */ - if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) { - struct drm_sched_job *next; - struct drm_sched_rq *rq; + next_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); - spin_lock(&entity->lock); - rq = entity->rq; - spin_lock(&rq->lock); - next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); - if (next) { - drm_sched_rq_update_fifo_locked(entity, rq, - next->submit_ts); - } else { - drm_sched_rq_remove_fifo_locked(entity, rq); - } - spin_unlock(&rq->lock); - spin_unlock(&entity->lock); + spin_lock(&entity->lock); + rq = entity->rq; + spin_lock(&rq->lock); + + if (next_job) { + ktime_t ts; + + if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) + ts = next_job->submit_ts; + else + ts = drm_sched_rq_get_rr_deadline(rq); + + drm_sched_rq_update_fifo_locked(entity, rq, ts); + } else { + drm_sched_rq_remove_fifo_locked(entity, rq); } + spin_unlock(&rq->lock); + spin_unlock(&entity->lock); + /* Jobs and entities might have different lifecycles. Since we're * removing the job from the entities queue, set the jobs entity pointer * to NULL to prevent any future access of the entity through this job. @@ -624,9 +639,9 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) spin_lock(&rq->lock); drm_sched_rq_add_entity(rq, entity); - - if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) - drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); + if (drm_sched_policy == DRM_SCHED_POLICY_RR) + submit_ts = drm_sched_rq_get_rr_deadline(rq); + drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); spin_unlock(&rq->lock); spin_unlock(&entity->lock); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 9beb4c611988..eb22b1b7de36 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -189,7 +189,6 @@ static void drm_sched_rq_init(struct drm_gpu_scheduler *sched, spin_lock_init(&rq->lock); INIT_LIST_HEAD(&rq->entities); rq->rb_tree_root = RB_ROOT_CACHED; - rq->current_entity = NULL; rq->sched = sched; } @@ -235,82 +234,13 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, atomic_dec(rq->sched->score); list_del_init(&entity->list); - if (rq->current_entity == entity) - rq->current_entity = NULL; - - if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) - drm_sched_rq_remove_fifo_locked(entity, rq); + drm_sched_rq_remove_fifo_locked(entity, rq); spin_unlock(&rq->lock); } /** - * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run - * - * @sched: the gpu scheduler - * @rq: scheduler run queue to check. - * - * Try to find the next ready entity. - * - * Return an entity if one is found; return an error-pointer (!NULL) if an - * entity was ready, but the scheduler had insufficient credits to accommodate - * its job; return NULL, if no ready entity was found. - */ -static struct drm_sched_entity * -drm_sched_rq_select_entity_rr(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq) -{ - struct drm_sched_entity *entity; - - spin_lock(&rq->lock); - - entity = rq->current_entity; - if (entity) { - list_for_each_entry_continue(entity, &rq->entities, list) { - if (drm_sched_entity_is_ready(entity)) { - /* If we can't queue yet, preserve the current - * entity in terms of fairness. - */ - if (!drm_sched_can_queue(sched, entity)) { - spin_unlock(&rq->lock); - return ERR_PTR(-ENOSPC); - } - - rq->current_entity = entity; - reinit_completion(&entity->entity_idle); - spin_unlock(&rq->lock); - return entity; - } - } - } - - list_for_each_entry(entity, &rq->entities, list) { - if (drm_sched_entity_is_ready(entity)) { - /* If we can't queue yet, preserve the current entity in - * terms of fairness. - */ - if (!drm_sched_can_queue(sched, entity)) { - spin_unlock(&rq->lock); - return ERR_PTR(-ENOSPC); - } - - rq->current_entity = entity; - reinit_completion(&entity->entity_idle); - spin_unlock(&rq->lock); - return entity; - } - - if (entity == rq->current_entity) - break; - } - - spin_unlock(&rq->lock); - - return NULL; -} - -/** - * drm_sched_rq_select_entity_fifo - Select an entity which provides a job to run + * drm_sched_rq_select_entity - Select an entity which provides a job to run * * @sched: the gpu scheduler * @rq: scheduler run queue to check. @@ -322,32 +252,29 @@ drm_sched_rq_select_entity_rr(struct drm_gpu_scheduler *sched, * its job; return NULL, if no ready entity was found. */ static struct drm_sched_entity * -drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq) +drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, + struct drm_sched_rq *rq) { + struct drm_sched_entity *entity = NULL; struct rb_node *rb; spin_lock(&rq->lock); for (rb = rb_first_cached(&rq->rb_tree_root); rb; rb = rb_next(rb)) { - struct drm_sched_entity *entity; - entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node); if (drm_sched_entity_is_ready(entity)) { - /* If we can't queue yet, preserve the current entity in - * terms of fairness. - */ if (!drm_sched_can_queue(sched, entity)) { - spin_unlock(&rq->lock); - return ERR_PTR(-ENOSPC); + entity = ERR_PTR(-ENOSPC); + break; } reinit_completion(&entity->entity_idle); break; } + entity = NULL; } spin_unlock(&rq->lock); - return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL; + return entity; } /** @@ -1045,20 +972,18 @@ void drm_sched_wakeup(struct drm_gpu_scheduler *sched) static struct drm_sched_entity * drm_sched_select_entity(struct drm_gpu_scheduler *sched) { - struct drm_sched_entity *entity; + struct drm_sched_entity *entity = NULL; int i; /* Start with the highest priority. */ for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ? - drm_sched_rq_select_entity_fifo(sched, sched->sched_rq[i]) : - drm_sched_rq_select_entity_rr(sched, sched->sched_rq[i]); + entity = drm_sched_rq_select_entity(sched, sched->sched_rq[i]); if (entity) break; } - return IS_ERR(entity) ? NULL : entity; + return IS_ERR(entity) ? NULL : entity;; } /** diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 978ca621cc13..db65600732b9 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -245,8 +245,7 @@ struct drm_sched_entity { * struct drm_sched_rq - queue of entities to be scheduled. * * @sched: the scheduler to which this rq belongs to. - * @lock: protects @entities, @rb_tree_root and @current_entity. - * @current_entity: the entity which is to be scheduled. + * @lock: protects @entities, @rb_tree_root and @rr_deadline. * @entities: list of the entities to be scheduled. * @rb_tree_root: root of time based priority queue of entities for FIFO scheduling * @@ -259,7 +258,7 @@ struct drm_sched_rq { spinlock_t lock; /* Following members are protected by the @lock: */ - struct drm_sched_entity *current_entity; + ktime_t rr_deadline; struct list_head entities; struct rb_root_cached rb_tree_root; }; From patchwork Mon Dec 30 16:52:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 430B7E7718F for ; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AAD0F10E52A; Mon, 30 Dec 2024 16:53:13 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="W7idq63+"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 74F5610E525 for ; Mon, 30 Dec 2024 16:53:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=R3M9OCAhf82ocU/m6COL6/lPoef6+XQRYuzxi6y0Ddw=; b=W7idq63+p2r7Ow/Lva9+qGTIut Y+DULNKYDVdgEquK8Ul/IC6EfpecLhmr667Gt6PbyBQmb8blslq0X6eFb/fbr9fBpM/DMQMgUviML MHi3F4kwS9JIkI8LR9dnS2TVS4I4o/xOngxEzNKld8G8NDLUaHlwAo0U3KtyTMzTXrCaTjZ20/AnQ fl4K7JJqgurTjilmjnE+PpRlbk8p5vK9IO8bIAb79ng83XbabBkekb/lErRnQoQd27hDRehY6hHlF xIQO8DuIR0mq0vHcFOf5CfEgM6Hj+bMvGpxTjh164FdQNZLah9qcRewwLi0XVZxC22TV4KN1KnRQ7 o0jp+FzQ==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0s-009ZvX-MF; Mon, 30 Dec 2024 17:53:06 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 04/14] drm/sched: Consolidate entity run queue management Date: Mon, 30 Dec 2024 16:52:49 +0000 Message-ID: <20241230165259.95855-5-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Move the code dealing with entities entering and exiting run queues to helpers to logically separate it from jobs entering and exiting entities. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 66 ++-------------- drivers/gpu/drm/scheduler/sched_main.c | 98 +++++++++++++++++++----- include/drm/gpu_scheduler.h | 12 +-- 3 files changed, 90 insertions(+), 86 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index cb5f596b48b7..b93da068585e 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -473,20 +473,9 @@ drm_sched_job_dependency(struct drm_sched_job *job, return NULL; } -static ktime_t -drm_sched_rq_get_rr_deadline(struct drm_sched_rq *rq) -{ - lockdep_assert_held(&rq->lock); - - rq->rr_deadline = ktime_add_ns(rq->rr_deadline, 1); - - return rq->rr_deadline; -} - struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) { - struct drm_sched_job *sched_job, *next_job; - struct drm_sched_rq *rq; + struct drm_sched_job *sched_job; sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); if (!sched_job) @@ -516,32 +505,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) smp_wmb(); spsc_queue_pop(&entity->job_queue); - - /* - * Update the entity's location in the min heap according to - * the timestamp of the next job, if any. - */ - next_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); - - spin_lock(&entity->lock); - rq = entity->rq; - spin_lock(&rq->lock); - - if (next_job) { - ktime_t ts; - - if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) - ts = next_job->submit_ts; - else - ts = drm_sched_rq_get_rr_deadline(rq); - - drm_sched_rq_update_fifo_locked(entity, rq, ts); - } else { - drm_sched_rq_remove_fifo_locked(entity, rq); - } - - spin_unlock(&rq->lock); - spin_unlock(&entity->lock); + drm_sched_rq_pop_entity(entity->rq, entity); /* Jobs and entities might have different lifecycles. Since we're * removing the job from the entities queue, set the jobs entity pointer @@ -623,30 +587,10 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) /* first job wakes up scheduler */ if (first) { struct drm_gpu_scheduler *sched; - struct drm_sched_rq *rq; - /* Add the entity to the run queue */ - spin_lock(&entity->lock); - if (entity->stopped) { - spin_unlock(&entity->lock); - - DRM_ERROR("Trying to push to a killed entity\n"); - return; - } - - rq = entity->rq; - sched = rq->sched; - - spin_lock(&rq->lock); - drm_sched_rq_add_entity(rq, entity); - if (drm_sched_policy == DRM_SCHED_POLICY_RR) - submit_ts = drm_sched_rq_get_rr_deadline(rq); - drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); - - spin_unlock(&rq->lock); - spin_unlock(&entity->lock); - - drm_sched_wakeup(sched); + sched = drm_sched_rq_add_entity(entity->rq, entity, submit_ts); + if (sched) + drm_sched_wakeup(sched); } } EXPORT_SYMBOL(drm_sched_entity_push_job); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index eb22b1b7de36..52c1a71d48e1 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -146,18 +146,19 @@ static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, return ktime_before(ent_a->oldest_job_waiting, ent_b->oldest_job_waiting); } -void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq) +static void __drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq) { - if (!RB_EMPTY_NODE(&entity->rb_tree_node)) { - rb_erase_cached(&entity->rb_tree_node, &rq->rb_tree_root); - RB_CLEAR_NODE(&entity->rb_tree_node); - } + lockdep_assert_held(&entity->lock); + lockdep_assert_held(&rq->lock); + + rb_erase_cached(&entity->rb_tree_node, &rq->rb_tree_root); + RB_CLEAR_NODE(&entity->rb_tree_node); } -void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq, - ktime_t ts) +static void __drm_sched_rq_add_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq, + ktime_t ts) { /* * Both locks need to be grabbed, one to protect from entity->rq change @@ -167,8 +168,6 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity, lockdep_assert_held(&entity->lock); lockdep_assert_held(&rq->lock); - drm_sched_rq_remove_fifo_locked(entity, rq); - entity->oldest_job_waiting = ts; rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root, @@ -192,6 +191,16 @@ static void drm_sched_rq_init(struct drm_gpu_scheduler *sched, rq->sched = sched; } +static ktime_t +drm_sched_rq_get_rr_deadline(struct drm_sched_rq *rq) +{ + lockdep_assert_held(&rq->lock); + + rq->rr_deadline = ktime_add_ns(rq->rr_deadline, 1); + + return rq->rr_deadline; +} + /** * drm_sched_rq_add_entity - add an entity * @@ -199,18 +208,41 @@ static void drm_sched_rq_init(struct drm_gpu_scheduler *sched, * @entity: scheduler entity * * Adds a scheduler entity to the run queue. + * + * Returns a DRM scheduler pre-selected to handle this entity. */ -void drm_sched_rq_add_entity(struct drm_sched_rq *rq, - struct drm_sched_entity *entity) +struct drm_gpu_scheduler * +drm_sched_rq_add_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity, + ktime_t ts) { - lockdep_assert_held(&entity->lock); - lockdep_assert_held(&rq->lock); + struct drm_gpu_scheduler *sched; + + if (entity->stopped) { + DRM_ERROR("Trying to push to a killed entity\n"); + return NULL; + } + + spin_lock(&entity->lock); + spin_lock(&rq->lock); + + sched = rq->sched; + atomic_inc(sched->score); if (!list_empty(&entity->list)) - return; + list_add_tail(&entity->list, &rq->entities); - atomic_inc(rq->sched->score); - list_add_tail(&entity->list, &rq->entities); + if (drm_sched_policy == DRM_SCHED_POLICY_RR) + ts = drm_sched_rq_get_rr_deadline(rq); + + if (!RB_EMPTY_NODE(&entity->rb_tree_node)) + __drm_sched_rq_remove_fifo_locked(entity, rq); + __drm_sched_rq_add_fifo_locked(entity, rq, ts); + + spin_unlock(&rq->lock); + spin_unlock(&entity->lock); + + return sched; } /** @@ -234,11 +266,39 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, atomic_dec(rq->sched->score); list_del_init(&entity->list); - drm_sched_rq_remove_fifo_locked(entity, rq); + if (!RB_EMPTY_NODE(&entity->rb_tree_node)) + __drm_sched_rq_remove_fifo_locked(entity, rq); spin_unlock(&rq->lock); } +void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity) +{ + struct drm_sched_job *next_job; + + next_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); + + spin_lock(&entity->lock); + spin_lock(&rq->lock); + + __drm_sched_rq_remove_fifo_locked(entity, rq); + + if (next_job) { + ktime_t ts; + + if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) + ts = next_job->submit_ts; + else + ts = drm_sched_rq_get_rr_deadline(rq); + + __drm_sched_rq_add_fifo_locked(entity, rq, ts); + } + + spin_unlock(&rq->lock); + spin_unlock(&entity->lock); +} + /** * drm_sched_rq_select_entity - Select an entity which provides a job to run * diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index db65600732b9..23d5b1b0b048 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -585,15 +585,15 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence, struct drm_sched_entity *entity); void drm_sched_fault(struct drm_gpu_scheduler *sched); -void drm_sched_rq_add_entity(struct drm_sched_rq *rq, - struct drm_sched_entity *entity); +struct drm_gpu_scheduler * +drm_sched_rq_add_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity, + ktime_t ts); void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity); -void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq); -void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq, ktime_t ts); +void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity); int drm_sched_entity_init(struct drm_sched_entity *entity, enum drm_sched_priority priority, From patchwork Mon Dec 30 16:52:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B429BE7718F for ; Mon, 30 Dec 2024 16:53:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3DC3310E53A; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="hiMOWshz"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2CF7610E527 for ; Mon, 30 Dec 2024 16:53:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=qyImZN9rav2ewYFzHzJY/rtautCgODqNlgLdAOL8mYk=; b=hiMOWshzNLl0konzGpSMrEmmCK Pb/TqvzGv9NcZyfqeGwe3EMHwMnlipCBym8wTOW6GfMbiwCg03lcVuJbUTmWflZh+Gd06bKSaKzN+ KgfOLXvZAJi68YiTCNqGB3cmaZNxl0Rkt4WfwW/7hrbePK8XyqfBcvcAxxkf1Sih7DWQjlkR1F2EU FlZZK2CuBOVNjWGC7jSqu4SK+3y202uyG9+w1EHjfbrynfXvhvbbRtNfYNOHbUMLktr7d65ZPBdHN sAjC4zikvt+J9sViGpLLtmQ1GMEoHfbn4zoSEngmuyEK4KsZshPINZd9oz18y3iKJrW7W3asM4zSi rweye7PA==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0t-009Zvb-Cy; Mon, 30 Dec 2024 17:53:07 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 05/14] drm/sched: Move run queue related code into a separate file Date: Mon, 30 Dec 2024 16:52:50 +0000 Message-ID: <20241230165259.95855-6-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Lets move all the code dealing with struct drm_sched_rq into a separate compilation unit. Advantage being sched_main.c is left with a clearer set of responsibilities. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/Makefile | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 207 +------------------------ drivers/gpu/drm/scheduler/sched_rq.c | 203 ++++++++++++++++++++++++ include/drm/gpu_scheduler.h | 13 ++ 4 files changed, 219 insertions(+), 206 deletions(-) create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c diff --git a/drivers/gpu/drm/scheduler/Makefile b/drivers/gpu/drm/scheduler/Makefile index 53863621829f..d11d83e285e7 100644 --- a/drivers/gpu/drm/scheduler/Makefile +++ b/drivers/gpu/drm/scheduler/Makefile @@ -20,6 +20,6 @@ # OTHER DEALINGS IN THE SOFTWARE. # # -gpu-sched-y := sched_main.o sched_fence.o sched_entity.o +gpu-sched-y := sched_main.o sched_fence.o sched_entity.o sched_rq.o obj-$(CONFIG_DRM_SCHED) += gpu-sched.o diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 52c1a71d48e1..5c92784bb533 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -87,9 +87,6 @@ static struct lockdep_map drm_sched_lockdep_map = { }; #endif -#define to_drm_sched_job(sched_job) \ - container_of((sched_job), struct drm_sched_job, queue_node) - int drm_sched_policy = DRM_SCHED_POLICY_FIFO; /** @@ -118,8 +115,8 @@ static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched) * Return true if we can push at least one more job from @entity, false * otherwise. */ -static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, - struct drm_sched_entity *entity) +bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, + struct drm_sched_entity *entity) { struct drm_sched_job *s_job; @@ -137,206 +134,6 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, return drm_sched_available_credits(sched) >= s_job->credits; } -static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, - const struct rb_node *b) -{ - struct drm_sched_entity *ent_a = rb_entry((a), struct drm_sched_entity, rb_tree_node); - struct drm_sched_entity *ent_b = rb_entry((b), struct drm_sched_entity, rb_tree_node); - - return ktime_before(ent_a->oldest_job_waiting, ent_b->oldest_job_waiting); -} - -static void __drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq) -{ - lockdep_assert_held(&entity->lock); - lockdep_assert_held(&rq->lock); - - rb_erase_cached(&entity->rb_tree_node, &rq->rb_tree_root); - RB_CLEAR_NODE(&entity->rb_tree_node); -} - -static void __drm_sched_rq_add_fifo_locked(struct drm_sched_entity *entity, - struct drm_sched_rq *rq, - ktime_t ts) -{ - /* - * Both locks need to be grabbed, one to protect from entity->rq change - * for entity from within concurrent drm_sched_entity_select_rq and the - * other to update the rb tree structure. - */ - lockdep_assert_held(&entity->lock); - lockdep_assert_held(&rq->lock); - - entity->oldest_job_waiting = ts; - - rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root, - drm_sched_entity_compare_before); -} - -/** - * drm_sched_rq_init - initialize a given run queue struct - * - * @sched: scheduler instance to associate with this run queue - * @rq: scheduler run queue - * - * Initializes a scheduler runqueue. - */ -static void drm_sched_rq_init(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq) -{ - spin_lock_init(&rq->lock); - INIT_LIST_HEAD(&rq->entities); - rq->rb_tree_root = RB_ROOT_CACHED; - rq->sched = sched; -} - -static ktime_t -drm_sched_rq_get_rr_deadline(struct drm_sched_rq *rq) -{ - lockdep_assert_held(&rq->lock); - - rq->rr_deadline = ktime_add_ns(rq->rr_deadline, 1); - - return rq->rr_deadline; -} - -/** - * drm_sched_rq_add_entity - add an entity - * - * @rq: scheduler run queue - * @entity: scheduler entity - * - * Adds a scheduler entity to the run queue. - * - * Returns a DRM scheduler pre-selected to handle this entity. - */ -struct drm_gpu_scheduler * -drm_sched_rq_add_entity(struct drm_sched_rq *rq, - struct drm_sched_entity *entity, - ktime_t ts) -{ - struct drm_gpu_scheduler *sched; - - if (entity->stopped) { - DRM_ERROR("Trying to push to a killed entity\n"); - return NULL; - } - - spin_lock(&entity->lock); - spin_lock(&rq->lock); - - sched = rq->sched; - atomic_inc(sched->score); - - if (!list_empty(&entity->list)) - list_add_tail(&entity->list, &rq->entities); - - if (drm_sched_policy == DRM_SCHED_POLICY_RR) - ts = drm_sched_rq_get_rr_deadline(rq); - - if (!RB_EMPTY_NODE(&entity->rb_tree_node)) - __drm_sched_rq_remove_fifo_locked(entity, rq); - __drm_sched_rq_add_fifo_locked(entity, rq, ts); - - spin_unlock(&rq->lock); - spin_unlock(&entity->lock); - - return sched; -} - -/** - * drm_sched_rq_remove_entity - remove an entity - * - * @rq: scheduler run queue - * @entity: scheduler entity - * - * Removes a scheduler entity from the run queue. - */ -void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, - struct drm_sched_entity *entity) -{ - lockdep_assert_held(&entity->lock); - - if (list_empty(&entity->list)) - return; - - spin_lock(&rq->lock); - - atomic_dec(rq->sched->score); - list_del_init(&entity->list); - - if (!RB_EMPTY_NODE(&entity->rb_tree_node)) - __drm_sched_rq_remove_fifo_locked(entity, rq); - - spin_unlock(&rq->lock); -} - -void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, - struct drm_sched_entity *entity) -{ - struct drm_sched_job *next_job; - - next_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); - - spin_lock(&entity->lock); - spin_lock(&rq->lock); - - __drm_sched_rq_remove_fifo_locked(entity, rq); - - if (next_job) { - ktime_t ts; - - if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) - ts = next_job->submit_ts; - else - ts = drm_sched_rq_get_rr_deadline(rq); - - __drm_sched_rq_add_fifo_locked(entity, rq, ts); - } - - spin_unlock(&rq->lock); - spin_unlock(&entity->lock); -} - -/** - * drm_sched_rq_select_entity - Select an entity which provides a job to run - * - * @sched: the gpu scheduler - * @rq: scheduler run queue to check. - * - * Find oldest waiting ready entity. - * - * Return an entity if one is found; return an error-pointer (!NULL) if an - * entity was ready, but the scheduler had insufficient credits to accommodate - * its job; return NULL, if no ready entity was found. - */ -static struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq) -{ - struct drm_sched_entity *entity = NULL; - struct rb_node *rb; - - spin_lock(&rq->lock); - for (rb = rb_first_cached(&rq->rb_tree_root); rb; rb = rb_next(rb)) { - entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node); - if (drm_sched_entity_is_ready(entity)) { - if (!drm_sched_can_queue(sched, entity)) { - entity = ERR_PTR(-ENOSPC); - break; - } - - reinit_completion(&entity->entity_idle); - break; - } - entity = NULL; - } - spin_unlock(&rq->lock); - - return entity; -} - /** * drm_sched_run_job_queue - enqueue run-job work * @sched: scheduler instance diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c new file mode 100644 index 000000000000..5b31e5434d12 --- /dev/null +++ b/drivers/gpu/drm/scheduler/sched_rq.c @@ -0,0 +1,203 @@ +#include + +#include +#include + +static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, + const struct rb_node *b) +{ + struct drm_sched_entity *ent_a = rb_entry((a), struct drm_sched_entity, rb_tree_node); + struct drm_sched_entity *ent_b = rb_entry((b), struct drm_sched_entity, rb_tree_node); + + return ktime_before(ent_a->oldest_job_waiting, ent_b->oldest_job_waiting); +} + +static void __drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq) +{ + lockdep_assert_held(&entity->lock); + lockdep_assert_held(&rq->lock); + + rb_erase_cached(&entity->rb_tree_node, &rq->rb_tree_root); + RB_CLEAR_NODE(&entity->rb_tree_node); +} + +static void __drm_sched_rq_add_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq, + ktime_t ts) +{ + /* + * Both locks need to be grabbed, one to protect from entity->rq change + * for entity from within concurrent drm_sched_entity_select_rq and the + * other to update the rb tree structure. + */ + lockdep_assert_held(&entity->lock); + lockdep_assert_held(&rq->lock); + + entity->oldest_job_waiting = ts; + rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root, + drm_sched_entity_compare_before); +} + +/** + * drm_sched_rq_init - initialize a given run queue struct + * + * @sched: scheduler instance to associate with this run queue + * @rq: scheduler run queue + * + * Initializes a scheduler runqueue. + */ +void drm_sched_rq_init(struct drm_gpu_scheduler *sched, + struct drm_sched_rq *rq) +{ + spin_lock_init(&rq->lock); + INIT_LIST_HEAD(&rq->entities); + rq->rb_tree_root = RB_ROOT_CACHED; + rq->sched = sched; +} + +static ktime_t +drm_sched_rq_get_rr_deadline(struct drm_sched_rq *rq) +{ + lockdep_assert_held(&rq->lock); + + rq->rr_deadline = ktime_add_ns(rq->rr_deadline, 1); + + return rq->rr_deadline; +} + +/** + * drm_sched_rq_add_entity - add an entity + * + * @rq: scheduler run queue + * @entity: scheduler entity + * + * Adds a scheduler entity to the run queue. + * + * Returns a DRM scheduler pre-selected to handle this entity. + */ +struct drm_gpu_scheduler * +drm_sched_rq_add_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity, + ktime_t ts) +{ + struct drm_gpu_scheduler *sched; + + if (entity->stopped) { + DRM_ERROR("Trying to push to a killed entity\n"); + return NULL; + } + + spin_lock(&entity->lock); + spin_lock(&rq->lock); + + sched = rq->sched; + atomic_inc(sched->score); + + if (!list_empty(&entity->list)) + list_add_tail(&entity->list, &rq->entities); + + if (drm_sched_policy == DRM_SCHED_POLICY_RR) + ts = drm_sched_rq_get_rr_deadline(rq); + + if (!RB_EMPTY_NODE(&entity->rb_tree_node)) + __drm_sched_rq_remove_fifo_locked(entity, rq); + __drm_sched_rq_add_fifo_locked(entity, rq, ts); + + spin_unlock(&rq->lock); + spin_unlock(&entity->lock); + + return sched; +} + +/** + * drm_sched_rq_remove_entity - remove an entity + * + * @rq: scheduler run queue + * @entity: scheduler entity + * + * Removes a scheduler entity from the run queue. + */ +void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity) +{ + lockdep_assert_held(&entity->lock); + + if (list_empty(&entity->list)) + return; + + spin_lock(&rq->lock); + + atomic_dec(rq->sched->score); + list_del_init(&entity->list); + + if (!RB_EMPTY_NODE(&entity->rb_tree_node)) + __drm_sched_rq_remove_fifo_locked(entity, rq); + + spin_unlock(&rq->lock); +} + +void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, + struct drm_sched_entity *entity) +{ + struct drm_sched_job *next_job; + + next_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); + + spin_lock(&entity->lock); + spin_lock(&rq->lock); + + __drm_sched_rq_remove_fifo_locked(entity, rq); + + if (next_job) { + ktime_t ts; + + if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) + ts = next_job->submit_ts; + else + ts = drm_sched_rq_get_rr_deadline(rq); + + __drm_sched_rq_add_fifo_locked(entity, rq, ts); + } + + spin_unlock(&rq->lock); + spin_unlock(&entity->lock); +} + +/** + * drm_sched_rq_select_entity - Select an entity which provides a job to run + * + * @sched: the gpu scheduler + * @rq: scheduler run queue to check. + * + * Find oldest waiting ready entity. + * + * Return an entity if one is found; return an error-pointer (!NULL) if an + * entity was ready, but the scheduler had insufficient credits to accommodate + * its job; return NULL, if no ready entity was found. + */ +struct drm_sched_entity * +drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, + struct drm_sched_rq *rq) +{ + struct drm_sched_entity *entity = NULL; + struct rb_node *rb; + + spin_lock(&rq->lock); + for (rb = rb_first_cached(&rq->rb_tree_root); rb; rb = rb_next(rb)) { + entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node); + if (drm_sched_entity_is_ready(entity)) { + if (!drm_sched_can_queue(sched, entity)) { + entity = ERR_PTR(-ENOSPC); + break; + } + + reinit_completion(&entity->entity_idle); + break; + } + entity = NULL; + } + spin_unlock(&rq->lock); + + return entity; +} diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 23d5b1b0b048..45879a755f34 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -386,6 +386,9 @@ struct drm_sched_job { ktime_t submit_ts; }; +#define to_drm_sched_job(sched_job) \ + container_of((sched_job), struct drm_sched_job, queue_node) + static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, int threshold) { @@ -547,6 +550,10 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, atomic_t *score, const char *name, struct device *dev); void drm_sched_fini(struct drm_gpu_scheduler *sched); + +bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, + struct drm_sched_entity *entity); + int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, u32 credits, void *owner); @@ -585,6 +592,9 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence, struct drm_sched_entity *entity); void drm_sched_fault(struct drm_gpu_scheduler *sched); +void drm_sched_rq_init(struct drm_gpu_scheduler *sched, + struct drm_sched_rq *rq); + struct drm_gpu_scheduler * drm_sched_rq_add_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity, @@ -594,6 +604,9 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity); +struct drm_sched_entity * +drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, + struct drm_sched_rq *rq); int drm_sched_entity_init(struct drm_sched_entity *entity, enum drm_sched_priority priority, From patchwork Mon Dec 30 16:52:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C851E7718F for ; Mon, 30 Dec 2024 16:53:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DD5C010E534; Mon, 30 Dec 2024 16:53:19 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="fnmAAVIA"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id D480110E527 for ; Mon, 30 Dec 2024 16:53:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=VjJBu9EsIz/jy7330xj/7yVXpht2/VRyXg32Sr9PH9U=; b=fnmAAVIAslBY9AIF+fClIh6OQ1 /T2UUZvl8OAKz6tLjj8re49/dAGHeherFGYvZ8MEkEIFkjgVgPlLpbhYzGo+CyIlZogKuphTAvBuS 079DYrjSrgVe7k9DryTSTyR+03BOkzybsrEvDHBLTCSxuE54Z24qK7Ax2jY2MDfj3aFk75LaAvSJ6 Uqx/InoVKNmMOSbY88ql2FCVpbyMDlrkpIBWADSh2xizmpaDNSEOMbIpnrGHZFSUBURu931zkfDb7 /y7WVtg0gofCPCtfpFYsnGuHv9ZljgrSqBRl/Q3OE4v+0L2omgyb+FQNhUOAVQ4arJgksTrxXIqnJ 2ACLslaw==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0u-009Zvo-3o; Mon, 30 Dec 2024 17:53:08 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 06/14] drm/sched: Ignore own fence earlier Date: Mon, 30 Dec 2024 16:52:51 +0000 Message-ID: <20241230165259.95855-7-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin If a job depends on another job from the same context it will be naturally ordered by the submission queue. We can therefore ignore those before adding them to the dependency tracking xarray. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 11 ----------- drivers/gpu/drm/scheduler/sched_main.c | 12 ++++++++++++ 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index b93da068585e..2c342c7b9712 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -412,17 +412,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity) struct dma_fence *fence = entity->dependency; struct drm_sched_fence *s_fence; - if (fence->context == entity->fence_context || - fence->context == entity->fence_context + 1) { - /* - * Fence is a scheduled/finished fence from a job - * which belongs to the same entity, we can ignore - * fences from ourself - */ - dma_fence_put(entity->dependency); - return false; - } - s_fence = to_drm_sched_fence(fence); if (!fence->error && s_fence && s_fence->sched == sched && !test_bit(DRM_SCHED_FENCE_DONT_PIPELINE, &fence->flags)) { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 5c92784bb533..34ed22c6482e 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -648,6 +648,7 @@ EXPORT_SYMBOL(drm_sched_job_arm); int drm_sched_job_add_dependency(struct drm_sched_job *job, struct dma_fence *fence) { + struct drm_sched_entity *entity = job->entity; struct dma_fence *entry; unsigned long index; u32 id = 0; @@ -656,6 +657,17 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job, if (!fence) return 0; + if (fence->context == entity->fence_context || + fence->context == entity->fence_context + 1) { + /* + * Fence is a scheduled/finished fence from a job + * which belongs to the same entity, we can ignore + * fences from ourself + */ + dma_fence_put(fence); + return 0; + } + /* Deduplicate if we already depend on a fence from the same context. * This lets the size of the array of deps scale with the number of * engines involved, rather than the number of BOs. From patchwork Mon Dec 30 16:52:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 59AD9E77194 for ; Mon, 30 Dec 2024 16:53:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D94CC10E530; Mon, 30 Dec 2024 16:53:13 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="BQhjghOh"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 928D010E527 for ; Mon, 30 Dec 2024 16:53:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=KFcaAC3/akPjjkIkXWmHXKqRMGt/8+cj1FcmVpgtEsg=; b=BQhjghOhbcv0cI7Ap8Pc8iroz+ XeEY2UBtzK99I8Rmg9KV5w7HlOl/2KEKir7uQzYvEKYSnLRUCH3mCIhoL2k4qnp1Joyun58iztt7c rPsZ14FEIdJLfSjdql5jMYB1ntUfMhcXPt1q0df0/37RTJP1e/oDmFFKKpP/7njIczPGBogPVoymY VOHAAUuof46I3f2dNpBEsvIIZDTKdoaLL1a8e5xouhbCViLPun/22nJXGdJ9gm/HfBt9QqmPvvUmd EVvPS7VVhnfFezyxlwpRBFOB5vL+SnXVacb4bp1mI/XVtCMiMA43yJXq641OeNZQs4t7x86y70StP O0oStkoQ==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0u-009Zvv-R1; Mon, 30 Dec 2024 17:53:08 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 07/14] drm/sched: Resolve same scheduler dependencies earlier Date: Mon, 30 Dec 2024 16:52:52 +0000 Message-ID: <20241230165259.95855-8-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin When a job's dependency is on a same scheduler we pipeline the two directly to the backend by replacing the dependency with the scheduled instead of the finished fence. Ordering is handled by the backend. Instead of doing this fence substitution at the time of popping the job (entity selected by the worker), we can do it earlier when preparing the job for submission. We add a new helper drm_sched_job_prepare_dependecies() which is ran before pushing the job but inside the arm+push block guaranteeing the final scheduler instance has been assigned and is fixed. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 72 +++++++----------------- drivers/gpu/drm/scheduler/sched_main.c | 23 ++++++++ include/drm/gpu_scheduler.h | 2 +- 3 files changed, 45 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 2c342c7b9712..608bc43ff256 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -360,31 +360,6 @@ void drm_sched_entity_destroy(struct drm_sched_entity *entity) } EXPORT_SYMBOL(drm_sched_entity_destroy); -/* drm_sched_entity_clear_dep - callback to clear the entities dependency */ -static void drm_sched_entity_clear_dep(struct dma_fence *f, - struct dma_fence_cb *cb) -{ - struct drm_sched_entity *entity = - container_of(cb, struct drm_sched_entity, cb); - - entity->dependency = NULL; - dma_fence_put(f); -} - -/* - * drm_sched_entity_wakeup - callback to clear the entity's dependency and - * wake up the scheduler - */ -static void drm_sched_entity_wakeup(struct dma_fence *f, - struct dma_fence_cb *cb) -{ - struct drm_sched_entity *entity = - container_of(cb, struct drm_sched_entity, cb); - - drm_sched_entity_clear_dep(f, cb); - drm_sched_wakeup(entity->rq->sched); -} - /** * drm_sched_entity_set_priority - Sets priority of the entity * @@ -402,41 +377,34 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity, } EXPORT_SYMBOL(drm_sched_entity_set_priority); +/* + * drm_sched_entity_wakeup - callback to clear the entity's dependency and + * wake up the scheduler + */ +static void drm_sched_entity_wakeup(struct dma_fence *f, + struct dma_fence_cb *cb) +{ + struct drm_sched_entity *entity = + container_of(cb, struct drm_sched_entity, cb); + + entity->dependency = NULL; + dma_fence_put(f); + drm_sched_wakeup(entity->rq->sched); +} + /* * Add a callback to the current dependency of the entity to wake up the * scheduler when the entity becomes available. */ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity) { - struct drm_gpu_scheduler *sched = entity->rq->sched; struct dma_fence *fence = entity->dependency; - struct drm_sched_fence *s_fence; - s_fence = to_drm_sched_fence(fence); - if (!fence->error && s_fence && s_fence->sched == sched && - !test_bit(DRM_SCHED_FENCE_DONT_PIPELINE, &fence->flags)) { - - /* - * Fence is from the same scheduler, only need to wait for - * it to be scheduled - */ - fence = dma_fence_get(&s_fence->scheduled); - dma_fence_put(entity->dependency); - entity->dependency = fence; - if (!dma_fence_add_callback(fence, &entity->cb, - drm_sched_entity_clear_dep)) - return true; - - /* Ignore it when it is already scheduled */ - dma_fence_put(fence); - return false; - } - - if (!dma_fence_add_callback(entity->dependency, &entity->cb, + if (!dma_fence_add_callback(fence, &entity->cb, drm_sched_entity_wakeup)) return true; - dma_fence_put(entity->dependency); + dma_fence_put(fence); return false; } @@ -558,19 +526,21 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) void drm_sched_entity_push_job(struct drm_sched_job *sched_job) { struct drm_sched_entity *entity = sched_job->entity; + ktime_t submit_ts = ktime_get(); bool first; - ktime_t submit_ts; trace_drm_sched_job(sched_job, entity); atomic_inc(entity->rq->sched->score); WRITE_ONCE(entity->last_user, current->group_leader); + drm_sched_job_prepare_dependecies(sched_job); + /* * After the sched_job is pushed into the entity queue, it may be * completed and freed up at any time. We can no longer access it. * Make sure to set the submit_ts first, to avoid a race. */ - sched_job->submit_ts = submit_ts = ktime_get(); + sched_job->submit_ts = submit_ts; first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); /* first job wakes up scheduler */ diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 34ed22c6482e..ba9b0274b185 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -780,6 +780,29 @@ int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, } EXPORT_SYMBOL(drm_sched_job_add_implicit_dependencies); +void drm_sched_job_prepare_dependecies(struct drm_sched_job *job) +{ + struct drm_gpu_scheduler *sched = job->sched; + struct dma_fence *fence; + unsigned long index; + + xa_for_each(&job->dependencies, index, fence) { + struct drm_sched_fence *s_fence = to_drm_sched_fence(fence); + + if (fence->error || !s_fence || s_fence->sched != sched || + test_bit(DRM_SCHED_FENCE_DONT_PIPELINE, &fence->flags)) + continue; + + /* + * Fence is from the same scheduler, only need to wait for + * it to be scheduled. + */ + xa_store(&job->dependencies, index, + dma_fence_get(&s_fence->scheduled), GFP_KERNEL); + dma_fence_put(fence); + } +} + /** * drm_sched_job_cleanup - clean up scheduler job resources * @job: scheduler job to clean up diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 45879a755f34..6fee85e45d45 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -570,7 +570,7 @@ int drm_sched_job_add_resv_dependencies(struct drm_sched_job *job, int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, struct drm_gem_object *obj, bool write); - +void drm_sched_job_prepare_dependecies(struct drm_sched_job *job); void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, struct drm_gpu_scheduler **sched_list, From patchwork Mon Dec 30 16:52:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A24F9E77197 for ; Mon, 30 Dec 2024 16:53:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 679EC10E527; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="lG8u4C7b"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5141C10E527 for ; Mon, 30 Dec 2024 16:53:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=GoHRmVAqTIBKjrVNJ4uCm6ZnB3h8aIWBEMw80LDAx+U=; b=lG8u4C7bDftwkL+Gp/vNVo5uqm YXRuFE+yQMwtFP2PoIO5QHl1T2vaCtgT3/EigYUOzeUqOp2vLsFnE4BlZ6OCdMjDKCyhjE5tirEWF +VxUcO9RLqdDNr9oOeK7UJKJ2iGiQYcbUF+gQ+nOwSs2T52wipAifQOSMulVvD1/sZ3N5YGO2ELIa cV/tEu6Gt6CWRPZrAmKrBqvcDenWkDEhKI79ENvweEsrB67u8+nCWw4ST4hV2jX9LgSWPxhjbIQ80 b6CqJE5erogbFtPJIWmScxwUse9BXHy2Xz/+vvnLe2A5xUPUIRUmmFUMvpKu4SLF9evmcCix/XP8L NftVq9cQ==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0v-009Zw2-I6; Mon, 30 Dec 2024 17:53:09 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 08/14] drm/sched: Add deadline policy Date: Mon, 30 Dec 2024 16:52:53 +0000 Message-ID: <20241230165259.95855-9-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Deadline scheduling policy should be a fairer flavour of FIFO with two main advantages being that it can naturally connect with the dma-fence deadlines, and secondly that it can get away with multiple run queues per scheduler. From the latter comes the fairness advantage. Where the current FIFO policy will always starve low priority entities by normal, and normal by high etc, deadline tracks all runnable entities in a single run queue and assigns them deadlines based on priority. Instead of being ordered strictly by priority, jobs and entities become ordered by deadlines. This means that a later higher priority submission can still overtake an earlier lower priority one, but eventually the lower priority will get its turn even if high priority is constantly feeding new work. Current mapping of priority to deadlines is somewhat arbitrary and looks like this (submit timestamp plus constant offset in micro-seconds): static const unsigned int d_us[] = { [DRM_SCHED_PRIORITY_KERNEL] = 100, [DRM_SCHED_PRIORITY_HIGH] = 1000, [DRM_SCHED_PRIORITY_NORMAL] = 5000, [DRM_SCHED_PRIORITY_LOW] = 100000, }; Assuming simultaneous submission of one normal and one low prioriy job at a time of "t", they will get respective deadlines of t+5ms and t+100ms. Hence normal will run first and low will run after it, or at the latest 100ms after it was submitted in case other higher priority submissions overtake it in the meantime. Because deadline policy does not need run queues, if the FIFO and RR polices are later removed, that would allow for a significant simplification of the code base by reducing the 1:N to 1:1 scheduler to run queue relationship. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 53 +++++++++++++++++++----- drivers/gpu/drm/scheduler/sched_main.c | 14 ++++--- drivers/gpu/drm/scheduler/sched_rq.c | 5 ++- include/drm/gpu_scheduler.h | 10 ++++- 4 files changed, 64 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 608bc43ff256..6928ec19ec23 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -71,6 +71,8 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, entity->guilty = guilty; entity->num_sched_list = num_sched_list; entity->priority = priority; + entity->rq_priority = drm_sched_policy == DRM_SCHED_POLICY_DEADLINE ? + DRM_SCHED_PRIORITY_KERNEL : priority; /* * It's perfectly valid to initialize an entity without having a valid * scheduler attached. It's just not valid to use the scheduler before it @@ -87,17 +89,23 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, */ pr_warn("%s: called with uninitialized scheduler\n", __func__); } else if (num_sched_list) { - /* The "priority" of an entity cannot exceed the number of run-queues of a - * scheduler. Protect against num_rqs being 0, by converting to signed. Choose - * the lowest priority available. + enum drm_sched_priority p = entity->priority; + + /* + * The "priority" of an entity cannot exceed the number of + * run-queues of a scheduler. Protect against num_rqs being 0, + * by converting to signed. Choose the lowest priority + * available. */ - if (entity->priority >= sched_list[0]->num_rqs) { - drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n", - entity->priority, sched_list[0]->num_rqs); - entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1, - (s32) DRM_SCHED_PRIORITY_KERNEL); + if (p >= sched_list[0]->num_user_rqs) { + drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_user_rqs:%u\n", + p, sched_list[0]->num_user_rqs); + p = max_t(s32, + (s32)sched_list[0]->num_user_rqs - 1, + (s32)DRM_SCHED_PRIORITY_KERNEL); + entity->priority = p; } - entity->rq = sched_list[0]->sched_rq[entity->priority]; + entity->rq = sched_list[0]->sched_rq[entity->rq_priority]; } init_completion(&entity->entity_idle); @@ -377,6 +385,27 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity, } EXPORT_SYMBOL(drm_sched_entity_set_priority); +static ktime_t +__drm_sched_entity_get_job_deadline(struct drm_sched_entity *entity, + ktime_t submit_ts) +{ + static const unsigned int d_us[] = { + [DRM_SCHED_PRIORITY_KERNEL] = 100, + [DRM_SCHED_PRIORITY_HIGH] = 1000, + [DRM_SCHED_PRIORITY_NORMAL] = 5000, + [DRM_SCHED_PRIORITY_LOW] = 100000, + }; + + return ktime_add_us(submit_ts, d_us[entity->priority]); +} + +ktime_t +drm_sched_entity_get_job_deadline(struct drm_sched_entity *entity, + struct drm_sched_job *job) +{ + return __drm_sched_entity_get_job_deadline(entity, job->submit_ts); +} + /* * drm_sched_entity_wakeup - callback to clear the entity's dependency and * wake up the scheduler @@ -503,7 +532,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) spin_lock(&entity->lock); sched = drm_sched_pick_best(entity->sched_list, entity->num_sched_list); - rq = sched ? sched->sched_rq[entity->priority] : NULL; + rq = sched ? sched->sched_rq[entity->rq_priority] : NULL; if (rq != entity->rq) { drm_sched_rq_remove_entity(entity->rq, entity); entity->rq = rq; @@ -547,6 +576,10 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) if (first) { struct drm_gpu_scheduler *sched; + if (drm_sched_policy == DRM_SCHED_POLICY_DEADLINE) + submit_ts = __drm_sched_entity_get_job_deadline(entity, + submit_ts); + sched = drm_sched_rq_add_entity(entity->rq, entity, submit_ts); if (sched) drm_sched_wakeup(sched); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ba9b0274b185..433bef85eeaf 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -87,13 +87,13 @@ static struct lockdep_map drm_sched_lockdep_map = { }; #endif -int drm_sched_policy = DRM_SCHED_POLICY_FIFO; +int drm_sched_policy = DRM_SCHED_POLICY_DEADLINE; /** * DOC: sched_policy (int) * Used to override default entities scheduling policy in a run queue. */ -MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO (default)."); +MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO, " __stringify(DRM_SCHED_POLICY_DEADLINE) " = Virtual deadline (default)."); module_param_named(sched_policy, drm_sched_policy, int, 0444); static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched) @@ -1109,11 +1109,15 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->own_submit_wq = true; } - sched->sched_rq = kmalloc_array(num_rqs, sizeof(*sched->sched_rq), + sched->num_user_rqs = num_rqs; + sched->num_rqs = drm_sched_policy != DRM_SCHED_POLICY_DEADLINE ? + num_rqs : 1; + sched->sched_rq = kmalloc_array(sched->num_rqs, + sizeof(*sched->sched_rq), GFP_KERNEL | __GFP_ZERO); if (!sched->sched_rq) goto Out_check_own; - sched->num_rqs = num_rqs; + for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { sched->sched_rq[i] = kzalloc(sizeof(*sched->sched_rq[i]), GFP_KERNEL); if (!sched->sched_rq[i]) @@ -1227,7 +1231,7 @@ void drm_sched_increase_karma(struct drm_sched_job *bad) if (bad->s_priority != DRM_SCHED_PRIORITY_KERNEL) { atomic_inc(&bad->karma); - for (i = DRM_SCHED_PRIORITY_HIGH; i < sched->num_rqs; i++) { + for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { struct drm_sched_rq *rq = sched->sched_rq[i]; spin_lock(&rq->lock); diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c index 5b31e5434d12..a6bb21250350 100644 --- a/drivers/gpu/drm/scheduler/sched_rq.c +++ b/drivers/gpu/drm/scheduler/sched_rq.c @@ -152,7 +152,10 @@ void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, if (next_job) { ktime_t ts; - if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) + if (drm_sched_policy == DRM_SCHED_POLICY_DEADLINE) + ts = drm_sched_entity_get_job_deadline(entity, + next_job); + else if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) ts = next_job->submit_ts; else ts = drm_sched_rq_get_rr_deadline(rq); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 6fee85e45d45..7532071fbea8 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -74,8 +74,9 @@ enum drm_sched_priority { /* Used to choose between FIFO and RR job-scheduling */ extern int drm_sched_policy; -#define DRM_SCHED_POLICY_RR 0 -#define DRM_SCHED_POLICY_FIFO 1 +#define DRM_SCHED_POLICY_RR 0 +#define DRM_SCHED_POLICY_FIFO 1 +#define DRM_SCHED_POLICY_DEADLINE 2 /** * struct drm_sched_entity - A wrapper around a job queue (typically @@ -153,6 +154,8 @@ struct drm_sched_entity { */ struct spsc_queue job_queue; + enum drm_sched_priority rq_priority; + /** * @fence_seq: * @@ -522,6 +525,7 @@ struct drm_gpu_scheduler { long timeout; const char *name; u32 num_rqs; + u32 num_user_rqs; struct drm_sched_rq **sched_rq; wait_queue_head_t job_scheduled; atomic64_t job_id_count; @@ -623,6 +627,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity, enum drm_sched_priority priority); bool drm_sched_entity_is_ready(struct drm_sched_entity *entity); int drm_sched_entity_error(struct drm_sched_entity *entity); +ktime_t drm_sched_entity_get_job_deadline(struct drm_sched_entity *entity, + struct drm_sched_job *job); struct drm_sched_fence *drm_sched_fence_alloc( struct drm_sched_entity *s_entity, void *owner); From patchwork Mon Dec 30 16:52:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 441F6E77188 for ; Mon, 30 Dec 2024 16:53:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B990D10E53B; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="CVvzZBEE"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 232E110E527 for ; Mon, 30 Dec 2024 16:53:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=WINuvH8rAgeJq26VtToDBST2AkkLU5OJEaD3IdTmY/w=; b=CVvzZBEEfAq0IR9y4CqdIZUvhq jVLYlHzgn5+IzWu2oz5Fl3pEUC7FlFwWW56Ga8j/re2aRWCCIbKuQUTMW9N9fMWPeI7Kkdox44mbH lqkTJ177iuB0J4sx9HGX8kr5+NzcDw19q+Qle72mDJjblUpAUFfmOls38E+/FkcJL3vkNwowTMVSB QjNktgIo9kPVqrQ9Ezs70Hv8e4oyy+QKPMWo+Sc6YdDOGT3IzMIB0LdsmWbot7UEIynjoFuNh2vT5 2Rihl7syLOQgRSf0mcZZ4hWd0PSjHbAQKcJGjGB+ESkn8bZQt2LYLF1bCcg9gWB0CySzedW60Lhfr Ao3zsVNQ==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0w-009ZwC-95; Mon, 30 Dec 2024 17:53:10 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 09/14] drm/sched: Remove FIFO and RR and simplify to a single run queue Date: Mon, 30 Dec 2024 16:52:54 +0000 Message-ID: <20241230165259.95855-10-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin If the new deadline policy is at least as good as FIFO and we can afford to remove round-robin, we can simplify the scheduler code by making the scheduler to run queue relationship always 1:1 and remove some code. Also, now that the FIFO policy is gone the tree of entities is not a FIFO tree any more so rename it to just the tree. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 23 ++-- drivers/gpu/drm/scheduler/sched_entity.c | 30 +---- drivers/gpu/drm/scheduler/sched_main.c | 136 ++++++----------------- drivers/gpu/drm/scheduler/sched_rq.c | 35 ++---- include/drm/gpu_scheduler.h | 13 +-- 5 files changed, 56 insertions(+), 181 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index b9d08bc96581..918b6d4919e1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -418,25 +418,22 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched) { + struct drm_sched_rq *rq = sched->rq; + struct drm_sched_entity *s_entity; struct drm_sched_job *s_job; - struct drm_sched_entity *s_entity = NULL; - int i; /* Signal all jobs not yet scheduled */ - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - struct drm_sched_rq *rq = sched->sched_rq[i]; - spin_lock(&rq->lock); - list_for_each_entry(s_entity, &rq->entities, list) { - while ((s_job = to_drm_sched_job(spsc_queue_pop(&s_entity->job_queue)))) { - struct drm_sched_fence *s_fence = s_job->s_fence; + spin_lock(&rq->lock); + list_for_each_entry(s_entity, &rq->entities, list) { + while ((s_job = to_drm_sched_job(spsc_queue_pop(&s_entity->job_queue)))) { + struct drm_sched_fence *s_fence = s_job->s_fence; - dma_fence_signal(&s_fence->scheduled); - dma_fence_set_error(&s_fence->finished, -EHWPOISON); - dma_fence_signal(&s_fence->finished); - } + dma_fence_signal(&s_fence->scheduled); + dma_fence_set_error(&s_fence->finished, -EHWPOISON); + dma_fence_signal(&s_fence->finished); } - spin_unlock(&rq->lock); } + spin_unlock(&rq->lock); /* Signal all jobs already scheduled to HW */ list_for_each_entry(s_job, &sched->pending_list, list) { diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 6928ec19ec23..14bc3f797079 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -71,8 +71,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, entity->guilty = guilty; entity->num_sched_list = num_sched_list; entity->priority = priority; - entity->rq_priority = drm_sched_policy == DRM_SCHED_POLICY_DEADLINE ? - DRM_SCHED_PRIORITY_KERNEL : priority; /* * It's perfectly valid to initialize an entity without having a valid * scheduler attached. It's just not valid to use the scheduler before it @@ -82,30 +80,14 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, RCU_INIT_POINTER(entity->last_scheduled, NULL); RB_CLEAR_NODE(&entity->rb_tree_node); - if (num_sched_list && !sched_list[0]->sched_rq) { + if (num_sched_list && !sched_list[0]->rq) { /* Since every entry covered by num_sched_list * should be non-NULL and therefore we warn drivers * not to do this and to fix their DRM calling order. */ pr_warn("%s: called with uninitialized scheduler\n", __func__); } else if (num_sched_list) { - enum drm_sched_priority p = entity->priority; - - /* - * The "priority" of an entity cannot exceed the number of - * run-queues of a scheduler. Protect against num_rqs being 0, - * by converting to signed. Choose the lowest priority - * available. - */ - if (p >= sched_list[0]->num_user_rqs) { - drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_user_rqs:%u\n", - p, sched_list[0]->num_user_rqs); - p = max_t(s32, - (s32)sched_list[0]->num_user_rqs - 1, - (s32)DRM_SCHED_PRIORITY_KERNEL); - entity->priority = p; - } - entity->rq = sched_list[0]->sched_rq[entity->rq_priority]; + entity->rq = sched_list[0]->rq; } init_completion(&entity->entity_idle); @@ -532,7 +514,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) spin_lock(&entity->lock); sched = drm_sched_pick_best(entity->sched_list, entity->num_sched_list); - rq = sched ? sched->sched_rq[entity->rq_priority] : NULL; + rq = sched ? sched->rq : NULL; if (rq != entity->rq) { drm_sched_rq_remove_entity(entity->rq, entity); entity->rq = rq; @@ -576,10 +558,8 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) if (first) { struct drm_gpu_scheduler *sched; - if (drm_sched_policy == DRM_SCHED_POLICY_DEADLINE) - submit_ts = __drm_sched_entity_get_job_deadline(entity, - submit_ts); - + submit_ts = __drm_sched_entity_get_job_deadline(entity, + submit_ts); sched = drm_sched_rq_add_entity(entity->rq, entity, submit_ts); if (sched) drm_sched_wakeup(sched); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 433bef85eeaf..4ba9ed27a8a6 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -87,15 +87,6 @@ static struct lockdep_map drm_sched_lockdep_map = { }; #endif -int drm_sched_policy = DRM_SCHED_POLICY_DEADLINE; - -/** - * DOC: sched_policy (int) - * Used to override default entities scheduling policy in a run queue. - */ -MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO, " __stringify(DRM_SCHED_POLICY_DEADLINE) " = Virtual deadline (default)."); -module_param_named(sched_policy, drm_sched_policy, int, 0444); - static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched) { u32 credits; @@ -850,34 +841,6 @@ void drm_sched_wakeup(struct drm_gpu_scheduler *sched) drm_sched_run_job_queue(sched); } -/** - * drm_sched_select_entity - Select next entity to process - * - * @sched: scheduler instance - * - * Return an entity to process or NULL if none are found. - * - * Note, that we break out of the for-loop when "entity" is non-null, which can - * also be an error-pointer--this assures we don't process lower priority - * run-queues. See comments in the respectively called functions. - */ -static struct drm_sched_entity * -drm_sched_select_entity(struct drm_gpu_scheduler *sched) -{ - struct drm_sched_entity *entity = NULL; - int i; - - /* Start with the highest priority. - */ - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - entity = drm_sched_rq_select_entity(sched, sched->sched_rq[i]); - if (entity) - break; - } - - return IS_ERR(entity) ? NULL : entity;; -} - /** * drm_sched_get_finished_job - fetch the next finished job to be destroyed * @@ -1000,8 +963,8 @@ static void drm_sched_run_job_work(struct work_struct *w) return; /* Find entity with a ready job */ - entity = drm_sched_select_entity(sched); - if (!entity) + entity = drm_sched_rq_select_entity(sched, sched->rq); + if (IS_ERR_OR_NULL(entity)) return; /* No more work */ sched_job = drm_sched_entity_pop_job(entity); @@ -1047,7 +1010,7 @@ static void drm_sched_run_job_work(struct work_struct *w) * @ops: backend operations for this scheduler * @submit_wq: workqueue to use for submission. If NULL, an ordered wq is * allocated and used - * @num_rqs: number of runqueues, one for each priority, up to DRM_SCHED_PRIORITY_COUNT + * @num_rqs: deprecated and ignored * @credit_limit: the number of credits this scheduler can hold from all jobs * @hang_limit: number of times to allow a job to hang before dropping it * @timeout: timeout value in jiffies for the scheduler @@ -1066,8 +1029,6 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, long timeout, struct workqueue_struct *timeout_wq, atomic_t *score, const char *name, struct device *dev) { - int i; - sched->ops = ops; sched->credit_limit = credit_limit; sched->name = name; @@ -1077,13 +1038,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->score = score ? score : &sched->_score; sched->dev = dev; - if (num_rqs > DRM_SCHED_PRIORITY_COUNT) { - /* This is a gross violation--tell drivers what the problem is. - */ - drm_err(sched, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n", - __func__); - return -EINVAL; - } else if (sched->sched_rq) { + if (sched->rq) { /* Not an error, but warn anyway so drivers can * fine-tune their DRM calling order, and return all * is good. @@ -1109,21 +1064,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->own_submit_wq = true; } - sched->num_user_rqs = num_rqs; - sched->num_rqs = drm_sched_policy != DRM_SCHED_POLICY_DEADLINE ? - num_rqs : 1; - sched->sched_rq = kmalloc_array(sched->num_rqs, - sizeof(*sched->sched_rq), - GFP_KERNEL | __GFP_ZERO); - if (!sched->sched_rq) + sched->rq = kmalloc(sizeof(*sched->rq), GFP_KERNEL | __GFP_ZERO); + if (!sched->rq) goto Out_check_own; - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - sched->sched_rq[i] = kzalloc(sizeof(*sched->sched_rq[i]), GFP_KERNEL); - if (!sched->sched_rq[i]) - goto Out_unroll; - drm_sched_rq_init(sched, sched->sched_rq[i]); - } + drm_sched_rq_init(sched, sched->rq); init_waitqueue_head(&sched->job_scheduled); INIT_LIST_HEAD(&sched->pending_list); @@ -1135,15 +1080,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, atomic_set(&sched->_score, 0); atomic64_set(&sched->job_id_count, 0); sched->pause_submit = false; - sched->ready = true; return 0; -Out_unroll: - for (--i ; i >= DRM_SCHED_PRIORITY_KERNEL; i--) - kfree(sched->sched_rq[i]); - kfree(sched->sched_rq); - sched->sched_rq = NULL; Out_check_own: if (sched->own_submit_wq) destroy_workqueue(sched->submit_wq); @@ -1174,25 +1113,21 @@ EXPORT_SYMBOL(drm_sched_init); */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { + + struct drm_sched_rq *rq = sched->rq; struct drm_sched_entity *s_entity; - int i; drm_sched_wqueue_stop(sched); - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - struct drm_sched_rq *rq = sched->sched_rq[i]; - - spin_lock(&rq->lock); - list_for_each_entry(s_entity, &rq->entities, list) - /* - * Prevents reinsertion and marks job_queue as idle, - * it will be removed from the rq in drm_sched_entity_fini() - * eventually - */ - s_entity->stopped = true; - spin_unlock(&rq->lock); - kfree(sched->sched_rq[i]); - } + spin_lock(&rq->lock); + list_for_each_entry(s_entity, &rq->entities, list) + /* + * Prevents reinsertion and marks job_queue as idle, + * it will be removed from the rq in drm_sched_entity_fini() + * eventually + */ + s_entity->stopped = true; + spin_unlock(&rq->lock); /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ wake_up_all(&sched->job_scheduled); @@ -1203,8 +1138,8 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) if (sched->own_submit_wq) destroy_workqueue(sched->submit_wq); sched->ready = false; - kfree(sched->sched_rq); - sched->sched_rq = NULL; + kfree(sched->rq); + sched->rq = NULL; } EXPORT_SYMBOL(drm_sched_fini); @@ -1219,35 +1154,28 @@ EXPORT_SYMBOL(drm_sched_fini); */ void drm_sched_increase_karma(struct drm_sched_job *bad) { - int i; - struct drm_sched_entity *tmp; - struct drm_sched_entity *entity; struct drm_gpu_scheduler *sched = bad->sched; + struct drm_sched_entity *entity, *tmp; + struct drm_sched_rq *rq = sched->rq; /* don't change @bad's karma if it's from KERNEL RQ, * because sometimes GPU hang would cause kernel jobs (like VM updating jobs) * corrupt but keep in mind that kernel jobs always considered good. */ - if (bad->s_priority != DRM_SCHED_PRIORITY_KERNEL) { - atomic_inc(&bad->karma); + if (bad->s_priority == DRM_SCHED_PRIORITY_KERNEL) + return; - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - struct drm_sched_rq *rq = sched->sched_rq[i]; + atomic_inc(&bad->karma); - spin_lock(&rq->lock); - list_for_each_entry_safe(entity, tmp, &rq->entities, list) { - if (bad->s_fence->scheduled.context == - entity->fence_context) { - if (entity->guilty) - atomic_set(entity->guilty, 1); - break; - } - } - spin_unlock(&rq->lock); - if (&entity->list != &rq->entities) - break; + spin_lock(&rq->lock); + list_for_each_entry_safe(entity, tmp, &rq->entities, list) { + if (bad->s_fence->scheduled.context == entity->fence_context) { + if (entity->guilty) + atomic_set(entity->guilty, 1); + break; } } + spin_unlock(&rq->lock); } EXPORT_SYMBOL(drm_sched_increase_karma); diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c index a6bb21250350..0b7a2b8b48db 100644 --- a/drivers/gpu/drm/scheduler/sched_rq.c +++ b/drivers/gpu/drm/scheduler/sched_rq.c @@ -12,7 +12,7 @@ static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, return ktime_before(ent_a->oldest_job_waiting, ent_b->oldest_job_waiting); } -static void __drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, +static void __drm_sched_rq_remove_tree_locked(struct drm_sched_entity *entity, struct drm_sched_rq *rq) { lockdep_assert_held(&entity->lock); @@ -22,7 +22,7 @@ static void __drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, RB_CLEAR_NODE(&entity->rb_tree_node); } -static void __drm_sched_rq_add_fifo_locked(struct drm_sched_entity *entity, +static void __drm_sched_rq_add_tree_locked(struct drm_sched_entity *entity, struct drm_sched_rq *rq, ktime_t ts) { @@ -56,16 +56,6 @@ void drm_sched_rq_init(struct drm_gpu_scheduler *sched, rq->sched = sched; } -static ktime_t -drm_sched_rq_get_rr_deadline(struct drm_sched_rq *rq) -{ - lockdep_assert_held(&rq->lock); - - rq->rr_deadline = ktime_add_ns(rq->rr_deadline, 1); - - return rq->rr_deadline; -} - /** * drm_sched_rq_add_entity - add an entity * @@ -97,12 +87,9 @@ drm_sched_rq_add_entity(struct drm_sched_rq *rq, if (!list_empty(&entity->list)) list_add_tail(&entity->list, &rq->entities); - if (drm_sched_policy == DRM_SCHED_POLICY_RR) - ts = drm_sched_rq_get_rr_deadline(rq); - if (!RB_EMPTY_NODE(&entity->rb_tree_node)) - __drm_sched_rq_remove_fifo_locked(entity, rq); - __drm_sched_rq_add_fifo_locked(entity, rq, ts); + __drm_sched_rq_remove_tree_locked(entity, rq); + __drm_sched_rq_add_tree_locked(entity, rq, ts); spin_unlock(&rq->lock); spin_unlock(&entity->lock); @@ -132,7 +119,7 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, list_del_init(&entity->list); if (!RB_EMPTY_NODE(&entity->rb_tree_node)) - __drm_sched_rq_remove_fifo_locked(entity, rq); + __drm_sched_rq_remove_tree_locked(entity, rq); spin_unlock(&rq->lock); } @@ -147,20 +134,14 @@ void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, spin_lock(&entity->lock); spin_lock(&rq->lock); - __drm_sched_rq_remove_fifo_locked(entity, rq); + __drm_sched_rq_remove_tree_locked(entity, rq); if (next_job) { ktime_t ts; - if (drm_sched_policy == DRM_SCHED_POLICY_DEADLINE) - ts = drm_sched_entity_get_job_deadline(entity, - next_job); - else if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) - ts = next_job->submit_ts; - else - ts = drm_sched_rq_get_rr_deadline(rq); + ts = drm_sched_entity_get_job_deadline(entity, next_job); - __drm_sched_rq_add_fifo_locked(entity, rq, ts); + __drm_sched_rq_add_tree_locked(entity, rq, ts); } spin_unlock(&rq->lock); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 7532071fbea8..93f6fcfe3ba0 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -71,13 +71,6 @@ enum drm_sched_priority { DRM_SCHED_PRIORITY_COUNT }; -/* Used to choose between FIFO and RR job-scheduling */ -extern int drm_sched_policy; - -#define DRM_SCHED_POLICY_RR 0 -#define DRM_SCHED_POLICY_FIFO 1 -#define DRM_SCHED_POLICY_DEADLINE 2 - /** * struct drm_sched_entity - A wrapper around a job queue (typically * attached to the DRM file_priv). @@ -154,8 +147,6 @@ struct drm_sched_entity { */ struct spsc_queue job_queue; - enum drm_sched_priority rq_priority; - /** * @fence_seq: * @@ -524,9 +515,7 @@ struct drm_gpu_scheduler { atomic_t credit_count; long timeout; const char *name; - u32 num_rqs; - u32 num_user_rqs; - struct drm_sched_rq **sched_rq; + struct drm_sched_rq *rq; wait_queue_head_t job_scheduled; atomic64_t job_id_count; struct workqueue_struct *submit_wq; From patchwork Mon Dec 30 16:52:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A4E0EE77194 for ; Mon, 30 Dec 2024 16:53:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 34E7A10E53C; Mon, 30 Dec 2024 16:53:21 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="APZKJGtF"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id BC14110E527 for ; Mon, 30 Dec 2024 16:53:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=Ww5seEkRAR8bzEkdNm61yKYPAq+nK+Ts6tqpjG72nAg=; b=APZKJGtFxM1gWjQEEH3VYlpvLV alHYGPKZ5XB8TfhxXskobWXXGmOD1W8bj8WZaf+p6uHGmHq46B5I+AzPhJqYYSfJBye2V/OjqJ3wK BzhjkGkwzeyLRsvLgQEyg3YZHA+1XtGrpXUWZRqCISE1xUKgNZi/GuwembY7R9hR83Qo3ubXgbZ3p xZugj75CjyHT/Kg6ZtszQIv+ej6RnC4iP3CElm5pzfJAw9/HjUCjNG/VN8K4cepSdYp1SDG3U3VL1 t0zRrsAQXOTKHtVbCsJ1ovkgyrLh77m3PF49wmi7HxBLbQtJcpYrBVNNg0wP+KrvzAzD0vpY6vP6H 9u8uWy+w==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0x-009ZwL-04; Mon, 30 Dec 2024 17:53:11 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 10/14] drm/sched: Queue all free credits in one worker invocation Date: Mon, 30 Dec 2024 16:52:55 +0000 Message-ID: <20241230165259.95855-11-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin There is no reason to queue just a single job if scheduler can take more and re-queue the worker to queue more. We can simply feed the hardware with as much as it can take in one go and hopefully win some latency. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_main.c | 112 +++++++++++-------------- drivers/gpu/drm/scheduler/sched_rq.c | 19 ++--- include/drm/gpu_scheduler.h | 3 - 3 files changed, 58 insertions(+), 76 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 4ba9ed27a8a6..6f4ea8a2ca17 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -98,33 +98,6 @@ static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched) return credits; } -/** - * drm_sched_can_queue -- Can we queue more to the hardware? - * @sched: scheduler instance - * @entity: the scheduler entity - * - * Return true if we can push at least one more job from @entity, false - * otherwise. - */ -bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, - struct drm_sched_entity *entity) -{ - struct drm_sched_job *s_job; - - s_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); - if (!s_job) - return false; - - /* If a job exceeds the credit limit, truncate it to the credit limit - * itself to guarantee forward progress. - */ - if (drm_WARN(sched, s_job->credits > sched->credit_limit, - "Jobs may not exceed the credit limit, truncate.\n")) - s_job->credits = sched->credit_limit; - - return drm_sched_available_credits(sched) >= s_job->credits; -} - /** * drm_sched_run_job_queue - enqueue run-job work * @sched: scheduler instance @@ -174,6 +147,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result) atomic_sub(s_job->credits, &sched->credit_count); atomic_dec(sched->score); + drm_sched_run_job_queue(sched); trace_drm_sched_process_job(s_fence); @@ -941,7 +915,6 @@ static void drm_sched_free_job_work(struct work_struct *w) sched->ops->free_job(job); drm_sched_run_free_queue(sched); - drm_sched_run_job_queue(sched); } /** @@ -953,54 +926,71 @@ static void drm_sched_run_job_work(struct work_struct *w) { struct drm_gpu_scheduler *sched = container_of(w, struct drm_gpu_scheduler, work_run_job); + u32 job_credits, submitted_credits = 0; struct drm_sched_entity *entity; - struct dma_fence *fence; - struct drm_sched_fence *s_fence; struct drm_sched_job *sched_job; - int r; + struct dma_fence *fence; if (READ_ONCE(sched->pause_submit)) return; - /* Find entity with a ready job */ - entity = drm_sched_rq_select_entity(sched, sched->rq); - if (IS_ERR_OR_NULL(entity)) - return; /* No more work */ + for (;;) { + /* Find entity with a ready job */ + entity = drm_sched_rq_select_entity(sched, sched->rq); + if (!entity) + break; /* No more work */ - sched_job = drm_sched_entity_pop_job(entity); - if (!sched_job) { + /* + * If a job exceeds the credit limit truncate it to guarantee + * forward progress. + */ + sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); + job_credits = sched_job->credits; + if (drm_WARN_ONCE(sched, job_credits > sched->credit_limit, + "Jobs may not exceed the credit limit, truncating.\n")) + job_credits = sched_job->credits = sched->credit_limit; + + if (job_credits > drm_sched_available_credits(sched)) { + complete_all(&entity->entity_idle); + break; + } + + sched_job = drm_sched_entity_pop_job(entity); complete_all(&entity->entity_idle); - drm_sched_run_job_queue(sched); - return; - } + if (!sched_job) { + /* Top entity is not yet runnable after all */ + continue; + } - s_fence = sched_job->s_fence; + drm_sched_job_begin(sched_job); + trace_drm_run_job(sched_job, entity); + submitted_credits += job_credits; + atomic_add(job_credits, &sched->credit_count); - atomic_add(sched_job->credits, &sched->credit_count); - drm_sched_job_begin(sched_job); + fence = sched->ops->run_job(sched_job); + drm_sched_fence_scheduled(sched_job->s_fence, fence); - trace_drm_run_job(sched_job, entity); - fence = sched->ops->run_job(sched_job); - complete_all(&entity->entity_idle); - drm_sched_fence_scheduled(s_fence, fence); + if (!IS_ERR_OR_NULL(fence)) { + int r; - if (!IS_ERR_OR_NULL(fence)) { - /* Drop for original kref_init of the fence */ - dma_fence_put(fence); + /* Drop for original kref_init of the fence */ + dma_fence_put(fence); - r = dma_fence_add_callback(fence, &sched_job->cb, - drm_sched_job_done_cb); - if (r == -ENOENT) - drm_sched_job_done(sched_job, fence->error); - else if (r) - DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); - } else { - drm_sched_job_done(sched_job, IS_ERR(fence) ? - PTR_ERR(fence) : 0); + r = dma_fence_add_callback(fence, &sched_job->cb, + drm_sched_job_done_cb); + if (r == -ENOENT) + drm_sched_job_done(sched_job, fence->error); + else if (r) + DRM_DEV_ERROR(sched->dev, + "fence add callback failed (%d)\n", r); + } else { + drm_sched_job_done(sched_job, IS_ERR(fence) ? + PTR_ERR(fence) : 0); + } } - wake_up(&sched->job_scheduled); - drm_sched_run_job_queue(sched); + if (submitted_credits) + wake_up(&sched->job_scheduled); } /** diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c index 0b7a2b8b48db..1a454384ab25 100644 --- a/drivers/gpu/drm/scheduler/sched_rq.c +++ b/drivers/gpu/drm/scheduler/sched_rq.c @@ -156,9 +156,7 @@ void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, * * Find oldest waiting ready entity. * - * Return an entity if one is found; return an error-pointer (!NULL) if an - * entity was ready, but the scheduler had insufficient credits to accommodate - * its job; return NULL, if no ready entity was found. + * Return an entity if one is found or NULL if no ready entity was found. */ struct drm_sched_entity * drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, @@ -170,16 +168,13 @@ drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, spin_lock(&rq->lock); for (rb = rb_first_cached(&rq->rb_tree_root); rb; rb = rb_next(rb)) { entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node); - if (drm_sched_entity_is_ready(entity)) { - if (!drm_sched_can_queue(sched, entity)) { - entity = ERR_PTR(-ENOSPC); - break; - } - - reinit_completion(&entity->entity_idle); - break; + if (!drm_sched_entity_is_ready(entity)) { + entity = NULL; + continue; } - entity = NULL; + + reinit_completion(&entity->entity_idle); + break; } spin_unlock(&rq->lock); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 93f6fcfe3ba0..85f3a0d5a7be 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -544,9 +544,6 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, void drm_sched_fini(struct drm_gpu_scheduler *sched); -bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, - struct drm_sched_entity *entity); - int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, u32 credits, void *owner); From patchwork Mon Dec 30 16:52:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D2380E77197 for ; Mon, 30 Dec 2024 16:53:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DD73310E538; Mon, 30 Dec 2024 16:53:19 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="AK3IkuD8"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id E165610E533 for ; Mon, 30 Dec 2024 16:53:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=2uHCzA+woXEdPjowOlN/PXxYpGOEfDogE3a9OlVqEh4=; b=AK3IkuD8iZIAIxoV48FMooB/yw o5I2Z8DlQU0T2yDNWoGgLo9UL8eeYGRSK9jNQr5kufwEadrdodKVeT9bLxIkplnrWnyJU4YSXukQO 4dGXDIMwbpuvITZWd3Fin5INR7gyXJ+T7N/WGW4D4sB/qlG7KotTsCNJoL9t3jnsrmJBMidHvL0SZ Sw6xkudQ8ogpXjDlNjzpItlbUH0cIVOFbXwWO1BAYKDIELmjlT3fzRl59dGcCuYtDYDy/pn2Kkb6b qlhQrjSSwUSH446hFfb8CTd/smD+BwqMwIUsEUOVqZiRwaoNkOceWHBMnE6/GP/YRraCMCfJCdSYt QK7KF1ig==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0x-009ZwU-NT; Mon, 30 Dec 2024 17:53:11 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner , Rob Clark Subject: [RFC 11/14] drm/sched: Connect with dma-fence deadlines Date: Mon, 30 Dec 2024 16:52:56 +0000 Message-ID: <20241230165259.95855-12-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Now that the scheduling policy is deadline based it feels completely natural to allow propagating externaly set deadlines to the scheduler. Scheduler deadlines are not a guarantee but as the dma-fence facility is already in use by userspace lets wire it up. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner Cc: Rob Clark --- drivers/gpu/drm/scheduler/sched_entity.c | 30 +++++++++++++++++++++++- drivers/gpu/drm/scheduler/sched_fence.c | 3 +++ drivers/gpu/drm/scheduler/sched_rq.c | 16 +++++++++++++ include/drm/gpu_scheduler.h | 8 +++++++ 4 files changed, 56 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 14bc3f797079..c5a4c04b2455 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -385,7 +385,24 @@ ktime_t drm_sched_entity_get_job_deadline(struct drm_sched_entity *entity, struct drm_sched_job *job) { - return __drm_sched_entity_get_job_deadline(entity, job->submit_ts); + struct drm_sched_fence *s_fence = job->s_fence; + struct dma_fence *fence = &s_fence->finished; + ktime_t deadline; + + deadline = __drm_sched_entity_get_job_deadline(entity, job->submit_ts); + if (test_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT, &fence->flags) && + ktime_before(s_fence->deadline, deadline)) + deadline = s_fence->deadline; + + return deadline; +} + +void drm_sched_entity_set_deadline(struct drm_sched_entity *entity, + ktime_t deadline) +{ + spin_lock(&entity->lock); + drm_sched_rq_update_deadline(entity->rq, entity, deadline); + spin_unlock(&entity->lock); } /* @@ -536,8 +553,11 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) */ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) { + struct drm_sched_fence *s_fence = sched_job->s_fence; struct drm_sched_entity *entity = sched_job->entity; + struct dma_fence *fence = &s_fence->finished; ktime_t submit_ts = ktime_get(); + ktime_t fence_deadline; bool first; trace_drm_sched_job(sched_job, entity); @@ -552,6 +572,11 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) * Make sure to set the submit_ts first, to avoid a race. */ sched_job->submit_ts = submit_ts; + if (test_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT, &fence->flags)) + fence_deadline = s_fence->deadline; + else + fence_deadline = KTIME_MAX; + first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); /* first job wakes up scheduler */ @@ -560,6 +585,9 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) submit_ts = __drm_sched_entity_get_job_deadline(entity, submit_ts); + if (ktime_before(fence_deadline, submit_ts)) + submit_ts = fence_deadline; + sched = drm_sched_rq_add_entity(entity->rq, entity, submit_ts); if (sched) drm_sched_wakeup(sched); diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c index 0f35f009b9d3..dfc7f50d4e0d 100644 --- a/drivers/gpu/drm/scheduler/sched_fence.c +++ b/drivers/gpu/drm/scheduler/sched_fence.c @@ -168,6 +168,8 @@ static void drm_sched_fence_set_deadline_finished(struct dma_fence *f, spin_unlock_irqrestore(&fence->lock, flags); + drm_sched_entity_set_deadline(fence->entity, deadline); + /* * smp_load_aquire() to ensure that if we are racing another * thread calling drm_sched_fence_set_parent(), that we see @@ -223,6 +225,7 @@ void drm_sched_fence_init(struct drm_sched_fence *fence, { unsigned seq; + fence->entity = entity; fence->sched = entity->rq->sched; seq = atomic_inc_return(&entity->fence_seq); dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled, diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c index 1a454384ab25..e96c8ca9c54b 100644 --- a/drivers/gpu/drm/scheduler/sched_rq.c +++ b/drivers/gpu/drm/scheduler/sched_rq.c @@ -148,6 +148,22 @@ void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, spin_unlock(&entity->lock); } +void drm_sched_rq_update_deadline(struct drm_sched_rq *rq, + struct drm_sched_entity *entity, + ktime_t deadline) +{ + lockdep_assert_held(&entity->lock); + + if (ktime_before(deadline, entity->oldest_job_waiting)) { + spin_lock(&rq->lock); + if (!RB_EMPTY_NODE(&entity->rb_tree_node)) { + __drm_sched_rq_remove_tree_locked(entity, rq); + __drm_sched_rq_add_tree_locked(entity, rq, deadline); + } + spin_unlock(&rq->lock); + } +} + /** * drm_sched_rq_select_entity - Select an entity which provides a job to run * diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 85f3a0d5a7be..c68dce8af063 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -291,6 +291,9 @@ struct drm_sched_fence { * &drm_sched_fence.finished fence once parent is signalled. */ struct dma_fence *parent; + + struct drm_sched_entity *entity; + /** * @sched: the scheduler instance to which the job having this struct * belongs to. @@ -597,6 +600,9 @@ void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, struct drm_sched_entity * drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, struct drm_sched_rq *rq); +void drm_sched_rq_update_deadline(struct drm_sched_rq *rq, + struct drm_sched_entity *entity, + ktime_t deadline); int drm_sched_entity_init(struct drm_sched_entity *entity, enum drm_sched_priority priority, @@ -612,6 +618,8 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job); void drm_sched_entity_set_priority(struct drm_sched_entity *entity, enum drm_sched_priority priority); bool drm_sched_entity_is_ready(struct drm_sched_entity *entity); +void drm_sched_entity_set_deadline(struct drm_sched_entity *entity, + ktime_t deadline); int drm_sched_entity_error(struct drm_sched_entity *entity); ktime_t drm_sched_entity_get_job_deadline(struct drm_sched_entity *entity, struct drm_sched_job *job); From patchwork Mon Dec 30 16:52:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923366 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC097E77188 for ; Mon, 30 Dec 2024 16:53:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4E68210E53E; Mon, 30 Dec 2024 16:53:40 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="SLw1aaDM"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3448310E538 for ; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=5ZBSFZxJvSu22KmqQfkECcJnDs4JmpClHHvldnIdaoM=; b=SLw1aaDMVnfpb7BbhtBMaRi1T7 gfcBkMyZkdUdpQ/WqdoIF6DC1e9OhZNjqt3OjvbfXK7VpiHGVcQXRIWUM2YVm0bV5VCvHPR4IyrGq mFdpus7/DDvhIVmuECbH2xWLzkJypdRZSPevKsmsITW2/d8wpNWIlyTCT6esdWrJ2UWAL2Yw3ryYQ jUOt6FCs/p30lYCHQZGr6+jDXTZ0fkfNifSCA0l/GlhKaRkxpJZKu4zaf55WIs4FdfBj5Nezb3Bso 6Ej5uMDke82AI87zoamUVlb39pnMWT4o3Laaym0Ct61LdC5I9qZdpldI0WAXFAAl5Kg70xe3sYVAf y8AeqDuQ==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0y-009Zwa-Eu; Mon, 30 Dec 2024 17:53:12 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 12/14] drm/sched: Embed run queue singleton into the scheduler Date: Mon, 30 Dec 2024 16:52:57 +0000 Message-ID: <20241230165259.95855-13-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Now that the run queue to scheduler relationship is always 1:1 we can embed it (the run queue) directly in the scheduler struct and save on some allocation error handling code and such. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 5 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 8 ++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 8 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 8 +++--- drivers/gpu/drm/scheduler/sched_entity.c | 27 ++++++++---------- drivers/gpu/drm/scheduler/sched_fence.c | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 31 ++++----------------- drivers/gpu/drm/scheduler/sched_rq.c | 17 +++++------ include/drm/gpu_scheduler.h | 11 ++------ 11 files changed, 56 insertions(+), 73 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index d891ab779ca7..25028ac48844 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1108,7 +1108,8 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) if (p->gang_size > 1 && !p->adev->vm_manager.concurrent_flush) { for (i = 0; i < p->gang_size; ++i) { struct drm_sched_entity *entity = p->entities[i]; - struct drm_gpu_scheduler *sched = entity->rq->sched; + struct drm_gpu_scheduler *sched = + container_of(entity->rq, typeof(*sched), rq); struct amdgpu_ring *ring = to_amdgpu_ring(sched); if (amdgpu_vmid_uses_reserved(adev, vm, ring->vm_hub)) @@ -1233,7 +1234,8 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p) return r; } - sched = p->gang_leader->base.entity->rq->sched; + sched = container_of(p->gang_leader->base.entity->rq, typeof(*sched), + rq); while ((fence = amdgpu_sync_get_fence(&p->sync))) { struct drm_sched_fence *s_fence = to_drm_sched_fence(fence); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 918b6d4919e1..f7abe413044e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -349,7 +349,9 @@ static struct dma_fence * amdgpu_job_prepare_job(struct drm_sched_job *sched_job, struct drm_sched_entity *s_entity) { - struct amdgpu_ring *ring = to_amdgpu_ring(s_entity->rq->sched); + struct drm_gpu_scheduler *sched = + container_of(s_entity->rq, typeof(*sched), rq); + struct amdgpu_ring *ring = to_amdgpu_ring(sched); struct amdgpu_job *job = to_amdgpu_job(sched_job); struct dma_fence *fence = NULL; int r; @@ -418,7 +420,7 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched) { - struct drm_sched_rq *rq = sched->rq; + struct drm_sched_rq *rq = &sched->rq; struct drm_sched_entity *s_entity; struct drm_sched_job *s_job; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h index ce6b9ba967ff..d6872baeba1e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h @@ -85,7 +85,10 @@ struct amdgpu_job { static inline struct amdgpu_ring *amdgpu_job_ring(struct amdgpu_job *job) { - return to_amdgpu_ring(job->base.entity->rq->sched); + struct drm_gpu_scheduler *sched = + container_of(job->base.entity->rq, typeof(*sched), rq); + + return to_amdgpu_ring(sched); } int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h index 383fce40d4dd..a3819ed20d27 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h @@ -145,6 +145,7 @@ TRACE_EVENT(amdgpu_cs, struct amdgpu_ib *ib), TP_ARGS(p, job, ib), TP_STRUCT__entry( + __field(struct drm_gpu_scheduler *, sched) __field(struct amdgpu_bo_list *, bo_list) __field(u32, ring) __field(u32, dw) @@ -152,11 +153,14 @@ TRACE_EVENT(amdgpu_cs, ), TP_fast_assign( + __entry->sched = container_of(job->base.entity->rq, + typeof(*__entry->sched), + rq); __entry->bo_list = p->bo_list; - __entry->ring = to_amdgpu_ring(job->base.entity->rq->sched)->idx; + __entry->ring = to_amdgpu_ring(__entry->sched)->idx; __entry->dw = ib->length_dw; __entry->fences = amdgpu_fence_count_emitted( - to_amdgpu_ring(job->base.entity->rq->sched)); + to_amdgpu_ring(__entry->sched)); ), TP_printk("bo_list=%p, ring=%u, dw=%u, fences=%u", __entry->bo_list, __entry->ring, __entry->dw, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c index 46d9fb433ab2..42f2bfb30af1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c @@ -105,13 +105,13 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p, static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p, struct dma_fence **fence) { + struct drm_gpu_scheduler *sched = + container_of(p->vm->delayed.rq, typeof(*sched), rq); + struct amdgpu_ring *ring = + container_of(sched, struct amdgpu_ring, sched); struct amdgpu_ib *ib = p->job->ibs; - struct amdgpu_ring *ring; struct dma_fence *f; - ring = container_of(p->vm->delayed.rq->sched, struct amdgpu_ring, - sched); - WARN_ON(ib->length_dw == 0); amdgpu_ring_pad_ib(ring, ib); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c index e209b5e101df..182744c5f0cf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c @@ -420,15 +420,15 @@ int amdgpu_xcp_open_device(struct amdgpu_device *adev, void amdgpu_xcp_release_sched(struct amdgpu_device *adev, struct amdgpu_ctx_entity *entity) { - struct drm_gpu_scheduler *sched; - struct amdgpu_ring *ring; + struct drm_gpu_scheduler *sched = + container_of(entity->entity.rq, typeof(*sched), rq); if (!adev->xcp_mgr) return; - sched = entity->entity.rq->sched; if (sched->ready) { - ring = to_amdgpu_ring(entity->entity.rq->sched); + struct amdgpu_ring *ring = to_amdgpu_ring(sched); + atomic_dec(&adev->xcp_mgr->xcp[ring->xcp_id].ref_cnt); } } diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index c5a4c04b2455..dc5105ca8381 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -77,19 +77,12 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, * is initialized itself. */ entity->sched_list = num_sched_list > 1 ? sched_list : NULL; + if (num_sched_list) { + entity->sched_list = num_sched_list > 1 ? sched_list : NULL; + entity->rq = &sched_list[0]->rq; + } RCU_INIT_POINTER(entity->last_scheduled, NULL); RB_CLEAR_NODE(&entity->rb_tree_node); - - if (num_sched_list && !sched_list[0]->rq) { - /* Since every entry covered by num_sched_list - * should be non-NULL and therefore we warn drivers - * not to do this and to fix their DRM calling order. - */ - pr_warn("%s: called with uninitialized scheduler\n", __func__); - } else if (num_sched_list) { - entity->rq = sched_list[0]->rq; - } - init_completion(&entity->entity_idle); /* We start in an idle state. */ @@ -279,7 +272,7 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) if (!entity->rq) return 0; - sched = entity->rq->sched; + sched = container_of(entity->rq, typeof(*sched), rq); /** * The client will not queue more IBs during this fini, consume existing * queued IBs or discard them on SIGKILL @@ -414,10 +407,12 @@ static void drm_sched_entity_wakeup(struct dma_fence *f, { struct drm_sched_entity *entity = container_of(cb, struct drm_sched_entity, cb); + struct drm_gpu_scheduler *sched = + container_of(entity->rq, typeof(*sched), rq); entity->dependency = NULL; dma_fence_put(f); - drm_sched_wakeup(entity->rq->sched); + drm_sched_wakeup(sched); } /* @@ -531,7 +526,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) spin_lock(&entity->lock); sched = drm_sched_pick_best(entity->sched_list, entity->num_sched_list); - rq = sched ? sched->rq : NULL; + rq = sched ? &sched->rq : NULL; if (rq != entity->rq) { drm_sched_rq_remove_entity(entity->rq, entity); entity->rq = rq; @@ -556,12 +551,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) struct drm_sched_fence *s_fence = sched_job->s_fence; struct drm_sched_entity *entity = sched_job->entity; struct dma_fence *fence = &s_fence->finished; + struct drm_gpu_scheduler *sched = + container_of(entity->rq, typeof(*sched), rq); ktime_t submit_ts = ktime_get(); ktime_t fence_deadline; bool first; trace_drm_sched_job(sched_job, entity); - atomic_inc(entity->rq->sched->score); + atomic_inc(sched->score); WRITE_ONCE(entity->last_user, current->group_leader); drm_sched_job_prepare_dependecies(sched_job); diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c index dfc7f50d4e0d..a0f8fbba6d7e 100644 --- a/drivers/gpu/drm/scheduler/sched_fence.c +++ b/drivers/gpu/drm/scheduler/sched_fence.c @@ -226,7 +226,7 @@ void drm_sched_fence_init(struct drm_sched_fence *fence, unsigned seq; fence->entity = entity; - fence->sched = entity->rq->sched; + fence->sched = container_of(entity->rq, typeof(*fence->sched), rq); seq = atomic_inc_return(&entity->fence_seq); dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled, &fence->lock, entity->fence_context, seq); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 6f4ea8a2ca17..67bf0bec3309 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -590,7 +590,7 @@ void drm_sched_job_arm(struct drm_sched_job *job) BUG_ON(!entity); drm_sched_entity_select_rq(entity); - sched = entity->rq->sched; + sched = container_of(entity->rq, typeof(*sched), rq); job->sched = sched; job->s_priority = entity->priority; @@ -936,7 +936,7 @@ static void drm_sched_run_job_work(struct work_struct *w) for (;;) { /* Find entity with a ready job */ - entity = drm_sched_rq_select_entity(sched, sched->rq); + entity = drm_sched_rq_select_entity(sched); if (!entity) break; /* No more work */ @@ -1028,15 +1028,6 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->score = score ? score : &sched->_score; sched->dev = dev; - if (sched->rq) { - /* Not an error, but warn anyway so drivers can - * fine-tune their DRM calling order, and return all - * is good. - */ - drm_warn(sched, "%s: scheduler already initialized!\n", __func__); - return 0; - } - if (submit_wq) { sched->submit_wq = submit_wq; sched->own_submit_wq = false; @@ -1054,11 +1045,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->own_submit_wq = true; } - sched->rq = kmalloc(sizeof(*sched->rq), GFP_KERNEL | __GFP_ZERO); - if (!sched->rq) - goto Out_check_own; - - drm_sched_rq_init(sched, sched->rq); + drm_sched_rq_init(sched); init_waitqueue_head(&sched->job_scheduled); INIT_LIST_HEAD(&sched->pending_list); @@ -1072,12 +1059,6 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->pause_submit = false; sched->ready = true; return 0; - -Out_check_own: - if (sched->own_submit_wq) - destroy_workqueue(sched->submit_wq); - drm_err(sched, "%s: Failed to setup GPU scheduler--out of memory\n", __func__); - return -ENOMEM; } EXPORT_SYMBOL(drm_sched_init); @@ -1104,7 +1085,7 @@ EXPORT_SYMBOL(drm_sched_init); void drm_sched_fini(struct drm_gpu_scheduler *sched) { - struct drm_sched_rq *rq = sched->rq; + struct drm_sched_rq *rq = &sched->rq; struct drm_sched_entity *s_entity; drm_sched_wqueue_stop(sched); @@ -1128,8 +1109,6 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) if (sched->own_submit_wq) destroy_workqueue(sched->submit_wq); sched->ready = false; - kfree(sched->rq); - sched->rq = NULL; } EXPORT_SYMBOL(drm_sched_fini); @@ -1146,7 +1125,7 @@ void drm_sched_increase_karma(struct drm_sched_job *bad) { struct drm_gpu_scheduler *sched = bad->sched; struct drm_sched_entity *entity, *tmp; - struct drm_sched_rq *rq = sched->rq; + struct drm_sched_rq *rq = &sched->rq; /* don't change @bad's karma if it's from KERNEL RQ, * because sometimes GPU hang would cause kernel jobs (like VM updating jobs) diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c index e96c8ca9c54b..2956d719c42d 100644 --- a/drivers/gpu/drm/scheduler/sched_rq.c +++ b/drivers/gpu/drm/scheduler/sched_rq.c @@ -47,13 +47,13 @@ static void __drm_sched_rq_add_tree_locked(struct drm_sched_entity *entity, * * Initializes a scheduler runqueue. */ -void drm_sched_rq_init(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq) +void drm_sched_rq_init(struct drm_gpu_scheduler *sched) { + struct drm_sched_rq *rq = &sched->rq; + spin_lock_init(&rq->lock); INIT_LIST_HEAD(&rq->entities); rq->rb_tree_root = RB_ROOT_CACHED; - rq->sched = sched; } /** @@ -71,7 +71,7 @@ drm_sched_rq_add_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity, ktime_t ts) { - struct drm_gpu_scheduler *sched; + struct drm_gpu_scheduler *sched = container_of(rq, typeof(*sched), rq); if (entity->stopped) { DRM_ERROR("Trying to push to a killed entity\n"); @@ -81,7 +81,6 @@ drm_sched_rq_add_entity(struct drm_sched_rq *rq, spin_lock(&entity->lock); spin_lock(&rq->lock); - sched = rq->sched; atomic_inc(sched->score); if (!list_empty(&entity->list)) @@ -108,6 +107,8 @@ drm_sched_rq_add_entity(struct drm_sched_rq *rq, void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity) { + struct drm_gpu_scheduler *sched = container_of(rq, typeof(*sched), rq); + lockdep_assert_held(&entity->lock); if (list_empty(&entity->list)) @@ -115,7 +116,7 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, spin_lock(&rq->lock); - atomic_dec(rq->sched->score); + atomic_dec(sched->score); list_del_init(&entity->list); if (!RB_EMPTY_NODE(&entity->rb_tree_node)) @@ -175,10 +176,10 @@ void drm_sched_rq_update_deadline(struct drm_sched_rq *rq, * Return an entity if one is found or NULL if no ready entity was found. */ struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq) +drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched) { struct drm_sched_entity *entity = NULL; + struct drm_sched_rq *rq = &sched->rq; struct rb_node *rb; spin_lock(&rq->lock); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index c68dce8af063..7b29f45aa1da 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -238,7 +238,6 @@ struct drm_sched_entity { /** * struct drm_sched_rq - queue of entities to be scheduled. * - * @sched: the scheduler to which this rq belongs to. * @lock: protects @entities, @rb_tree_root and @rr_deadline. * @entities: list of the entities to be scheduled. * @rb_tree_root: root of time based priority queue of entities for FIFO scheduling @@ -248,8 +247,6 @@ struct drm_sched_entity { * the next entity to emit commands from. */ struct drm_sched_rq { - struct drm_gpu_scheduler *sched; - spinlock_t lock; /* Following members are protected by the @lock: */ ktime_t rr_deadline; @@ -518,7 +515,7 @@ struct drm_gpu_scheduler { atomic_t credit_count; long timeout; const char *name; - struct drm_sched_rq *rq; + struct drm_sched_rq rq; wait_queue_head_t job_scheduled; atomic64_t job_id_count; struct workqueue_struct *submit_wq; @@ -585,8 +582,7 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence, struct drm_sched_entity *entity); void drm_sched_fault(struct drm_gpu_scheduler *sched); -void drm_sched_rq_init(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq); +void drm_sched_rq_init(struct drm_gpu_scheduler *sched); struct drm_gpu_scheduler * drm_sched_rq_add_entity(struct drm_sched_rq *rq, @@ -598,8 +594,7 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, void drm_sched_rq_pop_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity); struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched, - struct drm_sched_rq *rq); +drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched); void drm_sched_rq_update_deadline(struct drm_sched_rq *rq, struct drm_sched_entity *entity, ktime_t deadline); From patchwork Mon Dec 30 16:52:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D521AE77188 for ; Mon, 30 Dec 2024 16:53:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5DD9F10E53D; Mon, 30 Dec 2024 16:53:36 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="eE11UjXF"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id EB15710E53C for ; Mon, 30 Dec 2024 16:53:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=B1ijnfowIcz+OPNmF5MSit1uqKERNe7n5nKccOg1KDI=; b=eE11UjXFcEGRRwRsH/RbAcP5yE zqfo2IoCAZGYlhku7TBfIjnS0rMTg18hiY8tCzIvtX7asTpjSM9edjOi/AyJYSCWe++nQHmWPv5ca yVem3LDBGXcK2fHCSZtV2qhxB6yS1CWhMnRVq4bn2vFAX7PT5ldsWJrO/36sgSU9rC0VUBstb3wKk Ezd9r5ek8W3qA+uUSPvzi/azUcNfPfCSZnsvaPYifhXAhCp0pbAqAPTlkSYaTDUJS4HHBHMeAUJuJ vyRvZuC1t4OhDyg6jHRQbykIkEt6PvsyxSMSQi4JC6p771FY5U62tjJkNty8XfRnkkRLzogqHXEGv 1Jf+y38A==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0z-009Zwm-5v; Mon, 30 Dec 2024 17:53:13 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 13/14] dma-fence: Add helper for custom fence context when merging fences Date: Mon, 30 Dec 2024 16:52:58 +0000 Message-ID: <20241230165259.95855-14-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Add a dma_fence_unwrap_merge_context() helper which works exactly the same as the existing dma_fence_unwrap_merge() apart that instead of always allocating a new fence context it allows the caller to specify it. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/dma-buf/dma-fence-unwrap.c | 8 ++++---- include/linux/dma-fence-unwrap.h | 31 ++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 6 deletions(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index 2a059ac0ed27..fd9813a50532 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -80,7 +80,8 @@ static int fence_cmp(const void *_a, const void *_b) } /* Implementation for the dma_fence_merge() marco, don't use directly */ -struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, +struct dma_fence *__dma_fence_unwrap_merge(u64 context, + unsigned int num_fences, struct dma_fence **fences, struct dma_fence_unwrap *iter) { @@ -156,9 +157,8 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, count = ++j; if (count > 1) { - result = dma_fence_array_create(count, array, - dma_fence_context_alloc(1), - 1, false); + result = dma_fence_array_create(count, array, context, 1, + false); if (!result) { for (i = 0; i < count; i++) dma_fence_put(array[i]); diff --git a/include/linux/dma-fence-unwrap.h b/include/linux/dma-fence-unwrap.h index 66b1e56fbb81..12e8c43848ce 100644 --- a/include/linux/dma-fence-unwrap.h +++ b/include/linux/dma-fence-unwrap.h @@ -8,6 +8,8 @@ #ifndef __LINUX_DMA_FENCE_UNWRAP_H #define __LINUX_DMA_FENCE_UNWRAP_H +#include + struct dma_fence; /** @@ -48,7 +50,8 @@ struct dma_fence *dma_fence_unwrap_next(struct dma_fence_unwrap *cursor); for (fence = dma_fence_unwrap_first(head, cursor); fence; \ fence = dma_fence_unwrap_next(cursor)) -struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, +struct dma_fence *__dma_fence_unwrap_merge(u64 context, + unsigned int num_fences, struct dma_fence **fences, struct dma_fence_unwrap *cursors); @@ -69,7 +72,31 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, struct dma_fence *__f[] = { __VA_ARGS__ }; \ struct dma_fence_unwrap __c[ARRAY_SIZE(__f)]; \ \ - __dma_fence_unwrap_merge(ARRAY_SIZE(__f), __f, __c); \ + __dma_fence_unwrap_merge(dma_fence_context_alloc(1), \ + ARRAY_SIZE(__f), __f, __c); \ + }) + +/** + * dma_fence_unwrap_merge_context - unwrap and merge fences with context + * + * All fences given as parameters are unwrapped and merged back together as flat + * dma_fence_array. Useful if multiple containers need to be merged together. + * + * Implemented as a macro to allocate the necessary arrays on the stack and + * account the stack frame size to the caller. + * + * Fence context is passed in and not automatically allocated. + * + * Returns NULL on memory allocation failure, a dma_fence object representing + * all the given fences otherwise. + */ +#define dma_fence_unwrap_merge_context(context, ...) \ + ({ \ + struct dma_fence *__f[] = { __VA_ARGS__ }; \ + struct dma_fence_unwrap __c[ARRAY_SIZE(__f)]; \ + \ + __dma_fence_unwrap_merge((context), \ + ARRAY_SIZE(__f), __f, __c); \ }) #endif From patchwork Mon Dec 30 16:52:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13923367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7C22BE7718F for ; Mon, 30 Dec 2024 16:53:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EC22C10E541; Mon, 30 Dec 2024 16:53:48 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="a30GCqFP"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id A174510E53D for ; Mon, 30 Dec 2024 16:53:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=iBD/wk4CmejBhKyFmyJET9X/tPPqndGE3JuAX26IoAE=; b=a30GCqFPCwJyHNP2wDaYQ5p7GW JCreQ0UsPyVdEjzJT0tHSV2yO+qiuZTWIjpNcXGEF9rGiWt2stlelfIELFHyI+9zD0vtTv4nYM1ba aD38Qw7yeNVtyqzCI3S61gJRY9v19sN1uZkkLktLJomd1nZAEAdB1hG6JiwQAg5EDf4336kv7KiG8 Ni2+5tybc35irMjFsoMV//dxzIj6CLR1+IPaY4OpFgk+uvlhriI3sudVTFQ97thQfBixo0piJPh8A c6sP1M6RZpZ6iz+HFYW9N0zv2yYtOQtrywfXwi5KLyovqGgfG3dcps7my00uVtU6mYc+pELYWrtyC nu6xy5/A==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tSJ0z-009Zwu-Sp; Mon, 30 Dec 2024 17:53:13 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Danilo Krummrich , Matthew Brost , Philipp Stanner Subject: [RFC 14/14] drm/sched: Resolve all job dependencies in one go Date: Mon, 30 Dec 2024 16:52:59 +0000 Message-ID: <20241230165259.95855-15-tursulin@igalia.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230165259.95855-1-tursulin@igalia.com> References: <20241230165259.95855-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Instead of maintaining the dependencies in an xarray and then handling them one by one every time the scheduler picks the same entity for execution (requiring potentially multiple worker invocations for a job to actually get submitted), lets maintain them in a dma_fence_array and allow the worker to resolved them in one go. The slightly increased cost of dma_fence_merge_unwrap() when adding many dependencies (which is not a typical case as far as I can see) is hopefully annulled by reduced latency on the submission path. On the implementation side we allocate two more fence context numbers associated with each entity. One for the job dependency array and another for the entity cleanup path (seqnos are shared with the job). On the entity cleanup path we create an second dma-fence-array so we can add a single callback per job, which also handles all dependencies at once. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Danilo Krummrich Cc: Matthew Brost Cc: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_entity.c | 101 +++++++++++++++------- drivers/gpu/drm/scheduler/sched_main.c | 104 ++++++++++++----------- include/drm/gpu_scheduler.h | 14 +-- 3 files changed, 126 insertions(+), 93 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index dc5105ca8381..6c7325719d75 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -21,6 +21,8 @@ * */ +#include +#include #include #include #include @@ -92,7 +94,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, spsc_queue_init(&entity->job_queue); atomic_set(&entity->fence_seq, 0); - entity->fence_context = dma_fence_context_alloc(2); + entity->fence_context = dma_fence_context_alloc(4); return 0; } @@ -183,39 +185,78 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, { struct drm_sched_job *job = container_of(cb, struct drm_sched_job, finish_cb); - unsigned long index; dma_fence_put(f); - /* Wait for all dependencies to avoid data corruptions */ - xa_for_each(&job->dependencies, index, f) { - struct drm_sched_fence *s_fence = to_drm_sched_fence(f); - - if (s_fence && f == &s_fence->scheduled) { - /* The dependencies array had a reference on the scheduled - * fence, and the finished fence refcount might have - * dropped to zero. Use dma_fence_get_rcu() so we get - * a NULL fence in that case. + /* Wait for all dependencies to avoid data corruption */ + if (!dma_fence_add_callback(job->deps_finished, &job->finish_cb, + drm_sched_entity_kill_jobs_cb)) + return; + + INIT_WORK(&job->work, drm_sched_entity_kill_jobs_work); + schedule_work(&job->work); +} + +static struct dma_fence *job_deps_finished(struct drm_sched_job *job) +{ + struct dma_fence_array *array; + unsigned int i, j, num_fences; + struct dma_fence **fences; + + if (job->deps) { + array = to_dma_fence_array(job->deps); + + if (array) + num_fences = array->num_fences; + else + num_fences = 1; + } else { + num_fences = 0; + } + + if (!num_fences) + return dma_fence_get_stub(); + + fences = kmalloc_array(num_fences, sizeof(*fences), GFP_KERNEL); + if (!fences) + return NULL; + + if (num_fences == 1) + fences[0] = job->deps; + else + memcpy(fences, array->fences, num_fences * sizeof(*fences)); + + for (i = 0, j = 0; i < num_fences; i++) { + struct dma_fence *fence = dma_fence_get(fences[i]); + struct drm_sched_fence *s_fence; + + s_fence = to_drm_sched_fence(fence); + if (s_fence && fence == &s_fence->scheduled) { + /* + * The dependencies array had a reference on the + * scheduled fence, and the finished fence refcount + * might have dropped to zero. Use dma_fence_get_rcu() + * so we get a NULL fence in that case. */ - f = dma_fence_get_rcu(&s_fence->finished); + fence = dma_fence_get_rcu(&s_fence->finished); - /* Now that we have a reference on the finished fence, + /* + * Now that we have a reference on the finished fence, * we can release the reference the dependencies array * had on the scheduled fence. */ dma_fence_put(&s_fence->scheduled); } - xa_erase(&job->dependencies, index); - if (f && !dma_fence_add_callback(f, &job->finish_cb, - drm_sched_entity_kill_jobs_cb)) - return; - - dma_fence_put(f); + if (fence) + fences[j++] = fence; } - INIT_WORK(&job->work, drm_sched_entity_kill_jobs_work); - schedule_work(&job->work); + array = dma_fence_array_create(j, fences, + job->entity->fence_context + 3, 1, + false); + + return array ? &array->base : NULL; } /* Remove the entity from the scheduler and kill all pending jobs */ @@ -242,6 +283,11 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity) struct drm_sched_fence *s_fence = job->s_fence; dma_fence_get(&s_fence->finished); + + job->deps_finished = job_deps_finished(job); + if (WARN_ON_ONCE(!job->deps_finished)) + continue; + if (!prev || dma_fence_add_callback(prev, &job->finish_cb, drm_sched_entity_kill_jobs_cb)) drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); @@ -435,17 +481,8 @@ static struct dma_fence * drm_sched_job_dependency(struct drm_sched_job *job, struct drm_sched_entity *entity) { - struct dma_fence *f; - - /* We keep the fence around, so we can iterate over all dependencies - * in drm_sched_entity_kill_jobs_cb() to ensure all deps are signaled - * before killing the job. - */ - f = xa_load(&job->dependencies, job->last_dependency); - if (f) { - job->last_dependency++; - return dma_fence_get(f); - } + if (job->deps && !dma_fence_is_signaled(job->deps)) + return dma_fence_get(job->deps); if (job->sched->ops->prepare_job) return job->sched->ops->prepare_job(job, entity); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 67bf0bec3309..869d831e94aa 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -69,6 +69,8 @@ #include #include #include +#include +#include #include #include @@ -564,8 +566,6 @@ int drm_sched_job_init(struct drm_sched_job *job, INIT_LIST_HEAD(&job->list); - xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC); - return 0; } EXPORT_SYMBOL(drm_sched_job_init); @@ -614,46 +614,34 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job, struct dma_fence *fence) { struct drm_sched_entity *entity = job->entity; - struct dma_fence *entry; - unsigned long index; - u32 id = 0; - int ret; + struct dma_fence *deps; + int ret = 0; if (!fence) return 0; - if (fence->context == entity->fence_context || - fence->context == entity->fence_context + 1) { + if (fence->context >= entity->fence_context && + fence->context <= entity->fence_context + 3) { /* * Fence is a scheduled/finished fence from a job * which belongs to the same entity, we can ignore * fences from ourself */ - dma_fence_put(fence); - return 0; + goto out_put; } - /* Deduplicate if we already depend on a fence from the same context. - * This lets the size of the array of deps scale with the number of - * engines involved, rather than the number of BOs. - */ - xa_for_each(&job->dependencies, index, entry) { - if (entry->context != fence->context) - continue; - - if (dma_fence_is_later(fence, entry)) { - dma_fence_put(entry); - xa_store(&job->dependencies, index, fence, GFP_KERNEL); - } else { - dma_fence_put(fence); - } - return 0; + deps = dma_fence_unwrap_merge_context(entity->fence_context + 2, + job->deps, fence); + if (!deps) { + ret = -ENOMEM; + goto out_put; } - ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL); - if (ret != 0) - dma_fence_put(fence); + dma_fence_put(job->deps); + job->deps = deps; +out_put: + dma_fence_put(fence); return ret; } EXPORT_SYMBOL(drm_sched_job_add_dependency); @@ -745,26 +733,49 @@ int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, } EXPORT_SYMBOL(drm_sched_job_add_implicit_dependencies); +static bool +can_replace_fence(struct drm_gpu_scheduler *sched, + struct drm_sched_fence *s_fence, + struct dma_fence *fence) +{ + if (fence->error || !s_fence || s_fence->sched != sched || + test_bit(DRM_SCHED_FENCE_DONT_PIPELINE, &fence->flags)) + return false; + + return true; +} + void drm_sched_job_prepare_dependecies(struct drm_sched_job *job) { struct drm_gpu_scheduler *sched = job->sched; + struct drm_sched_fence *s_fence; + struct dma_fence_array *array; struct dma_fence *fence; - unsigned long index; - xa_for_each(&job->dependencies, index, fence) { - struct drm_sched_fence *s_fence = to_drm_sched_fence(fence); + if (!job->deps) + return; - if (fence->error || !s_fence || s_fence->sched != sched || - test_bit(DRM_SCHED_FENCE_DONT_PIPELINE, &fence->flags)) - continue; + array = to_dma_fence_array(job->deps); + if (!array) { + fence = job->deps; + s_fence = to_drm_sched_fence(fence); + if (can_replace_fence(sched, s_fence, fence)) { + job->deps = dma_fence_get(&s_fence->scheduled); + dma_fence_put(fence); + } + } else { + unsigned i; - /* - * Fence is from the same scheduler, only need to wait for - * it to be scheduled. - */ - xa_store(&job->dependencies, index, - dma_fence_get(&s_fence->scheduled), GFP_KERNEL); - dma_fence_put(fence); + for (i = 0; i < array->num_fences; i++) { + fence = array->fences[i]; + s_fence = to_drm_sched_fence(fence); + if (can_replace_fence(sched, s_fence, fence)) { + array->fences[i] = + dma_fence_get(&s_fence->scheduled); + dma_fence_put(fence); + } + } + array->base.seqno = job->s_fence->scheduled.seqno; } } @@ -783,24 +794,17 @@ void drm_sched_job_prepare_dependecies(struct drm_sched_job *job) */ void drm_sched_job_cleanup(struct drm_sched_job *job) { - struct dma_fence *fence; - unsigned long index; - if (kref_read(&job->s_fence->finished.refcount)) { /* drm_sched_job_arm() has been called */ dma_fence_put(&job->s_fence->finished); + dma_fence_put(job->deps); + dma_fence_put(job->deps_finished); } else { /* aborted job before committing to run it */ drm_sched_fence_free(job->s_fence); } job->s_fence = NULL; - - xa_for_each(&job->dependencies, index, fence) { - dma_fence_put(fence); - } - xa_destroy(&job->dependencies); - } EXPORT_SYMBOL(drm_sched_job_cleanup); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 7b29f45aa1da..525f1c1dd9c9 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -27,7 +27,6 @@ #include #include #include -#include #include #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000) @@ -360,17 +359,10 @@ struct drm_sched_job { enum drm_sched_priority s_priority; struct drm_sched_entity *entity; struct dma_fence_cb cb; - /** - * @dependencies: - * - * Contains the dependencies as struct dma_fence for this job, see - * drm_sched_job_add_dependency() and - * drm_sched_job_add_implicit_dependencies(). - */ - struct xarray dependencies; - /** @last_dependency: tracks @dependencies as they signal */ - unsigned long last_dependency; + struct dma_fence stub_fence; + struct dma_fence *deps; + struct dma_fence *deps_finished; /** * @submit_ts: