From patchwork Thu May 2 18:38:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13651994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 37645C4345F for ; Thu, 2 May 2024 18:38:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AD0F910E469; Thu, 2 May 2024 18:38:18 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.b="qR0QXv3J"; dkim-atps=neutral Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8A8C010E469 for ; Thu, 2 May 2024 18:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1714675095; bh=07oaTuHE3BVClNZd6QA9gATtgBKtfrWT13Xo3pNnSPU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qR0QXv3J0PLs9jbYbB+wZi4T6UQTb16ggeXU5NtReIybytp120G+D8ARbKg6CcL6o DFVnxz+XxLWeGB4MYdiJouUukLmWKKJxygaB1EwutgMug5wTxQjfo3+rRXOlNcakOa Rf/qkzMZ/AH6+0z1FNtKdUTYlNbR7P3UYqi7xB1YIe1zcNgXZKrWvyyxQEjeBzqM97 Kwyok+xk57XGrAe0XIk0w7HYunm0ezbEpkx/uZxZIwmMr0q+5zyfnrVQZ3+lrkWjMO A+loRg8GFFGNmiq3nYgdr6GEJiwRmeHOw11T02FuX5E6bar4grOBK/uOqVRvHmrb+q 1MeaPyajO1NSg== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 5D71C3782121; Thu, 2 May 2024 18:38:15 +0000 (UTC) From: Boris Brezillon To: Boris Brezillon , Steven Price , Liviu Dudau , =?utf-8?q?Adri=C3=A1n_Larumbe?= Cc: Christopher Healy , dri-devel@lists.freedesktop.org, kernel@collabora.com Subject: [PATCH 1/4] drm/panthor: Force an immediate reset on unrecoverable faults Date: Thu, 2 May 2024 20:38:09 +0200 Message-ID: <20240502183813.1612017-2-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240502183813.1612017-1-boris.brezillon@collabora.com> References: <20240502183813.1612017-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" If the FW reports an unrecoverable fault, we need to reset the GPU before we can start re-using it again. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_device.c | 1 + drivers/gpu/drm/panthor/panthor_device.h | 1 + drivers/gpu/drm/panthor/panthor_sched.c | 11 ++++++++++- 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c index 75276cbeba20..4c5b54e7abb7 100644 --- a/drivers/gpu/drm/panthor/panthor_device.c +++ b/drivers/gpu/drm/panthor/panthor_device.c @@ -293,6 +293,7 @@ static const struct panthor_exception_info panthor_exception_infos[] = { PANTHOR_EXCEPTION(ACTIVE), PANTHOR_EXCEPTION(CS_RES_TERM), PANTHOR_EXCEPTION(CS_CONFIG_FAULT), + PANTHOR_EXCEPTION(CS_UNRECOVERABLE), PANTHOR_EXCEPTION(CS_ENDPOINT_FAULT), PANTHOR_EXCEPTION(CS_BUS_FAULT), PANTHOR_EXCEPTION(CS_INSTR_INVALID), diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h index 2fdd671b38fd..e388c0472ba7 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -216,6 +216,7 @@ enum drm_panthor_exception_type { DRM_PANTHOR_EXCEPTION_CS_RES_TERM = 0x0f, DRM_PANTHOR_EXCEPTION_MAX_NON_FAULT = 0x3f, DRM_PANTHOR_EXCEPTION_CS_CONFIG_FAULT = 0x40, + DRM_PANTHOR_EXCEPTION_CS_UNRECOVERABLE = 0x41, DRM_PANTHOR_EXCEPTION_CS_ENDPOINT_FAULT = 0x44, DRM_PANTHOR_EXCEPTION_CS_BUS_FAULT = 0x48, DRM_PANTHOR_EXCEPTION_CS_INSTR_INVALID = 0x49, diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index 7f16a4a14e9a..1d2708c3ab0a 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -1281,7 +1281,16 @@ cs_slot_process_fatal_event_locked(struct panthor_device *ptdev, if (group) group->fatal_queues |= BIT(cs_id); - sched_queue_delayed_work(sched, tick, 0); + if (CS_EXCEPTION_TYPE(fatal) == DRM_PANTHOR_EXCEPTION_CS_UNRECOVERABLE) { + /* If this exception is unrecoverable, queue a reset, and make + * sure we stop scheduling groups until the reset has happened. + */ + panthor_device_schedule_reset(ptdev); + cancel_delayed_work(&sched->tick_work); + } else { + sched_queue_delayed_work(sched, tick, 0); + } + drm_warn(&ptdev->base, "CSG slot %d CS slot: %d\n" "CS_FATAL.EXCEPTION_TYPE: 0x%x (%s)\n" From patchwork Thu May 2 18:38:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13651997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 947A7C25B10 for ; Thu, 2 May 2024 18:38:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E9BB210EBE5; Thu, 2 May 2024 18:38:19 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.b="OX/vGK2x"; dkim-atps=neutral Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) by gabe.freedesktop.org (Postfix) with ESMTPS id BCA4D10E568 for ; Thu, 2 May 2024 18:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1714675096; bh=2L24G+ekzA6uoxP0Wg1Jdp3eXNqrxTsU2kd/JSWX/sQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OX/vGK2xeiUMCtyvG0MvubFpPFNpEIaI+PvWGM2/3rSzXfjEWEHCNhsI7/2MDJFQl zOK0agI7/7hfo9aS8ZL7L4mUlAV5d/MdcbKKFnH+lOh6PFDivokgkYaQAaI3wOPj7O TsEqea50YNql+O4RKF7MMr/spsn8uZH/kO1UQKvcvMBUoLoqfwoVgMJZMayGH2gLpB ziGsasuUcRQFy553GPkv+tw3m5X+grkcRPgW+3vFv2D8Wyw9h987uO3D6y6gfBENU2 fTajOqh96jiopwyPhJq/Gk3pG0M1lYuZADjEFIGt5gDUBUocYhcK+VSr5InGBq3wCN kdxIwg+tMwyaA== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 0A9563782135; Thu, 2 May 2024 18:38:15 +0000 (UTC) From: Boris Brezillon To: Boris Brezillon , Steven Price , Liviu Dudau , =?utf-8?q?Adri=C3=A1n_Larumbe?= Cc: Christopher Healy , dri-devel@lists.freedesktop.org, kernel@collabora.com Subject: [PATCH 2/4] drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level Date: Thu, 2 May 2024 20:38:10 +0200 Message-ID: <20240502183813.1612017-3-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240502183813.1612017-1-boris.brezillon@collabora.com> References: <20240502183813.1612017-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Avoids use-after-free situations when panthor_fw_unplug() is called and the kernel BO was mapped to the FW VM. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_fw.c | 4 ++-- drivers/gpu/drm/panthor/panthor_gem.c | 8 +++++--- drivers/gpu/drm/panthor/panthor_gem.h | 8 ++++++-- drivers/gpu/drm/panthor/panthor_heap.c | 8 ++++---- drivers/gpu/drm/panthor/panthor_sched.c | 11 +++++------ 5 files changed, 22 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c index 181395e2859a..b41685304a83 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -453,7 +453,7 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev, ret = panthor_kernel_bo_vmap(mem); if (ret) { - panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), mem); + panthor_kernel_bo_destroy(mem); return ERR_PTR(ret); } @@ -1133,7 +1133,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev) panthor_fw_stop(ptdev); list_for_each_entry(section, &ptdev->fw->sections, node) - panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), section->mem); + panthor_kernel_bo_destroy(section->mem); /* We intentionally don't call panthor_vm_idle() and let * panthor_mmu_unplug() release the AS we acquired with diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c index d6483266d0c2..38f560864879 100644 --- a/drivers/gpu/drm/panthor/panthor_gem.c +++ b/drivers/gpu/drm/panthor/panthor_gem.c @@ -26,18 +26,18 @@ static void panthor_gem_free_object(struct drm_gem_object *obj) /** * panthor_kernel_bo_destroy() - Destroy a kernel buffer object - * @vm: The VM this BO was mapped to. * @bo: Kernel buffer object to destroy. If NULL or an ERR_PTR(), the destruction * is skipped. */ -void panthor_kernel_bo_destroy(struct panthor_vm *vm, - struct panthor_kernel_bo *bo) +void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo) { + struct panthor_vm *vm; int ret; if (IS_ERR_OR_NULL(bo)) return; + vm = bo->vm; panthor_kernel_bo_vunmap(bo); if (drm_WARN_ON(bo->obj->dev, @@ -53,6 +53,7 @@ void panthor_kernel_bo_destroy(struct panthor_vm *vm, drm_gem_object_put(bo->obj); out_free_bo: + panthor_vm_put(vm); kfree(bo); } @@ -106,6 +107,7 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm, if (ret) goto err_free_va; + kbo->vm = panthor_vm_get(vm); bo->exclusive_vm_root_gem = panthor_vm_root_gem(vm); drm_gem_object_get(bo->exclusive_vm_root_gem); bo->base.base.resv = bo->exclusive_vm_root_gem->resv; diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h index 3bccba394d00..e43021cf6d45 100644 --- a/drivers/gpu/drm/panthor/panthor_gem.h +++ b/drivers/gpu/drm/panthor/panthor_gem.h @@ -61,6 +61,11 @@ struct panthor_kernel_bo { */ struct drm_gem_object *obj; + /** + * @vm: VM this private buffer is attached to. + */ + struct panthor_vm *vm; + /** * @va_node: VA space allocated to this GEM. */ @@ -136,7 +141,6 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm, size_t size, u32 bo_flags, u32 vm_map_flags, u64 gpu_va); -void panthor_kernel_bo_destroy(struct panthor_vm *vm, - struct panthor_kernel_bo *bo); +void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo); #endif /* __PANTHOR_GEM_H__ */ diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c index 143fa35f2e74..65921296a18c 100644 --- a/drivers/gpu/drm/panthor/panthor_heap.c +++ b/drivers/gpu/drm/panthor/panthor_heap.c @@ -127,7 +127,7 @@ static void panthor_free_heap_chunk(struct panthor_vm *vm, heap->chunk_count--; mutex_unlock(&heap->lock); - panthor_kernel_bo_destroy(vm, chunk->bo); + panthor_kernel_bo_destroy(chunk->bo); kfree(chunk); } @@ -183,7 +183,7 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev, return 0; err_destroy_bo: - panthor_kernel_bo_destroy(vm, chunk->bo); + panthor_kernel_bo_destroy(chunk->bo); err_free_chunk: kfree(chunk); @@ -391,7 +391,7 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool, mutex_unlock(&heap->lock); if (removed) { - panthor_kernel_bo_destroy(pool->vm, chunk->bo); + panthor_kernel_bo_destroy(chunk->bo); kfree(chunk); ret = 0; } else { @@ -587,7 +587,7 @@ void panthor_heap_pool_destroy(struct panthor_heap_pool *pool) drm_WARN_ON(&pool->ptdev->base, panthor_heap_destroy_locked(pool, i)); if (!IS_ERR_OR_NULL(pool->gpu_contexts)) - panthor_kernel_bo_destroy(pool->vm, pool->gpu_contexts); + panthor_kernel_bo_destroy(pool->gpu_contexts); /* Reflects the fact the pool has been destroyed. */ pool->vm = NULL; diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index 1d2708c3ab0a..6ea094b00cf9 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -826,8 +826,8 @@ static void group_free_queue(struct panthor_group *group, struct panthor_queue * panthor_queue_put_syncwait_obj(queue); - panthor_kernel_bo_destroy(group->vm, queue->ringbuf); - panthor_kernel_bo_destroy(panthor_fw_vm(group->ptdev), queue->iface.mem); + panthor_kernel_bo_destroy(queue->ringbuf); + panthor_kernel_bo_destroy(queue->iface.mem); kfree(queue); } @@ -837,15 +837,14 @@ static void group_release_work(struct work_struct *work) struct panthor_group *group = container_of(work, struct panthor_group, release_work); - struct panthor_device *ptdev = group->ptdev; u32 i; for (i = 0; i < group->queue_count; i++) group_free_queue(group, group->queues[i]); - panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), group->suspend_buf); - panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), group->protm_suspend_buf); - panthor_kernel_bo_destroy(group->vm, group->syncobjs); + panthor_kernel_bo_destroy(group->suspend_buf); + panthor_kernel_bo_destroy(group->protm_suspend_buf); + panthor_kernel_bo_destroy(group->syncobjs); panthor_vm_put(group->vm); kfree(group); From patchwork Thu May 2 18:38:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13651995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0CE54C4345F for ; Thu, 2 May 2024 18:38:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C043010E875; Thu, 2 May 2024 18:38:19 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.b="48jhSxjD"; dkim-atps=neutral Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4C5F810E469 for ; Thu, 2 May 2024 18:38:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1714675097; bh=DjRIkPw7KmW0hMerrbyhJ7T5a8eVe4o2K3/wZxbRVtM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=48jhSxjDqtAg7gZ5LPX3YuB9oWbSQr70CXyczD1kOJg6n26NKKqkNm0WT9N5h46z6 uTTFQ+buWR2QitQS0dfMbUzf2//OQQECH4uYQf+rTPorexT1T5dgR6YE412V3T1SJW q/ejKFfp+UJg9nnQ9vxMbUNGf6+mOVPoj5BKniwgWy4QVeEZvFjvnJLOqgj6hBny9d ecWYss0mmx6yHcN8m5MaXbLXhcVQ4SJwA95ZSdffeNMS64Q+JV0g5UfOTzzS8EwEiQ gYbOHEM8DMhlykYMX3MY1gbjbcqvLSq2WvMNagqYGp2HrhxCQlo1gNv88/lQ3o/UXw uAGuO2DM9epmQ== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madrid.collaboradmins.com (Postfix) with ESMTPSA id AD5B0378214B; Thu, 2 May 2024 18:38:16 +0000 (UTC) From: Boris Brezillon To: Boris Brezillon , Steven Price , Liviu Dudau , =?utf-8?q?Adri=C3=A1n_Larumbe?= Cc: Christopher Healy , dri-devel@lists.freedesktop.org, kernel@collabora.com Subject: [PATCH 3/4] drm/panthor: Reset the FW VM to NULL on unplug Date: Thu, 2 May 2024 20:38:11 +0200 Message-ID: <20240502183813.1612017-4-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240502183813.1612017-1-boris.brezillon@collabora.com> References: <20240502183813.1612017-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This way get NULL derefs instead of use-after-free if the FW VM is referenced after the device has been unplugged. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price Acked-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_fw.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c index b41685304a83..93165961a6b5 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -1141,6 +1141,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev) * state to keep the active_refcnt balanced. */ panthor_vm_put(ptdev->fw->vm); + ptdev->fw->vm = NULL; panthor_gpu_power_off(ptdev, L2, ptdev->gpu_info.l2_present, 20000); } From patchwork Thu May 2 18:38:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13651998 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 48D1DC4345F for ; Thu, 2 May 2024 18:38:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AD81611264A; Thu, 2 May 2024 18:38:22 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.b="CsMlODdo"; dkim-atps=neutral Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) by gabe.freedesktop.org (Postfix) with ESMTPS id E94E510E4AA for ; Thu, 2 May 2024 18:38:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1714675097; bh=sGq86Z4cvZoFrppSbDn0yJqVzrXl1ZvAmMsBHw4vLzY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CsMlODdoLq4+AH8NUZPUrFXbxZrZeLxRdTygBqC0EN9fVHMKxZAXpQriYiMm/fAO4 kKuuPmSjFNC07Yk8PtKJNsGb2KJEcA5tRvfiAqCJiis6unPLmXxMg2CN/sGelU9zpw UEAqPehZDroWplFeW4HXJaQrhQRbSfZ/J4Zv7K37IVOtI9xTAH7aNge8MnxpN364ko 08RrpS6ggzM/46MxAs1fvItmWmc9fDMe+0cYEsrqwtTqxkDQosiIaiU7u/FHa9I++F 6kAQYZrx9DAAXIbDtc2+XvwLwWPMhK6NBO4TWepu9snrHOOOnbRSHyAf5ddc6R0LsC 1ZBnoiN4o+7hw== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 5BFC0378214D; Thu, 2 May 2024 18:38:17 +0000 (UTC) From: Boris Brezillon To: Boris Brezillon , Steven Price , Liviu Dudau , =?utf-8?q?Adri=C3=A1n_Larumbe?= Cc: Christopher Healy , dri-devel@lists.freedesktop.org, kernel@collabora.com Subject: [PATCH 4/4] drm/panthor: Call panthor_sched_post_reset() even if the reset failed Date: Thu, 2 May 2024 20:38:12 +0200 Message-ID: <20240502183813.1612017-5-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240502183813.1612017-1-boris.brezillon@collabora.com> References: <20240502183813.1612017-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We need to undo what was done in panthor_sched_pre_reset() even if the reset failed. We just flag all previously running groups as terminated when that happens to unblock things. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_device.c | 7 +------ drivers/gpu/drm/panthor/panthor_sched.c | 19 ++++++++++++++----- drivers/gpu/drm/panthor/panthor_sched.h | 2 +- 3 files changed, 16 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c index 4c5b54e7abb7..4082c8f2951d 100644 --- a/drivers/gpu/drm/panthor/panthor_device.c +++ b/drivers/gpu/drm/panthor/panthor_device.c @@ -129,13 +129,8 @@ static void panthor_device_reset_work(struct work_struct *work) panthor_gpu_l2_power_on(ptdev); panthor_mmu_post_reset(ptdev); ret = panthor_fw_post_reset(ptdev); - if (ret) - goto out_dev_exit; - atomic_set(&ptdev->reset.pending, 0); - panthor_sched_post_reset(ptdev); - -out_dev_exit: + panthor_sched_post_reset(ptdev, ret != 0); drm_dev_exit(cookie); if (ret) { diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index 6ea094b00cf9..fc43ff62c77d 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -2728,15 +2728,22 @@ void panthor_sched_pre_reset(struct panthor_device *ptdev) mutex_unlock(&sched->reset.lock); } -void panthor_sched_post_reset(struct panthor_device *ptdev) +void panthor_sched_post_reset(struct panthor_device *ptdev, bool reset_failed) { struct panthor_scheduler *sched = ptdev->scheduler; struct panthor_group *group, *group_tmp; mutex_lock(&sched->reset.lock); - list_for_each_entry_safe(group, group_tmp, &sched->reset.stopped_groups, run_node) + list_for_each_entry_safe(group, group_tmp, &sched->reset.stopped_groups, run_node) { + /* Consider all previously running group as terminated if the + * reset failed. + */ + if (reset_failed) + group->state = PANTHOR_CS_GROUP_TERMINATED; + panthor_group_start(group); + } /* We're done resetting the GPU, clear the reset.in_progress bit so we can * kick the scheduler. @@ -2744,9 +2751,11 @@ void panthor_sched_post_reset(struct panthor_device *ptdev) atomic_set(&sched->reset.in_progress, false); mutex_unlock(&sched->reset.lock); - sched_queue_delayed_work(sched, tick, 0); - - sched_queue_work(sched, sync_upd); + /* No need to queue a tick and update syncs if the reset failed. */ + if (!reset_failed) { + sched_queue_delayed_work(sched, tick, 0); + sched_queue_work(sched, sync_upd); + } } static void group_sync_upd_work(struct work_struct *work) diff --git a/drivers/gpu/drm/panthor/panthor_sched.h b/drivers/gpu/drm/panthor/panthor_sched.h index 66438b1f331f..3a30d2328b30 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.h +++ b/drivers/gpu/drm/panthor/panthor_sched.h @@ -40,7 +40,7 @@ void panthor_group_pool_destroy(struct panthor_file *pfile); int panthor_sched_init(struct panthor_device *ptdev); void panthor_sched_unplug(struct panthor_device *ptdev); void panthor_sched_pre_reset(struct panthor_device *ptdev); -void panthor_sched_post_reset(struct panthor_device *ptdev); +void panthor_sched_post_reset(struct panthor_device *ptdev, bool reset_failed); void panthor_sched_suspend(struct panthor_device *ptdev); void panthor_sched_resume(struct panthor_device *ptdev);