From patchwork Mon Sep 23 23:06:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 13809997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7684DCF9C5B for ; Mon, 23 Sep 2024 23:09:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9FCCE10E20D; Mon, 23 Sep 2024 23:09:35 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (1024-bit key; unprotected) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b="UF/00oAN"; dkim-atps=neutral Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5AEB110E20D for ; Mon, 23 Sep 2024 23:09:34 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; t=1727132966; cv=none; d=zohomail.com; s=zohoarc; b=FJ1NhH7uhig/MkZYlgjPcBD2ITFFRjkTTkCvTNt8aqGo0TP1LtNSzpsitJqDIPdmYBvaohSCE4LU31HrE7CAMbgKIeCd18hfo+jKtJXFaO2wNZxuBJCCxnrrY2kYj85v84WCn/QJpBXeqhQAhvInSndj6OrUpoq806f8C6plH4Y= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1727132966; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:MIME-Version:Message-ID:Subject:Subject:To:To:Message-Id:Reply-To; bh=EXwo6vgrQS8r/knP9TPY91FMXiIzo7plKOoU5Yr1Wgg=; b=duJFrNW3xskG5n3NJAzBzlBQE8e4H5vTnE3v7u6T8l3640ZWcz7fQ0x4ZBkxZL//wjBtWjszpn63mSorw5WElg03yKCaDvyGA3AvTsxDo64Bo2gdwpbejwXti294yEqoGD9Ti2FkIlI0rd9AmOhJ+IGFxyqXwIkpQvYOn54QdW4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=collabora.com; spf=pass smtp.mailfrom=adrian.larumbe@collabora.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1727132966; s=zohomail; d=collabora.com; i=adrian.larumbe@collabora.com; h=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-Id:Reply-To; bh=EXwo6vgrQS8r/knP9TPY91FMXiIzo7plKOoU5Yr1Wgg=; b=UF/00oANFDVYlSsiKjV8pabqKMAMn1id+JyX9LW7lq9ahT+MoSYR40DrDhyGJuaa Ru6/QwyskRXp2mVf/Rpoou75+Ct1Ed2gYhzo2VFmJRygALmVTLSy6UIsYUre9hxxFmF nGbD9O8xMxk62UU5V+35tZjKiaf7RQgP4DuC4wJs= Received: by mx.zohomail.com with SMTPS id 1727132963748614.7582038696082; Mon, 23 Sep 2024 16:09:23 -0700 (PDT) From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: Boris Brezillon , Steven Price , Liviu Dudau , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: kernel@collabora.com, =?utf-8?q?Adri=C3=A1n_Larumbe?= , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH v8 0/5] Support fdinfo runtime and memory stats on Panthor Date: Tue, 24 Sep 2024 00:06:20 +0100 Message-ID: <20240923230912.2207320-1-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.46.0 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This patch series enables userspace utilities like gputop and nvtop to query a render context's fdinfo file and figure out rates of engine and memory utilisation. Previous discussion can be found at https://lore.kernel.org/dri-devel/20240920234436.207563-1-adrian.larumbe@collabora.com/ Changelog: v8: - Fixed uninitialised stack variable bug that was triggering an invalid memory reference. - Added a few R-b tags to commits v7: - Fixed some kernel test bot-reported documentation and sign mismatch errors. - Defined convenience macros for specifying CS instructions according to their profiled status. - Explicitly initialised instruction count for structure containing a job's instructions when calculating its amount of credits for the scheduler. - Some minor cosmetic nits. v6: - Addressed some nits and style issues. - Enforced static assert equality of instruction buffer when calculating job credits or copying them into the ringbuffer. - Added explanation to the way in which job credits and profiled job size is done. - Broke down fdinfo enablement patch into two, one of them dealing with adding support for calculating the current and top operating device frequencies - Fixed race at the time drm file-wide profiling stats are gathered from groups. v5: - Moved profiling information slots into a per-queue BO and away from syncobjs. - Decide on size of profiling slots BO from size of CS for minimal profiled job - Turn job and device profiling flag into a bit mask so that individual metrics can be enabled separately. - Shrunk ringbuffer slot size to that of a cache line. - Track profiling slot indeces separately from the job's queue ringbuffer's - Emit CS instructions one by one and tag them depending on profiling mask - New helper for calculating job credits depending on profiling flags - Add Documentation file for sysfs profiling knob - fdinfo will only show engines or cycles tags if these are respectively enabled. v4: - Fixed wrong assignment location for frequency values in Panthor's devfreq - Removed the last two commits about registering size of internal BO's - Rearranged patch series so that sysfs knob is done last and all the previous time sampling and fdinfo show dependencies are already in place v3: - Fixed some nits and removed useless bounds check in panthor_sched.c - Added support for sysfs profiling knob and optional job accounting - Added new patches for calculating size of internal BO's v2: - Split original first patch in two, one for FW CS cycle and timestamp calculations and job accounting memory management, and a second one that enables fdinfo. - Moved NUM_INSTRS_PER_SLOT to the file prelude - Removed nelem variable from the group's struct definition. - Precompute size of group's syncobj BO to avoid code duplication. - Some minor nits. Adrián Larumbe (5): drm/panthor: introduce job cycle and timestamp accounting drm/panthor: record current and maximum device clock frequencies drm/panthor: add DRM fdinfo support drm/panthor: enable fdinfo for memory stats drm/panthor: add sysfs knob for enabling job profiling .../testing/sysfs-driver-panthor-profiling | 10 + Documentation/gpu/panthor.rst | 46 +++ drivers/gpu/drm/panthor/panthor_devfreq.c | 18 +- drivers/gpu/drm/panthor/panthor_device.h | 36 ++ drivers/gpu/drm/panthor/panthor_drv.c | 73 ++++ drivers/gpu/drm/panthor/panthor_gem.c | 12 + drivers/gpu/drm/panthor/panthor_sched.c | 384 +++++++++++++++--- drivers/gpu/drm/panthor/panthor_sched.h | 2 + 8 files changed, 531 insertions(+), 50 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-driver-panthor-profiling create mode 100644 Documentation/gpu/panthor.rst