diff mbox series

[12/12] accel/ivpu: Share NPU busy time in sysfs

Message ID 20240508132931.2388996-1-jacek.lawrynowicz@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series accel/ivpu: Changes for 6.10 | expand

Commit Message

Jacek Lawrynowicz May 8, 2024, 1:29 p.m. UTC
From: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>

The driver tracks the time spent by NPU executing jobs
and shares it through sysfs `npu_busy_time_us` file.
It can be then used by user space applications to monitor device
utilization.

NPU is considered 'busy' starting with a first job submitted
to firmware and ending when there is no more jobs pending/executing.

Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
---
 drivers/accel/ivpu/Makefile     |  3 +-
 drivers/accel/ivpu/ivpu_drv.c   |  2 ++
 drivers/accel/ivpu/ivpu_drv.h   |  3 ++
 drivers/accel/ivpu/ivpu_job.c   | 23 ++++++++++++-
 drivers/accel/ivpu/ivpu_sysfs.c | 58 +++++++++++++++++++++++++++++++++
 drivers/accel/ivpu/ivpu_sysfs.h | 13 ++++++++
 6 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h

Comments

Jeffrey Hugo May 10, 2024, 4:55 p.m. UTC | #1
On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
> From: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
> 
> The driver tracks the time spent by NPU executing jobs
> and shares it through sysfs `npu_busy_time_us` file.
> It can be then used by user space applications to monitor device
> utilization.
> 
> NPU is considered 'busy' starting with a first job submitted
> to firmware and ending when there is no more jobs pending/executing.
> 
> Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>

This feels like something that would normally be handled by perf. Why 
not use that mechanism?

-Jeff
Jacek Lawrynowicz May 13, 2024, 10:22 a.m. UTC | #2
Hi,

On 10.05.2024 18:55, Jeffrey Hugo wrote:
> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
>> From: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
>>
>> The driver tracks the time spent by NPU executing jobs
>> and shares it through sysfs `npu_busy_time_us` file.
>> It can be then used by user space applications to monitor device
>> utilization.
>>
>> NPU is considered 'busy' starting with a first job submitted
>> to firmware and ending when there is no more jobs pending/executing.
>>
>> Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
>> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> 
> This feels like something that would normally be handled by perf. Why not use that mechanism?

Yeah, probably but we had several request to provide easy to use interface for this metric that
could be integrated in various user space apps/tools that do not use ftrace.
Tvrtko Ursulin May 13, 2024, 10:45 a.m. UTC | #3
On 13/05/2024 11:22, Jacek Lawrynowicz wrote:
> Hi,
> 
> On 10.05.2024 18:55, Jeffrey Hugo wrote:
>> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
>>> From: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
>>>
>>> The driver tracks the time spent by NPU executing jobs
>>> and shares it through sysfs `npu_busy_time_us` file.
>>> It can be then used by user space applications to monitor device
>>> utilization.
>>>
>>> NPU is considered 'busy' starting with a first job submitted
>>> to firmware and ending when there is no more jobs pending/executing.
>>>
>>> Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
>>> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
>>
>> This feels like something that would normally be handled by perf. Why not use that mechanism?
> 
> Yeah, probably but we had several request to provide easy to use interface for this metric that
> could be integrated in various user space apps/tools that do not use ftrace.

Probably more Perf/PMU aka performance counters? Which would be 
scriptable via $kernel/tools/perf or directly via perf_event_open(2) and 
read(2).

Note it is not easy to get right and in the i915 implementation (see 
i915_pmu.c) we have a known issue with PCI hot unplug and use after free 
which needs input from perf core folks.

Regards,

Tvrtko
Jacek Lawrynowicz May 13, 2024, 11:39 a.m. UTC | #4
Hi,

On 13.05.2024 12:45, Tvrtko Ursulin wrote:
> 
> On 13/05/2024 11:22, Jacek Lawrynowicz wrote:
>> Hi,
>>
>> On 10.05.2024 18:55, Jeffrey Hugo wrote:
>>> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
>>>> From: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
>>>>
>>>> The driver tracks the time spent by NPU executing jobs
>>>> and shares it through sysfs `npu_busy_time_us` file.
>>>> It can be then used by user space applications to monitor device
>>>> utilization.
>>>>
>>>> NPU is considered 'busy' starting with a first job submitted
>>>> to firmware and ending when there is no more jobs pending/executing.
>>>>
>>>> Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
>>>> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
>>>
>>> This feels like something that would normally be handled by perf. Why not use that mechanism?
>>
>> Yeah, probably but we had several request to provide easy to use interface for this metric that
>> could be integrated in various user space apps/tools that do not use ftrace.
> 
> Probably more Perf/PMU aka performance counters? Which would be scriptable via $kernel/tools/perf or directly via perf_event_open(2) and read(2).
> 
> Note it is not easy to get right and in the i915 implementation (see i915_pmu.c) we have a known issue with PCI hot unplug and use after free which needs input from perf core folks.

OK, we will consider using perf/pmu for NPU but for the moment I would like to keep this sysfs interface.
It so simple it can be used from bash and it always can be removed if obsoleted by something fancier.
diff mbox series

Patch

diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
index 726cf8f28ea3..f61d8d3320e2 100644
--- a/drivers/accel/ivpu/Makefile
+++ b/drivers/accel/ivpu/Makefile
@@ -14,7 +14,8 @@  intel_vpu-y := \
 	ivpu_mmu.o \
 	ivpu_mmu_context.o \
 	ivpu_ms.o \
-	ivpu_pm.o
+	ivpu_pm.o \
+	ivpu_sysfs.o
 
 intel_vpu-$(CONFIG_DEBUG_FS) += ivpu_debugfs.o
 
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 87c48fa8d719..b34d1766891c 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -28,6 +28,7 @@ 
 #include "ivpu_mmu_context.h"
 #include "ivpu_ms.h"
 #include "ivpu_pm.h"
+#include "ivpu_sysfs.h"
 
 #ifndef DRIVER_VERSION_STR
 #define DRIVER_VERSION_STR __stringify(DRM_IVPU_DRIVER_MAJOR) "." \
@@ -696,6 +697,7 @@  static int ivpu_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return ret;
 
 	ivpu_debugfs_init(vdev);
+	ivpu_sysfs_init(vdev);
 
 	ret = drm_dev_register(&vdev->drm, 0);
 	if (ret) {
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index d55f0bdffd71..04a054080eff 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -135,6 +135,9 @@  struct ivpu_device {
 
 	atomic64_t unique_id_counter;
 
+	ktime_t busy_start_ts;
+	ktime_t busy_time;
+
 	struct {
 		int boot;
 		int jsm;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 1d7b4388eb3b..845181b48b3a 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -438,11 +438,28 @@  ivpu_job_create(struct ivpu_file_priv *file_priv, u32 engine_idx, u32 bo_count)
 	return NULL;
 }
 
+static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device *vdev, u32 job_id)
+{
+	struct ivpu_job *job;
+
+	xa_lock(&vdev->submitted_jobs_xa);
+	job = __xa_erase(&vdev->submitted_jobs_xa, job_id);
+
+	if (xa_empty(&vdev->submitted_jobs_xa) && job) {
+		vdev->busy_time = ktime_add(ktime_sub(ktime_get(), vdev->busy_start_ts),
+					    vdev->busy_time);
+	}
+
+	xa_unlock(&vdev->submitted_jobs_xa);
+
+	return job;
+}
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 job_status)
 {
 	struct ivpu_job *job;
 
-	job = xa_erase(&vdev->submitted_jobs_xa, job_id);
+	job = ivpu_job_remove_from_submitted_jobs(vdev, job_id);
 	if (!job)
 		return -ENOENT;
 
@@ -477,6 +494,7 @@  static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
 	struct ivpu_device *vdev = job->vdev;
 	struct xa_limit job_id_range;
 	struct ivpu_cmdq *cmdq;
+	bool is_first_job;
 	int ret;
 
 	ret = ivpu_rpm_get(vdev);
@@ -497,6 +515,7 @@  static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
 	job_id_range.max = job_id_range.min | JOB_ID_JOB_MASK;
 
 	xa_lock(&vdev->submitted_jobs_xa);
+	is_first_job = xa_empty(&vdev->submitted_jobs_xa);
 	ret = __xa_alloc(&vdev->submitted_jobs_xa, &job->job_id, job, job_id_range, GFP_KERNEL);
 	if (ret) {
 		ivpu_dbg(vdev, JOB, "Too many active jobs in ctx %d\n",
@@ -516,6 +535,8 @@  static int ivpu_job_submit(struct ivpu_job *job, u8 priority)
 		wmb(); /* Flush WC buffer for jobq header */
 	} else {
 		ivpu_cmdq_ring_db(vdev, cmdq);
+		if (is_first_job)
+			vdev->busy_start_ts = ktime_get();
 	}
 
 	ivpu_dbg(vdev, JOB, "Job submitted: id %3u ctx %2d engine %d prio %d addr 0x%llx next %d\n",
diff --git a/drivers/accel/ivpu/ivpu_sysfs.c b/drivers/accel/ivpu/ivpu_sysfs.c
new file mode 100644
index 000000000000..913669f1786e
--- /dev/null
+++ b/drivers/accel/ivpu/ivpu_sysfs.c
@@ -0,0 +1,58 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024 Intel Corporation
+ */
+
+#include <linux/device.h>
+#include <linux/err.h>
+
+#include "ivpu_hw.h"
+#include "ivpu_sysfs.h"
+
+/*
+ * npu_busy_time_us is the time that the device spent executing jobs.
+ * The time is counted when and only when there are jobs submitted to firmware.
+ *
+ * This time can be used to measure the utilization of NPU, either by calculating
+ * npu_busy_time_us difference between two timepoints (i.e. measuring the time
+ * that the NPU was active during some workload) or monitoring utilization percentage
+ * by reading npu_busy_time_us periodically.
+ *
+ * When reading the value periodically, it shouldn't be read too often as it may have
+ * an impact on job submission performance. Recommended period is 1 second.
+ */
+static ssize_t
+npu_busy_time_us_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct drm_device *drm = dev_get_drvdata(dev);
+	struct ivpu_device *vdev = to_ivpu_device(drm);
+	ktime_t total, now = 0;
+
+	xa_lock(&vdev->submitted_jobs_xa);
+	total = vdev->busy_time;
+	if (!xa_empty(&vdev->submitted_jobs_xa))
+		now = ktime_sub(ktime_get(), vdev->busy_start_ts);
+	xa_unlock(&vdev->submitted_jobs_xa);
+
+	return sysfs_emit(buf, "%lld\n", ktime_to_us(ktime_add(total, now)));
+}
+
+static DEVICE_ATTR_RO(npu_busy_time_us);
+
+static struct attribute *ivpu_dev_attrs[] = {
+	&dev_attr_npu_busy_time_us.attr,
+	NULL,
+};
+
+static struct attribute_group ivpu_dev_attr_group = {
+	.attrs = ivpu_dev_attrs,
+};
+
+void ivpu_sysfs_init(struct ivpu_device *vdev)
+{
+	int ret;
+
+	ret = devm_device_add_group(vdev->drm.dev, &ivpu_dev_attr_group);
+	if (ret)
+		ivpu_warn(vdev, "Failed to add group to device, ret %d", ret);
+}
diff --git a/drivers/accel/ivpu/ivpu_sysfs.h b/drivers/accel/ivpu/ivpu_sysfs.h
new file mode 100644
index 000000000000..9836f09b35a3
--- /dev/null
+++ b/drivers/accel/ivpu/ivpu_sysfs.h
@@ -0,0 +1,13 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2024 Intel Corporation
+ */
+
+#ifndef __IVPU_SYSFS_H__
+#define __IVPU_SYSFS_H__
+
+#include "ivpu_drv.h"
+
+void ivpu_sysfs_init(struct ivpu_device *vdev);
+
+#endif /* __IVPU_SYSFS_H__ */