[RFC,6/8] drm: Document fdinfo format specification

Message ID	20210715091820.1613726-7-tvrtko.ursulin@linux.intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=2zS0=MH=lists.freedesktop.org=dri-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 621E1610C7 From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> To: Intel-gfx@lists.freedesktop.org Subject: [RFC 6/8] drm: Document fdinfo format specification Date: Thu, 15 Jul 2021 10:18:18 +0100 Message-Id: <20210715091820.1613726-7-tvrtko.ursulin@linux.intel.com> In-Reply-To: <20210715091820.1613726-1-tvrtko.ursulin@linux.intel.com> References: <20210715091820.1613726-1-tvrtko.ursulin@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Cc: David M Nieto <David.Nieto@amd.com>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, dri-devel@lists.freedesktop.org, Tvrtko Ursulin <tvrtko.ursulin@intel.com> Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	Per client GPU stats \| expand [RFC,0/8] Per client GPU stats [RFC,1/8] drm/i915: Explicitly track DRM clients [RFC,2/8] drm/i915: Make GEM contexts track DRM clients [RFC,3/8] drm/i915: Track runtime spent in closed and unreachable GEM contexts [RFC,4/8] drm/i915: Track all user contexts per client [RFC,5/8] drm/i915: Track context current active time [RFC,6/8] drm: Document fdinfo format specification [RFC,7/8] drm/i915: Expose client engine utilisation via fdinfo [RFC,8/8] drm/amdgpu: Convert to common fdinfo format

Tvrtko Ursulin July 15, 2021, 9:18 a.m. UTC

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Proposal to standardise the fdinfo text format as optionally output by DRM
drivers.

Idea is that a simple but, well defined, spec will enable generic
userspace tools to be written while at the same time avoiding a more heavy
handed approach of adding a mid-layer to DRM.

i915 implements a subset of the spec, everything apart from the memory
stats currently, and a matching intel_gpu_top tool exists.

Open is to see if AMD can migrate to using the proposed GPU utilisation
key-value pairs, or if they are not workable to see whether to go
vendor specific, or if a standardised  alternative can be found which is
workable for both drivers.

Same for the memory utilisation key-value pairs proposal.

v2:
 * Update for removal of name and pid.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David M Nieto <David.Nieto@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 Documentation/gpu/drm-usage-stats.rst | 97 +++++++++++++++++++++++++++
 Documentation/gpu/index.rst           |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 Documentation/gpu/drm-usage-stats.rst

Daniel Stone July 23, 2021, 4:43 p.m. UTC | #1

Hi Tvrtko,
Thanks for typing this up!

On Thu, 15 Jul 2021 at 10:18, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> +Mandatory fully standardised keys
> +---------------------------------
> +
> +- drm-driver: <str>
> +
> +String shall contain a fixed string uniquely identified the driver handling
> +the device in question. For example name of the respective kernel module.

I think let's be more prescriptive and just say that it is the module name.

> +Optional fully standardised keys
> +--------------------------------
> +
> +- drm-pdev: <aaaa:bb.cc.d>
> +
> +For PCI devices this should contain the PCI slot address of the device in
> +question.

How about just major:minor of the DRM render node device it's attached to?

> +- drm-client-id: <uint>
> +
> +Unique value relating to the open DRM file descriptor used to distinguish
> +duplicated and shared file descriptors. Conceptually the value should map 1:1
> +to the in kernel representation of `struct drm_file` instances.
> +
> +Uniqueness of the value shall be either globally unique, or unique within the
> +scope of each device, in which case `drm-pdev` shall be present as well.
> +
> +Userspace should make sure to not double account any usage statistics by using
> +the above described criteria in order to associate data to individual clients.
> +
> +- drm-engine-<str>: <uint> ns
> +
> +GPUs usually contain multiple execution engines. Each shall be given a stable
> +and unique name (str), with possible values documented in the driver specific
> +documentation.
> +
> +Value shall be in specified time units which the respective GPU engine spent
> +busy executing workloads belonging to this client.
> +
> +Values are not required to be constantly monotonic if it makes the driver
> +implementation easier, but are required to catch up with the previously reported
> +larger value within a reasonable period. Upon observing a value lower than what
> +was previously read, userspace is expected to stay with that larger previous
> +value until a monotonic update is seen.

Yeah, that would work well for Mali/Panfrost. We can queue multiple
jobs in the hardware, which can either be striped across multiple
cores with an affinity mask (e.g. 3 cores for your client and 1 for
your compositor), or picked according to priority, or ...

The fine-grained performance counters (e.g. time spent waiting for
sampler) are only GPU-global. So if you have two jobs running
simultaneously, you have no idea who's responsible for what.

But it does give us coarse-grained counters which are accounted
per-job-slot, including exactly this metric: amount of 'GPU time'
(whatever that means) occupied by that job slot during the sampling
period. So we could support that nicely if we fenced job-slot updates
with register reads/writes.

Something I'm missing though is how we enable this information. Seems
like it would be best to either only do it whilst fdinfo is open (and
re-read it whenever you need an update), or on a per-driver sysfs
toggle, or ... ?

> +- drm-memory-<str>: <uint> [KiB|MiB]
> +
> +Each possible memory type which can be used to store buffer objects by the
> +GPU in question shall be given a stable and unique name to be returned as the
> +string here.
> +
> +Value shall reflect the amount of storage currently consumed by the buffer
> +object belong to this client, in the respective memory region.
> +
> +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> +indicating kibi- or mebi-bytes.

I'm a bit wary of the accounting here. Is it buffer allocations
originating from the client, in which case it conceptually clashes
with gralloc? Is it the client which last wrote to the buffer? The
client with the oldest open handle to the buffer? Other?

Cheers,
Daniel

Daniel Vetter July 23, 2021, 4:47 p.m. UTC | #2

On Fri, Jul 23, 2021 at 05:43:01PM +0100, Daniel Stone wrote:
> Hi Tvrtko,
> Thanks for typing this up!
> 
> On Thu, 15 Jul 2021 at 10:18, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> > +Mandatory fully standardised keys
> > +---------------------------------
> > +
> > +- drm-driver: <str>
> > +
> > +String shall contain a fixed string uniquely identified the driver handling
> > +the device in question. For example name of the respective kernel module.
> 
> I think let's be more prescriptive and just say that it is the module name.

Just a quick comment on this one.

drm_driver.name is already uapi, so let's please not invent a new one. The
shared code should probably make sure drivers don't get this wrong. Maybe
good if we document the getverion ioctl, which also exposes this, and then
link between the two.
-Daniel

> 
> > +Optional fully standardised keys
> > +--------------------------------
> > +
> > +- drm-pdev: <aaaa:bb.cc.d>
> > +
> > +For PCI devices this should contain the PCI slot address of the device in
> > +question.
> 
> How about just major:minor of the DRM render node device it's attached to?
> 
> > +- drm-client-id: <uint>
> > +
> > +Unique value relating to the open DRM file descriptor used to distinguish
> > +duplicated and shared file descriptors. Conceptually the value should map 1:1
> > +to the in kernel representation of `struct drm_file` instances.
> > +
> > +Uniqueness of the value shall be either globally unique, or unique within the
> > +scope of each device, in which case `drm-pdev` shall be present as well.
> > +
> > +Userspace should make sure to not double account any usage statistics by using
> > +the above described criteria in order to associate data to individual clients.
> > +
> > +- drm-engine-<str>: <uint> ns
> > +
> > +GPUs usually contain multiple execution engines. Each shall be given a stable
> > +and unique name (str), with possible values documented in the driver specific
> > +documentation.
> > +
> > +Value shall be in specified time units which the respective GPU engine spent
> > +busy executing workloads belonging to this client.
> > +
> > +Values are not required to be constantly monotonic if it makes the driver
> > +implementation easier, but are required to catch up with the previously reported
> > +larger value within a reasonable period. Upon observing a value lower than what
> > +was previously read, userspace is expected to stay with that larger previous
> > +value until a monotonic update is seen.
> 
> Yeah, that would work well for Mali/Panfrost. We can queue multiple
> jobs in the hardware, which can either be striped across multiple
> cores with an affinity mask (e.g. 3 cores for your client and 1 for
> your compositor), or picked according to priority, or ...
> 
> The fine-grained performance counters (e.g. time spent waiting for
> sampler) are only GPU-global. So if you have two jobs running
> simultaneously, you have no idea who's responsible for what.
> 
> But it does give us coarse-grained counters which are accounted
> per-job-slot, including exactly this metric: amount of 'GPU time'
> (whatever that means) occupied by that job slot during the sampling
> period. So we could support that nicely if we fenced job-slot updates
> with register reads/writes.
> 
> Something I'm missing though is how we enable this information. Seems
> like it would be best to either only do it whilst fdinfo is open (and
> re-read it whenever you need an update), or on a per-driver sysfs
> toggle, or ... ?
> 
> > +- drm-memory-<str>: <uint> [KiB|MiB]
> > +
> > +Each possible memory type which can be used to store buffer objects by the
> > +GPU in question shall be given a stable and unique name to be returned as the
> > +string here.
> > +
> > +Value shall reflect the amount of storage currently consumed by the buffer
> > +object belong to this client, in the respective memory region.
> > +
> > +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> > +indicating kibi- or mebi-bytes.
> 
> I'm a bit wary of the accounting here. Is it buffer allocations
> originating from the client, in which case it conceptually clashes
> with gralloc? Is it the client which last wrote to the buffer? The
> client with the oldest open handle to the buffer? Other?
> 
> Cheers,
> Daniel

Nieto, David M July 23, 2021, 5:45 p.m. UTC | #3

[AMD Official Use Only]

I just want to make a comment that with this approach (the ns) calculating the percentage will take at least two reads of the fdinfo per pid over some time. Some engines may be able to provide a single shot percentage usage over an internal integration period. That is, for example, what we currently have implemented for that exact reason.

I'd like to propose that we add an optional set of fields for this. Also, I may have missed a message, but why did we remove the timstamp? It is needed for accurate measurements of engine usage.

David

Tvrtko Ursulin July 26, 2021, 8:57 a.m. UTC | #4

On 23/07/2021 17:43, Daniel Stone wrote:
> Hi Tvrtko,
> Thanks for typing this up!
> 
> On Thu, 15 Jul 2021 at 10:18, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> +Mandatory fully standardised keys
>> +---------------------------------
>> +
>> +- drm-driver: <str>
>> +
>> +String shall contain a fixed string uniquely identified the driver handling
>> +the device in question. For example name of the respective kernel module.
> 
> I think let's be more prescriptive and just say that it is the module name.

I liked the drm_driver.name the other Daniel made so I'll go with that.

>> +Optional fully standardised keys
>> +--------------------------------
>> +
>> +- drm-pdev: <aaaa:bb.cc.d>
>> +
>> +For PCI devices this should contain the PCI slot address of the device in
>> +question.
> 
> How about just major:minor of the DRM render node device it's attached to?

I don't have a strong opinion on this one. I can add it, but might keep 
the drm-dev tag under the optional list because it is handy for 
intel_gpu_top multi-device support. Or maybe the lookup to pci device is 
easier than I think now so okay, on my todo list to check.

>> +- drm-client-id: <uint>
>> +
>> +Unique value relating to the open DRM file descriptor used to distinguish
>> +duplicated and shared file descriptors. Conceptually the value should map 1:1
>> +to the in kernel representation of `struct drm_file` instances.
>> +
>> +Uniqueness of the value shall be either globally unique, or unique within the
>> +scope of each device, in which case `drm-pdev` shall be present as well.
>> +
>> +Userspace should make sure to not double account any usage statistics by using
>> +the above described criteria in order to associate data to individual clients.
>> +
>> +- drm-engine-<str>: <uint> ns
>> +
>> +GPUs usually contain multiple execution engines. Each shall be given a stable
>> +and unique name (str), with possible values documented in the driver specific
>> +documentation.
>> +
>> +Value shall be in specified time units which the respective GPU engine spent
>> +busy executing workloads belonging to this client.
>> +
>> +Values are not required to be constantly monotonic if it makes the driver
>> +implementation easier, but are required to catch up with the previously reported
>> +larger value within a reasonable period. Upon observing a value lower than what
>> +was previously read, userspace is expected to stay with that larger previous
>> +value until a monotonic update is seen.
> 
> Yeah, that would work well for Mali/Panfrost. We can queue multiple
> jobs in the hardware, which can either be striped across multiple
> cores with an affinity mask (e.g. 3 cores for your client and 1 for
> your compositor), or picked according to priority, or ...
> 
> The fine-grained performance counters (e.g. time spent waiting for
> sampler) are only GPU-global. So if you have two jobs running
> simultaneously, you have no idea who's responsible for what.
> 
> But it does give us coarse-grained counters which are accounted
> per-job-slot, including exactly this metric: amount of 'GPU time'
> (whatever that means) occupied by that job slot during the sampling
> period. So we could support that nicely if we fenced job-slot updates
> with register reads/writes.
> 
> Something I'm missing though is how we enable this information. Seems
> like it would be best to either only do it whilst fdinfo is open (and
> re-read it whenever you need an update), or on a per-driver sysfs
> toggle, or ... ?

Presumably there is non-trivial cost for querying this data on your driver?

Would it be workable to enable tracking on first use and stop some time 
after last? Just a thought which may have significant downsides from 
driver to driver.

>> +- drm-memory-<str>: <uint> [KiB|MiB]
>> +
>> +Each possible memory type which can be used to store buffer objects by the
>> +GPU in question shall be given a stable and unique name to be returned as the
>> +string here.
>> +
>> +Value shall reflect the amount of storage currently consumed by the buffer
>> +object belong to this client, in the respective memory region.
>> +
>> +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
>> +indicating kibi- or mebi-bytes.
> 
> I'm a bit wary of the accounting here. Is it buffer allocations
> originating from the client, in which case it conceptually clashes
> with gralloc? Is it the client which last wrote to the buffer? The
> client with the oldest open handle to the buffer? Other?

Haven't looked into AMD code here so know what they export.

Gralloc allocates buffer from it's own drm client and shares them or it 
is just a library which runs from a client context?

Regards,

Tvrtko

Tvrtko Ursulin July 26, 2021, 9:01 a.m. UTC | #5

On 23/07/2021 18:45, Nieto, David M wrote:
> [AMD Official Use Only]
> 
> 
> I just want to make a comment that with this approach (the ns) 
> calculating the percentage will take at least two reads of the fdinfo 
> per pid over some time. Some engines may be able to provide a single 
> shot percentage usage over an internal integration period. That is, for 
> example, what we currently have implemented for that exact reason.
> 
> I'd like to propose that we add an optional set of fields for this. 

Yes it is already like that in the text I've sent out. Because I was unclear how the amdgpu accounting works I called out for you guys to fill in the blanks in the last patch:

"""
Opens:
  * Does it work for AMD?
  * What are the semantics of AMD engine utilisation reported in percents?
    Can it align with what i915 does or needs to document the alternative
    in the specification document?

"""

"""
-- drm-engine-<str>: <uint> ns
+- drm-engine-<str>: <uint> [ns|%]
...
+Where time unit is given as a percentage...[AMD folks to fill the semantics
+and interpretation of that]...
"""

So if cumulative nanoseconds definitely do not work for you, could you please fill in those blanks?

> Also, I may have missed a message, but why did we remove the timstamp? 
> It is needed for accurate measurements of engine usage.

Hm I did not remove anything - I only renamed some of the fields output from amdgpu fdinfo.

Regards,

Tvrtko
  
> David
> ------------------------------------------------------------------------
> *From:* Daniel Vetter <daniel@ffwll.ch>
> *Sent:* Friday, July 23, 2021 9:47 AM
> *To:* Daniel Stone <daniel@fooishbar.org>
> *Cc:* Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>; intel-gfx 
> <Intel-gfx@lists.freedesktop.org>; Tvrtko Ursulin 
> <tvrtko.ursulin@intel.com>; Koenig, Christian 
> <Christian.Koenig@amd.com>; dri-devel <dri-devel@lists.freedesktop.org>; 
> Nieto, David M <David.Nieto@amd.com>
> *Subject:* Re: [RFC 6/8] drm: Document fdinfo format specification
> On Fri, Jul 23, 2021 at 05:43:01PM +0100, Daniel Stone wrote:
>> Hi Tvrtko,
>> Thanks for typing this up!
>> 
>> On Thu, 15 Jul 2021 at 10:18, Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>> > +Mandatory fully standardised keys
>> > +---------------------------------
>> > +
>> > +- drm-driver: <str>
>> > +
>> > +String shall contain a fixed string uniquely identified the driver handling
>> > +the device in question. For example name of the respective kernel module.
>> 
>> I think let's be more prescriptive and just say that it is the module name.
> 
> Just a quick comment on this one.
> 
> drm_driver.name is already uapi, so let's please not invent a new one. The
> shared code should probably make sure drivers don't get this wrong. Maybe
> good if we document the getverion ioctl, which also exposes this, and then
> link between the two.
> -Daniel
> 
>> 
>> > +Optional fully standardised keys
>> > +--------------------------------
>> > +
>> > +- drm-pdev: <aaaa:bb.cc.d>
>> > +
>> > +For PCI devices this should contain the PCI slot address of the device in
>> > +question.
>> 
>> How about just major:minor of the DRM render node device it's attached to?
>> 
>> > +- drm-client-id: <uint>
>> > +
>> > +Unique value relating to the open DRM file descriptor used to distinguish
>> > +duplicated and shared file descriptors. Conceptually the value should map 1:1
>> > +to the in kernel representation of `struct drm_file` instances.
>> > +
>> > +Uniqueness of the value shall be either globally unique, or unique within the
>> > +scope of each device, in which case `drm-pdev` shall be present as well.
>> > +
>> > +Userspace should make sure to not double account any usage statistics by using
>> > +the above described criteria in order to associate data to individual clients.
>> > +
>> > +- drm-engine-<str>: <uint> ns
>> > +
>> > +GPUs usually contain multiple execution engines. Each shall be given a stable
>> > +and unique name (str), with possible values documented in the driver specific
>> > +documentation.
>> > +
>> > +Value shall be in specified time units which the respective GPU engine spent
>> > +busy executing workloads belonging to this client.
>> > +
>> > +Values are not required to be constantly monotonic if it makes the driver
>> > +implementation easier, but are required to catch up with the previously reported
>> > +larger value within a reasonable period. Upon observing a value lower than what
>> > +was previously read, userspace is expected to stay with that larger previous
>> > +value until a monotonic update is seen.
>> 
>> Yeah, that would work well for Mali/Panfrost. We can queue multiple
>> jobs in the hardware, which can either be striped across multiple
>> cores with an affinity mask (e.g. 3 cores for your client and 1 for
>> your compositor), or picked according to priority, or ...
>> 
>> The fine-grained performance counters (e.g. time spent waiting for
>> sampler) are only GPU-global. So if you have two jobs running
>> simultaneously, you have no idea who's responsible for what.
>> 
>> But it does give us coarse-grained counters which are accounted
>> per-job-slot, including exactly this metric: amount of 'GPU time'
>> (whatever that means) occupied by that job slot during the sampling
>> period. So we could support that nicely if we fenced job-slot updates
>> with register reads/writes.
>> 
>> Something I'm missing though is how we enable this information. Seems
>> like it would be best to either only do it whilst fdinfo is open (and
>> re-read it whenever you need an update), or on a per-driver sysfs
>> toggle, or ... ?
>> 
>> > +- drm-memory-<str>: <uint> [KiB|MiB]
>> > +
>> > +Each possible memory type which can be used to store buffer objects by the
>> > +GPU in question shall be given a stable and unique name to be returned as the
>> > +string here.
>> > +
>> > +Value shall reflect the amount of storage currently consumed by the buffer
>> > +object belong to this client, in the respective memory region.
>> > +
>> > +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
>> > +indicating kibi- or mebi-bytes.
>> 
>> I'm a bit wary of the accounting here. Is it buffer allocations
>> originating from the client, in which case it conceptually clashes
>> with gralloc? Is it the client which last wrote to the buffer? The
>> client with the oldest open handle to the buffer? Other?
>> 
>> Cheers,
>> Daniel
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7CDavid.Nieto%40amd.com%7Cda2d9f95ced44d09f66c08d94df991da%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637626556571460650%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=GrjAYg9tG2QX7z4BDaqa4wMPj2nFcvGo4xCmD8OzwNE%3D&amp;reserved=0 
> <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7CDavid.Nieto%40amd.com%7Cda2d9f95ced44d09f66c08d94df991da%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637626556571460650%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=GrjAYg9tG2QX7z4BDaqa4wMPj2nFcvGo4xCmD8OzwNE%3D&amp;reserved=0>

[RFC,6/8] drm: Document fdinfo format specification

Commit Message

Comments

Patch